Archive | Python

# NUFFT with Julia

Julia language offers an interesting alternative to python when crunching numbers. Python has several ways to improve its inherently low performance, such as numpy, cython, or numba.
In an excellent post Jake Vanderplas discusses how to use numba to achieve a performance similar to Fortran, without writing any cython or Fortran code. The results are amazing.
Because the philosophies behind Julia and numba are similar, I wanted to see how Julia would perform. As I am new to Julia, this was also a way to learn the language, and I am sure I have made several mistakes that I would appreciate if you readers can point out.

We’ll start with a Direct Fourier Transform (DFT). We’ll import some python things into Julia, to compare different results.

using PyCall
@pyimport nufft as nufft_fortran
@pyimport numpy as np

Next we define 3 versions of the DFT. The first one uses Python and is as close as I could get to In[1] in Jake’s post. The other two use element-wise operations instead of whole-array operations.

function nufftfreqs(M,df=1.0)
"""Compute the frequency range used in nufft for M frequency bins"""
df * [-fld(M, 2): M - fld(M, 2)-1]
end

function nudftpy(x, y, M, df=1.0, iflag=1)
"""Non-Uniform Direct Fourier Transform. Using numpy"""
sign = iflag < 0 ? -1 : 1
(1 / length(x)) * np.dot(y, np.exp(sign * 1im * x*transpose(nufftfreqs(M, df))))
end

function nudft(x, y, M, df=1.0, iflag=1)
"""Non-Uniform Direct Fourier Transform. Using whole array operations"""
sign = iflag < 0 ? -1 : 1
(1 / length(x)) * *(transpose(y''), exp(sign * 1im * x*transpose(nufftfreqs(M, df))))
end

function nudft2(x::Vector, y::Vector, M::Int, df=1.0, iflag=1)
"""Non-Uniform Direct Fourier Transform. Using comprehensions"""
freqs = nufftfreqs(M, df)
sign = iflag < 0 ? -1 : 1
n = size(x,1)
m = size(freqs, 1)
r = (1 / n) * [y[i]*exp(sign*1im*x[i]*freqs[j]) for i=1:n, j=1:m]
sum(r,1)
end

Now we test them:

x = 100 * rand(1000)
y = sin(x)
Y0 = @time nudftpy(x, y, 1000)
Y1 = @time nudft(x, y, 1000)
Y2 = @time nudft2(x, y, 1000)
Yf = @time nufft_fortran.nufft1(x, y, 1000)
print([np.allclose(Y0, Yf), np.allclose(Y1, Yf),np.allclose(Y2, Yf))
elapsed time: 0.272123779 seconds (32289652 bytes allocated)
elapsed time: 0.211703597 seconds (32309564 bytes allocated, 26.77% gc time)
elapsed time: 0.14629479 seconds (32874852 bytes allocated)
elapsed time: 0.207123126 seconds (32918144 bytes allocated, 28.28% gc time)
Bool[true,true,true]

The results agree! But they are incredibly inefficient. The functions are also slower than the same numpy function called from python (about twice as fast in python). Probably there is some overhead in calling python from Julia. Besides, the @time macro in Julia works differently than the @timeit magic function…

Our real interest is comparing Julia with numba, so I’ll go on. Here is an FFT implementation of the nuft_numpy function of Jake’s post (In[4]). To be easily identifiable, I’ve kept the same doc-strings and function names, which kind-of make no sense here…

function _compute_grid_params(M, epsilon)
# Choose Msp & tau from eps following Dutt & Rokhlin (1993)
if epsilon <= 1E-33 || epsilon >= 1E-1
error(@sprintf("eps = %f; must satisfy 1e-33 < eps < 1e-1.", epsilon))
end
ratio = epsilon > 1E-11 ? 2 : 3
Msp = itrunc(-log(epsilon) / (pi * (ratio - 1) / (ratio - 0.5)) + 0.5)
Mr = max(ratio * M, 2 * Msp)
lambda_ = Msp / (ratio * (ratio - 0.5))
tau = pi * lambda_ / M^2
Msp, Mr, tau
end

function nufft_python(x, c, M, df=1.0, epsilon=1E-15, iflag=1):
"""Fast Non-Uniform Fourier Transform with Python"""
Msp, Mr, tau = _compute_grid_params(M, epsilon)
N = length(x)

# Construct the convolved grid
ftau = zeros(typeof(c[1]), Mr)
Mr = size(ftau,1)
hx = 2pi / Mr
mm = [-Msp:Msp-1]
for i=1:N
xi = (x[i] * df) % (2 * pi)
m = 1 + div(xi,hx)
spread = exp(-0.25 * (xi - hx * (m + mm)).^2 / tau)
ftau[1+mod(m + mm, Mr)] += c[i] * spread
end
# Compute the FFT on the convolved grid
if iflag < 0
Ftau = (1 / Mr) * fft(ftau)
else
Ftau = ifft(ftau)
end
Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]]
# Deconvolve the grid using convolution theorem
k = nufftfreqs(M)
(Ftau.*(1 / N) * sqrt(pi / tau)).* exp(tau * k.^2)
end

Following again Jake’s post, I write a couple of functions to test our implementations. They are a litteral translation from the Python code.

function test_nufft(nufft_func, M=1000, Mtime=100000)
# Test vs the direct method
print(repeat("-",30), "\n")
print("testing ",nufft_func, "\n")
x = 100 * rand(M + 1)
y = sin(x)
for df in [1.0, 2.0]
for iflag in [1, -1]
F1 = nudft(x, y, M, df, iflag)
F2 = nufft_func(x, y, M, df, 1E-15, iflag)
assert(all(x -> isapprox(x...), zip(F1, F2)))
end
end
print("- Results match the DFT\n")

# Time the nufft function
x = 100 * rand(Mtime)
y = sin(x)
times = Float64[]
for i = 1:5
tic()
F = nufft_func(x, y, Mtime)
t1 = toq()
push!(times,t1)
end
@printf("- Execution time (M=%d): %.2f sec\n",Mtime, median(times))
end

Let’s test it:

test_nufft(nufft_python)
test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_python
- Results match the DFT
- Execution time (M=100000): 1.07 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec

The results are an order of magnitude slower than the fortran code. They are about three times faster than the numpy_python code in my computer (3.7 sec). Good! So pure Julia is faster than simple python!

Python spends most of the time in the loop. This can be improved using numpy add function. As this function does not exist in Julia, I used a loop instead. The result is pretty cumbersome, but it was just a game to see if I could get something close to the numpy version.

function nufft_numpy(x, y, M, df=1.0, epsilon=1E-15, iflag=1):
"""Fast Non-Uniform Fourier Transform"""
Msp, Mr, tau = _compute_grid_params(M, epsilon)
N = length(x)
# Construct the convolved grid
ftau = zeros(typeof(y[1]), Mr)
hx = 2pi / Mr
xmod = map(mod2pi, x*df)
m = 1+ int(xmod/hx)
mm = [-Msp:Msp-1]
for (i,s) in zip(map(xi->1+mod(xi, Mr), mpmm), spread)
ftau[i] += s
end
# Compute the FFT on the convolved grid
if iflag < 0
Ftau = (1 / Mr) * fft(ftau)
else
Ftau = ifft(ftau)
end
Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]]
# Deconvolve the grid using convolution theorem
k = nufftfreqs(M)
(Ftau.*(1 / N) * sqrt(pi / tau)).* exp(tau * k.^2)
end

test_nufft(nufft_numpy)
test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numpy
- Results match the DFT
- Execution time (M=100000): 4.58 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec

So our attempt to emulate the numpy setup was a disaster. This could have been expected. After all, in python we were trying to remove a for loop, but these loops are not inherently slower in Julia, so that the complicated broadcastic resulted in a degradation of performance. It’s nice to see that complex code works worse!

Let’s see if the numba code results in more efficient Julia code. This is the line-by-line translation of the numba code (In[11]):

function build_grid(x, c, tau, Msp, ftau)
Mr = size(ftau,1)
hx = 2pi / Mr
for i=1:size(x,1)
xi = mod2pi(x[i])
m = 1 + int(xi/hx)
for mm=-Msp:Msp-1
ftau[1 + mod((m + mm) , Mr)] += c[i] * exp(-0.25 * (xi - hx * (m + mm))^2 / tau)
end
end
ftau
end

function nufft_numba(x, c, M, df=1.0, eps=1E-15, iflag=1)
"""Fast Non-Uniform Fourier Transform from Python numba code"""
Msp, Mr, tau = _compute_grid_params(M, eps)
N = length(x)

# Construct the convolved grid
ftau = build_grid(x * df, c, tau, Msp, zeros(typeof(c[1]), Mr))

# Compute the FFT on the convolved grid
if iflag < 0
Ftau = (1 / Mr) * fft(ftau)
else
Ftau = ifft(ftau)
end
Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]]

# Deconvolve the grid using convolution theorem
k = nufftfreqs(M)
(1 / N) * sqrt(pi / tau) .* exp(tau * k.^2).*Ftau
end

test_nufft(nufft_numba)
test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numba
- Results match the DFT
- Execution time (M=100000): 0.32 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec

This is a better performance! Last, we can potentially gain some more speed pre-calculating the exponentials. Again, this is a pure translation of Jake’s code.

function build_grid_fast(x, c, tau, Msp, ftau, E3)
Mr = size(ftau,1)
hx = 2pi / Mr
# precompute some exponents
for j=0:Msp
E3[j+1] = exp(-(pi * j / Mr)^2 / tau)
end
for i=1:size(x,1)
xi = mod2pi(x[i])
m = 1 + int(xi/hx)
xi = xi - hx * m
E1 = exp(-0.25 * xi^2 / tau)
E2 = exp((xi * pi) / (Mr * tau))
E2mm = 1
for mm=0:Msp-1
ftau[1+mod((m + mm) , Mr)] += c[i] * E1 * E2mm * E3[mm+1]
E2mm *= E2
ftau[1+mod((m - mm - 1) , Mr)] += c[i] * E1 / E2mm *E3[mm+2]
end
end
ftau
end

function nufft_numba_fast(x, c, M, df=1.0, eps=1E-15, iflag=1)
"""Fast Non-Uniform Fourier Transform from Python numba code"""
Msp, Mr, tau = _compute_grid_params(M, eps)
N = length(x)

# Construct the convolved grid
ftau = build_grid_fast(x * df, c, tau, Msp,
zeros(typeof(c[1]), Mr), zeros(typeof(x[1]), Msp+1) )

# Compute the FFT on the convolved grid
if iflag < 0
Ftau = (1 / Mr) * fft(ftau)
else
Ftau = ifft(ftau)
end
Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]]

# Deconvolve the grid using convolution theorem
k = nufftfreqs(M)
(1 / N) * sqrt(pi / tau) .* exp(tau * k.^2).*Ftau
end

test_nufft(nufft_numba_fast)
test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numba_fast
- Results match the DFT
- Execution time (M=100000): 0.70 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec

Here I am surprised to see that the performance is worse than before. I really don’t see any reason for that, and of course, a profiler should be the next step. Whereas the nufft_numba_fast in python is almost as efficient as the fortran code (0.14 sec vs. 0.11 sec), with Julia it is 100% slower than the simpler nufft_numba.

Conclusion? Julia is easy and powerful, but for those used to python, numba is a great alternative that can produce even faster code with less effort (for a Python programmer).

As I am new to Julia I may have made several mistakes and I would appreciate if readers can point them out.

You can find a very similar version of this post as a python notebook.

# PYMC potentials and priors

When I started learning Bayesian statistics I found very useful PYMC, as I needed to play with examples without having to implement MCMC myself or going through complicated integrals. A good source of information is Bayesian Methods for Hakers. But the outstanding blog by Jake VanderPlas has proven extremely enlightening, both in coding examples and underlying fundamental concepts.

PYMC implements potentials, but there are few examples of their uses. In this post I will show how priors can be implemented as potentials using two previously published examples. I hope this will show potentials, just as priors and likelihood, are just one more term in the posterior distribution that MCMC samples.

### Linear fit with non-uniform priors

Here I reproduce the linear fit from Jake Vanderplas post but using just two points:

xdata = np.array([0, 1.])
ydata = np.array([0, 2.])
edata = 1.* np.ones_like(xdata)

We will also use the same visualization functions defined in his post:

import pymc

# Create some convenience routines for plotting

def compute_sigma_level(trace1, trace2, nbins=20):
"""From a set of traces, bin by number of standard deviations"""
L, xbins, ybins = np.histogram2d(trace1, trace2, nbins)
L[L == 0] = 1E-16
logL = np.log(L)

shape = L.shape
L = L.ravel()

# obtain the indices to sort and unsort the flattened array
i_sort = np.argsort(L)[::-1]
i_unsort = np.argsort(i_sort)

L_cumsum = L[i_sort].cumsum()
L_cumsum /= L_cumsum[-1]

xbins = 0.5 * (xbins[1:] + xbins[:-1])
ybins = 0.5 * (ybins[1:] + ybins[:-1])

return xbins, ybins, L_cumsum[i_unsort].reshape(shape)

def plot_MCMC_trace(ax, xdata, ydata, trace, scatter=False, **kwargs):
"""Plot traces and contours"""
xbins, ybins, sigma = compute_sigma_level(trace[0], trace[1])
ax.contour(xbins, ybins, sigma.T, levels=[0.683, 0.955], **kwargs)
if scatter:
ax.plot(trace[0], trace[1], ',k', alpha=0.1)
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel(r'$\beta$')

def plot_MCMC_model(ax, xdata, ydata, trace):
"""Plot the linear model and 2sigma contours"""
ax.plot(xdata, ydata, 'ok')

alpha, beta = trace[:2]
xfit = np.linspace(-1, 2, 10) #This is the only change from Jake's post!
yfit = alpha[:, None] + beta[:, None] * xfit
mu = yfit.mean(0)
sig = 2 * yfit.std(0)

ax.plot(xfit, mu, '-k')
ax.fill_between(xfit, mu - sig, mu + sig, color='lightgray')

ax.set_xlabel('x')
ax.set_ylabel('y')

def plot_MCMC_results(xdata, ydata, trace, colors='k'):
"""Plot both the trace and the model together"""
fig, ax = plt.subplots(1, 2, figsize=(10, 4))
plot_MCMC_trace(ax[0], xdata, ydata, trace, True, colors=colors)
plot_MCMC_model(ax[1], xdata, ydata, trace)

First we will use uniform priors. With uniform priors one has to make sure their range is large enough to include all values of the parameters. Otherwise the sampled parameters cluster around the boundary of the uniform range. You can try this by changing the range for beta to (-1, 1).

# Define the variables needed for the routine, with their prior distributions
alpha = pymc.Uniform('alpha', -10, 10)
beta = pymc.Uniform('beta', -10, 10)

# Define the form of the model and likelihood
@pymc.deterministic
def y_model(x=xdata, alpha=alpha, beta=beta):
return alpha + beta * x
# (We define the error as 2 std)
y = pymc.Normal('y', mu=y_model, tau=1. / (edata/2) ** 2, observed=True, value=ydata)

# package the full model in a dictionary
model1 = dict(alpha=alpha, beta=beta, y_model=y_model, y=y)

# run the basic MCMC:
S = pymc.MCMC(model1)
S.sample(iter=100000, burn=50000)

# extract the traces and plot the results
pymc_trace_unifo = [S.trace('alpha')[:],
S.trace('beta')[:]]

plot_MCMC_results(xdata, ydata, pymc_trace_unifo)
print()
print("alpha mean= {:.2f}".format(pymc_trace_unifo[0].mean()))
print("beta  mean= {:.2f}".format(pymc_trace_unifo[1].mean()))

[—————–100%—————–] 100000 of 100000 complete in 16.8 sec
alpha mean= 0.00
beta mean= 1.99

Now with potentials. A uniform prior results in a constant log-p term, so we need to add the lop-p of the prior as a potential with the same effects (the constant term does not affect the MCMC sampling):

# Define the variables needed for the routine, with their prior distributions
alpha = pymc.Uniform('alpha', -10, 10)
beta = pymc.Uniform('beta', -10, 10)

#Define a potential equivalent to a prior
@pymc.potential
def pot_prior(val=beta):
return -1.5 * np.log(1 + val ** 2)

# Define the form of the model and likelihood
@pymc.deterministic
def y_model(x=xdata, alpha=alpha, beta=beta):
return alpha + beta * x

y = pymc.Normal('y', mu=y_model, tau=1. / (edata/2) ** 2, observed=True, value=ydata)

# package the full model in a dictionary
model1 = dict(alpha=alpha, beta=beta, y_model=y_model, pot_prior=pot_prior, y=y)

# run the basic MCMC: we'll do 100000 iterations to match emcee above
S = pymc.MCMC(model1)
S.sample(iter=100000, burn=50000)

# extract the traces and plot the results
pymc_trace_pot = [S.trace('alpha')[:],
S.trace('beta')[:]]

plot_MCMC_results(xdata, ydata, pymc_trace_pot)
print()
print("alpha mean= {:.2f}".format(pymc_trace_prior[0].mean()))
print("beta  mean= {:.2f}".format(pymc_trace_prior[1].mean()))

[—————–100%—————–] 100000 of 100000 complete in 18.6 sec
alpha mean= 0.31
beta mean= 1.38

I find it surprising that the average fit does not run through the data points. On the other hand this is not unexpected: our prior for beta favours values close to 0. I guess an asymmetrical error model in the likelihood could oppose this trend, but this is really beyond the scope of my knowledge and I would really appreciate comments on that.

### Ridge regression

As a second example of potentials, we’ll analyze the post by Austin Rochford, which I also enjoyed reading. It discusses, among others, Bayesian Ridge regression. Ridge regression adds a quadratic potential to the least-squares objective function to minimize. This is done to achieve regularization. A Bayesian approach reveals that this is equivalent to using a normal prior for the slope beta parameters.

I refer again to his post for explanations and more insight. In that post, instead of sampling the posterior he just finds the Maximum a posteriori (MAP) estimate. Here I just show that the same value is obtained when using uniform priors and an external potential is added ad hoc as is done in traditional Ridge regression.

from scipy.stats import norm

n = 1000

x1 = norm.rvs(0, 1, size=n)
x2 = -x1 + norm.rvs(0, 10**-3, size=n)
x3 = norm.rvs(0, 1, size=n)

X = np.column_stack([x1, x2, x3])
y = 10 * x1 + 10 * x2 + 0.1 * x3

def get_coefficients(map_):
return [{str(variable): variable.value} for variable in map_.variables if str(variable).startswith('beta')]

beta1_pot = pymc.Uniform('beta1', -100.,100.)
beta2_pot = pymc.Uniform('beta2', -100.,100.)
beta3_pot = pymc.Uniform('beta3', -100.,100.)

@pymc.potential
def pot(beta1=beta1_pot, beta2=beta2_pot, beta3=beta3_pot):
tau = 1.
return -tau/2.*(beta1**2+beta2**2+beta3**2)

@pymc.deterministic
def y_hat_pot(beta1=beta1_pot, beta2=beta2_pot, beta3=beta3_pot, x1=x1, x2=x2, x3=x3):
return beta1 * x1 + beta2 * x2 + beta3 * x3

Y_pot = pymc.Normal('Y', mu=y_hat_pot, tau=1.0, value=y, observed=True)

pot_model = pymc.Model([Y_pot, beta1_pot, beta2_pot, beta3_pot, pot])
pot_map = pymc.MAP(pot_model)
pot_map.fit()

print("With potentials:\n",get_coefficients(pot_map))

#
# With priors. Based on
# http://www.austinrochford.com/posts/2013-09-02-prior-distributions-for-bayesian-regression-using-pymc.html
#

beta1_ridge = pymc.Normal('beta1', mu=0, tau=1.0)
beta2_ridge = pymc.Normal('beta2', mu=0, tau=1.0)
beta3_ridge = pymc.Normal('beta3', mu=0, tau=1.0)

@pymc.deterministic
def y_hat_ridge(beta1=beta1_ridge, beta2=beta2_ridge, beta3=beta3_ridge, x1=x1, x2=x2, x3=x3):
return beta1 * x1 + beta2 * x2 + beta3 * x3
Y_ridge = pymc.Normal('Y', mu=y_hat_ridge, tau=1.0, value=y, observed=True)

ridge_model = pymc.Model([Y_ridge, beta1_ridge, beta2_ridge, beta3_ridge])
ridge_map = pymc.MAP(ridge_model)
ridge_map.fit()

print("With priors:\n",get_coefficients(ridge_map))

With potentials:
[{‘beta1’: array(0.004691201218377386)}, {‘beta3’: array(0.09966093745608606)}, {‘beta2’: array(0.004958704169379989)}]
With priors:
[{‘beta3’: array(0.09966049909698146)}, {‘beta1’: array(0.004779905863422812)}, {‘beta2’: array(0.005045738743703612)}]

# A periodic table of objects

… or a first step into Python objects.

Let us assume that for some large project we need an easy way to
translate atomic symbols into element names. The first thing that comes
to your mind is to use a dictionary. As this seems a bijective relation,
keys can be symbols and values names, or vice-versa. This is an
important decision as it fixes the structure of the dictionary. I
believe keys should be symbols. Let me give you a couple of reasons.
First, there is one element W, that has 2 names, Tungsten and
Wolfram. Second, you might image extending the dictionary to include
names in other languages. Therefore:

elementsDict=dict(H='hydrogen',
He='helium',
Li='lithium' ) # And so on...

Now your project has grown and you would like to also keep the atomic
number of an element. Two simple solutions are creating a second
dictionary (duplicating work is not nice and error prone) or keeping a
tuple (or a list) for the values, such as:

elementsDict=dict(H=('hydrogen', 1),
He=('helium', 2),
Li=('lithium', 3) ) # And so on...

This makes accessing to the attributes of an element ugly, as you have
to remember the arbitrary order of the attribute in the tuple. Did I say
attributes? That sounds object oriented! In fact, it is easy to
visualize an element as something (an object!) with different
attributes such as its name, its atomic symbol, its atomic weight, etc.
These attributes are of different type and do not bear an ordered
relationship among themselves as a list assumes. So let’s make an
object:

class Element:
def __init__(self, name, atomicNumber):
self.name = name
self.atomicNumber = atomicNumber

elementsDict=dict(H=Element('hydrogen', 1),
He=Element('helium', 2),
Li=Element('lithium', 3) )

Let’s try it:

for symbol in elementsDict:
print("The atomic number of {} is {}".format(elementsDict[symbol].name,
elementsDict[symbol].atomicNumber))

The atomic number of hydrogen is 1
The atomic number of lithium is 3
The atomic number of helium is 2

At this point, you might think it would be useful for the element object
to also contain its symbol. That’s easy, because you can add attributes
to an object dynamically:

for symbol in elementsDict:
elementsDict[symbol].symbol = symbol

print("{} name is {}".format(elementsDict['H'].symbol,
elementsDict['H'].name))

H name is hydrogen

### The other way around

For the sake of completeness, imagine you have created a list of
elements with a class that contains the atomic symbol, and you then want
to create a dictionary from that list. I find this more clear as the
__init__ method then contains all the attributes. Here is how you
can do it:

class Element2:
def __init__(self, symbol, name, atomicNumber):
self.symbol = symbol
self.name = name
self.atomicNumber = atomicNumber

elements2List =[Element2('H', 'hydrogen', 1),
Element2('He', 'helium', 2),
Element2('Li', 'lithium', 3) ]
# Then create a dictionary with a comprehension:
elements2Dict = {element.symbol: element for element in elements2List}

Obviously it works as before:

for symbol in elements2Dict:
print("The atomic number of {} is {}".format(elements2Dict[symbol].name,
elements2Dict[symbol].atomicNumber))

The atomic number of hydrogen is 1
The atomic number of lithium is 3
The atomic number of helium is 2

# The IPython Notebook

The IPython notebook is amazing. It is so amazing that I could not resist writing yet another blog entry about it, when there are probably hundreds of sources of information about the notebook.

Imagine you have run some calculations with a certain code. Then you have a short script that copies output files and/or extracts important data into a text file, which you then import into a spreadsheet. You work with the data, and produce some plots which you copy&paste into a word-processor file where you have explained your aims, your set-up and the data manipulation.

Are you familiar with this? Don’t you think that the lab notebooks your colleagues have are more efficient? and more trustful? Aren’t you tired of repeating this cycle again and again every time you re-run some of the calculations?
If you answered ‘yes’ to some if these questions, the IPython notebook is for you. Even if you have never used python!

The Ipython notebook is the lab notebook of the computational scientists. You can insert complex mathematical formulas as with .\LaTeX. The recent extension to use a spell-check render the notebook even more useful. You can also insert videos and images from the web. For example to compare your results with that of a published paper a couple of lines suffice:

from IPython.display import Image
Image(url='http://www.ploscompbiol.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pcbi.1000497\
.g001&amp;amp;representation=PNG_L')

Of course, you can also have python code in the notebook, But also bash, Ruby or Fortran code!

%%fortran
real function sincos(x)
implicit none
real*8 :: x
sincos = sin(x)*cos(x)
end function

And then use the Fortran function directly in your python cells (you’ll need to install the
%fortranmagic extension)

sincos(1.5) - math.sin(1.5)*math.cos(1.5) < 1e-7
True

You can share the code as static notebooks with the http://nbviewer.ipython.org or convert them into LaTeX, PDFs or HTML.

I’ve only sketched a few capabilities of the notebook. Take a look at these examples to decide where you want to focus. And after all the praise, some criticism:

• The notebook is evolving so fast that you never know if your installed version is able to do some of the awesome tricks you read somewhere. And the versions packed in the repositories usually lag behind. That means you are forced to do a manual installation.

I finish this post with a confession. I should have used IPython to blog directly. This is possible but I am new to wordpress.com and I still have to check how to do it. Shame on me!

# Python for Fortran programmers 8: Looking ahead

This series of posts were not a Python tutorial, just some tips for those Fortran programmers who are learning python. Once you know the basics of Python, you can focus on extensions useful to scientists.

The most essential one is Numpy, which gives python the ability to work efficiently with arrays. If you install Numpy, it is worth also installing Scipy. Scipy includes a wealth of algorithms that you probably need but you don’t want to code. From Fourier transforms and splines to minimizations and numerical integrations. There are excellent tutorials for both Numpy and Scipy and a good place to start is at http://www.scipy.org/.
Remember that although you can translate a Fortran code almost line by line into Python, the resulting code will not be optimal, neither for clarity nor efficiency. Learn to be Pythonic:
http://www.cafepy.com/article/be_pythonic/
http://blog.startifact.com/posts/older/what-is-pythonic.html
Use dictionaries, use sets, use list comprehension, and even consider using classes! Remember that almost everything is iterable in Python.
This is my last post for the series Python for Fortran programmers, but I will continue writing about Python tools that I find useful for my research. I hope they will also help other computational chemists and biophysicists.

# Python for Fortran programmers 7: default function arguments

Python arguments can have have a default value, in which case, they become optional. Its similar to Fortran optional arguments, but the implementation and the syntax is pretty different. For immutable objects, everything works as a Fortran programmer would expect. But for mutable objects we tend to be surprised.

Type the following function:

def function(data=[]):
data.append(1)
return data

and call the function several times, without arguments (function()). What is going on?

As usual, you can better visualize this process if you execute the code in: http://www.pythontutor.com
But it’s easy to understand, and if you do, you won’t be confused again. The first thing to remember is that lists are mutable. The second one is that the def statement is an executable statement, which is executed when the function is defined in that line. As an example, will Hello from f1! be printed if you execute only these lines of code?

def f1():
print("Hello from f1!")
return 1

def f2(x=f1()):
pass

See? You are defining functions but you are actually producing output!
So once the function is called and the object data is created, the same object is called all the time, pointing to the same data mutable object.

If you have understood everything, you should be able to predict the output of these lines:

def function(data=[]):
data.append(1)
return data

print(function())
print(function())
print(function())

f1 = function

def function(data=[]):
data.append(1)
print("Hello!")
return data

print(function())
print(function())
print(function())
print("f1 is function:", f1 is function)

You can expand this fascinating subject here: http://effbot.org/zone/default-values.htm from where I stole the example and some ideas, or here:
http://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument

# Python for Fortran programmers 6: mutable and immutable

Before we deal with the “problem” of default values, we need to clearly understand what mutable and immutable objects are, because this is a concept that does not appear in Fortran.
Objects in Python can be mutable or immutable. This may seem pretty irrelevant. After all, what is the difference between:

a = 5 #immutable
a = [1, 2, 3 ] #mutable

True, mutable objects have some methods such as a.pop() that immutable objects don’t have. But again, you can think of workarounds:

b = (1, 2, 3)  #an immutable tuple
b = b[:-1] #all elements except the last one.

However it is important to realize that b is now a new object (it has a different memory address) having the same name as the previous one, whereas a remains the same object (same memory address) after the pop method is called. You can check that by calling id(a) and id(b) before and after the assignments. When we modify a mutable object, the object remains the same, but when we “modify” b, we create a new object:

a = [1, 2, 3]
>>> id(a)
139878335741320
>>> a.pop()
3
>>> a
[1, 2]
>>> id(a) #the same as before
139878335741320
>>> b = (1, 2, 3)
>>> id(b)
139878160519888
>>> b = b[:-1]
>>> b
(1, 2)
>>> id(b) # a NEW object with the same name
139878160530536

Now we can deal with the surprising behaviour of default arguments.

# Python for Fortran programmers 5: Objects?

For a Fortran programmer, reading that “everything in python is an object” does not help. It mostly frightens. Here I’ll try to give some oversimplified definitions so that you can read Python literature without being blocked by the language.

At the beginning you can forget about how objects work, just remember that anything like a Fortran variable (integers, arrays, etc) is an object.

Objects are class instances. Somehow the class is like the definition of the object, whereas the objects are the actual variables. In REAL :: x,y,z, REAL would be the class and x,y,z would be the objects.

Objects have methods and attributes. Again, for simplicity, methods are like Fortran functions, but associated with a certain object. And attributes are variables associated with that object. In Fortran, the closest thing to an attribute would be the fields in a TYPE variable.

As attributes and methods are associated with an object, its syntax is special: object.attribute and object.method().

In fact, some modules such as numpy (we’ll see more about numpy in the future) have methods implemented also as “normal” module functions. For example, if x is a numpy array we can call its method to get the mean:

x.mean()

or call the module function:

numpy.mean(x)

It is not easy to see the utility of objects, because many of the tutorial examples are too trivial. But here is one more example. There are many diagonalization subroutines in lapack. One for real matrices, one for complex, another one for symmetric, etc. They all need different names, and you have to check which one to apply to your particular matrix.

In Python, you could have all these different algorithms as methods in different classes, so that they could have the same name without collisions. Then:

x.diagonalize()

would call the right method depending on what the class of x.

# Python for Fortran programmers IV: only functions

Python only has functions, whereas Fortran has two types of procedures: functions and subroutines. Functions do something with the arguments and return a value. The analogy with mathematical functions is obvious: f(x)=cos2(x) is asking to be coded as a function. Subroutines are supposed to perform some tasks and/or return several arguments than cannot be returned with a function. A typical example could be a diagonalization subroutine, which returns the eigenvalues and the eigenvectors.
Fortran programmers find no problem in understanding Python functions, but how can we make a function that returns eigenvalues and eigenvectors?
Python is not the only language having only functions, in fact its pretty common, the exception being Fortran… But in Fortran we can code everything as a subroutine or everything as a function. subroutine cos2(x, res) could return the “function” value as an argument. And function diag(m, eigvec, eigval) could have the same arguments as the analogous subroutine and have a return value that just informs if the diagonalization succeeded.
This diagonalization function could be easily translated into a python function. Done? Not yet…
As Fortran programmers we tend to overuse the arguments to output results, and in Python the return value is much more flexible than in Fortran, because what the function returns is not fixed in size or type. So in Python it would make more sense to return the eigenvalues and the eigenvectors in the return statement and leave the arguments only as inputs to the function.
Let’s see two examples:
The Lapack diagonalization subroutine DSYEV has these arguments:

SUBROUTINE dsyev( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO )

A contains the matrix to be diagonalized on input and the eigenvectors on output. The eigenvalues are returned in W. INFO returns 0 if successful. How is does the interface look like in Python?
Scipy contains wrappers to many Lapack subroutines. Although with Python we should call the higher-level routine scipy.linalg.eigh we can also call directly the Lapack routine:

>>> import numpy as np
>>> import scipy.linalg as la
>>> a = np.array(([[1,2,3], [2,4,5], [3,5,6]]))
>>> la.lapack.dsyev(a)
(array([ -0.51572947, 0.17091519, 11.34481428]), array([[ 0.73697623, 0.59100905, 0.32798528],
[ 0.32798528, -0.73697623, 0.59100905],
[-0.59100905, 0.32798528, 0.73697623]]), 0)

So it returns the eigenvalues, the eigenvectors and INFO. These arrays are created in the subroutine and can be assigned to variables which need not be defined. For example:

>>> vals, vecs, out = la.lapack.dsyev(a)

If the interface of dsyev was something like:

la.lapack.dsyev(a, w)

w should be defined before the call (with arbitrary values), something very unnatural in Python. I will discuss in the next post why I didn’t include info as an argument…

The second example is a function that concatenates two lists, only if the last element of the first list is the same as the first element of the second list. Otherwise it returns nothing (None):

def concat(l1, l2):
if l1[-1]==l2[0]:
return l1+l2
else:
return None

Because Python is dynamically typed, it is ok to return different types of variables in the same function depending on the case. Assigning a result to a variable will not produce an error:

l3 = concat([1,2,3], [4,5])

The code could continue by checking what is l3:

if l3 is None:
do something
else:
do something_else

So do not be afraid of returning complex lists from a function and avoid changing the arguments values as much as possible, as this is a more clear coding style.

# Python for Fortran programmers III: Learning Python

Learning Python is easier than learning Fortran. First, because the language has a simpler syntax and its interactive behaviour makes errors easier to catch and more understandable. Second, because there are lots of resources, many of them free: tutorials, books, etc.
These two interactive web-pages are a good example of the type of help you cannot find with Fortran:

That said, at some point you might feel dismayed by the size of the language. Python is much larger than Fortran. Both the core language and the available extensions. You’ll have to assume that, unless you become too hooked on it, there’ll be many areas of the language that you will – and can – ignore. With the risk of blasphemy, I dare say you can write hundreds of useful scripts completely ignoring Object Oriented Programming.

The number of available packages is also overwhelming. Take, for example, the calculation of the square root. In most tutorials you are told to import the math module which contains a sqrt function. math.sqrt will give an error for negative arguments. math.sqrt only returns floats. If you want deal with complex numbers, you need to import cmath. But cmath.sqrt always returns a complex number, even if its imaginary part is zero. All this is similar to the Fortran sqrt function (which implicitly calls two different functions for real and complex arguments). But with python you can both have your cake and eat it. There is a third sqrt function which returns a complex number only when the argument is negative. It can be found in the numpy.lib.scimath module. This third function can be useful to you, but you can also live without it. Getting used to ignoring all these potentially useful features is part of the learning process. Eventually you will remember and look for the ones you really need.

One last confusing issue is which Python version to learn. As with Fortran, the newest versions are better, but the problem with Python is that versions 3.x are not backward compatible with the wide-spread versions 2.x. My recommendation is that you learn python 3.x. Mainly because it is the future. Most packages have already been translated into Python 3. I think installing python3, numpy, scipy, sympy, and ipython is enough to start using python in scientific problems. Make sure you install these packages for python3. Each python version can host its own modules completely independently. If you are using an older distribution, you can also install them quite easily with pip. Of course, things can get more complex, as Konrad Hinsen reports in his interesting blog.

Despite their incompatibility, Python3 and Python2 are not that different. Probably, they are less different than FORTRAN77 and Fortran 2000. The most significant differences that you will find are that print() is a function in Python3 (it uses parenthesis) and that input() works differently.

And if you start programming, please do it in a nice way.