We’ll start with a Direct Fourier Transform (DFT). We’ll import some python things into Julia, to compare different results.
using PyCall @pyimport nufft as nufft_fortran @pyimport numpy as np
Next we define 3 versions of the DFT. The first one uses Python and is as close as I could get to In[1] in Jake’s post. The other two use element-wise operations instead of whole-array operations.
function nufftfreqs(M,df=1.0) """Compute the frequency range used in nufft for M frequency bins""" df * [-fld(M, 2): M - fld(M, 2)-1] end function nudftpy(x, y, M, df=1.0, iflag=1) """Non-Uniform Direct Fourier Transform. Using numpy""" sign = iflag < 0 ? -1 : 1 (1 / length(x)) * np.dot(y, np.exp(sign * 1im * x*transpose(nufftfreqs(M, df)))) end function nudft(x, y, M, df=1.0, iflag=1) """Non-Uniform Direct Fourier Transform. Using whole array operations""" sign = iflag < 0 ? -1 : 1 (1 / length(x)) * *(transpose(y''), exp(sign * 1im * x*transpose(nufftfreqs(M, df)))) end function nudft2(x::Vector, y::Vector, M::Int, df=1.0, iflag=1) """Non-Uniform Direct Fourier Transform. Using comprehensions""" freqs = nufftfreqs(M, df) sign = iflag < 0 ? -1 : 1 n = size(x,1) m = size(freqs, 1) r = (1 / n) * [y[i]*exp(sign*1im*x[i]*freqs[j]) for i=1:n, j=1:m] sum(r,1) end
Now we test them:
x = 100 * rand(1000) y = sin(x) Y0 = @time nudftpy(x, y, 1000) Y1 = @time nudft(x, y, 1000) Y2 = @time nudft2(x, y, 1000) Yf = @time nufft_fortran.nufft1(x, y, 1000) print([np.allclose(Y0, Yf), np.allclose(Y1, Yf),np.allclose(Y2, Yf))
elapsed time: 0.272123779 seconds (32289652 bytes allocated)
elapsed time: 0.211703597 seconds (32309564 bytes allocated, 26.77% gc time)
elapsed time: 0.14629479 seconds (32874852 bytes allocated)
elapsed time: 0.207123126 seconds (32918144 bytes allocated, 28.28% gc time)
Bool[true,true,true]
The results agree! But they are incredibly inefficient. The functions are also slower than the same numpy function called from python (about twice as fast in python). Probably there is some overhead in calling python from Julia. Besides, the @time
macro in Julia works differently than the @timeit
magic function…
Our real interest is comparing Julia with numba, so I’ll go on. Here is an FFT implementation of the nuft_numpy
function of Jake’s post (In[4]). To be easily identifiable, I’ve kept the same doc-strings and function names, which kind-of make no sense here…
function _compute_grid_params(M, epsilon) # Choose Msp & tau from eps following Dutt & Rokhlin (1993) if epsilon <= 1E-33 || epsilon >= 1E-1 error(@sprintf("eps = %f; must satisfy 1e-33 < eps < 1e-1.", epsilon)) end ratio = epsilon > 1E-11 ? 2 : 3 Msp = itrunc(-log(epsilon) / (pi * (ratio - 1) / (ratio - 0.5)) + 0.5) Mr = max(ratio * M, 2 * Msp) lambda_ = Msp / (ratio * (ratio - 0.5)) tau = pi * lambda_ / M^2 Msp, Mr, tau end function nufft_python(x, c, M, df=1.0, epsilon=1E-15, iflag=1): """Fast Non-Uniform Fourier Transform with Python""" Msp, Mr, tau = _compute_grid_params(M, epsilon) N = length(x) # Construct the convolved grid ftau = zeros(typeof(c[1]), Mr) Mr = size(ftau,1) hx = 2pi / Mr mm = [-Msp:Msp-1] for i=1:N xi = (x[i] * df) % (2 * pi) m = 1 + div(xi,hx) spread = exp(-0.25 * (xi - hx * (m + mm)).^2 / tau) ftau[1+mod(m + mm, Mr)] += c[i] * spread end # Compute the FFT on the convolved grid if iflag < 0 Ftau = (1 / Mr) * fft(ftau) else Ftau = ifft(ftau) end Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]] # Deconvolve the grid using convolution theorem k = nufftfreqs(M) (Ftau.*(1 / N) * sqrt(pi / tau)).* exp(tau * k.^2) end
Following again Jake’s post, I write a couple of functions to test our implementations. They are a litteral translation from the Python code.
function test_nufft(nufft_func, M=1000, Mtime=100000) # Test vs the direct method print(repeat("-",30), "\n") print("testing ",nufft_func, "\n") x = 100 * rand(M + 1) y = sin(x) for df in [1.0, 2.0] for iflag in [1, -1] F1 = nudft(x, y, M, df, iflag) F2 = nufft_func(x, y, M, df, 1E-15, iflag) assert(all(x -> isapprox(x...), zip(F1, F2))) end end print("- Results match the DFT\n") # Time the nufft function x = 100 * rand(Mtime) y = sin(x) times = Float64[] for i = 1:5 tic() F = nufft_func(x, y, Mtime) t1 = toq() push!(times,t1) end @printf("- Execution time (M=%d): %.2f sec\n",Mtime, median(times)) end
Let’s test it:
test_nufft(nufft_python) test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_python
- Results match the DFT
- Execution time (M=100000): 1.07 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec
The results are an order of magnitude slower than the fortran code. They are about three times faster than the numpy_python code in my computer (3.7 sec). Good! So pure Julia is faster than simple python!
Python spends most of the time in the loop. This can be improved using numpy add
function. As this function does not exist in Julia, I used a loop instead. The result is pretty cumbersome, but it was just a game to see if I could get something close to the numpy version.
function nufft_numpy(x, y, M, df=1.0, epsilon=1E-15, iflag=1): """Fast Non-Uniform Fourier Transform""" Msp, Mr, tau = _compute_grid_params(M, epsilon) N = length(x) # Construct the convolved grid ftau = zeros(typeof(y[1]), Mr) hx = 2pi / Mr xmod = map(mod2pi, x*df) m = 1+ int(xmod/hx) mm = [-Msp:Msp-1] mpmm = broadcast(+, transpose(m), mm) spread = broadcast(*, exp(-0.25 * (transpose(xmod).- hx*mpmm).^ 2 / tau), transpose(y)) for (i,s) in zip(map(xi->1+mod(xi, Mr), mpmm), spread) ftau[i] += s end # Compute the FFT on the convolved grid if iflag < 0 Ftau = (1 / Mr) * fft(ftau) else Ftau = ifft(ftau) end Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]] # Deconvolve the grid using convolution theorem k = nufftfreqs(M) (Ftau.*(1 / N) * sqrt(pi / tau)).* exp(tau * k.^2) end test_nufft(nufft_numpy) test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numpy
- Results match the DFT
- Execution time (M=100000): 4.58 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec
So our attempt to emulate the numpy setup was a disaster. This could have been expected. After all, in python we were trying to remove a for loop, but these loops are not inherently slower in Julia, so that the complicated broadcastic resulted in a degradation of performance. It’s nice to see that complex code works worse!
Let’s see if the numba code results in more efficient Julia code. This is the line-by-line translation of the numba code (In[11]):
function build_grid(x, c, tau, Msp, ftau) Mr = size(ftau,1) hx = 2pi / Mr for i=1:size(x,1) xi = mod2pi(x[i]) m = 1 + int(xi/hx) for mm=-Msp:Msp-1 ftau[1 + mod((m + mm) , Mr)] += c[i] * exp(-0.25 * (xi - hx * (m + mm))^2 / tau) end end ftau end function nufft_numba(x, c, M, df=1.0, eps=1E-15, iflag=1) """Fast Non-Uniform Fourier Transform from Python numba code""" Msp, Mr, tau = _compute_grid_params(M, eps) N = length(x) # Construct the convolved grid ftau = build_grid(x * df, c, tau, Msp, zeros(typeof(c[1]), Mr)) # Compute the FFT on the convolved grid if iflag < 0 Ftau = (1 / Mr) * fft(ftau) else Ftau = ifft(ftau) end Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]] # Deconvolve the grid using convolution theorem k = nufftfreqs(M) (1 / N) * sqrt(pi / tau) .* exp(tau * k.^2).*Ftau end test_nufft(nufft_numba) test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numba
- Results match the DFT
- Execution time (M=100000): 0.32 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec
This is a better performance! Last, we can potentially gain some more speed pre-calculating the exponentials. Again, this is a pure translation of Jake’s code.
function build_grid_fast(x, c, tau, Msp, ftau, E3) Mr = size(ftau,1) hx = 2pi / Mr # precompute some exponents for j=0:Msp E3[j+1] = exp(-(pi * j / Mr)^2 / tau) end # spread values onto ftau for i=1:size(x,1) xi = mod2pi(x[i]) m = 1 + int(xi/hx) xi = xi - hx * m E1 = exp(-0.25 * xi^2 / tau) E2 = exp((xi * pi) / (Mr * tau)) E2mm = 1 for mm=0:Msp-1 ftau[1+mod((m + mm) , Mr)] += c[i] * E1 * E2mm * E3[mm+1] E2mm *= E2 ftau[1+mod((m - mm - 1) , Mr)] += c[i] * E1 / E2mm *E3[mm+2] end end ftau end function nufft_numba_fast(x, c, M, df=1.0, eps=1E-15, iflag=1) """Fast Non-Uniform Fourier Transform from Python numba code""" Msp, Mr, tau = _compute_grid_params(M, eps) N = length(x) # Construct the convolved grid ftau = build_grid_fast(x * df, c, tau, Msp, zeros(typeof(c[1]), Mr), zeros(typeof(x[1]), Msp+1) ) # Compute the FFT on the convolved grid if iflag < 0 Ftau = (1 / Mr) * fft(ftau) else Ftau = ifft(ftau) end Ftau = [Ftau[end-div(M,2)+1:end], Ftau[1:div(M,2)+M%2]] # Deconvolve the grid using convolution theorem k = nufftfreqs(M) (1 / N) * sqrt(pi / tau) .* exp(tau * k.^2).*Ftau end test_nufft(nufft_numba_fast) test_nufft(nufft_fortran.nufft1)
------------------------------
testing nufft_numba_fast
- Results match the DFT
- Execution time (M=100000): 0.70 sec
------------------------------
testing fn
- Results match the DFT
- Execution time (M=100000): 0.12 sec
Here I am surprised to see that the performance is worse than before. I really don’t see any reason for that, and of course, a profiler should be the next step. Whereas the nufft_numba_fast
in python is almost as efficient as the fortran code (0.14 sec vs. 0.11 sec), with Julia it is 100% slower than the simpler nufft_numba
.
Conclusion? Julia is easy and powerful, but for those used to python, numba is a great alternative that can produce even faster code with less effort (for a Python programmer).
As I am new to Julia I may have made several mistakes and I would appreciate if readers can point them out.
You can find a very similar version of this post as a python notebook.
Implying that science stands on three feet and that each of them is connected to the other two. As a scientist working with computer models I like this idea. I am not the only one: 4 out the first 5 images I get come from sites about computation! [1]
However I dare say that our beloved structure of sciences is wrong: experiment and computation cannot talk to each other directly. I believe the following scheme would be more appropriate.
Experiments produce measurements of several quantities: X-ray diffraction patterns, SAXS curves, NMR relaxation times, IR spectra, etc. None of these can be directly compared to a computation. In the molecular sciences, computations mainly produce structures and their energies. Producing those already involves a lot of theory (equations of motion, interactions between particles) but it is pretty well established. Reproducing the experimental result from the simulated coordinates is usually much more complex.
That means that we need an important amount of theory to generate experiment-like data from a computation. And we also need to be aware of the modelling that comes after any experimental data.
Take the structures of a PDB file: they are modelled from a diffraction pattern. This pattern is the experimental data, the PDB structures are not. Even if the diffraction works fine, some assumptions in the model could be responsible for a disagreement between structures from a simulation and structures computed from the diffraction. Bojan Zagrovic has a nice work on that issue.
On the other hand, if we fail to reproduce the SAXS curve of a protein, it can be for several reasons. In most of the cases we assume it is because our calculated structure (or structural ensemble) does not reproduce the ensemble in the experiment. But we could have the exact structures and even fail to reproduce the experimental curve because we are computing that curve incorrectly. Try different SAXS predictors and you will get different curves for the same structure. Do no blame everything on the limitations of current force fields or an insufficient sampling!
In many cases, the computations needed to go from a structural ensemble to an experimental data is a no-man’s land. Computational scientists do not feel confident with the models behind every experimental technique. They (we) put a lot of emphasis on generating the right structures and use the codes to generate the experimental result as black boxes.
Experimentalists that use these codes try to avoid long discussions of why a code works better than another. After all, that is not the aim of their research and may distract the reader from the main point of their results.
Added to that is the difficulty in calculating the errors and the precision of these predictors, as we can never know exactly both the structure in solution and its measured experimental data. So it is very difficult to get benchmarks.
So the next time you see a disagreement between calculated and expermental results remember to question all the steps in the computation. Including those used by experimentalists!
[1] http://cds130.org/sheng/Introduction
http://www.shodor.org/master/environmental/water/runoff/about.html
http://www.socrates.if.usp.br/~wtc/?q=role-computing-science
http://www.tchpc.tcd.ie/computational-science
http://phycomp.technion.ac.il/~phr76ja/doh/doh1.html
PYMC implements potentials, but there are few examples of their uses. In this post I will show how priors can be implemented as potentials using two previously published examples. I hope this will show potentials, just as priors and likelihood, are just one more term in the posterior distribution that MCMC samples.
Here I reproduce the linear fit from Jake Vanderplas post but using just two points:
xdata = np.array([0, 1.]) ydata = np.array([0, 2.]) edata = 1.* np.ones_like(xdata)
We will also use the same visualization functions defined in his post:
import pymc # Create some convenience routines for plotting def compute_sigma_level(trace1, trace2, nbins=20): """From a set of traces, bin by number of standard deviations""" L, xbins, ybins = np.histogram2d(trace1, trace2, nbins) L[L == 0] = 1E-16 logL = np.log(L) shape = L.shape L = L.ravel() # obtain the indices to sort and unsort the flattened array i_sort = np.argsort(L)[::-1] i_unsort = np.argsort(i_sort) L_cumsum = L[i_sort].cumsum() L_cumsum /= L_cumsum[-1] xbins = 0.5 * (xbins[1:] + xbins[:-1]) ybins = 0.5 * (ybins[1:] + ybins[:-1]) return xbins, ybins, L_cumsum[i_unsort].reshape(shape) def plot_MCMC_trace(ax, xdata, ydata, trace, scatter=False, **kwargs): """Plot traces and contours""" xbins, ybins, sigma = compute_sigma_level(trace[0], trace[1]) ax.contour(xbins, ybins, sigma.T, levels=[0.683, 0.955], **kwargs) if scatter: ax.plot(trace[0], trace[1], ',k', alpha=0.1) ax.set_xlabel(r'$\alpha$') ax.set_ylabel(r'$\beta$') def plot_MCMC_model(ax, xdata, ydata, trace): """Plot the linear model and 2sigma contours""" ax.plot(xdata, ydata, 'ok') alpha, beta = trace[:2] xfit = np.linspace(-1, 2, 10) #This is the only change from Jake's post! yfit = alpha[:, None] + beta[:, None] * xfit mu = yfit.mean(0) sig = 2 * yfit.std(0) ax.plot(xfit, mu, '-k') ax.fill_between(xfit, mu - sig, mu + sig, color='lightgray') ax.set_xlabel('x') ax.set_ylabel('y') def plot_MCMC_results(xdata, ydata, trace, colors='k'): """Plot both the trace and the model together""" fig, ax = plt.subplots(1, 2, figsize=(10, 4)) plot_MCMC_trace(ax[0], xdata, ydata, trace, True, colors=colors) plot_MCMC_model(ax[1], xdata, ydata, trace)
First we will use uniform priors. With uniform priors one has to make sure their range is large enough to include all values of the parameters. Otherwise the sampled parameters cluster around the boundary of the uniform range. You can try this by changing the range for beta
to (-1, 1)
.
# Define the variables needed for the routine, with their prior distributions alpha = pymc.Uniform('alpha', -10, 10) beta = pymc.Uniform('beta', -10, 10) # Define the form of the model and likelihood @pymc.deterministic def y_model(x=xdata, alpha=alpha, beta=beta): return alpha + beta * x # (We define the error as 2 std) y = pymc.Normal('y', mu=y_model, tau=1. / (edata/2) ** 2, observed=True, value=ydata) # package the full model in a dictionary model1 = dict(alpha=alpha, beta=beta, y_model=y_model, y=y) # run the basic MCMC: S = pymc.MCMC(model1) S.sample(iter=100000, burn=50000) # extract the traces and plot the results pymc_trace_unifo = [S.trace('alpha')[:], S.trace('beta')[:]] plot_MCMC_results(xdata, ydata, pymc_trace_unifo) print() print("alpha mean= {:.2f}".format(pymc_trace_unifo[0].mean())) print("beta mean= {:.2f}".format(pymc_trace_unifo[1].mean()))
[—————–100%—————–] 100000 of 100000 complete in 16.8 sec
alpha mean= 0.00
beta mean= 1.99
Now with potentials. A uniform prior results in a constant log-p term, so we need to add the lop-p of the prior as a potential with the same effects (the constant term does not affect the MCMC sampling):
# Define the variables needed for the routine, with their prior distributions alpha = pymc.Uniform('alpha', -10, 10) beta = pymc.Uniform('beta', -10, 10) #Define a potential equivalent to a prior @pymc.potential def pot_prior(val=beta): return -1.5 * np.log(1 + val ** 2) # Define the form of the model and likelihood @pymc.deterministic def y_model(x=xdata, alpha=alpha, beta=beta): return alpha + beta * x y = pymc.Normal('y', mu=y_model, tau=1. / (edata/2) ** 2, observed=True, value=ydata) # package the full model in a dictionary model1 = dict(alpha=alpha, beta=beta, y_model=y_model, pot_prior=pot_prior, y=y) # run the basic MCMC: we'll do 100000 iterations to match emcee above S = pymc.MCMC(model1) S.sample(iter=100000, burn=50000) # extract the traces and plot the results pymc_trace_pot = [S.trace('alpha')[:], S.trace('beta')[:]] plot_MCMC_results(xdata, ydata, pymc_trace_pot) print() print("alpha mean= {:.2f}".format(pymc_trace_prior[0].mean())) print("beta mean= {:.2f}".format(pymc_trace_prior[1].mean()))
[—————–100%—————–] 100000 of 100000 complete in 18.6 sec
alpha mean= 0.31
beta mean= 1.38
I find it surprising that the average fit does not run through the data points. On the other hand this is not unexpected: our prior for beta
favours values close to 0. I guess an asymmetrical error model in the likelihood could oppose this trend, but this is really beyond the scope of my knowledge and I would really appreciate comments on that.
As a second example of potentials, we’ll analyze the post by Austin Rochford, which I also enjoyed reading. It discusses, among others, Bayesian Ridge regression. Ridge regression adds a quadratic potential to the least-squares objective function to minimize. This is done to achieve regularization. A Bayesian approach reveals that this is equivalent to using a normal prior for the slope beta
parameters.
I refer again to his post for explanations and more insight. In that post, instead of sampling the posterior he just finds the Maximum a posteriori (MAP) estimate. Here I just show that the same value is obtained when using uniform priors and an external potential is added ad hoc as is done in traditional Ridge regression.
from scipy.stats import norm n = 1000 x1 = norm.rvs(0, 1, size=n) x2 = -x1 + norm.rvs(0, 10**-3, size=n) x3 = norm.rvs(0, 1, size=n) X = np.column_stack([x1, x2, x3]) y = 10 * x1 + 10 * x2 + 0.1 * x3 def get_coefficients(map_): return [{str(variable): variable.value} for variable in map_.variables if str(variable).startswith('beta')] beta1_pot = pymc.Uniform('beta1', -100.,100.) beta2_pot = pymc.Uniform('beta2', -100.,100.) beta3_pot = pymc.Uniform('beta3', -100.,100.) @pymc.potential def pot(beta1=beta1_pot, beta2=beta2_pot, beta3=beta3_pot): tau = 1. return -tau/2.*(beta1**2+beta2**2+beta3**2) @pymc.deterministic def y_hat_pot(beta1=beta1_pot, beta2=beta2_pot, beta3=beta3_pot, x1=x1, x2=x2, x3=x3): return beta1 * x1 + beta2 * x2 + beta3 * x3 Y_pot = pymc.Normal('Y', mu=y_hat_pot, tau=1.0, value=y, observed=True) pot_model = pymc.Model([Y_pot, beta1_pot, beta2_pot, beta3_pot, pot]) pot_map = pymc.MAP(pot_model) pot_map.fit() print("With potentials:\n",get_coefficients(pot_map)) # # With priors. Based on # http://www.austinrochford.com/posts/2013-09-02-prior-distributions-for-bayesian-regression-using-pymc.html # beta1_ridge = pymc.Normal('beta1', mu=0, tau=1.0) beta2_ridge = pymc.Normal('beta2', mu=0, tau=1.0) beta3_ridge = pymc.Normal('beta3', mu=0, tau=1.0) @pymc.deterministic def y_hat_ridge(beta1=beta1_ridge, beta2=beta2_ridge, beta3=beta3_ridge, x1=x1, x2=x2, x3=x3): return beta1 * x1 + beta2 * x2 + beta3 * x3 Y_ridge = pymc.Normal('Y', mu=y_hat_ridge, tau=1.0, value=y, observed=True) ridge_model = pymc.Model([Y_ridge, beta1_ridge, beta2_ridge, beta3_ridge]) ridge_map = pymc.MAP(ridge_model) ridge_map.fit() print("With priors:\n",get_coefficients(ridge_map))
With potentials:
[{‘beta1’: array(0.004691201218377386)}, {‘beta3’: array(0.09966093745608606)}, {‘beta2’: array(0.004958704169379989)}]
With priors:
[{‘beta3’: array(0.09966049909698146)}, {‘beta1’: array(0.004779905863422812)}, {‘beta2’: array(0.005045738743703612)}]
Let us assume that for some large project we need an easy way to
translate atomic symbols into element names. The first thing that comes
to your mind is to use a dictionary. As this seems a bijective relation,
keys can be symbols and values names, or vice-versa. This is an
important decision as it fixes the structure of the dictionary. I
believe keys should be symbols. Let me give you a couple of reasons.
First, there is one element W, that has 2 names, Tungsten and
Wolfram. Second, you might image extending the dictionary to include
names in other languages. Therefore:
elementsDict=dict(H='hydrogen', He='helium', Li='lithium' ) # And so on...
Now your project has grown and you would like to also keep the atomic
number of an element. Two simple solutions are creating a second
dictionary (duplicating work is not nice and error prone) or keeping a
tuple (or a list) for the values, such as:
elementsDict=dict(H=('hydrogen', 1), He=('helium', 2), Li=('lithium', 3) ) # And so on...
This makes accessing to the attributes of an element ugly, as you have
to remember the arbitrary order of the attribute in the tuple. Did I say
attributes? That sounds object oriented! In fact, it is easy to
visualize an element as something (an object!) with different
attributes such as its name, its atomic symbol, its atomic weight, etc.
These attributes are of different type and do not bear an ordered
relationship among themselves as a list assumes. So let’s make an
object:
class Element: def __init__(self, name, atomicNumber): self.name = name self.atomicNumber = atomicNumber elementsDict=dict(H=Element('hydrogen', 1), He=Element('helium', 2), Li=Element('lithium', 3) )
Let’s try it:
for symbol in elementsDict: print("The atomic number of {} is {}".format(elementsDict[symbol].name, elementsDict[symbol].atomicNumber))
The atomic number of hydrogen is 1
The atomic number of lithium is 3
The atomic number of helium is 2
At this point, you might think it would be useful for the element object
to also contain its symbol. That’s easy, because you can add attributes
to an object dynamically:
for symbol in elementsDict: elementsDict[symbol].symbol = symbol print("{} name is {}".format(elementsDict['H'].symbol, elementsDict['H'].name))
H name is hydrogen
For the sake of completeness, imagine you have created a list of
elements with a class that contains the atomic symbol, and you then want
to create a dictionary from that list. I find this more clear as the
__init__
method then contains all the attributes. Here is how you
can do it:
class Element2: def __init__(self, symbol, name, atomicNumber): self.symbol = symbol self.name = name self.atomicNumber = atomicNumber elements2List =[Element2('H', 'hydrogen', 1), Element2('He', 'helium', 2), Element2('Li', 'lithium', 3) ] # Then create a dictionary with a comprehension: elements2Dict = {element.symbol: element for element in elements2List}
Obviously it works as before:
for symbol in elements2Dict: print("The atomic number of {} is {}".format(elements2Dict[symbol].name, elements2Dict[symbol].atomicNumber))
The atomic number of hydrogen is 1
The atomic number of lithium is 3
The atomic number of helium is 2
Imagine you have run some calculations with a certain code. Then you have a short script that copies output files and/or extracts important data into a text file, which you then import into a spreadsheet. You work with the data, and produce some plots which you copy&paste into a word-processor file where you have explained your aims, your set-up and the data manipulation.
Are you familiar with this? Don’t you think that the lab notebooks your colleagues have are more efficient? and more trustful? Aren’t you tired of repeating this cycle again and again every time you re-run some of the calculations?
If you answered ‘yes’ to some if these questions, the IPython notebook is for you. Even if you have never used python!
The Ipython notebook is the lab notebook of the computational scientists. You can insert complex mathematical formulas as with .\LaTeX. The recent extension to use a spell-check render the notebook even more useful. You can also insert videos and images from the web. For example to compare your results with that of a published paper a couple of lines suffice:
from IPython.display import Image Image(url='http://www.ploscompbiol.org/article/fetchObject.action?uri=info:doi/10.1371/journal.pcbi.1000497\ .g001&amp;representation=PNG_L')
Of course, you can also have python code in the notebook, But also bash, Ruby or Fortran code!
%%fortran real function sincos(x) implicit none real*8 :: x sincos = sin(x)*cos(x) end function
And then use the Fortran function directly in your python cells (you’ll need to install the
%fortranmagic extension)
sincos(1.5) - math.sin(1.5)*math.cos(1.5) < 1e-7 True
You can share the code as static notebooks with the http://nbviewer.ipython.org or convert them into LaTeX, PDFs or HTML.
I’ve only sketched a few capabilities of the notebook. Take a look at these examples to decide where you want to focus. And after all the praise, some criticism:
For some reason the image linked in my notebooks disappeared from the exported PDF version.
I finish this post with a confession. I should have used IPython to blog directly. This is possible but I am new to wordpress.com and I still have to check how to do it. Shame on me!
The most essential one is Numpy, which gives python the ability to work efficiently with arrays. If you install Numpy, it is worth also installing Scipy. Scipy includes a wealth of algorithms that you probably need but you don’t want to code. From Fourier transforms and splines to minimizations and numerical integrations. There are excellent tutorials for both Numpy and Scipy and a good place to start is at http://www.scipy.org/.
Remember that although you can translate a Fortran code almost line by line into Python, the resulting code will not be optimal, neither for clarity nor efficiency. Learn to be Pythonic:
http://www.cafepy.com/article/be_pythonic/
http://blog.startifact.com/posts/older/what-is-pythonic.html
Use dictionaries, use sets, use list comprehension, and even consider using classes! Remember that almost everything is iterable in Python.
This is my last post for the series Python for Fortran programmers, but I will continue writing about Python tools that I find useful for my research. I hope they will also help other computational chemists and biophysicists.
Type the following function:
def function(data=[]): data.append(1) return data
and call the function several times, without arguments (function()
). What is going on?
As usual, you can better visualize this process if you execute the code in: http://www.pythontutor.com
But it’s easy to understand, and if you do, you won’t be confused again. The first thing to remember is that lists are mutable. The second one is that the def
statement is an executable statement, which is executed when the function is defined in that line. As an example, will Hello from f1!
be printed if you execute only these lines of code?
def f1(): print("Hello from f1!") return 1 def f2(x=f1()): pass
See? You are defining functions but you are actually producing output!
So once the function is called and the object data is created, the same object is called all the time, pointing to the same data mutable object.
If you have understood everything, you should be able to predict the output of these lines:
def function(data=[]): data.append(1) return data print(function()) print(function()) print(function()) f1 = function def function(data=[]): data.append(1) print("Hello!") return data print(function()) print(function()) print(function()) print("f1 is function:", f1 is function)
You can expand this fascinating subject here: http://effbot.org/zone/default-values.htm from where I stole the example and some ideas, or here:
http://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument
a = 5 #immutable a = [1, 2, 3 ] #mutable
True, mutable objects have some methods such as a.pop()
that immutable objects don’t have. But again, you can think of workarounds:
b = (1, 2, 3) #an immutable tuple b = b[:-1] #all elements except the last one.
However it is important to realize that b
is now a new object (it has a different memory address) having the same name as the previous one, whereas a
remains the same object (same memory address) after the pop
method is called. You can check that by calling id(a)
and id(b)
before and after the assignments. When we modify a mutable object, the object remains the same, but when we “modify” b
, we create a new object:
a = [1, 2, 3] >>> id(a) 139878335741320 >>> a.pop() 3 >>> a [1, 2] >>> id(a) #the same as before 139878335741320 >>> b = (1, 2, 3) >>> id(b) 139878160519888 >>> b = b[:-1] >>> b (1, 2) >>> id(b) # a NEW object with the same name 139878160530536
Now we can deal with the surprising behaviour of default arguments.
At the beginning you can forget about how objects work, just remember that anything like a Fortran variable (integers, arrays, etc) is an object.
Objects are class instances. Somehow the class is like the definition of the object, whereas the objects are the actual variables. In REAL :: x,y,z
, REAL
would be the class and x,y,z
would be the objects.
Objects have methods and attributes. Again, for simplicity, methods are like Fortran functions, but associated with a certain object. And attributes are variables associated with that object. In Fortran, the closest thing to an attribute would be the fields in a TYPE
variable.
As attributes and methods are associated with an object, its syntax is special: object.attribute
and object.method()
.
In fact, some modules such as numpy (we’ll see more about numpy in the future) have methods implemented also as “normal” module functions. For example, if x is a numpy array we can call its method to get the mean:
x.mean()
or call the module function:
numpy.mean(x)
It is not easy to see the utility of objects, because many of the tutorial examples are too trivial. But here is one more example. There are many diagonalization subroutines in lapack. One for real matrices, one for complex, another one for symmetric, etc. They all need different names, and you have to check which one to apply to your particular matrix.
In Python, you could have all these different algorithms as methods in different classes, so that they could have the same name without collisions. Then:
x.diagonalize()
would call the right method depending on what the class of x
.
subroutine cos2(x, res)
could return the “function” value as an argument. And function diag(m, eigvec, eigval)
could have the same arguments as the analogous subroutine and have a return value that just informs if the diagonalization succeeded.DSYEV
has these arguments:
SUBROUTINE dsyev( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO )
A
contains the matrix to be diagonalized on input and the eigenvectors on output. The eigenvalues are returned in W
. INFO
returns 0 if successful. How is does the interface look like in Python?
Scipy contains wrappers to many Lapack subroutines. Although with Python we should call the higher-level routine scipy.linalg.eigh we can also call directly the Lapack routine:
>>> import numpy as np >>> import scipy.linalg as la >>> a = np.array(([[1,2,3], [2,4,5], [3,5,6]])) >>> la.lapack.dsyev(a) (array([ -0.51572947, 0.17091519, 11.34481428]), array([[ 0.73697623, 0.59100905, 0.32798528], [ 0.32798528, -0.73697623, 0.59100905], [-0.59100905, 0.32798528, 0.73697623]]), 0)
So it returns the eigenvalues, the eigenvectors and INFO. These arrays are created in the subroutine and can be assigned to variables which need not be defined. For example:
>>> vals, vecs, out = la.lapack.dsyev(a)
If the interface of dsyev
was something like:
la.lapack.dsyev(a, w)
w
should be defined before the call (with arbitrary values), something very unnatural in Python. I will discuss in the next post why I didn’t include info
as an argument…
The second example is a function that concatenates two lists, only if the last element of the first list is the same as the first element of the second list. Otherwise it returns nothing (None
):
def concat(l1, l2): if l1[-1]==l2[0]: return l1+l2 else: return None
Because Python is dynamically typed, it is ok to return different types of variables in the same function depending on the case. Assigning a result to a variable will not produce an error:
l3 = concat([1,2,3], [4,5])
The code could continue by checking what is l3:
if l3 is None: do something else: do something_else
So do not be afraid of returning complex lists from a function and avoid changing the arguments values as much as possible, as this is a more clear coding style.