This series of posts were not a Python tutorial, just some tips for those Fortran programmers who are learning python. Once you know the basics of Python, you can focus on extensions useful to scientists.
The most essential one is Numpy, which gives python the ability to work efficiently with arrays. If you install Numpy, it is worth also installing Scipy. Scipy includes a wealth of algorithms that you probably need but you don’t want to code. From Fourier transforms and splines to minimizations and numerical integrations. There are excellent tutorials for both Numpy and Scipy and a good place to start is at http://www.scipy.org/.
Remember that although you can translate a Fortran code almost line by line into Python, the resulting code will not be optimal, neither for clarity nor efficiency. Learn to be Pythonic:
Use dictionaries, use sets, use list comprehension, and even consider using classes! Remember that almost everything is iterable in Python.
This is my last post for the series Python for Fortran programmers, but I will continue writing about Python tools that I find useful for my research. I hope they will also help other computational chemists and biophysicists.
Python only has functions, whereas Fortran has two types of procedures: functions and subroutines. Functions do something with the arguments and return a value. The analogy with mathematical functions is obvious: f(x)=cos2(x) is asking to be coded as a function. Subroutines are supposed to perform some tasks and/or return several arguments than cannot be returned with a function. A typical example could be a diagonalization subroutine, which returns the eigenvalues and the eigenvectors.
Fortran programmers find no problem in understanding Python functions, but how can we make a function that returns eigenvalues and eigenvectors?
Python is not the only language having only functions, in fact its pretty common, the exception being Fortran… But in Fortran we can code everything as a subroutine or everything as a function.
subroutine cos2(x, res) could return the “function” value as an argument. And
function diag(m, eigvec, eigval) could have the same arguments as the analogous subroutine and have a return value that just informs if the diagonalization succeeded.
This diagonalization function could be easily translated into a python function. Done? Not yet…
As Fortran programmers we tend to overuse the arguments to output results, and in Python the return value is much more flexible than in Fortran, because what the function returns is not fixed in size or type. So in Python it would make more sense to return the eigenvalues and the eigenvectors in the return statement and leave the arguments only as inputs to the function.
Let’s see two examples:
The Lapack diagonalization subroutine
DSYEV has these arguments:
SUBROUTINE dsyev( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO )
A contains the matrix to be diagonalized on input and the eigenvectors on output. The eigenvalues are returned in
INFO returns 0 if successful. How is does the interface look like in Python?
Scipy contains wrappers to many Lapack subroutines. Although with Python we should call the higher-level routine scipy.linalg.eigh we can also call directly the Lapack routine:
>>> import numpy as np >>> import scipy.linalg as la >>> a = np.array(([[1,2,3], [2,4,5], [3,5,6]])) >>> la.lapack.dsyev(a) (array([ -0.51572947, 0.17091519, 11.34481428]), array([[ 0.73697623, 0.59100905, 0.32798528], [ 0.32798528, -0.73697623, 0.59100905], [-0.59100905, 0.32798528, 0.73697623]]), 0)
So it returns the eigenvalues, the eigenvectors and INFO. These arrays are created in the subroutine and can be assigned to variables which need not be defined. For example:
>>> vals, vecs, out = la.lapack.dsyev(a)
If the interface of
dsyev was something like:
w should be defined before the call (with arbitrary values), something very unnatural in Python. I will discuss in the next post why I didn’t include
info as an argument…
The second example is a function that concatenates two lists, only if the last element of the first list is the same as the first element of the second list. Otherwise it returns nothing (
def concat(l1, l2): if l1[-1]==l2: return l1+l2 else: return None
Because Python is dynamically typed, it is ok to return different types of variables in the same function depending on the case. Assigning a result to a variable will not produce an error:
l3 = concat([1,2,3], [4,5])
The code could continue by checking what is l3:
if l3 is None: do something else: do something_else
So do not be afraid of returning complex lists from a function and avoid changing the arguments values as much as possible, as this is a more clear coding style.