Theory, experiment and computation

If you put these three words in Google Images, you can see images such as this one:

Implying that science stands on three feet and that each of them is connected to the other two. As a scientist working with computer models I like this idea. I am not the only one: 4 out the first 5 images I get come from sites about computation! [1]

However I dare say that our beloved structure of sciences is wrong: experiment and computation cannot talk to each other directly. I believe the following scheme would be more appropriate.

experiment_theory_computationExperiments produce measurements of several quantities: X-ray diffraction patterns, SAXS curves, NMR relaxation times, IR spectra, etc. None of these can be directly compared to a computation. In the molecular sciences, computations mainly produce structures and their energies. Producing those already involves a lot of theory (equations of motion, interactions between particles) but it is pretty well established. Reproducing the experimental result from the simulated coordinates is usually much more complex.
That means that we need an important amount of theory to generate experiment-like data from a computation. And we also need to be aware of the modelling that comes after any experimental data.
Take the structures of a PDB file: they are modelled from a diffraction pattern. This pattern is the experimental data, the PDB structures are not. Even if the diffraction works fine, some assumptions in the model could be responsible for a disagreement between structures from a simulation and structures computed from the diffraction. Bojan Zagrovic has a nice work on that issue.
On the other hand, if we fail to reproduce the SAXS curve of a protein, it can be for several reasons. In most of the cases we assume it is because our calculated structure (or structural ensemble) does not reproduce the ensemble in the experiment. But we could have the exact structures and even fail to reproduce the experimental curve because we are computing that curve incorrectly. Try different SAXS predictors and you will get different curves for the same structure. Do no blame everything on the limitations of current force fields or an insufficient sampling!
In many cases, the computations needed to go from a structural ensemble to an experimental data is a no-man’s land. Computational scientists do not feel confident with the models behind every experimental technique. They (we) put a lot of emphasis on generating the right structures and use the codes to generate the experimental result as black boxes.
Experimentalists that use these codes try to avoid long discussions of why a code works better than another. After all, that is not the aim of their research and may distract the reader from the main point of their results.
Added to that is the difficulty in calculating the errors and the precision of these predictors, as we can never know exactly both the structure in solution and its measured experimental data. So it is very difficult to get benchmarks.
So the next time you see a disagreement between calculated and expermental results remember to question all the steps in the computation. Including those used by experimentalists!




Tags: , ,

About Ramon Crehuet

I'm a Computational Biochemist working at the IQAC institute of the Spanish National Research Council (CSIC)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: