The following codes and software have been developed by members of the Molecular Geochemistry Laboratory


Molecular dynamics

The atom scripts — a collection of 50+ matlab scripts and a framework for manipulating atomistic simulation cells and generate Gromacs and Lammps input files for Clayff and INTERFACE topologies. Download them from the Matlab Filexchange or GitHub.



Mass spectrometry

pyIsocalc — an isotope calculator in python 2.7x for mass spectrometry written in python. Very accurate — wrote it out of frustration with the inability of existing calculators to generate realistic  isotope envelopes for transition metal systems. Can change resolution. See also

Usage: ./pyisocalc -f ‘Fe(ClO3)5’ -p y -g 0.25 -o ironperchlorate.dat -c -2 -r 250

Computational chemistry


Andy Ohlin has together with Matt Asplund decided to fork the Extensible Computational Chemistry Environment (ECCE) suite. See here


nwchem 6.6 patch set — contains patches that improves interopability of NWChem with GabEdit (see e.g. this), and a few other things. Should work with future nwchem versions with minimal changes


monolith — Python code that does all of the below, but only needs to inputs: the fchk file and the temperature. Usage: ./monolith my.fck 298.15 — Python 2.7x. Generates a list of default atomic masses from an xyz file. You can edit the output by hand to replace the standard mass with an isotope. Usage: ./ — Python 2.7x. extracts a hessian from a gaussian formchk (fck) file from a vibration calculation. Usage: ./ example.fchk test.hess –Python 2.7x. uses a mass file (from xyz2mass) and a hessian file (from fck2hess) to compute normal modes/frequencies. Usage: ./ test.hess test.mass –Python 2.7x. computes the reduced partition function ratio from a file with two columns, each column containing the normal modes/frequencies of one isotope of a molecule. Usage: ./ tests.freq 298.15 –Python 2.7x. same as freq2rpfr, but can generate the rpfrs for a series of temperatures. Usage: ./ tests.freq 273.15 303.15 10

example input –Python 2.7x. gaussian 09 example.chk and example.out (opt + vib) are found in this file.

Full example usage of the above codes  on Linux (also invokes below, as well as the linux programs gawk and paste):

formchk example.chk 
./ example.out 
./ > test.mass
./ example.fchk test.hess
./ test.hess test.mass |gawk '{print $1}' > test.freq

cp test.mass test2.mass

I changed 1.2D01 in test2.mass to 13.003355 manually

./ test.hess test2.mass |gawk '{print $1}' > test2.freq
paste test.freq test2.freq > tests.freq
./ tests.freq 298.15
./ tests.freq 273.15 303.15 10

Old octave codes: nwhessian (frequencies=nwhess(“input.hess”, “input.mass”)), rpfr.m and rpfrvec.m (rpfr(‘infile.hess’,298.15) and rpfrvec(‘infile.hess’,273.15,313.15,10))

Normal modes

ramanlit — this is a fun one. It allows you to work out to which vibration (frequency) a specific atom contributes mostly to by varying the mass of that atom.
Usage: ramanlit gaussian.fchk 298 1 200
It’s a modified version of the fractionation script.  The third and fourth arguments are the atom number (according to the xyz) and the mass (in amu). By looking at the differences in computed frequencies, you can see which vibration an atom contributes to.

parsing –Python 2.7x. Extract all structures (and energies) from a gaussian output file/log file. NOTE: they don’t have to be optimised (I wrote it to extract structures from a QST/IRC search). Usage: ./ example.out –Python 2.7x. Extract all optimised structures (and energies) from a gaussian output file/log file. I wrote it to extract structures from a gaussian PES scan, but it’s great for simply extracting the final, optimised structure from a geometry optimisation in gaussian. Usage: ./ example.out — Python 2.7x. Same as qst_parse_g09, but for nwchem. –Python 2.7x.Same as pes_parse_g09, but for nwchem.

g09freq —

g09raman —

efgparse_g09 –Python 2.7x. program that parses gaussian EFG calculation output and processes it. Includes example input. Usage: ./efparse_g09 g09.g09out water

Misc. actions on xyz files

autorotate –Python 2.7x. An interesting program that can take two very similar molecules that have been rotated relative to one-another, and try to rotate them back into the same coordinate system. This can be useful before moving on to nebinterpolate (e.g. if you used autoz when optimised the reactant and product systems). contains example input. Instructions are found in the code.

nebinterpolate — This python script takes a multi-xyz file as input and interpolates the atom positions. If you had two structures in the file, you’ll get three (the average of the two). If you had three, you’ll get five. If you had five, you’ll get nine. It’s sometimes useful for generating input for nudge-elastic band computations in e.g. nwchem.
Usage: -i -o

pair_xyz –Python 2.7x. Tracks the distance between two atoms (e.g. the first and the second atom) in a multi-xyz file. Useful for parsing MD output or IRCs. Usage: ./pair_xyz 1 2

polish_xyz –Python 2.7x. Takes an ugly-looking unreadable xyz file and converts it to a standardised xyz file (can’t remember which standard)  that can be read by most software

flip_xyz –Python 2.7x. Generates the mirror image of a molecule by changing the sign of the z-coordinates of the atoms.

genstruc — Python 2.7x. Takes an xyz file and can generate a list over all bonds and/or angles and/or dihedrals in it. Need to provide a maximum bond distance in nm to consider (if it”s too long then everything is bonded to everything). I used it to look at the environment/bond distances of specific metals in clusters as a function of computational method. Can also be used to generate output for bar charts etc.

To get bonds only:

Usage: ./genstruc 2.3 1

To get bonds and angles only:

Usage: ./genstruc 2.3 2

To get bonds, angles and dihedrals

Usage: ./genstruc 2.3 3


nmrsim — The following two Octave scripts, homog.m and hmegen.m, simulate a very simple model spectrum of an arbritary number of uncoupled spins wich are in chemical equilibrium. It loosely follows P. Allard et al. (J.Mag. Res., 1997, 129, 19-29; J. Biomol. NMR, 2000, 18, 49-63). You only need to edit homog.m to change parameters. Leave hmegen.m alone.

kinsim — two simple Octave scripts, twosite.m and getpulse.m,  for simulating the NMR spectrum of a two-site exchange system using classical Bloch equations. . Put the files in a directory, open octave there, and run ‘twosite’. The real value of the scripts are that they are very easy to edit if you want e.g. multiple pulses, more species in equilibrium, or different rate laws.


pOCPCA — Collection of Octave scripts for doing some basic chemometric analysis of data. In particular, I’ve used it for kinetic UV/VIS data, pH-dependent UV/VIS data and cone voltage-dependent mass spec data. It’s quite old now and I haven’t used it for a while, but it performs as well as the SPECFIT program that caused me to write it (SPECFIT used an opaque algorithm, which is always frustrating to a scientist who wants to UNDERSTAND the data). pOCPA does both the Orthogonal Projection Approach and Evolving Factor Analysis and can handle vectors generated from pKa-based models, in addition to simple linear ones. To be absolutely fair, the fitting routines in pOCPCA could do with an overhaul as they are not altogether reliable in finding the minimum every time.

Here’s an old description of the collection of scripts, which is still accurate:
“pOCPCA is a collection of octave scripts controlled by newmain.m which can do: evolving factor analysis, orthogonal projection approach, pKa extraction and rate law fitting of large data sets, such as UV/VIs titration data, ESI-MS cone voltage data and so on. The best way to learn how to use the scripts is by looking at newmain.m. The biggest challenge is really constructing the data input: the input file should have the vector of change (e.g. time or pH) in the first column, and the static vector (e.g. wavelength) in the first row. The first row, first column position can be anything (e.g. 0) as it’s just used for symmetry. The rest of the fields contain the data.”

MOCPCA — matlab-compatible version of pOCPCA. As I had to ‘port’ minimize() from Octave I wouldn’t put as much trust in this version as in the Octave one since ‘port’ means ‘brutally fiddling with the code until it magically worked on matlab’.

Data processing

rotate — shell script which depends on gawk. Rotates an m times n tab-separated matrix

Usage: rotate filename

findpeaks –Python script which parses an xyyy.. tab-separated file and returns signal (y) with intensities larger than the supplied cutoff value.

Usage: findpeaks filename column cutoff

where the filename is the name of the file, the column is the column which to parse (numbering starts at 1, not 0), and the cutoff value is a floating point number. — Very simple shell script (bash) using gawk which harvests the first column of a series of xy tab-separated files ({0..20..10}.dat and {220..300..20}.dat) to a single file.

Usage: sh > output.dat — Python script. It is fairly specific in its purpose as it

  • opens a tab-separated file generated by
  • collects all unique numbers in the file
  • sorts them in descending order. These represent all possible x values for a system
  • opens a series of files with names in the {0..200..10}.dat and {220..300..20}.dat ranges. These are xy tab-separated files.
  • collects all the data corresponding to the unique x values
  • Returns a matrix with the X and Y values

This script is used to generate a matrix from a collection of mass spectra.

Usage: filename — sh