Codes

The following codes and software have been developed by members of the Molecular Geochemistry Laboratory

Holmboe

Molecular dynamics

The atom scripts — a collection of 50+ matlab scripts and a framework for manipulating atomistic simulation cells.

Ohlin

Mass spectrometry

pyIsocalc — an isotope calculator in python 2.7x for mass spectrometry written in python. Very accurate — wrote it out of frustration with the inability of existing calculators to generate realistic  isotope envelopes for transition metal systems. Can change resolution. See also https://sourceforge.net/projects/pyisocalc/.

Usage: ./pyisocalc -f ‘Fe(ClO3)5’ -p y -g 0.25 -o ironperchlorate.dat -c -2 -r 250

Computational chemistry


Suites

Andy Ohlin has together with Matt Asplund decided to fork the Extensible Computational Chemistry Environment (ECCE) suite. See here

Patches

nwchem 6.6 patch set — contains patches that improves interopability of NWChem with GabEdit (see e.g. this), and a few other things. Should work with future nwchem versions with minimal changes

Fractionation

monolith — Python code that does all of the below, but only needs to inputs: the fchk file and the temperature. Usage: ./monolith my.fck 298.15

xyz2mass.py — Python 2.7x. Generates a list of default atomic masses from an xyz file. You can edit the output by hand to replace the standard mass with an isotope. Usage: ./xyz2mass.py structure.xyz 

fck2hess.py — Python 2.7x. extracts a hessian from a gaussian formchk (fck) file from a vibration calculation. Usage: ./fck2hess.py example.fchk test.hess

hess2freq.py –Python 2.7x. uses a mass file (from xyz2mass) and a hessian file (from fck2hess) to compute normal modes/frequencies. Usage: ./hess2freq.py test.hess test.mass

freq2rpfr.py –Python 2.7x. computes the reduced partition function ratio from a file with two columns, each column containing the normal modes/frequencies of one isotope of a molecule. Usage: ./freq2rpfr.py tests.freq 298.15

freq2rpfrvec.py –Python 2.7x. same as freq2rpfr, but can generate the rpfrs for a series of temperatures. Usage: ./freq2rpfrvec.py tests.freq 273.15 303.15 10

example input –Python 2.7x. gaussian 09 example.chk and example.out (opt + vib) are found in this file.

Full example usage of the above codes  on Linux (also invokes pes_parse_g09.py below, as well as the linux programs gawk and paste):

formchk example.chk 
./pes_parse_g09.py example.out 
./xyz2mass.py structure_001.xyz > test.mass
./fck2hess.py example.fchk test.hess
./hess2freq.py test.hess test.mass |gawk '{print $1}' > test.freq

cp test.mass test2.mass

I changed 1.2D01 in test2.mass to 13.003355 manually

./hess2freq.py test.hess test2.mass |gawk '{print $1}' > test2.freq
paste test.freq test2.freq > tests.freq
./freq2rpfr.py tests.freq 298.15
./freq2rpfrvec.py tests.freq 273.15 303.15 10

Old octave codes: nwhessian (frequencies=nwhess(“input.hess”, “input.mass”)), rpfr.m and rpfrvec.m (rpfr(‘infile.hess’,298.15) and rpfrvec(‘infile.hess’,273.15,313.15,10))

parsing

qst_parse_g09.py –Python 2.7x. Extract all structures (and energies) from a gaussian output file/log file. NOTE: they don’t have to be optimised (I wrote it to extract structures from a QST/IRC search). Usage: ./qst_parse_g09.py example.out

pes_parse_g09.py –Python 2.7x. Extract all optimised structures (and energies) from a gaussian output file/log file. I wrote it to extract structures from a gaussian PES scan, but it’s great for simply extracting the final, optimised structure from a geometry optimisation in gaussian. Usage: ./pes_parse_g09.py example.out

qst_parse_nw.py — Python 2.7x. Same as qst_parse_g09, but for nwchem.

pes_parse_nw.py –Python 2.7x.Same as pes_parse_g09, but for nwchem.

g09freq —

g09raman —

efgparse_g09 –Python 2.7x. program that parses gaussian EFG calculation output and processes it. Includes example input. Usage: ./efparse_g09 g09.g09out water

Misc. actions on xyz files

autorotate –Python 2.7x. An interesting program that can take two very similar molecules that have been rotated relative to one-another, and try to rotate them back into the same coordinate system. This can be useful before moving on to nebinterpolate (e.g. if you used autoz when optimised the reactant and product systems). ab.zip contains example input. Instructions are found in the code.

nebinterpolate — This python script takes a multi-xyz file as input and interpolates the atom positions. If you had two structures in the file, you’ll get three (the average of the two). If you had three, you’ll get five. If you had five, you’ll get nine. It’s sometimes useful for generating input for nudge-elastic band computations in e.g. nwchem.
Usage: nebinterpolate.py -i test_input.xyz -o test_output.xyz

pair_xyz –Python 2.7x. Tracks the distance between two atoms (e.g. the first and the second atom) in a multi-xyz file. Useful for parsing MD output or IRCs. Usage: ./pair_xyz multi.xyz 1 2

polish_xyz –Python 2.7x. Takes an ugly-looking unreadable xyz file and converts it to a standardised xyz file (can’t remember which standard)  that can be read by most software

flip_xyz –Python 2.7x. Generates the mirror image of a molecule by changing the sign of the z-coordinates of the atoms.

genstruc — Python 2.7x. Takes an xyz file and can generate a list over all bonds and/or angles and/or dihedrals in it. Need to provide a maximum bond distance in nm to consider (if it”s too long then everything is bonded to everything). I used it to look at the environment/bond distances of specific metals in clusters as a function of computational method. Can also be used to generate output for bar charts etc.

To get bonds only:

Usage: ./genstruc structure.xyz 2.3 1

To get bonds and angles only:

Usage: ./genstruc structure.xyz 2.3 2

To get bonds, angles and dihedrals

Usage: ./genstruc structure.xyz 2.3 3

NMR

nmrsim — The following two Octave scripts, homog.m and hmegen.m, simulate a very simple model spectrum of an arbritary number of uncoupled spins wich are in chemical equilibrium. It loosely follows P. Allard et al. (J.Mag. Res., 1997, 129, 19-29; J. Biomol. NMR, 2000, 18, 49-63). You only need to edit homog.m to change parameters. Leave hmegen.m alone.

kinsim — two simple Octave scripts, twosite.m and getpulse.m,  for simulating the NMR spectrum of a two-site exchange system using classical Bloch equations. . Put the files in a directory, open octave there, and run ‘twosite’. The real value of the scripts are that they are very easy to edit if you want e.g. multiple pulses, more species in equilibrium, or different rate laws.

Chemometrics

pOCPCA — Collection of Octave scripts for doing some basic chemometric analysis of data. In particular, I’ve used it for kinetic UV/VIS data, pH-dependent UV/VIS data and cone voltage-dependent mass spec data. It’s quite old now and I haven’t used it for a while, but it performs as well as the SPECFIT program that caused me to write it (SPECFIT used an opaque algorithm, which is always frustrating to a scientist who wants to UNDERSTAND the data). pOCPA does both the Orthogonal Projection Approach and Evolving Factor Analysis and can handle vectors generated from pKa-based models, in addition to simple linear ones. To be absolutely fair, the fitting routines in pOCPCA could do with an overhaul as they are not altogether reliable in finding the minimum every time.

Here’s an old description of the collection of scripts, which is still accurate:
“pOCPCA is a collection of octave scripts controlled by newmain.m which can do: evolving factor analysis, orthogonal projection approach, pKa extraction and rate law fitting of large data sets, such as UV/VIs titration data, ESI-MS cone voltage data and so on. The best way to learn how to use the scripts is by looking at newmain.m. The biggest challenge is really constructing the data input: the input file should have the vector of change (e.g. time or pH) in the first column, and the static vector (e.g. wavelength) in the first row. The first row, first column position can be anything (e.g. 0) as it’s just used for symmetry. The rest of the fields contain the data.”

MOCPCA — matlab-compatible version of pOCPCA. As I had to ‘port’ minimize() from Octave I wouldn’t put as much trust in this version as in the Octave one since ‘port’ means ‘brutally fiddling with the code until it magically worked on matlab’.

Data processing

rotate — shell script which depends on gawk. Rotates an m times n tab-separated matrix

Usage: rotate filename

findpeaks –Python script which parses an xyyy.. tab-separated file and returns signal (y) with intensities larger than the supplied cutoff value.

Usage: findpeaks filename column cutoff

where the filename is the name of the file, the column is the column which to parse (numbering starts at 1, not 0), and the cutoff value is a floating point number.

homogenise.sh — Very simple shell script (bash) using gawk which harvests the first column of a series of xy tab-separated files ({0..20..10}.dat and {220..300..20}.dat) to a single file.

Usage: sh homogenise.sh > output.dat

makelist.py — Python script. It is fairly specific in its purpose as it

  • opens a tab-separated file generated by homogenise.sh
  • collects all unique numbers in the file
  • sorts them in descending order. These represent all possible x values for a system
  • opens a series of files with names in the {0..200..10}.dat and {220..300..20}.dat ranges. These are xy tab-separated files.
  • collects all the data corresponding to the unique x values
  • Returns a matrix with the X and Y values

This script is used to generate a matrix from a collection of mass spectra.

Usage: makelist.py filename

procms.sh — sh procms.sh