Codes – Molecular Geochemistry at Umeå University

The following codes and software have been developed by members of the Molecular Geochemistry Laboratory

Holmboe

Matlab scripts for MD/MC simulation systems

The Atomistic Topology Operations in Matlab (ATOM) scripts – a collection of many Matlab scripts and a general framework for manipulating atomistic simulation cells, i.e. the xyz-coordinates, bonds, angles and simple dihedrals (across the PBC). This Matlab library was specifically developed to build systems and generate Gromacs, Lammps, NAMD2 and RASPA2 input files for Clayff and INTERFACE topologies. Download from the Matlab Filexchange or GitHub, or go to the html documentation.

Furthermore, i) bond/neighbour distances can be compared to empirical atom-specific distances by comparison to the Revised Shannon radii, and ii) the valence of atoms can be calculated using the Bond Valence Method, using for instance the Matlab function properties_atom(atom,Box_dim). Furthermore, basic powder X-ray diffractograms can be generated by the xrd_atom() function

Ohlin

Mass spectrometry

pyIsocalc — an isotope calculator in python 2.7x for mass spectrometry written in python. Very accurate — wrote it out of frustration with the inability of existing calculators to generate realistic isotope envelopes for transition metal systems. Can change resolution. See also https://sourceforge.net/projects/pyisocalc/.

Usage: ./pyisocalc -f ‘Fe(ClO3)5’ -p y -g 0.25 -o ironperchlorate.dat -c -2 -r 250

Computational chemistry

Suites

Andy Ohlin has together with Matt Asplund decided to fork the Extensible Computational Chemistry Environment (ECCE) suite. See here

Patches

nwchem 6.6 patch set — contains patches that improves interopability of NWChem with GabEdit (see e.g. this), and a few other things. Should work with future nwchem versions with minimal changes

Fractionation

nwchem:

reorderhess.py — Python (2.7) code that reorders a vibrational hessian from nwchem. If you compute a hessian for an XYZ where the atomic order is O, H, H, you can reorder it so that the order is H,H,O by doing

python reorderhess.py H2O_OHH.hess H2O.xyz “2,3,1” > H2O_HHO.hess

“2,3,1” means that atom 2 will be first, then atom 3, then atom 1 (in the original xyz). You also get a new xyz file called newxyz.xyz

monolith_nw — Python 2.7 code that computes the RPFRs for Mg24 vs Mg26 (to change this, edit xyzgetmassesmg26()) using a hessian generated by nwchem. The system you’re interested in needs to be at the beginning of the hessian (see reorderhess.py), but all atoms can be active during the computation of the hessian. to change the temperature, edit ‘temperature=298.15’ in the code.

python monolith_nw H2O_HHO.hess newxyz.xyz 3

3 means that the frequencies and hessian will be computed using the first three atoms. In the present example that’s all atoms, but you could also do it for e.g. just AlO6 in a Al(H2O)6.12H2O cluster

Gaussian:

monolith — Python code that does all (xyz2mass, fck2hess, hess2freq, freq2rpfr) of the below, but only needs to inputs: the fchk file and the temperature. Note that it’s hardcoded to compute the RPFRs for Mg 23 vs Mg 26, AND you need to have the system of interest at the beginning of your hessian AND all other atoms should be frozen during the computation of the hessian. You need to edit the Mg26 thing in the code (sub-routine xyzgetmassesmg26() ). To fix the other items, look at monolith_nw above.

Usage: ./monolith my.fck 298.15

xyz2mass.py — Python 2.7x. Generates a list of default atomic masses from an xyz file. You can edit the output by hand to replace the standard mass with an isotope. Usage: ./xyz2mass.py structure.xyz

fck2hess.py — Python 2.7x. extracts a hessian from a gaussian formchk (fck) file from a vibration calculation. Usage: ./fck2hess.py example.fchk test.hess

hess2freq.py –Python 2.7x. uses a mass file (from xyz2mass) and a hessian file (from fck2hess) to compute normal modes/frequencies. Usage: ./hess2freq.py test.hess test.mass

freq2rpfr.py –Python 2.7x. computes the reduced partition function ratio from a file with two columns, each column containing the normal modes/frequencies of one isotope of a molecule. Usage: ./freq2rpfr.py tests.freq 298.15

freq2rpfrvec.py –Python 2.7x. same as freq2rpfr, but can generate the rpfrs for a series of temperatures. Usage: ./freq2rpfrvec.py tests.freq 273.15 303.15 10

example input –Python 2.7x. gaussian 09 example.chk and example.out (opt + vib) are found in this file.

Full example usage of the above codes on Linux (also invokes pes_parse_g09.py below, as well as the linux programs gawk and paste):

formchk example.chk 
./pes_parse_g09.py example.out 
./xyz2mass.py structure_001.xyz > test.mass
./fck2hess.py example.fchk test.hess
./hess2freq.py test.hess test.mass |gawk '{print $1}' > test.freq

cp test.mass test2.mass

I changed 1.2D01 in test2.mass to 13.003355 manually

./hess2freq.py test.hess test2.mass |gawk '{print $1}' > test2.freq
paste test.freq test2.freq > tests.freq
./freq2rpfr.py tests.freq 298.15
./freq2rpfrvec.py tests.freq 273.15 303.15 10

Old octave codes: nwhessian (frequencies=nwhess(“input.hess”, “input.mass”)), rpfr.m and rpfrvec.m (rpfr(‘infile.hess’,298.15) and rpfrvec(‘infile.hess’,273.15,313.15,10))

Normal modes

ramanlit — this is a fun one. It allows you to work out to which vibration (frequency) a specific atom contributes mostly to by varying the mass of that atom.
Usage: ramanlit gaussian.fchk 298 1 200
It’s a modified version of the fractionation script. The third and fourth arguments are the atom number (according to the xyz) and the mass (in amu). By looking at the differences in computed frequencies, you can see which vibration an atom contributes to.

parsing

qst_parse_g09.py –Python 2.7x. Extract all structures (and energies) from a gaussian output file/log file. NOTE: they don’t have to be optimised (I wrote it to extract structures from a QST/IRC search). Usage: ./qst_parse_g09.py example.out

pes_parse_g09.py –Python 2.7x. Extract all optimised structures (and energies) from a gaussian output file/log file. I wrote it to extract structures from a gaussian PES scan, but it’s great for simply extracting the final, optimised structure from a geometry optimisation in gaussian. Usage: ./pes_parse_g09.py example.out

qst_parse_nw.py — Python 2.7x. Same as qst_parse_g09, but for nwchem.

pes_parse_nw.py –Python 2.7x.Same as pes_parse_g09, but for nwchem.

g09freq —

g09raman —

efgparse_g09 –Python 2.7x. program that parses gaussian EFG calculation output and processes it. Includes example input. Usage: ./efparse_g09 g09.g09out water

Misc. actions on xyz files

autorotate –Python 2.7x. An interesting program that can take two very similar molecules that have been rotated relative to one-another, and try to rotate them back into the same coordinate system. This can be useful before moving on to nebinterpolate (e.g. if you used autoz when optimised the reactant and product systems). ab.zip contains example input. Instructions are found in the code.

nebinterpolate — This python script takes a multi-xyz file as input and interpolates the atom positions. If you had two structures in the file, you’ll get three (the average of the two). If you had three, you’ll get five. If you had five, you’ll get nine. It’s sometimes useful for generating input for nudge-elastic band computations in e.g. nwchem.
Usage: nebinterpolate.py -i test_input.xyz -o test_output.xyz

pair_xyz –Python 2.7x. Tracks the distance between two atoms (e.g. the first and the second atom) in a multi-xyz file. Useful for parsing MD output or IRCs. Usage: ./pair_xyz multi.xyz 1 2

polish_xyz –Python 2.7x. Takes an ugly-looking unreadable xyz file and converts it to a standardised xyz file (can’t remember which standard) that can be read by most software

flip_xyz –Python 2.7x. Generates the mirror image of a molecule by changing the sign of the z-coordinates of the atoms.

genstruc — Python 2.7x. Takes an xyz file and can generate a list over all bonds and/or angles and/or dihedrals in it. Need to provide a maximum bond distance in nm to consider (if it”s too long then everything is bonded to everything). I used it to look at the environment/bond distances of specific metals in clusters as a function of computational method. Can also be used to generate output for bar charts etc.

To get bonds only:

Usage: ./genstruc structure.xyz 2.3 1

To get bonds and angles only:

Usage: ./genstruc structure.xyz 2.3 2

To get bonds, angles and dihedrals

Usage: ./genstruc structure.xyz 2.3 3

NMR

nmrsim — The following two Octave scripts, homog.m and hmegen.m, simulate a very simple model spectrum of an arbritary number of uncoupled spins wich are in chemical equilibrium. It loosely follows P. Allard et al. (J.Mag. Res., 1997, 129, 19-29; J. Biomol. NMR, 2000, 18, 49-63). You only need to edit homog.m to change parameters. Leave hmegen.m alone.

kinsim — two simple Octave scripts, twosite.m and getpulse.m, for simulating the NMR spectrum of a two-site exchange system using classical Bloch equations. . Put the files in a directory, open octave there, and run ‘twosite’. The real value of the scripts are that they are very easy to edit if you want e.g. multiple pulses, more species in equilibrium, or different rate laws.

Chemometrics

pOCPCA — Collection of Octave scripts for doing some basic chemometric analysis of data. In particular, I’ve used it for kinetic UV/VIS data, pH-dependent UV/VIS data and cone voltage-dependent mass spec data. It’s quite old now and I haven’t used it for a while, but it performs as well as the SPECFIT program that caused me to write it (SPECFIT used an opaque algorithm, which is always frustrating to a scientist who wants to UNDERSTAND the data). pOCPA does both the Orthogonal Projection Approach and Evolving Factor Analysis and can handle vectors generated from pKa-based models, in addition to simple linear ones. To be absolutely fair, the fitting routines in pOCPCA could do with an overhaul as they are not altogether reliable in finding the minimum every time.

Here’s an old description of the collection of scripts, which is still accurate:
“pOCPCA is a collection of octave scripts controlled by newmain.m which can do: evolving factor analysis, orthogonal projection approach, pKa extraction and rate law fitting of large data sets, such as UV/VIs titration data, ESI-MS cone voltage data and so on. The best way to learn how to use the scripts is by looking at newmain.m. The biggest challenge is really constructing the data input: the input file should have the vector of change (e.g. time or pH) in the first column, and the static vector (e.g. wavelength) in the first row. The first row, first column position can be anything (e.g. 0) as it’s just used for symmetry. The rest of the fields contain the data.”

MOCPCA — matlab-compatible version of pOCPCA. As I had to ‘port’ minimize() from Octave I wouldn’t put as much trust in this version as in the Octave one since ‘port’ means ‘brutally fiddling with the code until it magically worked on matlab’.

Data processing

rotate — shell script which depends on gawk. Rotates an m times n tab-separated matrix

Usage: rotate filename

findpeaks –Python script which parses an xyyy.. tab-separated file and returns signal (y) with intensities larger than the supplied cutoff value.

Usage: findpeaks filename column cutoff

where the filename is the name of the file, the column is the column which to parse (numbering starts at 1, not 0), and the cutoff value is a floating point number.

homogenise.sh — Very simple shell script (bash) using gawk which harvests the first column of a series of xy tab-separated files ({0..20..10}.dat and {220..300..20}.dat) to a single file.

Usage: sh homogenise.sh > output.dat

makelist.py — Python script. It is fairly specific in its purpose as it

opens a tab-separated file generated by homogenise.sh
collects all unique numbers in the file
sorts them in descending order. These represent all possible x values for a system
opens a series of files with names in the {0..200..10}.dat and {220..300..20}.dat ranges. These are xy tab-separated files.
collects all the data corresponding to the unique x values
Returns a matrix with the X and Y values

This script is used to generate a matrix from a collection of mass spectra.

Usage: makelist.py filename

procms.sh — sh procms.sh