Saturday, 23 February 2013

cclib 1.1 and GaussSum 2.2.6 released

cclib is a Python library for parsing for analysing comp chem log files from many different QM packages. GaussSum is a GUI that uses cclib to monitor the progress of comp chem calculations and calculate predicted spectra for comparison with experimental results. cclib is the work of Adam Tenderholt, Karol Langner and myself, while GaussSum is just by me.

New releases of both of these are now available for download at their respective websites: and For help, email or Here's what's new in these releases:

cclib 1.1
New Features:
  • Add progress info for all parsers
  • Support ONIOM calculations in Gaussian (Karen Hemelsoet)
  • New attribute atomcharges extracts Mulliken and Lowdin atomic charges if present
  • New attribute atomspins extracts Mulliken and Lowdin atomic spin densities if present
  • New thermodynamic attributes: freeenergy, temperature, enthalpy (Edward Holland)
  • Extract PES information: scanenergies, scancoords, scanparm, scannames (Edward Holland)
  •  Handle coupled cluster energies in Gaussian 09 (Björn Dahlgren)
  •  Vibrational displacement vectors missing for Gaussian 09 (Björn Dahlgren)
  • Fix problem parsing vibrational frequencies in some GAMESS-US files
  • Fix missing final scfenergy in ADF geometry optimisations
  • Fix missing final scfenergy for ORCA where a specific number of SCF cycles has been specified
  • ORCA scfenergies not parsed if COSMO solvent effects included
  • Allow spin unrestricted calculations to use the fragment MO overlaps correctly for the MPA and CDA calculations
  • Handle Gaussian MO energies that are printed as a row of asterisks (Jerome Kieffer)
  • Add more explicit license notices, and allow LGPL versions after 2.1
  • Support Firefly calculations where nmo != nbasis (Pavel Solntsev)
  • Fix problem parsing vibrational frequency information in recent GAMESS (US) files (Chengju Wang)
  • Apply patch from Chengju Wang to handle GAMESS calculations with more than 99 atoms
  • Handle Gaussian files with more than 99 atoms having pseudopotentials (Björn Baumeier)
GaussSum 2.2.6
New Features:
  • A patch from Thomas Pijper was integrated to enable calculation of Raman intensities (from Raman activity).
  • Support has been added for calculating charge density changes for unrestricted calculations (requested by Phil Schauer).
  • Blank lines in Groups.txt are now ignored.
  • Parser updated to cclib 1.1

Sunday, 17 February 2013

A new home for Linux4Chemistry?

Are you interested in taking over the stewardship of the Linux4Chemistry website? This website was setup by Nikodem Kuznik in 2001 to promote the usage of Linux for chemistry by listing chemistry software available for Linux, both commercial, free, and open source. In 2005 I took over its maintenance, moved it to its present location and made some changes including the awesome logo. I am currently on the lookout for someone interested and willing to take over the running of this site. Update (25/02/2103): A new home has been found.

Why am I looking for someone new? Well, I think I am no longer the right person to maintain this, as the goals I wished to achieve have already been met. To me the idea of promoting Linux as a platform for chemistry software today seems obsolete when almost all chemistry software vendors target Linux at a minimum. Every docking program is on Linux, every QM package runs on Linux, etc.

In taking over the website I also had the goal of clarifying the distinction between Open Source software and software available for free. To this end, I added license information to all of the software, and made it possible to filter the results by license type (I was very proud of my tongue-in-cheek logo for Shareware software). I think that today more people are aware of this distinction and in any case, I'm not sure that L4C is really playing a role here.

It is also true that while at the time my desktop machine was running Linux (Debian Sarge I think), since then I do most of my work on Windows but use a VM for Linux. And it's harder to maintain something which is outside my day-to-day usage.

So I'm looking for someone to bring a new vision to Linux4Chemistry and shake it up a bit. If you are this person, get in touch.

Wednesday, 6 February 2013

A compilation of speeds - Compiler face-off

Compilers cake!Let's get right into this one. I've compiled Open Babel with g++ in various ways, and am going to compare the speed with the MSVC++ release. Specifically I'm going to compare the wallclock time to convert 10000 molecules (the first 10000 in ChEMBL 13) from an SDF file to SMILES.

Our starting point is the time for the MSVC++ compiled release:
29.6s (MSVC++ 2010 Express 32-bit)

I have a Linux Mint 12 VM (VMWare) on the same machine, so let's run the same executable under Wine on Linux:
37.3s (MSVC++ 32-bit under Wine/Linux) it's slower, pretty much as expected. The not-an-emulation layer slows things down a bit.

How about the MinGW compilation described in the previous post?:
24.1s (MinGW g++ 4.6.2 32-bit)
g++ beats MSVC++. To be honest, I was a bit surprised to see this, although I understand from Roger that g++ is surprisingly highly-optimised for cheminformatics toolkits. Maybe we should look into an official MinGW release in future.

What about Open Babel compiled with Cygwin's g++?:
39.5s (Cygwin g++ 4.5.3 32-bit)
As expected it runs like a pig compared to the MinGW version. Cygwin's handy, but when you're in a hurry it's maybe not the best choice.

So far, so not very unexpected. Now we will enter the realm of weirdness. Let's compile it on Linux in the VM and run it there:
14.8s (Linux Mint 12 g++ 4.6.1 64-bit)

So, in short, the fastest way to run Open Babel on Windows is to use a VM to run Linux. Huh? The like-with-like comparison of MinGW's 24.1 versus Linux's 14.8 is the most intriguing. It suggests that the slowdown is either due to rubbish file I/O by Windows, or sub-optimal platform-specific code in Open Babel's I/O handling code.

Either way, it's a pretty interesting result.

1. Hardware was a Dell Latitude E6400 bought 3 years ago (Core 2 Duo 2.4 Ghz, 4GB Ram) running Win 7 64-bit. The timing was the best of three after timings had stabilised (the first one or two is usually a second or two slower).
2. After the initial post, I compiled clang on Linux, and then used it to compile Open Babel. Running the conversion took 15.3s.
3. Also, I ran the MinGW compiled version under Linux, and it took 30.7s.

Image credit: Venkatesh Srinivas (Extrudedaluminiu on Flickr)

Compiling Open Babel with MinGW on Windows

If you want to compile on Windows using GCC, you have two alternatives: Cygwin's GCC and MinGW's. The one from Cygwin is easier to use (easier installation) but has the disadvantage that the resulting software does not run natively on Windows, various system calls go through Cygwin's emulation layer which slows things down. Here I'll show how to compile Open Babel with MinGW.

Installing MinGW

I've previously found this a bit confusing. This time I did a manual installation by creating a folder C:\MinGW, and then downloading all the relevant dlls on the installation page. To do this quickly just middle click on several links, wait a few seconds, and then hit Save on all the dialog boxes. Once they are all downloaded, move them to C:\MinGW and unzip them there.

Installing MSYS

No need to install MSYS (a kind of build environment for MinGW) for a project such as Open Babel that uses CMake to build. Why do I mention it then? Because the MinGW page talks all about it.

Compiling Open Babel

1. Add C:\MinGW\bin to the PATH
2. Get Cygwin's stuff off the PATH (if it's there). This is most easily accomplished by renaming C:\Cygwin to C:\oldCygwin or so.
3. Configure CMake to create makefiles for MinGW. I had some problems (at runtime) with a shared library version, so I went with the static one:
cmake -G "MinGW Makefiles" ../openbabel-2.3.2 -DWITH_INCHI=FALSE -DBUILD_SHARED=FALSE
4. Build it with MinGW's make.
Hmmm...I wonder if it's as fast as the MSVC-compiled version we distribute?