Wednesday, 22 August 2012

Transforming molecules into...well...other molecules

It's fairly straightforward to filter structures by SMARTS patterns using Open Babel, but how about if you want to transform all occurences of a particular substructure into something else? This may be useful for structure cleanup, tautomer normalisation, or even to carry out a reaction in silico.

Anyhoo, that's enough background. Let's suppose we want to hydrogenate all instances of C=C and C=O. Just copy and paste the following into the end of plugindefines.txt:
hydrogenate      # ID used for commandline option
*                # Asterisk means "no datafile specified"
Hydrogenate C=C and C=O double bonds
TRANSFORM [C:1]=[C:2] >> [C:1][C:2]
TRANSFORM [C:1]=[O:2] >> [C:1][O:2]
This gives a new obabel option, --hydrogenate, that will do the job:
C:\Tools\tmp>obabel -L ops
hydrogenate    Hydrogenate C=C and C=O double bonds
C:\Tools\tmp>obabel -:C=C -osmi --hydrogenate
C:\Tools\tmp>obabel -:O=CSC=CC=S -osmi --hydrogenate
(1) This works best in the latest SVN as Chris sorted out some longstanding bugs.
(2) I've just enabled this in Python where it works as follows:
import pybel
transform = pybel.ob.OBChemTsfm()
success = transform.Init("[C:1]=[C:2]", "[C:1][C:2]")
assert success
mol = pybel.readstring("smi", "C=C")
assert mol.write("smi").rstrip() == "CC"


Geoff said...

It's funny that you posted this today, since I ran into Evan Bolton (of PubChem) who said that in the next few months, they're going to release a set of normalization transforms.

Egon Willighagen said...

What is the relevance of this? That is, why make this an OB option? Is this very common then? Or is this more to just demo the plugin framework?

baoilleach said...

I haven't added this option to the codebase. The example is just a demo, not so much of the plugin framework, but of how to carry out a transformation with OB. There's no other way (beyond using the API, as shown in the Python example).

Chris Morley said...

OpenBabel has lacked good support for reactions and transformations for a long time. (I actually started taking an interest in OB nine years ago because I had a need for them.) It is true that the feature demonstrated here by Noel is a bit obscure and I have been playing with more obvious ways for doing transformations (SMIRKS). The lack of a working '.' functionality in the SMARTS parser is a barrier proving difficult to put right. A 2D depiction of a reaction (and sets of reactions) is also needed and illustrates a common problem - it would have been better to design it in from the beginning.