Saturday, 25 February 2012

Portrait of the molecule as a green substructure

I've already mentioned using Open Babel for depicting with SVG. That's a handy way to view a large set of molecules; you can zoom in and out and so forth. Let's look at some more of the features, as I've just been adding them to PNG depiction from SVG.

Let's start with the following basic depiction. Note that information on the SVG output options (e.g. -xC) is available in the docs or via "obabel -H svg", and that you can use the mouse to zoom in, etc.
obabel dataset.sdf -O output1.svg -xC

With some magic, we can convert carboxylic acid groups (and anything else listed in the user-editable superatom.txt) into COOH in the depiction. Let's add thick lines too:
obabel dataset.sdf -O output2.svg -xC --genalias -xA -xt

We can also do some fun stuff with descriptors (see "obabel -L descriptors" for a list). Let's sort by molecular weight and replace the title with the molecular formula and molecular weight:
obabel dataset.sdf -O output3.svg -xC --sort MW
  --title "" --append "formula MW"

You might have noticed that all of the molecules have a substructure in common. Let's highlight some of this in green, and get rid of the other colours:
obabel dataset.sdf -O output4.svg -xC
  -xu -s "[#6]~2~[#6]NCCN=C~2 green"

And finally, if the molecules are related, it can be useful to align the depictions using a substructure in order to identify similarities and differences (this has been improved in the development version):
obabel dataset.sdf -O output5.svg -xC
  -xu -s "[#6]~2~[#6]NCCN=C~2 green" --align

What other depiction features would you find useful?

Thursday, 23 February 2012

On reflection, transforming molecules is tricky

I've just spent quite some time trying to get the mirror-image of a chiral ligand in a metal-ligand complex. It took me a good while to figure out how to do it, so here it is for the record.

The solution uses a combination of Open Babel's API and Numpy. Open Babel can calculate the required transformation matrix given the normal to the mirror plane. To get the normal, we just need three points in the plane, from which we can derive two vectors (it doesn't matter which) which lie in the plane. The cross product of the two vectors is a vector that is orthogonal to both (that's a property of cross products), and thus is a normal to the plane.

It should be possible to do all of the maths with Open Babel, but whatever way the matrix3x3 and vector3 classes are implemented, it doesn't translate into Python (at least not without segfaults). So that explains why I've had to convert to Numpy arrays and do the maths there.

Wednesday, 15 February 2012

Towards a Novel Future

The recent JCAMD issue celebrating 25 years of that journal is full of articles speculating on the future of the field over the coming 25 years (note that these are available free online for 3 months and 6 of the 32 are Open Access). For me, as a science fiction fan, one particular article stands out: Alpha Shock by Mark Murcko and Pat Walters, which imagines a future where drug design will have virtually (le mot juste, eh?) eliminated those pesky experiments.

Science fiction articles don't appear very often in scientific journals (the double entendre aside), so I was intrigued by how the reviewers handled this. Through various nefarious and downright dastardly means, I managed to get a peek at the reviewers' reports and here are some highlights:

"A real Page-Downer of a PDF. I was gripped from the Introduction right through to the Conclusion."
- Reviewer 1
"8 out of 10. A story of great entertainment in the field."
- Reviewer 2
"P2, paragraph 3: Several misspellings. In the future, are there no spell checkers?"
- Reviewer 3
"Rattles along at a rollicking pace. And that's just the abstract!"
- Reviewer 1
"I laughed. I cried. Then I read the article."
- Reviewer 2
"Publish with minor revisions to add tension."
- Reviewer 3

Can anyone think of other examples where fiction has been included in a chemistry article?

Friday, 10 February 2012

Resources for Computational Drug Discovery

As mentioned by John Overington, the Joint EMBL-EBI/Wellcome Trust Course: Resources for Computational Drug Discovery on 2-5 July (see links here and here) is now open for registration. I'll be there as one of the trainers.

From previous experience of similar meetings at the EBI, if the topics covered are relevant to any problems you want to work on, I recommend going. It's very much hands-on; you will be introduced to the theory, then you will put it into practice. From the preliminary agenda (halfway down on this page), it seems that you will have the chance to apply the methods to your own problems of interest with help from the trainers. Also the whole meeting is quite small (in a good way) so you'll get to know the trainers and the other participants (notice my skillful avoidance of the word networking :-).

If you're going to be there and you read this blog, don't forget to say hi!

Friday, 3 February 2012

LICSS hits the press, and an update on the OB/BO papers

I've written here before about how Kevin Lawson (of Syngenta) has developed a way to incorporate chemistry into Excel using only freely-available software, namely the CDK and JChemPaint (and also now OPSIN it seems). This system is called LICSS, and the corresponding paper has just appeared in Journal of Cheminformatics where it has been highlighted as an Editor's Pick.

So go check it out. I'm particularly interested in the use of this software in an academic teaching setting. It would seem to be ideal for introducting students to cheminformatics.

While on the subject of papers in J. Cheminf., I've been keeping an eye on accesses and citations of the Open Babel and Blue Obelisk papers since publication in October of last year (see also my earlier post on the topic).

Both have remained in the top 10 most accessed papers in the last 30 days (now at positions 4 and 6 for OB and BO respectively). In terms of accesses over the last year, the OB paper is now at position 5 (BO at 23) behind Peter Ertl and Ansgar Schuffenhauer, Mikhail Elyashberg et al (including Tony Williams), Peter Ertl again, and Matthias Samwald et al (including Egon Willighagen) at #1. In terms of all-time accesses, there's still some way to go for OB (now at 24) and BO (now at 46).

Keeping an eye on accesses is fun, but do they translate into the traditional academic coin of citations? Well, the Open Babel paper has already been cited four times, although the Blue Obelisk paper still has only the initial citation from the corresponding editorial (early days yet though).