Thursday, 20 October 2011

Open Babel 2.3.1 released

Coming close on the heels of the Open Babel paper is the release of Open Babel 2.3.1.

As announced by Geoff on the mailing list:
I am very happy to finally announce the release of Open Babel 2.3.1, a major update release of the open source chemistry toolbox.

Open Babel has been downloaded nearly 200,000 times and is used in over 45 projects and over 400 publications.
http://www.jcheminf.com/content/3/1/33

This release represents a major bug-fix-release and should be a stable upgrade, strongly recommended for all users of Open Babel. Many bugs and enhancements have been added in the last year since the 2.3.0 release.

What's new? See the full release notes

See the new user guide.

See the updated developer documentation.

To download, see:
http://sourceforge.net/projects/openbabel/files/

For more information, see the project website.

I'd particularly like to thank Chris Morley, Noel O'Boyle, and many others who put large amounts of time testing and improving this release.

This is a community project and we couldn't have made this release without you. Many thanks to all the contributors to Open Babel including those of you who submitted feedback, bug reports, and code.

Cheers,
-Geoff

---
Prof. Geoffrey Hutchison
Department of Chemistry
University of Pittsburgh
email: geoffh@pitt.edu
web: http://hutchison.chem.pitt.edu/

Monday, 17 October 2011

The Blue Obelisk - An update after 5 years

The Blue Obelisk group was established at the Spring ACS Meeting in 2005. Following on from the presentation/poster I gave at PMR's Symposium in January, I put together an overview of the activities of the group over the past 5 years with help from Rajarshi Guha, Egon Willighagen and Peter Murray-Rust, and further contributions on particular projects from many more.

This paper has just appeared in Journal of Cheminformatics as part of the PMR Symposium themed issue:
Open Data, Open Source and Open Standards in chemistry: The Blue Obelisk five years on Noel M O'Boyle, Rajarshi Guha, Egon L Willighagen, Samuel E Adams, Jonathan Alvarsson, Jean-Claude Bradley, Igor V Filippov, Robert M Hanson, Marcus D Hanwell, Geoffrey R Hutchison, Craig A James, Nina Jeliazkova, Andrew SID Lang, Karol M Langner, David C Lonie, Daniel M Lowe, Jerome Pansanel, Dmitry Pavlov, Ola Spjuth, Christoph Steinbeck, Adam L Tenderholt, Kevin J Theisen, Peter Murray-Rust.
Journal of Cheminformatics 2011, 3:37.

Here's the abstract:
Background

The Blue Obelisk movement was established in 2005 as a response to the lack of Open Data, Open Standards and Open Source (ODOSOS) in chemistry. It aims to make it easier to carry out chemistry research by promoting interoperability between chemistry software, encouraging cooperation between Open Source developers, and developing community resources and Open Standards.

Results
This contribution looks back on the work carried out by the Blue Obelisk in thjavascript:void(0)e past 5 years and surveys progress and remaining challenges in the areas of Open Data, Open Standards, and Open Source in chemistry.

Conclusions
We show that the Blue Obelisk has been very successful in bringing together researchers and developers with common interests in ODOSOS, leading to development of many useful resources freely available to the chemistry community.
Check out the other papers in the themed series over at Journal of Cheminformatics.

Thursday, 13 October 2011

Recognise this? Roundtripping chemical images

With the imminent release of Open Babel 2.3.1, I thought I'd come up with some examples of use for a new feature, PNG depiction.

To generate a PNG with Open Babel you just use the PNG output format:
obabel -:CC(=O)Cl -O tmp.png

Open Babel actually allows you to embed the chemical structure (in any format) directly into a new or existing PNG file. If you do this, then you can roundtrip as follows:
> obabel -:CC(=O)Cl -O tmp.png -xO smi
> obabel tmp.png -osmi
CC(=O)Cl

If you haven't embedded a chemical structure in the image, you'll have to use optical chemical recognition software such as the open source OSRA (Igor Filippov) or Imago (GGA Software). Both of these can output a MOL/SDF file, which contains the 2D coordinates of the perceived structure, and this can be depicted. I did this for a set of 450 images from the Japanese Patent Office as follows:
> for %a in (*.tif) do "C:\Program Files (x86)\osra\1.3.8\osra.bat" %a --format sdf | obabel -isdf -O %~na_osra.png -d
> for %a in (*_chem.png) do "C:\Program Files\GGA Software\Imago Toolkit\alter_ego.exe" %a -o tmp.mol -q && obabel tmp.mol -O %~na_imago.png -d

The results are here: Subset 1 2 3.

Notes:
1. Open Babel depiction for large molecules needs to be fixed, as the lines get faint and disappear in some cases. [Update (26/03/2012): Now fixed]
2. The tiff files needed to be converted to pngs for Imago (used a "for" loop with Imagemagick convert).
3. In the case of multiple molecules in the OSRA output, only the first molecule is depicted (I think).
4. Several structure gave error messages when depicting the Imago structures due to unrecognised labels. I think there's a way around this but I didn't look into it.

Friday, 7 October 2011

Open Babel paper published in Journal of Cheminformatics


After almost 10 years as an independent project, the paper is finally here...

Open Babel: An open chemical toolbox N. M. O'Boyle, M. Banck, C. A. James, C. Morley, T. Vandermeersch and G. R. Hutchison. Journal of Cheminformatics 2011, 3:33.

Here's the abstract:
Background
A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendor-neutral formats.

Results
We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion.

Conclusions
Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license from http://openbabel.org.