Monday, 13 December 2010

Go dig in Indigo

About a year ago, SciTouch LLC announced the release of the open source cheminformatics library Indigo. At that time, the full API of the library was not exposed to the user. Instead, a set of simplified APIs were available along with a number of command-line applications.

Just a few weeks ago at Goslar, Dmitry and Mikhail of Indigo made a couple of announcements. First, the library, documentation, etc. is no longer at SciTouch, but at GGA software. Second, the full API is now being made available. It's actually a C library, but wrappers are available for C#, Java and Python. If you're interested in accessing it from Python, follow the Download link to the Python API.

The Indigo library is currently at the version 1.0 Beta 3 stage, but new API functions are being added on request over at the Indigo mailing list. So if there's something missing that you'd like to have, get over there now and ask about it.

Naturally, I'm interested in adding support for Indigo to Cinfony. I've already pretty much done the Python bindings. Just put into the same directory as and away you go. As usual, let me know if you find any bugs.


Igor said...

Noel, I'd like to point out
that they also have a free/open source chemical image recognition program Imago. I would be very interested to hear your thoughts about it (I am of course biased towards OSRA myself :)
I have urged the SciTouch to run one of the two freely available validation sets (one containing images from USPTO, the other from Japanese patent office) but to the best of my knowledge this hasn't happened yet.

Noel O'Boyle said...

Thanks Igor - I didn't realise. I see some discussion about this on the mailing list a few months ago now.

For sure that would be interesting. Even better would be to see if they have different strengths and weaknesses. If so, it might be possible for each to borrow some ideas from the other.

Dmitry Pavlov said...

Hello Noel,

Thank you for the update. We are very much interested in Indigo-Cinfony integration, and please do not hesitate to write us if anything else is missing.

Minor point: actually, the indigo-dev group is more appropriate for requesting new features (as well as indigo-bugs is for bug reports), but of course we read all the three groups anyway.

As for Imago, sure we will run it against the validation sets, but currently we are missing the superatom labels extraction procedure, and so we can not properly validate the obtained results. I believe we will be able to extract the labels and do the validation before Spring 2011.

Also, this month we are going to release a major update of Imago with new interface.

Best regards,