Friday, 20 November 2009

Chemical Identifier Resolver + TwirlyMol = Easily add molecules to a webpage

Markus Sitzmann of the NCI/CADD team has been busy. He has combined the Chemical Identifier Resolver with TwirlyMol to enable you to convert any chemical identifier to a 3D model that can you interact with in your webpage. I'm very excited about this as I think that people will find this very useful.

Just put this in your webpage or blog post (note however that Blogger preview does not show the Twirlymol):
<div id="DIVNAME" height="200" width="200"></div>

<script src="http://cactus.nci.nih.gov/chemical/structure/
CHEMICAL_IDENTIFIER/twirl?div_id=DIVNAME"

type="text/javascript"></script>
Replace DIVNAME with a unique name, and replace CHEMICAL_IDENTIFIER with any of the chemical identifiers accepted by the Chemical Structure Resolver; for example, a common name for a chemical, an InChI, or a SMILES string. More details over at the /Chemical/Structure blog. For now, let's just see it in action.

Replacing DIVNAME with 'buckyball' and CHEMICAL_IDENTIFIER with 'buckminsterfullerene' gives the following (go on, give it a twirl! - right mouse button to zoom in):

That was too easy - let's take one of Henzy Rzepa's crazy Mobius aromatic molecules. Steven Bachrach has written a review of some and very thoughtfully has included the InChIs. Replacing DIVNAME with 'crazymolecule' and CHEMICAL_IDENTIFIER with 'InChI=1/C14H14/c1-2-6-10-14-12-8-4-3-7-11-13(14)9-5-1/h1-14H/b2-1-,4-3-,9-5-,10-6-,11-7-,12-8-/t13-,14+' we have:

3D Nanoputians anyone? Here's the SMILES for a NanoKid: "c1(C2OCCO2)c(C#CC(C)(C)C)cc(C#Cc2cc(C#CCCC)cc(C#CCCC)c2)c(C#CC(C)(C)C)c1". Happy twirling.

Wednesday, 18 November 2009

My beaker overfloweth - New chemistry Q&A sites

Stackoverflow is one of the best Question & Answer websites for computer programming. It uses a carefully designed social model to build a community where people compete to give the best answer to questions in order to be rewarded with a better response to their own questions.

Recently, the people behind Stackoverflow have opened up the software to allow people to set up their own websites...but just for a beta period (money will then be required). Several chemistry 'stackoverflows' have already been set up. Here are a few I've heard about:

BlueObelisk: Questions about cheminformatics and computational chemistry leaning towards the open source or open data side of things. Update (07/10/2010): This website has moved to Shapado.
Chempedia Lab: Questions about experimental chemistry.
Chemistry: General chemistry (?)

These sites are all new so you won't find many questions there already. But give them a go. Go there and ask a question or two (even if you already know the answer), answer a question or two, and check back in a day to see what happens. You can log in with your Gmail address (among others) but do note that questions are not anonymous.

Such websites require a community. Some will gain such a community and flourish, others won't and will fail. In the meanwhile, go get some answers.

Image credit: Question Everything (Nullius in verba) Take nobody's word for it by Duncan Hull (CC BY 2.0)

Monday, 16 November 2009

Cheminformatics Tutorial using Python and Silverlight

Recently I introduced Webel, a Python cheminformatics module that runs entirely on web services. One of the advantages of such a module is that it can be used in places where it is difficult to install a traditional cheminformatics toolkit. Like in your browser.

It turns out that Silverlight ("Microsoft's Flash") provides a Python interpreter that runs in your browser. Using this, Michael Foord (of IronPython in Action) has developed an interactive Python interpreter which you can use at trypython.org. It consists of two windows, one with a Python tutorial and the other with a Python prompt so that you can work through the tutorial.

After some little work, I present Try Python...with Cheminformatics. This adds Webel as well as a short tutorial that introduces many of its features. With a few more tutorials that cover SMILES, InChI and so on in more detail, this could be useful for teaching purposes as well as bridging the gap to having students develop their own Python scripts that use the CDK, OpenBabel or the RDKit.

Here is the obligatory screenshot (click for a larger version):

Tuesday, 10 November 2009

In memory of Warren DeLano

Many of you will have heard the sad news about Warren. He passed away suddenly on November 3rd. His contribution to science through the development of PyMol is known to all of us.

I only met him once, but I was struck both by his enthusiasm for new ideas and his belief in open source software. He believed that such software was an enabler of science and its development should be supported.

His family are collecting memories and photographs of Warren, and have set up a fund in his memory to support achievements in Open Source scientific software:
Through PyMol and Open Source software, Warren DeLano exhibited the genius and generosity of science at its best. But he still had so much to give. In memory of his passing, we are creating a foundation to ensure that achievements in the field of Open Source scientific software are encouraged, recognized and rewarded. Warren was committed to the development of Open Source programs and how they would benefit humanity by allowing science to flourish in a collaborative environment. Your contribution will help us keep Warren's commitment alive.

Ní bheidh a leithéid arís ann.

Thursday, 5 November 2009

Introducing Webel - A cheminformatics toolkit built solely on webservices

I'd like to introduce a new Cinfony module, Webel. Like the other components of Cinfony, Webel implements a standard API (see for example, the Pybel API) that covers a large proportion of common cheminformatics operations including reading/writing SMILES strings and InChIs, calculation of molecular weight and formula, molecular fingerprints, SMARTS searching, and descriptor calculation.

However, unlike the other components, Webel runs entirely off web services. All cheminformatics analysis is carried out using Rajarshi's REST services (which use the CDK and are hosted at Uppsala) and the NIH's Chemical Identifier Resolver (by Markus Sitzmann, and which uses Cactvs for much of its backend).

To use Webel, all you need to do is download webel.py, and type "import webel" at a Python prompt (see example code below - it's basically the same as using Pybel if you're familiar with that).

So what are the advantages of running off webservices? First, as should be clear, there is the ease of installation. This means that Webel could easily be bundled in with some other software to provide some useful functionality. Second, Webel can still be used in environments where installation of a cheminformatics toolkit is simply not possible (more on this next week!). Third, webservices may provide additional functionality not available elsewhere (e.g. the Chemical Resolver provides name-to-structure conversion as well as InChIKey resolution). Fourth, webservices are accessed across HTTP rather than through some type of language binding. As a result, Webel works equally well from CPython, Jython or IronPython. And finally, it's just a cool idea. :-)

If you can think of any other advantages or potential applications, I'd be interested to hear them. In the meanwhile, here's some code that calculates the molecular weight of aspirin, its LogP, its InChI, gives alternate names for aspirin, and creates the PNG above:

import webel

mol = webel.readstring("name", "aspirin")
print "The molecular weight is %.1f" % mol.molwt
print "The InChI is %s" % mol.write("inchi")
print "LogP values are: %s" % mol.calcdesc(["ALOGPDescriptor"])
print "Aspirin is also known as: %s" % mol.write("names")
mol.draw(filename="aspirin.png", show=False)
...which gives...
C:\Tools\cinfony\trunk\cinfony>python example.py
The molecular weight is 180.2
The InChI is InChI=1/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)
/f/h11H AuxInfo=1/1/N:5,3,4,1,2,12,6,7,11,9,8,10,13/E:(11,12)/F:5,3,4,1,2,12,6,7
,11,9,10,8,13/rA:21CCCCCCCOOOCCOHHHHHHHH/rB:;a1;a2a3;;a1;a2a6;;;;s6d8s10;s5d9;s7
s12;s10;s1;s2;s3;s4;s5;s5;s5;/rC:6.3301,-.56,0;4.5981,-1.56,0;6.3301,-1.56,0;5.4
641,-2.06,0;2,-.06,0;5.4641,-.06,0;4.5981,-.56,0;4.5981,1.44,0;2.866,-1.56,0;6.3
301,1.44,0;5.4641,.94,0;2.866,-.56,0;3.7321,-.06,0;6.3301,2.06,0;6.8671,-.25,0;4
.0611,-1.87,0;6.8671,-1.87,0;5.4641,-2.68,0;2.31,.4769,0;1.4631,.25,0;1.69,-.596
9,0;
LogP values are: {'ALOGPDescriptor_ALogp2': 0.10304100000000004, 'ALOGPDescripto
r_AMR': 18.935400000000001}
Aspirin is also known as: ['2-Acetoxybenzoic acid', '50-78-2', '2-Acetoxybenzene
carboxylic acid', 'Acetylsalicylate', 'Acetylsalicylic acid', 'Aspirin', ...
'Claradin', 'Clariprin', 'Colfarit', 'Decaten', 'Dolean pH 8', ...
'Acetylsalicylsaure [German]', 'Acide acetylsalicylique [French]', ...
'A6810_SIGMA', 'Spectrum5_000740', 'CHEBI:15365',...]

Wednesday, 4 November 2009

In I go with Indigo, the new open source cheminformatics toolkit

SciTouch LLC have just announced the release of a dual licensed (GPL or commercial) cheminformatics toolkit, Indigo. See Depth-First and Rajarshi for some initial reactions.

It's a C++ toolkit, and right now what seems to be available are several .NET wrappers that enable specific uses as well as an Oracle cartridge. Access from Python, etc. is on the to-do list, and hopefully this will also give access to the core Molecule object so that all aspects of the toolkit will be available.

Charlie Zhu has already written an example application using C#. Rather than wait for CPython bindings, I installed IronPython and used it to access Indigo's .NET libraries (Dingo, in this case) to do a SMILES to png conversion:
C:\Tools\Indigo\dingonet-1.0-3669>"C:\Program Files\IronPython 2.6\ipy.exe"
IronPython 2.6 (2.6.10920.0) on .NET 2.0.50727.3603
Type "help", "copyright", "credits" or "license" for more information.
>>> import clr
>>> clr.AddReference("dingonet")
>>> import indigo
>>> dir(indigo)
['Dingo', 'DingoException']
>>> dingo = indigo.Dingo()
>>> dir(dingo)
['Dispose', 'Equals', ......, 'getResult', 'isEmpty', 'loadMolecule', 'loadMolec
uleFromFile', 'loadReaction', 'loadReactionFromFile', 'render', 'renderToBitmap'
, 'renderToMetafile', 'setAAMColor', 'setBackgroundColor', 'setBondLength', 'set
Coloring', 'setHighlightBold', 'setHighlightColor', 'setImageSize', 'setImplicit
HydrogenMode', 'setLabelMode', 'setLoadHighlighting', 'setLogPath', 'setMarginFa
ctor', 'setOutputFile', 'setOutputFormat', 'setOutputHDC', 'setOutputPrintingHDC
', 'setRelativeThickness', 'setStereoOldStyle']
>>> dingo.loadMolecule("CC(=O)Cl")
>>> dingo.setOutputFile("test.png")
>>> dingo.setOutputFormat("png")
>>> dingo.render()
>>> ^Z