Sunday 25 October 2009

How to correct 3D coordinates at stereocenters

Given a set of 3D coordinates for a molecule, and whether the stereochemistry at particular atoms is correct or not, how would you fix any errors?

This is a problem that I've been working on for the 3D builder in OpenBabel. Given a connection table (e.g. a SMILES string), OpenBabel builds up the structure of a molecule using some basic geometric rules as well as ring templates (SMARTS strings for rings, and associated coordinates). Afterwards, the stereochemistry is corrected where necessary.

Well, for any tetrahedral center with at least two non-ring bonds, those two bonds can be swapped to correct stereochemistry. For the special case of a spiro atom (an atom with four ring bonds which, if broken, split the molecule into three fragments), one of the rings involved can be rotated 180 degrees to correct the stereochemistry.

How about for a stereocenter with three ring bonds? This is typically found where two rings join along an edge, or in bridged ring systems. Well, that's a bit tricky as you can't swap bonds around. But what you can do is invert the coordinates of the entire ring system. Of course, the ring system may contain more than one stereocenter (actually, I think such a ring system is guaranteed to contain at least one other stereocenter) in which case it will not always be possible to satisfy the stereochemistry at all centers simultaneously.

This is as far as I've currently gotten.

The next step is to include some stereochemistry information in the ring templates themselves. That is, to include different versions of the ring templates for the various stereochemistry arrangements. This should increase the coverage of ring systems that OpenBabel can successfully handle.

Of course, there is a limit to how far one can get with ring templates, but it'll be interesting to find out where that limit is.

Image credit: nickzeff

Avogadro is 1.0 today

The 1.0 release of Avogadro has just come out as announced by Geoff, reported by Depth-First, blogged by Marcus (check out the video), and interviewed and microblogged by SourceForge.

To quote
Avogadro is an advanced molecular editor designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible rendering and a powerful plugin architecture.
Why am I interested in this? Well, firstly it's useful for comp chem, an area in which I still dabble a bit. Secondly, it's going to become more useful for cheminformatics with time (will need to add handling for multi-mol sdf files first). And thirdly, many new features of OpenBabel have been added to address requirements for Avogadro such as 3D conformer generation from SMILES and forcefields, both of which I now use regularly.

Well done, and best of luck to all involved. And what better release date? 6:02 on the 23rd of the 10th.

Wednesday 21 October 2009

Really really final deadline extended to 23rd Oct for CINF symposia

It seems that all of the CINF symposia have had their final deadlines extended to this Friday, 23rd October. So it's your last chance (again) to send in an abstract to the Visual Analysis of Chemical Data symposium, or any of the other symposia listed on the CINF website. For anything that doesn't fit a specific symposium, there's General Papers (I've one in here myself). The COMP division also has several symposia of interest to cheminformaticians (I'd link to the list of symposia but their website doesn't list them).

Tuesday 13 October 2009

One week left to submit - Symposium on Visual Analysis of Chemical Data (ACS Spring 2010)

Final Call for Papers:
Visual Analysis of Chemical Data
239th ACS National Meeting
San Francisco, March 21-25, 2010
CINF Division

Update(20/Oct): Closing date now 23rd Oct.

Dear Colleagues,

The submission deadline of 23rd Oct is approaching for an upcoming symposium focusing on innovative methods for visual representation and analysis of chemical data. Just as Edward Tufte has championed maximizing clarity and information content in statistical graphics, there is a need for methods to display chemical information that will maximize understanding, and allow rapid analysis and decision making.

We invite you to submit contributions that address various aspects of visualization of chemical data (such as structures, SAR data, literature, patents) including, but not limited to, the following topics:
  • With an ever increasing pool of descriptors, along with new and more sophisticated machine learning methods, QSAR models are becoming more difficult to interpret. How can information on model reliability, the presence of activity cliffs, and the range of applicability of a model and other relevant model properties be easily depicted?
  • Recently, virtual worlds 3D such as Second Life have presented new opportunities and challenges for the representation of chemical data. What is the potential of such a medium in education and communicating with the chemistry community?
  • Social software allows for rapid and convenient sharing of chemical data. Examples include Google Spreadsheets, ManyEyes, DabbleDB, and wikis, including Wikipedia. What are the implications for chemical research and education?
  • The visualization of the contents of large chemical datasets presents particular problems. How can an overview of the dataset be visualized so that it presents both the nature of the contents as well as the degree of diversity and similarity within the dataset? How can different datasets be visually compared?
  • Depicting 3D chemical information in 2D involves a loss of information. However, innovative 2D visualization methods can restore the most relevant information.
  • Chemical information comprises a diverse array of data types including chemical structures and diagrams (2D and 3D), associated assay results, conformations, QSAR models and their predictions. The visualization and integration of all these data into a single interface that aids interpretation and analysis is a continuing challenge.

We would also like to point out that sponsorship opportunities are available.

The on-line abstract submission system (PACS) will be open for submissions until 23rd October.

Please contact Andrew, Jean-Claude or myself if you have any questions.

Yours sincerely,
Noel O'Boyle

On behalf of the symposium organizers:

Dr. Jean-Claude Bradley,
Drexel University, PA

Dr. Andrew Lang,
Oral Roberts University, OK

Dr. Noel O’Boyle,
University College Cork, Ireland

Image credit: process/rum do/radial by Henry Cooke (CC BY-SA-NC 2.0)

Thursday 8 October 2009

Browser-based chemistry is here - its name is ChemDoodle Web Components

So...what's to say? Just check out ChemDoodle Web Components. It's Javascript. It's Open Source. It's running in your browser. It's doing funky chemistry.

Don't think it's going to affect you? Hear that noise? That's a paradigm shift.

Let's chart a brief timeline of what has led up to this:
  • 1995 Nov - JavaScript (then LiveScript) first released
  • 2008 Jul - Rich surveys all prior work at the intersection of Javascript and Chemistry, and identifies where Javascript can make the most impact on the web
  • 2008 Oct - blahbleh implements a Javascript 3D molecular editor and viewer, molecools
  • 2008 Dec - Duan Lian uses GWT to translate Rich Apodaca's lightweight Java cheminformatics toolkit, MX, into Javascript (website, demo)
  • 2009 Jan - I develop a Javascript 3D molecule viewer, TwirlyMol
  • 2009 Jan-Feb - Duan Lian releases a preview of the world's first Javascript molecular editor, jsMolEditor
  • 2009 Aug - Kevin Theisen releases ChemDoodle Web Components

Sunday 4 October 2009

Keep your publication list up to date with Javascript and Google Spreadsheet

Adding a new publication to a HTML page is a fiddly business, especially if you want to add some markup or links. This might explain why there are so many websites of scientists whose last publication appears to be four years ago. If only adding a new publication were as easy as, oh, let's easy as adding a row to a spreadsheet.

Well, you're in luck. The following procedure makes it as easy as just that. You can maintain the same list of publications on several web sites all of which will automatically be kept up-to-date. If you're familiar with Javascript and CSS, you can also easily change the markup used and its appearance. The result should look something like the following image:

Here's how it's done:
(1) Create a google spreadsheet, and use the same column names as shown in this spreadsheet.
(2) Add some information on your papers. Again, see the example spreadsheet for the format (note especially the author list format).
(3) Click on Share/Publish as Web Page, and make note of the key (i.e. the text between "key=" and "&single").
(4) Download addpapers.js, edit the line 'me = "N. M. O'Boyle";', and the line with the email address, and put it in the same directory as a HTML page, papers.html (for example).
(5) Edit papers.html to load addpapers.js in its HEAD ("<script type='text/javascript' src='addpapers.js'></script>")
(6) Download publishious.css, and put it in the same directory as papers.html.
(7) Edit papers.html to apply publishious.css in its HEAD ("<link rel='stylesheet' media='all' type='text/css' href='publishious.css' />").
(8) Add the following to papers.html after replacing MYKEY by the value of the key for your spreadsheet:
<div id="paperentries"></div>
<script src=""

Hopefully that works. If it doesn't, check your browser's error console (in Firefox Tools/Error console) for some idea of the problem.

It's probably not a good idea to rely totally on Google spreadsheets, so what I do is view the generated HTML code using the Web Developer plugin and paste it into the HTML page as the content of the paperentries div. That way, even if Google spreadsheets goes down (or changes its API), a couple of papers will still appear.

Feel free to adapt this code for your own use, although I'd appreciate if you could add a comment below with a link to the resulting webpage.