Tuesday, 18 May 2010

Caution: This post contains dangerous amounts of R

Only a volcano could have stopped me attending Rajarshi's CDK/R workshop at the EBI yesterday. And guess what? - it did.

However Rajarshi has just posted his mammoth CDK/R presentation on the web so all is not lost (thanks also to Duan Lian):

What does the pairwise RMSD of all possible conformations look like?

I have been doing some work on conformer generation and trying to think about how to measure the quality of one method over another. Since I'm a great believer in looking at the data (see also this post), I decided to calculate the pairwise RMSD of the systematically generated conformers of a molecule with 3 rotatable bonds, and then visualise these data with classical multidimensional scaling (cmdscale in R):
It turned out to be quite a nice graph. Each point here represents a different conformer. Of the three torsion angles, it seems that two were incremented in steps of 30°, while for the third Δang was 120°. Each torsion angle had a different magnitude of effect on the RMSD (or at least, this is how I interpret this graph - I haven't drilled down into the figures yet); presumably the torsion angle closer to the centre of the molecule gives rise to the three clusters of values.

The points in red are the lowest energy conformers (by MMFF94). In this graph it is difficult to see what the correspondence between these conformers is, although it is clear that the overall RMSD between some of these conformers is high.

All data was generated using a locally-modified development version of OpenBabel. Here is the R code I used:
<- read.table("tmp_dist.txt")
location <- cmdscale(m)

mycol <- read.table("tmp_en.txt")
max_scaled <- max(scaled)
min_scaled <- min(scaled)
scaled <- (-min_scaled + scaled) / (max_scaled - min_scaled)

# colmap <- rainbow(11)
lookup <- function(x) {
        # colmap[floor( (x + 0.15)*10) ]
        if(x < 0.1) {"red"}
        else {"black"}
}
colors = apply(scaled, 1, lookup)
plot(location, col = colors, pch=19)

Wednesday, 12 May 2010

Advanced guide to unsubscribing from an ACS marketing list

I used to think I was pretty good at avoiding spam. I always tick the box (or sometimes untick it - you have to be on your toes) that stops companies sending me marketing 'literature'. It all worked well until I met the ACS...

You see, I did tick that box with the ACS (I've just checked) but the spam keeps coming. And there's isn't just one list - oh no, they have got lists for authors, conference attendees, everything and anything. They used to send me a pre-approved membership form even though I was already a member! Naturally I complained, someone responded, and I continued to receive the requests. So it goes. I subsequently moved countries - seems to have worked.

So...back to the main topic. Today I received an email from marketing@acs.org about something called ACS Spam (I'm summarising some details here). As required by some internet law (in the US?), they include the following helpful unsubscribe link at the bottom of the email:
If you do not wish to receive any further correspondence from ACS via email, please send a blank e–mail to purge-allcontributors2010-7753700d73f2fd1010f8a0584006fffc5bb2fdf@listmanager1.acs.org and send the message.

Apart from the fact that the sentence doesn't make any sense, it simply doesn't work - and I've tried it before for other ACS spam. Here's the response I received:
Delivery to the following recipient failed permanently:

purge-allcontributors2010-7753700d73f2fd1010f8a0584006fffc5bb2fdf@listmanager1.acs.org

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other server returned was: 550 550 ... Hostname of recipient unknown to Lyris ListManager (state 14).

State 14, eh? This means that the ACS wants to play hardball (hurling perhaps). I was about to email the ACS directly, when I noticed a feature of Gmail I hadn't seen before. If you click on "Show Details" for the original email, you will see a link for "Unsubscribe me from this sender". And believe it or not, clicking on this appeared to work. I got an email back from "Lyris ListManager" saying that I had been subscribed from 'allcontributors2010'. This appears to be a Gmail only feature (see this blog post) so good luck unsubscribing if you use a different email provider.

Friday, 7 May 2010

Talking the talk - ACS talks now up on the web

You may have missed one or two talks at this year's ACS. If so, break open your internet and head over to the recorded talks. If you want a retro fix, you can check out last Fall's ACS too.

About half the talks from the Visual Analysis of Chemical Data symposium are available (just search the page for that phrase), but you know what's really missing - some sort of social voting system. In its absence, feel free to leave comments below about which talks are really great.

Tuesday, 4 May 2010

Give your talks and lectures a worldwide audience

I have recently become a convert to the idea of putting as many as possible of my talks, posters and lectures on the web (see current progress here).

On March 25th I gave a talk on Cinfony to about 30 people at the ACS National Meeting. As soon as I got back, I put the talk up on Slideshare and inserted it in my blog. There have been 451 views in the month and a half since.

Because of this, now I'm going back and digging up all of my talks and posters and putting them up both on Slideshare and on my website. Naturally, I'm going to include the original file (typically a Powerpoint file) as one day Slideshare will be no more.

The funny thing is that although few scientists tend to provide their talks (notable exceptions are Rajarshi and Jean-Claude Bradley), I find it hard to think of any disadvantage. The whole point of giving talks and presentations is to publicise your work and surely your website is equally important in this regard. Indeed, many people might prefer to click through a presentation that explains your paper in bullet points rather than wade through your scintillating prose.

Another good question to ask is whether there's any difference between the audience that might have seen the talk at the conference and those that would have seen it online? The conference audience (especially at the ACS) would be more of the PI variety than the postgrad/postdoc. And I would guess that it would be the other way around online.

If you have any thoughts on whether this is a good/bad idea or know anyone else that provides all their talks, please leave a comment below...

Saturday, 10 April 2010

Plug cclib into Avogadro

cclib is an open source Python library for parsing and analysing computational chemistry log files.

Avogadro is a cross-platform molecular visualiser with a lot of functionality geared towards setting up and analysing computational chemistry calculations. It is an open source project led by Marcus Hanwell and is built on top of OpenBabel.

Although Avogadro has increasing support for parsing comp chem log files from a variety of packages, there may be times when it chokes on a file. If this happens, it's nice to have a backup option. This is where cclib can come in handy.

If you're on Windows, have Avogadro 1.0.0, cclib 1.0, OpenBabel 2.2.3 and its Python bindings (these latter shouldn't be necessary in a future version of Avogadro), just copy the following code into a .py file and save it in C:\Program Files\Avogadro\bin\extensionScripts.

When you next start Avogadro, it will have an "Open with cclib" Option in the Scripts menu.

from PyQt4.QtCore import *
from PyQt4.QtGui import *

import sys
sys.path.append("C:\\Python26\\Lib\\site-packages")
sys.path.append("C:\\Python26\\lib")

import cclib

import numpy
import Avogadro as avo
import openbabel as ob

class Extension(QObject):
def __init__(self):
QObject.__init__(self)

def name(self): # Recommended
return "Open with cclib"

def description(self): # Recommended
return "Open comp chem files with cclib"

def actions(self): # Required
actions = []

# Actions are just instances of the QAction class from PyQt4
action = QAction(self)
action.setText("Open with cclib")
actions.append(action)

return actions

def performAction(self, action, glwidget): # Required
# Only one action so need to check its identity

filename = str(QFileDialog.getOpenFileName())
if not filename: # You hit cancel
return None

logfile = cclib.parser.ccopen(filename)
data = logfile.parse()

obmol = ob.OBMol()
avomol = avo.molecules.addMolecule()
avoatoms = []
for atomcoord, atomno in zip(data.atomcoords[-1], data.atomnos):
coord = atomcoord.tolist()
obatom = ob.OBAtom()
obatom.SetAtomicNum(int(atomno))
obatom.SetVector(*coord)
obmol.AddAtom(obatom)

newatom = avomol.addAtom()
newatom.atomicNumber = int(atomno)
newatom.pos = atomcoord
avoatoms.append(newatom)

obmol.ConnectTheDots()
obmol.PerceiveBondOrders()
obmol.SetTotalSpinMultiplicity(data.mult)
obmol.SetTotalCharge(data.charge)

for bond in ob.OBMolBondIter(obmol):
newbond = avomol.addBond()
newbond.setBegin(avomol.atom(bond.GetBeginAtomIdx() - 1))
newbond.setEnd(avomol.atom(bond.GetEndAtomIdx() - 1))
newbond.order = bond.GetBO()

avo.GLWidget.current().molecule = avomol

return None

Notes:
  • There should be no need to have the OpenBabel Python bindings installed separately, but there is no way to access the OpenBabel library in Avogadro (at least on Windows). This prevents me, for example, from calling ConnectTheDots (I had to use my own installation of OpenBabel) or to add Conformers (which was what I wanted to do).
  • There are two Script menus after installing this plugin!
  • How do I emit a debug message?
  • The Python prompt in Avogadro requires you to "print" everything to see its value. This should not be necessary.
  • Cutting and pasting multiple lines into the Python prompt works fine, but it looks pretty weird as the prompt (>>>) is missing.

Friday, 9 April 2010

ANN: Cinfony 1.0 released

Cinfony presents a common API to several cheminformatics toolkits. It uses the Python programming language, and builds on top of OpenBabel, RDKit, the CDK, and cheminformatics webservices.

Cinfony 1.0 is now available for download.

The two major additions in this release are support for using OpenBabel from IronPython (the ironable module), and webel - a cheminformatics toolkit that uses webservices (read more about this here).

As usual, Cinfony has been updated to use the latest stable releases of each toolkit: OpenBabel 2.2.3, CDK 1.2.5 and RDKit Q4_2009.

Thanks to Fredrik Wallner, Tom Sheldon and Cedric Moretti for reporting bugs.

Feedback, both positive and negative, is very much appreciated - I want to make Cinfony a useful toolkit for the community. That means you.