Noel O'Blog

Thursday, 22 August 2013

Conference etiquette - Don't mention the ----!

Conferences are all about communication. But what can and cannot be communicated at a conference, or rather what should and should not?

Probably top of the list of "should not" is to use inappropriate examples or images to liven up proceedings. Recently I attended the 6th Joint Sheffield Conference on Chemoinformatics (more here at NM) and saw red when a speaker talking about ways to rank docking programs decided to use contestants in Miss World as an illustrative example, complete with pictures, references to "lovely girls" and "let's just treat them as objects". I noticed someone got up and walked out at this point.

In other fields such as technology, some conferences have found it necessary to develop a Code of Conduct to spell out appropriate behaviour for people who need it spelled out; e.g. here's an excerpt from PyCon's:

All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery is not appropriate for any conference venue, including talks.

Be kind to others. Do not insult or put down other attendees. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate for PyCon.

Attendees violating these rules may be asked to leave the conference without a refund at the sole discretion of the conference organizers.

Meanwhile over in Vermont, the Gordon Research Conference on CADD was talking place at the same time. This particular meeting had a focus on the use or abuse of statistics with a goal to improve the situation. The thing is, GRCs are subject to the Chatham House Rule. Before I checked the Wikipedia link I thought that this meant that everything discussed at such a meeting is confidential, and I think this is the common understanding (see for example Peter Kenny's reference to this over at his blog). However, apparently it simply means that the discussions must not be attributed to anyone. Either way, probably more people misunderstand the Rule (like me) than understand it.

I understand that this may be out of the control of the organisers, but it seems to me that this rule holds back the communication of the information to a wider audience: can you tweet the talks? can you post the slides? can write a blog post about the conference? can you even mention it to co-workers at coffee? Do people really present controversial work that needs protecting by the Rule?

I guess what I'm really asking is, could everyone just follow Craig Bruce and Peter Kenny's lead and post their slides on Lanyard already? (Just don't mention the Rule) :-)

Poll question: Time for Python 3?

This year's poll question is...if you use Python, which Python do you use? (See the sidebar on the left.)

By way of background, Python 3.0 was first released back in Dec 2008. Almost five years later, many are still using a Python 2.x version. For scientists, the main blocker was numpy, which only supported Python 3 since Sep 2010, but additionally IPython took until July 2011 and matplotlib has only this year (Jan) supported Python 3.

In cheminformatics, Open Babel has supported Python 3 since March 2009 but neither RDKit nor OEChem has yet added Python 3 support. However Greg is currently thinking about it (if you're for it, you can support this idea on Google+).

It's interesting to look in more detail at the Windows downloads for the OB Python releases. For OB 2.3.1 released in Oct 2011, the ratio of Python 2 to Python 3 downloads is 4:1 (2.5:148, 2.6:238, 2.7:983, 3.1:41, 3.2:301) while for OB 2.3.2 released in Nov 2012, the ratio is 2.5:1 (2.6:81, 2.7:424, 3.2:73, 3.3:129). The trend is obvious but it seems that change is slow.

Image credit: Fox Fotography (Oliver on Flickr)

Thursday, 25 July 2013

So shall we be social?

In case you are upset that I ignore your LinkedIn requests, don't follow you back on G+, do not RT your Ts, won't answer your Open Babel emails, mark your blog comments as spam, throw your letters into the recycling bin, and never buy you flowers, let me explain:

If we have never met in person, I will ignore your LinkedIn request. I use LinkedIn to remember the names/faces/bios of people I've met.
I only follow people with public posts on Google Plus. Non-public posts cannot be reshared publicly, so it's just a pain to follow people who only generate content for Google.
I don't log into Twitter very often.
Anyone who emails me personally (even if I know you) with questions about Open Babel is politely requested to resend to the mailing list. Open Babel has a lot of users and I don't scale.
If I spammed your blog comment by mistake, get in touch. Like, you know, leaving a non-spammy comment below.
Any communications from the ACS marked Important, Very Important, Your Eyes Only, This Week Only, or Find-a-Member-and-Get-a-Periodic-Table-Thing will automatically be recycled without opening.

None of these social media requests upset me though - don't get me wrong. I'm just trying to keep my digital life manageable.

A new home for Linux4Chemistry - Part II

Back in February, I was looking for someone to take over Linux4Chemistry. Thanks to those of you who got in touch.

I'm happy to announce that Riccardo Vianello and Gianluca Sforna have stepped forward to take over the site, and have established a new home for it at http://www.linux4chemistry.info.

I wish them all the best. Good luck guys!

Saturday, 20 July 2013

Cheminformatics in Science Fiction

Ever read a science fiction story, or indeed any story, that featured cheminformatics as a major plot point?

I've just come across a short story by Asimov that involves a murder mystery in a chemistry library where the Beilstein catalogue plays a major role. Stricly speaking this isn't science fiction, but let's go with science-based fiction. The story in question is Asimov's "What's in a name?". Since it's a mystery story, better not to google the plot unless you like spoilers. It's available as part of a collection, Asimov's Mysteries.

(And a shout out to last year's Alpha Shock by Murcko and Walters - there's a bit of cheminformatics in there too.)

Sunday, 14 July 2013

Using Open Babel to package chemistry software

Let's suppose you want to write a piece of C++ code that manipulates molecules to do something or other, and generates an output, either more molecules (or indeed the original molecules filtered), some descriptors, or some text (a report, or table, or something).

I'm going to propose that you should write or adapt this code as an Open Babel plugin. I've just done this for Confab, the conformer generator I wrote some time back.

If you do this, you don't need to consider how to put together the build infrastructure, write the code for reading/writing file formats, or for handling command-line options and arguments (in fact, you get a lot of additional functionality for free). More generally, the software will compile cross-platform, be included in every major Linux distribution and be available to a very large number of people. It will also have a lifetime beyond the end of the grant that funded it.

I'm by no means the first to see the advantages of this. For example, Jiahao Chen added the QTPIE charge model he developed as one of Open Babel's charge model plugins.

Coming back to Confab, the original code was written as a modified version of Open Babel. I don't quite know what I was thinking but I meant to integrate it properly with the main Open Babel code at some point. In the end, the required push was provided by David Hall, who started off this integration early this year.

If you want to try it out now, clone the development version of Open Babel on github. The "--confab" option to obabel will invoke the Confab operation and "-oconfabreport" will replace calcrmsd in the original release. For command-line help on either, use "obabel -L confab" or " obabel -L confabreport".

Image credit: Plug In by Sebastian Anthony (Mr Seb) on Flickr (CC-BY ND)

Thursday, 20 June 2013

Least Publishable Unit #3: Novel Approach to Pharmacophore Discovery

Here's an idea I had towards the end of my last postdoc, but only played around with it a little bit. It relates to the pharmacophore search software Pharmer.

I've mentioned David Koes's Pharmer here before. For the purposes of this discussion the key point is that it can do ultra-fast pharmacophore searching. The catch is that you need to generate conformers and index them in advance, but once done the search software whizzes through them. For example, check out the ZINC Pharmer applet over at the Camacho Lab's website.

Once something becomes ultra-fast, it opens up new possibilities for its use. In particular, we can consider using an ultra-fast pharmacophore search tool to discover pharmacophores in the first place.* So imagine tuning the pharmacophore definition by searching against a set of inactives seeded with actives; you could have a GA handling the tweaking of the pharmacophore and so on.

This method has a couple of nice features. If you are interested in selectivity (think of the 5-HT_2A versus 5-HT_2C I mentioned in an earlier installment) you could penalise matches against molecules known to bind to only the wrong target and reward matches to molecules known to bind to only the right one. This was why I was interested in this method, but in the end I didn't have enough data to proceed down this road; I did put Pharmer through its paces though and tried some basic optimisations of pharmacophore definitions using this approach.

In short, as far as I know no-one has ever carried out pharmacophore discovery in this way, so it could be something interesting to try especially in the context of developing a selective pharmacophore.

* I think this possibility is mentioned at the end of the Pharmer paper. But as far as I remember I thought of this independently. Makes no difference in any case.