Tuesday 30 October 2007

A few brief words about journal names

If you solved the cryptic crossword that is today's blog title, you will have realised that I'm talking about journal abbreviations. For some reason, almost all journals require you to abbreviate the names of the journals you cite, while simultaneously (and this is the bit that really gets me) not letting you know what abbreviations you should use: it is "J. Mol. Struc." or "J. Mol. Struct." or even "J. Mol. Struct. THEOCHEM"?

Perhaps you do as I do; in that last-minute tidy up of citations, I google "journal abbreviations" and try to find suitable abbreviations for the last few remaining articles. Or sometimes I spend a quiet afternoon searching through the list of citations in recent articles from that journal just to find how out to cite that particular journal!

But sometimes a little voice inside me cries "Why!" - why am I wasting the best years of my life solving the non-problem that is how to abbreviate the title of a particular journal? Don't journals want to be cited correctly? In this age of electronic linking of data, misspelling the journal abbreviation could mean that a citation to a particular journal would be missed. Since journals seem to think that Impact Factors are so important, couldn't they at least give some hint what abbreviation they think you should use?

So, on to my favourite part: "The Case Study", aka "The one that pushed me over the edge". I wanted to cite a paper from Nucleic Acids Research. This is commonly known (to me) as NAR. Indeed, the website, is nar.oxfordjournals.org. And the blurb on the front page says "Nucleic Acids Research (NAR) is a fully Open Access journal". Fantastic, NAR must be the accepted journal abbreviation. Job done.

Just to be sure, let's check the Instructions for Authors. Here they give an example of how they wish references to be formatted. The great thing is that the examples they give are from NAR:
1. Schmitt,E., Panvert,M., Blanquet,S. and Mechulam,Y. (1995) Transition state stabilisation by the 'high' motif of class I aminoacyl-tRNA synthetases: the case of Escherichia coli methionyl-tRNA synthetase. Nucleic Acids Res., 23, 4793-4798.
...but wait a second, that's "Nucleic Acids Res." not "NAR"!

Now our curiosity (aka annoyance) is piqued, so let's check out the current issue of Nucleic Acids Research. But what's that in the title of the page, "Nucl. Acids Res. -- Table of Contents...".

So NAR, or "Nucl. Acids Res." or "Nucleic Acids Res."? And don't even get me started on why journals require these abbreviations in the first place even if they are web-only...

Image credit: Raistrick's Index to Legal Citations and Abbreviations by ex_libris_gul (CC BY-NC-SA 2.0)

Saturday 20 October 2007

At last - merge, split and create PDFs with open source tools

I recently had to manipulate some PDFs, and I was pleasantly surprised that things had improved so much since the last time I had to do this, a couple of years ago.

Whatever you feel about PDFs (for example, you may believe they are a very effective way of destroying all scientific data in the literature), your opinion will suddenly take a nosedive the first time you have to, for example, extract a single page from a PDF, or merge two PDFs together. At this point, you will suddenly realise that Adobe own you, and that you will need to buy Adobe's software if you want to perform this trivial task.

But help is at hand in the form of Open Source software. I found that I was able to manipulate PDFs by using Open Source tools, and on Windows. The first thing to do is to install PDFCreator (GPL). Once it's installed, when you print from any application (for example, Word) you just choose the PDFCreator 'printer', and click OK. After a couple of seconds, a dialog box will pop up where you can just click "Save" (or you might want to adjust the page size to A4 or Letter), and it will make a PDF with the same name as your original document.

A couple of years ago, I used PDFCreator and it worked 96% of the time. That is, Word documents sometimes had strange symbols inserted instead of Greek letters or bullet points; also, extra spacing was sometimes inserted in lines with some text in superscript. This time, it worked perfectly. Well done, PDFCreator creators.

Scanned documents are now often provided as PDFs. I needed to merge some pages of one scanned document with my new PDF. For this, I needed Pdftk (GPL). The blurb says it all:
If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a command-line tool for doing everyday things with PDF documents.
Here's an example commandline which is similar to the one I actually used (again I did this on Windows). It creates a combined PDF which consists of pages 1-7 from one.pdf, 1-5 from two.pdf, and ends with page 8 from one.pdf.
pdftk A=one.pdf B=two.pdf cat A1-7 B1-5 A8 output combined.pdf
If you ever need to deal with PDFs, hopefully these tools can help reduce the pain.

Friday 12 October 2007

ANN: Frog donates code to OpenBabel for SMILES to 3D conversion

Recently, researchers at the French research institutes INSERM and CNRS developed an online service for converting SMILES string to 3D conformers: "FRee Online druG 3D conformation generator (Frog)". A description of this service was published in T. Bohme Leite, D. Gomes, M.A. Miteva, J. Chomilier, B.O. Villoutreix and P. Tufféry. Nucleic Acids Research, 2007, 35, W568-W572:
Frog is an on-line service aimed at generating 3D conformations for drug-like compounds starting from their 1D or 2D descriptions. Given the atomic constitution of the molecules and connectivity information, Frog can identify the different unambiguous isomers corresponding to each compound, and generate single or multiple low-to-medium energy 3D conformations, using an assembly process that does not presently consider ring flexibility. Tests show that Frog is able to generate bioactive conformations close to those observed in crystallographic complexes.

On behalf of the OpenBabel project, I am pleased to announce that Dr. Bruno Villoutreix (INSERM, University of Paris 5) and Dr. Pierre Tufféry (INSERM, University of Paris 7) have generously donated their code to OpenBabel. This code will be incorporated into OpenBabel under the GPL in the coming months, making fast and accurate SMILES-to-3D conformer generation available to the open source community for the first time.

The absence of an open source 3D conformer generation algorithm has increasingly become a problem in recent years due to the popularity of SMILES strings for the description of molecular information. Fortunately, this problem has now been solved. Thanks again to all those involved in the development and release of this code.

For further information on Frog, please contact the corresponding author of the Frog paper.

Image credit: gottcha78

O'woe is me - It's a capostrophe

"The username contains an illegal character." This is the final straw. I can take no more. I need to share the pain of living with an apostrophe.

Yesterday I read Pedro's post and decided to get on the crest of that Web 2.0 wave. So I requested a beta account on JournalFire, which sounds an interesting way to review the literature. Today, I received permission to create an account. It asked me to: "Enter your name as it would appear for publication. For example: John M. Delacruz". So I entered "Noel M. O'Boyle". And I received the infamous message, "The username contains an illegal character"!!

What's the story with this apostrophe anyway? Well, at some point in history, when my ancestor was called "Ó Baoighill", he had this great idea: "hey, what's the story with this accent on the O; I want to use this new thing called an apostrophe that's going to mess up web accounts for my descendants" (or more likely "Yó, cad é an scéal leis an fada seo; ba mhaith liom usaid a bhaint as an uaschamóg nua sin, a scriosfaidh saol mo pháistí ar an idirlíon"). I'd have appreciated a bit more foresight, forefather.

On the web, the Irish are outlaws. This isn't the first time this has happened. A lot of websites think I'm trying to hack into their systems with my carefully-crafted surname. SourceForge is one of the only sites that allows me to use the apostrophe. But even there, they spell my name "Noel O\'Boyle" just to say..."okay, you can use the apostrophe, but we're keeping an eye on you to make sure you don't try anything crazy with it". Actually, I've just noticed that it's gotten even worse. Now, my name is spelt "Noel O\\\'Boyle". That's like two pairs of handcuffs.

Certain organisations should be more familiar with author names than others; for example, a journal. A paper of mine was recently accepted by J. Comp. Chem. On the PDF proof, the names are all in capitals, i.e. "NOEL M. O'BOYLE" for mine. However, on the web abstract, in the Table of Contents, and in the XML they provided to PubMed, my name is "O'boyle". I feel diminished.

It's not only on the web, of course. When spelling your name over the phone you find yourself saying things like "it's a comma in the air" when trying to describe an apostrophe. Sometimes you just give up and spell your name "Oboyle" with the result that people think it's "O-boy-lay". I've even gotten letters in the post addressed to "Noel O?Boyle". Which, right now, is how I feel.

Friday 5 October 2007

Bring PDB codes to life on web pages

Wouldn't it be nice if you could click on a PDB code on a web page and you could instantly see the actual structure in 3D? For example, you might be reading the HTML version of a paper which is discussing a particular protein structure.

Some time ago I wrote a Greasemonkey userscript to do just that. You can get it on the Blue Obelisk web site, as well as several other userscripts. Since I haven't previously mentioned it on this blog I thought I might do so now.

To avoid false positives, it only runs on web pages containing the words "protein", "pdb" or "enzyme". For any PDB codes found, it adds the appropriate link to Eric Martz's FirstGlance in Jmol site. For example, the PDB codes on this page are tranformed to:(The yellow Jmol links were added by the script)