Monday 20 September 2010

Depict a chemical structure...without graphics

Sometimes you just need to identify the chemical structure in an SD file, but don't have access to a graphics terminal. For example, you could be logged into a server at a remote location. What to do?

Well, it turns out that you can depict pretty much anything using text symbols - this is known as ASCII art. Ubuntu has two of the main ASCII art libraries available, aa-lib and libcaca (named by a 10-year old). With both of these there are associated viewers, asciiview and cacaview.

There a few ways to go here: either a cheminformatics library could directly depict a molecule using ASCII art, or it could depict it using one of these libraries, or we can be lazy and just convert an existing PNG to text. The first case is likely to produce a better quality image - it is actually the subject of a paper by Raymond Carhart in JCICS in 1976 (via Pat Walters). Naturally, since this is a blog post, we will take the lazy route here and just convert from PNG to text.

So, this is the original image:
I found it better to convert to B&W by thresholding all non-white pixels to black:
convert orig.png -threshold 99% blackwhite.png


Running asciiview, we have the following:

Note that the structure is immediately clear. Still - we can do better. If we "-negate" the image first, we have:

How about for cacaview?

Not so good. However, both asciiview and cacaview have zoom and pan functionality and once we zoom in, the structure can be clearly identified:

I was originally thinking of including this functionality in Pybel (which, with the help of OASA, can generate 2D depictions as PNG files), but I think that generating such text images is best done through these ASCII art viewers, as you might need to zoom and pan to get the "full picture".

In a comment on FriendFeed, Hari wondered whether an exact depiction of a chemical structure could be made using Unicode characters. Good question. But first, how close can we get with ASCII characters? Here's my best attempt:

O
//
Cl--{/
\
\_____
/ --- \
/ \
\\ //
\_____/
Can you do better?

6 comments:

Egon Willighagen said...

Nice Blue Obelisk eXchange answer!

Noel O'Boyle said...

That's true. Maybe that was in my subconscious - I've added a link at BOX to this blog post.

Anonymous said...

Can't wait for some MD trajectories
visualised in mplayer with -vo aa :-)

Egon Willighagen said...

Anonymous++

Noel, was just about to ask you about an inverted image, when I just noticed you added it somewhere this afternoon :) Noel++

harijay said...

Hi Noel and Egon..felt the need to do this again and came across this post..and was surprised to see a mention to a comment I made two years ago..:-)
Wondering if this can be done now in rdkit/openbabel/cinfony

Noel O'Boyle said...

See the latest post in this series. This functionality is available in the dev version of OB, soon to be released as OB 2.3.2.