Wednesday, 20 November 2013

Anatomy of a bug fix

There's been a longstanding problem with the 2d layout in Open Babel. When support for correct layout of cis/trans stereo was added, it was done as a correction to the initial layout rather than as an intrinsic part of the layout. This means that after nicely laying out a molecule to avoid overlaps, it may then flip two groups at the end of a double bond, crashing one part of a molecule into another.

At the time, given that the depiction code had just moved from no support for stereo to supporting it correctly (at a small expense of layout), sorting this out was way down my priority list. Anyhoo, a couple of days ago I decided to revisit this issue. Here's an example before and after:
I thought it might be interesting for some (or perhaps not) to see behind-the-scenes how one might go about sorting out something like this. In this case, I already knew the technical fix, but it still took some time to ensure it was done right. It's not quite #realtimechem, but probably as close I'll get.

Day one
1. Update local git repo by pulling from the main branch
2. Create (and switch to) branch for bugfix
3. Build Open Babel under MSVC (debug and release)
4. Some build issues with recent code (presumably only tested on Linux) for chemicaljson and chemdoodle plugins. Ignore for now as it doesn't impact this work (note to self: fix).
5. Need to generate problematic test cases. Convert several thousand ChEMBL molecules to SMIs and use grep to remove those with "." and keep those with either "\" or "/". Write Python script to sort by size and added line number as title (these are useful for debugging; smaller test cases are easier to handle).
6. Add print statements to the code to print out the measured "overlap error" before and after calling the code to fix the 2D layout
7. Use obabel to convert my test cases to SVG and pipe the output to a file
Night came. Morning came. The second day.
8. Coding time: move the double-bond correction to immediately after the initial layout.
9. When doing the somewhat exhaustive rotation around bonds for layout, do not alter bonds with defined double bond stereo.
10. Add print statements to the code to print out the measured "overlap error" afterwards
11. Use obabel to convert the test cases to SVG and pipe the output to a file
12. Write a Python script to compare the overlap errors for the old and new code
13. Visually inspect some of the cases where an improvement is supposed to have occured.
14. Oh oh! I see an example where the layout has improved but the stereo is now wrong.
Night came. Morning came. The third day.
15. I run the problem case in the debugger and step through the code inspecting the molecule object in the depiction code. Funny thing is, my code appears to be correct but none of the bonds are marked with stereo refs and so it doesn't do anything.
Night came. Morning came. The fourth day.
16. I run the problem case in the debugger with the original code. The stereo refs are there.
17. I update to the new code and run it again. The stereo refs are there originally but must be disappearing later. I step further along. A copy is made of the molecule; it seems like it uses its own special clone method. Looking further along, the bonds are cloned but the stereo refs are not being copied at that point. I fix this and all is well.
18. I run the test cases again, run my comparison Python script and start visually inspecting problem cases.
19. Seems like everything is working better than before.
20. I write a blog post before I forget everything.
21. Still need to remove the print statements, inspect the diff, commit the code, and send a pull request.

No comments: