Saturday, 23 September 2023

SmiZip at the 12th RDKit UGM


I really enjoyed the recent RDKit UGM in Mainz hosted by Paul Czodrowski and his group, and presided over as usual by Greg Landrum. Great talks, and also it was good to catch up with old and new friends. In particular it was great to meet Tim Vandermeersch again, with whom I worked on Open Babel many years ago now. He blew people's minds with a talk about converting SMARTS patterns to highly-efficient C++ at compile-time.
My talk was more about shrinking things, specifically SMILES strings. Back in 2001 Roger gave a talk where he described SmiZip, a method to compress SMILES strings. I revisit this, and provide the first public implementation of SmiZip. I also go on to discuss some potential applications including use as a measure of structural complexity:


If SmiZip sounds interesting, check out the info on the GitHub site and then "pip install smizip".