Sunday 30 May 2021

Combining protein structure with deep generative models for ligands

Journal of Cheminformatics has just published the first result from a collaboration between ourselves at Sosei Heptares and the Andreas Bender group. Morgan Thomas, the PhD student who did all the work, has presented early versions of this at various AI/Chemistry meetings but it's finally out there:

Morgan Thomas, Robert T. Smith, Noel M. O'Boyle, Chris de Graaf, Andreas Bender. Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study J. Cheminform. 2021, 13, 39.

Deep generative models have shown the ability to devise both valid and novel chemistry, which could significantly accelerate the identification of bioactive compounds. Many current models, however, use molecular descriptors or ligand-based predictive methods to guide molecule generation towards a desirable property space. This restricts their application to relatively data-rich targets, neglecting those where little data is available to sufficiently train a predictor. Moreover, ligand-based approaches often bias molecule generation towards previously established chemical space, thereby limiting their ability to identify truly novel chemotypes.

In this work, we assess the ability of using molecular docking via Glide—a structure-based approach—as a scoring function to guide the deep generative model REINVENT and compare model performance and behaviour to a ligand-based scoring function. Additionally, we modify the previously published MOSES benchmarking dataset to remove any induced bias towards non-protonatable groups. We also propose a new metric to measure dataset diversity, which is less confounded by the distribution of heavy atom count than the commonly used internal diversity metric. 
With respect to the main findings, we found that when optimizing the docking score against DRD2, the model improves predicted ligand affinity beyond that of known DRD2 active molecules. In addition, generated molecules occupy complementary chemical and physicochemical space compared to the ligand-based approach, and novel physicochemical space compared to known DRD2 active molecules. Furthermore, the structure-based approach learns to generate molecules that satisfy crucial residue interactions, which is information only available when taking protein structure into account.

Overall, this work demonstrates the advantage of using molecular docking to guide de novo molecule generation over ligand-based predictors with respect to predicted affinity, novelty, and the ability to identify key interactions between ligand and protein target. Practically, this approach has applications in early hit generation campaigns to enrich a virtual library towards a particular target, and also in novelty-focused projects, where de novo molecule generation either has no prior ligand knowledge available or should not be biased by it.

For further background, a Q&A with Morgan appears over on Andreas's blog.

Monday 18 January 2021

Data/cheminf/compchem openings at Sosei Heptares

A year and a half into my new life in pharma, and I'm really enjoying it. And now we're looking for new members of the team at Sosei Heptares in Cambridge (UK), with the advertised posts covering everything from data management, cheminformatics through to computational chemistry, both junior and senior.

I've pasted in the basic details of the posts below, but there are more details if you follow the links. Feel free to reach out to me if you have questions (, or contact Chris de Graaf who heads the Computational Chemistry team.

Computational Chemist – 3 positions at Sosei Heptares (Cambridge, UK) (Research Scientist, Senior Scientist, Principal Scientist)

We are growing our Computer-Aided Drug Design and Cheminformatics/AI capabilities by extending the Sosei Heptares Computational Chemistry team with three additional positions:

Link to advertised positions

These cover all experience levels from recent PhD to a well experienced senior computational chemist in drug discovery. The positions are flexible, so different combinations of skills and/or experience are acceptable for the right candidate, so please forward to those who you feel passionate about joining the Sosei Heptares CompChem team where scientific excellence and passion combine in a friendly fun environment to impact drug discovery projects and create new cutting-edge approaches.

Discovery Data Manager at Sosei Heptares (Cambridge, UK)

In addition, Sosei Heptares is looking to recruit an experienced Discovery Data Manager to support our Research team:

Link to advertised position

This position is an exciting opportunity to work at the interface between Computational Chemistry, Medicinal Chemistry, Molecular Pharmacology, Translational Sciences, and Platform groups to streamline the GPCR structure-based drug discovery process in an industry-leading biotech company.

Closing dates for all applications is 14th March.