In cheminformatics, there are two reasons why one might want to calculate the RMSD between two conformers. The first is to check whether two conformers are very close in structure - e.g. for the purpose of generating a diverse set of conformers. This problem is solved using (1) a least squares alignment followed by (2) calculation of the RMSD.

The other situation is comparing two sets of 3D coordinates to see whether a prediction method has accurately reproduced experimental coordinates (e.g. docking). This just requires step (2) above.

The situation is complicated a little bit by the fact that only "heavy" atoms (i.e. non-H atoms in this context) are typically used to calculate the RMSD. A much greater complication is that automorphisms (well, isomorphisms of two molecules which are identical, to be exact) must be taken into account in both cases above. For example, consider the case where two para-substituted benzene rings must be compared; the RMSD calculation must take into account the fact that a 180 degree flip of the ring might yield a smaller RMSD.

Anyhoo, here's some Pybel code that will calculate the RMSD between a crystal pose and a set of docked poses. The code also illustrates how to access the isomorphisms. You should modify the code for your specific purpose:

## 7 comments:

Thanks Noel, that's neat!

There is an algorithm dealing what you are describing: the hungarian algorithm. This has been implemented in a recent paper doi:10.1021/ci100219f

Thanks for the link. I think that the Hungarian algorithm is particularly useful for comparing different molecules.

ideal RMSD calculation algorithm should include graph matching and of course isomorphism detection. But RMSD is not the best and only measure to compare docking results.

@Vladimir: What is the best measure to compare docking results?

There is no best.

RMSD is good when you have an classic ligand with no really flexible parts and binding pocket is not big and rigid.

Real space R-factor - is alternative for RMSD http://pubs.acs.org/doi/abs/10.1021/ci800084x

Also you can use similarity between two interaction fingerprints (IFP, SIFt) as measure of docking quality - this method have limitations also.

Thanks for the link to the R-factor paper.

Post a Comment