tag:blogger.com,1999:blog-7844526396210378482.post2965125765890086186..comments2024-01-31T09:23:26.925+00:00Comments on Noel O'Blog: Pybel - Hack that SD fileNoel O'Boylehttp://www.blogger.com/profile/03288289351940689018noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-7844526396210378482.post-20644294833706677962014-06-17T09:20:53.398+01:002014-06-17T09:20:53.398+01:00Since Open Babel can do that, it's also possib...Since Open Babel can do that, it's also possible to do that conversion with Pybel. Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-14166954001876448152014-06-17T06:58:02.870+01:002014-06-17T06:58:02.870+01:00It's quite interesting. My question here is wh...It's quite interesting. My question here is whether PYBEL is useful for converting SMILES to SDF format or not ?Anonymoushttps://www.blogger.com/profile/16721590487971384687noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-52657267286417074002009-06-12T09:46:17.429+01:002009-06-12T09:46:17.429+01:00I am afraid that I'm not very familiar with SM...I am afraid that I'm not very familiar with SMARTS. Perhaps you should email the CCL.net list.Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-78812219051603877002009-06-10T23:31:02.972+01:002009-06-10T23:31:02.972+01:00I can ask Chris Lipinski but unfortunately (to my ...I can ask Chris Lipinski but unfortunately (to my mind), whatever his original intention was the term HBD has already come to be interpreted in different ways by different groups.<br /><br />In my work we routinely use the Accord for Excel (Accelrys) add-in which counts NH2 twice and I would prefer to stick with this interpretation. Could you please suggest what SMARTS could give such a result?<br />RegardsUnknownhttps://www.blogger.com/profile/16641913350284932180noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-31907665787057374362009-06-10T12:49:02.890+01:002009-06-10T12:49:02.890+01:00@Dast: I think there is only one way to find out f...@Dast: I think there is only one way to find out for certain what Lipinksi's intention was - ask him! If you Google "Chris Lipinski" his email address is contained in the first hit. If you find out the answer, I would very be interested to hear.Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-85003619740980863162009-06-08T01:20:23.370+01:002009-06-08T01:20:23.370+01:00I've just started using pybel and came across ...I've just started using pybel and came across your interesting blog.<br /><br />However, I'm curious about the definition of HBD used here. Lipinski defines this as the sum of OHs and NHs which I interpret differently to the other posters - to my mind this means that NH2 is counted twice.<br /><br />Indeed, in his original Table 1, Atenolol has HBD value of 4 (similarly Aciclovir) which arises from counting NH2 as having HBD value of 2.<br /><br />Could someone comment on this? I'm confused!Unknownhttps://www.blogger.com/profile/16641913350284932180noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-69744430370925033372008-01-15T16:24:00.000+00:002008-01-15T16:24:00.000+00:00The problem is in calcdesc(), and will be fixed in...The problem is in calcdesc(), and will be fixed in the next release of Pybel. Until then, here's a workaround:<BR/><BR/>import pybel<BR/>import openbabel as ob<BR/><BR/>HBD = pybel.Smarts("[!#6;!H0]")<BR/>HBA = pybel.Smarts("[$([$([#8,#16]);!$(*=N~O);" +<BR/> "!$(*~N=O);X1,X2]),$([#7;v3;" +<BR/> "!$([nH]);!$(*(-a)-a)])]")<BR/><BR/>def moredesc(mol):<BR/> ans = {}<BR/> ans['RotBonds'] = mol.OBMol.NumRotors()<BR/> ans['HBD'] = len(HBD.findall(mol))<BR/> ans['HBA'] = len(HBA.findall(mol))<BR/> ans['molwt'] = mol.molwt<BR/> return ans<BR/><BR/>descs = {'LogP': ob.OBLogP(), 'PSA': ob.OBPSA(), 'MR': ob.OBMR()}<BR/>def newcalcdesc(mol):<BR/> return dict([(x,y.Predict(mol.OBMol)) for x,y in descs.iteritems()]) <BR/><BR/>output = pybel.Outputfile("sdf", "LipinskiRulesOK.sdf")<BR/><BR/>for mol in pybel.readfile("sdf", "ace_ligands.sdf"):<BR/> mol.data.update(newcalcdesc(mol))<BR/> mol.data.update(moredesc(mol))<BR/> output.write(mol)<BR/><BR/>output.close()Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-77780328040174885562008-01-15T15:33:00.000+00:002008-01-15T15:33:00.000+00:00I'll check this up, and get back to you...I'll check this up, and get back to you...Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-36095736019590034262008-01-14T17:52:00.000+00:002008-01-14T17:52:00.000+00:00Noel,thank you for the useful post. In fact I lear...Noel,<BR/>thank you for the useful post. In fact I learnt about pybel from your blog first and only came to pybel wiki and other docs some time after.<BR/><BR/>The script calculating HBD and HDA appears to fail on a large sdf file. My view is that HBD.findall leads to a memory leak (at least what I see is that the process consumes all the memory available on my Ubuntu box and quits with:<BR/>terminate called after throwing an instance of 'std::bad_alloc'<BR/> what(): St9bad_alloc<BR/>Aborted<BR/>Please let me know if you know a remedy.<BR/>Regards,<BR/>PeterPeter Fedichev (Quantum CTO)https://www.blogger.com/profile/06881436001010579010noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-15935650283536572902007-07-24T23:06:00.000+01:002007-07-24T23:06:00.000+01:00A real quick comment because I'm going to be out o...A real quick comment because I'm going to be out of circulation for the next month or so. Your point about the open nature of the SMARTS is very important because it allows us to have our debate.<BR/><BR/>The acceptor definitions look weird to me. I would normally treat all nitrogens as acceptors unless they are hypervalent, cationic of adjacent to sp2 or sp carbons or nitrogens. Sulfonamide nitrogens are typically pyramidal but the electronic pull of the sulfonyl group is likely to zap any acceptor ability. Oxygen atoms are all likely to have some acceptor ability (expect in oxonium cations) but aromatic ethers, 2-connected ester O & furans will be very weak.<BR/><BR/>The definitions you have appear to allow sulfur without eliminating hypervalent sulfur and thioethers. The definitions appear to go to some length to eliminate nitro oxygen but will accept aromatic oxygen.<BR/><BR/>I think the v3 that qualifies the #7 will (correctly) eliminate hypervalent nitrogen and I would agree that pyrrole-like nitrogen [nH] should not be counted as an acceptor. However any 3-connected aromatic nitrogen [nX3] should be included in that category with [nH]. As mentioned above a case can be made for treating all 3-connected nitrogen next to sp2 or sp carbon ( [NX3][c,C&X3,C&X2] ) as non-hydrogen accepting. Amide N will be the most frequently encountered example of this type.<BR/><BR/>Finally the !$(*(-a)-a) looks fishy. It eliminates nitrogens singly bonded to two aromatic atoms. Although correct, it's an odd thing to code and it'd be a good idea to find out why it's there. I wondered if the single bonds were meant to be aromatic bonds. If this were the case, it would incorrectly eliminate pyridine nitrogen.<BR/><BR/>Hope this helps a bit. Maybe the original creators of the SMARTS can shed more light.Georg-Martin Krapperhttps://www.blogger.com/profile/15416686863175197568noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-18338708565578760972007-07-17T20:50:00.000+01:002007-07-17T20:50:00.000+01:00How about Lipinksi-like? Well, in any case, the be...How about Lipinksi-like? Well, in any case, the beauty of open source is that you can see exactly what definition I used for the HBA and HBD and agree or disagree as you see fit.<BR/><BR/>The definitions I used are from:<BR/><A HREF="http://openbabel.svn.sourceforge.net/viewvc/openbabel/openbabel/trunk/src/smartsdescriptors.cpp?view=markup#l_57" REL="nofollow">Lines 57 and 58</A> of smartsdescriptor.cpp in the OpenBabel development code. This is Chris Morley's code, but it takes the definitions from JOElib, which has <A HREF="http://www-ra.informatik.uni-tuebingen.de/software/joelib/api2/joelib2/feature/types/count/HBA1.html" REL="nofollow">references</A> for HBA and HBD.<BR/><BR/>Apart from not being Lipinski-compliant, if you can see any other problems with these definitions (you seem to read SMARTS and SMILES as a native language), future generations of OpenBabel users will thank you.Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-88967282399989506682007-07-12T23:23:00.000+01:002007-07-12T23:23:00.000+01:00Thanks for the reference to GMC! I think the SMAR...Thanks for the reference to GMC! <BR/><BR/>I think the SMARTS for the Lipinski rule of 5 should be:<BR/><BR/>HBD [#7,#8;!H0]<BR/>HBA [#7,#8]<BR/><BR/>Lipinski defines hydrogen bond acceptors as nitrogen or oxygen and donors as nitrogen or oxygen with one more hydrogens (section 2.6 of paper). There appears to be a typo in section 3.1 of his paper but he does appear to count heteroatoms rather than hydrogens for donors.<BR/><BR/>Your HBD SMARTS will match thiols which Lipinski does not count as donors. One can criticise the Lipinski definitions but they are what were used for his analysis and should be used for anything claiming to be a Lipinski Ro5 descriptor. Hope this doesn't come across as overly pendantic.Georg-Martin Krapperhttps://www.blogger.com/profile/15416686863175197568noreply@blogger.com