tag:blogger.com,1999:blog-7844526396210378482.post4933175568672982833..comments2024-01-31T09:23:26.925+00:00Comments on Noel O'Blog: A non-random method to improve your QSAR results - works every time! Part IINoel O'Boylehttp://www.blogger.com/profile/03288289351940689018noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-7844526396210378482.post-77260293701285947842012-11-27T17:14:50.057+00:002012-11-27T17:14:50.057+00:00Yep, my comment was more targeted on George's ...Yep, my comment was more targeted on George's post :)Vincenthttps://www.blogger.com/profile/08213931628045053111noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-34859360239844562892012-11-26T09:15:09.116+00:002012-11-26T09:15:09.116+00:00Don't get me wrong - diversity selection has i...Don't get me wrong - diversity selection has its place. Selecting a diverse set of molecules, for example! My comments are specifically regarding unbiased testing of a predictive QSAR model.Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-43875944752863660702012-11-26T08:31:13.082+00:002012-11-26T08:31:13.082+00:00There are some other studies that state otherwise ...There are some other studies that state otherwise regarding diversity picking for HTS (funny part is that it is also a Novartis study!): http://www.ncbi.nlm.nih.gov/pubmed/16562980<br /><br />Difficult to compare the two purposes though... (QSAR training / test set VS diversity for screening)Vincenthttps://www.blogger.com/profile/08213931628045053111noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-50277325663599790632012-11-02T16:25:08.662+00:002012-11-02T16:25:08.662+00:00Thanks for that George. The evidence is mounting.....Thanks for that George. The evidence is mounting...Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-42519227458978812482012-11-02T14:01:31.439+00:002012-11-02T14:01:31.439+00:00I couldn't agree more with your post Noel.
Th...I couldn't agree more with your post Noel. <br />There's also related evidence that sophisticated diversity selection methods do not actually perform better than random picking:<br />http://www.ncbi.nlm.nih.gov/pubmed/16562980<br />Anonymoushttps://www.blogger.com/profile/18414967060522745603noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-61122809118469164472012-11-01T14:24:41.259+00:002012-11-01T14:24:41.259+00:00I certainly agree. What I had in mind by way of a ...I certainly agree. What I had in mind by way of a paper, would be to try all possible training/test set splits and make an RMSD histogram, and then highlight where on the histogram a "rationally-selected" data set would appear.Noel O'Boylehttps://www.blogger.com/profile/03288289351940689018noreply@blogger.comtag:blogger.com,1999:blog-7844526396210378482.post-26131077446294804812012-11-01T14:10:57.798+00:002012-11-01T14:10:57.798+00:00I always found bootstrapping a nice way to estimat...I always found bootstrapping a nice way to estimate model prediction error due to data set composition. Using that, you can easily show that below some 100 molecules, the effect of 'selecting' a training set become dangerously large.<br /><br />Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.com