I recently attended the 4th Joint Sheffield Conference on Chemoinformatics, and enjoyed it a lot.
Some of the things I noted were:
(1) Industry talks have suddenly got more interesting (than they were, that is). After describing methods run on their own in-house data, they suddenly say "In order to compare with other methods, I ran this method on a publicly available dataset". Great. About time. The big free datasets are now so well established, they've even heard about them in industry. Thanks go to ZINC, DUD, PubChem, and the NIH guys who are even making available some HTS data (is this correct? I cannot easily find it on the web).
(2) In this postmodern age, it is now a requirement for all cheminformatics conferences to start with a talk that tells us we're wasting our time trying to dock anything, as it doesn't work. Full marks for shock value, but perhaps the more interesting content of Anthony Nicholls talk related to statistical comparisons of AUC (area under the ROC curve) for a published study of multiple docking problems (Warren et al.). Basically, the error bars are so large we cannot say that any program is significantly different from any other (according to him :-) [disclaimer, I'm developing GOLD]).
(3) Of course, no cheminformatics conference would be complete without dodgy statistics, and I'm as guilty of it (not knowingly, I hope) as anyone. Multiple tests on the same dataset require corrections for significance testing such as the Bonferroni statistic - "if it passes the Bonferroni it's probably true" was the quote from Martin Packer (AZ). Everyone wrote that one down when it was mentioned. But for the cheminformaticians who skipped Statistics 101, there was more extra homework. Jonathan Hirst directed us to read the appendix of one of his papers for some more light reading on hard-core statistics such as the Nemenyi test and the improved Friedman statistic.
(4) Open Source chemistry software got a mention by some of the academics speaking. Jonathan Hirst in particular gave Joelib2 a big thumbs up, and made it clear that his own software is Openly available from his web page (although no license is mentioned in the README there). The author of Joelib2 was in the audience, Joerg Kurt Wegner, and it would have been nice if the speaker had put Joerg's name on his slide along with the name of the program and the website. After all, it's nice to get some personal recognition if you put a lot of work into such a program and then make it Openly available. Jonathan had to skip his next-to-last slide promoting the Blue Obelisk group, but it was still good to see the reference flash by. Irilenia Nobeli used the CDK, as did David Wild who is very active in the development of Web services with open source software.