Thursday 14 April 2011

Review of Data Analysis with Open Source Tools

A little while ago, I reviewed Gnuplot in Action by Philipp Janert. One of the aspects of that book that I liked the most was the inclusion of more general discussions about data analysis and appropriate methods.

Now Philipp has published a book where those discussions are the focus, and the software is mentioned in passing. Data Analysis with Open Source Tools was published in January 2011 by O'Reilly.

I got my hands on a review copy and gave it a write-up in Journal of Cheminformatics:
...This is a real practitioner's book. Janert, a former physicist and software engineer, is a consultant in data analysis and mathematical modelling. He has taken his hard-won knowledge and tried to get it all down on paper for the reader's benefit. For example, in a chapter with the provocative title of "What you really need to know about classical statistics" he explains why introductory statistics textbooks seem to cover methods and topics at odds with the problems data analysts deal with day-to-day; essentially classical methods were developed at a time of small and expensive datasets and no computational power, and hypothesis testing focused on determining whether an effect existed. Today we have ample computing power and may be dealing with very large datasets; also, we are usually more interested in the size of an effect (practical significance) rather than just whether it exists (statistical significance)...

To read the full review, head on over to J. Cheminf. 2011, 3:10.

No comments: