I met Roger Sayle the other day and he was showing me some of the work he's been doing recently since leaving OpenEye. Apart from OpenEye, you may remember Roger from such programs as Rasmol and also GCC (he's the middle level maintainer; this is apparently the part that fixes your code to be more efficient - see this quiz for examples). He's also the author of the only paper in JCIM to contain Klingon, the Lexichem paper (incidentally this is also, as far as I know [1], the only paper in JCIM that is freely available through Author Choice).
Early this year, Roger set up NextMove Software which is still in early days but is one to keep an eye on. It currently has software to aid in checking and correcting the spelling of chemical nomenclature (e.g. in a user interface, word document, text mining) and has software for searching databases in the works.
What I want to highlight here is a bizarre way he found to speed up a Python script (via Andrew Dalke) by turning it into a function. If you go to his page on PPAccelerator, which is all about a potential product that speeds up Pipeline Pilot scripts (by up to 37 times if interpreted, or 1083 times if compiled), you will see timing comparisons of code for the Mandelbrot fractal written in different languages.
You will see two Python v2.5 examples, the second of which is about twice as fast. Look at the code for each. The only difference in the second case is that the entire script has been wrapped in a function (main() in this case), which is called at the end of the script. This is just plain weird! I don't think I've ever heard of this effect before. Apparently, Python optimizes the bytecode of a function but not module level code. If this is written anywhere in the docs, I certainly can't find it.
On a final note, a quick Google search brings up promise, which may be useful for exploiting such optimizations further.
Notes:
[1] The ACS don't provide any means to list Author Choice articles on their website. Supposedly Author Choice articles should receive increased exposure but the ACS seem to want to hide them.
I think that why in function is faster than out functions is that don't call global variables but local ones which is faster
ReplyDeleteI guess that from within a function it's faster to access local variables versus global variables. But the difference here is so large that there must be something else happening.
ReplyDeleteGlobal variable lookup goes through a dictionary. Local variable lookups can be detected during compile time and are done through an index lookup in a local array, in constant time.
ReplyDeleteWhile dict lookup is also O(1) time, the prefactor is the difference here.
The problem is that global variables can be changed (meaning, added or deleted) at any time. Local ones cannot. People have suggested optimizations, on the assumption that the global dict usually doesn't change the number of elements. I don't know if that's gone anywhere.
Where is it documented? It's an implementation-dependent aspect of CPython and might not be true for other implementations. I knew about it from reading the newsgroup years ago and doing testing. You can find mentions of it in passing in various places, like http://www.python.org/doc/essays/list2str.html , but nothing really authoritative.
Out of curiosity, I timed mandelbrot2 as 27 seconds in a main(), and 52 seconds using a global scope. I then tried the same programs using PyPy and got 9.7 and 14 seconds, respectively.
Thanks Andrew. Looks like Xavier was right so.
ReplyDeleteYes, he was. I just used more words. ;)
ReplyDelete