Saturday 21 December 2013

Top 5 favourite blogs of chemistry bloggers

I'm always looking for ways to free up time on the webs, and my New Year's resolution is to avoid reading all Top N lists. These ubiquitous lists somehow compel me to read the contents no matter how trivial. No more. Well, no more after this, my own top N list.

A list of all chemistry blogs is maintained by Peter Maas over at Chemical Blogspace (CB). You can subscribe to one of the feeds over there to keep on top of all of them at the same time (or a relevant subset). If you have a chemistry blog and it's not on there, you should submit it to the site and your readership will balloon overnight.

Keeping on top of new chemistry blogs is tricky though. One way I thought to find new blogs was to collate the blog rolls of all of the existing blogs on CB; to a first approximation one can do this by collating all of the links on each blog's front page. This gives some idea of which blogs other bloggers read.

The Top 5 are Derek Lowe's "In the Pipeline", Egon Willighagen's "Chem-bla-ics", Paul Bracher's "ChemBark", Milkshake's "Orp Prep Daily" and Nature Chemistry's "The Sceptical Chymist".

And here are the raw results (code at the end of post) sorted by frequency of occurrence.
34 http://pipeline.corante.com/
31 https://subscribe.wordpress.com/
27 http://wordpress.org/
19 http://chem-bla-ics.blogspot.com/
18 http://blog.chembark.com/
17 http://wordpress.com/
16 http://orgprepdaily.wordpress.com/
16 http://blogs.nature.com/thescepticalchymist/
15 http://www.chemspider.com/blog/
12 http://gaussling.wordpress.com/
12 http://ashutoshchemist.blogspot.com/
12 http://www.blogger.com/
11 http://depth-first.com/
11 http://wwmm.ch.cam.ac.uk/blogs/murrayrust/
11 http://www.chemistry-blog.com/
10 http://www.thechemblog.com/
10 http://curlyarrow.blogspot.com/
10 http://usefulchem.blogspot.com/
10 http://www.statcounter.com/
9 http://miningdrugs.blogspot.com/
9 http://cultureofchemistry.blogspot.com/
9 http://chemjobber.blogspot.com/
9 http://www.sciencebase.com/science-blog/
9 http://scienceblogs.com/moleculeoftheday/
9 http://www.chemspider.com/
8 http://kinasepro.wordpress.com/
8 http://cenblog.org/
8 http://liquidcarbon.livejournal.com/
8 http://chemistrylabnotebook.blogspot.com/
7 http://www.chemicalforums.com/
7 http://omicsomics.blogspot.com/
7 http://transitionstate.wordpress.com/
7 http://blog.chemicalforums.com/
7 http://homebrewandchemistry.blogspot.com/
6 http://www.fieldofscience.com/
6 http://www.coronene.com/blog/
6 http://blog.everydayscientist.com/
6 http://totallymechanistic.wordpress.com/
6 http://scienceblogs.com/insolence/
6 http://www.bccms.uni-bremen.de/cms/
6 http://www.researchblogging.org/
6 http://www.nature.com/
6 http://outonalims.wordpress.com/
6 http://www.totallysynthetic.com/blog/
6 http://www.tiberlab.com/
6 http://www.rscweb.org/blogs/cw/
5 http://wiki.cubic.uni-koeln.de/cb/
5 http://organometallics.blogspot.com/
5 http://baoilleach.blogspot.com/
5 http://wavefunction.fieldofscience.com/
5 http://atompusher.blogspot.com/
5 http://feedburner.google.com/
5 http://chemical-quantum-images.blogspot.com/
5 http://propterdoc.blogspot.com/
5 http://allthingsmetathesis.com/
5 http://cb.openmolecules.net/
5 http://the-half-decent-pharmaceutical-chemistry-blog.chemblogs.org/
5 http://www.ch.ic.ac.uk/rzepa/blog/
5 http://usefulchem.wikispaces.com/
4 http://www.rsc.org/chemistryworld/
4 http://justlikecooking.blogspot.com/
4 http://synthreferee.wordpress.com/
4 http://mndoci.com/blog/
4 http://scienceblogs.com/pharyngula/
4 http://jmgs.wordpress.com/
4 http://pubchem.ncbi.nlm.nih.gov/
4 http://walkerma.wordpress.com/
4 http://gmc2007.blogspot.com/
4 http://purl.org/dc/elements/1.1/
4 http://coronene.blogspot.com/
4 http://naturalproductman.wordpress.com/
4 http://blog.rguha.net/
4 http://syntheticnature.wordpress.com/
4 http://www.organic-chemistry.org/
4 http://www.emolecules.com/
4 http://www.livejournal.com/
4 http://carbontet.blogspot.com/
4 http://therealmoforganicsynthesis.blogspot.com/
4 http://syntheticenvironment.blogspot.com/
4 http://totallymedicinal.wordpress.com/
4 http://comporgchem.com/blog/
4 http://www.jungfreudlich.de/
4 http://graphiteworks.wordpress.com/
4 http://chemicalcrystallinity.blogspot.com/
4 http://www.ebi.ac.uk/
4 http://scienceblogs.com/principles/
4 http://sanjayat.wordpress.com/
4 http://greenchemtech.blogspot.com/
4 http://www.feedburner.com/
4 https://www.ebi.ac.uk/chembl/
3 http://blogs.nature.com/
3 http://chiraljones.wordpress.com/
3 http://pubs.acs.org/journals/joceah/
3 http://cen07.wordpress.com/
3 http://www.natureasia.com/
3 http://masterorganicchemistry.com/
3 http://chem-eng.blogspot.com/
3 http://pubs.acs.org/journals/orlef7/
3 http://molecularmodelingbasics.blogspot.com/
3 http://chemistswithoutborders.blogspot.com/
3 http://www.typepad.com/
3 http://wordpress.org/extend/ideas/
3 http://www.sciencebase.com/
3 http://codex.wordpress.org/
3 http://chemicalmusings.blogspot.com/
3 http://chemicalblogspace.blogspot.com/
3 http://wordpress.org/extend/themes/
3 http://drexel-coas-elearning.blogspot.com/
3 http://cenblog.org/terra-sigillata/
3 http://planet.wordpress.org/
3 http://l-stat.livejournal.com/
3 http://wordpress.org/extend/plugins/
3 http://jmol.sourceforge.net/
3 http://chembl.blogspot.com/
3 http://theme.wordpress.com/themes/regulus/
3 http://scientopia.org/blogs/ethicsandscience/
3 http://www.google.com/
3 http://youngfemalescientist.blogspot.com/
3 http://www.badscience.net/
3 http://paulingblog.wordpress.com/
3 http://l-api.livejournal.com/
3 http://waterinbiology.blogspot.com/
3 http://gmpg.org/xfn/
3 http://totallysynthetic.com/blog/
3 http://chemicalmusings.wordpress.com/
3 http://liberalchemistry.blogspot.com/
3 http://www.hdreioplus.de/wordpress/
3 http://pubs.acs.org/
3 http://wordpress.org/news/
3 http://verpa.wordpress.com/
3 http://www.opentox.org/
3 http://pubs.acs.org/journals/jacsat/
3 http://www.chemical-chimera.blogspot.com/
3 http://www.kilomentor.com/
3 http://invivoblog.blogspot.com/
3 http://blog.khymos.org/
3 http://www.livejournal.com/search/
3 http://chemicalsabbatical.blogspot.com/
3 http://wordpress.org/support/
3 http://scienceblogs.com/clock/
3 http://profmaster.blogspot.com/
3 http://www.orgsyn.org/
3 http://www.surechembl.org/
3 http://www.ebi.ac.uk/chebi/
3 http://scienceblogs.com/aetiology/
3 http://www.mazepath.com/uncleal/
3 http://syntheticremarks.com/
3 http://www.nature.com/nchem/
2 http://www.syntheticpages.org/
2 http://www.eyeonfda.com/
2 http://genchemist.wordpress.com/
2 http://cenblog.org/newscripts/2013/12/amusing-news-aliquots-128/
2 http://blogs.scientificamerican.com/the-curious-wavefunction/2013/10/09/computational-chemistry-wins-2013-nobel-prize-in-chemistry/
2 http://cenblog.org/just-another-electron-pusher/
2 http://www.uu.se/
2 http://bugs.bioclipse.net/
2 http://www.ch.cam.ac.uk/magnus/
2 http://archive.tenderbutton.com/
2 http://cenblog.org/the-safety-zone/2013/12/csb-report-on-chevron-refinery-fire-urges-new-regulatory-approach/
2 http://pubs.acs.org/cen/
2 http://interfacialscience.blogspot.com/
2 http://www.blogtopsites.com/
2 http://www.amazingcounters.com/
2 http://www.chemheritage.org/
2 http://drexelisland.wikispaces.com/
2 http://intermolecular.wordpress.com/
2 http://www.aldaily.com/
2 http://www.plos.org/
2 http://kashthealien.wordpress.com/
2 http://web.expasy.org/groups/swissprot/
2 http://theme.wordpress.com/themes/enterprise/
2 http://cenboston.wordpress.com/
2 http://infiniflux.blogspot.com/
2 http://laserjock.wordpress.com/
2 http://bkchem.zirael.org/
2 http://www.qdinformation.com/qdisblog/
2 http://syntheticorganic.blogspot.com/
2 http://www.sciencebasedmedicine.org/
2 http://scienceblogs.com/pontiff/
2 http://www.chemistryguide.org/
2 http://pubs.acs.org/journals/jmcmar/
2 http://blog.openwetware.org/scienceintheopen/
2 http://cenblog.org/transition-states/
2 http://theme.wordpress.com/themes/contempt/
2 http://www.fiercebiotech.com/
2 http://www.paulbracher.com/blog/
2 http://scienceblogs.com/ethicsandscience/
2 http://tripod.nih.gov/
2 http://sciencegeist.net/
2 http://spectroscope.blogspot.com/
2 http://kilomentor.chemicalblogs.com/
2 http://pharmagossip.blogspot.com/
2 http://chembioinfo.com/
2 http://philipball.blogspot.com/
2 http://browsehappy.com/
2 http://www.realclimate.org/
2 http://chem.vander-lingen.nl/
2 http://cenblog.org/the-safety-zone/2013/12/lab-safety-is-critical-in-high-school-too/
2 http://joaquinbarroso.com/
2 http://eristocracy.co.uk/brsm/
2 http://daneelariantho.wordpress.com/
2 http://blogs.discovermagazine.com/cosmicvariance/
2 http://johnirwin.docking.org/
2 http://www.scienceblog.com/cms/
2 http://www.chemistry-blog.com/2013/12/11/number-11-hydrogen/
2 http://thebioenergyblog.blogspot.com/
2 http://www.eclipse.org/
2 http://www.ebyte.it/stan/
2 http://scienceblogs.com/
2 http://boscoh.com/
2 http://www.nature.com/nchem/journal/v6/n1/
2 http://cdavies.wordpress.com/
2 http://chemistandcook.blogspot.com/
2 http://www.rheothing.blogspot.com/
2 http://openbabel.org/
2 http://www.metabolomics2012.org/
2 http://cenblog.org/grand-central/
2 http://www.scilogs.es/
2 http://openflask.blogspot.com/
2 http://d3js.org/
2 http://stuartcantrill.com/
2 http://blogs.discovermagazine.com/gnxp/
2 http://www.bioclipse.net/
2 http://rajcalab.wordpress.com/
2 http://bacspublish.blogspot.com/
2 https://cszamudio.wordpress.com/
2 http://scienceblogs.com/sciencewoman/
2 http://theorganicsolution.wordpress.com/2013/12/12/my-top-10-chemistry-papers-of-2013/
2 http://chem242.wikispaces.com/
2 http://icpmassspectrometry.blogspot.com/
2 http://theme.wordpress.com/themes/ocean-mist/
2 http://theme.wordpress.com/themes/andreas09/
2 http://impactstory.org/
2 http://blog.metamolecular.com/
2 http://lamsonproject.org/
2 http://agilemolecule.wordpress.com/
2 http://proteinsandwavefunctions.blogspot.com/
2 http://cenblog.org/newscripts/2013/12/heirloom-chemistry-set/
2 http://beautifulphotochemistry.wordpress.com/
2 http://www.etracker.com/
2 http://chem241.wikispaces.com/
2 http://practicalfragments.blogspot.com/
2 http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2013/
2 http://researchblogging.org/
2 http://retractionwatch.wordpress.com/
2 http://u-of-o-nmr-facility.blogspot.com/
2 http://www3.interscience.wiley.com/cgi-bin/jhome/26293/
2 http://scienceblogs.com/goodmath/
2 http://creativecommons.org/licenses/by-nc-sa/3.0/
2 http://cenblog.org/the-haystack/
2 http://www.simbiosys.com/
2 http://www.steinbeck-molecular.de/steinblog/
2 http://chemistry.about.com/
2 http://cen.acs.org/
2 http://cniehaus.livejournal.com/
2 http://chem.chem.rochester.edu/~nvd/
2 http://www.chemtube3d.com/
2 http://news.google.com/
2 http://theme.wordpress.com/themes/digg3/
2 http://theme.wordpress.com/themes/mistylook/
2 http://www.wordpress.org/
2 http://luysii.wordpress.com/
2 http://disqus.com/
2 http://openbabel.sourceforge.net/
2 http://networkedblogs.com/
2 http://www.chemspy.com/
2 http://www.openphacts.org/
2 http://cic-fachgruppe.blogspot.com/
2 http://weconsent.us/
2 http://cdktaverna.wordpress.com/
2 http://gilleain.blogspot.com/
2 http://scienceblogs.com/scientificactivist/
2 http://synchemist.blogspot.com/
2 http://www.compchemhighlights.org/
2 http://acdlabs.typepad.com/elucidation/
2 http://cdk.sf.net/
2 http://www.phds.org/
2 http://zinc.docking.org/
2 http://www.ch.imperial.ac.uk/rzepa/blog/
2 http://www.cas.org/
2 http://brsmblog.com/
2 http://www.sciencetext.com/
2 http://altchemcareers.wordpress.com/
2 http://theeccentricchemist.blogspot.com/
2 http://www.sciscoop.com/
2 http://www.agile2robust.com/
2 http://mwclarkson.blogspot.com/
2 http://www.jcheminf.com/
2 http://www.tns-counter.ru/V13a****sup_ru/ru/UTF-8/tmsec=lj_noncyr/
2 http://www.slideshare.net/
2 http://scienceblogs.com/eruptions/
2 http://www.scilogs.com/
2 http://scienceblogs.com/seejanecompute/
2 http://blog.tenderbutton.com/
2 http://www.amazingcounter.com/
2 http://creativecommons.org/licenses/by/3.0/
2 http://scienceblogs.com/greengabbro/
2 http://madskills.com/public/xml/rss/module/trackback/
2 http://edheckarts.wordpress.com/
2 http://www.scilogs.fr/
2 http://feed.informer.com/
2 http://scienceblogs.com/chaoticutopia/
2 http://www.chemaxon.com/
2 http://www.rsc.org/mpemba-competition/
2 http://neksa.blogspot.com/
2 http://orgchem.livejournal.com/
2 http://scienceblogs.com/transcript/
2 http://drexel-coas-elearning-transcripts.blogspot.com/
2 http://pmgb.wordpress.com/
2 http://www.acs.org/
2 http://www.scienceblogs.com/
2 http://www.milomuses.com/chemicalmusings/
2 http://www.the-scientist.com/
2 http://calvinus.wordpress.com/
2 http://www.pandasthumb.org/

import re
import os
import urllib
from collections import defaultdict

from bs4 import BeautifulSoup

def seedFromCB():
    if not os.path.isdir("CB"):
        os.mkdir("CB")
        url = "http://cb.openmolecules.net/blogs.php?skip=%d"
        N = 0
        while True:
            page = urllib.urlopen(url % N)
            # Need to remove apostrophes from tag URLs or BeautifulSoup will choke
            html = page.read().replace("you'll", "youll").replace("what's", "whats")
            if html.find("0 total") >= 0: break
            print >> open(os.path.join("CB", "CB_%d.html" % N), "w"), html
            N += 10
    N = 0
    allurls = []
    while True:
        filename = os.path.join("CB", "CB_%d.html" % N)
        if not os.path.isfile(filename): break

        soup = BeautifulSoup(open(filename).read())
        divs = soup.find_all("div", class_="blogbox_byline")
        urls = []
        for div in divs:
            children = div.find_all("a")
            anchor = children[2]
            urls.append(anchor['href'])

        print filename, len(urls)
        allurls.extend(urls)
        N += 10

    notfound = ["http://imagingchemistry.com", "http://www.caspersteinmann.dk"]
    allurls = [url for url in allurls if url not in notfound]
    return allurls

def url2file(url):
    for x in "/:?":
        url = url.replace(x, "_")
    return url

def norm(url):
    for x in [".org", ".com", ".es", ".fr", ".net"]:
        if url.endswith(x):
            return url + "/"
    for txt in ["index.html", "index.php", "index.htm", "blog.html"]:
        if url.endswith(txt):
            return url[:-len(txt)]
    if url == "http://www.corante.com/pipeline/":
        return "http://pipeline.corante.com/"
    return url

def getAllLinks(urls):
    if not os.path.isdir("Blogs"):
        os.mkdir("Blogs")
    for url in urls:
        filename = os.path.join("Blogs", url2file(url))
        print filename
        if not os.path.isfile(filename):
            html = urllib.urlopen(url).read()
            print >> open(filename, "w"), html

    countlinks = defaultdict(int)
    for url in urls:
        filename = os.path.join("Blogs", url2file(url))
        links = re.findall('"((http|ftp)s?://.*?)"', open(filename).read())
        links = set([x[0] for x in links if x[1]=='http'])
        for link in links:
            countlinks[norm(link)] += 1
    return countlinks

if __name__ == "__main__":
    blogURLs = seedFromCB()

    countlinks = getAllLinks(blogURLs)

    tmp = countlinks.items()
    tmp.sort(key=lambda x:x[1], reverse=True)
    err = open("err.txt", "w")
    for x, y in tmp:
        if y > 1:
            if x.endswith("/") and not (y==6 and "accelrysin x) and not (y==3 and "fieldofsciencein x):
                print y, x
            else:
                print >> err, y, x

Sunday 15 December 2013

QM Speed Test: ERKALE

I've checked in the first results from the speed test: ERKALE, an open-source QM package by Susi Lehtola. You can find full details at https://github.com/baoilleach/qmspeedtest but here's the summary:

HF

QM PackageTime (min)Stepsper step Total EHOMOLUMO
erkale810 909 -644.67570139 -0.353712 0.074269

B3LYP
QM PackageTime (min)Stepsper step Total EHOMOLUMO
erkale933 5816.1 -686.49566820 -0.260899 -0.064457

ERKALE is a very interesting project as it shows how the existing ecosystem of open source QM and maths libraries can be exploited as part of a modern QM package. But is it fast or slow? Who knows - we'll have to measure some more programs first...

Feel free to play along by forking the project and checking in your own results.

Note to self: The ERKALE version tested was SVN r1013 (2013-10-21). I had to do an extra single point calculation afterwards to find out the energy of the HOMO/LUMO. This time was not included in the assessment (as it shouldn't really have been necessary).