A list of all chemistry blogs is maintained by Peter Maas over at Chemical Blogspace (CB). You can subscribe to one of the feeds over there to keep on top of all of them at the same time (or a relevant subset). If you have a chemistry blog and it's not on there, you should submit it to the site and your readership will balloon overnight.
Keeping on top of new chemistry blogs is tricky though. One way I thought to find new blogs was to collate the blog rolls of all of the existing blogs on CB; to a first approximation one can do this by collating all of the links on each blog's front page. This gives some idea of which blogs other bloggers read.
The Top 5 are Derek Lowe's "In the Pipeline", Egon Willighagen's "Chem-bla-ics", Paul Bracher's "ChemBark", Milkshake's "Orp Prep Daily" and Nature Chemistry's "The Sceptical Chymist".
And here are the raw results (code at the end of post) sorted by frequency of occurrence.
34 http://pipeline.corante.com/ 31 https://subscribe.wordpress.com/ 27 http://wordpress.org/ 19 http://chem-bla-ics.blogspot.com/ 18 http://blog.chembark.com/ 17 http://wordpress.com/ 16 http://orgprepdaily.wordpress.com/ 16 http://blogs.nature.com/thescepticalchymist/ 15 http://www.chemspider.com/blog/ 12 http://gaussling.wordpress.com/ 12 http://ashutoshchemist.blogspot.com/ 12 http://www.blogger.com/ 11 http://depth-first.com/ 11 http://wwmm.ch.cam.ac.uk/blogs/murrayrust/ 11 http://www.chemistry-blog.com/ 10 http://www.thechemblog.com/ 10 http://curlyarrow.blogspot.com/ 10 http://usefulchem.blogspot.com/ 10 http://www.statcounter.com/ 9 http://miningdrugs.blogspot.com/ 9 http://cultureofchemistry.blogspot.com/ 9 http://chemjobber.blogspot.com/ 9 http://www.sciencebase.com/science-blog/ 9 http://scienceblogs.com/moleculeoftheday/ 9 http://www.chemspider.com/ 8 http://kinasepro.wordpress.com/ 8 http://cenblog.org/ 8 http://liquidcarbon.livejournal.com/ 8 http://chemistrylabnotebook.blogspot.com/ 7 http://www.chemicalforums.com/ 7 http://omicsomics.blogspot.com/ 7 http://transitionstate.wordpress.com/ 7 http://blog.chemicalforums.com/ 7 http://homebrewandchemistry.blogspot.com/ 6 http://www.fieldofscience.com/ 6 http://www.coronene.com/blog/ 6 http://blog.everydayscientist.com/ 6 http://totallymechanistic.wordpress.com/ 6 http://scienceblogs.com/insolence/ 6 http://www.bccms.uni-bremen.de/cms/ 6 http://www.researchblogging.org/ 6 http://www.nature.com/ 6 http://outonalims.wordpress.com/ 6 http://www.totallysynthetic.com/blog/ 6 http://www.tiberlab.com/ 6 http://www.rscweb.org/blogs/cw/ 5 http://wiki.cubic.uni-koeln.de/cb/ 5 http://organometallics.blogspot.com/ 5 http://baoilleach.blogspot.com/ 5 http://wavefunction.fieldofscience.com/ 5 http://atompusher.blogspot.com/ 5 http://feedburner.google.com/ 5 http://chemical-quantum-images.blogspot.com/ 5 http://propterdoc.blogspot.com/ 5 http://allthingsmetathesis.com/ 5 http://cb.openmolecules.net/ 5 http://the-half-decent-pharmaceutical-chemistry-blog.chemblogs.org/ 5 http://www.ch.ic.ac.uk/rzepa/blog/ 5 http://usefulchem.wikispaces.com/ 4 http://www.rsc.org/chemistryworld/ 4 http://justlikecooking.blogspot.com/ 4 http://synthreferee.wordpress.com/ 4 http://mndoci.com/blog/ 4 http://scienceblogs.com/pharyngula/ 4 http://jmgs.wordpress.com/ 4 http://pubchem.ncbi.nlm.nih.gov/ 4 http://walkerma.wordpress.com/ 4 http://gmc2007.blogspot.com/ 4 http://purl.org/dc/elements/1.1/ 4 http://coronene.blogspot.com/ 4 http://naturalproductman.wordpress.com/ 4 http://blog.rguha.net/ 4 http://syntheticnature.wordpress.com/ 4 http://www.organic-chemistry.org/ 4 http://www.emolecules.com/ 4 http://www.livejournal.com/ 4 http://carbontet.blogspot.com/ 4 http://therealmoforganicsynthesis.blogspot.com/ 4 http://syntheticenvironment.blogspot.com/ 4 http://totallymedicinal.wordpress.com/ 4 http://comporgchem.com/blog/ 4 http://www.jungfreudlich.de/ 4 http://graphiteworks.wordpress.com/ 4 http://chemicalcrystallinity.blogspot.com/ 4 http://www.ebi.ac.uk/ 4 http://scienceblogs.com/principles/ 4 http://sanjayat.wordpress.com/ 4 http://greenchemtech.blogspot.com/ 4 http://www.feedburner.com/ 4 https://www.ebi.ac.uk/chembl/ 3 http://blogs.nature.com/ 3 http://chiraljones.wordpress.com/ 3 http://pubs.acs.org/journals/joceah/ 3 http://cen07.wordpress.com/ 3 http://www.natureasia.com/ 3 http://masterorganicchemistry.com/ 3 http://chem-eng.blogspot.com/ 3 http://pubs.acs.org/journals/orlef7/ 3 http://molecularmodelingbasics.blogspot.com/ 3 http://chemistswithoutborders.blogspot.com/ 3 http://www.typepad.com/ 3 http://wordpress.org/extend/ideas/ 3 http://www.sciencebase.com/ 3 http://codex.wordpress.org/ 3 http://chemicalmusings.blogspot.com/ 3 http://chemicalblogspace.blogspot.com/ 3 http://wordpress.org/extend/themes/ 3 http://drexel-coas-elearning.blogspot.com/ 3 http://cenblog.org/terra-sigillata/ 3 http://planet.wordpress.org/ 3 http://l-stat.livejournal.com/ 3 http://wordpress.org/extend/plugins/ 3 http://jmol.sourceforge.net/ 3 http://chembl.blogspot.com/ 3 http://theme.wordpress.com/themes/regulus/ 3 http://scientopia.org/blogs/ethicsandscience/ 3 http://www.google.com/ 3 http://youngfemalescientist.blogspot.com/ 3 http://www.badscience.net/ 3 http://paulingblog.wordpress.com/ 3 http://l-api.livejournal.com/ 3 http://waterinbiology.blogspot.com/ 3 http://gmpg.org/xfn/ 3 http://totallysynthetic.com/blog/ 3 http://chemicalmusings.wordpress.com/ 3 http://liberalchemistry.blogspot.com/ 3 http://www.hdreioplus.de/wordpress/ 3 http://pubs.acs.org/ 3 http://wordpress.org/news/ 3 http://verpa.wordpress.com/ 3 http://www.opentox.org/ 3 http://pubs.acs.org/journals/jacsat/ 3 http://www.chemical-chimera.blogspot.com/ 3 http://www.kilomentor.com/ 3 http://invivoblog.blogspot.com/ 3 http://blog.khymos.org/ 3 http://www.livejournal.com/search/ 3 http://chemicalsabbatical.blogspot.com/ 3 http://wordpress.org/support/ 3 http://scienceblogs.com/clock/ 3 http://profmaster.blogspot.com/ 3 http://www.orgsyn.org/ 3 http://www.surechembl.org/ 3 http://www.ebi.ac.uk/chebi/ 3 http://scienceblogs.com/aetiology/ 3 http://www.mazepath.com/uncleal/ 3 http://syntheticremarks.com/ 3 http://www.nature.com/nchem/ 2 http://www.syntheticpages.org/ 2 http://www.eyeonfda.com/ 2 http://genchemist.wordpress.com/ 2 http://cenblog.org/newscripts/2013/12/amusing-news-aliquots-128/ 2 http://blogs.scientificamerican.com/the-curious-wavefunction/2013/10/09/computational-chemistry-wins-2013-nobel-prize-in-chemistry/ 2 http://cenblog.org/just-another-electron-pusher/ 2 http://www.uu.se/ 2 http://bugs.bioclipse.net/ 2 http://www.ch.cam.ac.uk/magnus/ 2 http://archive.tenderbutton.com/ 2 http://cenblog.org/the-safety-zone/2013/12/csb-report-on-chevron-refinery-fire-urges-new-regulatory-approach/ 2 http://pubs.acs.org/cen/ 2 http://interfacialscience.blogspot.com/ 2 http://www.blogtopsites.com/ 2 http://www.amazingcounters.com/ 2 http://www.chemheritage.org/ 2 http://drexelisland.wikispaces.com/ 2 http://intermolecular.wordpress.com/ 2 http://www.aldaily.com/ 2 http://www.plos.org/ 2 http://kashthealien.wordpress.com/ 2 http://web.expasy.org/groups/swissprot/ 2 http://theme.wordpress.com/themes/enterprise/ 2 http://cenboston.wordpress.com/ 2 http://infiniflux.blogspot.com/ 2 http://laserjock.wordpress.com/ 2 http://bkchem.zirael.org/ 2 http://www.qdinformation.com/qdisblog/ 2 http://syntheticorganic.blogspot.com/ 2 http://www.sciencebasedmedicine.org/ 2 http://scienceblogs.com/pontiff/ 2 http://www.chemistryguide.org/ 2 http://pubs.acs.org/journals/jmcmar/ 2 http://blog.openwetware.org/scienceintheopen/ 2 http://cenblog.org/transition-states/ 2 http://theme.wordpress.com/themes/contempt/ 2 http://www.fiercebiotech.com/ 2 http://www.paulbracher.com/blog/ 2 http://scienceblogs.com/ethicsandscience/ 2 http://tripod.nih.gov/ 2 http://sciencegeist.net/ 2 http://spectroscope.blogspot.com/ 2 http://kilomentor.chemicalblogs.com/ 2 http://pharmagossip.blogspot.com/ 2 http://chembioinfo.com/ 2 http://philipball.blogspot.com/ 2 http://browsehappy.com/ 2 http://www.realclimate.org/ 2 http://chem.vander-lingen.nl/ 2 http://cenblog.org/the-safety-zone/2013/12/lab-safety-is-critical-in-high-school-too/ 2 http://joaquinbarroso.com/ 2 http://eristocracy.co.uk/brsm/ 2 http://daneelariantho.wordpress.com/ 2 http://blogs.discovermagazine.com/cosmicvariance/ 2 http://johnirwin.docking.org/ 2 http://www.scienceblog.com/cms/ 2 http://www.chemistry-blog.com/2013/12/11/number-11-hydrogen/ 2 http://thebioenergyblog.blogspot.com/ 2 http://www.eclipse.org/ 2 http://www.ebyte.it/stan/ 2 http://scienceblogs.com/ 2 http://boscoh.com/ 2 http://www.nature.com/nchem/journal/v6/n1/ 2 http://cdavies.wordpress.com/ 2 http://chemistandcook.blogspot.com/ 2 http://www.rheothing.blogspot.com/ 2 http://openbabel.org/ 2 http://www.metabolomics2012.org/ 2 http://cenblog.org/grand-central/ 2 http://www.scilogs.es/ 2 http://openflask.blogspot.com/ 2 http://d3js.org/ 2 http://stuartcantrill.com/ 2 http://blogs.discovermagazine.com/gnxp/ 2 http://www.bioclipse.net/ 2 http://rajcalab.wordpress.com/ 2 http://bacspublish.blogspot.com/ 2 https://cszamudio.wordpress.com/ 2 http://scienceblogs.com/sciencewoman/ 2 http://theorganicsolution.wordpress.com/2013/12/12/my-top-10-chemistry-papers-of-2013/ 2 http://chem242.wikispaces.com/ 2 http://icpmassspectrometry.blogspot.com/ 2 http://theme.wordpress.com/themes/ocean-mist/ 2 http://theme.wordpress.com/themes/andreas09/ 2 http://impactstory.org/ 2 http://blog.metamolecular.com/ 2 http://lamsonproject.org/ 2 http://agilemolecule.wordpress.com/ 2 http://proteinsandwavefunctions.blogspot.com/ 2 http://cenblog.org/newscripts/2013/12/heirloom-chemistry-set/ 2 http://beautifulphotochemistry.wordpress.com/ 2 http://www.etracker.com/ 2 http://chem241.wikispaces.com/ 2 http://practicalfragments.blogspot.com/ 2 http://www.nobelprize.org/nobel_prizes/chemistry/laureates/2013/ 2 http://researchblogging.org/ 2 http://retractionwatch.wordpress.com/ 2 http://u-of-o-nmr-facility.blogspot.com/ 2 http://www3.interscience.wiley.com/cgi-bin/jhome/26293/ 2 http://scienceblogs.com/goodmath/ 2 http://creativecommons.org/licenses/by-nc-sa/3.0/ 2 http://cenblog.org/the-haystack/ 2 http://www.simbiosys.com/ 2 http://www.steinbeck-molecular.de/steinblog/ 2 http://chemistry.about.com/ 2 http://cen.acs.org/ 2 http://cniehaus.livejournal.com/ 2 http://chem.chem.rochester.edu/~nvd/ 2 http://www.chemtube3d.com/ 2 http://news.google.com/ 2 http://theme.wordpress.com/themes/digg3/ 2 http://theme.wordpress.com/themes/mistylook/ 2 http://www.wordpress.org/ 2 http://luysii.wordpress.com/ 2 http://disqus.com/ 2 http://openbabel.sourceforge.net/ 2 http://networkedblogs.com/ 2 http://www.chemspy.com/ 2 http://www.openphacts.org/ 2 http://cic-fachgruppe.blogspot.com/ 2 http://weconsent.us/ 2 http://cdktaverna.wordpress.com/ 2 http://gilleain.blogspot.com/ 2 http://scienceblogs.com/scientificactivist/ 2 http://synchemist.blogspot.com/ 2 http://www.compchemhighlights.org/ 2 http://acdlabs.typepad.com/elucidation/ 2 http://cdk.sf.net/ 2 http://www.phds.org/ 2 http://zinc.docking.org/ 2 http://www.ch.imperial.ac.uk/rzepa/blog/ 2 http://www.cas.org/ 2 http://brsmblog.com/ 2 http://www.sciencetext.com/ 2 http://altchemcareers.wordpress.com/ 2 http://theeccentricchemist.blogspot.com/ 2 http://www.sciscoop.com/ 2 http://www.agile2robust.com/ 2 http://mwclarkson.blogspot.com/ 2 http://www.jcheminf.com/ 2 http://www.tns-counter.ru/V13a****sup_ru/ru/UTF-8/tmsec=lj_noncyr/ 2 http://www.slideshare.net/ 2 http://scienceblogs.com/eruptions/ 2 http://www.scilogs.com/ 2 http://scienceblogs.com/seejanecompute/ 2 http://blog.tenderbutton.com/ 2 http://www.amazingcounter.com/ 2 http://creativecommons.org/licenses/by/3.0/ 2 http://scienceblogs.com/greengabbro/ 2 http://madskills.com/public/xml/rss/module/trackback/ 2 http://edheckarts.wordpress.com/ 2 http://www.scilogs.fr/ 2 http://feed.informer.com/ 2 http://scienceblogs.com/chaoticutopia/ 2 http://www.chemaxon.com/ 2 http://www.rsc.org/mpemba-competition/ 2 http://neksa.blogspot.com/ 2 http://orgchem.livejournal.com/ 2 http://scienceblogs.com/transcript/ 2 http://drexel-coas-elearning-transcripts.blogspot.com/ 2 http://pmgb.wordpress.com/ 2 http://www.acs.org/ 2 http://www.scienceblogs.com/ 2 http://www.milomuses.com/chemicalmusings/ 2 http://www.the-scientist.com/ 2 http://calvinus.wordpress.com/ 2 http://www.pandasthumb.org/
import re import os import urllib from collections import defaultdict from bs4 import BeautifulSoup def seedFromCB(): if not os.path.isdir("CB"): os.mkdir("CB") url = "http://cb.openmolecules.net/blogs.php?skip=%d" N = 0 while True: page = urllib.urlopen(url % N) # Need to remove apostrophes from tag URLs or BeautifulSoup will choke html = page.read().replace("you'll", "youll").replace("what's", "whats") if html.find("0 total") >= 0: break print >> open(os.path.join("CB", "CB_%d.html" % N), "w"), html N += 10 N = 0 allurls = [] while True: filename = os.path.join("CB", "CB_%d.html" % N) if not os.path.isfile(filename): break soup = BeautifulSoup(open(filename).read()) divs = soup.find_all("div", class_="blogbox_byline") urls = [] for div in divs: children = div.find_all("a") anchor = children[2] urls.append(anchor['href']) print filename, len(urls) allurls.extend(urls) N += 10 notfound = ["http://imagingchemistry.com", "http://www.caspersteinmann.dk"] allurls = [url for url in allurls if url not in notfound] return allurls def url2file(url): for x in "/:?": url = url.replace(x, "_") return url def norm(url): for x in [".org", ".com", ".es", ".fr", ".net"]: if url.endswith(x): return url + "/" for txt in ["index.html", "index.php", "index.htm", "blog.html"]: if url.endswith(txt): return url[:-len(txt)] if url == "http://www.corante.com/pipeline/": return "http://pipeline.corante.com/" return url def getAllLinks(urls): if not os.path.isdir("Blogs"): os.mkdir("Blogs") for url in urls: filename = os.path.join("Blogs", url2file(url)) print filename if not os.path.isfile(filename): html = urllib.urlopen(url).read() print >> open(filename, "w"), html countlinks = defaultdict(int) for url in urls: filename = os.path.join("Blogs", url2file(url)) links = re.findall('"((http|ftp)s?://.*?)"', open(filename).read()) links = set([x[0] for x in links if x[1]=='http']) for link in links: countlinks[norm(link)] += 1 return countlinks if __name__ == "__main__": blogURLs = seedFromCB() countlinks = getAllLinks(blogURLs) tmp = countlinks.items() tmp.sort(key=lambda x:x[1], reverse=True) err = open("err.txt", "w") for x, y in tmp: if y > 1: if x.endswith("/") and not (y==6 and "accelrys" in x) and not (y==3 and "fieldofscience" in x): print y, x else: print >> err, y, x
No comments:
Post a Comment