Thursday 14 July 2016

The 7th Joint Sheffield Conference on Chemoinformatics - a real tweet

Just back from the Sheffield meeting, which takes place every 3rd year. Great meeting as ever - tribute was paid to John Holliday for the lion's share of the organisation. I got to meet some old friends and some new. For the first time at a meeting I decided to live-tweet the talks, joining such Twitter luminaries as Wendy Warr, Mireille Krier, Nathan Brown, and Jérémy Besnard.

It worked out quite well, and kept me completely engaged and awake. When you are aware that what you write is instantly publicly visible, you really make an effort to follow method descriptions etc so that you can adequately describe what's going on. To speed things up I decided to avoid editorialising; if the author described their method/result as the best thing since sliced bread, I dutifully reported a major advance in the field of baked goods even if I was thinking "bread is dead, baby, bread is dead". I have since learned that this is referred to as journalism.

With about 750 tweets covering 27 talks (I missed one due to flat batteries), I averaged about 27 tweets per talk, which may be just over one per slide. Afterwards I asked on Twitter whether people were annoyed or found my avalanche of tweets useful; based on 13 respondents, the results were 3 to 1 in favour of the tweets. If I do a repeat performance, next time I'll give a heads-up so people can mute me if uninterested.

I don't like my efforts disappearing into the void, so I've archived the complete list of #ShefChem16 tweets from all attendees and remotes that used that hashtag. You can relive the build-up, the talks themselves, the scones/doughnuts, the conference dinner, not to mention the queuing for taxis to the station. The talks and posters are being made available by-and-by on the conference website so you might find it interesting to look at the tweets in combination with the slides.

Notes on creating the download of tweets:
I tried to do this the hi-tech route via the Twitter API, but I think it's impossible if there were more than 100 tweets in a day. The API is geared towards streaming not historical analysis. In the end, I went to the Twitter website, searched for #ShefChem16, hit "All tweets", zoomed out and kept hitting Page Down until all the conference tweets were shown. Next I saved the generated HTML via Firebug (right click on the <body> element and choose "Copy HTML"), and extracted the tweets with the following script. Unfortunately, although it's possible to know to whom a reply has been made, the corresponding tweet id does not seem to be available so I didn't bother handling replies in a special way.

# vim: set fileencoding=utf-8 :
from bs4 import BeautifulSoup as bs

soup = bs(open("shefchem16.html"), "lxml")

HANDLESIZE = 10

data = []
name = None
for tag in soup.find_all("div"):
    if not tag.get("class"):
        continue
    if "stream-item-header" in tag.get("class"):
        name = tag.a['href'][1:]
    if "js-tweet-text-container" in tag.get("class"):
        tweet = tag.get_text().encode("utf-8").replace(" …", "")
        data.append("%10s %s" % (name[:HANDLESIZE], tweet.strip().replace("\n", "\n"+" "*(HANDLESIZE+1))))

with open("tmp.txt", "w") as f:
    for d in reversed(data):
        f.write(d+"\n")

Image credit: Egon Willighagen on Twitter

2 comments:

Egon Willighagen said...

I have been using the twitteR package for years now. Works well, also with more than 100 tweets per day. You may have to request follow up tweets repeatedly. I have code for that. Email me.

Noel O'Boyle said...

Good to know - maybe I just wasn't using the API correctly.