Thursday, 16 June 2011

Using Zotero for Chemistry

Zotero keeps improving, and I was thinking it was time I started using it for my own papers. But how well do the translators work for Chemistry journals?

I tested the ability of Zotero to extract the correct metadata, the abstract, the journal abbreviation, the PDF, and the full-text HTML from the abstract page of a paper from a current issue of a journal from various publishers in Chemistry.

The following results are ranked by how well the translator works, starting with the best:
  • ACS Journals - Missing journal abbreviation.
  • BMC Journals - Misses author initials, doesn't recognise J Cheminf, missing journal abbreviation.
  • Elsevier (J Mol Struct THEOCHEM) - Metadata has slightly wrong DOI, markup included in abstract, missing journal abbreviation.
  • Wiley (J Comput Chem) - Only metadata (no PDF, or full-text HTML)
  • Springer (JCAMD) - Only metadata (no PDF, or full-text HTML)
  • Oxford (Nucleic Acids Res) - Only metadata (no PDF, or full-text HTML)
  • RSC (Chem Comm) - Only metadata, and that missing the page numbers (RSC must not be providing it to CrossRef).
I'm going to have a go at improving these by-and-by (keep an eye on my bitbucket account), but feel free to sort them out yourself if you want (leave a comment below if you do).

I was initially hesitant doing this as there was no test framework in place for translators, and there didn't seem to be much point in writing a translator that might break at any point without anyone knowing. But Avram Lyon is currently adding support for a test framework to Scaffold (the tool you use for writing Zotero translators), and so this should soon be available.

3 comments:

Avram Lyon said...

Note that the RSC page uses the DOI default translator, and the advance article you refer to simply isn't in the CrossRef DOI database yet. As for the others, post to zotero-dev if you run into any issues. Most translators are currently limited to what they're given in the form of RIS or BibTeX (leading to things like missing abbreviations), but there's no reason we can't augment that with additional scraped data.

baoilleach said...

@Avram: Re RSC, my bad, I meant to avoid advance articles. I'll update the text.

debbie said...

Thanks for supporting Zotero! Your dedication to testing/improving translators for the scientific community is much appreciated on this end.

Many thanks.

Debbie, Zotero Community Lead