Given a DOI or PMID, how can you find metadata for a publication using Python? The EUtils module of BioPython, by Andrew Dalke, is your friend.
For a DOI, you need to do a search:
from Bio import EUtils
from Bio.EUtils import DBIdsClient
doi = "10.1016/j.jmb.2007.02.065"
client = DBIdsClient.DBIdsClient()
result = client.search(doi + "[aid]", retmax = 1)
summary = result[0].summary()
For a PMID, there's a more direct method:
from Bio import EUtils
from Bio.EUtils import DBIdsClient
PMID = "17238260"
result = DBIdsClient.from_dbids(EUtils.DBIds("pubmed", PMID))
summary = result[0].summary()
So, what can you do with the summary? Something like the following maybe:
>>> data = summary.dataitems
>>> print data.keys()
['DOI', 'Title', 'Source', 'Volume', ...., ]
>>> print "%s. %s %s %s, %s, %s." % (
... ", ".join(data['AuthorList'].allvalues()),
... data['Title'], data['Source'], data['PubDate'].year,
... data['Volume'], data['Pages'])
...
O'Boyle NM, Holliday GL, Almonacid DE, Mitchell JB. Using
reaction mechanism to measure enzyme similarity. J Mol Biol
2007, 368, 1484-99.
For more info:
- PubMed search terms
- Bio.EUtils
- Andrew's introduction to (an earlier version of) Bio.EUtils
7 comments:
Thankyouthankyouthankyou.
This is just what I needed, so score one victory for Noel O'Blog and Google-Fu!
Update: just spent an hour or so figuring out that EUtils is gone, correct module is Bio.Entrez.
the link, http://biopython.org/DIST/docs/api/public/Bio.EUtils-module.html is 404.
@Bennest, Skylar: Yes - the blog post is now out of date. If you figure out the new code, I will be happy to update the post.
Download BioPython from
http://biopython.org/wiki/Download
The use the following code:
from Bio import Entrez
Entez.email='foo@bar.com'
database='pubmed'
myPMID='12345678'
handle = Entrez.efetch(db=database, id=myPMID, retmode="text", rettype="medline", tool="BioPython, Bio.Entrez")
record = handle.read()
Note that you can use other retmodes, for instance HTML or XML.
Hope this helps, feel free to modify and repost.
Ben
Thanks Bennest. I will publish an updated post in the next few weeks and add a link to it on this blog post.
Just adding a few lines to Bennest's post.
from Bio import Entrez
Entez.email='foo@bar.com'
database='pubmed'
myPMID='12345678'
handle = Entrez.efetch(db=database, id=myPMID, retmode="text", rettype="medline", tool="BioPython, Bio.Entrez")
record = handle.read()
firstPMID = record["IdList"][0]
handle = Entrez.efetch(db="pubmed", id=firstPMID)
print handle.read()
Post a Comment