Hacking PubChem with Ruby
PubChem is an increasingly popular, free-access, online molecular database operated by the National Institutes of Health. Web services are a hot topic, with sites such as Flickr, Google, and eBay offering developers the tools to build rich content through "mashups" of several web APIs. Although there is no formal PubChem API, it's possible to roll your own. As a demonstration, this article will show how structural information can be retrieved from PubChem using some simple Ruby code. The inspiration for this article came from the PubChem module that is part of Chemruby.
The only thing you'll need for this tutorial is Ruby, preferably version 1.8.2 or higher. Create a directory called pubchem and make it your working directory. Then create a file called pubchem.rb containing the following code:
require 'net/http'
# A very simple PubChem Web API.
class PubChem
# Returns a molfile (as a String) for the molecule with PubChem
# CID matching compound_id.
def self.get_molfile(compound_id)
molfile = nil
path = '/summary/summary.cgi?cid=' + compound_id + '&disopt=DisplaySDF'
Net::HTTP.start('pubchem.ncbi.nlm.nih.gov') do |http|
response = http.get(path)
molfile = response.body
end
molfile
end
# Writes a PNG image, for the molecule with PubChem
# CID matching compound_id, to the file specified by filename.
def self.write_image(compound_id, filename)
path = '/image/imgsrv.fcgi?t=l&cid=' + compound_id
Net::HTTP.start('pubchem.ncbi.nlm.nih.gov') do |http|
response = http.get(path)
image = response.body
File.open(filename, "w") do |file|
file << image
end
end
end
end require 'pubchem'
molfile = PubChem::get_molfile('13109') #=> returns the molfile for Levonorgestrel as a Stringrequire 'pubchem'
PubChem::write_png('13109', 'image.png') #=> writes a PNG image of Levonorgestrel
$ ruby filename.rb
or it they be entered interactively in your console with irb:
$ irb irb(main):001:0>
As you can see, there's not much to building a PubChem API in Ruby. The same principles discussed here should apply in any programming language. Future articles in this series will show how to build more complex PubChem APIs and integrate them with other software packages and web services.

