An Introduction to the Rubidium Cheminforamtics Toolkit: Interconvert SMILES, InChI, and Molfile with an Open Babel-Like Interface 4

Posted by Rich Apodaca Mon, 15 Oct 2007 14:59:00 GMT

Interconverting molecular languages is a very common operation in cheminformatics, so convenient conversion tools are desirable. Recent articles have discussed JRuby as a functional cheminformatics scripting environement. In this article, we'll see how this functionality can be combined with convenience for molecular language conversions.

In addition to illustrating a technique, this article is the first in a series aimed at documenting a new cheminformatics toolkit for Ruby called "Rubidium". Rubidium will provide a unified set of Ruby APIs for working with diverse Open Source cheminformatics tools.

Rubidium will be distributed under the highly permissive MIT License.

Prerequisites

This Rubidium library requires JRuby and the Chemistry Development Kit (CDK). Copying the CDK jarfile into your JRuby lib directory is all that's needed.

The Library

The goal of this library is to provide a simple, yet flexible way to interconvert SMILES, InChI, and molfile formats. It was inspired the Open Babel library, in which an OBConversion object is configured with input and output formats prior to performing one or more conversions. In today's library, a similar Ruby interface is created for the CDK. Because of it's length, it won't be presented in its entirety. Instead, it can be downloaded here.

Testing the Library

The library can be tested by saving it as a file called cdk.rb and invoking jirb. We can then convert a SMILES for benzene into the InChI for benzene:

$ jirb
irb(main):001:0> require 'cdk'
=> true
irb(main):002:0> c=CDK::Conversion.new
=> #<CDK::Conversion:0x4c6320 ... >
irb(main):003:0> c.set_formats 'smi', 'inchi'
=> "inchi"
irb(main):004:0> c.convert 'c1ccccc1'
=> "InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H"

Upcoming articles will show more examples of interconversions using this library, and discuss some of its limitations.

An Aside

It might be useful for Rubidium to support multiple Conversions, each using its own cheminformatics toolkit. For example, a recent article discussed SMILES and InChI interconversion with Ruby Open Babel. With a little tweaking, the Ruby Open Babel OBConversion interface could be make identical to the Ruby interface used in today's tutorial. We could also configure JOELib and Rosetta Conversions in an analogous fashion.

Rubidium would then offer a family of molecular language converters, each of which used exactly the same API. We could then pick the best converter based on the situation at hand.

Conclusions

With just a little Ruby code, we've created a convenient Ruby interface for interconverting SMILES, InChI, and molfile formats. JRuby supports even more interconversions through the CDK as well as other Java and Java Native Interface libraries. Future articles will discuss some of the possibilities.

Comments

Leave a response

  1. Egon Willighagen Tue, 16 Oct 2007 03:49:56 GMT

    Rich, Bioclipse will have Ruby scripting support in Bioclipse, and looking forward to the Rubidium plugin ! Cheers!

  2. Kim Hanjo Tue, 16 Oct 2007 05:27:42 GMT

    Just a trivial report on Mac OS.

    It seems that Rubidium requires the JNI-InCHI wrapper, which in turn requires InCHI itself. As InCHI is available on Windows and Linux, but not on MacOS right now, I coudn't use this library on my Macbook. Of course, I can modify (delete InCHI-related stuffs in) cdk.rb file to work on Mac OS.

  3. Kim Hanjo Tue, 16 Oct 2007 05:33:03 GMT

    One more report on this blog. I can't leave a comment in Safari browser, while Firefox and IE works well. I don't know why and how to fix it. I have tried several times to write comments on your posts but to fail, which was the problem of Safari (3 beta version).

  4. Rich Apodaca Tue, 16 Oct 2007 12:45:55 GMT

    Kim,

    Thanks for that reminder about Mac. You might try working with Sam Adams to get a Mac OS binary distributed w/ the JNI-InChI library. You can reach him via the cdk-devel mailing list.

    Thanks for info about Safari and this site. Safari for Windows also can't be used to post comments. At least its consistent...

Comments