Interconverting molecular languages is a very common operation in cheminformatics, so convenient conversion tools are desirable. Recent articles have discussed JRuby as a functional cheminformatics scripting environement. In this article, we'll see how this functionality can be combined with convenience for molecular language conversions.
In addition to illustrating a technique, this article is the first in a series aimed at documenting a new cheminformatics toolkit for Ruby called "Rubidium". Rubidium will provide a unified set of Ruby APIs for working with diverse Open Source cheminformatics tools.
Rubidium will be distributed under the highly permissive MIT License.
The goal of this library is to provide a simple, yet flexible way to interconvert SMILES, InChI, and molfile formats. It was inspired the Open Babel library, in which an
OBConversion object is configured with input and output formats prior to performing one or more conversions. In today's library, a similar Ruby interface is created for the CDK. Because of it's length, it won't be presented in its entirety. Instead, it can be downloaded here.
Testing the Library
The library can be tested by saving it as a file called cdk.rb and invoking
jirb. We can then convert a SMILES for benzene into the InChI for benzene:
jirb irb(main):001:0> require 'cdk' => true irb(main):002:0> c=CDK::Conversion.new => #<CDK::Conversion:0x4c6320 ... > irb(main):003:0> c.set_formats 'smi', 'inchi' => "inchi" irb(main):004:0> c.convert 'c1ccccc1' => "InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H"
Upcoming articles will show more examples of interconversions using this library, and discuss some of its limitations.
It might be useful for Rubidium to support multiple
Conversions, each using its own cheminformatics toolkit. For example, a recent article discussed SMILES and InChI interconversion with Ruby Open Babel. With a little tweaking, the Ruby Open Babel
OBConversion interface could be make identical to the Ruby interface used in today's tutorial. We could also configure JOELib and Rosetta
Conversions in an analogous fashion.
Rubidium would then offer a family of molecular language converters, each of which used exactly the same API. We could then pick the best converter based on the situation at hand.
With just a little Ruby code, we've created a convenient Ruby interface for interconverting SMILES, InChI, and molfile formats. JRuby supports even more interconversions through the CDK as well as other Java and Java Native Interface libraries. Future articles will discuss some of the possibilities.