Painless Installation of Ruby Open Babel 3

Posted by Rich Apodaca Mon, 09 Apr 2007 10:09:00 GMT

Open Babel 2.1.0 has just been released. Among its new features is a Ruby interface containing most of the functionality of the C++ library. Installation is quick and easy, as shown in this article.

Prerequisites

In addition to a working build system, you'll need Ruby and the Ruby development libraries. Although any recent version should do, this tutorial was written with version 1.8.5.

Step 0: Compile and Install Open Babel

Given the right tools on your system, compiling and and installing Open Babel from source is trivial. This page gives instructions for doing so on Linux, Windows, and Mac OS X.

Step 1: Create the Wrapper's Makefile

After unpacking, compiling, and installing Open Babel, change into the scripts/ruby directory of your source distribution. Next, run the extconf.rb script:

$ ruby extconf.rb
checking for main() in -lopenbabel... yes
creating Makefile

As you've probably guessed, the purpose of this script is to generate a Makefile specific to your platform. This script uses the standard Ruby library mkmf.

Step 2: Compile the Wrapper

After creating a Makefile, we're ready to compile the C++ Ruby wrapper, contained in openbabel_ruby.cpp:

$ make
g++ -I. -I. -I/usr/lib/ruby/1.8/x86_64-linux-gnu -I. -I../../include  -fPIC -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -Wall  -fPIC   -c openbabel_ruby.cpp

This output will be followed by other lines as the compiler builds the wrapper library.

Step 3: Install the Wrapper

After compiling the wrapper, we're ready to install it. You can probably guess that the next command will be (as root):

# make install
/usr/bin/install -c -m 0755 openbabel.so /usr/lib/ruby/site_ruby/1.8/x86_64-linux-gnu

Your install directory is chosen by Ruby to be appropriate for your platform and Ruby version.

Hello, Benzene!

Congratulations, you've installed Ruby Open Babel! You can verify that your new library works with interactive Ruby (irb):

$ irb
irb(main):001:0> require 'openbabel'
=> true
irb(main):002:0> c=OpenBabel::OBConversion.new
=> #<OpenBabel::OBConversion:0x2acedbadd020>
irb(main):003:0> c.set_in_format 'smi'
=> true
irb(main):004:0> benzene=OpenBabel::OBMol.new
=> #<OpenBabel::OBMol:0x2acedbacfa10>
irb(main):005:0> c.read_string benzene, 'c1ccccc1'
=> true
irb(main):006:0> benzene.num_atoms
=> 6

Creating Canonical SMILES with Ruby Open Babel

Posted by Rich Apodaca Tue, 03 Apr 2007 11:59:00 GMT

Unlike many data types, molecular structure representations are not normally unique. Each numbering system you choose for the atoms and bonds of a molecule gives rise to completely accurate, but degenerate molecular representations. This is one of the fundamental peculiarities of chemical information - and the focus of much research activity over the last sixty or so years. One of the most widely-used approaches to this problem is canonicalization.

This article discusses the SMILES canonicalization capability in the upcoming Open Babel 2.1 release. Among several other enhancements, this release will also feature a brand new Ruby interface. By way of preview, this article will demonstrate just how convenient it has now become to generate canonical SMILES strings with Ruby.

Consider the putative rodenticide aminopterin, the structure of which is shown above. Regardless of whether it turns out to be the culprit in the recent pet food poisoning case, it's a relatively complex molecule. And with this complexity comes many possible representations. Here's one of just hundreds, if not thousands, of possible SMILES strings for this molecule:

Nc3nc(N)c2nc(CNc1ccc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)cc1)cnc2n3

If you were developing a database of molecules and needed to support exact structure searching, how would you do it? One way would be to convert a query molecule to a canonical SMILES string, and then simply look for that string in an index of your database's canonical SMILES. This is useful because it allows us to convert a chemistry-specific problem (exact structure search) into a generic computer science problem (text matching).

We can create a simple Ruby library to convert any SMILES string into an Open Babel canonical SMILES string:

require 'openbabel'

class Can
  def initialize
    @conversion = OpenBabel::OBConversion.new
    @conversion.set_in_and_out_formats 'smi', 'can'
  end

  def convert smiles
    mol = OpenBabel::OBMol.new

    @conversion.read_string mol, smiles
    @conversion.write_string mol
  end
end
Save this code as a file called can.rb in your working directory. The library can then be used, for example, via interactive ruby (irb):
$ irb
irb(main):001:0> require 'can'
=> true
irb(main):002:0> c=Can.new
=> #>
irb(main):003:0> puts c.convert('Nc3nc(N)c2nc(CNc1ccc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)cc1)cnc2n3')
OC(=O)CC[C@@H](NC(=O)c1ccc(NCc2cnc3nc(N)nc(N)c3n2)cc1)C(=O)O
=> nil
irb(main):004:0> puts c.convert('C1=CC(=CC=C1C(=O)N[C@@H](CCC(=O)O)C(=O)O)NCC2=CN=C3C(=N2)C(=NC(=N3)N)N')
OC(=O)CC[C@@H](NC(=O)c1ccc(NCc2cnc3nc(N)nc(N)c3n2)cc1)C(=O)O
=> nil

As you can see, both SMILES strings for aminopterin were converted into the same canonical SMILES string.

Unlike InChI, which uses a "standard" canonicalization algorithm, SMILES canonicalization varies by software package. As a result, the SMILES canonicalization described here will be most useful within a software package, but probably not externally to it, at least initially.

Ruby is still an upstart language in cheminformatics. But tools like Ruby CDK and Ruby Open Babel offer ample opportunities for learning what this remarkable language can do for the development of chemistry applications.

Making the Case: In Silico Prediction of Ames Test Mutagenicity

Posted by Rich Apodaca Thu, 28 Dec 2006 15:09:00 GMT

The two models (SAm and AIm) and the RHC [robust hybrid classifier] were implemented in C++ using OpenBabel 1.100.2 libraries (http://openbabel.sourceforge.net/wiki/Main_Page).

The AI model (AIm) is based on the LAZAR system (http://www.predictive-toxicology.org/lazar/index.html) developed by C. Helma...

-Paolo Mazzatorta, Liên-Anh Tran, Benoît Schilter, and Martin Grigorov J. Chem. Inf. Model.

Yet another appearance of Open Source software in the primary cheminformatics literature comes by way of a paper from Mazzatorta, Tran, Shilter, and Grigorov of the Nestlé Research Center. This work employs two Open Source libraries: lazar, a tool for the prediction of toxic properties of chemical structures; and Open Babel, a widely-used, low-level library for cheminformatics. lazar, in turn, is based on both Open Babel and the GNU Scientific Library (GSL), a numerical library. Unfortunately, the Nestlé authors don't indicate whether the source code for their system is publicly available. Nevertheless, their work gives a taste of the kinds of synergies that inevitably develop through the the use of Open Source software.

From SMILES to InChI with OBRuby

Posted by Rich Apodaca Fri, 03 Nov 2006 15:50:00 GMT

SMILES and InChI are two commonly-used molecular line notations. Although each has its advantages and limitations, the novelty of InChI and the ubiquity of SMILES makes the SMILES to InChI conversion especially useful. Many of the situations in which the need for this conversion will arise are particularly well-suited for the Ruby programming language. A recent article described how RCDK and Rino could be used to accomplish this conversion. This article will show how Open Babel can be used from Ruby to effect the same conversion.

OBRuby

OBRuby is a SWIG-generated Ruby interface to the Open Babel library. Although OBRuby doesn't expose all aspects of the Open Babel API, nearly everything that can be done in C++ Open Babel can now be done in Ruby. For example, all OBConversion permutations should be available, including SMILES to InChI.

A Small Ruby Library

Let's create a small Ruby library for converting SMILES strings into InChI identifiers. Save the following into a file called convert.rb:

require 'openbabel'

class Convertor
  def initialize
    @conv = OpenBabel::OBConversion.new

    @conv.set_in_and_out_formats('smi', 'inchi')
  end

  def get_inchi(smiles)
    mol = OpenBabel::OBMol.new

    @conv.read_string(mol, smiles)
    @conv.write_string(mol)
  end
end 

There's nothing tricky here. We've simply created a Ruby class that makes the SMILES to InChI conversion as simple as one method call to an instance.

Testing the Library

A good way to test this library is through Interactive Ruby (irb). For example, to find the InChI of caffeine:

require 'convert'

c = Convertor.new

puts c.get_inchi('Cn1cnc2c1c(=O)n(C)c(=O)n2C') # caffeine
# =>InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3

Chiral SMILES

I applied this simple Ruby conversion library to the (S)-methamphetamine record in PubChem:

  • Isomeric SMILES: C[C@@H](CC1=CC=CC=C1)NC
  • PubChem InChI: InChI=1/C10H15N/c1-9(11-2)8-10-6-4-3-5-7-10/h3-7,9,11H,8H2,1-2H3/t9-/m0/s1

My results were:

  • Isomeric SMILES: C[C@@H](CC1=CC=CC=C1)NC
  • OBRuby InChI: InChI=1/C10H15N/c1-9(11-2)8-10-6-4-3-5-7-10/h3-7,9,11H,8H2,1-2H3/t9-/m1/s1

As you can see, there is a discrepancy in the two stereo layers ('m0' vs. 'm1'). The same InChI is generated by Open Babel using either OBRuby or the Worldwide Molecular Matrix. Substituting the SMILES string representing the opposite configuration at carbon generates the InChI with opposite configuration (R), which again is opposite to that of (R)-methamphetamine in PubChem.

At this point, it is unclear whether Open Babel or PubChem is producing the correct InChI for the methamphetamine enantiomers. I suspect Open Babel is correct. By creating a molfile of (S)-methamphetamine with JME and running cInChI over it, I got the same output as with the Open Babel conversions. I've found similar differences between PubChem and Open Babel InChIs in every chiral molecule I've looked at.

Conclusions

The conversion of SMILES, and other molecular languages, into InChI identifiers can be expected to become a recurring need as the popularity of InChI increases. Combining the formidable translation capabilities of Open Babel with the comfort and convenience of Ruby offers a powerful new technique for doing so.

OBRuby: A Ruby Interface to Open Babel

Posted by Rich Apodaca Tue, 31 Oct 2006 14:20:00 GMT

And the LORD said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do.

-Genesis 11:6

Open Babel is a widely-used Open Source chemical informatics toolkit written in C++. Although originally designed as a molecular language translator, Open Babel also supports SMARTS pattern recognition, molecular fingerprints, molecular superposition, and other features as well.

Open Babel currently offers interfaces for two scripting languages: Python and Perl. Recently, Geoff Hutchison and I have been working to add Ruby to that list. This article reports our success in doing so and provides a glimpse of what might now be possible.

OBRuby

The upcoming release of Open Babel (version 2.1.0) will come complete with a Ruby interface. For those interested in trying it out sooner, a package called OBRuby can be downloaded now. OBRuby compiles against revision 1577 of the Open Babel SVN trunk. It has been tested with Linux and Mac OS X, and will probably work on Windows with minor modifications. The approach outlined here is known to fail with Open Babel 2.0.2.

OBRuby is a technology demonstration. The Ruby scripting support included with Open Babel 2.1.0 may differ in some details from OBRuby. My purpose in this article is simply to demonstrate what is now possible. Please read through the install scripts (they're short) to be sure you're comfortable with what they do.

Here was my OBRuby installation process:

  1. Download the Open Babel SVN trunk revision 1577 or later.
  2. cd trunk
  3. configure, make, (as root) make install
  4. (as root) ldconfig (necessary on my system - perhaps not on yours)
  5. cd OBRUBY_DIR
  6. ruby build.rb
  7. (as root) make install

One last wrinkle: the build.rb script included with OBRuby is something of a hack. It hardcodes the location of the Open Babel library on line 6:

@@ob_dir='/usr/local'
Change this line to match your Open Babel installation and you should be ready to go. make install places a single file, openbabel.so into your Ruby site_ruby directory. To verify that the installation worked with IRB:
$ irb
irb(main):001:0> require 'openbabel'
=> true

A return value of true shows that the installation was successful. An error message about libopenbabel.so not being found indicates that your system can't find your Open Babel libraries. Be sure you've installed Open Babel and either run ldconfig or set LD_LIBRARY_PATH.

The majority of OBRuby was autogenerated by SWIG. A future article will detail how this was done - with an eye toward developing a Java interface to Open Babel.

Building an OBMol From SMILES

With installation out of the way, let's fire up OBRuby and take her for a test drive. The following code can either be entered with IRB or saved to a file and executed with the ruby interpreter:

require 'openbabel'
include OpenBabel

smi2mol = OBConversion.new
smi2mol.set_in_format("smi")

mol = OBMol.new
smi2mol.read_string(mol, 'CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C') # cholesterol, no chirality
mol.add_hydrogens

puts "Cholesterol has #{mol.num_atoms} atoms, including hydrogens."
puts "Its molecular weight is #{mol.get_mol_wt} and its molecular formula is #{mol.get_formula}."
This simple code illustrates some important points. All OBRuby classes reside in the OpenBabel module. These classes can be directly referenced by including the OpenBabel module. Also notice how Ruby underscore_delimited method names are used, rather than C++ UpperCamelCase names.

SMARTS Matching

One of the most useful features of Open Babel is its SMARTS pattern matching capability. This can conveniently be accessed from OBRuby by first instantiating an OBSmartsPattern, passing the SMARTS pattern of interest to the instance's init method, and retrieving the hit set:
require 'openbabel'
include OpenBabel

smi2mol = OBConversion.new
smi2mol.set_in_format("smi")

mol = OBMol.new
smiles = 'CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C' # cholesterol, no chirality
smi2mol.read_string(mol, smiles) 
mol.add_hydrogens

pattern=OBSmartsPattern.new
smarts = 'C1CCCCC1'

pattern.init(smarts)
pattern.match(mol)
hits = pattern.get_umap_list # => indicies of two cyclohexane rings

puts "Found #{hits.size} instances of the SMARTS pattern '#{smarts}' in the SMILES string #{smiles}. Here are the atom indices:"
hits.each_with_index do |hit, index|
  print "Hit #{index}: [ "

  hit.each do |atom_index|
    print "#{atom_index} "
  end

  puts "]"
end
Notice the Rubyesque each_with_index block that iterates over the elements in the hit set. Running the above code produces the following output:
Found 2 instances of the SMARTS pattern 'C1CCCCC1' in the SMILES string CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C. Here are the atom indices:
Hit 0: [ 12 17 16 15 14 13 ]
Hit 1: [ 20 25 24 23 22 21 ]

Finding Your Way

Using a new library like OBRuby can take some getting used to. An excellent source of information is OpenBabel's online API documentation. Another source is Ruby itself.

For example, let's say you've instantiated an OBMol, but can't remember the exact name of the method that counts the number of atoms. Just use Object.methods.sort:

require 'openbabel'

mol = OpenBabel::OBMol.new

mol.methods.sort # => see output below
When run from Interactive Ruby (irb), this code produces the following alphabetized list of methods, which I've truncated:
... "is_corrected_for_ph", "kekulize", "kind_of?", "method", "methods", "new_atom", "new_perceive_kekule_bonds", "new_residue", "next_atom", "next_bond", "next_conformer", "next_internal_coord", "next_residue", "nil?", "num_atoms", "num_bonds", "num_conformers", "num_edges", "num_hvy_atoms", "num_nodes", "num_residues", "num_rotors", "object_id", "perceive_bond_orders", "perceive_kekule_bonds", "private_methods", "protected_methods", "public_methods", "renumber_atoms", "reserve_atoms", "reset_visit_flags" ...

Conclusions

OBRuby combines the dynamic programming language Ruby with the highly-functional toolkit Open Babel. Further augmenting OBRuby's capabilities with the web application framework Rails and/or Ruby Chemistry Development Kit offers even more possibilities. Future articles will bring some of them to life.

Older posts: 1 2