Run Babel Anywhere Java Runs with JBabel 6

Posted by Rich Apodaca Mon, 10 Dec 2007 13:50:00 GMT

A recent series of D-F articles have discussed the use of NestedVM to compile cheminformatics programs written in C/C++ to pure java binaries that can be run on any system with a JVM. More specifically, an attempt to compile OpenBabel's babel program to bytecode was only partially successful. With the help of Geoff Hutchison, the problem was resolved. This article introduces JBabel, a platform-independent, pure Java implementation of OpenBabel's babel program.

A Little About JBabel

JBabel was compiled from the Open Babel 2.1.1 source release and can be downloaded from SourceForge. The same jarfile was successfully tested on Linux, Windows and Mac OS X. You can verify JBabel works on your platform with the following command:

$ java -jar jbabel-20071209.jar -Hsmi
smi  SMILES format
A linear text format which can describe the connectivity
and chirality of a molecule
Write Options e.g. -xt
  n no molecule name
  t molecule name only
  r radicals lower case eg ethyl is Cc

This version of JBabel was compiled with support for three formats:

  • SMILES (smi). Non-canonical SMILES.

  • MDL (mol). Molfiles and SD Files.

  • Canonical SMILES (can). Canonical SMILES implementation donated by eMolecules.

I'll discuss exactly how support for these formats was added in a subsequent post. More formats will be added in the future. For now, let's just try JBabel out.

Testing JBabel

One way to use JBabel is interactively from the command line - just leave out an input or output file parameter. For example, if you wanted to get the eMolecules canonical SMILES for sertraline, you might do something like this (be sure to use two returns to begin processing):

$ java -jar jbabel-20071209.jar -ismi -ocan
CN[C@H]1CC[C@H](C2=CC=CC=C12)C3=CC(=C(C=C3)Cl)Cl

CN[C@H]1CC[C@H](c2ccc(Cl)c(Cl)c2)c2ccccc12
1 molecule converted
34 audit log messages

This canonical SMILES can be converted into a molfile with the following:

$ java -jar jbabel-20071209.jar -ismi -omol
CN[C@H]1CC[C@H](c2ccc(Cl)c(Cl)c2)c2ccccc12


 OpenBabel12090723182D

 22 24  0  0  0  0  0  0  0  0999 V2000
    0.0000    0.0000    0.0000 C   0  0  0  0  0

...

To convert using input and output files, we could use a medium-sized dataset such as the PubChem benzodiazepine dataset prepared for Rubidium:

$ java -jar jbabel-20071209.jar -imol pubchem_benzodiazepine_20071110.sdf -ocan pubchem_benzodiazepine_20071110.smi
==============================
*** Open Babel Warning  in ReadMolecule
  WARNING: Problems reading a MDL file
Cannot read title line

2117 molecules converted

This test, which parses 2117 records, required four minutes forty-five seconds on my system. For comparison, the natively compiled binary did the same thing in about thirteen seconds. Clearly, the JBabel performance hit is substantial.

Uses

Although it's very unlikely that JBabel will ever be useful in performance-critical situations, its portability makes it attractive for other uses. Examples include:

  • application development in heterogeneous computing environments;

  • use on systems in which native compilation may be difficult, such as those with unusual configurations or operating systems;

  • cases in which native binaries work poorly or not at all, such as in applets and Java applications;

  • situations in which performance is a minor consideration, such as in end-user applications that process only a few molecules at a time, or during application prototyping

Conclusions

This article has described JBabel, the first portable binary version of OpenBabel's babel molecular file format interconversion program. The next article in this series will describe in detail the steps that were used to compile it.

Comments

Leave a response

  1. baoilleach Tue, 11 Dec 2007 08:19:07 GMT

    Is Jar size important for applets? Is this an issue with JBabel?

  2. Rich Apodaca Tue, 11 Dec 2007 14:36:18 GMT

    Noel, sometimes size matters and sometimes it doesn't ;-).

    The more frequently and widely deployed an applet is, the more important a small footprint becomes.

    Still, in some situations having a page that initially loads slowly will be a great trade-off for a site that's faster to develop.

    As you can see, JBabel in its current form is about 1.7 MB. This gives some room to maneuver.

  3. Geoffrey Hutchison Thu, 13 Dec 2007 00:23:27 GMT

    Well, the reason JBabel can be 1.7MB is that it doesn't include very many file formats.

    The reason that the JNI interface is likely to still be useful is that it can dynamically load formats as needed.

  4. Rich Apodaca Thu, 13 Dec 2007 01:19:18 GMT

    Good point, Geoff. It looks like 1.7 MB or so is the lower limit for size. In some cases, having access to all formats will be useful. In others, less so. So number of formats vs. binary size is one of the trade-offs to consider.

    The Open Babel JNI will also remain useful because it offers a Java development library through which the Open Babel API can be invoked. JBabel, in contrast, is at this point a program that can be interfaced through the main method only.

  5. Bob Hanson Wed, 19 Mar 2008 03:37:19 GMT

    Very interesting! What I see there is an application, not an applet, right? Does this compilation method allow for an applet with an associated JavaScript API?

  6. Rich Apodaca Wed, 19 Mar 2008 12:59:57 GMT

    Bob, yes, the demo is a Java application. It could be turned into an applet by wrapping it in a class that extends java.applet.Applet.

    The only problem with that is that the code as it stands uses the filesystem. It would need to be edited to use Strings if you wanted to deploy as an unsigned applet.

Comments