Merb on JRuby 1
My company is developing a commercial cheminformatics Web application with Ruby on Rails. Due to the nature of the problem domain, this application will require interaction with MX, an essential cheminformatics library written in Java.
This article will first explain how and why Merb entered the picture and then explain how to get Merb installed on a clean JRuby-1.1.6 installation.
JRuby on Rails: Express-Service Deployment, but a Trainwreck for Development
The initial approach, JRuby on Rails works amazingly well for deployment (especially using Glassfish and Warbler), but is nothing but heartache for development and testing. My biggest complaints include:
Startup time is prohibitively slow. It typically takes anywhere from 10 to 15 seconds for JRuby and the Rails environment to completely load on my system. This leads to the next point...
RSpec on JRuby: FAIL 10-15 seconds startup time may not sound like a big deal, but consider that in Behavior Driven Development with RSpec you don't even begin writing application code until you first write a failing test, then watch it fail, then write a small bit of code to make the test pass, then write another test, and so on. At every point in this process, you need to wait 10-15 seconds (compared to 1-2 seconds on MRI) for JRuby and Rails to startup up. All of which leads to a lot of waiting around and all of which is anything but Agile.
Autospec on JRuby: FAIL Autospec runs in the background, constantly monitoring modifications to your project and automatically running tests. It's a wonderful idea that, that once you've tried it, will make it very difficult to go back to manually running tests. The problem is that Autospec is broken on JRuby: (1) rspec only runs by monkeypatching the startup scripts and then dies when the first failing test is run; (2) spec_server doesn't work.
One Solution: Encapsulate Java Dependency in Web Service with Merb on JRuby
Given that the interaction with the MX library actually affects very little of the Rails application, one possibility would be to factor out this functionality into a private Web serivice. One way to do that would be with Merb on JRuby. At that point, the Rails application could be run under MRI for development and testing, and under JRuby for deployment.
For the unfamiliar, Merb is a lightweight, fast, MVC framework for Ruby ideal for building simple, restful Web services.
Installing Merb on JRuby
Despite reaching the version 1.0 milestone, getting specific information on installing Merb under non-standard platforms like JRuby is not easy. My solution was based on information provided by Arun Gupta in a recent article.
Starting with a clean installation of JRuby-1.6, here's what worked for me:
$ jruby -S gem install merb-core --no-ri --no-rdoc $ jruby -S gem install merb-more --no-ri --no-rdoc $ jruby -S gem install activerecord --no-ri --no-rdoc $ jruby -S gem install merb_activerecord --no-ri --no-rdoc $ jruby -S gem install mongrel --no-ri --no-rdoc
The problem we need to solve is that trying to install the full Merb metagem itself will complain due to a dependency on DataMapper, which uses native extensions and is not yet available for JRuby, so you'll get something like this if you try:
$ jruby -S gem install merb --no-ri --no-rdoc
Building native extensions. This could take a while...
/home/rich/local/jruby-1.1.5/lib/ruby/1.8/mkmf.rb:7: JRuby does not support native extensions. Check wiki.jruby.org for alternatives. (NotImplementedError)
from /home/rich/local/jruby-1.1.5/lib/ruby/1.8/mkmf.rb:21:in `require'
from extconf.rb:21
ERROR: Error installing merb:
ERROR: Failed to build gem native extension.
/home/rich/local/jruby-1.1.5/bin/jruby extconf.rb install merb --no-ri --no-rdoc
Gem files will remain installed in /home/rich/local/jruby-1.1.5/lib/ruby/gems/1.8/gems/do_sqlite3-0.9.10.1 for inspection.
Results logged to /home/rich/local/jruby-1.1.5/lib/ruby/gems/1.8/gems/do_sqlite3-0.9.10.1/ext/do_sqlite3_ext/gem_make.out
Installing only merb-core and merb-more along with ActiveRecord and Mongrel gives us everything we need to create a simple Merb on JRuby application.
A Simple Merb on JRuby Application
We need to tell Merb to use ActiveRecord as the ORM rather than DataMapper when the project is created. Let's call our project 'ws':
$ jruby -S merb-gen core --orm activerecord ws $ cd ws $ jruby -S rake db:create $ nano config/database.yml.sample ... $ mv config/database.yml.sample config/database.yml $ jruby -S merb
Pointing your browser at http://localhost:4000/ should render a page. It will complain about "No routes match the request: /", but that's to be expected since we haven't created any.
Conclusions
Merb isn't the only way to create a fast, private, restful web service for accessing Java libraries. One alternative would be to use a pure-Java solution such as Restlet. Still, with Merb on JRuby installed and running, it's possible to explore the tradeoffs. And the coming merge of Merb and Rails for Rails 3.0 makes doing so all the more compelling.
Exhaustive Ring Perception With MX 1
The latest release of MX now supports exhaustive ring perception. Both a platform-independent jarfile and source distribution can be downloaded.
Background
The ability to perceive all rings in a chemical structure is essential for a number of important cheminformatics capabilities including Structure Diagram Generation, aromaticity detection, and binary fingerprint generation.
A recent Depth-First article described a ring-perception algorithm that efficiently returns the set of all rings for any molecule. The algorithm, developed by Hanser and coworkers has now been implemented in MX.
MX is a platform-independent, cross-language cheminformatics toolkit written in Java and made available to the cheminformatics community by Metamolecular, LLC.
Examples
Ring perception can be tested conveniently using either JRuby or Jython. In these examples, we'll use JRuby.
To find all rings in benzene, we'd use something like:
$ jirb
irb(main):001:0> require 'mx-0.108.1.jar'
=> true
irb(main):002:0> import com.metamolecular.mx.ring.HanserRingFinder
=> Java::ComMetamolecularMxRing::HanserRingFinder
irb(main):003:0> import com.metamolecular.mx.io.Molecules
=> Java::ComMetamolecularMxIo::Molecules
irb(main):004:0> benzene = Molecules.create_benzene
=> #<Java::ComMetamolecularMxModel::DefaultMolecule:0x1971eb3 @java_object=com.metamolecular.mx.model.DefaultMolecule@126ba64>
irb(main):005:0> finder = HanserRingFinder.new
=> #<Java::ComMetamolecularMxRing::HanserRingFinder:0x76f2e8 @java_object=com.metamolecular.mx.ring.HanserRingFinder@1458dcb>
irb(main):006:0> rings = finder.find_rings benzene
=> #<Java::JavaUtil::ArrayList:0x1b83048 @java_object=[[com.metamolecular.mx.model.DefaultMolecule$AtomImpl@169dd64, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@145f5e3, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@122d9c, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@170984c, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@11ed166, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@45aa2c, com.metamolecular.mx.model.DefaultMolecule$AtomImpl@169dd64]]>
irb(main):007:0> rings[0].collect{|atom| atom.get_index}.join("-")
=> "5-0-1-2-3-4-5"
irb(main):008:0> rings.size
=> 1
Here, we're taking advantage of the Ruby Array.join function to place a dash between each atom index.
To really push the system, we could find all rings in cubane:
$ jirb
irb(main):001:0> require 'mx-0.108.1.jar'
=> true
irb(main):002:0> import com.metamolecular.mx.ring.HanserRingFinder
=> Java::ComMetamolecularMxRing::HanserRingFinder
irb(main):003:0> import com.metamolecular.mx.io.Molecules
=> Java::ComMetamolecularMxIo::Molecules
irb(main):004:0> cubane = Molecules.create_cubane
=> #<Java::ComMetamolecularMxModel::DefaultMolecule:0xe391c4 @java_object=com.metamolecular.mx.model.DefaultMolecule@182a033>
irb(main):005:0> finder = HanserRingFinder.new
=> #<Java::ComMetamolecularMxRing::HanserRingFinder:0x1458dcb @java_object=com.metamolecular.mx.ring.HanserRingFinder@1603522>
irb(main):006:0> rings = finder.find_rings cubane
=> #collection with many objects
irb(main):007:0> rings.size
=> 28
irb(main):008:0> rings[0].collect{|atom| atom.get_index}.join("-")
=> "3-0-1-2-3"
Other Improvements
The MX-0.108.1 release includes some other changes as well.
Fixes a bug in which multiline SD file data was not read.
Adds a resources directory containing atomic_system.xml so that the source distribution can compile and all tests will pass.
Conclusions
This first implementation of the Hanser algorithm focuses on correctness, readability, and test coverage over performance. Future releases will address performance in the context of a open, multi-toolkit cheminformatics benchmark suite.
Calculating Molecular Mass With MX: Using a Complete Hydrogen to Uranium System of Atomic Masses Linked to the Primary Literature 3
Calculating molecular mass is an important capability in cheminformatics. Performing the calculation itself is trivial, but determining what the masses will be used can be tricky. Ideally, each measurement in a system of atomic masses and isotopic distributions would be traceable to the primary literature.
MX is an open source cheminformatics toolkit written in Java that can be used without modification in a variety of other languages including Python and Ruby. The latest release of MX, v0.106.0 includes a complete hydrogen to uranium system of atomic masses linked to the primary literature.
You can try out the new feature with an interactive Jython shell (or interactive JRuby, if you'd prefer). Let's calculate the molecular mass of cubane. After downloading the mx-0.106.0 jarfile to your working directory, use:
Jython Completion Shell
Jython 2.5b0 (trunk:5540, Oct 31 2008, 13:55:41)
[Java HotSpot(TM) Client VM (Sun Microsystems Inc.)] on java1.5.0_16
>>> import sys
>>> sys.path.append("mx-0.106.0.jar")
>>> from com.metamolecular.mx.io.daylight import SMILESReader
>>> from com.metamolecular.mx.calc import MassCalculator
>>> c=MassCalculator()
>>> c.findAveragedMass(SMILESReader.read("C12C3C4C1C5C4C3C25"))
104.14912000000001
As you can see, using this new calculator is quite easy. One minor detail is the "1" at the end of result, which appears to be a Java floating point issue unrelated to the system of atomic masses.
MX uses an XML file compiled from the definitive IUPAC publication on atomic masses. The source of each measurement is cited, complete with uncertainty. More information on this file and its uses is available here.
Reading SMILES with MX
The latest release of MX, the Java toolkit for cheminformatics, now supports reading a subset of SMILES strings. Although incomplete, full support for this feature is planned within a few releases.
To get an idea of how to use the new SMILES reader, we can use interactive JRuby. Assuming we've downloaded the mx-0.105.0 jarfile to our working directory, we can use:
$ jirb irb(main):001:0> require 'mx-0.105.0.jar' => true irb(main):002:0> import com.metamolecular.mx.io.daylight.SMILESReader => Java::ComMetamolecularMxIoDaylight::SMILESReader irb(main):003:0> bromobenzene = SMILESReader.read 'C1=CC=CC=C1Br' => #<Java::ComMetamolecularMxModel::DefaultMolecule:0x8a2023 @java_object=com.metamolecular.mx.model.DefaultMolecule@182a70> irb(main):004:0> bromobenzene.count_atoms => 7 irb(main):005:0> bromobenzene.get_atom(6).get_symbol => "Br"
Flexible Depth-First Search With MX 2
Graph theory is an essential component of cheminformatics, if you dig deeply enough. MX is a lightweight cheminformatics toolkit written in Java with a major goal of exposing the most important cheminformatics graph manipulations in a flexible, Java-centric way. Previous releases have focused on implementing subgraph monomorphism functionality for use in substructure search. The new MX release, 0.104.0, introduces support for depth-first traversal. This article will give a simple example using this feature.
Downloading MX
MX can be downloaded in source or binary form:
mx-0.104.0.jar Platform-independent bytecode.
mx-0.104.0-src.tar.gz Source code and regression tests.
Scripting MX with JRuby
A previous article outlined the simple steps needed to install JRuby on unix-based systems for scripting MX.
Finding All Paths From a Given Atom
A fundamental graph operation in cheminformatics is finding all paths through a molecule from a starting atom. MX makes this easy with the com.metamolecular.mx.path.PathFinder class. Depth-first traversal is used in creating molecular fingerprints. Another use is in creating SMILES strings, although a limited form of depth-first traversal is used in which each atom in a molecule is traversed only once.
We can create a short library to print out all of the paths through a molecule in JRuby:
require 'mx-0.104.0.jar'
import 'com.metamolecular.mx.path.PathFinder'
class PathPrinter
def initialize
@finder = PathFinder.new
end
def print_paths atom
paths = @finder.find_all_paths atom
puts "printing all paths through the molecule"
paths.each do |path|
print_path path
end
end
private
def print_path path
path.each do |atom|
print atom.get_index
print '-' unless path.get(path.length - 1).equals(atom)
end
puts
end
endSaving the above code in a file called pathprinter.rb, we can test it from interactive JRuby:
$ jirb irb(main):001:0> require 'pathprinter' => true irb(main):002:0> import com.metamolecular.mx.io.Molecules => Java::ComMetamolecularMxIo::Molecules irb(main):003:0> benzene=Molecules.create_benzene => #<Java::ComMetamolecularMxModel::DefaultMolecule:0x43da1b @java_object=com.metamolecular.mx.model.DefaultMolecule@8a2023> irb(main):004:0> p=PathPrinter.new => #<PathPrinter:0x19ed7e @finder=#<Java::ComMetamolecularMxPath::PathFinder:0x3727c5 @java_object=com.metamolecular.mx.path.PathFinder@1140709>> irb(main):005:0> p.print_paths benzene.get_atom(0) printing all paths through the molecule 0-5-4-3-2-1 0-1-2-3-4-5 => nil
How It Works
Two classes collaborate in this traversal: com.metamolecular.mx.path.PathFinder and com.metamolecular.mx.path.DefaultStep.
Creating a depth-first traversal of your own is as simple as creating a DefaultStep from an Atom and implementing a walk method similar to the one shown below:
public void walk(Step step)
{
if (!step.hasNextBranch())
{
// do something with the completed branch
return;
}
while(step.hasNextBranch())
{
Atom next = step.nextBranch();
if (step.isBranchFeasible(next))
{
walk(step.nextStep(next));
step.backTrack();
}
}
}Conclusions
Depth-first traversal is an important tool in any cheminformatics library. MX offers an implementation of this traversal strategy that can be easily customized.

