<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: JRuby for Cheminformatics: Parsing SMILES Simply</title>
    <link>http://depth-first.com/articles/2007/10/09/jruby-for-cheminformatics-parsing-smiles-simply</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>JRuby for Cheminformatics: Parsing SMILES Simply</title>
      <description>&lt;p&gt;&lt;a href="http://cdk.sf.net"&gt;&lt;img src="http://depth-first.com/files/cdk_logo.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The previous article in this series outlined some &lt;a href="http://depth-first.com/articles/2007/10/08/five-reasons-to-start-using-jruby-now"&gt;reasons to consider JRuby for cheminformatics&lt;/a&gt;. Now I'll show how easy it is to get started by describing how to parse SMILES strings with the help of the &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt; (CDK).&lt;/p&gt;

&lt;h4&gt;What About Ruby CDK?&lt;/h4&gt;

&lt;p&gt;A number of Depth-First articles have discussed &lt;a href="http://depth-first.com/articles/2007/10/04/ruby-cdk-for-newbies"&gt;Ruby CDK&lt;/a&gt;. This library runs on top of C-Ruby, otherwise known as Matz' Ruby Implementation (MRI). &lt;a href="http://rjb.rubyforge.org/"&gt;Ruby Java Bridge&lt;/a&gt; connects MRI to a Java Virtual Machine under Ruby CDK.&lt;/p&gt;

&lt;p&gt;This article, and the others to follow, will instead discuss the use of the CDK and other Java libraries from JRuby. In contrast to MRI, JRuby is a pure Java implementation of the Ruby language. This approach offers some important advantages which will be highlighted along the way.&lt;/p&gt;

&lt;h4&gt;Installing JRuby&lt;/h4&gt;

&lt;p&gt;JRuby is not difficult to install. On Linux, the steps are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Install &lt;a href="http://java.sun.com"&gt;JDK Version 1.4 or higher&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Download and unpack the most recent JRuby release - at the time of this writing, &lt;a href="http://dist.codehaus.org/jruby"&gt;version 1.0.1&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add the JRuby &lt;tt&gt;bin&lt;/tt&gt; directory to your path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is no Step 4. ;-)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Installing CDK for JRuby&lt;/h4&gt;

&lt;p&gt;Installing CDK so that it works on JRuby is similarly quite simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Download the most recent CDK jarfile - at the time of this writing, &lt;a href="http://downloads.sourceforge.net/cdk/cdk-1.0.1.jar?modtime=1182877138&amp;amp;big_mirror=0"&gt;version 1.0.1&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Move the CDK jarfile to your JRuby &lt;tt&gt;lib&lt;/tt&gt; directory.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Testing CDK for JRuby&lt;/h4&gt;

&lt;p&gt;You can verify that your new CDK for JRuby installation works with &lt;tt&gt;jirb&lt;/tt&gt;:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ jirb
irb(main):001:0&gt; require 'java'
=&gt; true
irb(main):002:0&gt; include_class 'org.openscience.cdk.smiles.SmilesParser'
=&gt; ["org.openscience.cdk.smiles.SmilesParser"]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You should notice that &lt;tt&gt;jirb&lt;/tt&gt; takes a few seconds to initialize the JVM, whereas &lt;tt&gt;irb&lt;/tt&gt; starts almost instantly.&lt;/p&gt;

&lt;h4&gt;A Library to Read SMILES&lt;/h4&gt;

&lt;p&gt;We can write a short library to read SMILES strings using the CDK:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include_class&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.smiles.SmilesParser&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;Daylight&lt;/span&gt;
  &lt;span class="attribute"&gt;@@smiles_parser&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;SmilesParser&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;read_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;
    &lt;span class="attribute"&gt;@@smiles_parser&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parse_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Notice the use of the Rubyesque method name &lt;tt&gt;parse_smiles&lt;/tt&gt; rather than &lt;tt&gt;parseSmiles&lt;/tt&gt;. This is just one of the built-in conveniences offered by JRuby.&lt;/p&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

Saving the library as a file called &lt;strong&gt;daylight.rb&lt;/strong&gt; lets us test it using interactive JRuby:

&lt;div class="console"&gt;
&lt;pre&gt;
$ jirb
irb(main):001:0&gt; require 'daylight'
=&gt; true
irb(main):002:0&gt; include Daylight
=&gt; Object
irb(main):003:0&gt; mol = read_smiles 'c1ccccc1'
=&gt; #&lt;Java::OrgOpenscienceCdk:: [truncated] ...&gt;
irb(main):004:0&gt; mol.atom_count
=&gt; 6
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As you can see, the benzene SMILES has been parsed correctly. Again, notice the use of the Rubyesque method name &lt;tt&gt;atom_count&lt;/tt&gt;, rather than the CDK Java bean convention method name &lt;tt&gt;getAtomCount&lt;/tt&gt;. This feature makes it easy to ignore the fact you're using a Java library and get on with writing your Ruby code. Brilliant!&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;This article has shown how to install JRuby and begin to write some simple cheminformatics programs with a distinctive Ruby flavor. Although the focus was on SMILES parsing, there's much more functionality to be found within the CDK and other cheminformatics libraries written in Java. Future articles will outline some of the possibilities.&lt;/p&gt;</description>
      <pubDate>Tue, 09 Oct 2007 08:40:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:9007f034-5aa0-458c-b4e1-f9dc182d19be</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/09/jruby-for-cheminformatics-parsing-smiles-simply</link>
      <category>Tools</category>
      <category>jruby</category>
      <category>java</category>
      <category>ruby</category>
      <category>rubidium</category>
      <category>cdk</category>
      <category>smiles</category>
    </item>
  </channel>
</rss>
