<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag rcdk</title>
    <link>http://depth-first.com/articles/tag/rcdk</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Ruby CDK for Newbies</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/demo/20071004/newbies.png" align="right"&gt;&lt;/img&gt;Scripting languages and cheminformatics can be a highly-effective combination. With their relaxed syntax, compilation-free execution, and interactive testing environments, scripting languages offer fast development iteration cycles. And scripting languages' support for manipulating libraries written in other languages can be key in today's heterogeneous cheminformatics software environment.&lt;/p&gt;

&lt;p&gt;Although there are many &lt;a href="http://depth-first.com/articles/2006/11/14/eleven-free-cheminformatics-scripting-environments"&gt;cheminformatics scripting environments&lt;/a&gt; to choose from, Ruby offers some important advantages. Number one on the list is the wildly-popular &lt;a href="http://rubyonrails.org"&gt;Ruby on Rails&lt;/a&gt; Web development framework. Others worth mentioning include &lt;a href="http://tryruby.hobix.com/"&gt;interactive ruby&lt;/a&gt; (irb), the &lt;a href="http://www.rubygems.org/"&gt;RubyGems&lt;/a&gt; package manager, the &lt;a href="http://martinfowler.com/articles/rake.html"&gt;Rake&lt;/a&gt; build system, the &lt;a href="http://jruby.org"&gt;JRuby&lt;/a&gt; Ruby implementation, &lt;a href="http://rubyforge.org"&gt;RubyForge&lt;/a&gt;, and a host of other productivity-boosters.&lt;/p&gt;

&lt;p&gt;A major focus of Depth-First over the last few months has been &lt;a href="http://depth-first.com/articles/tag/rubycdk"&gt;Ruby CDK&lt;/a&gt;. This library consists of a thin Ruby wrapper around the open source &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt; (CDK), &lt;a href="http://depth-first.com/articles/2006/08/28/drawing-2-d-structures-with-structure-cdk"&gt;Structure-CDK&lt;/a&gt;, an open source 2D rendering toolkit, and &lt;a href="http://depth-first.com/articles/tag/opsin"&gt;OPSIN&lt;/a&gt;, an open source chemical nomenclature parser. A recent comment on Depth-First by &lt;a href="http://chem-bla-ics.blogspot.com/"&gt;Egon Willighagen&lt;/a&gt;, one of CDK's creators, got me thinking about centralizing this documentation. The following collection of links is a step in that direction.&lt;/p&gt;

&lt;h4&gt;Overview and Installation&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Agile Chemical Informatics Development with CDK and Ruby CDK 0.3.0&lt;/a&gt; Installation of Ruby CDK on Linux.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;Running Ruby Java Bridge on Windows&lt;/a&gt; Special installation instructions for Ruby CDK on Windows.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Ruby CDK in Its Environment&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin"&gt;From IUPAC Nomenclature to 2D Structures with OPSIN&lt;/a&gt; OPSIN converts IUPAC nomenclature into molecular representations and is now part of Ruby CDK.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/14/eleven-free-cheminformatics-scripting-environments"&gt;Eleven Free Cheminformatics Scripting Environments&lt;/a&gt; So many choices, so little time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;Metaprogramming with Ruby : mapping Java Packages Onto Ruby Modules&lt;/a&gt; Behind the scenes look at a trick used in Ruby CDK.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Using Ruby CDK&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/21/build-a-rails-cheminformatics-application-in-thirty-minutes"&gt;Build a Rails Cheminformatics Application in Thirty Minutes&lt;/a&gt; First article in a series on building a SMILES Depict application.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/27/anatomy-of-a-cheminformatics-web-application-beautifying-depict"&gt;Anatomy of a Cheminformatics Web Application: Beautifying Depict&lt;/a&gt; Second article in the series - cleaning up the Depict user interface.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/04/anatomy-of-a-cheminformatics-web-application-ajaxifying-depict"&gt;Anatomy of a Cheminformatics Web Application: Ajaxifying Depict&lt;/a&gt; Third article in the series - use Ajax to automatically update the Depict drawing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/13/cheminformatics-for-the-web-convert-sd-files-to-html-with-ruby-cdk"&gt;Cheminformatics for the Web: Convert SD Files to HTML with Ruby CDK&lt;/a&gt; SD Files are both everywhere and useless by themselves to chemists - why not convert them into HTML and post them to the Web?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/15/diversity-oriented-chemical-informatics"&gt;Diversity-Oriented Chemical Informatics&lt;/a&gt; CDK is chock-full of nifty little tidbits, like the ability to enumerate all molecules of a given empirical formula.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/22/scripting-molecular-fingerprints-with-ruby-cdk"&gt;Scripting Molecular Fingerprints with Ruby CDK&lt;/a&gt; To borrow a phrase from a cheminformatics master: "It's just that easy."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/11/hacking-molbank-creating-a-graphical-table-of-contents"&gt;Hacking Molbank: Creating a Graphical Table of Contents&lt;/a&gt; The intersection of Open Access, Open Source, and rapid application development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/18/anatomy-of-a-cheminformatics-web-application-structure-cleanup-in-java-molecular-editor"&gt;Anatomy of a Cheminformatics Web Application: Structure Cleanup in Java Molecular Editor&lt;/a&gt; The structure editor &lt;em&gt;can&lt;/em&gt; be lean, mean, and still highly functional - just offload resource-hungry features to the server.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/03/13/from-iupac-name-to-molecular-formula-with-ruby-cdk"&gt;From IUPAC Name to Molecular Formula with Ruby CDK&lt;/a&gt; &lt;a href="http://wwmm.ch.cam.ac.uk/blogs/corbett/"&gt;Peter Corbett's&lt;/a&gt; awesome OPSIN Library plays nice with Ruby CDK.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/06/from-inchi-to-image-with-ruby-open-babel-and-ruby-cdk"&gt;From InChI to Image with Ruby Open Babel and Ruby CDK&lt;/a&gt; InChIs are not easy to interpret - fortunately, this little library will do it for you.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/19/easily-calculate-tpsa-descriptors-from-smiles-strings-using-ruby-cdk"&gt;Easily Calculate TPSA Descriptors from SMILES Strings Using Ruby CDK&lt;/a&gt; It just works.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/20/ruby-cdk-one-liners-create-a-molfile-with-2d-atom-coordinates-from-arbitrary-smiles-strings"&gt;Ruby CDK One-Liners: Create a Molfile with 2D Atom Coordinates from Arbitrary SMILES Strings&lt;/a&gt; Extremely short library for solving a very common cheminformatics problem.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/05/source-code-documentation-in-ruby-rdoc-for-ruby-cdk"&gt;Source Code Documentation in Ruby: RDoc for Ruby CDK&lt;/a&gt; When all else fails, read the documentation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Image Generation Credit: &lt;a href="http://txt2pic.com/"&gt;txt2pic.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Thu, 04 Oct 2007 10:01:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:e2e8719d-5cef-4a64-813a-bf49cd1aa41f</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/04/ruby-cdk-for-newbies</link>
      <category>Tools</category>
      <category>rubycdk</category>
      <category>rcdk</category>
      <category>newbies</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Ruby CDK One-Liners: Create a Molfile With 2D Atom Coordinates From Arbitrary SMILES Strings</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;A very common operation in cheminformatics is the interconversion of molfiles and SMILES strings. Usually, converting from SMILES gives a molfile in which all atoms have coordinates of (0,0,0). Sometimes you just need more than that. The following &lt;a href="http://depth-first.com/articles/tag/rcdk"&gt;Ruby CDK&lt;/a&gt; code will accept an arbitrary SMILES string and return a molfile with fully-assigned 2D atom coordinates:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;

&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molfile&lt;/span&gt; &lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;smiles_to_molfile&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;c1ccccc1&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Looking at it this way, those four lines of require/include statements seem pretty darn verbose.&lt;/p&gt;</description>
      <pubDate>Thu, 20 Sep 2007 14:18:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:8c347f16-8a0c-4d35-a02c-a2560fdc5f79</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/20/ruby-cdk-one-liners-create-a-molfile-with-2d-atom-coordinates-from-arbitrary-smiles-strings</link>
      <category>Tools</category>
      <category>rubycdk</category>
      <category>rcdk</category>
      <category>smiles</category>
      <category>molfile</category>
      <category>interconversion</category>
      <category>sdg</category>
      <category>coordinates</category>
    </item>
    <item>
      <title>Easily Calculate TPSA Descriptors from SMILES Strings Using Ruby CDK</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;A D-F reader wrote in to ask how to calculate Topological Polar Surface Area (TPSA) using &lt;a href="http://depth-first.com/articles/tag/rcdk"&gt;Ruby CDK&lt;/a&gt;. TPSA is one of the most widely-used descriptors for predicting membrane permeability and from it other important ADME properties. This article shows how to calculate TPSA with Ruby using Ruby CDK.&lt;/p&gt;

&lt;h4&gt;The Library&lt;/h4&gt;

&lt;p&gt;Our library consists of nothing more than a few method calls to manipulate the underlying &lt;a href="http://cdk.sf.net"&gt;CDK&lt;/a&gt; library. The &lt;tt&gt;tpsa_for&lt;/tt&gt; method accepts any SMILES string and returns the calculated TPSA:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.qsar.descriptors.molecular.TPSADescriptor&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;TPSA&lt;/span&gt;
  &lt;span class="attribute"&gt;@@calc&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Qsar&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Descriptors&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Molecular&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;TPSADescriptor&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;tpsa_for&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;

    &lt;span class="attribute"&gt;@@calc&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;calculate&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;getValue&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;doubleValue&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;An Interactive Test&lt;/h4&gt;

&lt;p&gt;Saving the library to a file called &lt;strong&gt;tpsa.rb&lt;/strong&gt; lets us test it through interactive Ruby (irb):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'tpsa'
./tpsa.rb:2:Warning: require_gem is obsolete.  Use gem instead.
/usr/local/lib/ruby/gems/1.8/gems/rcdk-0.3.0/lib/rcdk/java.rb:26:Warning: require_gem is obsolete.  Use gem instead.
=&gt; true
irb(main):002:0&gt; include TPSA
=&gt; Object
irb(main):003:0&gt; tpsa_for 'COCCc1ccc(OCC(O)CNC(C)C)cc1' # metoprolol
=&gt; 50.72
irb(main):004:0&gt; tpsa_for 'O=C3Nc1ccc(Cl)cc1C(c2ccccc2)=NC3O' # oxazepam
=&gt; 61.69
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The results we obtain for metoprolol and oxazepam are 50.72 and 61.69, respectively. These values compare well with those reported by Ertl et al. in the &lt;a href="http://dx.doi.org/10.1021/jm000942e"&gt;definitive paper on TPSA&lt;/a&gt; (50.7 and 61.7, respectively).&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;It doesn't take much Ruby to command a wide range of cheminformatics functionality - in this case TPSA calculations. But the fun doesn't stop there. The CDK, and by extension Ruby CDK, offer access to a &lt;a href="http://cheminfo.informatics.indiana.edu/~rguha/code/java/nightly/api/org/openscience/cdk/qsar/descriptors/molecular/package-frame.html"&gt;wide array of descriptor calculations&lt;/a&gt;, each of which follow the same basic pattern outlined here. All of it can be prototyped, debugged, and deployed through one of the most flexible programming languages currently available.&lt;/p&gt;</description>
      <pubDate>Wed, 19 Sep 2007 09:27:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:f9b7229e-0e55-4299-a5ce-4d035a424398</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/19/easily-calculate-tpsa-descriptors-from-smiles-strings-using-ruby-cdk</link>
      <category>Tools</category>
      <category>ruby</category>
      <category>rcdk</category>
      <category>rubycdk</category>
      <category>tpsa</category>
      <category>descriptor</category>
      <category>rjb</category>
    </item>
    <item>
      <title>From InChI to Image with Ruby Open Babel and Ruby CDK</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;Like SMILES, InChI is a line notation that can be used to encode and store chemical information relatively efficiently. Although there are a number of scenarios where this strategy is used, what many of them have in common is the need to eventually convert an InChI into a human-readable form. In most cases, this form will be a 2D chemical structure. This article will show how a small Ruby library can convert InChI strings into color PNG images with the help of &lt;a href="http://depth-first.com/articles/tag/rubyopenbabel"&gt;Ruby Open Babel&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/tag/rcdk"&gt;Ruby CDK&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;The Library&lt;/h4&gt;

&lt;p&gt;Our library accepts an InChI as input and produces a scaled PNG image as output. It re-uses part of a &lt;a href="http://depth-first.com/articles/2007/06/25/interconvert-almost-any-smiles-and-inchi-with-ruby-open-babel"&gt;previously-discussed&lt;/a&gt; library for the interconversion of SMILES and InChI.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;InChI&lt;/span&gt;
  &lt;span class="attribute"&gt;@@to_smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
  &lt;span class="attribute"&gt;@@to_smiles&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_and_out_formats&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;inchi_to_png&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;path_to_png&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;
    &lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;inchi_to_smiles&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;

    &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Image&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;smiles_to_png&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;path_to_png&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="ident"&gt;private&lt;/span&gt;

    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;inchi_to_smiles&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;
      &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

      &lt;span class="attribute"&gt;@@to_smiles&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="keyword"&gt;or&lt;/span&gt; &lt;span class="keyword"&gt;raise&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Can't parse InChI: &lt;span class="expr"&gt;#{inchi}&lt;/span&gt;.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
      &lt;span class="attribute"&gt;@@to_smiles&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;strip&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Testing&lt;/h4&gt;

Our library can be tested by saving it to a file called &lt;strong&gt;inchi.rb&lt;/strong&gt; and using interactive Ruby (the warning can safely be ignored for now):

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'inchi'
./inchi.rb:3:Warning: require_gem is obsolete.  Use gem instead.
/usr/local/lib/ruby/gems/1.8/gems/rcdk-0.3.0/lib/rcdk/java.rb:26:Warning: require_gem is obsolete.  Use gem instead.
i=&gt; true
irb(main):002:0&gt; include InChI
=&gt; Object
irb(main):003:0&gt; inchi='InChI=1/C23H27FN4O2/c1-15-18(23(29)28-10-3-2-4-21(28)25-15)9-13-27-11-7-16(8-12-27)22-19-6-5-17(24)14-20(19)30-26-22/h5-6,14,16H,2-4,7-13H2,1H3' #risperidone
=&gt; "InChI=1/C23H27FN4O2/c1-15-18(23(29)28-10-3-2-4-21(28)25-15)9-13-27-11-7-16(8-12-27)22-19-6-5-17(24)14-20(19)30-26-22/h5-6,14,16H,2-4,7-13H2,1H3"
irb(main):004:0&gt; inchi_to_png inchi, 'risperidone.png', 300, 300
=&gt; nil
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This code produces the following image:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=5073"&gt;&lt;img src="http://depth-first.com/demo/20070906/risperidone.png"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Our library can also be used on more complicated molecules, for example Brevetoxin:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'inchi'
./inchi.rb:3:Warning: require_gem is obsolete.  Use gem instead.
/usr/local/lib/ruby/gems/1.8/gems/rcdk-0.3.0/lib/rcdk/java.rb:26:Warning: require_gem is obsolete.  Use gem instead.
=&gt; true
irb(main):002:0&gt; include InChI
=&gt; Object
irb(main):003:0&gt; inchi='InChI=1/C49H70O13/c1-26-17-36-39(22-45(52)58-36)57-44-21-38-40(62-48(44,4)23-26)18-28(3)46-35(55-38)11-7-6-10-31-32(59-46)12-8-14-34-33(54-31)13-9-15-43-49(5,61-34)24-42-37(56-43)20-41-47(60-42)30(51)19-29(53-41)16-27(2)25-50/h6-8,14,25-26,28-44,46-47,51H,2,9-13,15-24H2,1,3-5H3/b7-6-,14-8-' #brevetoxin a
=&gt; "InChI=1/C49H70O13/c1-26-17-36-39(22-45(52)58-36)57-44-21-38-40(62-48(44,4)23-26)18-28(3)46-35(55-38)11-7-6-10-31-32(59-46)12-8-14-34-33(54-31)13-9-15-43-49(5,61-34)24-42-37(56-43)20-41-47(60-42)30(51)19-29(53-41)16-27(2)25-50/h6-8,14,25-26,28-44,46-47,51H,2,9-13,15-24H2,1,3-5H3/b7-6-,14-8-"
irb(main):004:0&gt; inchi_to_png inchi, 'brevetoxin.png', 300, 200
=&gt; nil
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This produces the following image:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=6437089"&gt;&lt;img src="http://depth-first.com/demo/20070906/brevetoxin.png"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;While our library could certainly be improved, it solves what otherwise would be a very difficult problem conveniently. Areas for further work include error handling and improving the appearance of the images (the latter is the aim of &lt;a href="http://depth-first.com/articles/tag/firefly"&gt;Firefly&lt;/a&gt;). Despite the fact that three programming languages are used (Ruby, C++, and Java), this complexity is neatly encapsulated behind a simple Ruby interface.&lt;/p&gt;</description>
      <pubDate>Thu, 06 Sep 2007 08:25:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:242847bf-aa4f-474d-979f-7b73ed072a28</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/06/from-inchi-to-image-with-ruby-open-babel-and-ruby-cdk</link>
      <category>Tools</category>
      <category>ruby</category>
      <category>rubycdk</category>
      <category>rcdk</category>
      <category>rubyopenbabel</category>
      <category>depict</category>
      <category>inchi</category>
      <category>sdg</category>
    </item>
    <item>
      <title>From IUPAC Name to Molecular Formula with Ruby CDK</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/a&gt;Recently, a question was raised on the &lt;a href="http://tech.groups.yahoo.com/group/chemoinf/"&gt;Yahoo cheminf group list&lt;/a&gt; regarding the conversion of IUPAC names into molecular formulas. This can be done quickly with Ruby CDK, as this article will show.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;This tutorial requires &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt;, which in turn requires &lt;a href="http://rjb.rubyforge.org/"&gt;Ruby Java Bridge&lt;/a&gt; (RJB). A recent Depth-First article described the minimal system configuration required to run &lt;a href="http://depth-first.com/articles/2006/08/26/scripting-java-libraries-with-ruby-java-bridge"&gt;RJB on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RJB on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Small Library&lt;/h4&gt;

&lt;p&gt;The following library will convert IUPAC nomenclature into molecular formulas with Ruby:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;Formulator&lt;/span&gt;
  &lt;span class="attribute"&gt;@@hydrogen_adder&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Rjb&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="ident"&gt;import&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.tools.HydrogenAdder&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;get_formula&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iupac_name&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_iupac&lt;/span&gt; &lt;span class="ident"&gt;iupac_name&lt;/span&gt;
    &lt;span class="attribute"&gt;@@hydrogen_adder&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;addExplicitHydrogensToSatisfyValency&lt;/span&gt; &lt;span class="ident"&gt;mol&lt;/span&gt;
    &lt;span class="ident"&gt;analyzer&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Rjb&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="ident"&gt;import&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.tools.MFAnalyser&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;analyzer&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getMolecularFormula&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Save this code as a file named &lt;strong&gt;formulator.rb&lt;/strong&gt; in your working directory.&lt;/p&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;The Formulator library can be tested with the following code:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;formulator&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;Formulator&lt;/span&gt;

&lt;span class="ident"&gt;get_formula&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;benzene&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; &amp;quot;C6H6&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;get_formula&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;4-(3,4-dichlorophenyl)-N-methyl-1,2,3,4-tetrahydronaphthalen-1-amine&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; &amp;quot;C17H17NCl2&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Limitations&lt;/h4&gt;

&lt;p&gt;You may run across classes of structures that are not recognized by Ruby CDK. This is due to limitations of the underlying &lt;a href="http://depth-first.com/articles/tag/opsin"&gt;OPSIN library&lt;/a&gt;. For example, OPSIN does not yet recognize fused heterocycle names such as 'imidazo[2,1-b][1,3]thiazole'.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Ruby CDK makes short work of converting IUPAC names into molecular formulas. This is just one example of the kind of conversion that's possible. For example, &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;a recent article&lt;/a&gt; discussed the conversion of IUPAC names to color 2-D structures.&lt;/p&gt;

&lt;p&gt;Due to Ruby's position as both a highly functional scripting language and as the foundation for the popular Web application framework &lt;a href="http://www.rubyonrails.org/"&gt;Ruby on Rails&lt;/a&gt;, a variety of IUPAC nomenclature translation applications are just a few lines of code away.&lt;/p&gt;</description>
      <pubDate>Tue, 13 Mar 2007 10:25:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:6529cee0-0821-45b1-865a-267a3254d85a</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/03/13/from-iupac-name-to-molecular-formula-with-ruby-cdk</link>
      <category>Tools</category>
      <category>rubycdk</category>
      <category>rcdk</category>
      <category>iupac</category>
      <category>formula</category>
    </item>
    <item>
      <title>Hacking Molbank: Creating a Graphical Table of Contents</title>
      <description>&lt;p&gt;&lt;a href="http://www.mdpi.org/"&gt;&lt;img src="http://depth-first.com/files/mdpi-small.gif" border="0" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://www.mdpi.org/"&gt;Molbank&lt;/a&gt; is an Open Access collection of single-compound articles on synthetic chemistry. Previous articles on Depth-First have highlighted Molbank's practice of including &lt;a href="http://depth-first.com/articles/2006/11/30/molbank-and-the-convergence-of-open-access-open-data-and-open-source-in-chemistry"&gt;machine-readable molecular representations of its content&lt;/a&gt;, and its very &lt;a href="http://depth-first.com/articles/2006/12/01/hacking-molbank-downloading-a-complete-chemistry-journal"&gt;liberal policy on mirroring and robots&lt;/a&gt;. In this article, we'll take advantage of both of these features to build something that was left out of Molbank: a graphical table of contents.&lt;/p&gt;

&lt;h4&gt;The Graphical Table of Contents (GTOC)&lt;/h4&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/demo/20061211/molbank/index.html"&gt;The Molbank Graphical Table of Contents&lt;/a&gt; (Molbank GTOC) is available online. It consists of a single Web page containing a grid of color 2-D chemical structures representing the contents of Molbank. Each structure is hyperlinked into the Molbank site itself. Clicking on the structure takes you to the complete synthetic procedure and characterization data.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://depth-first.com/demo/20061211/molbank/index.html"&gt;&lt;img src="http://depth-first.com/demo/20061211/screenshot_1.png" border="0"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Prerequisites, Downloading, and Running&lt;/h4&gt;

&lt;p&gt;To run this project, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt;. A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;Ruby CDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;Ruby CDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The complete source code for this project can be &lt;a href="http://rubyforge.org/frs/download.php/15500/molbank-0.0.1.tar.gz"&gt;downloaded from RubyForge&lt;/a&gt;. A subdirectory called &lt;strong&gt;demo&lt;/strong&gt; contains the pre-built final result.&lt;/p&gt;

&lt;p&gt;After unpacking the &lt;strong&gt;molbank-0.1.0&lt;/strong&gt; archive, the demo application can be run:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd molbank-0.0.1
$ ruby test.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Problems, We've Got Problems&lt;/h4&gt;

&lt;p&gt;Several problems were uncovered while building the Molbank GTOC. This is to be expected with any data produced "in the wild" rather than within the safety of an Ivory Tower. Here are the main categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blank Images&lt;/strong&gt; The entry for M52 is blank. Checking the &lt;a href="http://www.mdpi.net/molbank/m0052.mol"&gt;underlying molfile&lt;/a&gt; reveals four instances of bond stereo flags set to "6," a problem common to many of the blank images in the GTOC. According to the Molfile specification, a value of 6 indicates "Down, double bonds," whatever that means. Given that the &lt;a href="http://www.mdpi.net/molbank/m0052.htm"&gt;molecules shown in M52&lt;/a&gt; only have one possible stereo bond, and that the Molfile specification relies on 2-D coordinates to encode double-bond geometry, an encoding inconsistency or incorrect stereo interpretation may be the cause.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Images Containing an "R" Atom Label&lt;/strong&gt; Entry M53 shows an "R" group at what should be the carbonyl carbon. &lt;a href="http://www.mdpi.net/molbank/m0053.mol"&gt;The underlying molfile&lt;/a&gt; contains several less-common entries in the properties block, a common feature of images containing "R" in the GTOC.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Molfile not Found&lt;/strong&gt; Entry M95 has no associated Molfile because it simply reports errata for other articles. M253-M259, on the other hand, lack molfiles because the articles were "withdrawn before publication." M347 describes a cyclodextrin for which, understandably, no molfile was provided. There are also a couple of cases in which a link to a molfile is provided, but is not available, such as M352.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Broken Molfiles&lt;/strong&gt; &lt;a href="http://www.mdpi.net/molbank/m0162.mol"&gt;The Molfile for M162&lt;/a&gt; encodes its line endings as two carriage returns and a newline, giving rise to the appearance of blank lines after data lines. This is something the Molfile specification strictly forbids. Apparently, the underlying CDK molfile reader can only handle one carriage return and a newline. Perhaps the extra return was introduced as the file was copied into and out of text editors on various operating systems in preparation for uploading it to Molbank. Another common problem was binary files being used for molfiles, such as with &lt;a href="http://www.mdpi.net/molbank/molbank2005/m402.mol"&gt;M402&lt;/a&gt;. These files don't appear to be compressed with either Zip or GZip and their nature is currently unknown.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bogus Molfiles&lt;/strong&gt; For reasons I still can't understand, &lt;a href="http://www.mdpi.net/molbank/molbank2005/m407.mol"&gt;the Molfile for M407&lt;/a&gt; encodes ethylene. So do several other Molbank molfiles. Other common dummy molfiles include toluene, benzene, and ethane.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After cataloging the problems that exist with the Molbank dataset and the software used to mine it, two interesting questions come into focus:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;What can be done to help Molbank fix the most obvious problems in their molfiles and would they accept these improvements?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How can "real" datasets like Molbank help developers build better cheminformatics software? (a graphical Molfile Debugger Utility would come in handy...)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Clearly, the connection between Open Access, Open Source, and Open Data is very strong and runs very deep.&lt;/p&gt;

&lt;h4&gt;Behind the Scenes&lt;/h4&gt;

&lt;p&gt;The Ruby Molbank GTOC generator works by connecting to the &lt;a href="http://www.mdpi.net"&gt;www.mdpi.net&lt;/a&gt; server to get its data in real-time. Internally, the software creates a map of the Molbank website so that the molfile (and URL) for any article can be retrieved on demand. Each readable molfile is used to create a 2-D image using &lt;a href="http://rubyforge.org/projects/rcdk"&gt;Ruby CDK&lt;/a&gt;. As a final step, the &lt;strong&gt;index.html&lt;/strong&gt; page is generated, linking the 2-D images to a specific URL for a Molbank article. This file is &lt;a href="http://depth-first.com/articles/2006/11/13/cheminformatics-for-the-web-convert-sd-files-to-html-with-ruby-cdk"&gt;produced with eRuby&lt;/a&gt; using a previously-described technique.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Building a Graphical Table of Contents for Molbank is not that difficult given the power of Ruby, and Molbank's forward-thinking attitude toward mirroring and robots. In working on this project, several problems were uncovered, both with Molbank's data, and the software used to mine it.&lt;/p&gt;

&lt;p&gt;In some ways, the software described here and its output are less interesting than the larger questions they raise:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;How do scientific journals best serve not only their readers, but developers who want to provide new ways to use the journal?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How far does copyright extend in scientific publications? For example, are molfiles copyrightable? If so, at what level of detail are they not? If atom coordinates or some other kind of non-essential information is left out, does that change anything?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In what other practical ways could the connection between Open Source, Open Data, and Open Access be explored?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These and many related questions are waiting just around the corner. As Open Access becomes more viable, both &lt;a href="http://depth-first.com/articles/2006/10/19/disruptive-innovation-in-scientific-publishing-free-journal-management-systems"&gt;technically &lt;/a&gt; and &lt;a href="http://depth-first.com/articles/2006/10/26/more-open-access-in-the-sciences-metal-based-drugs-and-hindawi-publishing"&gt;commercially&lt;/a&gt;, look to Open Source and Open Data to provide the synergies that will unlock its true potential.&lt;/p&gt;</description>
      <pubDate>Mon, 11 Dec 2006 15:00:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:6c2f002b-3d8d-40fc-a4a5-8008c473e7d7</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/12/11/hacking-molbank-creating-a-graphical-table-of-contents</link>
      <category>Web</category>
      <category>molbank</category>
      <category>gtoc</category>
      <category>2d</category>
      <category>rcdk</category>
      <category>ruby</category>
      <category>mdpi</category>
      <category>opensource</category>
      <category>openaccess</category>
      <category>opendata</category>
    </item>
    <item>
      <title>Scripting Molecular Fingerprints with Ruby CDK</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;A &lt;a href="http://www.daylight.com/dayhtml/doc/theory/theory.finger.html"&gt;molecular fingerprint&lt;/a&gt; represents a molecule as series of bits. There are many situations in which this reduced form of molecular representation is useful. For example, fingerprints are frequently used as a fast prescreen for database substructure searches. They can also be used for "fuzzy" comparisons involving molecular similarity, a nice complement to binary queries such as substructure search.&lt;/p&gt;

&lt;p&gt;Fingerprints have their limitations. Being a form of hashing, they are imprecise in that two different molecules can have exactly the same fingerprint. The converse is also true: many molecular fingerprints exaggerate small differences between two molecules that most chemists would say are similar - for example between oxygen and sulfur analogs of the same structure.&lt;/p&gt;

&lt;p&gt;Despite their limitations, the advantages of fingerprints make them useful in many situations. As a result, numerous fingerprinting systems have become popular. This tutorial will focus on creating and manipulating molecular fingerprints from Ruby using the Ruby Chemistry Development Kit (RCDK).&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; (RCDK). A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RCDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Small Fingerprint Library&lt;/h4&gt;

&lt;p&gt;Let's build a small Ruby library for working with fingerprints. Place the following code into a file called &lt;strong&gt;fingerprint.rb&lt;/strong&gt; in your working directory:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.fingerprint.Fingerprinter&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.similarity.Tanimoto&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="comment"&gt;# Molecule fingerprinting&lt;/span&gt;
&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Fingerprinter&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;
    &lt;span class="attribute"&gt;@fingerprinter&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Fingerprint&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Fingerprinter&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;

    &lt;span class="ident"&gt;fp&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@fingerprinter&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getFingerprint&lt;/span&gt; &lt;span class="ident"&gt;mol&lt;/span&gt;

    &lt;span class="comment"&gt;# Metaprogramming!&lt;/span&gt;
    &lt;span class="ident"&gt;fp&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;extend&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;Fingerprint&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="comment"&gt;# BitSet comparison&lt;/span&gt;
&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;Fingerprint&lt;/span&gt;
  &lt;span class="comment"&gt;# Returns true of all of the bits set to true in this fingerprint are also set to true in the specified fingerprint&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;subset?&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Fingerprint&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Fingerprinter&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;isSubset&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="constant"&gt;self&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="comment"&gt;# Tanimoto similarity of this fingerprint and the specified fingerprint&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;tanimoto&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Similarity&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Tanimoto&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;calculate&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;self&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Of particular note is the use of Ruby's &lt;tt&gt;Object.extend&lt;/tt&gt; method. This method allows a single instance of an object to be extended at runtime - a form of &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;metaprogramming&lt;/a&gt;. In this case, we add the &lt;tt&gt;subset?&lt;/tt&gt; and &lt;tt&gt;tanimoto&lt;/tt&gt; methods for determining whether all of the bits in one fingerprint are present in another, and for determining similarity, respectively. We use this technique here because currently RJB doesn't provide the complete interface into Java classes that would be required to create a Ruby class that directly inherits from Java's BitSet class.&lt;/p&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/loratadine.png"&gt;&lt;/img&gt;&lt;img src="http://depth-first.com/demo/20061122/desloratadine.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=3957"&gt;Claritin&lt;/a&gt; (loratadine, left) and &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=124087"&gt;Clarinex&lt;/a&gt; (desloratadine, right) are two structurally-related antihistamines. Can we quantitate the degree of similarity between these two structures? Fingerprints provide one way. The following code creates fingerprints for the two structures, determines if one is the subset of another, and assigns a Tanimoto similarity value:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;f&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Fingerprinter&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;loratadine&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;fingerprint&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;CCOC(=O)N1CCC(=C2C3=C(CCC4=C2N=CC=C4)C=C(C=C3)Cl)CC1&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;desloratadine&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;fingerprint&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;C1CC2=C(C=CC(=C2)Cl)C(=C3CCNCC3)C4=C1C=CC=N4&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Loratadine is a subset of desloratadine: &lt;span class="expr"&gt;#{loratadine.subset? desloratadine}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; false&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Desloratadine is a subset of loratadine: &lt;span class="expr"&gt;#{desloratadine.subset? loratadine}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; true&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Tanimoto similarity of desloratadine and loratadine: &lt;span class="expr"&gt;#{loratadine.tanimoto desloratadine}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 0.895683467388153&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Variations&lt;/h4&gt;

&lt;p&gt;CDK's &lt;tt&gt;&lt;a href="http://cdk.sourceforge.net/api/org/openscience/cdk/fingerprint/Fingerprinter.html"&gt;Fingerprinter&lt;/a&gt;&lt;/tt&gt; class returns an instance of the Java class &lt;tt&gt;&lt;a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/BitSet.html"&gt;BitSet&lt;/a&gt;&lt;/tt&gt;. This &lt;tt&gt;BitSet&lt;/tt&gt; can be further manipulated in Ruby. For example, to find the size (the total number of bits) of the &lt;tt&gt;BitSet&lt;/tt&gt;, we could use:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;loratadine&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;size&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 1024&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Similarly, to find the number of bits set to true, we would use:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;loratadine&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;cardinality&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 278&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To print out a list of all bits set to true, we could use the &lt;tt&gt;toString&lt;/tt&gt; method:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;loratadine&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toString&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; &amp;quot;{2, 8, 11, 16, 18, 22, 32, 37, 38, 41, 42, 46, 47, 51, 57, 64, 65, 66, 69 ... }&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Fingerprints enable many useful and fast comparisons between molecules. The form of fingerprint we've used here is but one of possibilities offered by CDK. The next article in this series will discuss fingerprints in &lt;a href="http://openbabel.sourceforge.net/wiki/Fingerprint"&gt;Open Babel&lt;/a&gt; using both Ruby and Python.&lt;/p&gt;</description>
      <pubDate>Wed, 22 Nov 2006 15:44:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:b5807052-d051-4121-b89f-1d8cc908ef4f</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/22/scripting-molecular-fingerprints-with-ruby-cdk</link>
      <category>Tools</category>
      <category>fingerprint</category>
      <category>bitset</category>
      <category>similarity</category>
      <category>rcdk</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Build a Rails Cheminformatics Application in Thirty Minutes</title>
      <description>&lt;p&gt;&lt;a href="http://www.rubyonrails.org/"&gt;&lt;img src="http://depth-first.com/files/rails_logo.png" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;A &lt;a href="http://depth-first.com/articles/2006/11/20/unchaining-chemistry-from-the-desktop"&gt;recent article&lt;/a&gt; highlighted the Web as a new cheminformatics platform. Advocacy is one thing, but a working, open, demo built with modern technologies is far more compelling. In the following tutorial we'll build a first-generation cheminformatics Web application using the &lt;a href="http://www.rubyonrails.org/"&gt;Ruby on Rails&lt;/a&gt; framework and 100% Open Source components. We'll just cover the essentials here - look for future articles to discuss the underlying technology in more detail.&lt;/p&gt;

&lt;h4&gt;The Problem&lt;/h4&gt;

&lt;p&gt;&lt;a href="http://www.daylight.com/smiles/index.html"&gt;Simplified Molecular Input Line Entry System&lt;/a&gt; (SMILES) is one of the most compact and easy-to-learn molecular representation systems ever developed. Part of a larger family of molecular languages called &lt;a href="http://depth-first.com/articles/2006/08/18/107-years-of-line-formula-notations-1861-1968"&gt;line notations&lt;/a&gt;, SMILES strings are always written as a single line of ASCII text. This makes them perfect in situations calling for data entry; witness their use in a wide range of new free &lt;a href="http://depth-first.com/articles/2006/11/07/twelve-free-chemistry-databases"&gt;online chemistry databases&lt;/a&gt;. This system typically works by a chemist drawing a structure in a graphical editor, copying a SMILES string from the editor, and pasting this string into a search window in the database application.&lt;/p&gt;

&lt;p&gt;SMILES is a great language for computers, but not for chemists, who are trained to communicate through 2-D structure diagrams. Although SMILES strings can be decoded manually, this is a tedious and error-prone process, especially for SMILES encoding high degrees of branching and ring content. It's preferable for the computer to do this hard work for us, providing a perfectly laid-out 2-D structure diagram for use in debugging or inclusion in documents.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.daylight.com/daycgi/depict"&gt;Depict&lt;/a&gt; is a Web application originally developed by &lt;a href="http://www.daylight.com/"&gt;Daylight&lt;/a&gt; for the conversion of SMILES strings into 2-D structure diagrams. Type a SMILES string into the form, press enter, and get a raster image of the encoded molecule. Daylight's Depict does a good enough job, but you can't change the interface or output. You also can't take the software apart to see how it works. Wouldn't it be great if you could?&lt;/p&gt;

&lt;h4&gt;About This Tutorial Series&lt;/h4&gt;

&lt;p&gt;This tutorial is the first in a series describing how to build a Depict server using 100% &lt;a href="http://opensource.org"&gt;Open Source&lt;/a&gt; components. The application will accept a SMILES string in a Webpage text field, and then produce a 2-D structure diagram. It won't be designed for ease of use, appearance, or configurability - these improvements will be described in subsequent articles. When this application is finished, I'll deploy it on a Web server. At every step in this process, I'll provide enough detail for anyone to do the same.&lt;/p&gt;

&lt;p&gt;It won't be necessary to finish every step yourself before you can work with the finished product. Near the beginning of each installment will be be a "Download and Prerequisites" section containing a link to the complete source code. Simply download this code and run it to see what it does.&lt;/p&gt;

&lt;h4&gt;Download and Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; (RCDK). A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RCDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.amazon.com/gp/product/0977616630?ie=UTF8&amp;amp;tag=depthfirst-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0977616630"&gt;&lt;img border="0" src="http://depth-first.com/demo/20061122/0977616630.01._AA_SCMZZZZZZZ_V36350687_.jpg" align="right"&gt;&lt;/a&gt;&lt;img src="http://www.assoc-amazon.com/e/ir?t=depthfirst-20&amp;amp;l=as2&amp;amp;o=1&amp;amp;a=0977616630" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt;In addition, you'll need to install &lt;a href="http://www.rubyonrails.org/down"&gt;Ruby on Rails&lt;/a&gt; - something that can be done through &lt;a href="http://docs.rubygems.org/"&gt;RubyGems&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The complete Depict application can be &lt;a href="http://depth-first.com/demo/20061121/depict.tar.gz"&gt;downloaded from this link&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Note on Ruby Java Bridge and AMD64 Linux Platforms&lt;/h4&gt;

&lt;p&gt;Our Depict application will use &lt;a href="http://rjb.rubyforge.org/"&gt;Ruby Java Bridge&lt;/a&gt; (RJB) as a Ruby interface to Java bytecode. Recently, &lt;a href="http://rubyforge.org/pipermail/rjb-users/2006-November/000008.html"&gt;a problem with RJB on AMD64-Linux&lt;/a&gt; was uncovered. This problem prevents third-party jarfiles from being loaded after Rails has been loaded.&lt;/p&gt;

&lt;p&gt;In practice, this means that the command to start the Rails server (Step 2) needs to be prefixed with an assignment of &lt;tt&gt;LD_PRELOAD&lt;/tt&gt;. You also need to make &lt;tt&gt;LD_LIBRARY_PATH&lt;/tt&gt; point to your native Java libraries. On my platform, which is AMD64-Linux running Sun's JVM, the commands are:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ export LD_LIBRARY_PATH=/usr/java/jdk1.5.0_09/jre/lib/amd64:/usr/java/jdk1.5.0_09/jre/lib/amd64/server
$ LD_PRELOAD=/usr/java/jdk1.5.0_09/jre/lib/amd64/libzip.so ruby script/server
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If you get an "Internal Error" due to an "unknown exception" while running Depict, chances are good that you've hit the same problem. Starting the Rails server as above should resolve it.&lt;/p&gt;

&lt;h4&gt;Step 1: Create the Application&lt;/h4&gt;

&lt;p&gt;Getting started with Rails is as simple as issuing the &lt;tt&gt;rails&lt;/tt&gt; command with the name of your application as an argument:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ rails depict
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Executing this command creates a complete Rails application template under the &lt;strong&gt;depict&lt;/strong&gt; subdirectory in your working directory. You build your application by editing the files and directories that were generated.&lt;/p&gt;

&lt;h4&gt;Step 2: Start the Server&lt;/h4&gt;

&lt;p&gt;You can start the Depict application by running the included server script:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd depict
$ ruby script/server
=&gt; Booting WEBrick...
=&gt; Rails application started on http://0.0.0.0:3000
=&gt; Ctrl-C to shutdown server; call with --help for options
[2006-11-18 10:17:08] INFO  WEBrick 1.3.1
[2006-11-18 10:17:08] INFO  ruby 1.8.5 (2006-08-25) [x86_64-linux-gnu]
[2006-11-18 10:17:08] INFO  WEBrick::HTTPServer#start: pid=4036 port=3000
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Let's see what Depict looks like so far. Point your browser to &lt;a href="http://localhost:3000"&gt;http://localhost:3000&lt;/a&gt;. You should see the following page:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_2.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Congratulations! You're now running Ruby on Rails.&lt;/p&gt;

&lt;h4&gt;Step 3: Create the SmilesController&lt;/h4&gt;

&lt;p&gt;Rails adapts the Model-View-Controller application paradigm to the Web. It also automates many of the steps in building models, views, and controllers. Let's create a controller to handle the manipulation of SMILES strings:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby script/generate controller Smiles
      exists  app/controllers/
      exists  app/helpers/
      create  app/views/smiles
      exists  test/functional/
      create  app/controllers/smiles_controller.rb
      create  test/functional/smiles_controller_test.rb
      create  app/helpers/smiles_helper.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Currently, &lt;tt&gt;SmilesController&lt;/tt&gt; is just a skeleton:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;SmilesController&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt; &lt;span class="constant"&gt;ApplicationController&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's give &lt;tt&gt;SmilesController&lt;/tt&gt; the ability to accept a SMILES string as input by adding an &lt;tt&gt;input&lt;/tt&gt; method.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;SmilesController&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt; &lt;span class="constant"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;input&lt;/span&gt;

  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Step 4: Create a Form&lt;/h4&gt;

&lt;p&gt;At this stage, pointing your browser to &lt;a href="http://localhost:3000/smiles/input"&gt;http://localhost:3000/smiles/input&lt;/a&gt; gives a screen containing an error message:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_4_1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Rails is looking for view that doesn't exist, so let's create one. To your &lt;strong&gt;depict/app/views/smiles&lt;/strong&gt; directory, add the following file, called &lt;strong&gt;input.rhtml&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;span class="constant"&gt;Enter&lt;/span&gt; &lt;span class="ident"&gt;a&lt;/span&gt; &lt;span class="constant"&gt;SMILES&lt;/span&gt; &lt;span class="constant"&gt;String&lt;/span&gt;&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="regex"&gt;title&amp;gt;
  &amp;lt;&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;&lt;span class="ident"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt;&lt;span class="string"&gt; form_tag :action&lt;/span&gt;&lt;span class="punct"&gt;=&amp;gt;'&lt;/span&gt;&lt;span class="string"&gt;depict&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="punct"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="string"&gt;
      Enter a SMILES String: &amp;lt;br /&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt;&lt;span class="string"&gt; text_field('smiles', 'value') %&amp;gt;&amp;lt;br /&amp;gt;
    &amp;lt;%&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;end_form_tag&lt;/span&gt; &lt;span class="punct"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="string"&gt;
  &amp;lt;/body&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="regex"&gt;html&amp;gt; &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This HTML view is an example of Ruby's templating mechanism, eRuby, which was discussed earlier in the context of &lt;a href="http://depth-first.com/articles/2006/11/15/diversity-oriented-chemical-informatics"&gt;converting SD files to HTML&lt;/a&gt;. In the template above, we've configured our form to invoke an action called &lt;tt&gt;depict&lt;/tt&gt; when submitted. This action does not yet exist, but will be created in Step 5 below.&lt;/p&gt;

&lt;p&gt;Now, pointing your browser to &lt;a href="http://localhost:3000/smiles/input"&gt;http://localhost:3000/smiles/input&lt;/a&gt; should give an input field:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_4_2.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Let's try it. Submitting the SMILES string for benzene gives the following error screen:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_4_3.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;We haven't defined the &lt;tt&gt;depict&lt;/tt&gt; action yet, a fact that Rails is communicating with this error message.&lt;/p&gt;

&lt;p&gt;Have you noticed how we haven't had to restart the Rails Web server as we've made changes? This is but one of the many conveniences that makes Rails such a productive platform.&lt;/p&gt;

&lt;h4&gt;Step 5: Add a Depict Action&lt;/h4&gt;

&lt;p&gt;We need a way to pass a SMILES string from the Web page text field in which it's entered to our application and back to another view. To do this we'll add a &lt;tt&gt;depict&lt;/tt&gt; method to &lt;strong&gt;depict/app/controllers/smiles_controller.rb&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;depict&lt;/span&gt;
  &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;][&lt;/span&gt;&lt;span class="symbol"&gt;:value&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Of course, our application still won't run properly because we haven't created a view for the new &lt;tt&gt;depict&lt;/tt&gt; method to use. Let's do this by adding the following file, named &lt;strong&gt;depict.rb&lt;/strong&gt; to the &lt;strong&gt;depict/app/views/smiles&lt;/strong&gt; directory:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;span class="constant"&gt;Depict&lt;/span&gt; &lt;span class="constant"&gt;SMILES&lt;/span&gt;&lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt;&lt;span class="string"&gt; @smiles %&amp;gt;&amp;lt;/title&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;body&amp;gt;
    &amp;lt;h1&amp;gt;SMILES: &amp;lt;%&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;/h1&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="regex"&gt;body&amp;gt;
&amp;lt;&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Notice how the instance variable &lt;tt&gt;@smiles&lt;/tt&gt; is available for use within the template.&lt;/p&gt;

&lt;p&gt;Let's have a look at Depict so far. Pointing your browser to &lt;a href="http://localhost:3000/smiles/input"&gt;http://localhost:3000/smiles/input&lt;/a&gt;, entering the SMILES string for benzene, and pressing return produces the page show below:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_5_1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;So far, so good. We've been able to read user input from an HTML form and reprocess it into some simple HTML output. Now, lets render a 2-D molecular image to go with it.&lt;/p&gt;

&lt;h4&gt;Step 6: Generate the 2-D Image&lt;/h4&gt;

&lt;p&gt;We'll use a method called &lt;tt&gt;image_for&lt;/tt&gt;, which we'll define shortly. The file &lt;strong&gt;depict/app/views/smiles/depict&lt;/strong&gt; should look like this:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;span class="constant"&gt;Depict&lt;/span&gt; &lt;span class="constant"&gt;SMILES&lt;/span&gt;&lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt;&lt;span class="string"&gt; @smiles %&amp;gt;&amp;lt;/title&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;body&amp;gt;
    &amp;lt;h1&amp;gt;SMILES:&amp;lt;%&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;/h1&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;img&lt;/span&gt; &lt;span class="ident"&gt;src&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;%= url_for :action =&amp;gt; &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="ident"&gt;image_for&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;, :smiles =&amp;gt; @smiles %&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="regex"&gt;img&amp;gt;
  &amp;lt;&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;&lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="regex"&gt;html&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The added &lt;tt&gt;img&lt;/tt&gt; tag is a placeholder for now. It loads an image dynamically generated from the &lt;tt&gt;image_for&lt;/tt&gt; method, which we'll shortly add to &lt;tt&gt;SmilesController&lt;/tt&gt;. We pass the SMILES string as a parameter.&lt;/p&gt;

&lt;p&gt;The &lt;tt&gt;image_for&lt;/tt&gt; method does all of the real work in the Depict application. It accepts a SMILES string as a parameter, and produces a laid-out 2-D color molecular image as output. The method uses a variety of functionality contained in the Java API itself, and in Ruby CDK.&lt;/p&gt;

&lt;p&gt;In addition to an &lt;tt&gt;image_for&lt;/tt&gt; method, we'll need to add some accessory code to make it work. Edit &lt;strong&gt;depict/app/controllers/smiles_controller.rb&lt;/strong&gt; so that it looks like this:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="comment"&gt;# Load the RCDK library&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="comment"&gt;# New jrequire calls.&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java.io.ByteArrayOutputStream&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.structure.cdk.util.ImageKit&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;javax.imageio.ImageIO&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;SmilesController&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt; &lt;span class="constant"&gt;ApplicationController&lt;/span&gt;

  &lt;span class="comment"&gt;# Already defined.&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;input&lt;/span&gt;

  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="comment"&gt;# Already defined.&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;depict&lt;/span&gt;
    &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;][&lt;/span&gt;&lt;span class="symbol"&gt;:value&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="comment"&gt;# New method.&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;image_for&lt;/span&gt;
    &lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molecule&lt;/span&gt; &lt;span class="ident"&gt;mol&lt;/span&gt;
    &lt;span class="ident"&gt;out&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;Java&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Io&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ByteArrayOutputStream&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="ident"&gt;image&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;Net&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Sf&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Structure&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ImageKit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;createRenderedImage&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="constant"&gt;Javax&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Imageio&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ImageIO&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;image&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="ident"&gt;out&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;send_data&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;out&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toByteArray&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:type&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;image/png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="symbol"&gt;:disposition&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;inline&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="symbol"&gt;:filename&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;molecule.png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's test the application with a real-world example. The achiral SMILES string for &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=14950"&gt;Carmine&lt;/a&gt; is:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_smiles "&gt;CC1=C2C(=CC(=C1C(=O)O)O)C(=O)C3=C(C2=O)C(=C(C(=C3O)O)C4C(C(C(C(O4)CO)O)O)O)O&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Pointing your browser to &lt;a href="http://localhost:3000/smiles/input"&gt;http://localhost:3000/smiles/input&lt;/a&gt; and entering the above SMILES string produces a color 2-D image of the structure of the red food coloring:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061122/step_6_1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Ruby on Rails is a fun and agile framework for rapid Web development. Although Depict isn't much to look at yet, it demonstrates many key Rails concepts. Several techniques could be used improve the application's look and usability. For example, we could use &lt;a href="http://www.ajaxian.com/"&gt;AJAX&lt;/a&gt; to depict SMILES strings as they are being typed - without the need to hit return. We could also provide options for changing image format, size, and color scheme. Future articles will describe these and other improvements.&lt;/p&gt;</description>
      <pubDate>Tue, 21 Nov 2006 15:06:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:28d193f3-f174-45ca-84f1-cac552672c29</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/21/build-a-rails-cheminformatics-application-in-thirty-minutes</link>
      <category>Tools</category>
      <category>webapplication</category>
      <category>web</category>
      <category>rails</category>
      <category>2d</category>
      <category>rcdk</category>
      <category>ruby</category>
      <category>rjb</category>
      <category>amd64</category>
    </item>
    <item>
      <title>Diversity-Oriented Chemical Informatics</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/cdk_logo.png" align="right"&gt;&lt;/img&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;How would you enumerate all of the molecules represented by a molecular formula? This question was recently posed to members of the &lt;a href="http://hardly.cubic.uni-koeln.de/pipermail/blue-obelisk/2006-November/000970.html"&gt;Blue Obelisk mailing list&lt;/a&gt;. Formula-based exhaustive structure enumeration may seem on the surface to be just another esoteric problem. Nevertheless, playing with open, interactive software that can perform such enumerations can be a great source of new ideas for applications and unit tests.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt; offers a fully-functional exhaustive structure enumerator through its &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; class. This article will use &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; through the &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; interface to generate color 2-D images for all molecules of a given molecular formula.&lt;/p&gt;

&lt;h4&gt;A Solution&lt;/h4&gt;

&lt;p&gt;The software described in this article will generate a collection of 2-D molecular PNG images based on a user-supplied molecular formula. When viewed in a file browser such as Windows Explorer or &lt;a href="http://www.konqueror.org/"&gt;Konqueror&lt;/a&gt;, the output is visible as a matrix of images. The filename of each image is given by the SMILES string of the corresponding molecule. All molecules are enumerated, whether they look "reasonable" or not. As an example, consider a section of the output for 'C4H8ClNO', which looks like this on my system:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061115/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Enumerator: A Small Ruby Library&lt;/h4&gt;

&lt;p&gt;We'll create a small Ruby class to do most of the work. Save the following in a file called &lt;strong&gt;enum.rb&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.structgen.deterministic.GENMDeterministicGenerator&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.structure.cdk.util.ImageKit&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Enumerator&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;formula&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="attribute"&gt;@generator&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Structgen&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Deterministic&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;GENMDeterministicGenerator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;formula&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt;
    &lt;span class="attribute"&gt;@width&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;150&lt;/span&gt;
    &lt;span class="attribute"&gt;@height&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;150&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;set_size&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="attribute"&gt;@width&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;
    &lt;span class="attribute"&gt;@height&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;write_images&lt;/span&gt;
    &lt;span class="ident"&gt;mols&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@generator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getStructures&lt;/span&gt;
    &lt;span class="ident"&gt;iterator&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;mols&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;

    &lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;hasNext&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molecule&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;next&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_smiles&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

      &lt;span class="constant"&gt;Net&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Sf&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Structure&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ImageKit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;writePNG&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="attribute"&gt;@width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="attribute"&gt;@height&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{smiles}&lt;/span&gt;.png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, this class is nothing more than a thin wrapper around a large amount of CDK functionality. Most of the action happens in the &lt;tt&gt;write_images&lt;/tt&gt; method, where three things take place:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;We retrieve a list of molecules from the &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; instance that satisfy the molecular formula passed to &lt;tt&gt;Enumerator's&lt;/tt&gt; constructor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;These molecules are iterated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For each molecule, an image is written with the filename given by its SMILES string.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;To test the library, the following code can either be entered interactively via Interactive Ruby (irb) or saved to a file and run with the Ruby interpreter (ruby):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;enum&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;Enumerator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;C4H8ClNO&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write_images&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this code will produce a collection of PNG images in your working directory. By changing the argument passed to the &lt;tt&gt;Enumerator&lt;/tt&gt; constructor, you can change the makeup of the image set.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; (RCDK). A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RCDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Unexpected Behavior&lt;/h4&gt;

&lt;p&gt;After testing the Enumerator library, you may notice a new file in your working directory called &lt;strong&gt;structuredata.txt&lt;/strong&gt;. This file is written automatically by &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; on instantiation, providing information on each structure that is generated. The &lt;a href="http://cdk.sourceforge.net/api/org/openscience/cdk/structgen/deterministic/GENMDeterministicGenerator.html"&gt;CDK API&lt;/a&gt; does not mention the creation of this file, and it would be preferable for this file to only created on request. I'll be submitting a &lt;a href="http://sourceforge.net/tracker/?group_id=20024&amp;amp;atid=370024"&gt;feature request&lt;/a&gt; to this effect shortly.&lt;/p&gt;

&lt;h4&gt;Food for Thought&lt;/h4&gt;

&lt;p&gt;If you plan to explore larger areas of chemical space with the Enumerator library, be prepared to wait. The generation of molecules, determination of 2-D coordinates, and rendering can take some time. Of course, the number of molecules increases dramatically with the number of atoms in the molecular formula - a concrete demonstration of what makes organic chemistry the fascinating discipline that it is.&lt;/p&gt;

&lt;p&gt;An interesting variation on the ideas presented here would be to filter out molecules based on some criteria. One approach would be to remove molecules containing reactive functionality such as nitrogen substituted with chorine. A SMARTS pattern search could easily form the basis for this filter. In applying this and similar filters, larger areas of interesting chemical space could be sampled in a reasonable amount of time.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;CDK's &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; class, when combined with 2-D structure layout and 2-D rendering, provides the foundation of an intriguing tool for exploring chemical diversity. Further combining this capability with that offered by other freely-available tools offers some thought-provoking possibilities.&lt;/p&gt;</description>
      <pubDate>Wed, 15 Nov 2006 15:03:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:16ee911f-73ea-4056-9f9d-dcad5a698a91</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/15/diversity-oriented-chemical-informatics</link>
      <category>Tools</category>
      <category>diversity</category>
      <category>cdk</category>
      <category>ruby</category>
      <category>rcdk</category>
      <category>enumeration</category>
      <category>integration</category>
    </item>
    <item>
      <title>Agile Chemical Informatics Development with CDK and Ruby: RCDK-0.3.0</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/cdk_logo.png" align="right"&gt;&lt;/img&gt;Ruby Chemistry Development Kit (RCDK) version 0.3.0 is now available from RubyForge. &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK&lt;/a&gt; enables the complete CDK API to be accessed from Ruby. This release adds support for &lt;a href="http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin"&gt;IUPAC nomenclature translation&lt;/a&gt;  and &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;tighter Java integration&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Dependencies&lt;/h4&gt;

&lt;p&gt;RCDK requires Ruby, the Ruby developer libraries, a working build toolchain, and &lt;a href="http://rjb.rubyforge.org"&gt;Ruby Java Bridge&lt;/a&gt; (RJB). This latter dependency can be satisfied during the RCDK installation process if the RubyGems method is used (see 'Installation').&lt;/p&gt;

&lt;h4&gt;Installation&lt;/h4&gt;

&lt;p&gt;RCDK can be conveniently installed using the &lt;a href="http://rubygems.org/"&gt;RubyGems&lt;/a&gt; packaging mechanism:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
# gem install rcdk
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Alternatively, the source package and RubyGem can be downloaded &lt;a href="http://rubyforge.org/frs/?group_id=2199"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Tighter Java Integration&lt;/h4&gt;

&lt;p&gt;RCDK-0.3.0 introduces a previously-described &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;Java package to Ruby module mapping mechanism&lt;/a&gt;. For example, if you'd like to create a Java &lt;tt&gt;ArrayList&lt;/tt&gt;, it can be done through the new &lt;tt&gt;jrequire&lt;/tt&gt; command:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;list&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Java&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ArrayList&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;list&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;size&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;IUPAC Nomenclature Translation&lt;/h4&gt;

&lt;p&gt;RCDK's most important new chemical informatics feature is made possible by &lt;a href="http://wwmm.ch.cam.ac.uk/blogs/corbett/"&gt;Peter Corbett's&lt;/a&gt; excellent IUPAC nomenclature translation library &lt;a href="http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin"&gt;OPSIN&lt;/a&gt;. It can either be used directly with &lt;tt&gt;jrequire&lt;/tt&gt;, or indirectly through RCDK's convenience library &lt;tt&gt;RCDK::Util&lt;/tt&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_iupac&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;quinoline&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomCount&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 10&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There are two things to notice here. First, no &lt;tt&gt;jrequire&lt;/tt&gt; statement is needed when using the &lt;tt&gt;RCDK::Util&lt;/tt&gt; library. Second, there is a multisecond delay after &lt;tt&gt;read_iupac&lt;/tt&gt; is invoked. OPSIN itself introduces this delay during the &lt;tt&gt;NameToStructure&lt;/tt&gt; constructor call, and RCDK inherits this behavior. However, after the first invocation of &lt;tt&gt;read_iupac&lt;/tt&gt;, subsequent calls to this method are very fast.&lt;/p&gt;

&lt;p&gt;Let's decorate the quinoline nucleus with some substituents and render a 2-D image of the result. Execute the following code, either through the Ruby interpreter (&lt;tt&gt;ruby&lt;/tt&gt;) or through Interactive Ruby (&lt;tt&gt;irb&lt;/tt&gt;):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Image&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;iupac_to_png&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;3-chloro-4-(2-aminopropyl)-6-mercapto-8-(2-hydroxyphenyl)-quinoline-2-carboxylic acid&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;test.png&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this code produces the following image in your working directory:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061030/test.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Be Agile&lt;/h4&gt;

&lt;p&gt;RCDK marries the &lt;a href="http://www.martinfowler.com/articles/newMethodology.html"&gt;agility&lt;/a&gt; of the Ruby language with the functionality of three Open Source chemical informatics libraries: &lt;a href="http://cdk.sf.net"&gt;CDK&lt;/a&gt;; &lt;a href="http://depth-first.com/articles/2006/10/14/decoding-iupac-names-with-opsin"&gt;OPSIN&lt;/a&gt;; and &lt;a href="http://depth-first.com/articles/2006/08/28/drawing-2-d-structures-with-structure-cdk"&gt;Structure-CDK&lt;/a&gt;. Future articles will discuss some simple applications of this powerful combination.&lt;/p&gt;</description>
      <pubDate>Mon, 30 Oct 2006 14:03:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:a5e77e1e-7c24-47e4-90e9-ed1e068b19c2</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0</link>
      <category>Tools</category>
      <category>rcdk</category>
      <category>ruby</category>
      <category>cdk</category>
      <category>opsin</category>
      <category>agile</category>
      <category>java</category>
      <category>integration</category>
    </item>
  </channel>
</rss>
