<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag integration</title>
    <link>http://depth-first.com/articles/tag/integration</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Anatomy of a Cheminformatics Web Application: Ajaxifying Depict</title>
      <description>&lt;p&gt;&lt;a href="http://rubyonrails.org"&gt;&lt;img src="http://depth-first.com/files/rails_logo.png" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;The &lt;a href="http://depth-first.com/articles/2006/11/27/anatomy-of-a-cheminformatics-web-application-beautifying-depict"&gt;previous tutorial in this series&lt;/a&gt; showed some techniques for improving the appearance and usability of a simple cheminformatics Web application. That application, Depict, rendered color images of 2-D molecular structures when given a SMILES string. Still, something is missing. Wouldn't it be better if the application responded to individual keystrokes in the input field, rather than waiting for the user to hit the return key? In this tutorial, we'll see how to quickly accomplish this effect with a technology called "Ajax."&lt;/p&gt;

&lt;h4&gt;Downloads and Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; (RCDK). A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RCDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In addition, you'll need to install &lt;a href="http://www.rubyonrails.org/down"&gt;Ruby on Rails&lt;/a&gt; - something that can be done through &lt;a href="http://docs.rubygems.org/"&gt;RubyGems&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The Rails application that this tutorial starts with can be downloaded from &lt;a href="http://depth-first.com/demo/20061127/depict-20061127.tar.gz"&gt;this link&lt;/a&gt;. If you'd rather start working directly with the version of Depict produced by applying the changes outlined in this tutorial, the full source code can be downloaded from &lt;a href="http://depth-first.com/demo/20061204/depict-20061204.tar.gz"&gt;this link&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you'll be running Depict on an AMD64 Linux system, you'll need to prepend your invocation of &lt;tt&gt;script/server&lt;/tt&gt; with &lt;tt&gt;LD_PRELOAD&lt;/tt&gt;. For example, on my system running Sun's JVM, the full command looks like:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ LD_PRELOAD=/usr/java/jdk1.5.0_09/jre/lib/amd64/libzip.so ruby script/server
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;A Brief Introduction to Ajax&lt;/h4&gt;

&lt;p&gt;When stripped down to its essentials, &lt;a href="http://ajaxian.com/"&gt;Ajax&lt;/a&gt; is nothing more than an asynchronous communication channel between Web browsers and Web servers. In the pre-Ajax model of client-server Web interactions, a browser would make a request to a server and then wait until getting a server response, which would take the form of a complete Web page. In the Ajax model, a browser makes a request to a server, continuing to function while the server generates a response, which takes the form of a small section of the page that gets replaced. For this reason, Ajax-enabled Web sites are far more application-like than the document-centric sites that preceded them.&lt;/p&gt;

&lt;h4&gt;Ajax Support in Rails&lt;/h4&gt;

&lt;p&gt;&lt;a href="http://www.amazon.com/gp/product/0977616630?ie=UTF8&amp;amp;tag=depthfirst-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0977616630"&gt;&lt;img border="0" src="http://depth-first.com/demo/20061127/0977616630.01._AA_SCMZZZZZZZ_V36350687_.jpg" align="right"&gt;&lt;/a&gt;&lt;img src="http://www.assoc-amazon.com/e/ir?t=depthfirst-20&amp;amp;l=as2&amp;amp;o=1&amp;amp;a=0977616630" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt;Ajax is implemented in JavaScript using the &lt;tt&gt;HTMLHttpRequest&lt;/tt&gt; object, although working at this level can require a lot of code to do anything meaningful. Fortunately, Rails and other Web application frameworks provide high-level interfaces to Ajax. In Rails, Ajax support takes the form of a variety of helper methods, one of which we'll use in this tutorial: &lt;tt&gt;observe_field&lt;/tt&gt;. This method, an instance of the &lt;a href="http://en.wikipedia.org/wiki/Observer_pattern"&gt;Observer Pattern&lt;/a&gt;, assigns an Observer to monitor input activity in a text field.&lt;/p&gt;

&lt;h4&gt;The Problem at Hand&lt;/h4&gt;

&lt;p&gt;We'd like Depict to provide immediate feedback by rendering a SMILES string as it is keyed into the input field. If the partial SMILES string is valid, it will be rendered, otherwise, an error image will be rendered. At no point will the user need to press the return key to see an image of the SMILES string they are typing.&lt;/p&gt;

&lt;h4&gt;Step 1: Ajaxify the View&lt;/h4&gt;

&lt;p&gt;Let's start by adding an observer to Depict's input field. These changes will occur to the SMILES View, contained in &lt;strong&gt;depict/app/views/smiles/depict.rhtml&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;&amp;lt;html&amp;gt;
  &amp;lt;head&amp;gt;
    &amp;lt;title&amp;gt;Depict&amp;lt;/title&amp;gt;
    &amp;lt;%= stylesheet_link_tag &amp;quot;default&amp;quot;, :media =&amp;gt; &amp;quot;all&amp;quot; %&amp;gt;

    &amp;lt;!-- Nothing works without this line. --&amp;gt;
    &amp;lt;%= javascript_include_tag :defaults %&amp;gt;
  &amp;lt;/head&amp;gt;
  &amp;lt;body&amp;gt;
    &amp;lt;h1&amp;gt;Depict a SMILES String&amp;lt;/h1&amp;gt;

    &amp;lt;!-- New id attribute needed by Ajax --&amp;gt;
    &amp;lt;div class=&amp;quot;image&amp;quot; id=&amp;quot;results&amp;quot; &amp;gt;
      &amp;lt;img src=&amp;quot;&amp;lt;%= image_for_smiles :smiles =&amp;gt; @smiles %&amp;gt;&amp;quot;&amp;gt;&amp;lt;/img&amp;gt;
    &amp;lt;/div&amp;gt;
    &amp;lt;br /&amp;gt;&amp;lt;br /&amp;gt;
      &amp;lt;div class=&amp;quot;smiles&amp;quot;&amp;gt;
      &amp;lt;%= form_tag :action=&amp;gt;'depict' %&amp;gt;
        &amp;lt;label&amp;gt;SMILES: &amp;lt;/label&amp;gt;

        &amp;lt;!-- Ajaxified text field. --&amp;gt;
        &amp;lt;!-- We turn off autocomplete to simplfify the interface. --&amp;gt;
        &amp;lt;%= text_field_tag :smiles, @smiles, {:autocomplete =&amp;gt; &amp;quot;off&amp;quot;} %&amp;gt;
        &amp;lt;%= observe_field( :smiles,
                           :frequency =&amp;gt; 0.5,
                           :update    =&amp;gt; :results,
                           :url       =&amp;gt; { :action =&amp;gt; :ajax_depict } ) %&amp;gt;
      &amp;lt;%= end_form_tag %&amp;gt;
      &amp;lt;/div&amp;gt;
  &amp;lt;/body&amp;gt;

  &amp;lt;div class=&amp;quot;about&amp;quot;&amp;gt;
    &amp;lt;!-- Update the URL to point to the new Depth-First article --&amp;gt;
    &amp;lt;a href=&amp;quot;http://depth-first.com/articles/2006/12/04/anatomy-of-a-cheminformatics-web-application-ajaxifying-depict&amp;quot;&amp;gt;About this Application&amp;lt;/a&amp;gt;
  &amp;lt;/div&amp;gt;
&amp;lt;/html&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The above code introduces three key elements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The &lt;tt&gt;javascript_include_tag&lt;/tt&gt; method is called, which is surprisingly easy to forget to do.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The original &lt;tt&gt;text_field&lt;/tt&gt; method call is replaced by &lt;tt&gt;text_field_tag&lt;/tt&gt; to simplify coding. We disable browser-based autocompletion by setting the &lt;tt&gt;autocomplete&lt;/tt&gt; attribute to &lt;tt&gt;off&lt;/tt&gt;. This removes a feature unlikely to ever be used, and leads to a more streamlined interface.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;tt&gt;observe_field&lt;/tt&gt; method is called, linking activity in the text field to an Ajax action, &lt;tt&gt;ajax_depict&lt;/tt&gt;, that will update the image area. To accomplish this, we assign the &lt;tt&gt;div&lt;/tt&gt; containing our image the id "results."&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Making these changes and refreshing the browser window gives a screen like the one below:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061204/step_1_1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Although the client side of the Ajax communication channel is working, the server side is not. Let's fix that.&lt;/p&gt;

&lt;h4&gt;Step 2: Ajaxify the Server&lt;/h4&gt;

&lt;p&gt;Depict needs an Action and View that will be invoked in response to keyboard events in the SMILES input box. To do this, first add a new &lt;tt&gt;ajax_depict&lt;/tt&gt; method to &lt;tt&gt;SmilesController&lt;/tt&gt;, the source for which is found in &lt;strong&gt;depict/app/controllers/smiles_controller.rb&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;SmilesController&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt; &lt;span class="constant"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;depict&lt;/span&gt;
    &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
      &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;][&lt;/span&gt;&lt;span class="symbol"&gt;:value&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
    &lt;span class="keyword"&gt;else&lt;/span&gt;
      &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;image_for&lt;/span&gt;
    &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;flash&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:bytes&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
      &lt;span class="ident"&gt;send_data&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;flash&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:bytes&lt;/span&gt;&lt;span class="punct"&gt;],&lt;/span&gt; &lt;span class="symbol"&gt;:type&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;image/png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="symbol"&gt;:disposition&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;inline&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="symbol"&gt;:filename&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{flash[:smiles]}&lt;/span&gt;.png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="comment"&gt;# The new ajax_depict method.&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;ajax_depict&lt;/span&gt;
    &lt;span class="attribute"&gt;@smiles&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="ident"&gt;request&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;raw_post&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Making the above changes and refreshing your browser should give an error message:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061204/step_2_1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;The new &lt;tt&gt;ajax_depict&lt;/tt&gt; method is being called, but no associated template exits. This template contains the HTML that will be inserted into the &lt;tt&gt;div&lt;/tt&gt; with the &lt;tt&gt;results&lt;/tt&gt; id attribute that we set up in Step 1. We can resolve the error we're getting by simply creating a new file (&lt;strong&gt;depict/app/views/smiles/ajax_depict.rhtml&lt;/strong&gt;) containing the following partial template:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;&amp;lt;img src=&amp;quot;&amp;lt;%= image_for_smiles :smiles =&amp;gt; @smiles %&amp;gt;&amp;quot;&amp;gt;&amp;lt;/img&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Now, refreshing your browser should produce a screen like that shown below. We have now Ajaxified Depict, but we're not quite done yet.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061204/step_2_2.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Step 3: Update the Cascading Style Sheet&lt;/h4&gt;

&lt;p&gt;As you type a SMILES string into the input window, you may have noticed the input box being repositioned toward the top of the application window just prior to the display of a new image. This is due to the image area being resized to zero height as the new image is generated.&lt;/p&gt;

&lt;p&gt;Fortunately, the fix is simple; we'll just specify that the image area must be 400 pixels high, whether an image is being displayed or not. This is done by editing the &lt;tt&gt;image&lt;/tt&gt; selector in the CSS file at &lt;strong&gt;depict/public/stylesheets/default.css&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;.image {
    margin-left: auto;
    margin-right: auto;
    width: 400px;
    /* Keeps the input box from moving during image refresh.*/
    height: 400px;
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Refreshing the Depict window should now give a statically-positioned SMILES input field.&lt;/p&gt;

&lt;h4&gt;Step 4: Backward Compatibility&lt;/h4&gt;

&lt;p&gt;As it stands, if the user presses the return key, they will see the "Enter SMILES Below" message. This is due to the change in the way SMILES strings are transmitted into the application. To fix this problem, we simply change the way that &lt;tt&gt;SmilesController&lt;/tt&gt; assigns the &lt;tt&gt;smiles&lt;/tt&gt; instance variable (&lt;strong&gt;depict/app/controllers/smiles_controller.rb&lt;/strong&gt;):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;depict&lt;/span&gt;
  &lt;span class="comment"&gt;# Uses new input method.&lt;/span&gt;
  &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
    &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;params&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:smiles&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
  &lt;span class="keyword"&gt;else&lt;/span&gt;
    &lt;span class="attribute"&gt;@smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Making this change produces an interface that will render the correct image whether the return key is typed or not. If JavaScript is disabled, Depict will work exactly the same way as it did in the non-Ajax version.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Ajax makes the Web more attractive than ever as an application development platform. In this tutorial, we've seen how using Rails made it very easy to give Depict the feel of an interactive SMILES depiction tool using Ajax. But a few details remain before we're ready to deploy this application on a Web server for the public to use. For example, we need to take server load and network latency into account, and we need to make sure Depict works well on all major browsers. The next articles in this series will address these issues.&lt;/p&gt;</description>
      <pubDate>Mon, 04 Dec 2006 15:06:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:04460726-dd5b-40ce-aeb8-1b8f055932a8</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/12/04/anatomy-of-a-cheminformatics-web-application-ajaxifying-depict</link>
      <category>Tools</category>
      <category>depict</category>
      <category>rails</category>
      <category>cheminformatics</category>
      <category>2d</category>
      <category>integration</category>
      <category>ajax</category>
    </item>
    <item>
      <title>Molbank and the Convergence of Open Access, Open Data, and Open Source in Chemistry</title>
      <description>&lt;p&gt;&lt;a href="http://www.mdpi.org/"&gt;&lt;img src="http://depth-first.com/files/mdpi-small.gif" border="0" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://www.mdpi.org/molbank/"&gt;Molbank&lt;/a&gt;, published by &lt;a href="http://www.mdpi.org/"&gt;Molecuar Diversity Preservation International&lt;/a&gt;, is one of the oldest of a handful of &lt;a href="http://depth-first.com/articles/2006/10/18/disruptive-innovation-in-scientific-publishing-directory-of-open-access-journals"&gt;Open Access journals in chemistry&lt;/a&gt;. Although its longevity is a remarkable accomplishment in itself, there is much more to Molbank than meets eye. Just below the surface is a feature so revolutionary, yet simple, that chemistry publishers years from now will wonder why &lt;em&gt;they&lt;/em&gt; didn't implement it sooner.&lt;/p&gt;

&lt;p&gt;A Molbank article consists of a short monograph on a single compound, or possibly two. This may strike some scientists as a strange way to publish results, and it is unusual. On the other hand, this system offers vast potential to capture useful, but "unpublishable" findings that would otherwise be lost. Back when scientists actually read hardcopy journals, such a system would never have been feasible. Today, with hard drive space measured in terabytes, fiber optics cables crisscrossing the planet, Internet connectivity for almost everyone, and servers that can be had for virtually nothing, this system not only looks perfectly feasible, but preferable in many ways to the status quo.&lt;/p&gt;

&lt;p&gt;Here's the revolutionary part: each article that Molbank publishes is accompanied by a publicly-available, machine-readable file encoding the structure of the article's subject molecule. That's it. There's nothing tricky or high-tech about it. In fact, the practice is about as low-tech as you could imagine. The file format in which structures are encoded, molfile, dates back at least fifteen years, and nearly every piece of chemistry software - both end-user and developer tools - can handle it. What makes Molbank's practice revolutionary is that not a single chemistry journal, Open Access or subscription-based, currently does this.&lt;/p&gt;

&lt;p&gt;Why does the simple inclusion of a publicly-available molfile encoding molecular structures in a paper matter so much? This is where the second two entities of the trinity named in this article's title come into play: Open Source and Open Data. By providing a mechanism for a computer to decipher the chemistry in a paper, Molbank has opened the door to a host of highly-productive integration activities that nobody outside of &lt;a href="http://www.cas.org/"&gt;Chemical Abstract Service&lt;/a&gt; has even been able to contemplate, let alone prepare for.&lt;/p&gt;

&lt;p&gt;This article is the first in a series aimed at exploring the wide-open space that Molbank has created. Rather than arguing my point with words, I'll actually build working demonstrations of what is now easily within reach. At the same time, I'll document my work on this blog. I'm not sure where all of this will end up, but I do hope to shine some light on a vital, although currently obscure, component of the Open Access debate.&lt;/p&gt;</description>
      <pubDate>Thu, 30 Nov 2006 15:01:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:0ec69fe1-07ac-46d0-9112-95afd038e81f</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/30/molbank-and-the-convergence-of-open-access-open-data-and-open-source-in-chemistry</link>
      <category>Open X</category>
      <category>opensource</category>
      <category>opendata</category>
      <category>openaccess</category>
      <category>mdpi</category>
      <category>molbank</category>
      <category>integration</category>
      <category>molfile</category>
    </item>
    <item>
      <title>Diversity-Oriented Chemical Informatics</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/cdk_logo.png" align="right"&gt;&lt;/img&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;How would you enumerate all of the molecules represented by a molecular formula? This question was recently posed to members of the &lt;a href="http://hardly.cubic.uni-koeln.de/pipermail/blue-obelisk/2006-November/000970.html"&gt;Blue Obelisk mailing list&lt;/a&gt;. Formula-based exhaustive structure enumeration may seem on the surface to be just another esoteric problem. Nevertheless, playing with open, interactive software that can perform such enumerations can be a great source of new ideas for applications and unit tests.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt; offers a fully-functional exhaustive structure enumerator through its &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; class. This article will use &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; through the &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; interface to generate color 2-D images for all molecules of a given molecular formula.&lt;/p&gt;

&lt;h4&gt;A Solution&lt;/h4&gt;

&lt;p&gt;The software described in this article will generate a collection of 2-D molecular PNG images based on a user-supplied molecular formula. When viewed in a file browser such as Windows Explorer or &lt;a href="http://www.konqueror.org/"&gt;Konqueror&lt;/a&gt;, the output is visible as a matrix of images. The filename of each image is given by the SMILES string of the corresponding molecule. All molecules are enumerated, whether they look "reasonable" or not. As an example, consider a section of the output for 'C4H8ClNO', which looks like this on my system:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061115/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Enumerator: A Small Ruby Library&lt;/h4&gt;

&lt;p&gt;We'll create a small Ruby class to do most of the work. Save the following in a file called &lt;strong&gt;enum.rb&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.structgen.deterministic.GENMDeterministicGenerator&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.structure.cdk.util.ImageKit&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Enumerator&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;formula&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="attribute"&gt;@generator&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Org&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Openscience&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Structgen&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Deterministic&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;GENMDeterministicGenerator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;formula&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt;
    &lt;span class="attribute"&gt;@width&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;150&lt;/span&gt;
    &lt;span class="attribute"&gt;@height&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;150&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;set_size&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="attribute"&gt;@width&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;
    &lt;span class="attribute"&gt;@height&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;write_images&lt;/span&gt;
    &lt;span class="ident"&gt;mols&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@generator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getStructures&lt;/span&gt;
    &lt;span class="ident"&gt;iterator&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;mols&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;

    &lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;hasNext&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molecule&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iterator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;next&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_smiles&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

      &lt;span class="constant"&gt;Net&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Sf&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Structure&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Cdk&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ImageKit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;writePNG&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="attribute"&gt;@width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="attribute"&gt;@height&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{smiles}&lt;/span&gt;.png&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;As you can see, this class is nothing more than a thin wrapper around a large amount of CDK functionality. Most of the action happens in the &lt;tt&gt;write_images&lt;/tt&gt; method, where three things take place:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;We retrieve a list of molecules from the &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; instance that satisfy the molecular formula passed to &lt;tt&gt;Enumerator's&lt;/tt&gt; constructor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;These molecules are iterated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For each molecule, an image is written with the filename given by its SMILES string.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;To test the library, the following code can either be entered interactively via Interactive Ruby (irb) or saved to a file and run with the Ruby interpreter (ruby):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;enum&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;Enumerator&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;C4H8ClNO&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write_images&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this code will produce a collection of PNG images in your working directory. By changing the argument passed to the &lt;tt&gt;Enumerator&lt;/tt&gt; constructor, you can change the makeup of the image set.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby CDK&lt;/a&gt; (RCDK). A recent article described the small amount of system configuration required for &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK on Linux&lt;/a&gt;. Another article showed how to install &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;RCDK on Windows&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Unexpected Behavior&lt;/h4&gt;

&lt;p&gt;After testing the Enumerator library, you may notice a new file in your working directory called &lt;strong&gt;structuredata.txt&lt;/strong&gt;. This file is written automatically by &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; on instantiation, providing information on each structure that is generated. The &lt;a href="http://cdk.sourceforge.net/api/org/openscience/cdk/structgen/deterministic/GENMDeterministicGenerator.html"&gt;CDK API&lt;/a&gt; does not mention the creation of this file, and it would be preferable for this file to only created on request. I'll be submitting a &lt;a href="http://sourceforge.net/tracker/?group_id=20024&amp;amp;atid=370024"&gt;feature request&lt;/a&gt; to this effect shortly.&lt;/p&gt;

&lt;h4&gt;Food for Thought&lt;/h4&gt;

&lt;p&gt;If you plan to explore larger areas of chemical space with the Enumerator library, be prepared to wait. The generation of molecules, determination of 2-D coordinates, and rendering can take some time. Of course, the number of molecules increases dramatically with the number of atoms in the molecular formula - a concrete demonstration of what makes organic chemistry the fascinating discipline that it is.&lt;/p&gt;

&lt;p&gt;An interesting variation on the ideas presented here would be to filter out molecules based on some criteria. One approach would be to remove molecules containing reactive functionality such as nitrogen substituted with chorine. A SMARTS pattern search could easily form the basis for this filter. In applying this and similar filters, larger areas of interesting chemical space could be sampled in a reasonable amount of time.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;CDK's &lt;tt&gt;GENMDeterministicGenerator&lt;/tt&gt; class, when combined with 2-D structure layout and 2-D rendering, provides the foundation of an intriguing tool for exploring chemical diversity. Further combining this capability with that offered by other freely-available tools offers some thought-provoking possibilities.&lt;/p&gt;</description>
      <pubDate>Wed, 15 Nov 2006 15:03:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:16ee911f-73ea-4056-9f9d-dcad5a698a91</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/15/diversity-oriented-chemical-informatics</link>
      <category>Tools</category>
      <category>diversity</category>
      <category>cdk</category>
      <category>ruby</category>
      <category>rcdk</category>
      <category>enumeration</category>
      <category>integration</category>
    </item>
    <item>
      <title>Debabelization</title>
      <description>&lt;blockquote&gt;
    &lt;p&gt;Today, we find &lt;em&gt;Chemical Abstracts&lt;/em&gt; with over two million compounds coded in a connectivity table system and ISI with close to a million compounds coded in WLN. The U.S. Patent Office has large files coded in the Hayward notation; the IDC has large numbers of compounds in its CT and GREMAS Code. Derwent has a sizable patent file coded in one fragment code, and many journal literature compounds coded in the Ring Code fragment code. There are a number of individual companies and government agencies with over 100,000 compounds coded in "a" system. And almost all companies synthesizing new compounds have some internal system for their compounds. Finally, there are many universities with a wide variety of coded structure files.&lt;/p&gt;

    &lt;p&gt;-&lt;cite&gt;Charles E. Granito &lt;a href="http://dx.doi.org/10.1021/c160049a009"&gt;J. Chem. Doc. 1973, 13, 72-74&lt;/a&gt;&lt;/cite&gt; &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The situation described by Granito in 1973 seems eerily familiar today. The names of the players, the technologies, and encoding systems have changed, but the problem of multiple incompatible molecular languages has persisted for over 30 years.&lt;/p&gt;

&lt;p&gt;This problem will become even more pronounced in the near future as &lt;a href="http://depth-first.com/articles/2006/11/07/twelve-free-chemistry-databases"&gt;free chemistry databases on the Web&lt;/a&gt; continue their rapid proliferation. In Granito's world of closed, proprietary databases and unevenly distributed computer power, interoperability was an afterthought; in the coming world of free, open databases, and ubiquitous computer networks that connect to them, interoperability will be taken for granted.&lt;/p&gt;

&lt;p&gt;Granito goes on to observe that "there is no one 'best' system" for molecular representation. And he's right. Molecular languages evolve within a particular problem domain, just as human languages evolve within a specific cultural context. This isn't to say that a molecular language can't be creatively &lt;a href="http://dx.doi.org/10.1021/ci0496797"&gt;adapted to serve purposes for which it was never designed&lt;/a&gt;. Trying to do so is, after all, how new languages are conceived.&lt;/p&gt;

&lt;p&gt;Consider the case of InChI, which is both a molecular identification system and a &lt;a href="http://depth-first.com/articles/2006/08/18/107-years-of-line-formula-notations-1861-1968"&gt;line notation&lt;/a&gt;, or &lt;a href="http://cml.sourceforge.net/"&gt;Chemical Markup Language&lt;/a&gt; (CML), an XML language. There are vast areas of chemistry in which using either InChI or CML will be problematic - particularly polymers, organometallics, and inorganic chemistry. And let's not ignore new molecular representation problems brewing on the horizon like &lt;a href="http://dx.doi.org/10.1002/anie.200602173"&gt;small molecule tertiary structure&lt;/a&gt;. Yet for pure organic chemistry as most of us know it today, InChI and CML may well be optimal.&lt;/p&gt;

&lt;p&gt;The problem is that both InChI and CML compete with simpler, entrenched alternatives - SMILES and molfile, respectively. Even MDL, the author of the original molfile specification, is having difficulty gaining acceptance for its new molfile format, despite significant technical advantages.&lt;/p&gt;

&lt;p&gt;If history is any guide, we can look forward to at least as many molecular languages in the next thirty years as we've seen in the last thirty. It wasn't long ago that WLN was viewed as &lt;a href="http://dx.doi.org/10.1021/ci00034a005"&gt;the language of the future&lt;/a&gt;. Now it just looks cryptic. For this we can thank a combination of technology advances and the emergence of a far simpler alternative, SMILES. A similar fate, more likely than not, awaits all molecular languages currently in use.&lt;/p&gt;

&lt;p&gt;Will there ever be a universal molecular language and is there any point in trying to invent one? Every area of chemistry introduces its own peculiarities not shared by any of the others. Yet all users want the simplest language possible. These two contradictory forces ensure that a universal language is unlikely to ever appear. In other words, the most successful new molecular languages are likely to be &lt;em&gt;agile&lt;/em&gt; - simple, easy to learn, cheap to implement, and quickly adaptable in the face of new chemical concepts and advances in computer technology.&lt;/p&gt;</description>
      <pubDate>Wed, 08 Nov 2006 14:32:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:f484e38a-6ec5-41d1-81fd-5eeb35465ead</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/08/debabelization</link>
      <category>Meta</category>
      <category>molecularlanguage</category>
      <category>inchi</category>
      <category>cml</category>
      <category>integration</category>
      <category>databases</category>
    </item>
    <item>
      <title>OBRuby: A Ruby Interface to Open Babel</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/Babel256.png" align="right"&gt;&lt;/img&gt;&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;And the LORD said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do.&lt;/p&gt;

    &lt;p&gt;-&lt;cite&gt;Genesis 11:6&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="http://openbabel.sf.net"&gt;Open Babel&lt;/a&gt; is a &lt;a href="http://sourceforge.net/project/stats/detail.php?group_id=40728&amp;amp;ugn=openbabel&amp;amp;mode=alltime&amp;amp;&amp;amp;type=prdownload"&gt;widely-used&lt;/a&gt; Open Source chemical informatics toolkit written in C++. Although originally designed as a &lt;a href="http://openbabel.sourceforge.net/wiki/Formats"&gt;molecular language translator&lt;/a&gt;, Open Babel also supports &lt;a href="http://openbabel.sourceforge.net/wiki/SMARTS"&gt;SMARTS pattern recognition&lt;/a&gt;, &lt;a href="http://openbabel.sourceforge.net/wiki/Fingerprint"&gt;molecular fingerprints&lt;/a&gt;, &lt;a href="http://openbabel.sourceforge.net/wiki/Obfit"&gt;molecular superposition&lt;/a&gt;, and other features as well.&lt;/p&gt;

&lt;p&gt;Open Babel currently offers interfaces for two scripting languages: &lt;a href="http://openbabel.sourceforge.net/wiki/Python"&gt;Python&lt;/a&gt; and &lt;a href="http://openbabel.sourceforge.net/wiki/Perl"&gt;Perl&lt;/a&gt;. Recently, &lt;a href="http://geoffhutchison.net/blog/"&gt;Geoff Hutchison&lt;/a&gt; and I have been working to add Ruby to that list. This article reports our success in doing so and provides a glimpse of what might now be possible.&lt;/p&gt;

&lt;h4&gt;OBRuby&lt;/h4&gt;

&lt;p&gt;The upcoming release of Open Babel (version 2.1.0) will come complete with a Ruby interface. For those interested in trying it out sooner, a package called &lt;a href="http://depth-first.com/demo/20061031/obruby.tar.gz"&gt;OBRuby&lt;/a&gt; can be downloaded now. OBRuby compiles against revision 1577 of the Open Babel SVN trunk. It has been tested with Linux and Mac OS X, and will probably work on Windows with minor modifications. &lt;em&gt;The approach outlined here is known to fail with Open Babel 2.0.2.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OBRuby is a technology demonstration. The Ruby scripting support included with Open Babel 2.1.0 may differ in some details from OBRuby. My purpose in this article is simply to demonstrate what is now possible. Please read through the install scripts (they're short) to be sure you're comfortable with what they do.&lt;/p&gt;

&lt;p&gt;Here was my OBRuby installation process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Download the Open Babel SVN trunk revision 1577 or later.&lt;/li&gt;
&lt;li&gt;cd trunk&lt;/li&gt;
&lt;li&gt;configure, make, (as root) make install&lt;/li&gt;
&lt;li&gt;(as root) ldconfig (necessary on my system - perhaps not on yours)&lt;/li&gt;
&lt;li&gt;cd OBRUBY_DIR&lt;/li&gt;
&lt;li&gt;ruby build.rb&lt;/li&gt;
&lt;li&gt;(as root) make install&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One last wrinkle: the &lt;strong&gt;build.rb&lt;/strong&gt; script included with OBRuby is something of a hack. It hardcodes the location of the Open Babel library on line 6:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="attribute"&gt;@@ob_dir&lt;/span&gt;&lt;span class="punct"&gt;='&lt;/span&gt;&lt;span class="string"&gt;/usr/local&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Change this line to match your Open Babel installation and you should be ready to go. &lt;tt&gt;make install&lt;/tt&gt; places a single file, openbabel.so into your Ruby site_ruby directory.

To verify that the installation worked with IRB:

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'openbabel'
=&gt; true
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;A return value of &lt;tt&gt;true&lt;/tt&gt; shows that the installation was successful. An error message about &lt;strong&gt;libopenbabel.so&lt;/strong&gt; not being found indicates that your system can't find your Open Babel libraries. Be sure you've installed Open Babel and either run &lt;tt&gt;ldconfig&lt;/tt&gt; or set &lt;tt&gt;LD_LIBRARY_PATH&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;The majority of OBRuby was autogenerated by &lt;a href="http://www.swig.org/"&gt;SWIG&lt;/a&gt;. A future article will detail how this was done - with an eye toward developing a Java interface to Open Babel.&lt;/p&gt;

&lt;h4&gt;Building an OBMol From SMILES&lt;/h4&gt;

&lt;p&gt;With installation out of the way, let's fire up OBRuby and take her for a test drive. The following code can either be entered with IRB or saved to a file and executed with the ruby interpreter:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;

&lt;span class="ident"&gt;smi2mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_format&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt; &lt;span class="comment"&gt;# cholesterol, no chirality&lt;/span&gt;
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add_hydrogens&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Cholesterol has &lt;span class="expr"&gt;#{mol.num_atoms}&lt;/span&gt; atoms, including hydrogens.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Its molecular weight is &lt;span class="expr"&gt;#{mol.get_mol_wt}&lt;/span&gt; and its molecular formula is &lt;span class="expr"&gt;#{mol.get_formula}&lt;/span&gt;.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

This simple code illustrates some important points. All OBRuby classes reside in the &lt;tt&gt;OpenBabel&lt;/tt&gt; module. These classes can be directly referenced by &lt;tt&gt;including&lt;/tt&gt; the &lt;tt&gt;OpenBabel&lt;/tt&gt; module. Also notice how Ruby &lt;tt&gt;underscore_delimited&lt;/tt&gt; method names are used, rather than C++ &lt;tt&gt;UpperCamelCase&lt;/tt&gt; names.

&lt;h4&gt;SMARTS Matching&lt;/h4&gt;

One of the most useful features of Open Babel is its SMARTS pattern matching capability. This can conveniently be accessed from OBRuby by first instantiating an &lt;tt&gt;OBSmartsPattern&lt;/tt&gt;, passing the SMARTS pattern of interest to the instance's &lt;tt&gt;init&lt;/tt&gt; method, and retrieving the hit set:

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;

&lt;span class="ident"&gt;smi2mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_format&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;# cholesterol, no chirality&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; 
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add_hydrogens&lt;/span&gt;

&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;OBSmartsPattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smarts&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;C1CCCCC1&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;init&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;smarts&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
&lt;span class="ident"&gt;hits&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_umap_list&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; indicies of two cyclohexane rings&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Found &lt;span class="expr"&gt;#{hits.size}&lt;/span&gt; instances of the SMARTS pattern '&lt;span class="expr"&gt;#{smarts}&lt;/span&gt;' in the SMILES string &lt;span class="expr"&gt;#{smiles}&lt;/span&gt;. Here are the atom indices:&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;hits&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each_with_index&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;hit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;index&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
  &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Hit &lt;span class="expr"&gt;#{index}&lt;/span&gt;: [ &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

  &lt;span class="ident"&gt;hit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;atom_index&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
    &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{atom_index}&lt;/span&gt; &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;]&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Notice the Rubyesque &lt;tt&gt;each_with_index&lt;/tt&gt; block that iterates over the elements in the hit set.

Running the above code produces the following output:

&lt;div class="console"&gt;
&lt;pre&gt;
Found 2 instances of the SMARTS pattern 'C1CCCCC1' in the SMILES string CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C. Here are the atom indices:
Hit 0: [ 12 17 16 15 14 13 ]
Hit 1: [ 20 25 24 23 22 21 ]
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Finding Your Way&lt;/h4&gt;

&lt;p&gt;Using a new library like OBRuby can take some getting used to. An excellent source of information is OpenBabel's &lt;a href="http://openbabel.sourceforge.net/dev-api/classes.shtml"&gt;online API documentation&lt;/a&gt;. Another source is Ruby itself.&lt;/p&gt;

&lt;p&gt;For example, let's say you've instantiated an &lt;tt&gt;OBMol&lt;/tt&gt;, but can't remember the exact name of the method that counts the number of atoms. Just use &lt;tt&gt;Object.methods.sort&lt;/tt&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;methods&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;sort&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; see output below&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

When run from Interactive Ruby (irb), this code produces the following alphabetized list of methods, which I've truncated:

&lt;div class="console"&gt;
... "is_corrected_for_ph", "kekulize", "kind_of?", "method", "methods", "new_atom", "new_perceive_kekule_bonds", "new_residue", "next_atom", "next_bond", "next_conformer", "next_internal_coord", "next_residue", "nil?", &lt;strong&gt;"num_atoms"&lt;/strong&gt;, "num_bonds", "num_conformers", "num_edges", "num_hvy_atoms", "num_nodes", "num_residues", "num_rotors", "object_id", "perceive_bond_orders", "perceive_kekule_bonds", "private_methods", "protected_methods", "public_methods", "renumber_atoms", "reserve_atoms", "reset_visit_flags" ...
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;OBRuby combines the dynamic programming language Ruby with the highly-functional toolkit Open Babel. Further augmenting OBRuby's capabilities with the web application framework &lt;a href="http://www.rubyonrails.org/"&gt;Rails&lt;/a&gt; and/or &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby Chemistry Development Kit&lt;/a&gt; offers even more possibilities. Future articles will bring some of them to life.&lt;/p&gt;</description>
      <pubDate>Tue, 31 Oct 2006 14:20:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:09873e0e-eda9-4496-a1f1-28ab6d11930e</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/31/obruby-a-ruby-interface-to-open-babel</link>
      <category>Tools</category>
      <category>openbabel</category>
      <category>obruby</category>
      <category>ruby</category>
      <category>integration</category>
      <category>smiles</category>
      <category>smarts</category>
    </item>
    <item>
      <title>Agile Chemical Informatics Development with CDK and Ruby: RCDK-0.3.0</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/cdk_logo.png" align="right"&gt;&lt;/img&gt;Ruby Chemistry Development Kit (RCDK) version 0.3.0 is now available from RubyForge. &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK&lt;/a&gt; enables the complete CDK API to be accessed from Ruby. This release adds support for &lt;a href="http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin"&gt;IUPAC nomenclature translation&lt;/a&gt;  and &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;tighter Java integration&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Dependencies&lt;/h4&gt;

&lt;p&gt;RCDK requires Ruby, the Ruby developer libraries, a working build toolchain, and &lt;a href="http://rjb.rubyforge.org"&gt;Ruby Java Bridge&lt;/a&gt; (RJB). This latter dependency can be satisfied during the RCDK installation process if the RubyGems method is used (see 'Installation').&lt;/p&gt;

&lt;h4&gt;Installation&lt;/h4&gt;

&lt;p&gt;RCDK can be conveniently installed using the &lt;a href="http://rubygems.org/"&gt;RubyGems&lt;/a&gt; packaging mechanism:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
# gem install rcdk
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Alternatively, the source package and RubyGem can be downloaded &lt;a href="http://rubyforge.org/frs/?group_id=2199"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Tighter Java Integration&lt;/h4&gt;

&lt;p&gt;RCDK-0.3.0 introduces a previously-described &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;Java package to Ruby module mapping mechanism&lt;/a&gt;. For example, if you'd like to create a Java &lt;tt&gt;ArrayList&lt;/tt&gt;, it can be done through the new &lt;tt&gt;jrequire&lt;/tt&gt; command:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;jrequire&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;list&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Java&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;ArrayList&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;list&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;size&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;IUPAC Nomenclature Translation&lt;/h4&gt;

&lt;p&gt;RCDK's most important new chemical informatics feature is made possible by &lt;a href="http://wwmm.ch.cam.ac.uk/blogs/corbett/"&gt;Peter Corbett's&lt;/a&gt; excellent IUPAC nomenclature translation library &lt;a href="http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin"&gt;OPSIN&lt;/a&gt;. It can either be used directly with &lt;tt&gt;jrequire&lt;/tt&gt;, or indirectly through RCDK's convenience library &lt;tt&gt;RCDK::Util&lt;/tt&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_iupac&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;quinoline&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomCount&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 10&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There are two things to notice here. First, no &lt;tt&gt;jrequire&lt;/tt&gt; statement is needed when using the &lt;tt&gt;RCDK::Util&lt;/tt&gt; library. Second, there is a multisecond delay after &lt;tt&gt;read_iupac&lt;/tt&gt; is invoked. OPSIN itself introduces this delay during the &lt;tt&gt;NameToStructure&lt;/tt&gt; constructor call, and RCDK inherits this behavior. However, after the first invocation of &lt;tt&gt;read_iupac&lt;/tt&gt;, subsequent calls to this method are very fast.&lt;/p&gt;

&lt;p&gt;Let's decorate the quinoline nucleus with some substituents and render a 2-D image of the result. Execute the following code, either through the Ruby interpreter (&lt;tt&gt;ruby&lt;/tt&gt;) or through Interactive Ruby (&lt;tt&gt;irb&lt;/tt&gt;):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Image&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;iupac_to_png&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;3-chloro-4-(2-aminopropyl)-6-mercapto-8-(2-hydroxyphenyl)-quinoline-2-carboxylic acid&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;test.png&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this code produces the following image in your working directory:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061030/test.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Be Agile&lt;/h4&gt;

&lt;p&gt;RCDK marries the &lt;a href="http://www.martinfowler.com/articles/newMethodology.html"&gt;agility&lt;/a&gt; of the Ruby language with the functionality of three Open Source chemical informatics libraries: &lt;a href="http://cdk.sf.net"&gt;CDK&lt;/a&gt;; &lt;a href="http://depth-first.com/articles/2006/10/14/decoding-iupac-names-with-opsin"&gt;OPSIN&lt;/a&gt;; and &lt;a href="http://depth-first.com/articles/2006/08/28/drawing-2-d-structures-with-structure-cdk"&gt;Structure-CDK&lt;/a&gt;. Future articles will discuss some simple applications of this powerful combination.&lt;/p&gt;</description>
      <pubDate>Mon, 30 Oct 2006 14:03:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:a5e77e1e-7c24-47e4-90e9-ed1e068b19c2</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0</link>
      <category>Tools</category>
      <category>rcdk</category>
      <category>ruby</category>
      <category>cdk</category>
      <category>opsin</category>
      <category>agile</category>
      <category>java</category>
      <category>integration</category>
    </item>
    <item>
      <title>Scripting Java with Ruby: Yet Another Java Bridge</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;New technologies attempting to compete with older technologies need to provide a clear upgrade path, if they are to succeed. A case in point is Ruby. Many Java developers' reaction to this language has less to do with its capabilities and more to do with previous investments in Java. What good is a new language if the special library X that you depend on needs to be rewritten from scratch?&lt;/p&gt;

&lt;p&gt;Previous articles, starting with &lt;a href="http://depth-first.com/articles/2006/08/26/scripting-java-libraries-with-ruby-java-bridge"&gt;this one&lt;/a&gt;, have discussed &lt;a href="http://rjb.rubyforge.org"&gt;Ruby Java Bridge&lt;/a&gt; (RJB) as a Java-Ruby integration tool. Two additional articles discussed RJB in the context of &lt;a href="http://depth-first.com/articles/2006/10/24/metaprogramming-with-ruby-mapping-java-packages-onto-ruby-modules"&gt;mapping Java packages onto Ruby modules&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;Java-Ruby integration on Windows&lt;/a&gt;. RJB currently provides the mechanism whereby the full &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt; (CDK) API can be used in Ruby with &lt;a href="http://rubyforge.org/projects/rcdk"&gt;Ruby CDK&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Another option for Java-Ruby integration is &lt;a href="http://jruby.codehaus.org/"&gt;JRuby&lt;/a&gt;, a Java implementation of the Ruby interpreter. JRuby offers tight integration with the Java Virtual Machine, which will be ideal in many situations. In other situations, it will not be the best choice. For example, one of the advantages of RJB over JRuby is that the standard C-Ruby implementation can be used. This in turn offers, for example, full &lt;a href="http://www.rubyonrails.org/"&gt;Rails&lt;/a&gt; functionality and access to C extensions. A disadvantage of RJB is that, being written in C, it requires a working build toolchain for installation.&lt;/p&gt;

&lt;p&gt;I've seen &lt;a href="http://www.jaredrichardson.net/blog/2006/09/01/"&gt;one report&lt;/a&gt; of a Macintosh installation of RJB that failed. Without a Mac of my own, I can't confirm if this is indeed a problem. But this report also pointed me to a third approach to Ruby-Java integration, &lt;a href="http://www.cmt.phys.kyushu-u.ac.jp/~M.Sakurai/cgi-bin/fw/wiki.cgi?page=YAJB"&gt;Yet Another Java Bridge&lt;/a&gt; (YAJB). YAJB is different from both JRuby and RJB in that it extends the C implementation of Ruby with a Java bridge written in pure Java. In theory, it should run on any platform that both Ruby and Java run on.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.cmt.phys.kyushu-u.ac.jp/~M.Sakurai/java/ruby/yajb-0.8.1.tar.gz"&gt;YAJB-0.8.1&lt;/a&gt; installed on my system without a hitch. From the root directory of the distribution:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
# ruby setup.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Using YAJB was straightforward. A Java &lt;tt&gt;Vector&lt;/tt&gt; instance could be instantiated and manipulated using familiar syntax:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;yajb/jbridge&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;JavaBridge&lt;/span&gt;

&lt;span class="ident"&gt;v&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;jnew&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;java.util.Vector&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="ident"&gt;v&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;one&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
&lt;span class="ident"&gt;v&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;two&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
&lt;span class="ident"&gt;v&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;size&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 2&lt;/span&gt;
&lt;span class="ident"&gt;v&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;elementAt&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; &amp;quot;two&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Good integration tools can make the difference between actually using new technologies and simply observing them. Java developers interested in using Ruby now have at least three good options to choose from: JRuby; RJB; and YAJB.&lt;/p&gt;</description>
      <pubDate>Wed, 25 Oct 2006 14:53:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:65a72a41-d6a6-4728-82b4-c93b9fed8421</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/25/scripting-java-with-ruby-yet-another-java-bridge</link>
      <category>Tools</category>
      <category>ruby</category>
      <category>java</category>
      <category>integration</category>
      <category>rjb</category>
      <category>jruby</category>
      <category>yajb</category>
    </item>
    <item>
      <title>From IUPAC Nomenclature to 2-D Structures With OPSIN</title>
      <description>&lt;p&gt;A &lt;a href="http://depth-first.com/articles/2006/10/14/decoding-iupac-names-with-opsin"&gt;previous article&lt;/a&gt; introduced OPSIN, an Open Source Java library for decoding IUPAC chemical nomenclature. In this tutorial, you'll see how OPSIN can, when interfaced with freely-available chemical informatics software, generate 2-D structure diagrams from IUPAC names.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;This tutorial requires &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;Ruby CDK&lt;/a&gt; (RCDK), which in turn requires Ruby, Java, and the &lt;a href="http://rjb.rubyforge.org"&gt;Ruby Java Bridge&lt;/a&gt;. Tutorials detailing the installation of RCDK on both &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;Windows&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-"&gt;Linux&lt;/a&gt; platforms are available.&lt;/p&gt;

&lt;p&gt;In addition, you'll need a copy of the standalone jarfile &lt;a href="http://prdownloads.sourceforge.net/oscar3-chem/opsin-big-0.1.0.jar?download"&gt;opsin-big-0.1.0.jar&lt;/a&gt;. Future versions of RCDK will integrate the OPSIN jarfile, making this step unnecessary.&lt;/p&gt;

&lt;h4&gt;Outlining the Problem and a Solution&lt;/h4&gt;

&lt;p&gt;We'd like to create a simple Ruby class with a method that accepts an IUPAC chemical name as input and produces a PNG image of the corresponding molecule as output. OPSIN accepts IUPAC names as input, but it only produces &lt;a href="http://www.xml-cml.org/"&gt;Chemical Markup Language&lt;/a&gt; (CML) as output. The CML output lacks 2-D coordinates, and OPSIN itself has no 2-D rendering capabilities.&lt;/p&gt;

&lt;p&gt;We'll use RCDK to augment OPSIN's capabilities. Thanks to CDK's built-in CML support, RCDK can read CML and generate an &lt;tt&gt;AtomContainer&lt;/tt&gt; representation. RCDK also supports the assignment of 2-D coordinates to an &lt;tt&gt;AtomContainer&lt;/tt&gt; via CDK's &lt;tt&gt;StructureDiagramGenerator&lt;/tt&gt;. To produce the PNG image, we'll use the 2-D rendering capability made possible through &lt;a href="http://depth-first.com/articles/2006/08/28/drawing-2-d-structures-with-structure-cdk"&gt;Structure-CDK&lt;/a&gt;, which is a built-in component of RCDK.&lt;/p&gt;

&lt;h4&gt;A Simple Ruby Library&lt;/h4&gt;

&lt;p&gt;Create a working directory and copy &lt;a href="http://prdownloads.sourceforge.net/oscar3-chem/opsin-big-0.1.0.jar?download"&gt;opsin-big-0.1.0.jar&lt;/a&gt; into it. Next, create a file called &lt;strong&gt;depictor.rb&lt;/strong&gt; containing the following Ruby code:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="constant"&gt;Java&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Classpath&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;opsin-big-0.1.0.jar&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt;

&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="comment"&gt;# A simple IUPAC-&amp;gt;2-D structure convertor.&lt;/span&gt;
&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Depictor&lt;/span&gt;
  &lt;span class="attribute"&gt;@@StringReader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java.io.StringReader&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="attribute"&gt;@@NameToStructure&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;uk.ac.cam.ch.wwmm.opsin.NameToStructure&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="attribute"&gt;@@CMLReader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.io.CMLReader&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="attribute"&gt;@@ChemFile&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.ChemFile&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;
    &lt;span class="attribute"&gt;@nts&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@@NameToStructure&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@@CMLReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="comment"&gt;# Writes a &amp;lt;tt&amp;gt;width&amp;lt;/tt&amp;gt; by &amp;lt;tt&amp;gt;height&amp;lt;/tt&amp;gt; PNG to&lt;/span&gt;
  &lt;span class="comment"&gt;# &amp;lt;tt&amp;gt;filename&amp;lt;/tt&amp;gt; for the molecule described by&lt;/span&gt;
  &lt;span class="comment"&gt;# &amp;lt;tt&amp;gt;iupac_name&amp;lt;/tt&amp;gt;.&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;depict_png&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iupac_name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;filename&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;cml&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@nts&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parseToCML&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;iupac_name&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;throw&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Can't parse name: &lt;span class="expr"&gt;#{iupac_name}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt; &lt;span class="keyword"&gt;unless&lt;/span&gt; &lt;span class="ident"&gt;cml&lt;/span&gt;

    &lt;span class="ident"&gt;molfile&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;cml_to_molfile&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cml&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Image&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;molfile_to_png&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molfile&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;filename&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;width&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;height&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="ident"&gt;private&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;cml_to_molfile&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cml&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;string_reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;StringReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cml&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toXML&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;setReader&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;string_reader&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;chem_file&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="attribute"&gt;@@ChemFile&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;molecule&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;chem_file&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getChemSequence&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;getChemModel&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;getSetOfMolecules&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getMolecule&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;molecule&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molecule&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_molfile&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Testing, Testing&lt;/h4&gt;

&lt;p&gt;A short test will demonstrate the capabilities of the &lt;tt&gt;Depictor&lt;/tt&gt; library. Add the following to a file called &lt;strong&gt;test.rb&lt;/strong&gt; in your working directory (or enter it interactively with irb):&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;depictor&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;depictor&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Depictor&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;name&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;3,3-dimethyl-7-oxo-6-[(2-phenylacetyl)amino]-4-thia-1-azabicyclo[3.2.0]heptane-2-carboxylic acid&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;#Penicillin G&lt;/span&gt;

&lt;span class="ident"&gt;depictor&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;depict_png&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;out.png&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this test produces a 300x300 PNG image of Penicillin G, named &lt;strong&gt;out.png&lt;/strong&gt;, in your working directory:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20061017/out.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;As you can see, this simple library and test code has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;correctly parsed the rather complex IUPAC name (3,3-dimethyl-7-oxo-6-[(2-phenylacetyl)amino]-4-thia-1-azabicyclo[3.2.0]heptane-2- carboxylic acid) to a valid CML representation&lt;/li&gt;
&lt;li&gt;converted this representation to a CDK &lt;tt&gt;AtomContainer&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;assigned 2-D coordinates&lt;/li&gt;
&lt;li&gt;rendered a PNG image in color&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice how the thiaazabicyclo[3.2.0] system, complete with properly-placed substitutents, was flawlessly identified and parsed.&lt;/p&gt;

&lt;p&gt;If you entered the above test code interactively via IRB, you may have noticed a multi-second delay in instantiating &lt;tt&gt;Depictor&lt;/tt&gt;. This latency results from a sluggish &lt;tt&gt;NameToStructure&lt;/tt&gt; constructor in OPSIN. A similar delay also occurs in OPSIN's pure-Java unit tests. Once &lt;tt&gt;Depictor&lt;/tt&gt; is instantiated, however, image generation occurs relatively quickly.&lt;/p&gt;

&lt;p&gt;The unususal orientation of the beta-lactam carbonyl group is determined by CDK's &lt;tt&gt;StructureDiagramGenerator&lt;/tt&gt;. The source of this behavior will be explored in a future article.&lt;/p&gt;

&lt;h4&gt;More Examples&lt;/h4&gt;

&lt;p&gt;To illustrate some of the capabilities of the OPSIN-RCDK combination, a few more examples are provided below.&lt;/p&gt;

&lt;p&gt;One of OPSIN's more surprising features is how well it handles heterocycles. For example, the IUPAC name for caffeine (&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=2519"&gt;1,3,7-trimethylpurine-2,6-dione&lt;/a&gt;) is translated to:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/caffeine.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;As another example, consider the tetrazole (&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=180603"&gt;1-[2-hydroxy-3-propyl-4-[3-(2H-tetrazol-5-yl)propoxy]phenyl]ethanone&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/180603.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;Highly substituted benzene rings and carboxylic acids are also translated accurately, as in &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=2528"&gt;3-acetamido-5-(acetyl-methyl-amino)-2,4,6-triiodo-benzoic acid&lt;/a&gt; (Metrizoate):&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/metrizoate.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;How about a hairy-looking macrocycle name with multiple levels of morpheme nesting (&lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=2547"&gt;3,6-diamino-N-[[15-amino-11-(2-amino-3,4,5,6-tetrahydropyrimidin-4-yl)-8- [(carbamoylamino)methylidene]-2-(hydroxymethyl)-3,6,9,12,16-pentaoxo- 1,4,7,10,13-pentazacyclohexadec-5-yl]methyl]hexanamide&lt;/a&gt;)? Not a problem:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/2547.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;h4&gt;Limitations&lt;/h4&gt;

&lt;p&gt;In my tests of the OPSIN library, one structure appeared to be incorrectly parsed - &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=180591"&gt;N-(5-chloro-2-methyl-phenyl)-2-methoxy-N-(2-oxooxazolidin-3-yl)acetamide&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/180591.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;There are actually two problems with the output. First, an oxygen atom and a methyl group are overlapping near the top of the diargram. This cosmetic issue is related to CDK's &lt;tt&gt;StructureDiagramGenerator&lt;/tt&gt;. Second, the oxazolidine nitrogen atom is misplaced by OPSIN. The correct 2-D image of this molecule, obtained from PubChem, is shown below:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="http://depth-first.com/demo/20061017/180591_pc.png"&gt;&lt;/img&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;It's not common to find an early-development Open Source project with the sophistication of OPSIN. The smooth handling of nested morphemes, aromatic heterocycles, macrocycles, and a good fraction of what I threw at it leads me to belive that a well-designed and extensible nomenclature parsing engine lies at OPSIN's core. More on that later, though.&lt;/p&gt;

&lt;p&gt;What could you do with a powerful Open Source IUPAC nomenclature parser? The answer to that one question could fill a three-volume series. Suffice it to say that OPSIN, in combination with other Open Source software, offers virtually limitless potential for indexing, collecting, repackaging, reprocessing, and mashing up vast amounts of chemical information. Because of its Open Source license, OPSIN can be extended and otherwise modified to fit your particular needs. Future articles will highlight some of the possibilities.&lt;/p&gt;</description>
      <pubDate>Tue, 17 Oct 2006 13:57:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:fd6de2ae-23c8-4e50-9765-344e9a7a9545</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/17/from-iupac-nomenclature-to-2-d-structures-with-opsin</link>
      <category>Graphics</category>
      <category>opsin</category>
      <category>nametostruct</category>
      <category>iupac</category>
      <category>rcdk</category>
      <category>structure</category>
      <category>cdk</category>
      <category>integration</category>
      <category>mashup</category>
    </item>
    <item>
      <title>Compiling C to Java Bytecode</title>
      <description>&lt;p&gt;In the ideal world for many Java developers, all software would be written in Java. The reality is that a great deal of software is written in other languages, one of the most widespread of which is C. This article discusses a unique approach to working with C code from Java, producing 100% pure Java bytecode that runs anywhere Java does.&lt;/p&gt;

&lt;h4&gt;JNI - The Standard Solution&lt;/h4&gt;

&lt;p&gt;The standard solution to working with C code from Java has been the Java Native Interface (JNI). In this approach, the Java Virtual Machine (JVM) is able to treat a native binary library as if it were written in Java. This is a clever solution that does what it claims to.&lt;/p&gt;

&lt;p&gt;Unfortunately, JNI introduces a platform dependency - the very thing Java was designed to avoid. Depending on the details of the native library, this platform dependency may effectively banish your software from platforms that it would otherwise run on without modification. The Eclipse team, for example, has had to deal with the platform dependence issues of the Standard Widget Toolkit (SWT) for some time now. Even if a workable solution is developed, deployment is an order of magnitude more complex when native libraries are involved.&lt;/p&gt;

&lt;p&gt;It doesn't have to be this way. What if it were possible to compile C source code directly into Java bytecode?&lt;/p&gt;

&lt;h4&gt;A Better Way&lt;/h4&gt;

&lt;p&gt;&lt;a href="http://www.axiomsol.com/"&gt;Axiomatic Solutions&lt;/a&gt; has an answer to this problem called Axiomatic Multi-Platform C (AMPC). This software can compile C source files directly into Java class files.&lt;/p&gt;

&lt;p&gt;Axiomatic offers a free demo version of AMPC, which can be downloaded &lt;a href="http://www.download.com/Axiomatic-Multi-Platform-C/3000-2069-10395882.html?part=dl-Axiomatic&amp;amp;subj=dl&amp;amp;tag=button"&gt;here&lt;/a&gt;. The demo is rather limited; it expires after fifteen days and lacks certain key features available in the full version, such as multiplication and division.&lt;/p&gt;

&lt;p&gt;For those serious about AMPC, the full version can be had for $2999.00. This is a hefty sum. But depending on who you are and what you're trying to do, AMPC may be the most cost-effective solution available.&lt;/p&gt;

&lt;p&gt;AMPC is not the only C to Java conversion option. Another program, &lt;a href="http://tech.novosoft-us.com/product_c2j.jsp"&gt;C2J&lt;/a&gt; is free (as in beer) software from &lt;a href="http://novosoft-us.com/"&gt;Novasoft&lt;/a&gt; that translates C source into Java source. &lt;a href="http://www.jazillian.com/"&gt;Jazillian&lt;/a&gt; also converts C source into Java source, with an emphasis on readability. Links to more C to Java solutions are available from &lt;a href="http://www.jazillian.com/competition.html"&gt;this page&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Simple Demo&lt;/h4&gt;

&lt;p&gt;To learn more about AMPC, I downloaded and installed the Demo version 1.5.1. It installed without a hitch.&lt;/p&gt;

&lt;p&gt;AMPC actually consists of two components - a command-line utility and an IDE. Those of you used to &lt;a href="http://www.eclipse.org/"&gt;Eclipse&lt;/a&gt; will be somewhat disappointed with AMPC's IDE, which is based on &lt;a href="http://www.scintilla.org/SciTE.html"&gt;SciTE&lt;/a&gt;. For this reason, I spent most of my time with the command-line utility.&lt;/p&gt;

&lt;p&gt;I decided the venerable Hello World application should be my first stop. I saved this version to a file called &lt;strong&gt;hello.c&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_C "&gt;#include &amp;lt;stdio.h&amp;gt;

int main(void)
{
  printf(&amp;quot;Hello World - From C!\n&amp;quot;);

  return 0;
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

From a DOS prompt, I then issued:

&lt;div class="console"&gt;
&lt;pre&gt;
&gt;compile hello.c
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This produced the file &lt;strong&gt;hello.class&lt;/strong&gt;. Running this class with Java confirmed that this process does indeed work:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
&gt;java hello
Hello World - From C!
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;A More Complex Demo&lt;/h4&gt;

&lt;p&gt;One of the key differences between C and Java is that C has pointers and Java does not. So how does AMPC handle a simple program that uses pointers? Very well, it turns out. For this test, I used the following source code, which I lifted from &lt;a href="http://www.gamedev.net/reference/articles/article1697.asp"&gt;this tutorial&lt;/a&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_C "&gt;#include &amp;lt;stdio.h&amp;gt;

int j, k;
int *ptr;

int main(void)
{
    j = 1;
    k = 2;
    ptr = &amp;amp;k;
    printf(&amp;quot;\n&amp;quot;);
    printf(&amp;quot;j has the value %d and is stored at %p\n&amp;quot;, j, (void *)&amp;amp;j);
    printf(&amp;quot;k has the value %d and is stored at %p\n&amp;quot;, k, (void *)&amp;amp;k);
    printf(&amp;quot;ptr has the value %p and is stored at %p\n&amp;quot;, ptr, (void *)&amp;amp;ptr);
    printf(&amp;quot;The value of the integer pointed to by ptr is %d\n&amp;quot;, *ptr);

    return 0;
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Compiling this code and running it analogously to the Hello World example above produced the following output:

&lt;div class="console"&gt;
&lt;pre&gt;
j has the value 1 and is stored at 0x2824
k has the value 2 and is stored at 0x2826
ptr has the value 0x2826 and is stored at 0x2c78
The value of the integer pointed to by ptr is 2
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;So What?&lt;/h4&gt;

&lt;p&gt;Libraries written in C are of course quite common in chemical informatics and computational chemistry. Although most of these are legacy libraries developed long ago, some are more recent.&lt;/p&gt;

&lt;p&gt;A case in point is the &lt;a href="http://www.iupac.org/inchi/"&gt;InChI library&lt;/a&gt;, the only implementation of which is written in C. It has been suggested that the best solution to using InChI from Java is JNI. However, for the reasons outlined above, this is not really the solution that Java developers want. I, and others, have argued that a pure Java implementation is the best solution - but porting is an expensive proposition, given the complexity of the InChI code.&lt;/p&gt;

&lt;p&gt;Perhaps applying AMPC, C2J, Jazillian, or similar software to the InChI library would offer the best of both worlds. That is, assuming these approaches can be made to work.&lt;/p&gt;

&lt;p&gt;A future article will detail my attempts to translate the InChI library to Java with C2J.&lt;/p&gt;

&lt;h4&gt;The Final Word&lt;/h4&gt;

&lt;p&gt;The limited nature of AMPC demo prevents me from evaluating whether the full version can be used to compile real libraries, like InChI, directly into Java bytecode. However, if my experiences with the demo version are predictive, AMPC may well be a viable option for chemical informatics integration efforts.&lt;/p&gt;</description>
      <pubDate>Mon, 16 Oct 2006 01:53:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:570cc10a-241e-42f7-8d68-7cb1b1eb9bfa</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/16/compiling-c-to-java-bytecode</link>
      <category>Tools</category>
      <category>c</category>
      <category>java</category>
      <category>integration</category>
      <category>ampc</category>
      <category>c2j</category>
      <category>compiler</category>
    </item>
    <item>
      <title>Toward an Open, Worldwide Chemical Information Network</title>
      <description>&lt;blockquote&gt;
    &lt;p&gt;...Whatever your views of the present situation may be, I think there is general agreement that more attention will be given in the next few years to the information network concept. The hardware capability for such a network is well assured; in fact, the capability exists today. The real question is when, and under what conditions, the chemical community will determine that an economic need exists for a network that will tie together a wide range of chemical information services.&lt;/p&gt;

    &lt;p&gt;-&lt;cite&gt;Walter M. Carlson &lt;a href="http://dx.doi.org/10.1021/c160016a001"&gt;J. Chem. Doc. 1965, 5, 1-3&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Several online chemical information services, including &lt;a href="http://depth-first.com/articles/2006/08/30/hacking-pubchem-with-ruby"&gt;PubChem&lt;/a&gt;, &lt;a href="http://depth-first.com/articles/2006/09/04/hacking-nmrshiftdb"&gt;NMRShiftDB&lt;/a&gt;, and &lt;a href="http://blaster.docking.org/zinc/"&gt;ZINC&lt;/a&gt;, have emerged in a relatively short period of time. As these systems go from being toys for hackers to essential components of scientific workflow, their true potential will be unlocked by developing innovative ways to tie these disparate systems together.&lt;/p&gt;

&lt;p&gt;This is not unlike the situation Carlson was describing in his 1964 luncheon speech before the ACS Division of Chemical Literature. Technologies have changed radically, but the fundamental problem of integrating disparate chemical information systems remains unsolved and ripe with possibilities.&lt;/p&gt;

&lt;p&gt;A future in which Chemical Abstracts Service no longer dominates the collection and distribution of chemical information is looking more possible than ever before. If recent history is any guide to this future, we can look to an array of semi-independent, open systems using open standards and operating on a global scale to become the new focal point. In fact, the capability exists today.&lt;/p&gt;</description>
      <pubDate>Sun, 17 Sep 2006 16:09:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:b1d944ca-c9ef-49e1-8577-294f2d635521</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/09/17/toward-an-open-worldwide-chemical-information-network</link>
      <category>Open X</category>
      <category>acs</category>
      <category>pubchem</category>
      <category>integration</category>
    </item>
  </channel>
</rss>
