<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag inchi</title>
    <link>http://depth-first.com/articles/tag/inchi</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>A Simple and Portable Ruby Interface to InChI - Part 2: Silencing Console Output</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org/"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The previous article in this series described a &lt;a href="http://depth-first.com/articles/2008/05/29/a-simple-and-portable-ruby-interface-to-inchi"&gt;simple and portable method&lt;/a&gt; for interfacing Ruby to the cInChI-1 binary. One disadvantage was noisy console output. This article offers a minor modification to disable it.&lt;/p&gt;

&lt;h4&gt;The Code&lt;/h4&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;InChI&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;inchi_for&lt;/span&gt; &lt;span class="ident"&gt;molfile&lt;/span&gt;
    &lt;span class="ident"&gt;output&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;%x[&lt;/span&gt;&lt;span class="string"&gt;echo &amp;quot;&lt;span class="expr"&gt;#{molfile}&lt;/span&gt;&amp;quot; | cInChI-1 -STDIO 2&amp;gt;/dev/null&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;

    &lt;span class="ident"&gt;output&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;eql?&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="ident"&gt;output&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;split&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;/)[&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Here, we're taking advantage of the ability to redirect certain output streams to &lt;tt&gt;/dev/null&lt;/tt&gt;.&lt;/p&gt;

&lt;h4&gt;Testing the Code&lt;/h4&gt;

&lt;p&gt;Saving the above in a file called &lt;strong&gt;inchi.rb&lt;/strong&gt;, we can test it from IRB. To make things interesting, let's pull a molfile from &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt;:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&amp;gt; require 'open-uri'
=&amp;gt; true
irb(main):002:0&amp;gt; require 'inchi'
=&amp;gt; true
irb(main):003:0&amp;gt; include InChI
=&amp;gt; Object
irb(main):004:0&amp;gt; open 'http://chempedia.com/compounds/83490.mol' do |f|
irb(main):005:1*   puts inchi_for(f.read)
irb(main):006:1&amp;gt; end
InChI=1/C15H15NO3S/c17-14(16-18)11-20(19)15(12-7-3-1-4-8-12)13-9-5-2-6-10-13/h1-10,15,18H,11H2,(H,16,17)
=&amp;gt; nil
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We should be able to run this code unmodified on any UNIX-like system in which the &lt;strong&gt;cInChI-1&lt;/strong&gt; binary is on the path. And of course we could take this one step further by allowing &lt;a href="http://depth-first.com/articles/2007/03/19/customize-inchi-output-with-rino"&gt;command line options&lt;/a&gt; to be passed in as parameters to the &lt;tt&gt;inchi_for&lt;/tt&gt; method.&lt;/p&gt;

&lt;p&gt;Simplicity has its advantages.&lt;/p&gt;</description>
      <pubDate>Fri, 30 May 2008 10:04:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:86f5938d-6519-4d2a-87e1-14b281f1323b</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/30/a-simple-and-portable-ruby-interface-to-inchi-part-2-silencing-console-output</link>
      <category>Tools</category>
      <category>inchi</category>
      <category>ruby</category>
      <category>designingtheobvious</category>
      <category>chempedia</category>
      <category>console</category>
      <category>unix</category>
    </item>
    <item>
      <title>A Simple and Portable Ruby Interface to InChI</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Although the &lt;a href="http://depth-first.com/articles/2007/09/27/inchi-for-newbies"&gt;InChI&lt;/a&gt; software itself is written in C, it can still be used via Ruby. &lt;a href="http://depth-first.com/articles/2007/03/19/customize-inchi-output-with-rino"&gt;Rino&lt;/a&gt; offers one implementation of a Ruby InChI interface that makes use of a C extension. This article describes a more concise and portable solution.&lt;/p&gt;

&lt;h4&gt;The Code&lt;/h4&gt;

&lt;p&gt;The following code will accept a String encoding a molfile and return either its InChI, or an empty String if no InChI could be found:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;InChI&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;inchi_for&lt;/span&gt; &lt;span class="ident"&gt;molfile&lt;/span&gt;
    &lt;span class="ident"&gt;output&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;%x[&lt;/span&gt;&lt;span class="string"&gt;echo &amp;quot;&lt;span class="expr"&gt;#{molfile}&lt;/span&gt;&amp;quot; | cInChI-1 -STDIO&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;

    &lt;span class="ident"&gt;output&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;eql?&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="ident"&gt;output&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;split&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;/)[&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This code takes advantage of Ruby's built-in support for &lt;a href="http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_expressions.html#UA"&gt;Command Expansion&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Testing the Code&lt;/h4&gt;

&lt;p&gt;The code below tests the library:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;InChI&lt;/span&gt;

&lt;span class="ident"&gt;molfile&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt;
&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://chempedia.com/compounds/106.mol
  -OEChem-03010811072D

 12 12  0     0  0  0  0  0  0999 V2000
    2.8660    1.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0000    0.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7321    0.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0000   -0.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7321   -0.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8660   -1.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8660    1.6200    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.4631    0.8100    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    4.2690    0.8100    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.4631   -0.8100    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    4.2690   -0.8100    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.8660   -1.6200    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  1  3  1  0  0  0  0
  1  7  1  0  0  0  0
  2  4  1  0  0  0  0
  2  8  1  0  0  0  0
  3  5  2  0  0  0  0
  3  9  1  0  0  0  0
  4  6  2  0  0  0  0
  4 10  1  0  0  0  0
  5  6  1  0  0  0  0
  5 11  1  0  0  0  0
  6 12  1  0  0  0  0
M  END&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Found InChI: &lt;span class="expr"&gt;#{inchi_for(molfile)}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We can run the test by saving it in a file called &lt;strong&gt;test.rb&lt;/strong&gt; and executing it:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby test.rb
InChI version 1, Software version 1.02-beta August 2007
Log file not specified. Using standard error output.
Input file not specified. Using standard input.
Output file not specified. Using standard output.
Options: Mobile H Perception ON
Isotopic ON, Absolute Stereo ON
Omit undefined/unknown stereogenic centers and bonds
Full Aux. info
Input format: MOLfile
Output format: Plain text
Timeout per structure: 60.000 sec; Up to 1024 atoms per structure
End of file detected after structure #1.
Finished processing 1 structure: 0 errors, processing time 0:00:00.00
Found InChI: InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;The above approach only requires that it be run on a UNIX-like system, and that a copy of the InChI library be present on your path.&lt;/p&gt;

&lt;h4&gt;Advantages&lt;/h4&gt;

&lt;p&gt;The approach described here offers some important advantages over Rino:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It works without modification on both the &lt;a href="http://en.wikipedia.org/wiki/Ruby_MRI"&gt;Matz Ruby Interpreter&lt;/a&gt; (C-Ruby) and &lt;a href="http://jruby.codehaus.org/"&gt;JRuby&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It neither creates nor uses files.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Disadvantages&lt;/h4&gt;

&lt;p&gt;This approach creates a lot of noisy log output to the console. There must be a way to suppress it, but so far I haven't found out how.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Using Ruby's support for Command Expansions has enabled the creation of a concise and portable Ruby interface to the InChI toolkit. Similar principles would apply to any Unix command-line binary, including for example, &lt;a href="http://openbabel.org/wiki/Babel"&gt;Open Babel&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Thu, 29 May 2008 12:12:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:a79f3191-e044-4db5-8676-38f97fcaeedf</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/29/a-simple-and-portable-ruby-interface-to-inchi</link>
      <category>Tools</category>
      <category>ruby</category>
      <category>inchi</category>
      <category>rino</category>
      <category>commandexpansion</category>
    </item>
    <item>
      <title>From C Source Code to Platform-Independent Executable Jarfile: Using NestedVM to Build JInChI</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/smithco/76749170/"&gt;&lt;img src="http://depth-first.com/demo/20071203/nested.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;A &lt;a href="http://depth-first.com/articles/tag/nestedvm"&gt;recent series of articles&lt;/a&gt; discussed in some detail the process of compiling source code written in C and C++ to pure Java bytecode with &lt;a href="http://nestedvm.ibex.org/"&gt;NestedVM&lt;/a&gt;. But the full conversion process, starting with source and finishing with an executable jarfile, has to my knowledge never been documented. This article uses the InChI toolkit to illustrate the complete process for converting a real-world C source distribution into a platform-independent, executable jarfile that can be run with any modern Java Virtual Machine (JVM).&lt;/p&gt;

&lt;h4&gt;About InChI&lt;/h4&gt;

&lt;p&gt;The previous article in this series &lt;a href="http://depth-first.com/articles/2007/10/31/jinchi-run-inchi-anywhere-java-runs"&gt;introduced JInChI&lt;/a&gt;, the first and only pure Java implementation of the &lt;a href="http://www.iupac.org/inchi/"&gt;IUPAC/NIST InChI toolkit&lt;/a&gt;. This toolkit is used to convert molecular connection tables encoded in &lt;a href="http://www.mdli.com/downloads/public/ctfile/ctfile.jsp"&gt;MDL's SD File format&lt;/a&gt; into ASCII character strings called 'InChIs' that have a &lt;a href="http://depth-first.com/articles/2007/09/27/inchi-for-newbies"&gt;variety of applications in the field of cheminformatics&lt;/a&gt;. Although an excellent &lt;a href="http://depth-first.com/articles/2007/10/10/jruby-for-cheminformatics-reading-and-writing-inchis-via-the-java-native-interface"&gt;JNI-InChI&lt;/a&gt; interface is available, JNI won't be a viable option in every situation. Our pure Java implementation nicely complements the JNI-InChI library.&lt;/p&gt;

&lt;p&gt;In this tutorial, we'll build version 1.0.2b of the InChI toolkit. This version, among other features, supports the generation of &lt;a href="http://depth-first.com/articles/2007/05/09/hashing-inchis"&gt;InChI Keys&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;This article assumes you've already &lt;a href="http://wiki.brianweb.net/NestedVM/QuickStartGuide"&gt;installed NestedVM&lt;/a&gt; on your system. Building NestedVM required the installation of many dependencies and was a fairly lengthy, but straightforward, process on my Linux system.&lt;/p&gt;

&lt;h4&gt;Step 1: Prepare Your Environment&lt;/h4&gt;

&lt;p&gt;Before building anything, we'll need to set up our environment. NestedVM makes this simple:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd /your/path/to/nestedvm/
$ source env.sh
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Next, let's create a directory to hold the various components we'll need during the build process:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd /your/projects/directory
$ mkdir jinchi
$ cd jinchi
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Next, we'll download and unpack the InChI source distribution:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ wget http://www.iupac.org/inchi/download/inchi102b.zip
$ unzip inchi102b.zip
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Step 2: Cross-Compile InChI&lt;/h4&gt;

&lt;p&gt;We now have everything we need to begin cross-compiling. NestedVM uses a two-part process in which source code is first cross-compiled to a MIPS binary. That MIPS binary is then translated to Java bytecode. We start by invoking &lt;tt&gt;make&lt;/tt&gt; with the appropriate cross-compiler flags (which I found by looking through the InChI &lt;strong&gt;Makefile&lt;/strong&gt;):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ make C_COMPILER=mips-unknown-elf-gcc LINKER=mips-unknown-elf-gcc
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This creates a MIPS binary (&lt;tt&gt;cInChI-1&lt;/tt&gt;). Unless you're running on a MIPS machine, this binary won't be executable.&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ./cInChI-1
bash: ./cInChI-1: cannot execute binary file
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can now translate the MIPS binary into pure Java bytecode:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java org.ibex.nestedvm.Compiler -outfile JInChI.class JInChI cInChI-1
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This produces a Java class file:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ll JInChI.class
-rw-r--r-- 1 rich rich 4372362 Nov 30 08:27 JInChI.class
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can verify that the classfile has been compiled correctly by running it:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java JInChI
InChI ver 1, Software version 1.02-beta August 2007.

Usage:
cInChI-1 inputFile [outputFile [logFile [problemFile]]] [-option[ -option...]]

Options:
  SNon        Exclude stereo (Default: Include Absolute stereo)
  SRel        Relative stereo

-- truncated --
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We have now done something truly remarkable: we've taken a standard C source code distribution and converted it into an executable Java class file. It runs, but only because the NestedVM runtime is on our classpath (thanks to the &lt;tt&gt;source&lt;/tt&gt; command we used at the beginning of the process).&lt;/p&gt;

&lt;p&gt;What we really want is a self-contained, executable jarfile that can be run, unmodified, on any system with Java installed.&lt;/p&gt;

&lt;h4&gt;Step 3: Build the JInChI Jarfile&lt;/h4&gt;

&lt;p&gt;We begin by moving up the the root directory of our jinchi project, creating a new directory to hold our java-specific files (the &lt;strong&gt;JInChI.class&lt;/strong&gt; file and the NestedVM runtime), and copying them into it:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd ../../..
$ mkdir jinchi-1.0.2b.1
$ mv InChI-1-software-1-02-beta/cInChI/gcc_makefile/JInChI.class jinchi-1.0.2b.1/
$ cp -r /your/path/to/nestedvm/build/org/ jinchi-1.0.2b.1
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;An executable jarfile generally needs a manifest to point to the main execution class. One way to do that is to first create a manifest:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ vi jinchi-1.0.2b.1/MANIFEST.MF
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;It's essential that this file end with a newline.&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cat jinchi-1.0.2b.1/MANIFEST.MF
Main-Class: JInChI
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;With everything in place, we can create the jarfile:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cd jinchi-1.0.2b.1/
$ ls
JInChI.class  MANIFEST.MF  org/
$ jar -cfm jinchi-1.0.2b.1.jar MANIFEST.MF *
$ ls
jinchi-1.0.2b.1.jar  JInChI.class  MANIFEST.MF  org/
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We've successfully converted standard C source code into a platform independent executable jarfile. But does it work?&lt;/p&gt;

&lt;h4&gt;Step 4: Test JInChI&lt;/h4&gt;

&lt;p&gt;We can confirm that the process has worked by running the jarfile (you should do this in a new shell session to verify that the jarfile is indeed independent of your NestedVM installation).&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java -jar jinchi-1.0.2b.1.jar
InChI ver 1, Software version 1.02-beta August 2007.

Usage:
cInChI-1 inputFile [outputFile [logFile [problemFile]]] [-option[ -option...]]

Options:
  SNon        Exclude stereo (Default: Include Absolute stereo)
  SRel        Relative stereo
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;That's all there is to it! Your shiny new jarfile can be run on any system with a JVM installed. The one created here has been successfully tested on Mac OS X, Linux, and Windows.&lt;/p&gt;

&lt;p&gt;If you'd prefer to download the JInChI jarfile, it can be obtained &lt;a href="http://sourceforge.net/project/showfiles.php?group_id=142870&amp;amp;package_id=250448&amp;amp;release_id=558625"&gt;from SourceForge&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;This article has illustrated in detail the process of converting a standard C source distribution into a platform-independent executable jarfile. Given the appropriate MIPS cross-compiler (many of which come with the NestedVM distribution), the same process can be repeated with code written in a variety of other languages.&lt;/p&gt;

&lt;p&gt;You may be wondering what kind of performance hit you can expect with the approach outlined here. After all, we'd be comparing a native binary to something running on top of two abstraction layers: the NestedVM runtime and a JVM. It's not as bad as you might think, but that's a story for another time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/smithco/"&gt;smithco&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 03 Dec 2007 08:42:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:d7475edf-ad10-4358-af89-35c3c830f422</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/12/03/from-c-source-code-to-platform-independent-executable-jarfile-using-nestedvm-to-build-jinchi</link>
      <category>Tools</category>
      <category>nestedvm</category>
      <category>java</category>
      <category>inchi</category>
      <category>jinchi</category>
      <category>jarfile</category>
    </item>
    <item>
      <title>JInChI: Run InChI Anywhere Java Runs</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/smithco/76749170/"&gt;&lt;img src="http://depth-first.com/demo/20071031/russian_doll.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;Regardless of your views on Java the Programming Language, Java the Platform has a lot going for it. The ability to run the same executable on any system with a Java Virtual Machine (JVM), without recompilation, is a significant advantage in today's heterogeneous computing environment. Combine that with Java the Platform's battle-tested security model, stability and performance, and you have some compelling reasons to actually prefer that code execute on a JVM rather than bare metal.&lt;/p&gt;

&lt;p&gt;Cheminformatics has many useful libraries, legacy and otherwise, that don't yet run on a JVM. Many of these can trace their roots back to the 1960s and 1970s and FORTRAN; others were written in C or C++ more recently. What they all have in common is that they're compiled to native binaries rather than Java bytecode.&lt;/p&gt;

&lt;p&gt;Wouldn't it be great if this software could be easily compiled to Java bytecode instead?&lt;/p&gt;

&lt;p&gt;A recent Depth-First article described how the &lt;a href="http://www.iupac.org/inchi/"&gt;InChI toolkit&lt;/a&gt;, an open source C library distributed by IUPAC, &lt;a href="http://depth-first.com/articles/2007/10/29/compiling-the-inchi-toolkit-to-pure-java-bytecode-with-nestedvm"&gt;was successfully compiled to a Java classfile&lt;/a&gt; with the remarkable &lt;a href="http://nestedvm.ibex.org/"&gt;NestedVM&lt;/a&gt; library. This article describes the creation and use of a new platform-independent jarfile that runs the InChI program.&lt;/p&gt;

&lt;p&gt;The procedure was not difficult. The two files previously released ( &lt;a href="http://downloads.sourceforge.net/ninja/JInChI.class?modtime=1193672654&amp;amp;big_mirror=0"&gt;JInChI.class&lt;/a&gt; and &lt;a href="http://downloads.sourceforge.net/ninja/nestedvm.jar?modtime=1193673646&amp;amp;big_mirror=0"&gt;nestedvm.jar&lt;/a&gt;) were combined into a single executable jarfile with a Manifest pointing to the JInChI classfile as the Main class.&lt;/p&gt;

&lt;p&gt;The full cInChI jarfile can be &lt;a href="http://sourceforge.net/project/showfiles.php?group_id=142870&amp;amp;package_id=250448&amp;amp;release_id=550857"&gt;downloaded here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;jinchi.jar&lt;/strong&gt; file can be tested from the command line:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java -jar jinchi.jar
InChI ver 1, Software version 1.01 release 07/21/2006.

Usage:
cInChI-1 inputFile [outputFile [logFile [problemFile]]] [-option[ -option...]]

Options:
  SNon        Exclude stereo (Default: Include Absolute stereo)
  SRel        Relative stereo
  SRac        Racemic stereo

[truncated]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If we wanted to process a molfile representing toluene, we'd use something like the following:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java -jar jinchi.jar test/toluene.mol
InChI version 1, Software version 1.01 release 07/21/2006
Opened log file 'test/toluene.mol.log'
Opened input file 'test/toluene.mol'
Opened output file 'test/toluene.mol.txt'
Opened problem file 'test/toluene.mol.prb'
Options: Mobile H Perception ON
Isotopic ON, Absolute Stereo ON
Omit undefined/unknown stereogenic centers and bonds
Full Aux. info
Input format: MOLfile
Output format: Plain text
Timeout per structure: 60.000 sec; Up to 1024 atoms per structure
End of file detected after structure #1.
Finished processing 1 structure: 0 errors, processing time 0:00:00.00
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This command would produce the following output file, just like the cInChI program:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cat test/toluene.mol.txt
* Input_File: "test/toluene.mol"
Structure: 1
InChI=1/C7H8/c1-7-5-3-2-4-6-7/h2-6H,1H3
AuxInfo=1/0/N:1,2,3,4,5,6,7/E:(3,4)(5,6)/rA:7nCCCCCCC/rB:;d2;s2;s3;d4;s1d5s6;/rC:3.6373,2.8,0;0,.7,0;0,2.1,0;1.2124,0,0;1.2124,2.8,0;2.4249,.7,0;2.4249,2.1,0;
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can also convert InChIs into molfiles (command line options work the same as in cInChI):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ java -jar jinchi.jar test/toluene.mol.txt -OutputSDF
InChI version 1, Software version 1.01 release 07/21/2006
Opened log file 'test/toluene.mol.txt.log'
Opened input file 'test/toluene.mol.txt'
Opened output file 'test/toluene.mol.txt.txt'
Opened problem file 'test/toluene.mol.txt.prb'
Options: Output SDfile only
End of file detected after structure #1.
Finished processing 1 structure: 0 errors, processing time 0:00:00.00
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In this case the output is:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ cat test/toluene.mol.txt.txt
Structure #1
  InChI v1 SDfile Output

  7  7  0  0  0  0  0  0  0  0  1 V2000
    3.6373    2.8000    0.0000 C   0  0  0     0  0  0  0  0  0
    0.0000    0.7000    0.0000 C   0  0  0     0  0  0  0  0  0
    0.0000    2.1000    0.0000 C   0  0  0     0  0  0  0  0  0
    1.2124    0.0000    0.0000 C   0  0  0     0  0  0  0  0  0
    1.2124    2.8000    0.0000 C   0  0  0     0  0  0  0  0  0
    2.4249    0.7000    0.0000 C   0  0  0     0  0  0  0  0  0
    2.4249    2.1000    0.0000 C   0  0  0     0  0  0  0  0  0
  1  7  1  0  0  0  0
  2  3  2  0  0  0  0
  2  4  1  0  0  0  0
  3  5  1  0  0  0  0
  4  6  2  0  0  0  0
  5  7  2  0  0  0  0
  6  7  1  0  0  0  0
M  END
$$$$
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Similar tests worked on both Linux and Windows using the same jarfile.&lt;/p&gt;

&lt;p&gt;There are still some issues to be addressed with this approach. For example, various reports indicate that NestedVM code runs about four to ten times slower than native execution. Benchmarking may be useful at this point.&lt;/p&gt;

&lt;p&gt;Another issue is how to go about making a Java InChI library with NestedVM. If you decompile the &lt;strong&gt;jinchi.jar&lt;/strong&gt; file, you'll find that the &lt;strong&gt;JInChI.class&lt;/strong&gt; file is a large and complex beast in which almost all methods are named as hex numbers. It may be possible to create a library by renaming certain methods and breaking the code into smaller classfiles, but the NestedVM documentation seems sparse on this subject.&lt;/p&gt;

&lt;p&gt;Despite these difficulties, this article demonstrates the power of NestedVM and describes the first (and currently only) example of a 100% Java InChI implementation.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/smithco/"&gt;smithco&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Wed, 31 Oct 2007 10:59:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:8d75b727-adf7-4722-9c60-2b9b91c94f17</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/31/jinchi-run-inchi-anywhere-java-runs</link>
      <category>Tools</category>
      <category>inchi</category>
      <category>java</category>
      <category>nestedvm</category>
    </item>
    <item>
      <title>Compiling the InChI Toolkit to Pure Java Bytecode with NestedVM</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/pmorgan/32606683/"&gt;&lt;img src="http://depth-first.com/demo/20071029/shanghai.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;Some time ago, a Depth-First article discussed some methods for &lt;a href="http://depth-first.com/articles/2006/10/16/compiling-c-to-java-bytecode"&gt;compiling C to Java bytecode&lt;/a&gt;. Many factors make this approach attractive compared to the JNI approach. Some of them include security, portability, and use within applets. Unfortunately, none of the approaches discussed in the earlier article seemed particularly general.&lt;/p&gt;

&lt;p&gt;Many cheminformatics libraries are written in C and C++; being able to reliably and automatically port them to Java could potentially save a great deal of effort.&lt;/p&gt;

&lt;p&gt;One of the more important cheminformatics C libraries written in recent years is the &lt;a href="http://www.iupac.org/inchi/"&gt;InChI toolkit&lt;/a&gt;. With no pure Java port of this library, JNI is the &lt;a href="http://depth-first.com/articles/2007/10/10/jruby-for-cheminformatics-reading-and-writing-inchis-via-the-java-native-interface"&gt;only way&lt;/a&gt; to use InChI with Java. In some situations, this approach is either overly complicated or simply unacceptable.&lt;/p&gt;

&lt;p&gt;All of this leaves us with the question: how can the InChI toolkit be converted into a pure Java library without writing any new code?&lt;/p&gt;

&lt;p&gt;A partial answer to this question came from Evan Jones, who suggested I look at &lt;a href="http://nestedvm.ibex.org/"&gt;NestedVM&lt;/a&gt;. From the website:&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;NestedVM provides binary translation for Java Bytecode. This is done by having GCC compile to a MIPS binary which is then translated to a Java class file. Hence any application written in C, C++, Fortran, or any other language supported by GCC can be run in 100% pure Java with no source changes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And it worked!&lt;/p&gt;

&lt;p&gt;NestedVM was successfully used compile the InChI-API distribution to a Java class file that executed on nothing more than a standard JVM -- and with no JNI code. The InChI classfile and nestedvm runtime jarfile can be &lt;a href="http://sourceforge.net/project/showfiles.php?group_id=142870&amp;amp;package_id=250448&amp;amp;release_id=550390"&gt;downloaded from SourceForge&lt;/a&gt;. Future articles in this series will describe the compilation, installation, and use of NestedVM, as well as the Java class file that it produced.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/pmorgan/"&gt;pmorgan&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 29 Oct 2007 11:13:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:77ab0a8f-6209-462f-a536-41ef737fc7a4</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/29/compiling-the-inchi-toolkit-to-pure-java-bytecode-with-nestedvm</link>
      <category>Tools</category>
      <category>inchi</category>
      <category>nestedvm</category>
      <category>bytecode</category>
    </item>
    <item>
      <title>Easily Convert IUPAC Nomenclature to SMILES, InChI, or Molfile with Rubidium</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/demo/20071015/rubidium.png" align="right"&gt;&lt;/img&gt;A recent article &lt;a href="http://depth-first.com/articles/2007/10/15/an-introduction-to-the-rubidium-cheminforamtics-toolkit-interconvert-smiles-inchi-and-molfile-with-an-open-babel-like-interface"&gt;introduced Rubidium&lt;/a&gt;, a cheminformatics toolkit written in Ruby. One of Ruby's strengths is the speed with which it enables disparate pieces of code to be glued together - even if they're written in different programming languages. In this article, we'll see how Rubidium can be extended to provide support for converting IUPAC nomenclature into SMILES, InChI, or Molfile formats.&lt;/p&gt;

&lt;h4&gt;About Rubidium&lt;/h4&gt;

&lt;p&gt;Rubidium is a cheminformatics toolkit written in Ruby. Rubidium is currently configured to run on &lt;a href="http://jruby.codehaus.org/"&gt;JRuby&lt;/a&gt;, although future versions may also work with &lt;a href="http://en.wikipedia.org/wiki/Ruby_(programming_language"&gt;Matz' Ruby Implementation&lt;/a&gt;) (MRI) via &lt;a href="http://rjb.rubyforge.org/"&gt;Ruby Java Bridge&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Rubidium will eventually be packaged as a &lt;a href="http://www.rubygems.org/"&gt;RubyGem&lt;/a&gt; and hosted on &lt;a href="http://rubyforge.org"&gt;RubyForge&lt;/a&gt;. For now, the toolkit consists of a running library that will updated and documented on this blog.&lt;/p&gt;

&lt;h4&gt;The Library&lt;/h4&gt;

&lt;p&gt;The library extends the CDK module presented in the &lt;a href="http://depth-first.com/articles/2007/10/15/an-introduction-to-the-rubidium-cheminforamtics-toolkit-interconvert-smiles-inchi-and-molfile-with-an-open-babel-like-interface"&gt;previous article in this series&lt;/a&gt;. The main change is the addition of an &lt;tt&gt;IUPACReader&lt;/tt&gt; class, based on Peter Corbett's excellent &lt;a href="http://depth-first.com/articles/2007/10/12/jruby-for-cheminformatics-parsing-iupac-nomenclature-with-opsin"&gt;OPSIN library&lt;/a&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;IUPACReader&lt;/span&gt;
  &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java.io.StringReader&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;uk.ac.cam.ch.wwmm.opsin.NameToStructure&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.io.CMLReader&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
  &lt;span class="ident"&gt;import&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;org.openscience.cdk.ChemFile&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;
    &lt;span class="attribute"&gt;@iupac_reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;NameToStructure&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;CMLReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;read&lt;/span&gt; &lt;span class="ident"&gt;name&lt;/span&gt;
    &lt;span class="ident"&gt;cml&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@iupac_reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parse_to_cml&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="keyword"&gt;raise&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Could not parse '&lt;span class="expr"&gt;#{name}&lt;/span&gt;'.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="keyword"&gt;unless&lt;/span&gt; &lt;span class="ident"&gt;cml&lt;/span&gt;

    &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_reader&lt;/span&gt; &lt;span class="constant"&gt;StringReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cml&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_xml&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

    &lt;span class="ident"&gt;chem_file&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="attribute"&gt;@cml_reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt; &lt;span class="constant"&gt;ChemFile&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

    &lt;span class="ident"&gt;chem_file&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;chem_sequence&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;chem_model&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;molecule_set&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Using this additional functionality requires nothing more than copying the &lt;a href="http://prdownloads.sourceforge.net/oscar3-chem/opsin-big-0.1.0.jar?download"&gt;OPSIN jarfile&lt;/a&gt; into the &lt;strong&gt;lib&lt;/strong&gt; directory of your JRuby installation. You'll also need to place the &lt;a href="http://downloads.sourceforge.net/cdk/cdk-1.0.1.jar?modtime=1182877138&amp;big_mirror=0"&gt;CDK jarfile&lt;/a&gt; in this directory if you haven't done so already.&lt;/p&gt;

&lt;p&gt;The complete Rubidium library can be &lt;a href="http://depth-first.com/demo/20071019/cdk.rb"&gt;downloaded here&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Test&lt;/h4&gt;

&lt;p&gt;We can test Rubidium's IUPAC nomenclature parsing abilities with &lt;tt&gt;jirb&lt;/tt&gt;. For example, to convert from name to SMILES:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ jirb
irb(main):001:0&gt; require 'cdk'
=&gt; true
irb(main):002:0&gt; c=CDK::Conversion.new
=&gt; #&amp;lt;CDK::Conversion:0x46ca65 ... &amp;gt;
irb(main):003:0&gt; c.set_formats 'iupac', 'smi'
=&gt; "smi"
irb(main):004:0&gt; c.convert '1,4-dichlorobenzene'
=&gt; "C=1C=C(C=CC=1Cl)Cl"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;To convert from name to InChI (in the same &lt;tt&gt;jirb&lt;/tt&gt; session):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
irb(main):005:0&gt; c.set_out_format 'inchi'
=&gt; "inchi"
irb(main):006:0&gt; c.convert '1,4-dichlorobenzene'
=&gt; "InChI=1/C6H4Cl2/c7-5-1-2-6(8)4-3-5/h1-4H"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;And to convert from name to Molfile (also in the same &lt;tt&gt;jirb&lt;/tt&gt; session):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
irb(main):007:0&gt; c.set_out_format 'mol'
=&gt; "mol"
irb(main):008:0&gt; c.convert '1,4-dichlorobenzene'
=&gt; "\n  CDK    10/19/07,7:59\n\n  8  8  0  0  0  0  0  0  0  0999 V2000\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0\n    0.0000    0.0000    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0\n  1  2  2  0  0  0  0 \n  2  3  1  0  0  0  0 \n  3  4  2  0  0  0  0 \n  4  5  1  0  0  0  0 \n  5  6  2  0  0  0  0 \n  6  1  1  0  0  0  0 \n  7  1  1  0  0  0  0 \n  8  4  1  0  0  0  0 \nM  END\n"
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;By re-using a simple conversion API together with another Java library, we've given Rubidium the ability to translate IUPAC nomenclature into other molecular languages. The additional code was both easy to write and easy to test. Future articles will discuss the packaging, distribution, and further elaboration of Rubidium.&lt;/p&gt;</description>
      <pubDate>Fri, 19 Oct 2007 10:05:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:1b7e76b3-93a7-4372-982f-cd60c9ed40d0</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/19/easily-convert-iupac-nomenclature-to-smiles-inchi-or-molfile-with-rubidium</link>
      <category>Tools</category>
      <category>rubidium</category>
      <category>iupac</category>
      <category>smiles</category>
      <category>inchi</category>
      <category>moflile</category>
    </item>
    <item>
      <title>JRuby for Cheminformatics: Reading and Writing InChIs Via the Java Native Interface</title>
      <description>&lt;p&gt;&lt;a href="http://ruby-lang.org"&gt;&lt;img src="http://depth-first.com/files/ruby_logo_new.gif" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The increased use of the &lt;a href="http://depth-first.com/articles/2007/09/27/inchi-for-newbies"&gt;InChI identifier&lt;/a&gt; is making the reading and writing of InChIs a standard cheminformatics capability. Recent articles have discussed the &lt;a href="http://depth-first.com/articles/tag/jruby"&gt;advantages of JRuby for cheminformatics&lt;/a&gt;. One disadvantage of JRuby is that code written in C can't be directly used. The presents a potential problem for libraries, such as the InChI toolkit, that are written in C. Fortunately, the solution is simple. Today's tutorial will demonstrate how InChIs can be both read and written using the C-InChI toolkit via JRuby and the excellent &lt;a href="http://jni-inchi.sourceforge.net/"&gt;JNI-InChI library&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;About JNI-InChI&lt;/h4&gt;

&lt;p&gt;The &lt;a href="http://jni-inchi.sourceforge.net/"&gt;JNI-InChI&lt;/a&gt; library, written by Jim Downing and Sam Adams, wraps the &lt;a href="http://www.iupac.org/inchi/"&gt;C InChI toolkit&lt;/a&gt; in a Java Native Interface. This low-level toolkit is suitable for building more complex software, but lacks many features present in the C InChI toolkit. For example, JNI-InChI doesn't directly interconvert SMILES or molfile with InChI. For that you'd need to build a support library. If you're building a toolkit from scratch, this lightweight approach can be a significant advantage.&lt;/p&gt;

&lt;p&gt;The JNI-InChI binary distribution jarfile includes the compiled native InChI library. In this sense it's virtually indistinguishable from any other Java library. This simplified packaging makes it exceptionally easy to use JNI-InChI from JRuby, as we'll see below.&lt;/p&gt;

&lt;h4&gt;Installation&lt;/h4&gt;

&lt;p&gt;JRuby &lt;a href="http://depth-first.com/articles/2007/10/09/jruby-for-cheminformatics-parsing-smiles-simply"&gt;can be installed&lt;/a&gt; as described previously. To install the JNI-InChI library for JRuby, simply copy the &lt;a href="http://sourceforge.net/project/showfiles.php?group_id=173262"&gt;current release jarfile&lt;/a&gt; into the &lt;tt&gt;lib&lt;/tt&gt; directory of your JRuby installation. That's all there is to it.&lt;/p&gt;

&lt;h4&gt;A Simple Library&lt;/h4&gt;

&lt;p&gt;We can now write a simple library to read InChIs via JRuby:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;java&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;include_class&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.jniinchi.JniInchiInput&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include_class&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.jniinchi.JniInchiInputInchi&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include_class&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net.sf.jniinchi.JniInchiWrapper&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;IUPAC&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;read_inchi&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;
    &lt;span class="ident"&gt;input&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;JniInchiInputInchi&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="ident"&gt;inchi&lt;/span&gt;

    &lt;span class="constant"&gt;JniInchiWrapper&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getStructureFromInchi&lt;/span&gt; &lt;span class="ident"&gt;input&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;By saving the above library to a file called &lt;strong&gt;iupac.rb&lt;/strong&gt;, we can parse InChIs via JRuby:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ jirb
irb(main):001:0&gt; require 'iupac'
=&gt; true
irb(main):002:0&gt; include IUPAC
=&gt; Object
irb(main):003:0&gt; output = read_inchi 'InChI=1/C14H10/c1-3-7-13-11(5-1)9-10-12-6-2-4-8-14(12)13/h1-10H'
=&gt; #&lt;Java::NetSfJniinchi::JniInchiOutputStructure:0x1ed5459 @java_object=net.sf.jniinchi.JniInchiOutputStructure@313170&gt;
irb(main):004:0&gt; output.num_atoms
=&gt; 14
irb(main):005:0&gt; output.num_bonds
=&gt; 16
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Writing InChIs&lt;/h4&gt;

&lt;p&gt;Because JNI-InChI is a low-level toolkit, writing InChIs is feasible, but not trivial. We must first construct a representation, and then get the InChI for it. For example, we could get the InChI for methane as follows:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ jirb
irb(main):001:0&gt; require 'java'
=&gt; true
irb(main):002:0&gt; include_class 'net.sf.jniinchi.JniInchiInput'
=&gt; ["net.sf.jniinchi.JniInchiInput"]
irb(main):003:0&gt; include_class 'net.sf.jniinchi.JniInchiAtom'
=&gt; ["net.sf.jniinchi.JniInchiAtom"]
irb(main):004:0&gt; include_class 'net.sf.jniinchi.JniInchiWrapper'
=&gt; ["net.sf.jniinchi.JniInchiWrapper"]
irb(main):005:0&gt; input = JniInchiInput.new
=&gt; #&lt;Java::NetSfJniinchi::JniInchiInput:0x2f2295 @java_object=net.sf.jniinchi.JniInchiInput@15b0333&gt;
irb(main):006:0&gt; a1 = input.add_atom JniInchiAtom.new(0,0,0, "C")
=&gt; #&lt;Java::NetSfJniinchi::JniInchiAtom:0x1b22920 @java_object=net.sf.jniinchi.JniInchiAtom@2f356f&gt;
irb(main):007:0&gt; a1.set_implicit_h(4)
=&gt; nil
irb(main):008:0&gt; output = JniInchiWrapper.get_inchi input
=&gt; #&lt;Java::NetSfJniinchi::JniInchiOutput:0xf894ce @java_object=net.sf.jniinchi.JniInchiOutput@132ae7&gt;
irb(main):009:0&gt; output.get_inchi
=&gt; "InChI=1/CH4/h1H4"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Fortunately, we don't have to work that hard. The &lt;a href="http://cdk.sf.net"&gt;Chemistry Development Kit&lt;/a&gt;, through JNI-InChI, supports reading and writing of InChIs via a variety of molecular languages, including SMILES and molfile. More on that later, though.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Provided that a Java Native Interface exists for a C library, it can be used from JRuby. Future articles will discuss the use of other cheminformatics libraries written in either C or C++ from JRuby, and their integration with pure Java and Ruby libraries.&lt;/p&gt;</description>
      <pubDate>Wed, 10 Oct 2007 08:21:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:0348fa93-7376-488d-9afc-789590ac9fcb</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/10/jruby-for-cheminformatics-reading-and-writing-inchis-via-the-java-native-interface</link>
      <category>Tools</category>
      <category>rubidium</category>
      <category>jruby</category>
      <category>ruby</category>
      <category>java</category>
      <category>jni</category>
      <category>inchi</category>
      <category>cdk</category>
    </item>
    <item>
      <title>Streamlining Cheminformatics on the Web: Let InChI Do the Heavy Lifting and Get Some REST</title>
      <description>&lt;p&gt;&lt;a href="http://chemspider.com"&gt;&lt;img src="http://depth-first.com/demo/20070917/chemspider.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;A recent Depth-First article discussed the advantages of &lt;a href="http://depth-first.com/articles/2007/08/13/the-best-api-may-be-no-api-at-all-pubchem-and-pdb"&gt;minimal Web APIs in Cheminformatics&lt;/a&gt;. Recently, Antony Williams unveiled some &lt;a href="http://www.chemspider.com/blog/?p=179"&gt;simplified ChemSpider URL schemes&lt;/a&gt;, mainly from the perspective of enabling Google indexing. However, it's possible to take this scheme much, much further. Here I present a proposal for radically simplifying (and unifying) the development of cheminformatics Web APIs and the software that interacts with them.&lt;/p&gt;

&lt;h4&gt;The New ChemSpider URLs&lt;/h4&gt;

&lt;p&gt;ChemSpider now has several new kinds of URLs. For the purposes of this article, the most interesting of these are of the format:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://www.chemspider.com/InChIKey=DEIYFTQMQPDXOT-RERXVCSDCZ"&gt;http://www.chemspider.com/InChIKey=DEIYFTQMQPDXOT-RERXVCSDCZ &lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://www.chemspider.com/InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H"&gt;http://www.chemspider.com/InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These URLs may seem unremarkable, but there's much more than meets the eye. They let anonymous developers query ChemSpider about specific substances - without needing to know much at all about how ChemSpider itself works. Goodbye API. Goodbye API support. Goodbye API documentation. Goodbye angle brackets. Hello to getting stuff done. It's all very &lt;a href="http://depth-first.com/articles/2007/05/30/restful-cheminformatics"&gt;RESTful&lt;/a&gt;. Well, at least it could be that way with some minor modification.&lt;/p&gt;

&lt;h4&gt;Some Recommendations&lt;/h4&gt;

&lt;p&gt;ChemSpider hasn't quite reached that place where the API &lt;a href="http://wwmm.ch.cam.ac.uk/blogs/downing/?p=128"&gt;just disappears&lt;/a&gt;. The problem is that the ChemSpider URLs listed above point to query results pages, not compound summary pages. Were these URLs to redirect to a summary page, we could construct the following URLs to extract ChemSpider resources (I've replaced the '=' sign with a '/' for simplicity):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ&lt;/strong&gt; Get all resources for the molecule identified by the given InChIKey - i.e., "Compound summary page"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/molfile.mol&lt;/strong&gt; Get the molfile for the molecule identified by the given InChIKey&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/small_image.png&lt;/strong&gt; Get the small image for the molecule indentified by the given InChIKey.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/large_image.png&lt;/strong&gt; Get the large image for the molecule identified by the given InChIKey.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/citations.xml&lt;/strong&gt; Get the list of citations for the molecule identified by the given InchIKey, in XML format.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Jane, a developer building Web applications on top of this new ChemSpider API, would immediately notice that things just work. Let's say her online database stores IC&lt;sub&gt;50&lt;/sub&gt;s at the dopamine D&lt;sub&gt;2&lt;/sub&gt; receptor. On the summary page for each molecule, she wants to link out to the ChemSpider compound summary page, if available. She would simply construct the InChIKey on her server, build the needed ChemSpider URL and GET it. An HTTP 404 would indicate no molecule with that Key exists on ChemSpider and so no link would be shown. An HTTP 200 would indicate ChemSpider has the molecule, and so the link would appear.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;It would be interesting enough if ChemSpider adopted a system like that described here. But the real power of this approach would emerge if multiple Web services were to adopt it. By following a simple set of conventions, these services would enable third party developers to elegantly &lt;a href="http://depth-first.com/articles/2006/09/23/mashups-for-fun-and-profit"&gt;mashup&lt;/a&gt; all manner of cheminformatics resources into applications unimaginable today.&lt;/p&gt;

&lt;p&gt;Technically, there's nothing that prevents this system from being implemented on every &lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;free chemistry database&lt;/a&gt; in existence today. However, doing so would transfer a significant degree of control from service operators to third-party developers. Not all providers will be comfortable with that idea.&lt;/p&gt;

&lt;p&gt;Cheminformatics Web service providers need to carefully consider whether they're trying to develop a &lt;a href="http://depth-first.com/articles/2007/07/04/pubchem-is-a-platform"&gt;platform or an integrated service&lt;/a&gt;. As history has shown, the strategies, and upside potential, for each approach can differ dramatically.&lt;/p&gt;</description>
      <pubDate>Mon, 01 Oct 2007 10:53:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:e3caeb1b-58a7-4a3a-b215-131825ee9f2e</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/01/streamlining-cheminformatics-on-the-web-let-inchi-do-the-heavy-lifting-and-get-some-rest</link>
      <category>Meta</category>
      <category>chemspider</category>
      <category>rest</category>
      <category>inchi</category>
      <category>inchikey</category>
      <category>web</category>
    </item>
    <item>
      <title>InChI for Newbies</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/demo/20070927/newbies.png" align="right"&gt;&lt;/a&gt;The &lt;a href="http://www.iupac.org/inchi/"&gt;IUPAC International Chemical Indentifier&lt;/a&gt; (InChI) provides an open system for converting molecular representations into strings of text. Because text processing is one of the things computers do very well, InChI serves as an important link between chemistry and computer science.&lt;/p&gt;

&lt;p&gt;Unfortunately, the InChI documentation is rather scattered. To help remedy this problem, I have selected some links to Depth-First articles discussing InChI. These articles span a wide range of perspectives. They have been divided into categories for easier navigation, although many contain information that any user of InChI could find useful.&lt;/p&gt;

&lt;h4&gt;InChI in Context&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/06/25/interconvert-almost-any-smiles-and-inchi-with-ruby-open-babel"&gt;Interconvert Almost Any SMILES and InChI with Ruby Open Babel&lt;/a&gt; The risk-free way to test InChI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/03/14/eleven-qualities-of-the-perfect-line-notation-for-the-web"&gt;Eleven Qualities of the Perfect Line Notation for the Web&lt;/a&gt; Whatever limitations it may have, designing InChI was no easy task.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/01/26/how-to-find-chemical-information-on-the-internet-why-open-source-open-access-and-open-data-matter"&gt;How to Find Chemical Information on the Internet: Why Open Source, Open Access and Open Data Matter&lt;/a&gt; InChI is also part of the process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;Thirty-Two Free Chemistry Databases&lt;/a&gt; Many of them use InChI already.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/05/the-automatic-encoding-of-chemical-structures"&gt;The Automatic Encoding of Chemical Structures&lt;/a&gt; Why InChI needs to be heard but not seen.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/11/08/debabelization"&gt;Debabelization&lt;/a&gt; Why InChI probably isn't the last line notation you'll ever need.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/08/18/107-years-of-line-formula-notations-1861-1968"&gt;107 Years of Line Formula Notations 1861-1968&lt;/a&gt; InChI is the latest in a long ...ummm... line of indexing systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/07/20/everything-old-is-new-again-wiswesser-line-notation-wln"&gt;Everything Old is New Again: Wiswesser Line Notation (WLN)&lt;/a&gt; The early roots of InChI.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Hacking InChI&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/17/hacking-chemspider-query-by-smiles-and-inchi-with-ruby"&gt;Hacking ChemSpider: Query by SMILES and InChI with Ruby&lt;/a&gt; The easy way to mash-up ChemSpider.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/01/12/decoding-inchis-an-introduction-to-ninja"&gt;Decoding InChIs: An Introduction to Ninja&lt;/a&gt; Putting InChI together again.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/06/from-inchi-to-image-with-ruby-open-babel-and-ruby-cdk"&gt;From InChI to Image with Ruby Open Babel and Ruby CDK&lt;/a&gt; Inscrutable InChIs got you down? Visualize them in color!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/19/decoding-inchis-with-rino"&gt;Decoding InChIs with Rino&lt;/a&gt; Instructive, but no longer recommended.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/16/taking-a-swig-of-inchi"&gt;Taking a SWIG of InChI&lt;/a&gt; It may leave a doozy of a hangover, but it'll cure what ails ya.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/08/12/inchi-canonicalization-algorithm"&gt;InChI Canonization Algorithm&lt;/a&gt; More detail than you probably wanted to know.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;InChI and the Web&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/05/17/my-inchi-runneth-over"&gt;My InChI Runneth Over&lt;/a&gt; Why InChI makes grown Web designers cry.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/05/09/hashing-inchis"&gt;Hashing InChIs&lt;/a&gt; One-way ticket to Web harmony.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/03/05/why-the-web-isnt-ready-for-chemistry"&gt;Why the Web Isn't Ready for Chemistry&lt;/a&gt; Still!?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/02/28/inchi-spam"&gt;InChI Spam&lt;/a&gt; Remember when Spam was funny?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/13/the-chemically-aware-web-are-we-there-yet"&gt;The Chemically-Aware Web: Are We There Yet?&lt;/a&gt; Google and InChI don't always mix.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;InChIMatic&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/02/19/google-for-molecules-with-inchimatic"&gt;Google for Molecules with InChIMatic&lt;/a&gt; Title says it all.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/09/24/building-the-chemically-aware-web-totallysynthetic-and-inchimatic"&gt;Building the Chemically-Aware Web: TotallySynthetic and InChIMatic&lt;/a&gt; Hey, this thing is useful for something!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/08/15/googling-for-molecules-with-inchimatic-and-firefly"&gt;Googling for Molecules with InChIMatic and Firefly&lt;/a&gt; A comfortable editor for structure searching the Web.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/06/21/open-notebook-science-using-inchimatic"&gt;Open Notebook Science Using InChIMatic&lt;/a&gt; One way to use InChI in Open Science&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/15/anatomy-of-a-cheminformatics-web-application-inchimatic"&gt;Anatomy of a Cheminformatics Web Application: InChIMatic&lt;/a&gt; It's never been so easy to build software for chemists.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Please feel free to link to your favorite InChI resource in the comments section.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Generator Credit: &lt;a href="http://txt2pic.com"&gt;txt2pic.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Thu, 27 Sep 2007 08:39:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:f6c3ac58-6f0a-4afe-a553-87d423e3d2ca</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/27/inchi-for-newbies</link>
      <category>Tools</category>
      <category>inchi</category>
      <category>newbies</category>
    </item>
    <item>
      <title>Building the Chemically-Aware Web: TotallySynthetic and InChIMatic</title>
      <description>&lt;p&gt;&lt;a href="http://depth-first.com/articles/tag/inchimatic"&gt;Recent D-F articles&lt;/a&gt; have discussed &lt;a href="http://inchimatic.com"&gt;InChIMatic&lt;/a&gt;, a Web application that lets you search the Web for chemical structures by simply drawing them. InChIMatic takes advantage of &lt;a href="http://www.iupac.org/inchi/"&gt;InChI&lt;/a&gt;, a system for representing molecular structures as a strings of text, and Google, which indexes these text strings. In this article, I'll show InChIMatic in action as it quickly finds a molecule discussed in a &lt;a href="http://totallysynthetic.com/blog/?p=762"&gt;review&lt;/a&gt; of &lt;a href="http://dx.doi.org/10.1021/ja074300t"&gt;Overman's Sarain A synthesis&lt;/a&gt; appearing in Paul Docherty's &lt;a href="http://totallysynthetic.com/blog"&gt;TotallySynthetic blog&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;You Can Skip this Step&lt;/h4&gt;

&lt;p&gt;The TotallySynthetic review lists three InChIs at the bottom, but which structures, out of the many discussed, do these represent? We need to know so that we can enter these structures into InChIMatic. This is, of course a step only needed because we're testing the system, not because we're using the system the way it was designed to be used.&lt;/p&gt;

&lt;p&gt;A recent D-F article discussed a method for &lt;a href="http://depth-first.com/articles/2007/09/06/from-inchi-to-image-with-ruby-open-babel-and-ruby-cdk"&gt;converting InChIs into 2D structures&lt;/a&gt; using Ruby. It has the advantage of being easily adaptable to building chemically-aware Web spiders. And it's 100% Open Source.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070924/first.png"&gt;&lt;/img&gt;&lt;img src="http://depth-first.com/demo/20070924/second.png"&gt;&lt;/img&gt;&lt;img src="http://depth-first.com/demo/20070924/third.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Running this library over TotallySynthetic's InChIs yields the three images above. Notice, we have some problems. The first and third images lack stereochemistry. The second has a trans- double bond instead of the cis- stereochemistry encoded by the InChI. There are good reasons for each of these problems, which I hope to address in later articles. For now, it's sufficient that we can clearly make the connection between the TotallySynthetic InChIs and structures in the Sarain A review.&lt;/p&gt;

&lt;h4&gt;Run the Search&lt;/h4&gt;

&lt;p&gt;We can test this system by pointing our browser to &lt;a href="http://inchimatic.com"&gt;inchimatic.com&lt;/a&gt;. Entering one of the structures and clicking "Search" takes us directly to a link for the TotallySynthetic site, courtesy of Google. Unfortunately, the link doesn't currently point to &lt;a href="http://totallysynthetic.com/blog/?p=762"&gt;the article itself&lt;/a&gt;. This issue may resolve itself as the Googlebot continues to index the TotallySynthetic site.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://inchimatic.com"&gt;&lt;img src="http://depth-first.com/demo/20070924/inchimatic.png"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;A Technical Note&lt;/h4&gt;

&lt;p&gt;If you spend any time working with InChIs, you'll notice that they're very long. So long, in fact, that they break many Web page layouts. There have been many attempts to &lt;a href="http://depth-first.com/articles/2007/03/05/why-the-web-isnt-ready-for-chemistry"&gt;fix the long-InChI problem&lt;/a&gt;, but Paul may have found the answer by trying the simplest thing that could possibly work.&lt;/p&gt;

&lt;p&gt;If you inspect the HTML source for the TotallySynthetic article, you'll find that Paul has inserted hard returns (&lt;tt&gt;br&lt;/tt&gt; elements) to manually break his InChIs, including &lt;del&gt;the one we just located with InChIMatic (first in the list)&lt;/del&gt; the first and last structures above, both of which can be found with InChIMatic:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;p&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;small&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;InChI=1/C29H33NO4Si/c1-5-32-28(31)26-25(34-27(30-26)22-15-9-6-10-16-22)21-33-35(29(2,3)4,23-17-11-7-12-18-23)24-19-13-8-14-20-24&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;br&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
/h6-20,25-26H,5,21H2,1-4H3/t25-,26-/m0/s1 InChI=1/C18H25NO6S/c1-14-9-11-15(12-10-14)26(22,23)19(17(21)25-18(2,3)4)13-7-6-8-16(20)24-5/h6,8-12H,7,13H2,1-5H3/b8-6- InChI=1/C47H58N2O10SSi/c1-10-56-43(51)47(36(32-41(50)55-9)30-31-49(44(52)59-45(3,4)5)60(53,54)37-28-26-34(2)27-29-37)40&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;br&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;

(58-42(48-47)35-20-14-11-15-21-35)33-57-61(46(6,7)8,38-22-16-12-17-23-38)39-24-18-13-19-25-39/h11-29,36,40H,10,30-33H2,1-9H3&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;br&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
/t36-,40-,47-/m0/s1&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;small&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;p&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;In other words, fixing the long InChI/Google indexing problem may be as simple as just inserting &lt;tt&gt;br&lt;/tt&gt; elements when needed. More on this later, though.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;This article has shown a working demonstration that uses free tools to build self-organizing, highly distributed, searchable chemical databases. Although the system is far from perfect, it does provide a glimpse at what can be done right now with relatively little effort. Starting with this basic idea, we can begin to think about a variety of fast, free, user-friendly services that make finding molecules on the Web, and publishing their wherabouts, as easy as using Google and WordPress. But that's a story for another time.&lt;/p&gt;</description>
      <pubDate>Mon, 24 Sep 2007 10:54:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:1151053d-f13b-4431-b5a3-9eaaab658cec</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/24/building-the-chemically-aware-web-totallysynthetic-and-inchimatic</link>
      <category>Tools</category>
      <category>inchimatic</category>
      <category>inchi</category>
      <category>totallysynthetic</category>
      <category>sarain</category>
      <category>ruby</category>
    </item>
  </channel>
</rss>
