<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Looking at InChIs</title>
    <link>http://depth-first.com/articles/2006/09/26/looking-at-inchis</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Looking at InChIs</title>
      <description>&lt;p&gt;InChI identifiers can be viewed both as unique molecular keys and as a language encoding molecular structure. With the right software, it is possible to decode any InChI to arrive at a human-readable molecular structure. This tutorial will show how to convert InChI identifiers into 2-D molecular renderings using open source tools.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;The InChI to 2-D image conversion process requires two pieces of software:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/19/decoding-inchis-with-rino"&gt;Rino&lt;/a&gt; decodes InChI identifiers into molfiles. The resulting atomic coordinates are set to zero.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/09/25/cdk-the-ruby-way-rcdk-0-2-0"&gt;RCDK&lt;/a&gt; assigns coordinates to the molfile produced by Rino, and renders the result.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Bring on the Code&lt;/h4&gt;

&lt;p&gt;The following Ruby code illustrates how the InChI for the pesticide fipronil (Regent) can be translated into a PNG image:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rino&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;inchi&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;InChI=1/C12H4Cl2F6N4OS/c13-5-1-4(11(15,16)17)2-6(14)8(5)24-10(22)9(7(3-21)23-24)26(25)12(18,19)20/h1-2H,22H2&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;#fipronil&lt;/span&gt;
&lt;span class="ident"&gt;reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Rino&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;InChIReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;molfile1&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# lacks 2-D atomic coordinates&lt;/span&gt;
&lt;span class="ident"&gt;molfile2&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;XY&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;coordinate_molfile&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molfile1&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# has 2-D atomic coordinates&lt;/span&gt;

&lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Image&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;molfile_to_png&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molfile2&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;fipronil.png&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="number"&gt;350&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="number"&gt;300&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Running this code produces the image &lt;strong&gt;fipronil.png&lt;/strong&gt; in your working directory:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20060926/fipronil.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Limitations&lt;/h4&gt;

&lt;p&gt;The technique illustrated here is subject to the same limitations as the underlying software. For Rino, this means that stereochemistry is ignored. For RCDK, this means that implicit hydrogen atoms, isotopes, and charges are omitted, and that layout of macrocycles and other complex ring systems may not subjectively appear very refined.&lt;/p&gt;

&lt;h4&gt;Other Software that Does This&lt;/h4&gt;

&lt;p&gt;To my knowledge, only one other Open Source package, &lt;a href="http://bkchem.zirael.org/inchi_en.html"&gt;BKChem&lt;/a&gt;, is capable of rendering InChIs as described here. BKChem's underlying InChI translation and depiction software, OASA, can also be &lt;a href="http://inchi.info/converter_en.html"&gt;accessed online&lt;/a&gt;. For comparison, OASA produces the following image for for the fipronil InChI:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20060926/fipronil_bkchem.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://pubchem.ncbi.nlm.nih.gov/edit/"&gt;PubChem editor&lt;/a&gt; can also translate and render InChIs, but no source code appears to be available. PubChem's InChI translation and rendering output for fipronil is:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20060926/fipronil_pubchem.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://cdk.sf.net"&gt;The Chemistry Development Kit&lt;/a&gt;, on which RCDK is based, was recently upgraded to support reading InChI identifiers. For some time, CDK has been able to generate 2-D atomic coordinates.&lt;/p&gt;

&lt;p&gt;More information on InChI software can be found at Beda Kosata's &lt;a href="http://inchi.info/index.html"&gt;InChI.info&lt;/a&gt; site.&lt;/p&gt;

&lt;h4&gt;The Final Word&lt;/h4&gt;

&lt;p&gt;Within certain limitations, it is quite feasible to programatically obtain a 2-D molecular image for any InChI identifier. Combining this capability with other chemical informatics software and services offers numerous possibilities to use InChI in innovative ways.&lt;/p&gt;</description>
      <pubDate>Tue, 26 Sep 2006 14:35:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:a6f32cae-5226-4b4d-9ab6-6ecd0a788c40</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/09/26/looking-at-inchis</link>
      <category>Graphics</category>
      <category>inchi</category>
      <category>rcdk</category>
      <category>rino</category>
      <category>depict</category>
      <category>2d</category>
      <category>ruby</category>
    </item>
  </channel>
</rss>
