<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: From SMILES to InChI with OBRuby</title>
    <link>http://depth-first.com/articles/2006/11/03/from-smiles-to-inchi-with-obruby</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>From SMILES to InChI with OBRuby</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/Babel256.png" align="right"&gt;&lt;/img&gt;&lt;a href="http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html"&gt;SMILES&lt;/a&gt; and &lt;a href="http://www.iupac.org/inchi/"&gt;InChI&lt;/a&gt; are two commonly-used &lt;a href="http://depth-first.com/articles/2006/08/18/107-years-of-line-formula-notations-1861-1968"&gt;molecular line notations&lt;/a&gt;. Although each has its advantages and limitations, the novelty of InChI and the ubiquity of SMILES makes the SMILES to InChI conversion especially useful. Many of the situations in which the need for this conversion will arise are particularly well-suited for the &lt;a href="http://ruby-lang.org"&gt;Ruby&lt;/a&gt; programming language. A &lt;a href="http://depth-first.com/articles/2006/08/26/from-smiles-to-inchi-rino-cdk-and-java-ruby-bridge"&gt;recent article&lt;/a&gt; described how &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;RCDK&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/2006/09/26/looking-at-inchis"&gt;Rino&lt;/a&gt; could be used to accomplish this conversion. This article will show how Open Babel can be used from Ruby to effect the same conversion.&lt;/p&gt;

&lt;h4&gt;OBRuby&lt;/h4&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/10/31/obruby-a-ruby-interface-to-open-babel"&gt;OBRuby&lt;/a&gt; is a SWIG-generated Ruby interface to the &lt;a href="http://openbabel.sf.net"&gt;Open Babel&lt;/a&gt; library. Although OBRuby doesn't expose all aspects of the Open Babel API, nearly everything that can be done in C++ Open Babel can now be done in Ruby. For example, all &lt;tt&gt;OBConversion&lt;/tt&gt; permutations should be available, including SMILES to InChI.&lt;/p&gt;

&lt;h4&gt;A Small Ruby Library&lt;/h4&gt;

&lt;p&gt;Let's create a small Ruby library for converting SMILES strings into InChI identifiers. Save the following into a file called &lt;strong&gt;convert.rb&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Convertor&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt;
    &lt;span class="attribute"&gt;@conv&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

    &lt;span class="attribute"&gt;@conv&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_and_out_formats&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;inchi&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;get_inchi&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

    &lt;span class="attribute"&gt;@conv&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="attribute"&gt;@conv&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There's nothing tricky here. We've simply created a Ruby class that makes the SMILES to InChI conversion as simple as one method call to an instance.&lt;/p&gt;

&lt;h4&gt;Testing the Library&lt;/h4&gt;

&lt;p&gt;A good way to test this library is through Interactive Ruby (irb). For example, to find the InChI of caffeine:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;convert&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;c&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Convertor&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;c&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_inchi&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;Cn1cnc2c1c(=O)n(C)c(=O)n2C&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt; &lt;span class="comment"&gt;# caffeine&lt;/span&gt;
&lt;span class="comment"&gt;# =&amp;gt;InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Chiral SMILES&lt;/h4&gt;

&lt;p&gt;I applied this simple Ruby conversion library to the &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=10836"&gt;(S)-methamphetamine record in PubChem&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Isomeric SMILES: C[C@@H](CC1=CC=CC=C1)NC&lt;/li&gt;
&lt;li&gt;PubChem InChI: InChI=1/C10H15N/c1-9(11-2)8-10-6-4-3-5-7-10/h3-7,9,11H,8H2,1-2H3/t9-/m0/s1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My results were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Isomeric SMILES: C[C@@H](CC1=CC=CC=C1)NC&lt;/li&gt;
&lt;li&gt;OBRuby InChI: InChI=1/C10H15N/c1-9(11-2)8-10-6-4-3-5-7-10/h3-7,9,11H,8H2,1-2H3/t9-/m1/s1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you can see, there is a discrepancy in the two stereo layers ('m0' vs. 'm1'). The same InChI is generated by Open Babel using either OBRuby or the &lt;a href="http://wwmm-svc.ch.cam.ac.uk/wwmm/html/observer.html"&gt;Worldwide Molecular Matrix&lt;/a&gt;. Substituting the SMILES string representing the opposite configuration at carbon generates the InChI with opposite configuration (R), which again is opposite to that of &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=36604"&gt;(R)-methamphetamine in PubChem&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;At this point, it is unclear whether Open Babel or PubChem is producing the correct InChI for the methamphetamine enantiomers. I suspect Open Babel is correct. By creating a molfile of (S)-methamphetamine with &lt;a href="http://depth-first.com/articles/2006/08/21/four-free-2-d-structure-editors-for-web-applications"&gt;JME&lt;/a&gt; and running cInChI over it, I got the same output as with the Open Babel conversions. I've found similar differences between PubChem and Open Babel InChIs in every chiral molecule I've looked at.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;The conversion of SMILES, and other molecular languages, into InChI identifiers can be expected to become a recurring need as the popularity of InChI increases. Combining the formidable translation capabilities of Open Babel with the comfort and convenience of Ruby offers a powerful new technique for doing so.&lt;/p&gt;</description>
      <pubDate>Fri, 03 Nov 2006 15:50:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:9cb1d0d4-a7ec-426f-8e4f-6d82a9363aef</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/11/03/from-smiles-to-inchi-with-obruby</link>
      <category>Tools</category>
      <category>inchi</category>
      <category>smiles</category>
      <category>ruby</category>
      <category>openbabel</category>
      <category>pubchem</category>
      <category>chiral</category>
    </item>
  </channel>
</rss>
