<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag smarts</title>
    <link>http://depth-first.com/articles/tag/smarts</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Strings and Things</title>
      <description>&lt;p&gt;&lt;a href="http://daylight.com/meetings/mug01/Bradshaw/History/800x600/Strings_and_Things/sld029.htm"&gt;&lt;img src="http://depth-first.com/demo/20070425/the_future.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;I ran across John Bradshaw's excellent presentation &lt;a href="http://daylight.com/meetings/mug01/Bradshaw/History/800x600/Strings_and_Things/sld001.htm"&gt;Strings and Things&lt;/a&gt;. Part historical overview, part explanation of the SMILES/SMARTS &lt;a href="http://depth-first.com/articles/2007/03/14/eleven-qualities-of-the-perfect-line-notation-for-the-web"&gt;line notation&lt;/a&gt; systems, Bradshaw's slides are chock full of interesting tidbits.&lt;/p&gt;

&lt;p&gt;My favorite: &lt;a href="http://daylight.com/meetings/mug01/Bradshaw/History/800x600/Strings_and_Things/sld029.htm"&gt;slide 29&lt;/a&gt; - "Line notations are dead." It's a wonderful illustration of why predicting the future of technology is so tricky. The light pen became the mouse, the computer display became color, and Digital fell off a cliff. SMILES and SMARTS are the only things to have survived.&lt;/p&gt;</description>
      <pubDate>Wed, 25 Apr 2007 09:28:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:00ee8bee-8341-48d4-bc91-afb685b90acb</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/04/25/strings-and-things</link>
      <category>Meta</category>
      <category>smiles</category>
      <category>smarts</category>
      <category>linenotation</category>
    </item>
    <item>
      <title>OBRuby: A Ruby Interface to Open Babel</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/files/Babel256.png" align="right"&gt;&lt;/img&gt;&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;And the LORD said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do.&lt;/p&gt;

    &lt;p&gt;-&lt;cite&gt;Genesis 11:6&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="http://openbabel.sf.net"&gt;Open Babel&lt;/a&gt; is a &lt;a href="http://sourceforge.net/project/stats/detail.php?group_id=40728&amp;amp;ugn=openbabel&amp;amp;mode=alltime&amp;amp;&amp;amp;type=prdownload"&gt;widely-used&lt;/a&gt; Open Source chemical informatics toolkit written in C++. Although originally designed as a &lt;a href="http://openbabel.sourceforge.net/wiki/Formats"&gt;molecular language translator&lt;/a&gt;, Open Babel also supports &lt;a href="http://openbabel.sourceforge.net/wiki/SMARTS"&gt;SMARTS pattern recognition&lt;/a&gt;, &lt;a href="http://openbabel.sourceforge.net/wiki/Fingerprint"&gt;molecular fingerprints&lt;/a&gt;, &lt;a href="http://openbabel.sourceforge.net/wiki/Obfit"&gt;molecular superposition&lt;/a&gt;, and other features as well.&lt;/p&gt;

&lt;p&gt;Open Babel currently offers interfaces for two scripting languages: &lt;a href="http://openbabel.sourceforge.net/wiki/Python"&gt;Python&lt;/a&gt; and &lt;a href="http://openbabel.sourceforge.net/wiki/Perl"&gt;Perl&lt;/a&gt;. Recently, &lt;a href="http://geoffhutchison.net/blog/"&gt;Geoff Hutchison&lt;/a&gt; and I have been working to add Ruby to that list. This article reports our success in doing so and provides a glimpse of what might now be possible.&lt;/p&gt;

&lt;h4&gt;OBRuby&lt;/h4&gt;

&lt;p&gt;The upcoming release of Open Babel (version 2.1.0) will come complete with a Ruby interface. For those interested in trying it out sooner, a package called &lt;a href="http://depth-first.com/demo/20061031/obruby.tar.gz"&gt;OBRuby&lt;/a&gt; can be downloaded now. OBRuby compiles against revision 1577 of the Open Babel SVN trunk. It has been tested with Linux and Mac OS X, and will probably work on Windows with minor modifications. &lt;em&gt;The approach outlined here is known to fail with Open Babel 2.0.2.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OBRuby is a technology demonstration. The Ruby scripting support included with Open Babel 2.1.0 may differ in some details from OBRuby. My purpose in this article is simply to demonstrate what is now possible. Please read through the install scripts (they're short) to be sure you're comfortable with what they do.&lt;/p&gt;

&lt;p&gt;Here was my OBRuby installation process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Download the Open Babel SVN trunk revision 1577 or later.&lt;/li&gt;
&lt;li&gt;cd trunk&lt;/li&gt;
&lt;li&gt;configure, make, (as root) make install&lt;/li&gt;
&lt;li&gt;(as root) ldconfig (necessary on my system - perhaps not on yours)&lt;/li&gt;
&lt;li&gt;cd OBRUBY_DIR&lt;/li&gt;
&lt;li&gt;ruby build.rb&lt;/li&gt;
&lt;li&gt;(as root) make install&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One last wrinkle: the &lt;strong&gt;build.rb&lt;/strong&gt; script included with OBRuby is something of a hack. It hardcodes the location of the Open Babel library on line 6:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="attribute"&gt;@@ob_dir&lt;/span&gt;&lt;span class="punct"&gt;='&lt;/span&gt;&lt;span class="string"&gt;/usr/local&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Change this line to match your Open Babel installation and you should be ready to go. &lt;tt&gt;make install&lt;/tt&gt; places a single file, openbabel.so into your Ruby site_ruby directory.

To verify that the installation worked with IRB:

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'openbabel'
=&gt; true
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;A return value of &lt;tt&gt;true&lt;/tt&gt; shows that the installation was successful. An error message about &lt;strong&gt;libopenbabel.so&lt;/strong&gt; not being found indicates that your system can't find your Open Babel libraries. Be sure you've installed Open Babel and either run &lt;tt&gt;ldconfig&lt;/tt&gt; or set &lt;tt&gt;LD_LIBRARY_PATH&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;The majority of OBRuby was autogenerated by &lt;a href="http://www.swig.org/"&gt;SWIG&lt;/a&gt;. A future article will detail how this was done - with an eye toward developing a Java interface to Open Babel.&lt;/p&gt;

&lt;h4&gt;Building an OBMol From SMILES&lt;/h4&gt;

&lt;p&gt;With installation out of the way, let's fire up OBRuby and take her for a test drive. The following code can either be entered with IRB or saved to a file and executed with the ruby interpreter:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;

&lt;span class="ident"&gt;smi2mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_format&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt; &lt;span class="comment"&gt;# cholesterol, no chirality&lt;/span&gt;
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add_hydrogens&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Cholesterol has &lt;span class="expr"&gt;#{mol.num_atoms}&lt;/span&gt; atoms, including hydrogens.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Its molecular weight is &lt;span class="expr"&gt;#{mol.get_mol_wt}&lt;/span&gt; and its molecular formula is &lt;span class="expr"&gt;#{mol.get_formula}&lt;/span&gt;.&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

This simple code illustrates some important points. All OBRuby classes reside in the &lt;tt&gt;OpenBabel&lt;/tt&gt; module. These classes can be directly referenced by &lt;tt&gt;including&lt;/tt&gt; the &lt;tt&gt;OpenBabel&lt;/tt&gt; module. Also notice how Ruby &lt;tt&gt;underscore_delimited&lt;/tt&gt; method names are used, rather than C++ &lt;tt&gt;UpperCamelCase&lt;/tt&gt; names.

&lt;h4&gt;SMARTS Matching&lt;/h4&gt;

One of the most useful features of Open Babel is its SMARTS pattern matching capability. This can conveniently be accessed from OBRuby by first instantiating an &lt;tt&gt;OBSmartsPattern&lt;/tt&gt;, passing the SMARTS pattern of interest to the instance's &lt;tt&gt;init&lt;/tt&gt; method, and retrieving the hit set:

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;

&lt;span class="ident"&gt;smi2mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBConversion&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_in_format&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;smi&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smiles&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;# cholesterol, no chirality&lt;/span&gt;
&lt;span class="ident"&gt;smi2mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; 
&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add_hydrogens&lt;/span&gt;

&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;OBSmartsPattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;smarts&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;C1CCCCC1&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;init&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;smarts&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
&lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
&lt;span class="ident"&gt;hits&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;pattern&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_umap_list&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; indicies of two cyclohexane rings&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Found &lt;span class="expr"&gt;#{hits.size}&lt;/span&gt; instances of the SMARTS pattern '&lt;span class="expr"&gt;#{smarts}&lt;/span&gt;' in the SMILES string &lt;span class="expr"&gt;#{smiles}&lt;/span&gt;. Here are the atom indices:&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;hits&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each_with_index&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;hit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;index&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
  &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Hit &lt;span class="expr"&gt;#{index}&lt;/span&gt;: [ &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

  &lt;span class="ident"&gt;hit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;atom_index&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
    &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{atom_index}&lt;/span&gt; &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;]&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Notice the Rubyesque &lt;tt&gt;each_with_index&lt;/tt&gt; block that iterates over the elements in the hit set.

Running the above code produces the following output:

&lt;div class="console"&gt;
&lt;pre&gt;
Found 2 instances of the SMARTS pattern 'C1CCCCC1' in the SMILES string CC(C)CCCC(C)C1CCC2C1(CCC3C2CC=C4C3(CCC(C4)O)C)C. Here are the atom indices:
Hit 0: [ 12 17 16 15 14 13 ]
Hit 1: [ 20 25 24 23 22 21 ]
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Finding Your Way&lt;/h4&gt;

&lt;p&gt;Using a new library like OBRuby can take some getting used to. An excellent source of information is OpenBabel's &lt;a href="http://openbabel.sourceforge.net/dev-api/classes.shtml"&gt;online API documentation&lt;/a&gt;. Another source is Ruby itself.&lt;/p&gt;

&lt;p&gt;For example, let's say you've instantiated an &lt;tt&gt;OBMol&lt;/tt&gt;, but can't remember the exact name of the method that counts the number of atoms. Just use &lt;tt&gt;Object.methods.sort&lt;/tt&gt;:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;openbabel&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;OpenBabel&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;OBMol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;

&lt;span class="ident"&gt;mol&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;methods&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;sort&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; see output below&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

When run from Interactive Ruby (irb), this code produces the following alphabetized list of methods, which I've truncated:

&lt;div class="console"&gt;
... "is_corrected_for_ph", "kekulize", "kind_of?", "method", "methods", "new_atom", "new_perceive_kekule_bonds", "new_residue", "next_atom", "next_bond", "next_conformer", "next_internal_coord", "next_residue", "nil?", &lt;strong&gt;"num_atoms"&lt;/strong&gt;, "num_bonds", "num_conformers", "num_edges", "num_hvy_atoms", "num_nodes", "num_residues", "num_rotors", "object_id", "perceive_bond_orders", "perceive_kekule_bonds", "private_methods", "protected_methods", "public_methods", "renumber_atoms", "reserve_atoms", "reset_visit_flags" ...
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;OBRuby combines the dynamic programming language Ruby with the highly-functional toolkit Open Babel. Further augmenting OBRuby's capabilities with the web application framework &lt;a href="http://www.rubyonrails.org/"&gt;Rails&lt;/a&gt; and/or &lt;a href="http://depth-first.com/articles/2006/10/30/agile-chemical-informatics-development-with-cdk-and-ruby-rcdk-0-3-0"&gt;Ruby Chemistry Development Kit&lt;/a&gt; offers even more possibilities. Future articles will bring some of them to life.&lt;/p&gt;</description>
      <pubDate>Tue, 31 Oct 2006 14:20:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:09873e0e-eda9-4496-a1f1-28ab6d11930e</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/10/31/obruby-a-ruby-interface-to-open-babel</link>
      <category>Tools</category>
      <category>openbabel</category>
      <category>obruby</category>
      <category>ruby</category>
      <category>integration</category>
      <category>smiles</category>
      <category>smarts</category>
    </item>
  </channel>
</rss>
