<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag commandline</title>
    <link>http://depth-first.com/articles/tag/commandline</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Fast Substructure Search Using Open Source Tools Part 4: Creating Fingerprints from Chemical Structures</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/adrenalin/4250667/"&gt;&lt;img src="http://depth-first.com/demo/20081015/falls.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The previous articles in this series have detailed the steps needed to build a working fingerprint screening system using nothing more than the open source tools &lt;a href="http://www.mysql.com/"&gt;MySQL&lt;/a&gt;, &lt;a href="http://ruby-lang.org"&gt;Ruby&lt;/a&gt;, and &lt;a href="http://ar.rubyonrails.org/"&gt;ActiveRecord&lt;/a&gt;. With this system we can create, read, update, and destroy fingerprints in persistent storage. Although the system meets all of the requirements of a fingerprint screening system, it isn't a substructure search system - yet. For that, we need a way to convert chemical structure representations into fingerprints. This article describes a very simple method for doing so.&lt;/p&gt;

&lt;p&gt;All Articles in this Series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://depth-first.com/articles/2008/10/02/fast-substructure-search-using-open-source-tools-part-1-fingerprints-and-databases"&gt;Part 1: Fingerprints and Databases&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://depth-first.com/articles/2008/10/03/fast-substructure-search-using-open-source-tools-part-2-fingerprint-screen-with-sql"&gt;Part 2: Fingerprint Screen With SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://depth-first.com/articles/2008/10/06/fast-substructure-search-using-open-source-tools-part-3-a-crud-api-for-fingerprints-in-ruby"&gt;Part 3: A CRUD API for Fingerprints in Ruby&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Part 4: Creating Fingerprints from Chemical Structures&lt;/li&gt;
&lt;li&gt;&lt;a href="http://depth-first.com/articles/2008/10/21/fast-substructure-search-using-open-source-tools-part-5-relating-molecules-to-fingerprints-with-sql"&gt;Part 5: Relating Molecules to Fingerprints with SQL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://depth-first.com/articles/2008/10/29/fast-substructure-search-using-open-source-tools-part-6-modelling-a-one-to-many-relationship-between-fingerprints-and-compounds-in-ruby"&gt;Part 6: Modelling a One-To-Many Relationship Between Fingerprints and Compounds in Ruby&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;A Ruby Fingerprinter in Eight Lines&lt;/h4&gt;

&lt;p&gt;Let's create a &lt;tt&gt;Fingerprinter&lt;/tt&gt; class that's capable of converting a SMILES string into a &lt;tt&gt;Fingerprint&lt;/tt&gt; that can be stored and queried. The Ruby code below makes use of Open Babel's &lt;a href="http://openbabel.org/wiki/Babel"&gt;&lt;tt&gt;babel&lt;/tt&gt;&lt;/a&gt; command-line utility:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;fingerprint&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Fingerprinter&lt;/span&gt;  
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;fingerprint_smiles&lt;/span&gt; &lt;span class="ident"&gt;smiles&lt;/span&gt;
    &lt;span class="ident"&gt;raw&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;%x[&lt;/span&gt;&lt;span class="string"&gt;echo '&lt;span class="expr"&gt;#{smiles}&lt;/span&gt;' | babel -ismi -ofpt 2&amp;gt;/dev/null&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
    &lt;span class="ident"&gt;bytes&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;raw&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;gsub&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;&amp;gt;.*?&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;/,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;gsub&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;/,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;split&lt;/span&gt;

    &lt;span class="constant"&gt;Fingerprint&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;fill_bytes&lt;/span&gt;&lt;span class="punct"&gt;{|&lt;/span&gt;&lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{bytes[2*i]}#{bytes[2*i+1]}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;.&lt;/span&gt;&lt;span class="ident"&gt;hex&lt;/span&gt;&lt;span class="punct"&gt;}&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This class takes advantage of Ruby's ability to interface directly with the command line through the &lt;tt&gt;%x&lt;/tt&gt; operator in a way similar to that previously described for the &lt;a href="http://depth-first.com/articles/2008/05/30/a-simple-and-portable-ruby-interface-to-inchi-part-2-silencing-console-output"&gt;cInChI command line tool&lt;/a&gt;. The &lt;tt&gt;babel&lt;/tt&gt; output is then converted into a form suitable for use with our &lt;a href="http://depth-first.com/articles/2008/10/06/fast-substructure-search-using-open-source-tools-part-3-a-crud-api-for-fingerprints-in-ruby"&gt;previously-defined&lt;/a&gt; &lt;tt&gt;Fingerprint&lt;/tt&gt; class.&lt;/p&gt;

&lt;p&gt;Although quite easy to implement, this approach may not work in every situation. For example, the &lt;tt&gt;fingerprint_smiles&lt;/tt&gt; method opens the possibility that a malicious user could attempt to execute arbitrary shell commands by creating a mis-formed SMILES string. Windows users may need to adapt the code. But for trusted SMILES on Unix machines, this implementation works well and can be used in many different programming environments.&lt;/p&gt;

&lt;h4&gt;Testing the Fingerprinter&lt;/h4&gt;

We can test the Fingerprinter through interactive Ruby (irb):

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&amp;gt; require 'lib/fingerprinter'
=&amp;gt; true
irb(main):002:0&amp;gt; fp=Fingerprinter.new
=&amp;gt; #&amp;lt;Fingerprinter:0xb7498038&amp;gt;
irb(main):003:0&amp;gt; f=fp.fingerprint_smiles 'c1ccccc1'
=&amp;gt; #&amp;lt;Fingerprint id: nil, byte0: 0, byte1: 512, byte2: 0, byte3: 0, byte4: 2112, byte5: 32768, byte6: 0, byte7: 0, byte8: 0, byte9: 0, byte10: 134217728, byte11: 0, byte12: 0, byte13: 0, byte14: 131072, byte15: 0, hex: nil&amp;gt;
irb(main):004:0&amp;gt; f.cardinality
=&amp;gt; 6
irb(main):005:0&amp;gt; f.bitstring
=&amp;gt; "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000100000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As we previously saw, any &lt;tt&gt;Fingerprint&lt;/tt&gt; we create can be stored and later retrieved from a MySQL database. If we've already stored the fingerprint for benzene it can be found with the following:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&amp;gt; require 'lib/fingerprinter'
=&amp;gt; true
irb(main):002:0&amp;gt; fp=Fingerprinter.new
=&amp;gt; #&amp;lt;Fingerprinter:0xb74ae284&amp;gt;
irb(main):003:0&amp;gt; f=fp.fingerprint_smiles 'c1ccccc1'
=&amp;gt; #&amp;lt;Fingerprint id: nil, byte0: 0, byte1: 512, byte2: 0, byte3: 0, byte4: 2112, byte5: 32768, byte6: 0, byte7: 0, byte8: 0, byte9: 0, byte10: 134217728, byte11: 0, byte12: 0, byte13: 0, byte14: 131072, byte15: 0, hex: nil&amp;gt;
irb(main):004:0&amp;gt; Fingerprint.find_by_fingerprint f
=&amp;gt; #&amp;lt;Fingerprint id: 12687, byte0: 0, byte1: 512, byte2: 0, byte3: 0, byte4: 2112, byte5: 32768, byte6: 0, byte7: 0, byte8: 0, byte9: 0, byte10: 134217728, byte11: 0, byte12: 0, byte13: 0, byte14: 131072, byte15: 0, hex: "000000000000000000000000000002000000000000000000000..."&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;We now have the ability to create, store, and query fingerprints created from arbitrary SMILES strings. If there were a 1:1 relationship between molecules and fingerprints, we'd be nearly done. But things are not quite that simple. The next article in this series will show how to relate molecules to fingerprints.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/adrenalin/"&gt;adrenalin&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Wed, 15 Oct 2008 14:42:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:ad16d97d-9183-4e25-8b88-26a28ffdca48</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/10/15/fast-substructure-search-using-open-source-tools-part-4-creating-fingerprints-from-chemical-structures</link>
      <category>Tools</category>
      <category>ruby</category>
      <category>activerecord</category>
      <category>openbabel</category>
      <category>commandline</category>
      <category>fingerprint</category>
      <category>database</category>
      <category>substructuresearch</category>
      <category>query</category>
    </item>
    <item>
      <title>Recombining Compressed PubChem SD Files with Open Babel</title>
      <description>&lt;p&gt;&lt;a href="http://openbabel.org"&gt;&lt;img src="http://depth-first.com/files/Babel256.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;While testing &lt;a href="http://metamolecular.com/chemphoto"&gt;ChemPhoto&lt;/a&gt;, it became necessary to test the &lt;a href="http://depth-first.com/articles/2008/09/08/smarter-cheminformatics-from-sd-file-to-image-collection-with-chemphoto"&gt;chemical structure imaging application&lt;/a&gt; with SD Files containing several hundred thousand records. Although it's tempting to meet this need by constructing "dummy" files with the same record or small set of records repeated, tests are always far more illuminating when real data is used.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt; is an excellent source of large molecular datasets, and the entire database can be &lt;a href="http://depth-first.com/articles/2006/09/29/hacking-pubchem-direct-access-with-ftp"&gt;downloaded by FTP&lt;/a&gt;. Because of PubChem's massive size, what's downloadable consists of files broken up into groups of about 25,000 in gzipped SD File format (*.sdf.gz). Although this is an excellent resource, it creates a problem: how can you conveniently recombine this set of compressed SD Files into a single SD File?&lt;/p&gt;

&lt;p&gt;You might think about writing some "quick" code in your language of choice. Fortunately, &lt;a href="http://openbabel.org"&gt;Open Babel&lt;/a&gt; gets the job done - without any of the coding or debugging.&lt;/p&gt;

&lt;p&gt;The following command will create a single SD File from all of the compressed SD Files in a given directory, while also stripping explicit hydrogens and removing all fields except PUBCHEM_COMPOUND_CID.&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
babel *.sdf.gz pubchem.sdf -d --delete PUBCHEM_COMPOUND_CANONICALIZED,PUBCHEM_CACTVS_COMPLEXITY,PUBCHEM_CACTVS_HBOND_ACCEPTOR,PUBCHEM_CACTVS_HBOND_DONOR,PUBCHEM_CACTVS_ROTATABLE_BOND,PUBCHEM_CACTVS_SUBSKEYS,PUBCHEM_IUPAC_OPENEYE_NAME,PUBCHEM_IUPAC_CAS_NAME,PUBCHEM_IUPAC_NAME,PUBCHEM_IUPAC_SYSTEMATIC_NAME,PUBCHEM_IUPAC_TRADITIONAL_NAME,PUBCHEM_NIST_INCHI,PUBCHEM_EXACT_MASS,PUBCHEM_MOLECULAR_FORMULA,PUBCHEM_MOLECULAR_WEIGHT,PUBCHEM_OPENEYE_CAN_SMILES,PUBCHEM_OPENEYE_ISO_SMILES,PUBCHEM_CACTVS_TPSA,PUBCHEM_MONOISOTOPIC_WEIGHT,PUBCHEM_TOTAL_CHARGE,PUBCHEM_HEAVY_ATOM_COUNT,PUBCHEM_ATOM_DEF_STEREO_COUNT,PUBCHEM_ATOM_UDEF_STEREO_COUNT,PUBCHEM_BOND_DEF_STEREO_COUNT,PUBCHEM_BOND_UDEF_STEREO_COUNT,PUBCHEM_ISOTOPIC_ATOM_COUNT,PUBCHEM_COMPONENT_COUNT,PUBCHEM_CACTVS_TAUTO_COUNT,PUBCHEM_BONDANNOTATIONS,PUBCHEM_CACTVS_XLOGP

865543 molecules converted
7 info messages 15372962 audit log messages 
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Apparently, there is no way to tell babel to &lt;em&gt;keep&lt;/em&gt; just a particular field in an SD File - they need to be removed individually.&lt;/p&gt;

&lt;p&gt;Still, not bad for a few seconds on the command line.&lt;/p&gt;</description>
      <pubDate>Wed, 01 Oct 2008 01:25:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:725a5f70-77e1-4aee-a79d-e7fb9f7c3401</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/10/01/recombining-compressed-pubchem-sd-files-with-open-babel</link>
      <category>Tools</category>
      <category>openbabel</category>
      <category>sdfile</category>
      <category>pubchem</category>
      <category>sdfgz</category>
      <category>commandline</category>
    </item>
    <item>
      <title>Rethinking the Command Line for Chemistry</title>
      <description>&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070327/yubnub.png" align="center"&gt;&lt;/img&gt;&lt;/center&gt;
&lt;br /&gt;&lt;br /&gt;
A &lt;a href="http://depth-first.com/articles/2007/03/15/do-you-use-the-command-line"&gt;recent article&lt;/a&gt; discussed the renaissance of the command line. Particularly on the Web, command line interfaces have become so advanced, that most of us don't even realize we're using them. Consider the Google search box, which is nothing more than one of the most powerful command line interfaces ever developed.&lt;/p&gt;

&lt;p&gt;A service called &lt;a href="http://yubnub.org/"&gt;YubNub&lt;/a&gt; takes this idea one step further. YubNub is a meta command line interface for the Web. The following YubNub command will do a &lt;a href="http://flickr.com"&gt;Flickr&lt;/a&gt; search for benzene.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070327/ducatisearch.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;If this were all YubNub did, it would be merely interesting. What makes YubNub remarkable is that you can create your own commands that other people can use. I recently added the "ginchi" command to query Google for an InChI. Now you can try it out:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070327/benzenesearch1.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;By itself this isn't particularly useful because you can just go to Google and query the InChI directly. However, it's not too hard to imagine several commands like &lt;tt&gt;ginchi&lt;/tt&gt; that could be added. Some would use Google, others would use other services.  How about something that searches Mitch Garcia's &lt;a href="http://www.sciencebase.com/science-blog/chemical-pipe-works.html"&gt;chemistry journal Yahoo pipe&lt;/a&gt;? It would be very convenient to have all of those commands accessible from the same Web page.&lt;/p&gt;

&lt;p&gt;Command line interfaces can be phenomenally useful for both beginning and advanced users. The hardest part to get right is not what the user sees as they type, but what happens after they hit the enter key.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/articles/tag/linenotation"&gt;Line notations&lt;/a&gt; are the perfect match for command line interfaces. The widespread use of SMILES and the precision of InChI offer many possibilities for innovative chemistry Web services.&lt;/p&gt;</description>
      <pubDate>Tue, 27 Mar 2007 12:30:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:8e30edda-82a2-4800-95cf-0c34b669a056</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/03/27/rethinking-the-command-line-for-chemistry</link>
      <category>Tools</category>
      <category>commandline</category>
      <category>linenotation</category>
      <category>yubnub</category>
      <category>web20</category>
      <category>ginchi</category>
    </item>
    <item>
      <title>Customize InChI Output with Rino</title>
      <description>&lt;p&gt;&lt;a href="http://rubyforge.org/projects/rino"&gt;Rino&lt;/a&gt; is a toolkit for working with the &lt;a href="http://en.wikipedia.org/wiki/International_Chemical_Identifier"&gt;IUPAC International Chemical Identifier&lt;/a&gt; (InChI) in Ruby. Because it's based on the IUPAC/NIST InChI toolkit, Rino can be configured using a variety of useful options. This article summarizes those options and provides an illustrative example.&lt;/p&gt;

&lt;h4&gt;Complete List of InChI Command Line Options&lt;/h4&gt;

&lt;p&gt;The following is a complete summary of the IUPAC/NIST InChI toolkit command line options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SNon&lt;/strong&gt; Exclude stereo (Default: Include Absolute stereo)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SRel&lt;/strong&gt; Relative stereo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SRac&lt;/strong&gt; Racemic stereo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SUCF&lt;/strong&gt; Use Chiral Flag: On means Absolute stereo, Off - Relative&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SUU&lt;/strong&gt; Include omitted unknown/undefined stereo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;NEWPS&lt;/strong&gt; Narrow end of wedge points to stereocenter (default: both)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SPXYZ&lt;/strong&gt; Include Phosphines Stereochemistry&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SAsXYZ&lt;/strong&gt; Include Arsines Stereochemistry&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RecMet&lt;/strong&gt; Include reconnected metals results&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FixedH&lt;/strong&gt; Mobile H Perception Off (Default: On)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AuxNone&lt;/strong&gt; Omit auxiliary information (default: Include)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;NoADP&lt;/strong&gt; Disable Aggressive Deprotonation (for testing only)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Compress&lt;/strong&gt; Compressed output&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;DoNotAddH&lt;/strong&gt; Don't add H according to usual valences: all H are explicit&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Wnumber&lt;/strong&gt; Set time-out per structure in seconds; W0 means unlimited&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SDF:DataHeader&lt;/strong&gt; Read from the input SDfile the ID under this DataHeader&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;NoLabels&lt;/strong&gt; Omit structure number, DataHeader and ID from InChI output&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tabbed&lt;/strong&gt; Separate structure number, InChI, and AuxIndo with tabs&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OutputSDF&lt;/strong&gt; Convert InChI created with default aux. info to SDfile&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;InChI2InChI&lt;/strong&gt; Convert InChI string into InChI string for validation purposes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SdfAtomsDT&lt;/strong&gt; Output Hydrogen Isotopes to SDfile as Atoms D and T&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;STDIO&lt;/strong&gt; Use standard input/output streams&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;FB&lt;/strong&gt; (or FixSp3Bug) Fix bug leading to missing or undefined sp3 parity&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;WarnOnEmptyStructure&lt;/strong&gt; Warn and produce empty InChI for empty structure&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;A Test&lt;/h4&gt;

&lt;p&gt;The following code displays the InChI for benzoic acid with and without mobile hydrogen atom perception. It requires both &lt;a href="http://depth-first.com/articles/tag/rino"&gt;Rino&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/tag/rcdk"&gt;Ruby CDK&lt;/a&gt;. The latter library is used to convert a SMILES string into a molfile for use by Rino.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rino&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rcdk/util&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;molfile&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;RCDK&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Util&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Lang&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;smiles_to_molfile&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;c1ccccc1C(=O)O&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="comment"&gt;# benzoic acid&lt;/span&gt;
&lt;span class="ident"&gt;reader&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Rino&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;MolfileReader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
&lt;span class="ident"&gt;inchi&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molfile&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Without mobile hydrogen perception:&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;span class="expr"&gt;#{inchi}&lt;/span&gt;&lt;span class="escape"&gt;\n\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="ident"&gt;reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;options&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;-FixedH&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;inchi&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;reader&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;molfile&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;With mobile hydrogen perception:&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;span class="expr"&gt;#{inchi}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;tt&gt;-FixedH&lt;/tt&gt; flag used by the reader the second time tells Rino to identify mobile hydrogens in the InChI output. Some InChI authors use this form of InChI and others don't. PubChem is an example of a large InChI author that does use mobile hydrogen perception, as their entry for &lt;a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=243"&gt;benzoic acid&lt;/a&gt; demonstrates. To perform an exact match of your InChIs with theirs, the &lt;tt&gt;-FixedH&lt;/tt&gt; flag must be set.&lt;/p&gt;

&lt;h4&gt;Running the Test&lt;/h4&gt;

&lt;p&gt;Running the test code produces the following output:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
Without mobile hydrogen perception:
InChI=1/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)

With mobile hydrogen perception:
InChI=1/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)/f/h8H
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;When matching InChIs generated by other authors, it's best to adopt their processing conventions. Rino makes it conventient to do so through its full support for the standard IUPAC/NIST command line options.&lt;/p&gt;</description>
      <pubDate>Mon, 19 Mar 2007 10:30:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:1975ab63-5e0e-4ef7-9227-46b1fb0f0939</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/03/19/customize-inchi-output-with-rino</link>
      <category>Tools</category>
      <category>rino</category>
      <category>inchi</category>
      <category>pubchem</category>
      <category>commandline</category>
    </item>
    <item>
      <title>Do You Use the Command Line?</title>
      <description>&lt;blockquote&gt;
    &lt;p&gt;&lt;a href="http://flickr.com/photos/bartholomule/65432558/"&gt;&lt;img src="http://depth-first.com/demo/20070315/keyboard.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;In the run to abandon command line interfaces for the GUI, we've left behind the versatility of language.&lt;/p&gt;
    
    &lt;p&gt;...&lt;/p&gt;
    
    &lt;p&gt;[Imagine] using a drop-down menu to select the one web site you want to go to out of the 100 million web sites in existence. Ludicrous! How do we actually surf to a site? By typing an address into the address bar. When we want to go to the mail "application", we type in "gmail.com"; when we want to open a news "application", we type in "nytimes.com". On the old unix command lines, we would type type "pine" and "rn". See a similarity? The address bar is just a primitive command line. A command line that your grandmother can&#8212;and does&#8212;use.&lt;/p&gt;
    
    &lt;p&gt;-&lt;cite&gt;Aza Raskin, &lt;a href="http://www.humanized.com/weblog/2007/02/24/your_grandmothers_command_line_the_command_line_co/"&gt;Get Humanized&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The command line is alive and well. It's simply become so sophisticated that most of us don't realize we're using it. Whether we're entering a URL into a browser address bar, taking advantage of autocomplete to look up a co-worker's name in an address book, or using Google to search the Web, the command line is hard at work. Most people wouldn't want it any other way.&lt;/p&gt;

&lt;p&gt;To an end user, a command line is nothing more than a box to enter text. The magic happens when this text is processed. Aza Raskin's company &lt;a href="http://humanized.com"&gt;Humanized&lt;/a&gt; uses this simple idea to build text-driven applications that save time and effort. &lt;/p&gt;

&lt;p&gt;What would happen if the same thinking were applied to chemical informatics?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image credit: &lt;a href="http://flickr.com/photos/bartholomule/"&gt;Bartholomule&lt;/a&gt; - &lt;a href="http://flickr.com"&gt;Flickr&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Thu, 15 Mar 2007 11:18:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:5ae758c1-3e43-42fb-81ad-7aca498bea13</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/03/15/do-you-use-the-command-line</link>
      <category>Meta</category>
      <category>humanized</category>
      <category>commandline</category>
      <category>text</category>
    </item>
  </channel>
</rss>
