Customize InChI Output with Rino
Rino is a toolkit for working with the IUPAC International Chemical Identifier (InChI) in Ruby. Because it's based on the IUPAC/NIST InChI toolkit, Rino can be configured using a variety of useful options. This article summarizes those options and provides an illustrative example.
Complete List of InChI Command Line Options
The following is a complete summary of the IUPAC/NIST InChI toolkit command line options:
- SNon Exclude stereo (Default: Include Absolute stereo)
- SRel Relative stereo
- SRac Racemic stereo
- SUCF Use Chiral Flag: On means Absolute stereo, Off - Relative
- SUU Include omitted unknown/undefined stereo
- NEWPS Narrow end of wedge points to stereocenter (default: both)
- SPXYZ Include Phosphines Stereochemistry
- SAsXYZ Include Arsines Stereochemistry
- RecMet Include reconnected metals results
- FixedH Mobile H Perception Off (Default: On)
- AuxNone Omit auxiliary information (default: Include)
- NoADP Disable Aggressive Deprotonation (for testing only)
- Compress Compressed output
- DoNotAddH Don't add H according to usual valences: all H are explicit
- Wnumber Set time-out per structure in seconds; W0 means unlimited
- SDF:DataHeader Read from the input SDfile the ID under this DataHeader
- NoLabels Omit structure number, DataHeader and ID from InChI output
- Tabbed Separate structure number, InChI, and AuxIndo with tabs
- OutputSDF Convert InChI created with default aux. info to SDfile
- InChI2InChI Convert InChI string into InChI string for validation purposes
- SdfAtomsDT Output Hydrogen Isotopes to SDfile as Atoms D and T
- STDIO Use standard input/output streams
- FB (or FixSp3Bug) Fix bug leading to missing or undefined sp3 parity
- WarnOnEmptyStructure Warn and produce empty InChI for empty structure
A Test
The following code displays the InChI for benzoic acid with and without mobile hydrogen atom perception. It requires both Rino and Ruby CDK. The latter library is used to convert a SMILES string into a molfile for use by Rino.
require 'rubygems'
require_gem 'rcdk'
require_gem 'rino'
require 'rcdk/util'
molfile=RCDK::Util::Lang.smiles_to_molfile 'c1ccccc1C(=O)O' # benzoic acid
reader = Rino::MolfileReader.new
inchi = reader.read(molfile)
puts "Without mobile hydrogen perception:\n#{inchi}\n\n"
reader.options << '-FixedH'
inchi = reader.read(molfile)
puts "With mobile hydrogen perception:\n#{inchi}"
The -FixedH
flag used by the reader the second time tells Rino to identify mobile hydrogens in the InChI output. Some InChI authors use this form of InChI and others don't. PubChem is an example of a large InChI author that does use mobile hydrogen perception, as their entry for benzoic acid demonstrates. To perform an exact match of your InChIs with theirs, the -FixedH
flag must be set.
Running the Test
Running the test code produces the following output:
Without mobile hydrogen perception:
InChI=1/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)
With mobile hydrogen perception:
InChI=1/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)/f/h8H
Conclusions
When matching InChIs generated by other authors, it's best to adopt their processing conventions. Rino makes it conventient to do so through its full support for the standard IUPAC/NIST command line options.