<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag xml</title>
    <link>http://depth-first.com/articles/tag/xml</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>The Best API May Be No API At All: PubChem and PDB</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/druclimb/325661568/"&gt;&lt;img src="http://depth-first.com/demo/20070813/invisible.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;Both &lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt; and the &lt;a href="http://www.rcsb.org/pdb/home/home.do"&gt;Protein Data Bank&lt;/a&gt; (PDB) maintain vast collections of molecular data. Individual users are free to view and search these collections via standard Web browsers. But what are the options if you're developing software to interact with these databases?&lt;/p&gt;

&lt;p&gt;Various application programming interfaces (APIs) are available for accessing PubChem and PDB records. For example, PubChem recently introduced its &lt;a href="http://depth-first.com/articles/tag/pug"&gt;Power User Gateway&lt;/a&gt; (PUG), an XML-based query language. But writing APIs is extremely difficult; reconciling the need for simplicity with the need for rich functionality is a tough balancing act. Where do you draw the line?&lt;/p&gt;

&lt;p&gt;Recently, &lt;a href="http://boscoh.com/"&gt;Bosco&lt;/a&gt; described a &lt;a href="http://boscoh.com/protein/fetching-pdb-files-remotely-in-pure-python-code"&gt;remarkably short method&lt;/a&gt; to retrieve PDB records using nothing more than standard Python. Given the similarities between Python and Ruby, it seemed reasonable that his method could be adapted to Ruby.&lt;/p&gt;

&lt;p&gt;The following Ruby library accepts a PDB identifier and returns the corresponding PDB record:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net/http&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;PDB&lt;/span&gt;
  &lt;span class="comment"&gt;# Returns a PDB record for the given id&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;self.get_record&lt;/span&gt; &lt;span class="ident"&gt;id&lt;/span&gt;
    &lt;span class="constant"&gt;Net&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;HTTP&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_response&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;www.rcsb.org&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;/pdb/files/&lt;span class="expr"&gt;#{id}&lt;/span&gt;.pdb&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;).&lt;/span&gt;&lt;span class="ident"&gt;body&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

Notice how the business end of this library is nothing more than a single line of Ruby code.

The library can be tested by saving it in a file called &lt;strong&gt;pdb.rb&lt;/strong&gt; and invoking interactive Ruby (irb):

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'pdb'
=&gt; true
irb(main):002:0&gt; puts PDB::get_record('1hpn')
HEADER    GLYCOSAMINOGLYCAN                       17-JAN-95   1HPN
TITLE     N.M.R. AND MOLECULAR-MODELLING STUDIES OF THE SOLUTION
TITLE    2 CONFORMATION OF HEPARIN

[truncated]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Several months ago, a D-F article described a related, but somewhat lengthier approach to &lt;a href="http://depth-first.com/articles/2006/08/30/hacking-pubchem-with-ruby"&gt;retrieving PubChem molfiles&lt;/a&gt;. Using the same approach we used for PDB, we can create the world's shortest PubChem library:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;net/http&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;PubChem&lt;/span&gt;
  &lt;span class="comment"&gt;# Returns a molfile for the given PubChem CID&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;self.get_molfile&lt;/span&gt; &lt;span class="ident"&gt;cid&lt;/span&gt;
    &lt;span class="constant"&gt;Net&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;HTTP&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get_response&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;pubchem.ncbi.nlm.nih.gov&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;/summary/summary.cgi?cid=&lt;span class="expr"&gt;#{cid}&lt;/span&gt;&amp;amp;disopt=DisplaySDF&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;).&lt;/span&gt;&lt;span class="ident"&gt;body&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

This library can be tested by saving it in a file called &lt;strong&gt;pubchem.rb&lt;/strong&gt; followed by running irb:

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'pubchem'
=&gt; true
irb(main):002:0&gt; puts PubChem::get_molfile('969472') #eszopiclone (Lunesta)
969472
  -OEChem-08130700422D

 44 47  0     1  0  0  0  0  0999 V2000
    9.2619   -2.2732    0.0000 Cl  0  0  0  0  0  0  0  0  0  0  0  0

[truncated]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Both of these Ruby libraries leverage one the most versatile and robust protocols ever developed: plain old http. The last few years have witnessed a renaissance in using bare http as platform for building simplified yet powerful Web APIs with less software. Referred to as &lt;a href="http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm"&gt;REST&lt;/a&gt;, the approach has gained traction partly in response to the wasteful complexities introduced by various XML-based approaches. Although &lt;a href="http://depth-first.com/articles/2007/05/30/restful-cheminformatics"&gt;slow to catch on in cheminformatics&lt;/a&gt;, REST has enormous potential in unifying &lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;a diverse array&lt;/a&gt; of isolated database systems.&lt;/p&gt;

&lt;p&gt;One limitation of the approach described here is that the PubChem (or PDB) folks may get upset if you use it a lot. For example, if you examine the &lt;a href="http://pubchem.ncbi.nlm.nih.gov/robots.txt"&gt;PubChem robots.txt file&lt;/a&gt;, you'll notice that access to the &lt;tt&gt;summary.cgi&lt;/tt&gt; resource, which our library makes use of, is prohibited to robots:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_robots "&gt;...

User-agent: *

...
Disallow: /summary/summary.cgi
...&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;What makes a "robot" and does your software qualify for exclusion? The answer is not enirely clear-cut, especially in the era of browser-side scripting.&lt;/p&gt;

&lt;p&gt;Regardless, it looks like PubChem's policy was put in place in 2004, long before PubChem had experience with usage patterns for its service. It may be that this restriction could be relaxed without adversely affecting PubChem's ability to operate efficiently. It may even be possible to offer a low-level http retrieval method alongside PubChem's PUG interface on a machine dedicated to automated queries (i.e., &lt;a href="http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html"&gt;Entrez eUtils&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;As developers, our mission is to deliver functionality, not to write software. We should extract every possible ounce of value from established protocols and APIs before writing a single line of additional code. REST, and the creative use of good old http, are powerful tools to do so.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit &lt;a href="http://flickr.com/photos/druclimb/"&gt;Dru!&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 13 Aug 2007 07:55:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:0632ab5e-4c6a-4bb5-b898-5606e7743230</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/08/13/the-best-api-may-be-no-api-at-all-pubchem-and-pdb</link>
      <category>Tools</category>
      <category>pubchem</category>
      <category>pdb</category>
      <category>pug</category>
      <category>xml</category>
      <category>rest</category>
      <category>http</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Hacking PubChem: Learning to Speak PUG</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/40121670@N00/478093105/"&gt;&lt;img src="http://depth-first.com/demo/20070611/pug.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;A previous article introduced PubChem's &lt;a href="http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway"&gt;Power User Gateway&lt;/a&gt; (PUG), an XML-based communication channel. Although NIH kindly supplies a &lt;a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pug.xsd"&gt;commented schema&lt;/a&gt; for PUG queries and responses, there's nothing like seeing real examples when learning a new language. This article will describe one method for conveniently generating PUG XML queries.&lt;/p&gt;

&lt;h4&gt;Let PubChem Build Your Query&lt;/h4&gt;

&lt;p&gt;One of the options on the &lt;a href="http://pubchem.ncbi.nlm.nih.gov/search/search.cgi"&gt;PubChem search page&lt;/a&gt; is "Save Query." As it turns out, PubChem saves queries in PUG XML (I'll just call it PUGML). In other words, preparing a query using the PubChem search page and saving it gives a simple method for creating PUGML queries. Let's try it.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070611/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Using the "Sketch" button, draw the structure of benzimidazole. Under "Search Type", select "Substructure." Now click "Save Query", and you'll download a substructure query for benzimidazole in PUGML:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType_css&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query_data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;C1=CC=CC2=C1N=C[N]2&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query_data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type_subss&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure_bonds&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;true&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
                      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type_subss&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_results&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;2000000&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_results&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType_css&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;tt&gt;PCT-QueryCompoundCS_type_subss&lt;/tt&gt; element will tell PUG to look for substructures.&lt;/p&gt;

&lt;h4&gt;Using the Saved Query with PUG&lt;/h4&gt;

&lt;p&gt;Saving this file as &lt;strong&gt;benzimidazole_sss.xml&lt;/strong&gt;, lets us feed it to PUG:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @benzimidazole_sss.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;and get the following PUGML response:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;queued&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;62668946396085905&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;Structure search job was submitted&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

We can then check on the status of our query by saving the following as &lt;strong&gt;status.xml&lt;/strong&gt;:

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;62668946396085905&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_type&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;status&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

POSTing this to PUG:

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @status.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;gives us the following PUGML:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;success&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;Your search has already been completed successfully!.&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;pccompound&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_query-key&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;1&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_query-key&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_webenv&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;0CPrI_peUmUtWDooyjxpJ1XAXPcOl-ESZZxj8sJV9ZDR8musMjh1oBTib@1EDD43FA66AE1BE0_0001SID&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_webenv&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway"&gt;Last time&lt;/a&gt;, we got a URL to download a gzipped SD File. This time, our query specified results to be returned as an Entrez Key through the &lt;tt&gt;PCT-Entrez_webenv&lt;/tt&gt; element. We can construct a URL that will let us view these results:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=HistorySearch&amp;amp;WebEnvRq=1&amp;amp;db=pccompound&amp;amp;query_key=1&amp;amp;WebEnv=0CPrI_peUmUtWDooyjxpJ1XAXPcOl-ESZZxj8sJV9ZDR8musMjh1oBTib%401EDD43FA66AE1BE0_0001SID&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Where to Next?&lt;/h4&gt;

&lt;p&gt;If we wanted to get a gzipped SD File instead, we'd need to edit our original query. But manually editing XML is a lot like mowing a lawn with scissors. What we'd really like is a simple API in a language like Ruby that will let us build sophisticated PUG queries, process the results, and pipe them into other queries with little effort. But that's a story for another time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/40121670@N00/"&gt;sutterbabe68&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 11 Jun 2007 09:04:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:c9cf69b1-a86f-4a3b-ba3c-a8dcded2fa9f</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/06/11/hacking-pubchem-learning-to-speak-pug</link>
      <category>Databases</category>
      <category>pubchem</category>
      <category>pug</category>
      <category>xml</category>
      <category>api</category>
      <category>powerusergateway</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Hacking PubChem: Power User Gateway</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/40121670@N00/519042653/"&gt;&lt;img src="http://depth-first.com/demo/20070604/pug.jpg" border="0" align="right"&gt;&lt;/img&gt;&lt;/a&gt;If you've been waiting for a simple way to programatically query PubChem without screen scraping, the wait is over. An (apparently) new service called the Power User Gateway (PUG) now offers a direct, XML-based PubChem data channel.&lt;/p&gt;

&lt;h4&gt;See PUG&lt;/h4&gt;

&lt;p&gt;Previous articles have discussed various methods for hacking PubChem: screen scraping (&lt;a href="http://depth-first.com/articles/2006/08/30/hacking-pubchem-with-ruby"&gt;link&lt;/a&gt;, &lt;a href="http://depth-first.com/articles/2006/08/30/hacking-pubchem-with-ruby"&gt;link&lt;/a&gt;); with the &lt;a href="http://depth-first.com/articles/2006/09/23/hacking-pubchem-entrez-programming-utilities"&gt;Entrez Utilities&lt;/a&gt;; and by simply &lt;a href="http://depth-first.com/articles/2006/09/29/hacking-pubchem-direct-access-with-ftp"&gt;replicating the database&lt;/a&gt;. PUG is different in that it is both very simple and apparently quite powerful.&lt;/p&gt;

&lt;p&gt;From the &lt;a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_pug.txt"&gt;PUG documentation&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;... There is a single CGI (pug.cgi, referred to hereafter as simply PUG) that is the central gateway to multiple PubChem functions. PUG takes no URL arguments; all communication with PUG is done by XML. To perform any request, you will formulate your input in XML and then HTTP POST it to PUG. The CGI interprets your incoming request, initiates the appropriate action, then returns results (also) in XML format. ...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;See PUG Run&lt;/h4&gt;

&lt;p&gt;Let's perform a simple query using PUG. As the documentation states, all communication with PUG is done through HTTP POST. In contrast to other approaches to interfacing with PubChem, parameters and results are encoded in raw XML, the schema for which is available &lt;a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pug.xsd"&gt;here&lt;/a&gt;. To use PUG your first step is to locate software capable of encoding this form of HTTP request.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://curl.haxx.se/"&gt;cURL&lt;/a&gt; is such a utility. Among many capabilities, cURL offers a quick and easy way to POST XML to a server and view the response. For example, to POST the file called &lt;strong&gt;foo.xml&lt;/strong&gt; to PUG, the command would be: &lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @foo.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Our query will request PubChem's first fifty Compounds in &lt;a href="http://depth-first.com/articles/2006/09/29/hacking-pubchem-direct-access-with-ftp"&gt;sdf.gz&lt;/a&gt; format.&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_download&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download_uids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryUids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryUids_ids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;pccompound&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids_E&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;1&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids_E&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids_E&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;50&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids_E&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List_uids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-ID-List&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryUids_ids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryUids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download_uids&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download_format&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;sdf&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download_compression&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;gzip&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_download&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

After saving this file as &lt;strong&gt;pugtest.xml&lt;/strong&gt;, we can POST it to PUG using cURL:

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @pugtest.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Run PUG, Run!&lt;/h4&gt;

&lt;p&gt;After POSTing our query, PUG gives one of two possible responses: we're informed of the status of our query, or we're given a URL to download our results.&lt;/p&gt;

&lt;p&gt;Here's an example of a status result:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;success&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;638302818484957496&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;tt&gt;PCT-Waiting_reqid&lt;/tt&gt; informs us of our query's ID. We could then prepare and POST another query to monitor its status:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;638302818484957496&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_type&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;status&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Eventually, we'll get a response containing a &lt;tt&gt;PCT-Download_URL_url&lt;/tt&gt; element. Inside this element is the URL through which we can download our results:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;success&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_download-url&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download-URL&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download-URL_url&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;ftp://ftp-private.ncbi.nlm.nih.gov/pubchem/.fetch/766964770894289974.sdf.gz&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download-URL_url&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Download-URL&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_download-url&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;PUG offers the basic foundation for building a variety of innovative and useful cheminformatics Web services. But before that can happen, high-level APIs will be needed in languages like Ruby, Python, and Java. With these APIs in hand, what kinds of applications will result? Fortunately, imagination is now the only barrier.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/40121670@N00/"&gt;shutterbabe68&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 04 Jun 2007 07:06:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:51a56e2f-5ac3-4fd4-92e8-74978045eae2</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway</link>
      <category>Databases</category>
      <category>pubchem</category>
      <category>pug</category>
      <category>xml</category>
      <category>api</category>
      <category>ruby</category>
      <category>powerusergateway</category>
      <category>curl</category>
    </item>
    <item>
      <title>Octet Fundamentals: A Documented System of Atomic Masses</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/stinkypeter/136646214/"&gt;&lt;img src="http://depth-first.com/demo/20070202/scale.jpg" border="0" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The way that atoms, and particularly their masses, are modeled sets the stage for the kinds of problems a cheminformatics environment can solve. Many systems are currently in use, a reflection of the many different ways there are to think about this problem. This article will introduce the atomic mass system used by Octet, which provides atomic mass values and uncertainties cross-referenced to the primary literature.&lt;/p&gt;

&lt;h4&gt;A Documented System of Atomic Masses&lt;/h4&gt;

&lt;p&gt;Mass and isotopic composition are fundamental atomic properties. In addition to the mass values themselves, the errors of these determinations are also important. Because these quantities are sometimes in dispute, it is essential that they be cross-referenced to the primary literature. Fortunately, a landmark work titled &lt;a href="http://www.iupac.org/publications/pac/2003/7506/7506x0683.html"&gt;"Atomic weights of the elements"&lt;/a&gt; (AWOTE) accomplishing exactly this objective was published in 2000 by a team led by J. K. B&amp;#246;hlke from the U.S. Geological survey.&lt;/p&gt;

&lt;p&gt;Octet uses an XML representation of the data contained in AWOTE. To view the entire document, &lt;a href="http://depth-first.com/demo/20070202/mass.xml"&gt;click here&lt;/a&gt;. To illustrate the kind of data included in this document, consider this entry for the element carbon:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;entry&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;atomic-number&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;6&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;natural-abundance&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;mass&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;12.0107&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;error&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.0008&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;isotope&lt;/span&gt; &lt;span class="attribute"&gt;mass-number&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;12&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;mass&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;12&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;error&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;abundance&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.9893&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;error&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.0008&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;isotope&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;isotope&lt;/span&gt; &lt;span class="attribute"&gt;mass-number&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;13&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;mass&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;13.003354838&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;error&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.000000005&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;abundance&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.0107&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;error&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;0.0008&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;isotope&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;natural-abundance&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;entry&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Carbon has two naturally-occurring stable isotopes, &lt;sup&gt;12&lt;/sup&gt;C and &lt;sup&gt;13&lt;/sup&gt;C. They have relative abundances of 98.93% and 1.07%, and masses of 12 (exactly) and 13.003354838&amp;plusmn;0.000000005 unified mass units (u), respectively. Every element from hydrogen to uranium is included, excluding technitium. By reference to AWOTE, the determination of every value in the XML file can be found in the primary literature.&lt;/p&gt;

&lt;h4&gt;Using the Atomic Mass System&lt;/h4&gt;

&lt;p&gt;As a demonstration of Octet's system of atomic masses, consider the following Ruby code:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require_gem&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rjb&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="constant"&gt;Rjb&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="ident"&gt;import&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;net.sf.octet.model.BasicAtomicSystem&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;getInstance&lt;/span&gt;
&lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;=&lt;/span&gt;&lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNaturalAbundance&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomicSymbol&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;))&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;countNuclei&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt;2&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNucleus&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;getMassNumber&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt;12&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNucleus&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;getMassNumber&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt;13&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomicMass&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNucleus&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;)).&lt;/span&gt;&lt;span class="ident"&gt;getValue&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toString&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 12.0&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomicMass&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNucleus&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;)).&lt;/span&gt;&lt;span class="ident"&gt;getValue&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toString&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 13.003354838&lt;/span&gt;
&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;atomic_system&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getAtomicMass&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;carbon_distribution&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getNucleus&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;)).&lt;/span&gt;&lt;span class="ident"&gt;getUncertainty&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;toString&lt;/span&gt; &lt;span class="comment"&gt;# =&amp;gt; 5.0E-9&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href="http://depth-first.com/articles/2007/01/31/a-molecular-language-for-modern-chemistry-reading-flexmol-documents-with-octet"&gt;previous article in this series&lt;/a&gt; described the small number of steps needed to execute Ruby code such as that shown above on Windows and Linux systems. For more information on the &lt;tt&gt;AtomicSystem&lt;/tt&gt; API, consult the &lt;a href="http://depth-first.com/doc/octet/"&gt;Octet Javadoc&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Octet provides a comprehensive system of atomic masses containing both measurements and uncertainties. This system is furthermore cross-referenced to the primary literature. As a result, the mass of every Octet Molecule can be determined to high precision and with error analysis. Not every application will require this level of detail and documentation, but for those that do the capability exists.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.numly.com/numly/verify.asp?id=34181-070204-258949-40"&gt;&lt;img alt="numly esn" src="http://numly.com/numly/icon.asp?id=3418107020425894940" border="0"&gt; 34181-070204-258949-40&lt;/a&gt; Rate content: &lt;a href="http://numly.com/numly/verify.asp?id=3418107020425894940&amp;rate=yes"&gt;&lt;img src="http://numly.com/numly/thumbup.gif" border="0"&gt;&lt;/a&gt;&lt;a href="http://numly.com/numly/verify.asp?id=3418107020425894940&amp;rate=no"&gt;&lt;img src="http://numly.com/numly/thumbdown.gif" border="0"&gt;&lt;/a&gt;&lt;br&gt;&lt;img src="http://numly.com/numly/barcode.asp?code=3418107020425894940&amp;height=20&amp;width=1&amp;mode=code39"&gt;&lt;br&gt;&lt;br&gt;&lt;!--Creative Commons License--&gt;&lt;a rel="license" href="http://creativecommons.org/licenses/by/2.5/"&gt;&lt;img alt="Creative Commons License" style="border-width: 0" src="http://i.creativecommons.org/l/by/2.5/88x31.png"/&gt;&lt;/a&gt;&lt;br/&gt;This work is licensed under a &lt;a rel="license" href="http://creativecommons.org/licenses/by/2.5/"&gt;Creative Commons Attribution 2.5  License&lt;/a&gt;.&lt;!--/Creative Commons License--&gt;&lt;!-- &lt;rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"&gt;&lt;Work rdf:about=""&gt;&lt;license rdf:resource="http://creativecommons.org/licenses/by/2.5/"/&gt;&lt;/Work&gt;&lt;License rdf:about="http://creativecommons.org/licenses/by/2.5/"&gt;&lt;permits rdf:resource="http://web.resource.org/cc/Reproduction"/&gt;&lt;permits rdf:resource="http://web.resource.org/cc/Distribution"/&gt;&lt;requires rdf:resource="http://web.resource.org/cc/Notice"/&gt;&lt;requires rdf:resource="http://web.resource.org/cc/Attribution"/&gt;&lt;permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/&gt;&lt;/License&gt;&lt;/rdf:RDF&gt; --&gt;&lt;/p&gt;</description>
      <pubDate>Fri, 02 Feb 2007 15:10:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:3cd8acc0-eb27-44fc-b238-f14536238e5d</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/02/02/octet-fundamentals-a-documented-system-of-atomic-masses</link>
      <category>Tools</category>
      <category>octet</category>
      <category>xml</category>
      <category>atomicmass</category>
      <category>awote</category>
    </item>
    <item>
      <title>A Molecular Language for Modern Chemistry: Reading FlexMol Documents with Octet</title>
      <description>&lt;p&gt;&lt;a href="http://www.amazon.com/gp/product/0596004206?ie=UTF8&amp;amp;tag=depthfirst-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0596004206"&gt;&lt;img border="0" src="http://depth-first.com/files/learning_xml.jpg" align="right"&gt;&lt;/a&gt;&lt;img src="http://www.assoc-amazon.com/e/ir?t=depthfirst-20&amp;amp;l=as2&amp;amp;o=1&amp;amp;a=0596004206" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt;An XML language is only as useful as the software tools that take advantage of it. &lt;a href="http://depth-first.com/articles/tag/flexmol"&gt;Previous articles&lt;/a&gt; have discussed how the XML language FlexMol can solve a variety of molecular representation problems ranging from the &lt;a href="http://depth-first.com/articles/2006/12/20/a-molecular-language-for-modern-chemistry-getting-started-with-flexmol"&gt;multiatom bonding of metallocenes&lt;/a&gt; to the &lt;a href="http://depth-first.com/articles/2006/12/20/a-molecular-language-for-modern-chemistry-getting-started-with-flexmol"&gt;axial chirality of biaryls&lt;/a&gt;. &lt;a href="http://depth-first.com/articles/2007/01/30/an-object-oriented-framework-for-molecular-representation-getting-started-with-octet"&gt;Octet&lt;/a&gt; is a framework written in Java that speaks FlexMol natively. In this article, I'll show how Octet can be used to read a sample FlexMol document.&lt;/p&gt;

&lt;h4&gt;Prerequisites&lt;/h4&gt;

&lt;p&gt;For this tutorial, you'll need &lt;a href="http://rubyforge.org/projects/rjb/"&gt;Ruby Java Bridge&lt;/a&gt; (RJB). Previous articles have discussed the installation and use of RJB on &lt;a href="http://depth-first.com/articles/2006/10/12/running-ruby-java-bridge-on-windows"&gt;Windows&lt;/a&gt; and &lt;a href="http://depth-first.com/articles/2006/08/26/scripting-java-libraries-with-ruby-java-bridge"&gt;Linux&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;A Sample Molecule&lt;/h4&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070131/s_monolaterol.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;A &lt;a href="http://depth-first.com/articles/2007/01/25/a-molecular-language-for-modern-chemistry-flexmol-tetrahedral-chirality-and-monolaterol"&gt;recent article&lt;/a&gt; disused a FlexMol representation of the chiral natural product monolaterol. Using a slightly modified numbering system for this molecule (shown above), we can construct a &lt;a href="http://depth-first.com/demo/20070131/s_monolaterol.xml"&gt;complete FlexMol representation&lt;/a&gt;. In this case, we simply start numbering at index zero, subtracting one from every index in the previous example to match the zero-based indices used in Octet.&lt;/p&gt;

&lt;h4&gt;A Demonstration Package&lt;/h4&gt;

&lt;p&gt;To illustrate the process of reading a FlexMol document, I've prepared a small package (&lt;strong&gt;demo-20070131.tar.gz&lt;/strong&gt;) that can be &lt;a href="https://sourceforge.net/project/showfiles.php?group_id=96108&amp;amp;package_id=220177&amp;amp;release_id=482855"&gt;downloaded from SourceForge&lt;/a&gt;. In it, you'll find an Octet jarfile (&lt;strong&gt;octet-0.8.2.jar&lt;/strong&gt;), a FlexMol representation of monolaterol (&lt;strong&gt;s_monolaterol.xml&lt;/strong&gt;), a Ruby library (&lt;strong&gt;reader.rb&lt;/strong&gt;), and some Ruby test code (&lt;strong&gt;test.rb&lt;/strong&gt;). Inflate this archive and make it your working directory.&lt;/p&gt;

&lt;h4&gt;A Simple Test&lt;/h4&gt;

&lt;p&gt;The following sequence of commands will run the test included with the demonstration package:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ export CLASSPATH=./octet-0.8.2.jar
$ ruby test.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You should see several lines of output terminated with the line:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
The exact mass of monolaterol is 276.115029755.
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You can get more hands-on experience with loading and processing the monolaterol FlexMol document using interactive Ruby (irb). For example:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'reader'
=&gt; true
irb(main):002:0&gt; r=Reader.new
=&gt; #&lt;Reader:0x2b9ab173a1f0 @xml_reader=#&lt;#&lt;Class:0x2b9ab1741680&gt;:0x2b9ab1736690&gt;, @handler=#&lt;#&lt;Class:0x2b9ab1741680&gt;:0x2b9ab1736e10&gt;, @builder=#&lt;#&lt;Class:0x2b9ab1741680&gt;:0x2b9ab1736b90&gt;&gt;
irb(main):003:0&gt; mol=r.read_file 's_monolaterol.xml'
=&gt; #&lt;#&lt;Class:0x2b9ab1741680&gt;:0x2b9ab172cd48&gt;
irb(main):004:0&gt; mol.countAtoms
=&gt; 21
irb(main):005:0&gt; mol.countBondingSystems
=&gt; 24
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Of course, this is just scratching the surface of what can be done once a FlexMol document has been loaded by Octet.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Octet makes it possible to convert FlexMol documents into Java object representations that can be accessed through Ruby. With an object representation, the possibilities are limitless. Some simple examples have been provided here. Future articles will illustrate more advanced uses.&lt;/p&gt;</description>
      <pubDate>Wed, 31 Jan 2007 14:56:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:ab1f670a-3c7e-407d-af0e-c4343d7082d2</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/01/31/a-molecular-language-for-modern-chemistry-reading-flexmol-documents-with-octet</link>
      <category>Tools</category>
      <category>flexmol</category>
      <category>octet</category>
      <category>ruby</category>
      <category>java</category>
      <category>rjb</category>
      <category>monolaterol</category>
      <category>xml</category>
    </item>
    <item>
      <title>A Molecular Language for Modern Chemistry: FlexMol and Axial Chirality</title>
      <description>&lt;p&gt;A &lt;a href="http://depth-first.com/articles/2007/01/08/the-axial-chirality-problem"&gt;recent article&lt;/a&gt; introduced FlexMol as a molecular language with the unique capability of encoding axial chirality. &lt;a href="http://depth-first.com/articles/2007/01/02/a-molecular-language-for-modern-chemistry-flexmol-and-alkene-geometrical-isomerism"&gt;A previous article&lt;/a&gt; showed how E/Z geometrical isomerism is encoded with FlexMol. Using the popular chiral reagent and ligand 1,1'-bi-2-naphthol (BINOL) as an example, this tutorial will illustrate in detail how axial chirality is encoded in FlexMol.&lt;/p&gt;

&lt;h4&gt;Configuration or Conformation?&lt;/h4&gt;

&lt;p&gt;In contrast to configurational stereoisomers, conformational stereoisomers can be interconverted through bond rotations. So we'll need to use a &lt;tt&gt;conformationWheel&lt;/tt&gt; to represent stereochemistry in BINOL - &lt;a href="http://depth-first.com/articles/2007/01/02/a-molecular-language-for-modern-chemistry-flexmol-and-alkene-geometrical-isomerism"&gt;just as we did with 2-butene&lt;/a&gt;. For more rigorous definitions of these concepts, see the original &lt;a href="http://dx.doi.org/10.1021/ci00027a001"&gt;specification by Dietz&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;(&lt;em&gt;R&lt;/em&gt;)-BINOL&lt;/h4&gt;

&lt;p&gt;A FlexMol representation and associated atom numbering scheme (&lt;em&gt;R&lt;/em&gt;)-BINOL are show below:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;&amp;lt;!-- (R)-BINOL --&amp;gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; standalone=&amp;quot;yes&amp;quot;?&amp;gt;

&amp;lt;molecule&amp;gt;
  &amp;lt;constitution&amp;gt;
    &amp;lt;atoms&amp;gt;
      &amp;lt;atom id=&amp;quot;C0&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C1&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C2&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C3&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C4&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C5&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C6&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C7&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C8&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C9&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C10&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C11&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C12&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C13&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C14&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C15&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C16&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C17&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C18&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;C19&amp;quot; symbol=&amp;quot;C&amp;quot; hydrogens=&amp;quot;0&amp;quot; ionization=&amp;quot;4&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;O20&amp;quot; symbol=&amp;quot;O&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
      &amp;lt;atom id=&amp;quot;O22&amp;quot; symbol=&amp;quot;O&amp;quot; hydrogens=&amp;quot;1&amp;quot; ionization=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/atom&amp;gt;
    &amp;lt;/atoms&amp;gt;
    &amp;lt;bonding&amp;gt;
      &amp;lt;bond source=&amp;quot;C0&amp;quot; target=&amp;quot;C1&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C1&amp;quot; target=&amp;quot;C2&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C2&amp;quot; target=&amp;quot;C3&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C3&amp;quot; target=&amp;quot;C4&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C4&amp;quot; target=&amp;quot;C5&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C0&amp;quot; target=&amp;quot;C5&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C0&amp;quot; target=&amp;quot;C6&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C6&amp;quot; target=&amp;quot;C7&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C7&amp;quot; target=&amp;quot;C8&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C8&amp;quot; target=&amp;quot;C9&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C9&amp;quot; target=&amp;quot;C1&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bondingSystem bondingElectrons=&amp;quot;10&amp;quot;&amp;gt;
        &amp;lt;connections&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C0&amp;quot; target=&amp;quot;C1&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C1&amp;quot; target=&amp;quot;C2&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C2&amp;quot; target=&amp;quot;C3&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C3&amp;quot; target=&amp;quot;C4&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C4&amp;quot; target=&amp;quot;C5&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C0&amp;quot; target=&amp;quot;C5&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C0&amp;quot; target=&amp;quot;C6&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C6&amp;quot; target=&amp;quot;C7&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C7&amp;quot; target=&amp;quot;C8&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C8&amp;quot; target=&amp;quot;C9&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C9&amp;quot; target=&amp;quot;C1&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
        &amp;lt;/connections&amp;gt;
      &amp;lt;/bondingSystem&amp;gt;
      &amp;lt;bond source=&amp;quot;C10&amp;quot; target=&amp;quot;C11&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C11&amp;quot; target=&amp;quot;C12&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C12&amp;quot; target=&amp;quot;C13&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C13&amp;quot; target=&amp;quot;C14&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C14&amp;quot; target=&amp;quot;C15&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C10&amp;quot; target=&amp;quot;C15&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C10&amp;quot; target=&amp;quot;C16&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C16&amp;quot; target=&amp;quot;C17&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C17&amp;quot; target=&amp;quot;C18&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C18&amp;quot; target=&amp;quot;C19&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C19&amp;quot; target=&amp;quot;C11&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bondingSystem bondingElectrons=&amp;quot;10&amp;quot;&amp;gt;
        &amp;lt;connections&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C10&amp;quot; target=&amp;quot;C11&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C11&amp;quot; target=&amp;quot;C12&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C12&amp;quot; target=&amp;quot;C13&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C13&amp;quot; target=&amp;quot;C14&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C14&amp;quot; target=&amp;quot;C15&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C10&amp;quot; target=&amp;quot;C15&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C10&amp;quot; target=&amp;quot;C16&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C16&amp;quot; target=&amp;quot;C17&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C17&amp;quot; target=&amp;quot;C18&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C18&amp;quot; target=&amp;quot;C19&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C19&amp;quot; target=&amp;quot;C11&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
        &amp;lt;/connections&amp;gt;
      &amp;lt;/bondingSystem&amp;gt;
      &amp;lt;bond source=&amp;quot;C9&amp;quot; target=&amp;quot;C19&amp;quot; bondingElectrons=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C8&amp;quot; target=&amp;quot;O20&amp;quot; bondingElectron=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
      &amp;lt;bond source=&amp;quot;C18&amp;quot; target=&amp;quot;O21&amp;quot; bondingElectron=&amp;quot;2&amp;quot;&amp;gt;&amp;lt;/bond&amp;gt;
    &amp;lt;/bonding&amp;gt;
  &amp;lt;/constitution&amp;gt;
  &amp;lt;conformation&amp;gt;
    &amp;lt;conformationWheel&amp;gt;
      &amp;lt;gammaSequence source=&amp;quot;C19&amp;quot; target=&amp;quot;C9&amp;quot;&amp;gt;
        &amp;lt;connections&amp;gt;
          &amp;lt;atomPair source=&amp;quot;C9&amp;quot; target=&amp;quot;C19&amp;quot;&amp;gt;&amp;lt;/atomPair&amp;gt;
        &amp;lt;/connections&amp;gt;
      &amp;lt;/gammaSequence&amp;gt;
      &amp;lt;halfPlane&amp;gt;
        &amp;lt;lower atom=&amp;quot;C11&amp;quot;&amp;gt;&amp;lt;/lower&amp;gt;
      &amp;lt;/halfPlane&amp;gt;
      &amp;lt;halfPlane&amp;gt;
        &amp;lt;upper atom=&amp;quot;C1&amp;quot;&amp;gt;&amp;lt;/upper&amp;gt;
      &amp;lt;/halfPlane&amp;gt;
      &amp;lt;halfPlane&amp;gt;
        &amp;lt;lower atom=&amp;quot;C18&amp;quot;&amp;gt;&amp;lt;/lower&amp;gt;
      &amp;lt;/halfPlane&amp;gt;
      &amp;lt;halfPlane&amp;gt;
        &amp;lt;upper atom=&amp;quot;C8&amp;quot;&amp;gt;&amp;lt;/upper&amp;gt;
      &amp;lt;/halfPlane&amp;gt;
    &amp;lt;/conformationWheel&amp;gt;
  &amp;lt;/conformation&amp;gt;
&amp;lt;/molecule&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070109/r_binol.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;We've elected to represent BINOL's two pi-systems as ten-atom, ten-electron &lt;tt&gt;bondingSystems&lt;/tt&gt;. We could have just as easily represented each naphthalene ring using alternating single/double &lt;tt&gt;bonds&lt;/tt&gt; containing two and four electrons, respectively. For an explanation of multi-atom pi-system bonding in FlexMol, see &lt;a href="http://depth-first.com/articles/2006/12/20/a-molecular-language-for-modern-chemistry-getting-started-with-flexmol"&gt;this article&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The stereochemically-relevant part of this document is contained within the &lt;tt&gt;conformation&lt;/tt&gt; element. A &lt;tt&gt;gammaSequence&lt;/tt&gt;, or conformational axis, is defined along with four non-empty &lt;tt&gt;halfPlanes&lt;/tt&gt;. Notice how the basic structure of this &lt;tt&gt;conformation&lt;/tt&gt; element closely resembles &lt;a href="http://depth-first.com/articles/2007/01/02/a-molecular-language-for-modern-chemistry-flexmol-and-alkene-geometrical-isomerism"&gt;the one for 2-butene&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To better visualize the the &lt;tt&gt;conformation&lt;/tt&gt; element of (&lt;em&gt;R&lt;/em&gt;)-BINOL, consider the following diagram:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070109/r_binol_planes.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;The &lt;tt&gt;conformationWheel&lt;/tt&gt; defines a conformational axis vector from atom C19 to atom C9. Arranged about this axis in a clockwise fashion are four non-empty &lt;tt&gt;halfPlanes&lt;/tt&gt;. Picking an arbitrary &lt;tt&gt;halfPlane&lt;/tt&gt; to start with, atom C11 is positioned first in the lower half. This is then followed by the next &lt;tt&gt;halfPlane&lt;/tt&gt;, which contains atom C1 in its upper half. The next &lt;tt&gt;halfPlane&lt;/tt&gt; contains atom C18 in the lower half. Finally, atom C8 is located in the last &lt;tt&gt;halfPlane&lt;/tt&gt;'s upper half.&lt;/p&gt;

&lt;p&gt;This procedure completely specifies the axial chirality of (&lt;em&gt;R&lt;/em&gt;)-BINOL. Notice how no arbitrary stereodescriptors or chiral templates were used. Of course, we could derive the Cahn-Ingold-Prelog stereodescriptor of (&lt;em&gt;R&lt;/em&gt;), given the right software.&lt;/p&gt;

&lt;p&gt;Many representations of the same chiral axis are possible, just as each connection table can be represented in many different ways. For example, we could have started the &lt;tt&gt;conformation&lt;/tt&gt; element with the &lt;tt&gt;halfPlane&lt;/tt&gt; containing atom C1. In this case, the ordering of atoms would be C1, C18, C8, C11. Similarly, the orientation of our chiral axis could have been defined from atom C9 to atom C19. In this case the ordering of &lt;tt&gt;halfPlanes&lt;/tt&gt; would be reversed, and the upper/lower designations would be inverted.&lt;/p&gt;

&lt;h4&gt;(&lt;em&gt;S&lt;/em&gt;)-BINOL&lt;/h4&gt;

&lt;p&gt;How is (&lt;em&gt;S&lt;/em&gt;)-BINOL encoded in FlexMol? As you might expect, completely analogously to the (&lt;em&gt;R&lt;/em&gt;) enantiomer:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="comment"&gt;&amp;lt;!-- snip --&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;conformation&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;conformationWheel&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;gammaSequence&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C19&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C9&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;connections&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C9&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C19&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;connections&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;gammaSequence&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;lower&lt;/span&gt; &lt;span class="attribute"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C11&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;lower&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;upper&lt;/span&gt; &lt;span class="attribute"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C8&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;upper&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;lower&lt;/span&gt; &lt;span class="attribute"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C18&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;lower&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;upper&lt;/span&gt; &lt;span class="attribute"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;upper&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;halfPlane&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;conformationWheel&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;conformation&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="comment"&gt;&amp;lt;!-- snip --&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070109/s_binol.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;As with (&lt;em&gt;R&lt;/em&gt;)-BINOL, we can create a diagram representing the &lt;tt&gt;conformationWheel&lt;/tt&gt; of (&lt;em&gt;S&lt;/em&gt;)-BINOL:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070109/s_binol_planes.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;As you can see, FlexMol completely encodes axial chirality using just a few basic XML elements, rather than chiral templates or stereodescriptors. These were, in fact, the same elements used to encode alkene geometrical isomerism. This modular approach to stereoisomerism results in an extensible system. Future articles will discuss other forms of stereoisomerism that can be represented in FlexMol, including the all-important tetrahedral stereogenic center. &lt;/p&gt;</description>
      <pubDate>Tue, 09 Jan 2007 15:50:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:f00f8003-6896-4636-948c-82acf1194e43</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/01/09/a-molecular-language-for-modern-chemistry-flexmol-and-axial-chirality</link>
      <category>Tools</category>
      <category>flexmol</category>
      <category>binol</category>
      <category>conformation</category>
      <category>xml</category>
    </item>
    <item>
      <title>A Molecular Language for Modern Chemistry: Getting Started with FlexMol</title>
      <description>&lt;p&gt;Existing molecular languages are limited in their ability to represent such commonplace features as multi-center bonding and axial chirality. The practical outcome of these limitations can be seen in &lt;a href="http://depth-first.com/articles/2006/12/12/the-problem-with-ferrocene"&gt;PubChem's four separate entries for ferrocene&lt;/a&gt; and the inability to fully represent &lt;a href="http://depth-first.com/articles/2006/12/19/ferrocene-and-beyond-a-solution-to-the-molecular-representation-problem"&gt;many molecules now in common use by organic chemists&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/articles/2006/12/19/ferrocene-and-beyond-a-solution-to-the-molecular-representation-problem"&gt;A recent article&lt;/a&gt; touched on a molecular representation system that was capable of far greater expressive power than those currently in use. In this article, I'll introduce FlexMol, an XML implementation of this advanced molecular representation system.&lt;/p&gt;

&lt;h4&gt;What is FlexMol?&lt;/h4&gt;

&lt;p&gt;FlexMol is an XML-based molecular language that's designed to allow the faithful representation of any molecule, regardless of its peculiarities. The following is a list of features that FlexMol can encode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Multi-atom, multi-electron bonds&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All known forms of stereochemistry, including axial chirality (e.g., allenes and biarlys), planar chirality (e.g., metallocenes), and non-tetrahedral stereocenters (e.g., square planar and octahedral metal complexes)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Non-natural isotopic distributions and pure isotopes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Virtual hydrogens (similar to "implicit hydrogens") through mandatory, explicit enumeration&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Electronic spin, enabling the differentiation of spin states&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;What Does FlexMol Look Like?&lt;/h4&gt;

&lt;p&gt;Let's start with the simple example of benzene:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="comment"&gt;&amp;lt;!-- Benzene, represented as &amp;quot;1,3,5-cyclohexatriene&amp;quot; --&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;standalone&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;yes&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;

&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;constitution&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atoms&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atoms&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bonding&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bonding&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;constitution&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The above representation divides the structure of benzene into two main elements - &lt;tt&gt;atoms&lt;/tt&gt; and &lt;tt&gt;bonding&lt;/tt&gt;. Both of these elements are in turn subelements of the &lt;tt&gt;constitution&lt;/tt&gt; element, which specifies atom connectivity. Had we been representing a molecule with stereochemical features, the above document could have also contained a &lt;tt&gt;configuration&lt;/tt&gt; element, a &lt;tt&gt;conformation&lt;/tt&gt; element, or both.&lt;/p&gt;

&lt;p&gt;Within the &lt;tt&gt;atoms&lt;/tt&gt; element are definitions for each of the six degenerate carbon atoms of benzene. Each atom is assigned a unique ID for use elsewhere in the document, an atomic symbol, the number of hydrogens bonded to each atom, and the effective ionization state of each atom. The mandatory &lt;tt&gt;hydrogens&lt;/tt&gt; attribute specifies "virtual" hydrogens, or those associated with an atom without being full-fledged nodes in the graph representation.&lt;/p&gt;

&lt;p&gt;The &lt;tt&gt;bonding&lt;/tt&gt; element defines all of the bonding arrangements within benzene. In this case, benzene is being represented as "cyclohexatriene" with alternating single and double bonds; below we'll see how to use FlexMol to represent delocalized (aromatic) bonding. Each bond specifies a source atom, a target atom, and the number of bonding electrons.&lt;/p&gt;

&lt;p&gt;In many situations, the above representation of benzene will not suffice. What if we want to describe the one-electron ionization of benzene to form the benzene radical cation? Using the "cyclohexatriene" form of benzene makes it impossible to select the correct bond from which to take electrons.&lt;/p&gt;

&lt;p&gt;Instead, we could use a more physically meaningful representation of benzene, such as that shown below:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="comment"&gt;&amp;lt;!-- Benzene, represented with a delocalized pi-system --&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;standalone&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;yes&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;

&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;constitution&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atoms&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt; &lt;span class="attribute"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;symbol&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;hydrogens&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;ionization&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atom&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atoms&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bonding&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bond&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;bondingSystem&lt;/span&gt; &lt;span class="attribute"&gt;bondingElectrons&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;6&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;connections&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C3&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C4&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt; &lt;span class="attribute"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;target&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;C5&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;atomPair&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;connections&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bondingSystem&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;bonding&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;constitution&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;molecule&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is certainly more verbose, but what does it buy us? Notice the &lt;tt&gt;bondingSystem&lt;/tt&gt; subelement at the end of the &lt;tt&gt;bonding&lt;/tt&gt; element. Here we define an extended six-atom, six-electron bonding system that much more closely reflects the true nature of benzene's pi-system. Now it's obvious that this is the bonding motif from which to take an electron to make the benzene radical cation.&lt;/p&gt;

&lt;p&gt;Next, consider the cyclopenadienyl anion, which possesses a five-atom, six-electron Hueckel aromatic bonding system. We can appl