<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Hacking PubChem: Learning to Speak PUG</title>
    <link>http://depth-first.com/articles/2007/06/11/hacking-pubchem-learning-to-speak-pug</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Hacking PubChem: Learning to Speak PUG</title>
      <description>&lt;p&gt;&lt;a href="http://flickr.com/photos/40121670@N00/478093105/"&gt;&lt;img src="http://depth-first.com/demo/20070611/pug.jpg" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;A previous article introduced PubChem's &lt;a href="http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway"&gt;Power User Gateway&lt;/a&gt; (PUG), an XML-based communication channel. Although NIH kindly supplies a &lt;a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pug.xsd"&gt;commented schema&lt;/a&gt; for PUG queries and responses, there's nothing like seeing real examples when learning a new language. This article will describe one method for conveniently generating PUG XML queries.&lt;/p&gt;

&lt;h4&gt;Let PubChem Build Your Query&lt;/h4&gt;

&lt;p&gt;One of the options on the &lt;a href="http://pubchem.ncbi.nlm.nih.gov/search/search.cgi"&gt;PubChem search page&lt;/a&gt; is "Save Query." As it turns out, PubChem saves queries in PUG XML (I'll just call it PUGML). In other words, preparing a query using the PubChem search page and saving it gives a simple method for creating PUGML queries. Let's try it.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070611/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Using the "Sketch" button, draw the structure of benzimidazole. Under "Search Type", select "Substructure." Now click "Save Query", and you'll download a substructure query for benzimidazole in PUGML:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType_css&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query_data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;C1=CC=CC2=C1N=C[N]2&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query_data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type_subss&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure_bonds&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;true&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
                      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-CSStructure&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type_subss&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_results&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;2000000&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS_results&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryCompoundCS&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType_css&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-QueryType&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query_type&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_query&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;tt&gt;PCT-QueryCompoundCS_type_subss&lt;/tt&gt; element will tell PUG to look for substructures.&lt;/p&gt;

&lt;h4&gt;Using the Saved Query with PUG&lt;/h4&gt;

&lt;p&gt;Saving this file as &lt;strong&gt;benzimidazole_sss.xml&lt;/strong&gt;, lets us feed it to PUG:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @benzimidazole_sss.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;and get the following PUGML response:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;queued&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;62668946396085905&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;Structure search job was submitted&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_waiting&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

We can then check on the status of our query by saving the following as &lt;strong&gt;status.xml&lt;/strong&gt;:

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;62668946396085905&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_reqid&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request_type&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;status&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData_request&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-InputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_input&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

POSTing this to PUG:

&lt;div class="console"&gt;
&lt;pre&gt;
$ curl -d @status.xml "http://pubchem.ncbi.nlm.nih.gov/pug/pug.cgi"
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;gives us the following PUGML:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;?&lt;/span&gt;&lt;span class="tag"&gt;xml&lt;/span&gt; &lt;span class="attribute"&gt;version&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;1.0&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;!&lt;/span&gt;&lt;span class="tag"&gt;DOCTYPE&lt;/span&gt; &lt;span class="attribute"&gt;PCT-Data&lt;/span&gt; &lt;span class="attribute"&gt;PUBLIC&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;-//NCBI//NCBI PCTools/EN&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/pug/pug.dtd&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status&lt;/span&gt; &lt;span class="attribute"&gt;value&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;success&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;/&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;Your search has already been completed successfully!.&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message_message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Status-Message&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_status&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;pccompound&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_db&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_query-key&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;1&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_query-key&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_webenv&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;0CPrI_peUmUtWDooyjxpJ1XAXPcOl-ESZZxj8sJV9ZDR8musMjh1oBTib@1EDD43FA66AE1BE0_0001SID&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez_webenv&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output_entrez&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-OutputData&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data_output&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;PCT-Data&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway"&gt;Last time&lt;/a&gt;, we got a URL to download a gzipped SD File. This time, our query specified results to be returned as an Entrez Key through the &lt;tt&gt;PCT-Entrez_webenv&lt;/tt&gt; element. We can construct a URL that will let us view these results:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_default "&gt;http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=HistorySearch&amp;amp;WebEnvRq=1&amp;amp;db=pccompound&amp;amp;query_key=1&amp;amp;WebEnv=0CPrI_peUmUtWDooyjxpJ1XAXPcOl-ESZZxj8sJV9ZDR8musMjh1oBTib%401EDD43FA66AE1BE0_0001SID&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Where to Next?&lt;/h4&gt;

&lt;p&gt;If we wanted to get a gzipped SD File instead, we'd need to edit our original query. But manually editing XML is a lot like mowing a lawn with scissors. What we'd really like is a simple API in a language like Ruby that will let us build sophisticated PUG queries, process the results, and pipe them into other queries with little effort. But that's a story for another time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image Credit: &lt;a href="http://flickr.com/photos/40121670@N00/"&gt;sutterbabe68&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 11 Jun 2007 09:04:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:c9cf69b1-a86f-4a3b-ba3c-a8dcded2fa9f</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/06/11/hacking-pubchem-learning-to-speak-pug</link>
      <category>Databases</category>
      <category>pubchem</category>
      <category>pug</category>
      <category>xml</category>
      <category>api</category>
      <category>powerusergateway</category>
      <category>ruby</category>
    </item>
    <item>
      <title>"Hacking PubChem: Learning to Speak PUG" by Egon Willighagen</title>
      <description>&lt;p&gt;Rich, please let me know when you are going to hack up libraries. At this moment I retrieve info from PubChem within a Bioclipse plugin [1], and for finding molecules in blog items in Chemical blogspace [2].&lt;/p&gt;

&lt;p&gt;1.&lt;a href="http://bioclipse.svn.sf.net/svnroot/bioclipse/trunk/plugins/net.bioclipse.pubchem/" rel="nofollow"&gt;http://bioclipse.svn.sf.net/svnroot/bioclipse/trunk/plugins/net.bioclipse.pubchem/&lt;/a&gt;
2.&lt;a href="http://cb.openmolecules.net/" rel="nofollow"&gt;http://cb.openmolecules.net/&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Tue, 19 Jun 2007 03:24:34 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:8e3df0da-e070-424b-8825-f0ca63826f37</guid>
      <link>http://depth-first.com/articles/2007/06/11/hacking-pubchem-learning-to-speak-pug#comment-70</link>
    </item>
  </channel>
</rss>
