<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag cas</title>
    <link>http://depth-first.com/articles/tag/cas</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Validating CAS Numbers</title>
      <description>&lt;p&gt;The Chemical Abstracts Service (CAS) &lt;a href="http://en.wikipedia.org/wiki/CAS_registry_number"&gt;registry number system&lt;/a&gt; was designed to be fault-tolerant. Built into every CAS number is a &lt;a href="http://www.cas.org/expertise/cascontent/registry/checkdig.html"&gt;check-digit&lt;/a&gt; that makes it possible to detect mis-typed numbers. Validation is a mathematical and repetitive process well-suited for software.&lt;/p&gt;

&lt;p&gt;The Ruby program below validates arbitrary CAS numbers:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;CAS&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;validate&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;
    &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="constant"&gt;false&lt;/span&gt; &lt;span class="keyword"&gt;unless&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt; &lt;span class="punct"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;[0-9]{2,7}-[0-9]{2}-[0-9]&lt;/span&gt;&lt;span class="punct"&gt;/)&lt;/span&gt;

    &lt;span class="ident"&gt;check_digit&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;[-&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;].&lt;/span&gt;&lt;span class="ident"&gt;to_i&lt;/span&gt;
    &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt;

    &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;reverse&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;scan&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;[0-9]&lt;/span&gt;&lt;span class="punct"&gt;/).&lt;/span&gt;&lt;span class="ident"&gt;each_with_index&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;digit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;+&lt;/span&gt; &lt;span class="ident"&gt;digit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_i&lt;/span&gt; &lt;span class="punct"&gt;*&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

    &lt;span class="ident"&gt;check_digit&lt;/span&gt; &lt;span class="punct"&gt;==&lt;/span&gt; &lt;span class="ident"&gt;sum&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;remainder&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;10&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;CAS&lt;/span&gt;

&lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="constant"&gt;true&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt;
  &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;CAS Number: &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

  &lt;span class="ident"&gt;cas_number&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;strip&lt;/span&gt;

  &lt;span class="keyword"&gt;break&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;empty?&lt;/span&gt;

  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="constant"&gt;CAS&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;validate&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;valid&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;invalid&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

The program can be tested from the command line:

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby cas.rb
CAS Number: 107-07-3
valid
CAS Number: 107-87-3
invalid
CAS Number:
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that a validated CAS number can still be absent from the CAS database; validation only says that a CAS number &lt;em&gt;could&lt;/em&gt; be valid based on its format.&lt;/p&gt;</description>
      <pubDate>Wed, 23 Jul 2008 12:30:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:37a66468-b6ed-4237-8fdc-81ac981466c8</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/07/23/validating-cas-numbers</link>
      <category>Tools</category>
      <category>cas</category>
      <category>casnumber</category>
      <category>validate</category>
      <category>ruby</category>
    </item>
    <item>
      <title>Simple CAS Number Lookup (and More) with Chempedia</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com"&gt;&lt;img src="http://depth-first.com/demo/20080513/chempedia.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Despite many ingenious and energetic attempts, CAS Registry Numbers&amp;reg; remain chemistry's only universal method for referencing chemical structures and substances. They're so woven into the fabric of chemistry and trade that the &lt;a href="http://www.uspto.gov/main/profiles/otherid.htm"&gt;US Patent and Trademark Office discusses them&lt;/a&gt; in the same context as Domain Names, Drivers License Numbers, ZIP Codes, and UPC Barcodes. But for all of the system's advantages, it suffers from a significant limitation: without access to the CAS Registry database, it can be difficult if not impossible to link a chemical structure with a CAS Number. This article discusses a unique approach to this problem.&lt;/p&gt;

&lt;h4&gt;Finding CAS Numbers on Chempedia&lt;/h4&gt;

&lt;p&gt;Let's say you've found the CAS Number &lt;strong&gt;[58-08-2]&lt;/strong&gt; and needed to look up the chemical structure it refers to. How would you do it?&lt;/p&gt;

&lt;p&gt;We can use &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt; to find the answer. Entering "58-08-2" into the &lt;a href="http://chempedia.com/queries/new"&gt;text query&lt;/a&gt; box takes us to the corresponding &lt;a href="http://chempedia.com/registry_numbers/58-08-2"&gt;Registry Number Summary&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Under the heading "Compound Monographs", this page tells us that one &lt;a href="http://chempedia.com/monographs/caffeine"&gt;Compound Monograph&lt;/a&gt; referencing the CAS Number &lt;strong&gt;[58-08-2]&lt;/strong&gt; exists. We can easily see that both the name and chemical structure are consistent with &lt;a href="http://chempedia.com/monographs/caffeine"&gt;Caffeine&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://chempedia.com/registry_numbers/58-08-2"&gt;&lt;img src="http://depth-first.com/demo/20080526/caffeine.png"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;However, the section under the heading "Compounds" gives us something unique. Rather than simply telling us that the structure of Caffeine is linked to CAS Number &lt;strong&gt;[58-08-2]&lt;/strong&gt;, Chempedia tells us how it arrived at this conclusion. As you can see, there are over a dozen references matching CAS Number &lt;strong&gt;[58-08-2]&lt;/strong&gt; with a single chemical structure.&lt;/p&gt;

&lt;p&gt;More than that, Chempedia give us links to the organizations making these assertions and the actual Web pages recording them.&lt;/p&gt;

&lt;p&gt;Rather than just giving the answer, Chempedia says how it found the answer.&lt;/p&gt;

&lt;h4&gt;Authority, Confidence, and the Electronic Paper Trail&lt;/h4&gt;

&lt;p&gt;By definition, the only authority on CAS Registry Numbers is &lt;a href="http://www.cas.org/"&gt;Chemical Abstracts Service&lt;/a&gt; itself. But for many, many organizations, full-time access to the CAS Registry database is hopelessly out of reach, and access of the form required to incorporate the CAS Registry system into third-party products is a non-starter.&lt;/p&gt;

&lt;p&gt;In other words, there is a widespread need to work with CAS Numbers independently of the CAS Registry System, but any such attempt is inherently non-authoritative. How can we work within this constraint?&lt;/p&gt;

&lt;p&gt;What's lacking is the concept of confidence.&lt;/p&gt;

&lt;p&gt;To illustrate, let's try to find the structure associated with the CAS Number &lt;strong&gt;[480-41-1]&lt;/strong&gt;. In contrast to our earlier search, this one takes us to a &lt;a href="http://chempedia.com/registry_numbers/480-41-1"&gt;summary page&lt;/a&gt; with three different structures! (see below)&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;a href="http://chempedia.com/registry_numbers/480-41-1"&gt;&lt;img src="http://depth-first.com/demo/20080526/naringenin.png"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Each of these three structures share the same connectivity, but different stereochemistry. The first structure (presumably) represents a racemate, the second represents the (&lt;em&gt;S&lt;/em&gt;) enantiomer, and the third represents the (&lt;em&gt;R&lt;/em&gt;) enantiomer.&lt;/p&gt;

&lt;p&gt;At this point, we need to decide whether we've really found the structure for the CAS Number &lt;strong&gt;[480-41-1]&lt;/strong&gt;. And we could use Chempedia's electronic paper trail to guide our thinking. Both the racemate and (&lt;em&gt;S&lt;/em&gt;) enantiomer have five references linking structure and CAS number, whereas the (&lt;em&gt;R&lt;/em&gt;) enantiomer only lists one.&lt;/p&gt;

&lt;p&gt;We can also see that the racemate and and pure (&lt;em&gt;S&lt;/em&gt;) enantiomer each are associated with yet another CAS Registry Number, &lt;a href="http://chempedia.com/registry_numbers/67604-48-2"&gt;[67604-48-2]&lt;/a&gt;. Examination of this record shows that two structures are cited, the same two structures we were originally considering.&lt;/p&gt;

&lt;p&gt;Clearly, there's some confusion regarding the exact identity of the structure represented by CAS Number &lt;strong&gt;[480-41-1]&lt;/strong&gt;. Nevertheless, we can guess that the Registry Numbers &lt;a href="http://chempedia.com/registry_numbers/67604-48-2"&gt;[67604-48-2]&lt;/a&gt; and &lt;a href="http://chempedia.com/registry_numbers/480-41-1"&gt;[480-41-1]&lt;/a&gt; refer to the racemate and (&lt;em&gt;S&lt;/em&gt;) enantiomer of the flavinoid &lt;a href="http://chempedia.com/monographs/naringenin"&gt;Naringenin&lt;/a&gt;, although we don't know which is which.&lt;/p&gt;

&lt;p&gt;For some applications this answer would be sufficient. For others, however, it wouldn't. The key point is that Chempedia has enabled us to arrive at this conclusion by exposing the electronic paper trail of third-party CAS Registry Number assignments.&lt;/p&gt;

&lt;p&gt;Chempedia offers a way to debug CAS Registry Numbers.&lt;/p&gt;

&lt;p&gt;Chempedia currently contains just over 380,000 unique CAS Numbers. To browse through the entire set, ten at a time, you can &lt;a href="http://chempedia.com/registry_numbers"&gt;begin with this page&lt;/a&gt;. Notice how &lt;a href="http://depth-first.com/articles/2007/05/30/restful-cheminformatics"&gt;RESTful URLs&lt;/a&gt; are used throughout.&lt;/p&gt;

&lt;h4&gt;Web 2.0 and All That&lt;/h4&gt;

&lt;p&gt;Those who have spent time using or developing "Web 2.0" applications may recognize a potentially powerful analogy between CAS Registry Numbers and the concept of &lt;a href="http://en.wikipedia.org/wiki/Tag_%28metadata%29"&gt;tagging&lt;/a&gt;. A tag is an alphanumeric string associated with some resource of interest, for example, &lt;a href="http://flickr.com/photos/tags/chemistry/clusters/"&gt;a photo&lt;/a&gt;, &lt;a href="http://www.connotea.org/tag/chemistry"&gt;a scientific paper&lt;/a&gt;, &lt;a href="http://del.icio.us/tag/chemistry"&gt;a URL&lt;/a&gt;, or &lt;a href="http://depth-first.com/articles/tag/chemistry"&gt;a blog post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Although originally designed to uniquely identify a chemical substance or structure, when used in the wild, CAS Registry Numbers sometimes more closely resemble the fuzzy semantics of tagging.&lt;/p&gt;

&lt;p&gt;Chemical information system that use CAS Numbers processed by third parties need to take this reality into account or run the risk of misleading users. Chempedia offers one method for doing so.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Chempedia currently contains just over 380,000 CAS registry numbers. Although this is a minuscule fraction of the total CAS Registry, Chempedia's collection comprises some of the most widely used and most important substances known. More importantly, Chempedia now offers a tool for understanding the often complex associations between chemical structure and CAS Registry numbers that exist in real-world chemical information sources.&lt;/p&gt;

&lt;p&gt;In this sense, Chempedia could be a useful tool for small organizations to double check their CAS Number assignments and for individuals to quickly look up the chemical structure of a given CAS number and understand ambiguities.&lt;/p&gt;

&lt;p&gt;Chempedia also lays bare both the confusion and consensus around CAS Registry Numbers used in the real world. If CAS Numbers in the wild are more like tags than unique identifiers, what can we do with this insight? Future articles will describe some possibilities.&lt;/p&gt;</description>
      <pubDate>Mon, 26 May 2008 13:06:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:f0fedea9-3156-42d9-a0ce-5e0fdb3d3f8b</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/26/simple-cas-number-lookup-and-more-with-chempedia</link>
      <category>Tools</category>
      <category>cas</category>
      <category>registrynumber</category>
      <category>tagging</category>
      <category>tags</category>
      <category>chempedia</category>
      <category>manytomany</category>
    </item>
    <item>
      <title>Wikipedia for Cheminformatics: A Simple Web API for Finding CAS Numbers in Compound Monographs</title>
      <description>&lt;p&gt;&lt;a href="http://wikipedia.org"&gt;&lt;img src="http://depth-first.com/demo/20070123/wikipedia.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Good news for cheminformatics: Chemical Abstracts Service (CAS) &lt;a href="http://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Chemistry/CAS_validation"&gt;has agreed&lt;/a&gt; to help Wikipedia users curate its collection of CAS numbers. As a result of the diligence of some hard-working volunteers, chemistry's most universal system for referring to chemicals can now be used far more effectively by the worlds biggest open repository of knowledge.&lt;/p&gt;

&lt;p&gt;Wouldn't it be great to be able to pull these CAS numbers from Wikipedia programmatically?&lt;/p&gt;

&lt;h4&gt;Perspective&lt;/h4&gt;

&lt;p&gt;Estimates place the number of Wikipedia pages dealing with individual &lt;a href="http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemicals/Inorganics"&gt;inorganic&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/List_of_organic_compounds"&gt;organic&lt;/a&gt; substances in the thousands. (I'll use the term "compound monographs" to describe them.) One factor acting to keep this number low is poor visibility of these entries. Unlike most &lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;chemical databases&lt;/a&gt;, Wikipedia can't, by itself, be easily searched by structure. As chemically-aware tools for indexing Wikipedia begin to emerge, look for six things to happen:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The number of Wikipedia compound monographs will increase significantly.&lt;/li&gt;
&lt;li&gt;The quality of monographs for intermediate- to well-known compounds will increase substantially.&lt;/li&gt;
&lt;li&gt;Demand for user-friendly interfaces to Wikipedia's chemical content will increase.&lt;/li&gt;
&lt;li&gt;Wikipedia users will become interested in storing and finding ever more diverse kinds of information about each compound.&lt;/li&gt;
&lt;li&gt;Bench chemists will start to include Wikipedia as one of their preferred literature search techniques, leading to...&lt;/li&gt;
&lt;li&gt;More creative tools for using the chemical content of Wikipedia.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As noted previously, it wasn't too long ago that indexing of the chemical literature &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;was done solely by volunteers&lt;/a&gt;. Wikipedia offers an intriguing way to channel the innate drive for chemists to combine their own work and experience with that of others to build useful information tools for the community.&lt;/p&gt;

&lt;p&gt;But for now we are left with the question of how to index the chemical content of Wikipedia. Although a few systems have been proposed, the only practical method is through the use of CAS numbers. Which brings us to the subject of today's tutorial.&lt;/p&gt;

&lt;h4&gt;A Quick CAS Number API for Wikipedia&lt;/h4&gt;

&lt;p&gt;The Ruby program below will accept the title of any Wikipedia compound monograph title and return the CAS number for the compound being discussed, or an error message if none was found:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;hpricot&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;open-uri&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;cgi&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Wikikemi&lt;/span&gt;
  &lt;span class="attribute"&gt;@cas&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt;

  &lt;span class="ident"&gt;attr_reader&lt;/span&gt; &lt;span class="symbol"&gt;:cas&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt;
    &lt;span class="ident"&gt;uri&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;URI&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;escape&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://en.wikipedia.org/wiki/&lt;span class="expr"&gt;#{title}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
    &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;loading... &lt;span class="expr"&gt;#{uri}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
    &lt;span class="ident"&gt;doc&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Hpricot&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;open&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;uri&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
    &lt;span class="ident"&gt;table&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;doc&lt;/span&gt;&lt;span class="punct"&gt;/&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)[&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;

    &lt;span class="ident"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;inner_html&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;([0-9]{2,7}?&lt;span class="escape"&gt;\-&lt;/span&gt;[0-9]{2}&lt;span class="escape"&gt;\-&lt;/span&gt;[0-9])&lt;/span&gt;&lt;span class="punct"&gt;/)&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;table&lt;/span&gt;

    &lt;span class="attribute"&gt;@cas&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="global"&gt;$1&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="comment"&gt;# Returns the CAS number present in the Wikipedia monograph with&lt;/span&gt;
&lt;span class="comment"&gt;# the indicated title, or an error message if none is found. Try, for example,&lt;/span&gt;
&lt;span class="comment"&gt;# &amp;quot;benzene.&amp;quot;.&lt;/span&gt;
&lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="constant"&gt;true&lt;/span&gt;
  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Enter the title of the Wikipedia page, for example: 'benzene'&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="ident"&gt;monograph_title&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;chomp&lt;/span&gt;
  &lt;span class="ident"&gt;w&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Wikikemi&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="ident"&gt;monograph_title&lt;/span&gt;
  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;w&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;cas&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;[&lt;span class="expr"&gt;#{w.cas}&lt;/span&gt;]&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;CAS number not found&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This program makes use of the excellent Ruby HTML parser, &lt;a href="http://code.whytheluckystiff.net/hpricot/"&gt;Hpricot&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Saving the above code to a file called &lt;strong&gt;wikikemi.rb&lt;/strong&gt;, we can run it with:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby wikikemi.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;For example, we can look up the CAS numbers for Ferrocene, Lipitor, or 1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby wikikemi.rb
Enter the title of the Wikipedia page, for example: 'benzene'
ferrocene
loading... http://en.wikipedia.org/wiki/ferrocene
[102-54-5]
Enter the title of the Wikipedia page, for example: 'benzene'
lipitor
loading... http://en.wikipedia.org/wiki/lipitor
[134523-00-5]
Enter the title of the Wikipedia page, for example: 'benzene'
1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene
loading... http://en.wikipedia.org/wiki/1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene
[91-17-8]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;All this method requires is that the Wikipedia page lists the correct CAS number in its &lt;a href="http://en.wikipedia.org/wiki/Template:Drugbox"&gt;Drugbox&lt;/a&gt; or &lt;a href="http://en.wikipedia.org/wiki/Template:Chembox_new"&gt;Chembox&lt;/a&gt; template. Fortunately, CAS has agreed to help make this happen.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;A little Ruby code is all it takes to build a working CAS number lookup system using Wikipedia. Although this may be useful as a standalone tool, it becomes much more powerful when made part of &lt;a href="http://depth-first.com/articles/2007/05/21/simple-cas-number-lookup-with-pubchem"&gt;a larger cheminformatics system&lt;/a&gt;. But that's a story for another time.&lt;/p&gt;

&lt;p&gt;See also &lt;a href="http://www.chemspider.com/blog/a-message-of-support-and-public-service-from-the-chemical-abstracts-service.html"&gt;Antony Williams' announcement on CAS and Wikipedia&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Wed, 02 Apr 2008 17:29:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:c11402b2-406a-4ec9-8b65-fc34da179c1a</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs</link>
      <category>Tools</category>
      <category>cas</category>
      <category>acs</category>
      <category>casnumber</category>
      <category>lookup</category>
      <category>wikipedia</category>
      <category>ruby</category>
    </item>
    <item>
      <title>ACS Loses $27 Million Case Against Leadscope</title>
      <description>&lt;p&gt;&lt;a href="http://www.dispatch.com"&gt;The Columbus Dispatch&lt;/a&gt; reported yesterday that an &lt;a href="http://www.dispatch.com/live/content/local_news/stories/2008/03/27/leadscope.html?sid=101"&gt;Ohio jury had decided against the ACS&lt;/a&gt; for $27 million in a dispute with the founders of &lt;a href="http://www.leadscope.com/"&gt;Leadscope, Inc.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The case was a countersuit by Leadscope, who alleged that Chemical Abstracts Service (CAS) had damaged their company in pursuing an earlier lawsuit. That original suit claimed that the founders of Leadscope, who were former CAS employees, had used intellectual property belonging to CAS in the development of their products.&lt;/p&gt;

&lt;p&gt;Some background on the original case is available in &lt;a href="http://www.sconet.state.oh.us/rod/docs/pdf/10/2005/2005-ohio-2557.pdf"&gt;&lt;em&gt;Am. Chem. Soc. v. Leadscope, Inc.&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclaimer: &lt;a href="http://depth-first.com/articles/2006/12/29/dispelling-open-source-confusion-an-introduction-to-licenses"&gt;I am not a lawyer&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Fri, 28 Mar 2008 09:40:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:d77ea087-9962-4d49-b981-4a18b59ba9a9</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/03/28/acs-loses-27-million-case-against-leadscope</link>
      <category>Meta</category>
      <category>cas</category>
      <category>leadscope</category>
      <category>lawsuit</category>
    </item>
    <item>
      <title>Hacking PubChem: Visually Inspect Results for CAS Number and Keyword Searches</title>
      <description>&lt;p&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;&lt;img src="http://depth-first.com/files/pubchemlogo.gif" align="right"&gt;&lt;/img&gt;&lt;/a&gt;A recent article described how PubChem could be used to &lt;a href="http://depth-first.com/articles/2007/09/13/hacking-pubchem-convert-cas-numbers-into-pubchem-cids-with-ruby"&gt;quickly search for CAS numbers&lt;/a&gt;. Although useful, the approach is limited in that only an array of PubChem CIDs was returned. What would be really useful would be a simple way to create a report with entries hyperlinked into the PubChem site itself to aid in visual inspection. In this tutorial, we'll see how an HTML template and a few extra lines of code can do just that.&lt;/p&gt;

&lt;h4&gt;The Template&lt;/h4&gt;

&lt;p&gt;Ruby supports a number of HTML templating mechanisms. In this example, we'll use an ERB template resurrected from the &lt;a href="http://depth-first.com/articles/2006/12/11/hacking-molbank-creating-a-graphical-table-of-contents"&gt;Molbank graphical table of contents&lt;/a&gt; tutorial:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_xml "&gt;&lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;PubChem Search for #{term}&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;head&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;h1&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Search: #{term}&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;h1&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;tr&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;col&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;cids.each&lt;/span&gt; &lt;span class="attribute"&gt;do&lt;/span&gt; |&lt;span class="attribute"&gt;cid|&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;td&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;image&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/image/imgsrv.fcgi?cid=#{cid}&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;summary&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=#{cid}&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;a&lt;/span&gt; &lt;span class="attribute"&gt;href&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;%= summary %&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;img&lt;/span&gt; &lt;span class="attribute"&gt;src&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;%= image %&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="attribute"&gt;border&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;img&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;a&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;center&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;span&lt;/span&gt; &lt;span class="attribute"&gt;style&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;font-size: 8px&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&lt;/span&gt;
              &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;a&lt;/span&gt; &lt;span class="attribute"&gt;href&lt;/span&gt;&lt;span class="punct"&gt;=&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&amp;lt;%= summary %&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&amp;gt;&amp;lt;%=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;CID-#{cid}&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;a&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;span&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;center&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;td&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;col&lt;/span&gt; +&lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;if&lt;/span&gt; &lt;span class="attribute"&gt;col&lt;/span&gt; &lt;span class="punct"&gt;&amp;gt;&lt;/span&gt; 5 %&amp;gt;
          &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;col&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;tr&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt;&lt;span class="tag"&gt;tr&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="attribute"&gt;end&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;%&lt;/span&gt;&lt;span class="attribute"&gt;end&lt;/span&gt; %&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;tr&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="punct"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="tag"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The above template uses a search term and an array of CIDs to build a table of results. Each cell in the table contains a color 2D image and the CID, both hyperlinked into PubChem itself.&lt;/p&gt;

&lt;p&gt;Saving this library to a file called &lt;strong&gt;template.rhtml&lt;/strong&gt; is all we need to do.&lt;/p&gt;

&lt;h4&gt;The Library&lt;/h4&gt;

&lt;p&gt;The library is a modification of the one shown in &lt;a href="http://depth-first.com/articles/2007/09/13/hacking-pubchem-convert-cas-numbers-into-pubchem-cids-with-ruby"&gt;the previous article&lt;/a&gt; in this series:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;mechanize&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;erb&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;PubChemTerms&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;report&lt;/span&gt; &lt;span class="ident"&gt;term&lt;/span&gt;
    &lt;span class="ident"&gt;cids&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;get_cids&lt;/span&gt; &lt;span class="ident"&gt;term&lt;/span&gt;
    &lt;span class="ident"&gt;erb&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;ERB&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;IO&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;template.rhtml&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;))&lt;/span&gt;

    &lt;span class="constant"&gt;File&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;open&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;output.html&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;w+&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;file&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="ident"&gt;file&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="ident"&gt;erb&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;result&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;binding&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;get_cids&lt;/span&gt; &lt;span class="ident"&gt;term&lt;/span&gt;
    &lt;span class="ident"&gt;agent&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;WWW&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;Mechanize&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="ident"&gt;page&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;agent&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;get&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pccompound&amp;amp;retmax=100&amp;amp;term=&lt;span class="expr"&gt;#{term}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

    &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;page&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parser&lt;/span&gt;&lt;span class="punct"&gt;/&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;).&lt;/span&gt;&lt;span class="ident"&gt;collect&lt;/span&gt; &lt;span class="punct"&gt;{|&lt;/span&gt;&lt;span class="ident"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; &lt;span class="ident"&gt;id&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;innerHTML&lt;/span&gt;&lt;span class="punct"&gt;}&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The method &lt;tt&gt;report&lt;/tt&gt; accepts a search term and uses our template to render a report.&lt;/p&gt;

&lt;h4&gt;Testing&lt;/h4&gt;

&lt;p&gt;By saving the above library in a file called &lt;strong&gt;pubchem.rb&lt;/strong&gt;, we can search by keyword via interactive ruby (irb):&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'pubchem'
=&gt; true
irb(main):002:0&gt; include PubChemTerms
=&gt; Object
irb(main):003:0&gt; report 'esomeprazole'
=&gt; #&lt;File:output.html (closed)&gt;
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This produces a file called &lt;strong&gt;output.html&lt;/strong&gt; that can be viewed with any browser:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070925/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;As in the original version of the library, we can also query by CAS number:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ irb
irb(main):001:0&gt; require 'pubchem'
=&gt; true
irb(main):002:0&gt; include PubChemTerms
=&gt; Object
irb(main):003:0&gt; report '119141-88-7'
=&gt; #&lt;File:output.html (closed)&gt;
&lt;/pre&gt;
&lt;/div&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;The simple approach outlined here could be extended in many ways. For example, we could easily retrieve molfiles based on keyword or CAS number search. We could pipe queries together or work with query lists. We could &lt;a href="http://depth-first.com/articles/2007/09/17/hacking-chemspider-query-by-smiles-and-inchi-with-ruby"&gt;blend in ChemSpider data&lt;/a&gt;. We could even build a simple Web application (with &lt;a href="http://rubyonrails.org"&gt;Rails&lt;/a&gt;) that returned customized reports. Mixing in &lt;a href="http://depth-first.com/articles/tag/rcdk"&gt;Ruby CDK&lt;/a&gt; or &lt;a href="http://depth-first.com/articles/tag/rubyopenbabel"&gt;Ruby Open Babel&lt;/a&gt; offers still more possibilities.&lt;/p&gt;

&lt;p&gt;Increasingly, the most important question in cheminformatics is not "What can we build?", but rather "What should we build?" Success in this new world requires a much deeper understanding of how cheminformatics software is being used by real chemists and where it's not.&lt;/p&gt;</description>
      <pubDate>Tue, 25 Sep 2007 10:55:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:5b1ef92b-4ed3-443e-a683-dc37d23c4352</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/09/25/hacking-pubchem-visually-inspect-results-for-cas-number-and-keyword-searches</link>
      <category>Tools</category>
      <category>pubchem</category>
      <category>casnumber</category>
      <category>cas</category>
      <category>ruby</category>
      <category>keyword</category>
      <category>erb</category>
      <category>html</category>
      <category>entrez</category>
    </item>
    <item>
      <title>PubChem is a Platform</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/demo/20070704/apple-II.jpg" align="right"&gt;&lt;/img&gt;Two recent &lt;a href="http://pubs.acs.org/journals/jcisd8/index.html"&gt;&lt;em&gt;J. Chem. Inf. Model.&lt;/em&gt;&lt;/a&gt; articles support the idea that &lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt; is rapidly evolving into a Chemical Informatics platform:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://dx.doi.org/10.1021/ci700092v"&gt;Large-Scale Annotation of Small-Molecule Libraries Using Public Databases&lt;/a&gt;. Using PubChem and other databases, the authors categorize the level of annotation (data, metadata, and links) of free chemical databases, with PubChem as the centerpiece. The work is part of a larger effort designed to integrate this free resource into the Novartis Research Foundation (GNF) workflow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://dx.doi.org/10.1021/ci6004349"&gt;Web Service Infrastructure for Chemoinformatics&lt;/a&gt;. Among other interesting initiatives, the article describes a desktop application front-end for PubChem. (As a bonus, the authors also &lt;a href="http://depth-first.com/articles/search?q=making+the+case"&gt;make the case&lt;/a&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Platforms are essential because they focus the attention and effort of self-interested third-parties around a common goal. They become so integrated into society that they eventually become invisible. There is outrage when they stop working. Think of highways, sewers, phone lines, communications satellites, the patent system, and the Internet, among others. We don't just use these services, we build on top of them.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.cas.org/"&gt;Chemical Abstract Service&lt;/a&gt; is an important tool for many, but it is not a platform. By placing high costs on access to its service and severely restricting its use, the ACS has effectively shut out anyone wanting to build another service on top of CAS. Clearly this was part of the plan. Small and large third-party players alike are shut out, with the inevitable chilling effect on innovation.&lt;/p&gt;

&lt;p&gt;Contrast this situation with PubChem. The public is free to &lt;a href="http://depth-first.com/articles/2006/09/29/hacking-pubchem-direct-access-with-ftp"&gt;download&lt;/a&gt; and re-use the entire database of molecules and associated data. PubChem has recently unveiled a new Web API called &lt;a href="http://depth-first.com/articles/2007/06/04/hacking-pubchem-power-user-gateway"&gt;PUG&lt;/a&gt; that will make it even easier to layer on additional functionality. These kinds of capabilities create an entirely different dynamic: witness both &lt;a href="http://emolecules.com"&gt;eMolecules&lt;/a&gt; and &lt;a href="http://chemspider.com"&gt;ChemSpider&lt;/a&gt;, two services that unashamedly exploit the PubChem resource. Expect to see more of this in the months ahead.&lt;/p&gt;

&lt;p&gt;Remember the &lt;a href="http://en.wikipedia.org/wiki/Apple_II"&gt;Apple II&lt;/a&gt;? This product became so successful that it played a major role in undermining dozens of highly profitable and well-established businesses. Why was it so successful? One of the key reasons was its open architecture, compared to what had preceded it. Within a very short time, third parties had developed a large number of innovative products that exploited the underlying platform - both with and without Apple's encouragement. One of those products, &lt;a href="http://en.wikipedia.org/wiki/VisiCalc"&gt;VisiCalc&lt;/a&gt; was so successful that at one point many buyers of Apple's machine did so for no other purpose than to run it.&lt;/p&gt;

&lt;p&gt;Whether PubChem itself ends up becoming the standard cheminformatics platform is hard to say. Perhaps this role will be filled by a system not yet built, or which evolves from PubChem. Whatever the outcome, PubChem has unmasked a deep need (and opportunity) for an open cheminformatics platform. As Apple's experience demonstrates, often you get more in the end by giving something up.&lt;/p&gt;</description>
      <pubDate>Wed, 04 Jul 2007 10:45:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:60236f1c-8b1b-4aaf-b0c4-5c0f17a563a8</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/07/04/pubchem-is-a-platform</link>
      <category>Meta</category>
      <category>pubchem</category>
      <category>cas</category>
      <category>platform</category>
      <category>open</category>
    </item>
    <item>
      <title>Simple CAS Number Lookup with PubChem</title>
      <description>&lt;p&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;&lt;img src="http://depth-first.com/files/pubchemlogo.gif" align="right" border="none"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://www.cas.org/expertise/cascontent/registry/regsys.html"&gt;CAS Registry Numbers&lt;/a&gt; simplify the thorny problem of referring to chemical substances. These short numerical sequences are arguably the most widely-used form of molecular identifier, appearing on reagent bottles, in publications, in patents and patent applications, and MSDS sheets.&lt;/p&gt;

&lt;p&gt;During my time as a synthetic organic chemist, I would sometimes run into the problem of finding the structure of a molecule represented by a CAS number. A common case was when an ambiguous, incomprehensible, or blurred IUPAC name was printed on a reagent bottle along with a CAS number. By looking up the CAS number, I could confirm the bottle's contents.&lt;/p&gt;

&lt;p&gt;Your first impulse when looking up a CAS number might be to fire up &lt;a href="http://www.cas.org/SCIFINDER/"&gt;SciFinder&lt;/a&gt;. For years this was the only option. Those days are quickly starting to seem as quaint as when people actually wrote on pieces of paper and dropped them in mailboxes (&lt;a href="http://netflix.com"&gt;dropping DVDs in a mailbox&lt;/a&gt; is a different matter).&lt;/p&gt;

&lt;p&gt;A little-publicized feature of PubChem makes it an ideal way to quickly find the structure associated with a CAS Number. To use it, you need nothing more than a computer, a browser, and an internet connection.&lt;/p&gt;

&lt;p&gt;Browse over to the &lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt; welcome page. At the top you'll find a search box. Enter your CAS number and press "Go." For this example, I'm using the CAS number for 2,5-Pyrazinedicarboxylic acid dihydrate:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070521/screenshot.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;If all goes well, you should see a results screen containing the structure of your compound and a link to its summary page:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20070521/screenshot2.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Does this seem a little too good to be true? Try it for yourself. Pick up a copy of the Aldrich catalog, Merck index, or anything else that lists lots of CAS numbers. Choose several structures at random and see how PubChem performs.&lt;/p&gt;

&lt;p&gt;There are limitations to this method. PubChem generally doesn't index large molecules such as polymers and peptides, so they won't be found by this method. Similarly, if a CAS number doesn't point to a distinct molecular entity (e.g. "mineral oil"), PubChem won't find it either. But these are hardly limitations in the vast majority of cases.&lt;/p&gt;

&lt;p&gt;With the &lt;a href="http://www.corporate-ir.net/ireye/ir_site.zhtml?ticker=SIAL&amp;amp;script=410&amp;amp;layout=-6&amp;amp;item_id=984368"&gt;recent addition of Sigma-Aldrich&lt;/a&gt; as a PubChem compound supplier, it won't be long before smaller companies begin following suit. What we're seeing with PubChem is a classic example of a &lt;a href="http://en.wikipedia.org/wiki/Network_effect"&gt;network effect&lt;/a&gt;. The end result should come as a surprise to nobody.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Update: &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt; offers a more detailed &lt;a href="http://depth-first.com/articles/2008/05/26/simple-cas-number-lookup-and-more-with-chempedia"&gt;CAS Number Lookup&lt;/a&gt; service.&lt;/em&gt;&lt;/p&gt;</description>
      <pubDate>Mon, 21 May 2007 11:46:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:e20e2fc2-e99e-4171-8055-1493bcb31d65</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/05/21/simple-cas-number-lookup-with-pubchem</link>
      <category>Databases</category>
      <category>cas</category>
      <category>pubchem</category>
      <category>casnumber</category>
      <category>lookup</category>
      <category>networkeffect</category>
    </item>
    <item>
      <title>SciLink: Science Meets Facebook</title>
      <description>&lt;p&gt;&lt;a href="http://www.scilink.com/"&gt;&lt;img src="http://depth-first.com/demo/20070320/scilink.gif" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;Whether you're in academics or industry, a big part of being a scientist is getting your work noticed by other scientists. In years past, scientists relied on subscription-only services such as &lt;a href="http://www.cas.org/"&gt;Chemical Abstracts Service&lt;/a&gt; and &lt;a href="http://scientific.thomson.com/products/sci/"&gt;Science Citation Index&lt;/a&gt;. But these services are starting to show their limitations, particularly with respect to being able to sell their products at an affordable price. Can you afford to trust this important part of your scientific career to companies with &lt;a href="http://depth-first.com/articles/2007/03/01/bryan-vickery-on-whats-broken-in-cheminformatics"&gt;broken business models&lt;/a&gt;?&lt;/p&gt;

&lt;p&gt;Enter &lt;a href="http://scilink.com"&gt;SciLink&lt;/a&gt;, a service that could end up doing for science what &lt;a href="http://facebook.com"&gt;Facebook&lt;/a&gt; has done for college campuses. Search for a scientist's name and get a (partial) list of their publications along with a list of other scientists they've worked with as co-authors. No small database, SciLink already contains information on 5.8 million scientists. Although you can create your own user profile after registering, you're probably already in SciLink courtesy of its creative use of &lt;a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed"&gt;PubMed&lt;/a&gt; bibliographical information.&lt;/p&gt;

&lt;p&gt;It's not clear what the future holds for SciLink. One thing is certain: free services like SciLink can be expected to proliferate over the next few years as cost squeezes and technological advances continue to take their toll on the established scientific information industry.&lt;/p&gt;</description>
      <pubDate>Tue, 20 Mar 2007 11:00:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:48bdfb65-5f78-4891-835d-fd708c72b16d</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/03/20/scilink-science-meets-facebook</link>
      <category>Meta</category>
      <category>scilink</category>
      <category>facebook</category>
      <category>cas</category>
      <category>sci</category>
      <category>pubmed</category>
    </item>
    <item>
      <title>From Famine to Feast: A Bumper Crop of Free Chemistry Databases</title>
      <description>&lt;p&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;&lt;img src="http://depth-first.com/files/pubchemlogo.gif" align="right" border="0"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;"Until PubChem came on the scene, the state of chemoinformatics compared to bioinformatics was 20 years behind," says Christopher Lipinski, who formulated the eponymous rule-of-five criteria for drug bioavailability.&lt;/p&gt;

    &lt;p&gt;-&lt;cite&gt;Monya Baker, &lt;a href="http://dx.doi.org/10.1038/nrd2148"&gt;Nature Reviews Drug Discovery&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The number of free chemistry databases on the Web just keeps growing. A recent Depth-First article discussed &lt;a href="http://depth-first.com/articles/2006/11/07/twelve-free-chemistry-databases"&gt;twelve of them&lt;/a&gt;. It turns out that &lt;a href="http://www.indiana.edu/~cheminfo/cicc/"&gt;Chembiogrid&lt;/a&gt; from Indiana University maintains a &lt;a href="http://www.indiana.edu/~cheminfo/cicc/databases.html#free"&gt;list of forty free chemistry databases&lt;/a&gt;, most of which are alive and well.&lt;/p&gt;

&lt;p&gt;As this trend continues, the need for database standards will become painfully obvious. Not only will interoperable infrastructure technologies and user interface standards need to be developed, but thorny intellectual property issues including &lt;a href="http://depth-first.com/articles/2006/09/27/hacking-pubchem-free-speech-or-free-beer"&gt;access, chain of title&lt;/a&gt;, and &lt;a href="http://depth-first.com/articles/2006/09/22/hacking-pubchem-why-the-open-access-fight-is-just-the-beginning"&gt;digital rights&lt;/a&gt; will need to be resolved. However, the most immediate need is much more down-to-earth: to involve chemists with the growing number of free alternatives to the &lt;a href="http://www.cas.org/"&gt;chemical information monopoly&lt;/a&gt; they've come to rely on.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.numly.com/numly/verify.asp?id=98219-070105-899330-68"&gt;&lt;img alt="numly esn" src="http://numly.com/numly/icon.asp?id=9821907010589933068" border="0"&gt; 98219-070105-899330-68&lt;/a&gt; &lt;/p&gt;</description>
      <pubDate>Fri, 05 Jan 2007 14:53:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:701a5e92-94f4-4c6f-af18-f39018caec88</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/01/05/from-famine-to-feast-a-bumper-crop-of-free-chemistry-databases</link>
      <category>Databases</category>
      <category>pubchem</category>
      <category>opendata</category>
      <category>openaccess</category>
      <category>chembiogrid</category>
      <category>cas</category>
    </item>
    <item>
      <title>Chemical Reviews on Wikipedia</title>
      <description>&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/files/500px-Sharpless_Dihydroxylation_Scheme.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Until 1966, Chemical Abstracts Service used &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;volunteers&lt;/a&gt; exclusively to abstract the chemical literature. At the system's peak, thousands of scientists were willing and even enthusiastic to perform this tedious, demanding work for very little pay. The system was eventually phased out in favor of the professional abstracting service that replaced it.&lt;/p&gt;

&lt;p&gt;What motivated these volunteer abstracters? Enlightened self-interest probably played a role. After all, preparing a set of abstracts in a field you do research in can pay off in your own increased productivity. It's also a good way to stay current with the literature, something you would do anyway. If your abstracts help your fellow scientist at the same time, so much the better. Another motivation could have been a simple desire to create order out of chaos, not unlike the many &lt;a href="http://digg.com"&gt;social networking activities&lt;/a&gt; flourishing on the internet today. &lt;a href="http://almost.cubic.uni-koeln.de/jrg/"&gt;Christoph Steinbeck&lt;/a&gt; will be giving a &lt;a href="http://wiki.cubic.uni-koeln.de/blog/pivot/entry.php?id=5#body"&gt;talk&lt;/a&gt; at the Fall 2006 ACS touching on this theme, and it's likely others will too as the field gathers momentum.&lt;/p&gt;

&lt;p&gt;In browsing &lt;a href="http://blog.tenderbutton.com"&gt;Dylan Stiles' blog&lt;/a&gt;, I came across &lt;a href="http://blog.tenderbutton.com/?p=250"&gt;an entry&lt;/a&gt; on the aldehyde-&gt;alkyne homologation. In it, Stiles cited a brief, but informative &lt;a href="http://en.wikipedia.org/wiki/Seyferth-Gilbert_homologation"&gt;Wikipedia review&lt;/a&gt; on this reaction.&lt;/p&gt;

&lt;p&gt;Surely this couldn't be the only example of online volunteer-created reviews in chemistry on Wikipedia. A quick search resulted in numerous examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Wittig_reaction"&gt;Wittig Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Grignard_reaction"&gt;Grignard Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Dihydroxylation"&gt;Sharpless Dihydroxylation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Diels-Alder"&gt;Diels-Alder Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Thermite"&gt;Thermite Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Danishefsky_Taxol_total_synthesis"&gt;Danishefsky Taxol Total Synthesis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Olefin_metathesis"&gt;Olefin Metathesis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/McMurry_reaction"&gt;McMurry Coupling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Robinson_annulation"&gt;Robinson Annulation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Swern_oxidation"&gt;Swern Oxidation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Cholesterol"&gt;Cholesterol&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The proliferation of this kind of volunteer, peer-reviewed chemical documentation is similar in spirit to that used by CAS in earlier times, although the technology couldn't be more different. Of course, this approach is not without its limitations and &lt;a href="http://www.youtube.com/watch?v=zmHm0rGns4I"&gt;potential pitfalls&lt;/a&gt;, but it is remarkably &lt;a href="http://en.wikipedia.org/wiki/Elephants"&gt;self-correcting&lt;/a&gt;. This emerging system offers something that CAS will never be able to provide - involvement in, and ownership of, the documentation process itself.&lt;/p&gt;

&lt;p&gt;Unfortunately, chemical informatics technologies have not kept up with internet technologies and the people currently using them. The reliance on &lt;a href="http://depth-first.com/articles/2006/08/25/computational-perception-and-recognition-of-digitized-molecular-structures"&gt;raster images of 2-D structures&lt;/a&gt;, and the lack of a reliable web-enabled chemical indexing system both loom especially large as future problems to be addressed. What tools does this new kind of chemical publishing need to become more effective and efficient? How can these tools be made as &lt;a href="http://depth-first.com/articles/2006/09/05/the-automatic-encoding-of-chemical-structures"&gt;invisible&lt;/a&gt; as possible?&lt;/p&gt;</description>
      <pubDate>Fri, 08 Sep 2006 14:34:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:10cd0b85-a082-4aca-bbe6-296e7d78fdd7</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/09/08/chemical-reviews-on-wikipedia</link>
      <category>Web</category>
      <category>wikipedia</category>
      <category>volunteer</category>
      <category>cas</category>
    </item>
  </channel>
</rss>
