<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Validating CAS Numbers</title>
    <link>http://depth-first.com/articles/2008/07/23/validating-cas-numbers</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Validating CAS Numbers</title>
      <description>&lt;p&gt;The Chemical Abstracts Service (CAS) &lt;a href="http://en.wikipedia.org/wiki/CAS_registry_number"&gt;registry number system&lt;/a&gt; was designed to be fault-tolerant. Built into every CAS number is a &lt;a href="http://www.cas.org/expertise/cascontent/registry/checkdig.html"&gt;check-digit&lt;/a&gt; that makes it possible to detect mis-typed numbers. Validation is a mathematical and repetitive process well-suited for software.&lt;/p&gt;

&lt;p&gt;The Ruby program below validates arbitrary CAS numbers:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;module &lt;/span&gt;&lt;span class="module"&gt;CAS&lt;/span&gt;
  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;validate&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;
    &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="constant"&gt;false&lt;/span&gt; &lt;span class="keyword"&gt;unless&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt; &lt;span class="punct"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;[0-9]{2,7}-[0-9]{2}-[0-9]&lt;/span&gt;&lt;span class="punct"&gt;/)&lt;/span&gt;

    &lt;span class="ident"&gt;check_digit&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;[-&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;].&lt;/span&gt;&lt;span class="ident"&gt;to_i&lt;/span&gt;
    &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt;

    &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;reverse&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;scan&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;[0-9]&lt;/span&gt;&lt;span class="punct"&gt;/).&lt;/span&gt;&lt;span class="ident"&gt;each_with_index&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;digit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;sum&lt;/span&gt; &lt;span class="punct"&gt;+&lt;/span&gt; &lt;span class="ident"&gt;digit&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_i&lt;/span&gt; &lt;span class="punct"&gt;*&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

    &lt;span class="ident"&gt;check_digit&lt;/span&gt; &lt;span class="punct"&gt;==&lt;/span&gt; &lt;span class="ident"&gt;sum&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;remainder&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;10&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="ident"&gt;include&lt;/span&gt; &lt;span class="constant"&gt;CAS&lt;/span&gt;

&lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="constant"&gt;true&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt;
  &lt;span class="ident"&gt;print&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;CAS Number: &lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

  &lt;span class="ident"&gt;cas_number&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;strip&lt;/span&gt;

  &lt;span class="keyword"&gt;break&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;empty?&lt;/span&gt;

  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="constant"&gt;CAS&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;validate&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;cas_number&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;valid&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;invalid&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

The program can be tested from the command line:

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby cas.rb
CAS Number: 107-07-3
valid
CAS Number: 107-87-3
invalid
CAS Number:
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that a validated CAS number can still be absent from the CAS database; validation only says that a CAS number &lt;em&gt;could&lt;/em&gt; be valid based on its format.&lt;/p&gt;</description>
      <pubDate>Wed, 23 Jul 2008 12:30:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:37a66468-b6ed-4237-8fdc-81ac981466c8</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/07/23/validating-cas-numbers</link>
      <category>Tools</category>
      <category>cas</category>
      <category>casnumber</category>
      <category>validate</category>
      <category>ruby</category>
    </item>
    <item>
      <title>"Validating CAS Numbers" by ChemSpiderMan</title>
      <description>&lt;p&gt;Rich..all of the CAS Numbers on ChemSpider labeled as [RN] following the number are validated using a similar approach.&lt;/p&gt;</description>
      <pubDate>Thu, 24 Jul 2008 11:25:46 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:0b3c483f-4d98-4423-a317-8ab16d798189</guid>
      <link>http://depth-first.com/articles/2008/07/23/validating-cas-numbers#comment-654</link>
    </item>
  </channel>
</rss>
