<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Depth-First: Tag wikipedia</title>
    <link>http://depth-first.com/articles/tag/wikipedia</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Walking the Web of Chemical Informatics</description>
    <item>
      <title>Building Chempedia: Learning About Contributors</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com"&gt;&lt;img src="http://depth-first.com/demo/20080513/chempedia.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://chempedia.com/"&gt;Chempedia&lt;/a&gt; is a free online chemical encyclopedia similar in concept to the Merck Index, but &lt;a href="http://depth-first.com/articles/2008/04/28/building-chempedia-indexing-wikipedias-6-411-compound-monographs"&gt;radically different&lt;/a&gt; in implementation. One key difference: the Merck Index is compiled by a small number of paid professionals while Chempedia is compiled by thousands of unpaid volunteers. Although this distinction raises a host of intriguing questions, one of the most basic revolves around what can be said about these volunteers in the aggregate. This article, the first in a series, explores this issue with some statistics compiled from Chempedia.&lt;/p&gt;

&lt;h4&gt;Learning About Contributors&lt;/h4&gt;

&lt;p&gt;Chempedia works in part by aggregating content from Wikipedia dealing with single molecular entities, or "Compound Monographs." This content is created by the now &lt;a href="http://en.wikipedia.org/wiki/Wikipedia:Introduction"&gt;famous process&lt;/a&gt; of individuals taking upon themselves the responsibility of fixing what's broken in Wikipedia. (Some take it upon themselves to &lt;a href="http://en.wikipedia.org/wiki/Wikipedia:Vandalism"&gt;break what's working&lt;/a&gt;, but that's another topic.)&lt;/p&gt;

&lt;p&gt;Chempedia associates each of its Compound Monographs with the last Wikipedia user to edit it. The current interface to these relationships is available on the &lt;a href="http://chempedia.com/contributors"&gt;Chempedia contributors page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The interface to this page is currently limited. The analyses reported here were made for the most part by querying the Chempedia database directly.&lt;/p&gt;

&lt;p&gt;Each contributor is linked to a contributor summary page containing links to that user's Wikipedia homepage and talk page, as well as a complete listing of all active contributions. For example, you can view the contributor page for one of Chempedia's most active contributors, &lt;a href="http://chempedia.com/contributors/40"&gt;Arcadian&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The data model is also limited. Because Chempedia only records the last Contributor to edit a Monograph, when another Contributor edits a Monograph, the link between the previous Contributor is lost. As a result, many Contributors have no associated Monographs.&lt;/p&gt;

&lt;h4&gt;How Many Monographs?&lt;/h4&gt;

&lt;p&gt;Chempedia currently hosts 6,308 Compound Monographs.&lt;/p&gt;

&lt;h4&gt;How Many Contributors?&lt;/h4&gt;

&lt;p&gt;Chempedia currently lists &lt;a href="http://chempedia.com/contributors"&gt;2,516 Contributors&lt;/a&gt;. Of these, 1,046, or 42% are associated with one or more Monographs, meaning that they were the last to edit. The remainder are associated with no Monographs for which they were the last to edit.&lt;/p&gt;

&lt;p&gt;Here is a list of the top 20 Contributors and the number of Monographs they were the last to edit:&lt;/p&gt;

&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/2"&gt;anonymous&lt;/a&gt;&lt;/td&gt;&lt;td&gt;1022&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/2"&gt;DOI bot&lt;/a&gt;&lt;/td&gt;&lt;td&gt;904&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/1"&gt;Edgar181&lt;/a&gt;&lt;/td&gt;&lt;td&gt;378&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/66"&gt;Fvasconcellos&lt;/a&gt;&lt;/td&gt;&lt;td&gt;170&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/31"&gt;Meodipt&lt;/a&gt;&lt;/td&gt;&lt;td&gt;151&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/40"&gt;Arcadian&lt;/a&gt;&lt;/td&gt;&lt;td&gt;144&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/59"&gt;Chem-awb&lt;/a&gt;&lt;/td&gt;&lt;td&gt;133&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/22"&gt;Chowbok&lt;/a&gt;&lt;/td&gt;&lt;td&gt;122&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/2"&gt;Rifleman 82&lt;/a&gt;&lt;/td&gt;&lt;td&gt;114&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/10"&gt;SmackBot&lt;/a&gt;&lt;/td&gt;&lt;td&gt;105&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/19"&gt;Thijs!bot&lt;/a&gt;&lt;/td&gt;&lt;td&gt;99&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/1236"&gt;ChemNerd&lt;/a&gt;&lt;/td&gt;&lt;td&gt;85&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/127"&gt;Puppy8800&lt;/a&gt;&lt;/td&gt;&lt;td&gt;80&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/48"&gt;DumZiBoT&lt;/a&gt;&lt;/td&gt;&lt;td&gt;78&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/182"&gt;Axiosaurus&lt;/a&gt;&lt;/td&gt;&lt;td&gt;63&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/6"&gt;Chempedia&lt;/a&gt;&lt;/td&gt;&lt;td&gt;63&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/174"&gt;Carlo Banez&lt;/a&gt;&lt;/td&gt;&lt;td&gt;55&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/13"&gt;Benjah-bmm27&lt;/a&gt;&lt;/td&gt;&lt;td&gt;52&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/93"&gt;OKBot&lt;/a&gt;&lt;/td&gt;&lt;td&gt;51&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="http://chempedia.com/contributors/45"&gt;Cacycle&lt;/a&gt;&lt;/td&gt;&lt;td&gt;50&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;These Contributors represent 1.9% of all active Contributors and collectively are responsible for being the last to edit 62% of all Monographs. Although not performed here, a histogram plotting number of contributions would be expected to follow a &lt;a href="http://en.wikipedia.org/wiki/Power_law"&gt;power law&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;'Anonymous' is an aggregation of all users who edited a Monograph without a Wikipedia account. 16% of all Monographs were last edited by an anonymous user. Leaving out the aggregated 'anonymous' users indicates that roughly half of all Monographs were last edited by the top 19 Contributors.&lt;/p&gt;

&lt;h4&gt;What is a Contributor?&lt;/h4&gt;

&lt;p&gt;Although it's difficult to say a lot about individual Contributors, most appear to have some training in science, although that training may not have involved chemistry or biology. Still others (for example, &lt;a href="http://chempedia.com/contributors/2404"&gt;SJP&lt;/a&gt;) appear to have been drawn to contribute to a Monograph based on their nonscientific experience with the title compound or in an effort to fight vandalism or otherwise improve the nonscientific content of the Monograph. The ability of services like Wikipedia (and by extension Chempedia) to provide a platform for those without formal training in a particular area to make useful contributions is without question one of its most useful (and controversial) features.&lt;/p&gt;

&lt;p&gt;Some Contributors are not even human, but rather robots designed to improve the quality of Wikipedia articles in general. For example, &lt;a href="http://chempedia.com/contributors/10"&gt;SmackBot&lt;/a&gt; performs an array of tedious quality control jobs such as fixing bad checksum ISBNs (&lt;a href="http://www.cas.org/expertise/cascontent/registry/checkdig.html"&gt;CAS Numbers, anyone?&lt;/a&gt;) and capitalization errors.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Wikipedia's collaboration model has made the creation of a free and continuously-updated chemical encyclopedia feasible. Applying chemistry-specific user interfaces and data models exposes this hidden treasure. Although it's tempting to think of this process as mainly being the work of a handful of trained scientists, the numbers suggest a much broader base of contributors. Future articles will explore this idea.&lt;/p&gt;

&lt;p&gt;Related Article: &lt;a href="http://depth-first.com/articles/2008/05/21/building-chempedia-social-networking-applied-to-chemistry"&gt;&lt;em&gt;Building Chempedia: Social Networking Applied to Chemistry&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Wed, 02 Jul 2008 11:50:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:cc2cc82d-b3d9-4bba-89de-69f685033389</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/07/02/building-chempedia-learning-about-contributors</link>
      <category>Tools</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>collectiveintelligence</category>
      <category>socialnetworking</category>
      <category>merckindex</category>
    </item>
    <item>
      <title>Building Chempedia: Social Networking Applied to Chemistry</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com"&gt;&lt;img src="http://depth-first.com/demo/20080513/chempedia.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt; is a free online chemical encyclopedia; it's also a work in progress, the contents of which are being written by numerous volunteers worldwide. A previous article described initial work toward &lt;a href="http://depth-first.com/articles/2008/05/15/building-chempedia-the-human-element"&gt;connecting the people behind Chempedia's content with the compound monographs they're writing&lt;/a&gt;. This article will describe new features that take this idea much further.&lt;/p&gt;

&lt;h4&gt;Contributors&lt;/h4&gt;

&lt;p&gt;Chempedia now uses the concept of a "Contributor" as part of its data model. Each Compound Monograph has one associated Contributor, the Wikipedia user who last edited it. In other words, a one-to-many relationship exists between a Contributor and a Monograph.&lt;/p&gt;

&lt;p&gt;Of course, this model is simplistic; Compound Monographs are edited by multiple users over time, and so the relationship should be many-to-many. Nevertheless, for now a one-to-many relationship works well enough.&lt;/p&gt;

&lt;h4&gt;Learning About Contributors&lt;/h4&gt;

&lt;p&gt;You can view a &lt;a href="http://chempedia.com/contributors"&gt;complete list of contributors to Chempedia&lt;/a&gt;. As you can see, over 1,000 Wikipedia users are currently listed. The number in parentheses appearing after each contributor's username is the number of Monographs for which Wikipedia lists them as the last editor.&lt;/p&gt;

&lt;p&gt;Chempedia contains just over 6,400 Compound Monographs; the fact that 1,000 Wikipedia users contributed to making that happen is remarkable. That such a large number of users contribute relative to the number Monographs may be surprising given the rule of thumb that &lt;a href="http://depth-first.com/articles/2007/10/05/what-makes-wikipedia-tick"&gt;only 2-10% of users are responsible for the majority of the work on community-driven projects&lt;/a&gt;. While the majority of work may well be done by a relatively small group of Contributors, these numbers demonstrate a &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;widespread interest in creating and maintaining information about chemical compounds&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Chempedia lets you learn more about what an individual Contributor has done. Clicking on a Contributor name takes us to a Contributor summary page showing all of the Monographs on which they are listed as Contributor, as well as their Wikipedia home and talk pages. The latter can be used to take part in discussions.&lt;/p&gt;

&lt;p&gt;A particularly active contributor (one of several) goes by the name of &lt;a href="http://chempedia.com/contributors/1"&gt;Edgar181&lt;/a&gt;. As of this writing, s/he is listed as the Contributor on 458 Compound monographs, and the last one s/he edited was &lt;a href="http://chempedia.com/monographs/methylparaben"&gt;Methylparaben&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And 1,000 Contributors represents only a lower limit due the the large number of &lt;a href="http://chempedia.com/contributors/2"&gt;anonymous contributions&lt;/a&gt;, which on Chempedia are lumped together. As you can see, over 1,100 Compound Monographs were last edited by a Wikipedia user who didn't log in.&lt;/p&gt;

&lt;h4&gt;Thank Goodness for Robots&lt;/h4&gt;

&lt;p&gt;Quite a few &lt;a href="http://chempedia.com/contributors"&gt;Contributors&lt;/a&gt; have the letters 'Bot' in their names. A 'Bot is a script designed to do work on Wikipedia that would be tedious and/or error prone if done by humans.&lt;/p&gt;

&lt;p&gt;One of my favorites is &lt;a href="http://chempedia.com/contributors/17"&gt;ClueBot&lt;/a&gt;. From the &lt;a href="http://en.wikipedia.org/wiki/User:ClueBot"&gt;ClueBot Wikipedia user page&lt;/a&gt;, this script's purpose in life is to revert Wikipedia vandalism, a job it does with breathtaking efficiency and accuracy.&lt;/p&gt;

&lt;p&gt;For example, one of ClueBot's last pieces of work was to &lt;a href="http://en.wikipedia.org/w/index.php?title=Lysergic_acid_diethylamide&amp;amp;diff=213850493&amp;amp;oldid=213850472"&gt;revert an edit&lt;/a&gt; made to &lt;a href="http://chempedia.com/monographs/lysergic-acid-diethylamide"&gt;Lysergic Acid Diethylamide&lt;/a&gt; in which a user tried to include some enthusiastic, but subjective, comments about &lt;a href="http://en.wikipedia.org/wiki/Albert_Hofmann"&gt;Albert Hofmann's&lt;/a&gt; discovery.&lt;/p&gt;

&lt;p&gt;In less than one minute, ClueBot had not only identified the comment as vandalism (despite the fact that no 'offensive' language was used), but had removed it as well. Amazing.&lt;/p&gt;

&lt;h4&gt;Connecting Contributors to Monographs&lt;/h4&gt;

&lt;p&gt;Chempedia is quickly forming a densely connected network of people and molecules. What can we do with this?&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20080521/edit_status.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;A new edit status line has been added to Monographs summaries (above). With it, you can easily see the number of edits that have occurred, when the last one happened, and who did it. Links will take you directly to the Wikipedia edit history page and to the Chempedia Contributor page for the last editor.&lt;/p&gt;

&lt;p&gt;For example, the entry for &lt;a href="http://chempedia.com/monographs/modafinil"&gt;Modafinil&lt;/a&gt; currently lists &lt;a href="http://chempedia.com/contributors/314"&gt;Paul gene&lt;/a&gt; as the last contributor. Bringing up his Chempedia contributor page, we can see that he's listed as the last Contributor on three other Monographs, all of which are organic compounds of pharmacological interest. Curious about whether this might be one of Paul gene's interests, we click on the &lt;a href="http://en.wikipedia.org/wiki/User:Paul%20gene"&gt;User Page&lt;/a&gt; link at the top right of the contributor page and find out that this Wikipedia user received a Ph.D., works in academia, and has an interest in pharmacology, immunology, chemistry, kinase inhibitors, and antidepressants.&lt;/p&gt;

&lt;h4&gt;Newly-Edited Monographs&lt;/h4&gt;

&lt;p&gt;It might be of interest to know when Compound Monographs are edited. This can be done from the &lt;a href="http://chempedia.com/monographs"&gt;Browse&lt;/a&gt; link at the top-left main menu. On this page Monographs are sorted in descending order according to the last edit timestamp. The most recently-edited monographs appear on the first page, which is currently updated once every 30 minutes or so.&lt;/p&gt;

&lt;h4&gt;Hot Monographs&lt;/h4&gt;

&lt;p&gt;We may also be interested in which Compound Monographs are receiving the most edit activity. We can do that by choosing the &lt;a href="http://chempedia.com/monographs?sortby=activity"&gt;Active&lt;/a&gt; link at the top-right submenu. As of this writing, &lt;a href="http://chempedia.com/monographs/heroin"&gt;Heroin&lt;/a&gt; is the most actively edited Monograph, with 10 edits since May 19&lt;sup&gt;th&lt;/sup&gt;. Clicking on &lt;a href="http://en.wikipedia.org/w/index.php?title=Heroin&amp;amp;action=history"&gt;the link&lt;/a&gt; in the edit status line, we can see what all the activity is about.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;None of the technology described here is especially new or innovative; social networking has been part of information systems for several years now and relational databases are designed to make discoveries possible by linking disparate pieces of information. What is new is Chempedia's application of social networking, facilitated by relational databases, to chemistry. I'm unaware of any other chemical information system that takes the possibilities of social networking as far as Chempedia has taken them.&lt;/p&gt;

&lt;p&gt;There's quite a bit more that could be done to link people and molecules on Chempedia, but for now, it's time to move onto some related areas. It turns out that the use of CAS numbers, when used outside of the CAS database system itself, raises all kinds of difficult and interesting questions around trust and authority in which social networking ideas can be applied. But that's a story for another time.&lt;/p&gt;</description>
      <pubDate>Wed, 21 May 2008 10:40:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:6e4b6bf3-e0db-417e-9f28-87412710fdf1</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/21/building-chempedia-social-networking-applied-to-chemistry</link>
      <category>Tools</category>
      <category>chempedia</category>
      <category>socialnetworking</category>
      <category>wikipedia</category>
      <category>vandalism</category>
      <category>cluebot</category>
      <category>contributor</category>
    </item>
    <item>
      <title>Building Chempedia: The Human Element</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com" align="right"&gt;&lt;img src="http://depth-first.com/demo/20080513/chempedia.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;The study of chemistry is an inherently social activity. From the papers we use and cite, to the conferences we attend, to the informal discussions we engage in daily, being a chemist means interacting with your fellow chemists. Yet strangely, most chemical information systems either totally ignore this central fact, or provide only the most meager of tools to harness it to its full potential. This article discusses how &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt; currently integrates the social with the scientific, and what may be in store for the future.&lt;/p&gt;

&lt;h4&gt;Chempedia as a Tool for Scientific Collaboration&lt;/h4&gt;

&lt;p&gt;Like all chemical reference works, Chempedia is written by people with their own interests, skills, and ambitions. Unlike almost every other chemical reference work, Chempedia (through Wikipedia, on which it's based) offers intriguing possibilities to directly collaborate and learn from its contributors - or even become one of them.&lt;/p&gt;

&lt;p&gt;How can Chempedia better facilitate scientific collaboration?&lt;/p&gt;

&lt;h4&gt;A Simple But Possibly Useful Feature&lt;/h4&gt;

&lt;p&gt;Yesterday, a new feature was added to Chempedia that makes it easier to understand the recent history of a Compound Monograph. The new feature shows the date that a Compound Monograph was last edited, and the Wikiepdia user who edited it:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/demo/20080515/screen.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Clicking on the link takes you to the Wikipedia users page, in this case the one for &lt;a href="http://en.wikipedia.org/wiki/User:Meodipt"&gt;Meodipt&lt;/a&gt;. (Wikipedia users frequently use handles rather than their given names.) From Meodipt's page, we can see that s/he received degrees in chemistry and pharmacology and is currently studying law. Meodipt's interests include pharmacology, chemistry, law, and science. We can also see that Meodipt is maintaining a &lt;a href="http://en.wikipedia.org/wiki/User:Meodipt/casnumbers"&gt;good-sized list of CAS numbers for drugs&lt;/a&gt;, grouped by indication.&lt;/p&gt;

&lt;p&gt;We might be curious about what Meodipt found worth changing, and how s/he changed it. We could do so by first clicking the Chempedia &lt;a href="http://chempedia.com/monographs/pravadoline/edit"&gt;edit link&lt;/a&gt;. In the Wikipedia box (framed by the red dotted lines), we would then click on the 'history' tab. Clicking on the 'last' link for the top entry shows us exactly what Meodipt changed on Pravadoline's compound monograph (also visible through &lt;a href="http://en.wikipedia.org/w/index.php?title=Pravadoline&amp;amp;diff=200731945&amp;amp;oldid=200731624"&gt;this link&lt;/a&gt;).&lt;/p&gt;

&lt;h4&gt;Looking Ahead&lt;/h4&gt;

&lt;p&gt;Linking a real person to changes in a Compound Monograph could be enormously useful, if done properly. After all, bringing people with highly focussed interests together is the essence of scientific collaboration. The Chempedia/Wikipedia combination provides one way to do that.&lt;/p&gt;

&lt;p&gt;As Chis Anderson puts it, "&lt;a href="http://www.longtail.com/the_long_tail/2007/09/social-networki.html"&gt;social networking should be a feature, not a destination&lt;/a&gt;." Scientists were social networking long before the Internet, the computer, and the telephone were invented; indeed scientists who fail to connect with their fellow scientists have a difficult time of prospering. When seen from this perspective, it's surprising that good 'social networking' features would not be viewed as a top priority in chemical information systems.&lt;/p&gt;

&lt;p&gt;The Chempedia author credit system in its current form is rather simplistic and may not actually promote scientific collaboration at all. But it's not hard to imagine ways to make it far more effective. Future articles will discuss some of the possibilities.&lt;/p&gt;</description>
      <pubDate>Thu, 15 May 2008 14:50:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:ae862028-7efd-4e91-b5ee-36b91cbed66e</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/15/building-chempedia-the-human-element</link>
      <category>Tools</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>socialnetworking</category>
      <category>collaboration</category>
      <category>author</category>
    </item>
    <item>
      <title>The Daily Molecule: The Wonders of Chemistry - One Molecule at a Time</title>
      <description>&lt;p&gt;&lt;a href="http://blog.chempedia.com"&gt;&lt;img src="http://depth-first.com/demo/20080513/chempedia.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Chemistry is a big field judged by any standard, including the &lt;a href="http://depth-first.com/articles/2008/05/07/1908-and-all-that-the-long-tail-and-chemistry"&gt;proliferation of American Chemical Society (ACS) divisions&lt;/a&gt;. Each subdiscipline in chemistry is in turn so big, that once a chemist becomes 'differentiated' it's easy to lose touch even with neighboring subdisciplines. It doesn't have to be that way. This article introduces a new service, &lt;a href="http://blog.chempedia.com"&gt;&lt;em&gt;The Daily Molecule&lt;/em&gt;&lt;/a&gt; designed to make it just a little bit easier (and hopefully fun) to stay in the chemical loop.&lt;/p&gt;

&lt;h4&gt;What Is It?&lt;/h4&gt;

&lt;p&gt;The idea is simple: every weekday, a new molecule will be featured on &lt;em&gt;The Daily Molecule&lt;/em&gt; with a short write-up and some leading references. Although molecules in the news will get first priority, any molecule is fair game.&lt;/p&gt;

&lt;p&gt;The material for &lt;em&gt;The Daily Molecule&lt;/em&gt; will be drawn from &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt;, which in turn gets some of its content from &lt;a href="http://wikipedia.org"&gt;Wikipedia&lt;/a&gt;. In other words, the entries on the Daily Molecule will be largeley written by my fellow chemists.&lt;/p&gt;

&lt;p&gt;The process of creating a &lt;em&gt;Daily Molecule&lt;/em&gt; entry is not time-consuming, but much of what is being done manually now could be automated in the future. The technology platform lends itself well to many forms of chemistry-specific modification (see below).&lt;/p&gt;

&lt;p&gt;I hesitate to use the term 'blog' to describe &lt;em&gt;The Daily Molecule&lt;/em&gt;, but the description may be helpful to an extent.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Daily Molecule&lt;/em&gt; is unlike a blog in that most content will be generated by others, selected by some criteria, reformatted for consistency, and published. In that sense, &lt;em&gt;The Daily Molecule&lt;/em&gt; is a something like a mini scientific journal, but it turns the process of acquiring content on its head.&lt;/p&gt;

&lt;p&gt;If chemistry ever evolves beyond the &lt;a href="http://depth-first.com/articles/2007/07/16/go-west-young-man-does-open-access-really-matter-in-the-long-run"&gt;current model of publication&lt;/a&gt;, which seems inevitable at this point, the journals of the future may resemble &lt;em&gt;The Daily Molecule&lt;/em&gt; in one or more ways.&lt;/p&gt;

&lt;h4&gt;Technology&lt;/h4&gt;

&lt;p&gt;The software running &lt;em&gt;The Daily Molecule&lt;/em&gt; is a modified version of &lt;a href="http://simplelog.net/"&gt;SimpleLog&lt;/a&gt;, a Web application based on &lt;a href="http://www.rubyonrails.org/"&gt;Ruby on Rails&lt;/a&gt;. Unlike most blogging engines, SimpleLog focuses on implementing only the most basic publication features, and doing them to perfection. If you know a little Ruby and can work with Rails, you can do a lot with SimpleLog.&lt;/p&gt;

&lt;p&gt;One of the first items of business will be to implement &lt;a href="http://depth-first.com/articles/2007/09/18/six-reasons-i-like-recaptcha-or-how-to-build-a-web-service-worth-talking-about"&gt;reCAPTCHA&lt;/a&gt; support and activate comments on articles.&lt;/p&gt;

&lt;p&gt;Some ideas for chemically-enabling &lt;em&gt;The Daily Molecule&lt;/em&gt; include a graphical abstract sidebar and (sub)structure search. Currently, the 2D chemical structure images posted to &lt;em&gt;The Daily Molecule&lt;/em&gt; &lt;a href="http://depth-first.com/articles/2007/08/08/never-draw-the-same-molecule-twice-viewing-image-metadata"&gt;have complete connection tables embedded as metadata&lt;/a&gt;, a feature with some interesting possibilities.&lt;/p&gt;

&lt;h4&gt;The Molecule of the Day/Week/Month&lt;/h4&gt;

&lt;p&gt;The basic idea behind &lt;em&gt;The Daily Molecule&lt;/em&gt; is not new. Many other services have sprung up over the last ten years that operate, at least on the surface, similarly. Some examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.moleculeoftheday.com/"&gt;Molecule of the Day&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://portal.acs.org/portal/acs/corg/content?_nfpb=true&amp;amp;_pageLabel=PP_TRANSITIONMAIN&amp;amp;node_id=677&amp;amp;use_sec=false&amp;amp;sec_url_var=region1"&gt;ACS Molecule of the Week&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.drugsandpoisons.com/"&gt;Drugs and Poisons&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://the-half-decent-pharmaceutical-chemistry-blog.chemblogs.org/category/saturday-night-synthesis"&gt;Saturday Night Synthesis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.chm.bris.ac.uk/motm/motm.htm"&gt;The Molecule of the Month&lt;/a&gt; (may be the oldest continuously-operated MOTM site in existence)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.3dchem.com/motm.asp"&gt;3dchem.com Molecule of the Month&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.expasy.org/spotlight/"&gt;Protein Spotlight&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mgl.scripps.edu/people/goodsell/illustration/pdb"&gt;PDB Molecule of the Month&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.prous.com/molecules/default.asp"&gt;Prous Molecule of the Month&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Quite a few others don't appear on this list.&lt;/p&gt;

&lt;p&gt;The different idea behind the &lt;em&gt;The Daily Molecule&lt;/em&gt; is that chemical content already exists in on the Web in machine-readable format with licenses that permit its re-use; all that's needed is a way to aggregate, format, and package that information in a form suitable for once-daily scanning and cheminformatics manipulation.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Like no other medium, the Web blurs artificial distinctions: between work and play; between private and public; between on-topic and off-topic; between fame and obscurity; between mine and yours; between big and small; and between profit and non-profit. Chemistry may be late to the party, but is not immune to its call.&lt;/p&gt;</description>
      <pubDate>Wed, 14 May 2008 11:58:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:804a7467-98a1-47ae-975a-b1fdd172f1c0</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/05/14/the-daily-molecule-the-wonders-of-chemistry-one-molecule-at-a-time</link>
      <category>Meta</category>
      <category>dailymolecule</category>
      <category>scientificpublication</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>journal</category>
      <category>web</category>
      <category>rails</category>
      <category>ruby</category>
      <category>simplelog</category>
    </item>
    <item>
      <title>Building Chempedia: Indexing Wikipedia's 6,411 Compound Monographs</title>
      <description>&lt;p&gt;&lt;img src="http://depth-first.com/demo/20080428/merck.png" align="right"&gt;&lt;/img&gt;&lt;a href="http://www.merckbooks.com/mindex/"&gt;The Merck Index&lt;/a&gt; is one of chemistry's most useful reference works. Organized like an encyclopedia, each entry, or "Compound Monograph," describes a single compound complete with chemical structure, CAS Number, IUPAC name, trivial names, physical properties, and leading primary literature references describing uses. Unlike other chemistry databases, the Merck Index focuses on only those compounds with important industrial, biological, medical, or technical applications.&lt;/p&gt;

&lt;h4&gt;What's Wrong with the Merck Index?&lt;/h4&gt;

&lt;p&gt;Wonderful product though it may be, the Merck Index has some limitations. For starters, online versions are not free. The disadvantages of this access model go well beyond a simple price barrier; it prevents the very thing the Web was designed to promote: linking. Another limitation is the time it takes for new versions to appear, which is typically measured in years. Still another limitation is in the cost of adding entries for niche compounds that may not be suitable for a general audience, a major barrier to exposing &lt;a href="http://depth-first.com/articles/2007/08/27/the-long-tail-and-chemistry-why-so-many-acs-meeting-talks-are-uninteresting"&gt;chemistry's long tail&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;What's Chempedia?&lt;/h4&gt;

&lt;p&gt;If we wanted to create a free, online service that worked like the Merck Index but which took full advantage of today's powerful collaboration and information technology tools, how could we go about doing so?&lt;/p&gt;

&lt;p&gt;This article, the first in a series, discusses &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt;, a free, structure-oriented online encyclopedia of useful chemical compounds designed to answer this question.&lt;/p&gt;

&lt;h4&gt;Background&lt;/h4&gt;

&lt;p&gt;The following articles may be useful in understanding Chempedia's approach and underlying technology:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2008/04/17/user-created-compound-monographs-on-chempedia-net-open-sourcing-the-collation-and-indexing-of-chemical-information"&gt;User-Created Compound Monographs on Chempedia.net: Open Sourcing the Collation and Indexing of Chemical Information&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2008/04/04/chempedia-net-mashing-up-pubchem-and-wikipedia"&gt;Chempedia.net: Mashing Up PubChem and Wikipedia&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs"&gt;Wikipedia for Cheminformatics: A Simple Web API for Finding CAS Numbers in Compound Monographs&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;Thirty-Two Free Chemistry Databases&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Where to Begin?&lt;/h4&gt;

&lt;p&gt;One of the first problems we'd face in building a free Web-based version of the Merck Index is where to get the compound monographs.&lt;/p&gt;

&lt;p&gt;It turns out that &lt;a href="http://wikipedia.org"&gt;Wikipedia&lt;/a&gt; (yes, Wikipedia) hosts a growing collection of compound monographs that, when viewed together, bear a striking resemblance to the Merck Index. And the effort is becoming increasingly organized with respect to content and data provenance.&lt;/p&gt;

&lt;p&gt;Why not start here?&lt;/p&gt;

&lt;h4&gt;The Task at Hand&lt;/h4&gt;

&lt;p&gt;To get an idea of just how Wikipedia's collection of compound monographs compares to the Merck Index, it helps to know: (1) how to find Wikipedia compound monographs; and (2) the range of information available for each entry.&lt;/p&gt;

&lt;p&gt;This tutorial will describe a simple method to index Wikipedia's compound monographs using nothing but free tools and data. Subsequent articles will discuss qualitative aspects of Wikipedia's compound monographs and the challenges involved in organizing them into a chemically-aware service.&lt;/p&gt;

&lt;h4&gt;Indexing Wikipedia's Compound Monographs&lt;/h4&gt;

&lt;p&gt;We can index Wikipedia compound monographs via a simple procedure.&lt;/p&gt;

&lt;p&gt;Most compound monographs employ one of four precompiled Wikpedia templates: &lt;a href="http://en.wikipedia.org/wiki/Template:Chembox"&gt;Chembox&lt;/a&gt; (deprecated); &lt;a href="http://en.wikipedia.org/wiki/Template:Chembox_new"&gt;Chembox new&lt;/a&gt;; &lt;a href="http://en.wikipedia.org/wiki/Template:Drugbox"&gt;Drugbox&lt;/a&gt;; and &lt;a href="http://en.wikipedia.org/wiki/Template:Explosivebox"&gt;Explosivebox&lt;/a&gt;. As an example of what these templates look like, see the right-hand box on Wikipedia's entry on &lt;a href="http://en.wikipedia.org/wiki/Modafinil"&gt;modafinil&lt;/a&gt;. To index Wikipedia's compound monographs, all we need to do is find the titles of all articles using one of these four templates.&lt;/p&gt;

&lt;p&gt;To get started, we'll need a local copy of Wikipedia. The complete set of all Wikipedia articles, as of March 12, 2008 can be &lt;a href="http://download.wikimedia.org/enwiki/20080312/enwiki-20080312-pages-articles.xml.bz2"&gt;downloaded here&lt;/a&gt;. This data dump is updated periodically, so you may have access to a more recent version.&lt;/p&gt;

&lt;p&gt;The Wikipedia dump, which contains the full text of every article in Wikipedia, consists of a 3.5 GB file in &lt;a href="http://www.bzip.org/"&gt;BZip2&lt;/a&gt; format. Fortunately, we won't need to inflate it to index its chemical content.&lt;/p&gt;

&lt;p&gt;The following code will scan the raw Wikipedia dump and produce a list of all compound monograph titles:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;title&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="ident"&gt;log&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;File&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;monographs.txt&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;w&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;

&lt;span class="keyword"&gt;while&lt;/span&gt;&lt;span class="punct"&gt;((&lt;/span&gt;&lt;span class="ident"&gt;line&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;STDIN&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
  &lt;span class="ident"&gt;line&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt; &lt;span class="punct"&gt;/&amp;lt;&lt;/span&gt;&lt;span class="ident"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;&amp;gt;(.*)&amp;lt;\/&lt;/span&gt;&lt;span class="regex"&gt;title&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;

  &lt;span class="ident"&gt;if&lt;/span&gt; &lt;span class="global"&gt;$1&lt;/span&gt;
    &lt;span class="ident"&gt;title&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="global"&gt;$1&lt;/span&gt;

    &lt;span class="keyword"&gt;next&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;

  &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;line&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt; &lt;span class="punct"&gt;/\{\{(&lt;/span&gt;&lt;span class="ident"&gt;chembox&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;drugbox&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;explosivebox&lt;/span&gt;&lt;span class="punct"&gt;)/&lt;/span&gt;&lt;span class="ident"&gt;i&lt;/span&gt;
    &lt;span class="keyword"&gt;unless&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt; &lt;span class="punct"&gt;==&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;||&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;:&lt;/span&gt;&lt;span class="punct"&gt;/)&lt;/span&gt;
      &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt;
      &lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt;
      &lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;flush&lt;/span&gt;

      &lt;span class="ident"&gt;title&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;close&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Saving this code into a file called &lt;strong&gt;filter.rb&lt;/strong&gt;, we can run it by piping the output of &lt;tt&gt;bzcat&lt;/tt&gt; on the raw dump file:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ bzcat &amp;lt;path_to_dump&amp;gt;/enwiki-20080312-pages-articles.xml.bz2 | ruby filter.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Alphabetizing the output file gives a complete listing of Wikipedia's compound monograph titles (all 6,411 of them), which for convenience can be &lt;a href="http://depth-first.com/demo/20080428/compound_monographs_20080315.txt"&gt;downloaded here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We can construct a URL to each Wikipedia compound monograph by prepending the title with &lt;strong&gt;http://wikipedia.org/wiki/&lt;/strong&gt;. In other words, our program's output can be used both as a list of chemical names and as a hash of chemical names to Wikipedia URLs. And with the URL in hand, &lt;a href="http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs"&gt;all kinds of interesting things can be done&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;Limitations&lt;/h4&gt;

&lt;p&gt;Although easy to carry out, the procedure described here has some limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monographs added after March 12, 2008 are not visible.&lt;/li&gt;
&lt;li&gt;Monographs that don't use the chembox, chembox new, drugbox, or explosivebox templates are not visible.&lt;/li&gt;
&lt;li&gt;A very small number of articles erroneously use the chembox template, for example &lt;a href="http://en.wikipedia.org/wiki/Iraq%27s_Chemical_Warfare"&gt;this one&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Chempedia Redesign&lt;/h4&gt;

&lt;p&gt;Currently, Chempedia doesn't include all 6,411 monographs but rather a subset created by a much less comprehensive indexing method. As part of a major redesign of the site, all Wikipedia compound monographs will be available on Chempedia, which should result in a much more useful service.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Wikipedia is fast becoming a major storehouse of chemical information with tantalizing potential for creating powerful new services for chemists. More to the point for cheminformatics, the entire Wikipedia dataset can be downloaded and reprocessed free of charge; Wikipedia is one of those rare cheminformatics datasets that is &lt;a href="http://depth-first.com/articles/2006/09/27/hacking-pubchem-free-speech-or-free-beer"&gt;both free as in speech and free as in beer&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As this article has shown, some simple programming is all it takes to begin doing useful things with Wikipedia's chemical content. Future articles will discuss some of the possibilities.&lt;/p&gt;</description>
      <pubDate>Mon, 28 Apr 2008 18:22:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:6980ce0d-0482-48ba-9489-ca1235632f66</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/04/28/building-chempedia-indexing-wikipedias-6-411-compound-monographs</link>
      <category>Meta</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>compoundmonograph</category>
      <category>bzip2</category>
      <category>merckindex</category>
    </item>
    <item>
      <title>User-Created Compound Monographs on Chempedia.net: Open Sourcing the Collation and Indexing of Chemical Information</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com"&gt;&lt;img src="http://chempedia.net/images/global/logo.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Printed encyclopedias of chemical information like the &lt;a href="http://www.merckbooks.com/mindex/"&gt;Merck Index&lt;/a&gt; suffer from the problem of becoming obsolete on publication. When new compounds are discovered, or when the information about a compound changes, those changes can take many months or years to appear in print form due to the high cost of publication. It doesn't have to be that way. This article introduces a new feature to the free online chemical encyclopedia &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt; that lets working scientists update is contents via &lt;a href="http://wikipedia.org"&gt;Wikipedia&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;About Chempedia.net&lt;/h4&gt;

&lt;p&gt;A &lt;a href="http://depth-first.com/articles/2008/04/04/chempedia-net-mashing-up-pubchem-and-wikipedia"&gt;recent article&lt;/a&gt; introduced &lt;a href="http://chempedia.com"&gt;Chempdia&lt;/a&gt;, the free online chemical encyclopedia. This service is built on two of the largest &lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;free and open repositories of chemical information&lt;/a&gt; in existence: &lt;a href="http://wikipedia.org"&gt;Wikipedia&lt;/a&gt; and &lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt;. PubChem supplies low-level chemical information such as connection tables, and Wikipedia supplies free-text descriptions of the properties and uses of certain molecules.&lt;/p&gt;

&lt;h4&gt;Which Molecules?&lt;/h4&gt;

&lt;p&gt;Currently, Chempedia.net only includes &lt;a href="http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs"&gt;compound monographs&lt;/a&gt; for about 1,000 of its over 300,000 molecules. These monographs were located by a manual process in which the titles for all Wikipedia articles were downloaded in alphabetized form; this process clustered titles that represented IUPAC nomenclature due to its use of leading numbers and symbols. IUPAC nomenclature titles were extracted, and then a script was written to extract the chemical information from these titles and combine it with that from PubChem.&lt;/p&gt;

&lt;p&gt;This method, although useful for getting a service running, is clearly flawed. The biggest problem is in how to discover new compound monographs.&lt;/p&gt;

&lt;h4&gt;Why Not Put Users in Control?&lt;/h4&gt;

&lt;p&gt;Chempedia users themselves are in the best position to know when an existing Wikipedia compound monograph should appear in Chempedia but doesn't, when an existing monograph needs to be updated, or when a new monograph is written and needs to be linked.&lt;/p&gt;

&lt;p&gt;How can the process be &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;automated&lt;/a&gt;?&lt;/p&gt;

&lt;p&gt;As a partial answer to this question, users &lt;a href="http://chempedia.net/articles/new"&gt;now have the ability to notify Chempedia of any changes to a Wikipedia compound monograph&lt;/a&gt;, and to have those changes immediately reflected in the next viewing of a Chempedia compound monograph.&lt;/p&gt;

&lt;h4&gt;An Example&lt;/h4&gt;

&lt;p&gt;As an example, let's take &lt;a href="http://en.wikipedia.org/wiki/anandamide"&gt;anandamide&lt;/a&gt;, a compound I've had some experience with during my time as a medicinal chemist. Although the &lt;a href="http://chempedia.net/compounds/6030"&gt;Chempedia entry for ananandamide&lt;/a&gt; exists, there is (or as of today - was) no link to the Wikipedia compound monograph. Let's create one.&lt;/p&gt;

&lt;p&gt;At the top of &lt;a href="http://chempedia.com/"&gt;Chempedia's main menu&lt;/a&gt;, you'll see a link titled '&lt;a href="http://chempedia.net/articles/new"&gt;Update&lt;/a&gt;'. Choosing this link leads to a form that will ask for two pieces of information: (1) the title of the Wikipedia article to which you want Chempedia to link - in this case '&lt;a href="http://en.wikipedia.org/wiki/anandamide"&gt;anandamide&lt;/a&gt;'; and (2) &lt;a href="http://depth-first.com/articles/2007/09/18/six-reasons-i-like-recaptcha-or-how-to-build-a-web-service-worth-talking-about"&gt;reCaptcha&lt;/a&gt; text to keep robots from making mischief.&lt;/p&gt;

&lt;p&gt;Submitting this information is all that's needed to create a new or updated link from Chempedia to Wikipedia. Chempedia handles the rest.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;Wikipedia is a vast source of free, high-quality, semi-structured chemical information just waiting to have good chemically-aware interfaces applied to it. Chempedia.net is an attempt to do just that, but it's a bit more as well. Although it may appear that Chempedia is the major beneficiary in this relationship, Wikipedia also benefits. When chemists have a tool that allows them to query and visualize Wikipedia using their native language (the chemical structure) they're in a better position to both use and contribute to Wikipedia itself - something I've started to do.&lt;/p&gt;

&lt;p&gt;This positive feedback effect is the real value of exposing Web services. The question is: who in cheminformatics is willing and able to take the risk to discover this simple principle and its benefits?&lt;/p&gt;</description>
      <pubDate>Thu, 17 Apr 2008 17:50:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:9db0f83e-ebaf-49cc-af9d-03d44250c05d</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/04/17/user-created-compound-monographs-on-chempedia-net-open-sourcing-the-collation-and-indexing-of-chemical-information</link>
      <category>Tools</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>webservice</category>
      <category>mashup</category>
      <category>compoundmonograph</category>
      <category>merckindex</category>
    </item>
    <item>
      <title>Chempedia.net: Mashing Up PubChem and Wikipedia</title>
      <description>&lt;p&gt;&lt;a href="http://chempedia.com"&gt;&lt;img src="http://chempedia.net/images/global/logo.png" align="right"&gt;&lt;/img&gt;&lt;/a&gt;&lt;a href="http://pubchem.ncbi.nlm.nih.gov/"&gt;PubChem&lt;/a&gt; and &lt;a href="http://wikipedia.net"&gt;Wikipedia&lt;/a&gt; represent two of the largest open repositories of chemical information in the world. And they complement each other very nicely. PubChem contains mainly low-level chemical structure information whereas Wikipedia contains free-text descriptions of chemical compounds in the form of &lt;a href="http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs"&gt;compound monographs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Both services offer permission and access to copy and reuse their contents. But neither service is, by itself, nearly as useful as it could be.&lt;/p&gt;

&lt;p&gt;Why not mash them up?&lt;/p&gt;

&lt;p&gt;To explore that question my company, &lt;a href="http://metamolecular.com"&gt;Metamolecular, LLC&lt;/a&gt; has launched &lt;a href="http://chempedia.com"&gt;Chempedia&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To my knowledge, Chempedia represents the first publicly-facing database of compounds to incorporate Wikipedia's collection of organic compound monographs. And it's one of the few cheminformatics services to make use of free-text descriptions generated by individual chemists.&lt;/p&gt;

&lt;p&gt;Chempedia has been somewhat selective about the compounds it includes. To date, it has spidered over 2,500 monographs, combining them with over 300,000 of the most interesting compounds from PubChem. Not every Chempedia.net molecule has a monograph, but now there's a tool that can actually make that absence apparent.&lt;/p&gt;

&lt;p&gt;Chempedia is both an experiment and a service. It's immediately useful for anyone in the business of making or doing things with organic molecules. It's created several unexpected moments of "Oh, that's actually a useful molecule!" It also will serve as a platform to test some of the ideas discussed in Depth-First over the last year or so on the advantages of the Web for collaboration in chemistry.&lt;/p&gt;

&lt;p&gt;Stay tuned for more details about how Chempedia was created and some of its applications in chemistry.&lt;/p&gt;</description>
      <pubDate>Fri, 04 Apr 2008 10:06:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:168432fb-c064-43c2-a60d-728c7c29c406</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/04/04/chempedia-net-mashing-up-pubchem-and-wikipedia</link>
      <category>Tools</category>
      <category>chempedia</category>
      <category>wikipedia</category>
      <category>pubchem</category>
      <category>rails</category>
      <category>ruby</category>
      <category>chemwriter</category>
      <category>applet</category>
      <category>java</category>
      <category>jruby</category>
    </item>
    <item>
      <title>Wikipedia for Cheminformatics: A Simple Web API for Finding CAS Numbers in Compound Monographs</title>
      <description>&lt;p&gt;&lt;a href="http://wikipedia.org"&gt;&lt;img src="http://depth-first.com/demo/20070123/wikipedia.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Good news for cheminformatics: Chemical Abstracts Service (CAS) &lt;a href="http://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Chemistry/CAS_validation"&gt;has agreed&lt;/a&gt; to help Wikipedia users curate its collection of CAS numbers. As a result of the diligence of some hard-working volunteers, chemistry's most universal system for referring to chemicals can now be used far more effectively by the worlds biggest open repository of knowledge.&lt;/p&gt;

&lt;p&gt;Wouldn't it be great to be able to pull these CAS numbers from Wikipedia programmatically?&lt;/p&gt;

&lt;h4&gt;Perspective&lt;/h4&gt;

&lt;p&gt;Estimates place the number of Wikipedia pages dealing with individual &lt;a href="http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemicals/Inorganics"&gt;inorganic&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/List_of_organic_compounds"&gt;organic&lt;/a&gt; substances in the thousands. (I'll use the term "compound monographs" to describe them.) One factor acting to keep this number low is poor visibility of these entries. Unlike most &lt;a href="http://depth-first.com/articles/2007/01/24/thirty-two-free-chemistry-databases"&gt;chemical databases&lt;/a&gt;, Wikipedia can't, by itself, be easily searched by structure. As chemically-aware tools for indexing Wikipedia begin to emerge, look for six things to happen:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The number of Wikipedia compound monographs will increase significantly.&lt;/li&gt;
&lt;li&gt;The quality of monographs for intermediate- to well-known compounds will increase substantially.&lt;/li&gt;
&lt;li&gt;Demand for user-friendly interfaces to Wikipedia's chemical content will increase.&lt;/li&gt;
&lt;li&gt;Wikipedia users will become interested in storing and finding ever more diverse kinds of information about each compound.&lt;/li&gt;
&lt;li&gt;Bench chemists will start to include Wikipedia as one of their preferred literature search techniques, leading to...&lt;/li&gt;
&lt;li&gt;More creative tools for using the chemical content of Wikipedia.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As noted previously, it wasn't too long ago that indexing of the chemical literature &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;was done solely by volunteers&lt;/a&gt;. Wikipedia offers an intriguing way to channel the innate drive for chemists to combine their own work and experience with that of others to build useful information tools for the community.&lt;/p&gt;

&lt;p&gt;But for now we are left with the question of how to index the chemical content of Wikipedia. Although a few systems have been proposed, the only practical method is through the use of CAS numbers. Which brings us to the subject of today's tutorial.&lt;/p&gt;

&lt;h4&gt;A Quick CAS Number API for Wikipedia&lt;/h4&gt;

&lt;p&gt;The Ruby program below will accept the title of any Wikipedia compound monograph title and return the CAS number for the compound being discussed, or an error message if none was found:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;hpricot&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;open-uri&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;cgi&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Wikikemi&lt;/span&gt;
  &lt;span class="attribute"&gt;@cas&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt;

  &lt;span class="ident"&gt;attr_reader&lt;/span&gt; &lt;span class="symbol"&gt;:cas&lt;/span&gt;

  &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;initialize&lt;/span&gt; &lt;span class="ident"&gt;title&lt;/span&gt;
    &lt;span class="ident"&gt;uri&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;URI&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;escape&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;http://en.wikipedia.org/wiki/&lt;span class="expr"&gt;#{title}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
    &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;loading... &lt;span class="expr"&gt;#{uri}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
    &lt;span class="ident"&gt;doc&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Hpricot&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;open&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;uri&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
    &lt;span class="ident"&gt;table&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;doc&lt;/span&gt;&lt;span class="punct"&gt;/&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)[&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;

    &lt;span class="ident"&gt;table&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;inner_html&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;match&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;([0-9]{2,7}?&lt;span class="escape"&gt;\-&lt;/span&gt;[0-9]{2}&lt;span class="escape"&gt;\-&lt;/span&gt;[0-9])&lt;/span&gt;&lt;span class="punct"&gt;/)&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;table&lt;/span&gt;

    &lt;span class="attribute"&gt;@cas&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="global"&gt;$1&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="comment"&gt;# Returns the CAS number present in the Wikipedia monograph with&lt;/span&gt;
&lt;span class="comment"&gt;# the indicated title, or an error message if none is found. Try, for example,&lt;/span&gt;
&lt;span class="comment"&gt;# &amp;quot;benzene.&amp;quot;.&lt;/span&gt;
&lt;span class="keyword"&gt;while&lt;/span&gt; &lt;span class="constant"&gt;true&lt;/span&gt;
  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;Enter the title of the Wikipedia page, for example: 'benzene'&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="ident"&gt;monograph_title&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;chomp&lt;/span&gt;
  &lt;span class="ident"&gt;w&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Wikikemi&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt; &lt;span class="ident"&gt;monograph_title&lt;/span&gt;
  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;w&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;cas&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;[&lt;span class="expr"&gt;#{w.cas}&lt;/span&gt;]&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;CAS number not found&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This program makes use of the excellent Ruby HTML parser, &lt;a href="http://code.whytheluckystiff.net/hpricot/"&gt;Hpricot&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Saving the above code to a file called &lt;strong&gt;wikikemi.rb&lt;/strong&gt;, we can run it with:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby wikikemi.rb
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;For example, we can look up the CAS numbers for Ferrocene, Lipitor, or 1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene:&lt;/p&gt;

&lt;div class="console"&gt;
&lt;pre&gt;
$ ruby wikikemi.rb
Enter the title of the Wikipedia page, for example: 'benzene'
ferrocene
loading... http://en.wikipedia.org/wiki/ferrocene
[102-54-5]
Enter the title of the Wikipedia page, for example: 'benzene'
lipitor
loading... http://en.wikipedia.org/wiki/lipitor
[134523-00-5]
Enter the title of the Wikipedia page, for example: 'benzene'
1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene
loading... http://en.wikipedia.org/wiki/1,2,3,4,4a,5,6,7,8,8a-Decahydronaphthalene
[91-17-8]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;All this method requires is that the Wikipedia page lists the correct CAS number in its &lt;a href="http://en.wikipedia.org/wiki/Template:Drugbox"&gt;Drugbox&lt;/a&gt; or &lt;a href="http://en.wikipedia.org/wiki/Template:Chembox_new"&gt;Chembox&lt;/a&gt; template. Fortunately, CAS has agreed to help make this happen.&lt;/p&gt;

&lt;h4&gt;Conclusions&lt;/h4&gt;

&lt;p&gt;A little Ruby code is all it takes to build a working CAS number lookup system using Wikipedia. Although this may be useful as a standalone tool, it becomes much more powerful when made part of &lt;a href="http://depth-first.com/articles/2007/05/21/simple-cas-number-lookup-with-pubchem"&gt;a larger cheminformatics system&lt;/a&gt;. But that's a story for another time.&lt;/p&gt;

&lt;p&gt;See also &lt;a href="http://www.chemspider.com/blog/a-message-of-support-and-public-service-from-the-chemical-abstracts-service.html"&gt;Antony Williams' announcement on CAS and Wikipedia&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Wed, 02 Apr 2008 17:29:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:c11402b2-406a-4ec9-8b65-fc34da179c1a</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2008/04/02/wikipedia-for-cheminformatics-a-simple-web-api-for-finding-cas-numbers-in-compound-monographs</link>
      <category>Tools</category>
      <category>cas</category>
      <category>acs</category>
      <category>casnumber</category>
      <category>lookup</category>
      <category>wikipedia</category>
      <category>ruby</category>
    </item>
    <item>
      <title>What Makes Wikipedia Tick?</title>
      <description>&lt;p&gt;&lt;a href="http://wikipedia.org"&gt;&lt;img src="http://depth-first.com/demo/20070123/wikipedia.jpg" align="right"&gt;&lt;/img&gt;&lt;/a&gt;Whatever your views on &lt;a href="http://wikipedia.org"&gt;Wikipedia&lt;/a&gt;, it's clear that the volunteer online encyclopedia has left it's mark on society. But the most important things about Wikipedia have less to do with its contents and more to do with the people contributing and using the service. To understand how and why people collaborate on the Web, you have to understand Wikipedia.&lt;/p&gt;

&lt;p&gt;An &lt;a href="http://www.riehle.org/computer-science/research/2006/wikisym-2006-interview.html"&gt;interview with three leading Wikipedia figures&lt;/a&gt; sheds some light on Wikipedia as a collaborative activity.&lt;/p&gt;

&lt;p&gt;There is a myth about online collaboration that Open Source practitioners are very familiar with. It goes something like this: "I'll start building something and release it to the community. I'll get feedback from a lot of users, some of whom will fix bugs, write documentation, and build extensions. All of that feedback will create a better product."&lt;/p&gt;

&lt;p&gt;Now, this does happen, of course. The reason I consider it a myth is that it happens so rarely that you might as well not count on it. Virtually all Open Source software is designed, written, documented, debugged, and promoted by a single developer with the help of a tiny fraction (say &lt;a href="http://depth-first.com/articles/2007/09/05/name-that-graph-revealed-oligarchy-2-0"&gt;2-10%&lt;/a&gt;) of the committed user base. Pick any good example of Open Source software that works and behind it you'll find a committed user base large enough to make 2-10% a number greater or equal to one. It's not clear this is necessarily &lt;a href="http://depth-first.com/articles/2007/01/18/collective-intelligence-and-the-dumbness-of-crowds"&gt;a bad thing&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The interview with the Wikipedia leaders confirmed this view. When asked about the idea that lots of contributors makes a good article, Elisabeth Bauer, of the English Wikipedia, had this to say:&lt;/p&gt;

&lt;blockquote&gt;
    &lt;p&gt;The best articles are typically written by a single or a few authors with expertise in the topic. In this respect, Wikipedia is not different from classical encyclopedias.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Her view was shared by Kizo Naoko, of the Japanese Wikipedia who added that short articles tend to remain short and of poor quality.&lt;/p&gt;

&lt;p&gt;There doesn't seem to be anything complicated here. Wikipedia places a very low barrier to contribution. It has created a system where active contributors with specialized knowledge feel a sense of ownership over their contributions. Checks and balances insure that these contributors can monitor changes to their work, and correct errors. Finally, the subject matter is so broadly appealing (All of Human Knowledge) that 2-10% of the user base is a massive number.&lt;/p&gt;

&lt;p&gt;It may not be complicated, but it's far from easy.&lt;/p&gt;</description>
      <pubDate>Fri, 05 Oct 2007 10:28:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:2400028b-e0d2-44b2-b257-c7b7d3425820</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2007/10/05/what-makes-wikipedia-tick</link>
      <category>Meta</category>
      <category>wikipedia</category>
      <category>web20</category>
      <category>collaboration</category>
      <category>collectiveintelligence</category>
    </item>
    <item>
      <title>Chemical Reviews on Wikipedia</title>
      <description>&lt;p&gt;&lt;center&gt;&lt;img src="http://depth-first.com/files/500px-Sharpless_Dihydroxylation_Scheme.png"&gt;&lt;/img&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Until 1966, Chemical Abstracts Service used &lt;a href="http://depth-first.com/articles/2006/08/19/history-of-abstracting-at-chemical-abstracts-service"&gt;volunteers&lt;/a&gt; exclusively to abstract the chemical literature. At the system's peak, thousands of scientists were willing and even enthusiastic to perform this tedious, demanding work for very little pay. The system was eventually phased out in favor of the professional abstracting service that replaced it.&lt;/p&gt;

&lt;p&gt;What motivated these volunteer abstracters? Enlightened self-interest probably played a role. After all, preparing a set of abstracts in a field you do research in can pay off in your own increased productivity. It's also a good way to stay current with the literature, something you would do anyway. If your abstracts help your fellow scientist at the same time, so much the better. Another motivation could have been a simple desire to create order out of chaos, not unlike the many &lt;a href="http://digg.com"&gt;social networking activities&lt;/a&gt; flourishing on the internet today. &lt;a href="http://almost.cubic.uni-koeln.de/jrg/"&gt;Christoph Steinbeck&lt;/a&gt; will be giving a &lt;a href="http://wiki.cubic.uni-koeln.de/blog/pivot/entry.php?id=5#body"&gt;talk&lt;/a&gt; at the Fall 2006 ACS touching on this theme, and it's likely others will too as the field gathers momentum.&lt;/p&gt;

&lt;p&gt;In browsing &lt;a href="http://blog.tenderbutton.com"&gt;Dylan Stiles' blog&lt;/a&gt;, I came across &lt;a href="http://blog.tenderbutton.com/?p=250"&gt;an entry&lt;/a&gt; on the aldehyde-&gt;alkyne homologation. In it, Stiles cited a brief, but informative &lt;a href="http://en.wikipedia.org/wiki/Seyferth-Gilbert_homologation"&gt;Wikipedia review&lt;/a&gt; on this reaction.&lt;/p&gt;

&lt;p&gt;Surely this couldn't be the only example of online volunteer-created reviews in chemistry on Wikipedia. A quick search resulted in numerous examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Wittig_reaction"&gt;Wittig Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Grignard_reaction"&gt;Grignard Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Dihydroxylation"&gt;Sharpless Dihydroxylation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Diels-Alder"&gt;Diels-Alder Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Thermite"&gt;Thermite Reaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Danishefsky_Taxol_total_synthesis"&gt;Danishefsky Taxol Total Synthesis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Olefin_metathesis"&gt;Olefin Metathesis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/McMurry_reaction"&gt;McMurry Coupling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Robinson_annulation"&gt;Robinson Annulation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Swern_oxidation"&gt;Swern Oxidation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Cholesterol"&gt;Cholesterol&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The proliferation of this kind of volunteer, peer-reviewed chemical documentation is similar in spirit to that used by CAS in earlier times, although the technology couldn't be more different. Of course, this approach is not without its limitations and &lt;a href="http://www.youtube.com/watch?v=zmHm0rGns4I"&gt;potential pitfalls&lt;/a&gt;, but it is remarkably &lt;a href="http://en.wikipedia.org/wiki/Elephants"&gt;self-correcting&lt;/a&gt;. This emerging system offers something that CAS will never be able to provide - involvement in, and ownership of, the documentation process itself.&lt;/p&gt;

&lt;p&gt;Unfortunately, chemical informatics technologies have not kept up with internet technologies and the people currently using them. The reliance on &lt;a href="http://depth-first.com/articles/2006/08/25/computational-perception-and-recognition-of-digitized-molecular-structures"&gt;raster images of 2-D structures&lt;/a&gt;, and the lack of a reliable web-enabled chemical indexing system both loom especially large as future problems to be addressed. What tools does this new kind of chemical publishing need to become more effective and efficient? How can these tools be made as &lt;a href="http://depth-first.com/articles/2006/09/05/the-automatic-encoding-of-chemical-structures"&gt;invisible&lt;/a&gt; as possible?&lt;/p&gt;</description>
      <pubDate>Fri, 08 Sep 2006 14:34:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:10cd0b85-a082-4aca-bbe6-296e7d78fdd7</guid>
      <author>Rich Apodaca</author>
      <link>http://depth-first.com/articles/2006/09/08/chemical-reviews-on-wikipedia</link>
      <category>Web</category>
      <category>wikipedia</category>
      <category>volunteer</category>
      <category>cas</category>
    </item>
  </channel>
</rss>
