Casting a Wide Net in Cheminformatics

November 18, 2009

I've always viewed the term 'cheminformatics' pretty broadly. To me, cheminfomatics is simply the application of information technology to solve problems in chemistry.

Unsolved Problems in Cheminformatics

For my money, some of the most important unsolved problems in cheminformatics today have to do with the many issues that come with the transition from a paper-centric scientific workflow to a Web-centric scientific workflow. We're not just talking about publishing the same stuff in a new medium, we're talking about a completely different way of working. And a different set of non-technical issues ranging from intellectual property to culture to how to make a living.

One thing the Web excels at is connecting people who share the same interests in the long tail. If this is true for society at large, why shouldn't it be true in science - and chemistry in particular?

But connecting people on the Web is a very tricky business. Millions - probably billions of dollars have been spent on schemes that ultimately went nowhere.

A Platform for Chemical Knowledge-Gathering

So it was with great interest that I noticed the public beta of StackExchange, the software behind the wildly-popular programming site StackOverflow.

For the unfamiliar, StackOverflow is a question and answer site that combines aspects of peer-review, wikis, forums, and mailing lists into a single, coherent, fun, search engine-optimized package.

After having actively used StackOverflow for awhile now, some things stand out in my mind:

  1. The very first question I asked had an answer within 30 minutes.
  2. I now know of a number of smart, friendly people who I've never met but who share my narrow interests with respect to the developer tools and methods I use regularly.
  3. I no longer read or post to mailing lists for the products I use. I go to StackOverflow instead.
  4. I no longer use Google for some questions - I go directly to StackOverflow.

I had been sitting on the fence with respect to starting a StackExchange site in chemistry. The idea is, after all, a pretty radical one. See, StackExchange isn't about just enabling peer-to-peer communication, it's about promoting it, and then leaving a public record of the event behind for others to use.

This is supposed to be what scientific papers are for. The problem is that papers do such an awful job of it that the similarity may not be immediately obvious.

But Egon Willighagen gave me the inspiration I needed (not the first time, either) with his BlueObelisk SE site. As Noel O'Boyle notes, the idea can only work, regardless of the software being used, if people actually find something that fills a need.

Chempedia Lab

So, without further ado, I give you Chempedia Lab, a site for asking and answering questions about experimental chemistry.

The choice of topic is deliberate. There are many sites where you can go to get answers about general chemistry questions. But none of them are dedicated to collecting, organizing, centralizing, and freely-distributing the massive body of knowledge that is experimental chemistry.

Being a trained synthetic organic chemist myself with eight years experience in drug discovery has given me a particular perspective on cheminformatics - it's not worth much until put into practice by someone who makes decisions in the lab.

Experimental chemistry is where cheminformatics begins, and the thing that keeps it going.

The format of Chempedia Lab is free text and images. But fear not, there are many things we can do to layer on deeper chemical meaning.

For example, see this question on the synthesis of 2,3,6,7-tetrabromoanthracene. The chemical structure image I embedded in the body is served by the corresponding Substance summary page on the Chempedia substance registry. In exchange for a user getting a convenient way to add a structure image to her question, the Chempedia registry gets some higher-level chemical information.

Links between sites are a powerful way to join bits of information together. In this case, a substance, its machine-readable structure, a question about its synthesis, and a primary literature citation to its preparation.

Wrapup

I have no idea what to expect from Chempedia Lab. The newness of the concept, and some of those non-technical concerns noted above, might take awhile to work themselves out. Or there may simply be no need for it.

One thing is certain - if you pay attention, there will be plenty to learn.