Raiding Chemistry's Data Tombs

February 04, 2008

Duncan Hull offers an interesting commentary on the rapid increase in the number of biologically-oriented databases. He asks whether all of this abundance is leading to nothing more than a bad case of data indigestion, in which data is dumped into write-only "data tombs," never to be seen again.

A data tomb is created whenever the ability to generate data outstrips the ability to do useful things with it. Like the burial tombs of ancient civilizations, data tombs are created for many reasons and take many forms.

Where are chemistry's data tombs and what do they look like? Given that the number of free chemistry databases pales in comparison to the number free biological databases, the question may seem irrelevant.

Nevertheless, data tombs in chemistry are ubiquitous. The most obvious examples are the supplementary data sections of major chemical journals. These write-only databases suffer from dual afflictions of copyright restriction and electronic degradation.

The collective experimental sections of the world's chemical literature is, in effect, a vast catacomb of jealously-guarded, but poorly-catalogued treasures.

Data silos are an especially prevalent kind of data tomb that result when data is created for a single use and either for technical or political reasons never placed in a real database. SD files containing SAR data, PowerPoint slides containing tables of synthetic yields, and Word documents containing experimental procedures are some of the forms these chemical data silos take.

What chemical data tombs have you run into, and what methods did you use to raid them?

Image Credit: Duncan Hull