PubChem is a Platform

Two recent J. Chem. Inf. Model. articles support the idea that PubChem is rapidly evolving into a Chemical Informatics platform:

Large-Scale Annotation of Small-Molecule Libraries Using Public Databases. Using PubChem and other databases, the authors categorize the level of annotation (data, metadata, and links) of free chemical databases, with PubChem as the centerpiece. The work is part of a larger effort designed to integrate this free resource into the Novartis Research Foundation (GNF) workflow.

Web Service Infrastructure for Chemoinformatics. Among other interesting initiatives, the article describes a desktop application front-end for PubChem. (As a bonus, the authors also make the case).

Platforms are essential because they focus the attention and effort of self-interested third-parties around a common goal. They become so integrated into society that they eventually become invisible. There is outrage when they stop working. Think of highways, sewers, phone lines, communications satellites, the patent system, and the Internet, among others. We don't just use these services, we build on top of them.

Chemical Abstract Service is an important tool for many, but it is not a platform. By placing high costs on access to its service and severely restricting its use, the ACS has effectively shut out anyone wanting to build another service on top of CAS. Clearly this was part of the plan. Small and large third-party players alike are shut out, with the inevitable chilling effect on innovation.

Contrast this situation with PubChem. The public is free to download and re-use the entire database of molecules and associated data. PubChem has recently unveiled a new Web API called PUG that will make it even easier to layer on additional functionality. These kinds of capabilities create an entirely different dynamic: witness both eMolecules and ChemSpider, two services that unashamedly exploit the PubChem resource. Expect to see more of this in the months ahead.

Remember the Apple II? This product became so successful that it played a major role in undermining dozens of highly profitable and well-established businesses. Why was it so successful? One of the key reasons was its open architecture, compared to what had preceded it. Within a very short time, third parties had developed a large number of innovative products that exploited the underlying platform - both with and without Apple's encouragement. One of those products, VisiCalc was so successful that at one point many buyers of Apple's machine did so for no other purpose than to run it.

Whether PubChem itself ends up becoming the standard cheminformatics platform is hard to say. Perhaps this role will be filled by a system not yet built, or which evolves from PubChem. Whatever the outcome, PubChem has unmasked a deep need (and opportunity) for an open cheminformatics platform. As Apple's experience demonstrates, often you get more in the end by giving something up.