Web-Centric Science

February 04, 2009

From The Realm of Organic Synthesis comes a common feeling of frustration with the way scientific information is distributed, and an increasingly common proposal for a solution:

I envision a hybrid of Doug Taber's Organic Chemistry Portal, Wikipedia and a condensed version of SciFinder. I'll gladly contribute! How do we get the ball rolling?

Today we're witnessing a major re-evaluation of the scientific publication system. At issue is the fundamental inefficiency with the way things work - in terms of time, effort, and especially money.

To change any system, you must first understand its parts and how they work together. In chemistry, the workflow for scientific publication goes something like this (items in boldface are key components):

  1. You perform several Experiments over the course of between several months to several years.
  2. You record your Observations in a Notebook, typically visible only to You and those You choose to share with.
  3. You prepare a Manuscript summarizing your Observations using a Word Processor. This Manuscript contains machine-readable Tables, Chemical Structures, Characterization Data, and Cross-References.
  4. After internal review within your Organization, You submit the Manuscript to a Publisher.
  5. Publisher finds 2-3 (semi)qualified Reviewers for the Manuscript. The ease of finding (semi)qualified reviewers may be a function of the Prestige of the Publisher.
  6. Reviewers, together with the Journal Editor decide on whether the Manuscript is publishable at all and if so, what Revisions need to be made.
  7. You make Revisions to your Manuscript and send the result back to the Publisher.
  8. Publisher publishes a Paper from your revised Manuscript. The form this document takes varies, but generally consists of Physical Paper, PDF, or HTML. All of these formats prevent, to varying degrees, the machine-readability of the Tables, Chemical Structures, Characterization Data, and Cross-References in your Manuscript.
  9. Readers of the Journal who find your Paper immediately useful may Bookmark it (either literally, or by printing/copying it) for future reference. In most cases, however, your Paper will not be immediately read or noted.
  10. To place your Paper into a larger Context, an Abstractor attempts to once again make machine-readable those elements from your Paper of broadest interest: Tables, Chemical Structures, Characterization Data, and Cross-References.
  11. To enable the efficient location of your Paper by Researchers looking for answers to scientific questions, Abstractor creates and maintains a Database Service. Finding your Paper relies on Abstractor re-generating as much machine-readable information as possible.

Lavishly Inefficient

To a non-scientist who has used the Web for their entire adult life, this system appears lavishly inefficient. Each step requires people-power and toll-gates. There's nothing wrong with employing people to do work, of course. The problem is in employing expensive people to do work that cheap machines do far more efficiently. The problem is when the expensive people you employ work to maintain unnecessary steps in the production process. The problem is when passionate volunteers (or cheap labor) can do more with less than your paid staff. The problem is when cheaper technologies make your product look less appealing.

Just ask the automakers. Or the corner booksellers. Or the newspapers. Or textile manufacturers. Or the recording industry.

Reinventing the Wheel?

If you were going to re-invent the scientific publication system using any modern technology, how would you do it?

While I sympathize with the desire expressed by "J" (author of The Realm of Organic Synthesis - who might want to reconsider anonymously blogging science), I believe the Web offers a much more compelling range of solutions to the problem. The bad news is that it requires change at every level in the scientific publication process - and change is painful.

Who Profits from Inefficiency?

There's a case to be made that the inefficiency of the current publication system is actually helpful in certain situations. For example, the de-digitization and re-digitization of scientific content creates a profitable market for Abstractors. Another example: limited peer review of Manuscripts may be seen at benefiting authors anxious about being scooped by their competitors. Still another example: the lack of scalablility in how Publishers currently operate can lead to a Prestige factor for Journals that maintain high quality standards.

Nevertheless, all of these advantages (and more) could be built into systems with the Web as their organizing principle.

Stirrings

For a glimpse of what the future of chemistry publication might hold, consider Open Notebook Science, ChemSpider, and Collaborative Drug Discovery. Each of these services shares a Web-centric view of information management and collaboration. And each contains at its core a fundamentally unique view of the role publication plays in the daily workflow of scientists.

Finally, consider GitHub, a Web-centric developer tool that demonstrate more clearly than any other I'm aware of how tenuous the distinctions between individual work, collaboration, and publication can actually be.

Conclusions

There's no question that a Web-centric scientific publication system can work much more effectively than what we have today - for authors, readers, and abstractors. The question is - are we ready for it?

Image Credit: Robert Scoble