GlaxoSmithKline Donates Cancer Genomics Dataset to Public Information Network

June 27, 2008

In a move likely to up the ante in the emerging Open Source Drug Discovery movement, GlaxoSmithKline have announced the donation of genomic profiling data for over 300 cancer cell lines to the National Cancer Institute's cancer Bioinformatics Grid (caBIG).

According to NCI's FAQ, caBIG is "an open-source, open-access information network enabling cancer researchers to share tools, data, applications, and technologies according to agreed-upon standards and identified needs." caBIG is comprised of publicly-available datasets and open source software tools designed to interact with them.

This move has potential significance on a number of levels:

  • Large pharmaceutical companies haven't generally made a habit out of donating their hard-earned raw intellectual property on this scale. For something this far outside the industry norm, nobody wants to go first; GSK's actions have made the practice a little more respectable.
  • caBIG is simultaneously a publicly-accessible database, a set of open data specifications, and an open source software platform. In other words, it's striving to become an end-to-end solution to the problem of open collaboration in the biological sciences.
  • Donating raw data directly into an open repository bypasses the established scientific publication model in which data are communicated only as part of a peer-reviewed publication. Should this new publication model continue to gain popularity, it would fundamentally change the way science is conducted.

Credit: Peter Murray-Rust, Peter Suber, and Wired.