JavaScript Survival: Screw.Unit 1

Posted by Rich Apodaca Mon, 23 Nov 2009 14:53:00 GMT

Over the last year or so, I've become increasingly convinced that implementing Behavior-Driven Development (BDD) practices can lead to significantly cleaner code and better design. And in the case of dynamic languages like Ruby, a good testing framework can be essential in enabling good maintenance practices.

JavaScript, that language that for years was written off as a toy at best and an abomination at worst, is now much better understood. It turns out JavaScript is quite powerful and surprisingly well-designed, provided that you stick to the good parts and avoid the bad parts.

But how do you do BDD in JavaScript? One tool I was pleasantly surprised with is Screw.Unit. If you've ever worked with RSpec, you'll immediately recognize the connection. The ability to create nested contexts and tests that read like plain English makes this my JavaScript testing tool of choice.

Casting a Wide Net in Cheminformatics 1

Posted by Rich Apodaca Wed, 18 Nov 2009 18:56:00 GMT

I've always viewed the term 'cheminformatics' pretty broadly. To me, cheminfomatics is simply the application of information technology to solve problems in chemistry.

Unsolved Problems in Cheminformatics

For my money, some of the most important unsolved problems in cheminformatics today have to do with the many issues that come with the transition from a paper-centric scientific workflow to a Web-centric scientific workflow. We're not just talking about publishing the same stuff in a new medium, we're talking about a completely different way of working. And a different set of non-technical issues ranging from intellectual property to culture to how to make a living.

One thing the Web excels at is connecting people who share the same interests in the long tail. If this is true for society at large, why shouldn't it be true in science - and chemistry in particular?

But connecting people on the Web is a very tricky business. Millions - probably billions of dollars have been spent on schemes that ultimately went nowhere.

A Platform for Chemical Knowledge-Gathering

So it was with great interest that I noticed the public beta of StackExchange, the software behind the wildly-popular programming site StackOverflow.

For the unfamiliar, StackOverflow is a question and answer site that combines aspects of peer-review, wikis, forums, and mailing lists into a single, coherent, fun, search engine-optimized package.

After having actively used StackOverflow for awhile now, some things stand out in my mind:

  1. The very first question I asked had an answer within 30 minutes.
  2. I now know of a number of smart, friendly people who I've never met but who share my narrow interests with respect to the developer tools and methods I use regularly.
  3. I no longer read or post to mailing lists for the products I use. I go to StackOverflow instead.
  4. I no longer use Google for some questions - I go directly to StackOverflow.

I had been sitting on the fence with respect to starting a StackExchange site in chemistry. The idea is, after all, a pretty radical one. See, StackExchange isn't about just enabling peer-to-peer communication, it's about promoting it, and then leaving a public record of the event behind for others to use.

This is supposed to be what scientific papers are for. The problem is that papers do such an awful job of it that the similarity may not be immediately obvious.

But Egon Willighagen gave me the inspiration I needed (not the first time, either) with his BlueObelisk SE site. As Noel O'Boyle notes, the idea can only work, regardless of the software being used, if people actually find something that fills a need.

Chempedia Lab

So, without further ado, I give you Chempedia Lab, a site for asking and answering questions about experimental chemistry.

The choice of topic is deliberate. There are many sites where you can go to get answers about general chemistry questions. But none of them are dedicated to collecting, organizing, centralizing, and freely-distributing the massive body of knowledge that is experimental chemistry.

Being a trained synthetic organic chemist myself with eight years experience in drug discovery has given me a particular perspective on cheminformatics - it's not worth much until put into practice by someone who makes decisions in the lab.

Experimental chemistry is where cheminformatics begins, and the thing that keeps it going.

The format of Chempedia Lab is free text and images. But fear not, there are many things we can do to layer on deeper chemical meaning.

For example, see this question on the synthesis of 2,3,6,7-tetrabromoanthracene. The chemical structure image I embedded in the body is served by the corresponding Substance summary page on the Chempedia substance registry. In exchange for a user getting a convenient way to add a structure image to her question, the Chempedia registry gets some higher-level chemical information.

Links between sites are a powerful way to join bits of information together. In this case, a substance, its machine-readable structure, a question about its synthesis, and a primary literature citation to its preparation.

Wrapup

I have no idea what to expect from Chempedia Lab. The newness of the concept, and some of those non-technical concerns noted above, might take awhile to work themselves out. Or there may simply be no need for it.

One thing is certain - if you pay attention, there will be plenty to learn.

Tech Fridays: Cloud Computing (in Plain English)

Posted by Rich Apodaca Fri, 13 Nov 2009 15:53:00 GMT

Chempedia Data Downloads: Free as in Free 4

Posted by Rich Apodaca Tue, 10 Nov 2009 15:16:00 GMT

Few non-technical topics lead to more heated debate in the field of scientific databases than terms of data reuse.

Chempedia, the free chemical substance registry, aims to be different in every significant way from the services that came before it. One of the ways this can be seen is in the terms of use for its data downloads.

There are really two bits of news here. The first is that you can now download the Chempedia Registry at no cost.

The second bit of news is that the files you download are licensed under the ultra-permissive Creative Commons CC0 License. In the most basic terms, this no-B.S. license lets anybody mash up the Chempedia Registry data for any purpose, commercial or otherwise. There is no need for attribution, no confusing legal mumbo-jumbo, no impenetrable reciprocity provisions, no dangling questions about copyright. Use the data for any purpose you desire, but don't blame Metamolecular if something bad happens.

By using CC0, Chempedia short-circuits the rancorous debate that has plagued past attempts to license chemical data. I challenge every operator of a public-facing chemical database to take similar steps.

Credit: Kudos to Egon Willighagen for introducing me to the CC0 License.

A Clean, Well-Lit Place for Spectra 3

Posted by Rich Apodaca Mon, 09 Nov 2009 17:17:00 GMT

Bonnie Swoger, Science and Technology Librarian at SUNY Geneseo asks a very interesting question on the CHMINF-L mailing list:

Theoretical question here from a still-new science librarian:

Let's say that a faculty member wants a particular spectra (IR, NMR, etc). He has exhausted the free options (available on many of the excellent guides put together by many members of this list) and the spectra available in SciFinder (which we subscribe to). We don't subscribe to any other specialty chemical databases that provide this information.

What would his options be? With our shrinking budgets, I would be hard pressed to justify subscribing to one of the spectra databases, especially since it would get limited use.

Is there a way to "ILL" particular spectra? Or would the faculty member need to determine which journal article a spectra appeared in and ILL that? Or are there other options?

This question is purely theoretical at this point. Faculty members have asked me about additional access to spectra, but there hasn't been a lot of follow-up.

Any guidance you can provide would be useful!

For those not familiar with the term, "ILL" means "Inter Library Loan" - a method for a library to offer materials by borrowing them from other libraries.

There are several services, commercial and free, that I could mention. The problem is that none of them make it easy to get a spectrum that isn't already in "the collection".

For that, we need a new kind of spectroscopy marketplace. One built on the principles of openness and peer-review. One that's free to use in every sense of the word. One that takes advantage of the best technologies the Web currently has to offer to enable scientists' innate desire to help each other and make their results widely known.

Chempedia may offer a glimpse of what this new kind of service could look like. In Chempedia, scientists themselves take center stage. We're currently working on ways to take this simple idea in innovative directions. Chempedia's problem domain is substance registration (on which nearly all chemistry information resources depend), but the approach can be applied to any area - for example, spectroscopy.

If you're looking for ways to make big contributions to the scientific enterprise through information technology, create the solution to Bonnie's problem.

Older posts: 1 2 3 ... 119