Designing the Obvious: Permalinks and Paradigms
Pssssst. Want to know a secret? Some of the best inventions are completely obvious. That is, given a half dozen years or so. At the time they're conceived, however, most good, obvious ideas just seem dumb, dangerous, or uninteresting. They have to - otherwise they'd have been developed already.
Case in point: the blog permalink. If you've ever read a blog, you know what a permalink is. It's the link you click when looking at a story headline in an RSS reader like Google Reader.
If you run a blog, you definitely know what a permalink is. It's the link that a Google user follows when they search for a topic you've written about. It's what other authors link to in their own writing, thereby increasing your ranking in Google. If your blog is anything like mine, Google drives a lot of your traffic, and the permalink makes it all possible.
A permalink is nothing more than a fixed, unique identifier (URL) for online content. Blogging would have never caught on without it.
I recently ran across Tom Coates' excellent essay on the lowly permalink. He describes the time around 1999-2000 when permalinks didn't exist. If you ran a blog back then and wanted to write about someone else's blog post, you had to link to the other blog's home page. As the author you linked to continued to post, the content you had discussed in your own blog disappeared from the other author's front page, making your link irrelevant.
It was a huge problem, yet few perceived it as such. Interestingly, Coates even admits to having been against the idea of permalinks because of their hacky nature. Besides, they didn't seem to do anything useful.
So the next time you're stumped while trying to find something to work on that matters, try picking up a dumb, dangerous, or uninteresting - yet obvious - idea and run with it. In six year's time your invention may become so well known that most people couldn't imagine the world without it.
Image Credit: mklingo
Streamlining Cheminformatics on the Web: Let InChI Do the Heavy Lifting and Get Some REST 11
A recent Depth-First article discussed the advantages of minimal Web APIs in Cheminformatics. Recently, Antony Williams unveiled some simplified ChemSpider URL schemes, mainly from the perspective of enabling Google indexing. However, it's possible to take this scheme much, much further. Here I present a proposal for radically simplifying (and unifying) the development of cheminformatics Web APIs and the software that interacts with them.
The New ChemSpider URLs
ChemSpider now has several new kinds of URLs. For the purposes of this article, the most interesting of these are of the format:
These URLs may seem unremarkable, but there's much more than meets the eye. They let anonymous developers query ChemSpider about specific substances - without needing to know much at all about how ChemSpider itself works. Goodbye API. Goodbye API support. Goodbye API documentation. Goodbye angle brackets. Hello to getting stuff done. It's all very RESTful. Well, at least it could be that way with some minor modification.
Some Recommendations
ChemSpider hasn't quite reached that place where the API just disappears. The problem is that the ChemSpider URLs listed above point to query results pages, not compound summary pages. Were these URLs to redirect to a summary page, we could construct the following URLs to extract ChemSpider resources (I've replaced the '=' sign with a '/' for simplicity):
.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ Get all resources for the molecule identified by the given InChIKey - i.e., "Compound summary page"
.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/molfile.mol Get the molfile for the molecule identified by the given InChIKey
.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/small_image.png Get the small image for the molecule indentified by the given InChIKey.
.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/large_image.png Get the large image for the molecule identified by the given InChIKey.
.../InChIKey/DEIYFTQMQPDXOT-RERXVCSDCZ/citations.xml Get the list of citations for the molecule identified by the given InchIKey, in XML format.
Jane, a developer building Web applications on top of this new ChemSpider API, would immediately notice that things just work. Let's say her online database stores IC50s at the dopamine D2 receptor. On the summary page for each molecule, she wants to link out to the ChemSpider compound summary page, if available. She would simply construct the InChIKey on her server, build the needed ChemSpider URL and GET it. An HTTP 404 would indicate no molecule with that Key exists on ChemSpider and so no link would be shown. An HTTP 200 would indicate ChemSpider has the molecule, and so the link would appear.
Conclusions
It would be interesting enough if ChemSpider adopted a system like that described here. But the real power of this approach would emerge if multiple Web services were to adopt it. By following a simple set of conventions, these services would enable third party developers to elegantly mashup all manner of cheminformatics resources into applications unimaginable today.
Technically, there's nothing that prevents this system from being implemented on every free chemistry database in existence today. However, doing so would transfer a significant degree of control from service operators to third-party developers. Not all providers will be comfortable with that idea.
Cheminformatics Web service providers need to carefully consider whether they're trying to develop a platform or an integrated service. As history has shown, the strategies, and upside potential, for each approach can differ dramatically.


