Chemistry, The Web, and Netflix 13
If you've ever rented movies from Netflix, you've probably noticed the information box that pops up when you hover over a movie image. If you just want a quick peek at what a movie is all about, this simple feature can save a great deal of time and effort in mousing around, clicking, and general navigation annoyance. It turns out that chemical compounds have a lot in common with movies in that they both can be referred to through one or more identifiers and they both have a lot of interesting metadata linked to them. This article shows that what works for Netflix can also work for chemistry.
The Problem
Interpreting IUPAC nomenclature and references to compound numbers is a major chore when working with chemistry experimental sections. When paper documents are used, this typically involves flipping pages back and forth many times between the narrative and the experimental section. With Web documents, this is usually either impossible or very inconvenient, and so the PDF is printed to paper.
A Demonstration
The following text is an edited and re-formatted passage taken from the experimental section of a paper published in Beilstein Journal of Organic Chemistry. If you hover over any hyperlink for half a second or more, a balloon will pop up showing you the chemical structure of the substance being referred to. Mousing away from the link hides the balloon.
1-[(1R)-1-(2- {[tert-Butyl(dimethyl)silyl]oxy}ethylhexyl] -2-piperidinone (34)
5-Bromopentanoyl chloride (1.84 g, 9.25 mmol) was added to a stirred solution of primary amine 32 (2.00 g, 7.71 mmol) in dry 1,2-dichloroethane (30 cm3), followed by anhydrous NaHCO3 (0.78 g, 9.25 mmol). The reaction mixture was left to stir at room temperature for 16 h. The resulting mixture was filtered through a pad of celite, which was then washed with CH2Cl2. The combined filtrate and washings were then evaporated in vacuo to yield a crude orange oil (4.06 g), which was purified by column chromatography on silica gel with hexane-EtOAc (7:3) as eluent to give the 5-bromo-N-[(1R)-1- (2-{[tert-butyl(dimethyl)silyl]oxy}ethyl)hexyl pentanamide 33 as an orange oil (2.92 g, 89%).
A portion of the bromoamide 33 (0.20 g, 0.47 mmol) was dissolved in dry THF (3 cm3) containing a suspension of potassium tert-butoxide (587 mg, 0.52 mmol), and the mixture was stirred at room temperature for 25 min before being diluted with EtOAc (10 cm3). The mixture was then washed with saturated aqueous sodium chloride solution (5 x 2 cm3). The combined organic extracts were dried (MgSO4), filtered and evaporated in vacuo to yield a crude yellow oil (0.16 g), which was purified by column chromatography on silica gel with hexane-EtOAc (85:15) as eluent to give 1-[(1R)-1-(2-{[tert- butyl(dimethyl)silyl]oxy}ethylhexyl]-2-piperidinone 34 as a pale yellow oil (0.13 g, 81%).
-Michael, Accone, Koning, and Westhuyzen, Beilstein J. Org. Chem. 2008, 4, 5
This demo has been tested on Internet Explorer 6/7, Firefox 2, and Safari 3.
Technologies
Although this demonstration is built on numerous Web technologies, two are at the top of the stack: the vector graphics rendering engine of ChemWriter and the open source Javascript library Balloon.js.
Chemical structures are displayed as lightweight Adobe Flash SWF files, as described in a previous Depth-First article. Software based on ChemWriter converts a molecular connection table into vector graphics commands for the Flash runtime with the help of the open source Transform SWF library.
Playing to the Web's Strengths
The Web is a new medium with a completely different set of rules compared to print media. One of its biggest strengths is interactivity: the ability to see something of interest and to immediately be able to find out more about it. One of its biggest weaknesses, even today, is technology standards. It's not enough to create interactivity; that interactivity must also fit within the technical constraints imposed by a medium that is still a work in progress.
As journal publishers and others grapple with how to approach the inevitable transition to purely Web-based scientific communication, it's important to keep both the strengths and limitations of the Web in mind. To date, nearly all attempts to create Web-based versions of chemistry journals have simply tried to duplicate the form of the print medium. This has resulted, if anything, on an even greater reliance on paper, resulting in valuable information being used well below its full potential.
Conclusions
This article has demonstrated a simple labor-saving technique in which chemical structures can be visualized by hovering the cursor over specially-designated chemical identifiers. There's quite a bit more that can be done with chemical vector graphics, chemical information, and Web technologies commonly used in consumer services like Netflix. Future articles will discuss some possibilities.


That is very awesome. That would make reading articles a whole lot handier. I sure hope this idea of yours gets picked up quickly by the publishers!
This idea is an old idea and HAS been used by publishers. Project Prospect from the RSC has been using this technology for a while. SureChem's fee patent portal with structure searching has been using this technology (both Project Prospect and SureChem technologies are supplied by ChemAxon but there are other providers including OpenEye). It's a very enabling technology for helping to visualize text-based articles with embedded structures.
The comment about SureChem should read FREE patent portal..not fee. You can access it here: http://www.surechem.org/
Tony, I didn't say that this general idea had never been used before - it has. For example, OpenEye sells (or used to sell) a plugin for Adobe Acrobat that works in a similar manner to that described here. But desktop software plugins are a world away from doing this inside the browser.
I've worked with Prospect's implementation of this general idea and my critique is available by following the link given in the article above.
What's presented here is significantly different from the way Prospect implements it. Here, the balloon pops up without the user needing to enable the feature and without the user needing to click on the reference or click on the window to dispose of it when they're done. Eliminating unnecessary mousing and clicking is critical to creating responsive Web applications. The approach described here also offers the possibility of the text reference being hyperlinked in addition to being hoverable, something that looked impossible with Prospect.
In addition, this approach uses lightweight vector graphics, not bitmaps, to display the structures. This means that the same image file can be used in a tiny view for search results and in a large balloon view. Moreover, the size of the balloon can be scaled to accommodate the size of the structure being displayed - with larger structures getting a larger balloon and smaller structures getting a smaller balloon.
The SureChem implementation is new to me, but more closely related to the approach described here.
Aside from SureChem and Prospect, I know of no publisher, traditional or electronic, who implements the feature. Prospect has only been at it since Feb '07, and I suspect SureChem's implementation is more recent than that.
If this doesn't qualify the idea as "new", what else could?
Very cool, looks great.
Rich...my response regarding publishers doing things like this was to Johan who said "I sure hope this idea of yours gets picked up quickly by the publishers! ". RSC is leading the way with such implementations and I, like yourself, hope that more will do so. I agree that reducing the number of clicks is important. I'm all for ease of use! I like the vector graphics implementation and the scaling. I also don't know of any other publishers who's implemented the feature but I have to hope that others are watching prospect and planning to implement similar technologies. It's certainly possible of course! You, Surechem, Chemaxon, RSC and Openeye are all showing it's possible. Now the work has to be done on the editorial side to gather the structures and link up the technologies. Again...all possible...just work.
Tony, a good chunk of the needed work might be done by automated systems like Sciborg and OSCAR, which I believe are being used behind the scenes at RSC/Prospect.
There are a lot of technologies laying around just waiting to be assembled into coherent chemical information systems...
It also works fine on the konqueror browser (Fedora 8, KDE).
Nice! The Userscripts paper (doi:10.1186/1471-2105-8-487) shows the use of such popups for comments from the Chemical blogspace for articles.
One big problem here is annotation. I value expert markup of chemicals over using OSCAR for this kind of thing. We all too often mess our chemistry up already. Semantic markup with InChIs (see my blog) would be rather suited for such markup, as explained in the mentioned userscripts paper.
@ChemSpiderMan: Well, okay, I suppose there might indeed be a few publishers who use something akin to this. Fact is that the big ones, Elsevier, SpringerLink and such, do not have this, and that is too bad. I especially like this implementation due to its ease of use. I doubt we see them quickly implementing something like this. It took Elsevier over half a year to fix a single missing closing tag that messed up Safari. I doubt they are in for something as "advanced" as this.
In your article you have mentioned the following concept like the interactivity. And I think it can be picked up by the publishers very soon. You are right that the Web is a new medium with a completely different set of rules compared to print media.
I believe your next article will come soon, because I want to know more about "a simple labor-saving technique..." That sounds simple, but when one is looking closer he or she would see that while saving a minute a day he or she is getting many free moments a year. Thanks a lot!
The demonstration of the work of Netflix on the example of text is great. I really hope that future articles will discuss some more possibilities.