Building WebSpex: Putting Custom Data Types In Their Place
The previous article in this series introduced WebSpex, a spectroscopic data visualization tool being designed especially for use in a Web browser. Previously, the platform on which the user interface would be built was discussed. This article will discuss the question of where to put the spectroscopy data that WebSpex will display.
Tag Soup
We've decided to target WebSpex for use on the Web, which means that spectroscopy data would need to be referenced or embedded in a Web page. How should we do this? The answer, it turns out, is far from obvious.
If we knew that WebSpex were going to be created as a Java or Flash applet, which is not the current plan, we might be tempted to pass a reference to the data (or the data itself) as a parameter in the <object> tag. For an applet, this might look something like:
<object type="application/x-java-applet;version=1.4.2" width="520" height="350">
<param name="code" value="com/metamolecular/webspex/applet/FullApplet.class">
<param name="archive" value="http://metamolecular.com/applets/webspex.jar">
<param name="jcamp" value="http://base-url/spectrum.jdx">
</object>In the example above, the parameter jcamp would encode the path to a JCAMP-DX file for WebSpex to load.
Alternatively, if we were going to develop WebSpex as a Flash applet, we might use an object tag like this:
<object type="application/x-shockwave-flash" width="520" height="350">
<param name="movie" value="webspex.swf">
<param name="FlashVars" value="filename=http://spectrum.jdx">
</object> In this example, we associate the parameter filename with the value "spectrum.dx" using FlashVars.
This works well enough, but what if we need to load a custom data type in a Web page without a plugin?
Some Options
There are a few options for including custom data in an HTML document:
Invent our Own Tag Browsers are designed to ignore content they don't understand. We could just hack our own tag, let's call it <spectrum>. But for a variety of reasons, this is a bad idea. Most importantly, we'd be breaking with conventions used worldwide, which is never a good idea without a very good reason. For another, any developer tools we'd use would probably complain about a mis-formed HTML document. Still another reason might be that browsers could parse our invented tag in unpredictable ways. We may also run into problems with search engines not indexing our content properly.
Use XHML We could try inventing a tag the right way: with XHTML. This might be a worthwhile option if our data type (JCAMP-DX) were XML-based, but it's not. At best we'd expend a lot of effort learning about namespaces, schema, and HTTP response headers only to end up with an amorphous flat <spectrum> tag containing freeform text.
Use JSON We could encode our JCAMP-DX files as JSON. JSON is a markup language like XML, but with the difference that it can be evaled directly by the JavaScript interpreter. This has the advantage that either a filename, or the actual data could be encoded. We could, in fact, create the entire object model for our spectrum, ready to be displayed, if we had software that could make the conversion from JCAMP-DX to JSON. This approach has the disadvantage that it could require significant amounts JavaScript code to be mixed in with our HTML, a less than ideal solution.
Use the Object Tag Given that none of the three options above are especially appealing, we might ask ourselves whether we've really tried everything possible to use plain old HTML to encode our data. More specifically, what if we were to use the object tag itself, without actually having a plugin?
Encoding Custom Data Types With The Object Tag
The HTML 4 specification has this to say about the object tag:
Most user agents have built-in mechanisms for rendering common data types such as text, GIF images, colors, fonts, and a handful of graphic elements. To render data types they don't support natively, user agents generally run external applications. The OBJECT element allows authors to control whether data should be rendered externally or by some program, specified by the author, that renders the data within the user agent.
In the most general case, an author may need to specify three types of information:
- The implementation of the included object. For instance, if the included object is a clock applet, the author must indicate the location of the applet's executable code.
- The data to be rendered. For instance, if the included object is a program that renders font data, the author must indicate the location of that data.
- Additional values required by the object at run-time. For example, some applets may require initial values for parameters.
The OBJECT element allows authors to specify all three types of data, but authors may not have to specify all three at once. For example, some objects may not require data (e.g., a self-contained applet that performs a small animation). Others may not require run-time initialization. Still others may not require additional implementation information, i.e., the user agent itself may already know how to render that type of data (e.g., GIF images).
In other words, we could place a reference to a spectrum object in an HTML page with code like this:
<object width="520" height="350">
<param name="data" value="http://base-url/spectrum.jdx">
</object>After loading the document, we could have WebSpex walk the DOM looking for object tags that could be replaced with an instance of WebSpex. That instance could actually be placed inside the original object tag like this:
<object width="520" height="350">
<param name="data" value="http://base-url/spectrum.jdx">
<div class="webspex">
<!-- WebSpex visual presentation -->
</div>
</object>The HTML 4 documentation states that any content contained within object tags not recognized by the user agent will be rendered (fallback content). So dynamically inserting the div into the object tag as shown above would have the effect of giving the browser something to display in place of the object tag.
Advantages of Using This Approach
This approach has several advantages worth mentioning:
It's fully compliant with the HTML 4 specification.
It provides a natural anchor point to attach both the custom data and the visual presentation of that data.
It's pure HTML, requiring minimal mixing in of JavaScript content.
Web spiders can be taught a single method to associate a spectrum with a URL, regardless of how the viewer is implemented.
It's technology-agnostic. This approach lets us implement WebSpex as a Java or Flash applet (or some other plugin technology) just as easily as a pure JavaScript UI. To change our viewer implementation, we just change a JavaScript file.
It allows spectra to be inlined, or place directly into the HTML. Using a Data URI, we could replace "http://base-url/spectrum.jdx" with something like "data:chemical/x-jcamp-dx;base64,iVBORw0KGgoAA...". This would be important in those situations in which a public URL to a JCAMP file was not feasible and/or desirable. It could also accelerate the rendering of multiple spectra in the same page by eliminating the need to create a separate HTTP request for each file.
The method carries an imporant limitation: if a user has disabled JavaScript, they may see nothing to indicate a problem. We could address this issue by always placing fallback content in the object tag that would then be overwritten by the JavaScript code.
Implementation Detail
This approach relies on Onobtrusive JavaScript techniques to keep JavaScript as separate from HTML as possible. One way to implement such a scheme would be to include a single reference to the relevant JavaScript somewhere in the document, probably withing the <head> tag or right after the opening <body> tag:
<script type="text/javascript" src="webspex.js"></script>The file webspex.js would then execute code to place a function into the document's onLoad queue that would scan for object tags containing JCAMP-DX content and insert the needed viewer.
Previous Uses
I'm unaware of any previous applications of this technique, although is seems like something that may have been used before.
Conclusions
Encoding and displaying custom data types in HTML is possible by using the HTML 4 object tag coupled with client-side JavaScript to rewrite the DOM. It offers the potential to create HTML documents that are both human- and machine-readable. Although the approach described here was developed for the special case of spectroscopy data, it could in principle be used for any data type requiring a visual presentation.
Image Credit: mdezemery
JavaScript for Cheminformatics: An Introduction to WebSpex, a Spectroscopy Tool for the Internet 6
The previous article in this series discussed the untapped potential of JavaScript for building rich, chemistry-oriented Web applications. This article will describe the design of WebSpex, a spectrum viewer designed for the Web and written entirely in JavaScript.
Warning: Potential Vaporware Ahead
WebSpex can't yet be download or deployed, and it may never be finished. Articles like this one will document the tool's transition from concept to hopefully something more substantial. There may be false-starts and dead-ends along the way. But I'm hoping that there will also be feedback from readers like you. Feel free to chime in, regardless of your background.
This process will be necessarily non-linear. But I believe that being able to incorporate feedback in real-time increases the chances of creating a better product, and seeing mistakes, false-starts, and dead-ends in context can be useful to everyone.
The Problem
Of all the forms of chemical data generated in labs around the world today, spectra are one of the most common. Although several tools can display and manipulate these data, few are capable of being used on the Web. If you agree with the hypothesis that the Web is fast becoming the only information exchange platform that matters, this presents a significant problem (or opportunity, depending on your perspective).
Recently, a series of articles on the Web-based spectroscopy tool JSpecView appeared here. JSpecView is one of the only free tools currently available that enables spectra to be displayed and manipulated on the Web.
JSpecView is written in Java and deployed as an applet. In very many situations, applets are the best technology for delivering interactive Web content of the kind required by spectroscopy.
But no technology is perfect. One of the biggest limitations of applets is that they require the correct version of the Java plug-in to be installed on a users' machine (a limitation shared by all plug-in technologies, including Flash). In situations where users either can't or won't install the plugin, or in which adequate resources are not available to ensure smooth applet deployment, Java applets may not be the best technology platform.
What if there were a spectroscopy tool that worked on any browser without any plugin dependencies?
Scenarios
For a better idea of what WebSpex will be about, consider some scenarios:
Viewing Sally logs into her research group's spectroscopy archive website. She clicks on the link to her most recently-collected IR spectrum and is taken to a page displaying an image of her spectrum along with textual metadata.
Zooming Fred is viewing an IR spectrum published on a chemical supplier's Website. Wanting a better view of the fingerprint region, he uses the mouse to zoom the spectrum.
Peak Picking Victor is using a publisher website to view the IR spectrum of a compound described in a recent paper. Wanting to match the carbonyl stretching frequency of the material he prepared, he uses the mouse to pick the peak.
Data
WebSpex will use as its input format JCAMP-DX, the de facto standard for spectral data encoding. Although JCAMP-DX has been extended in many ways over the last several years, for now the goal of WebSpex will be to simply read and display error-free examples of the original format specification.
Conclusions
Good Web-based spectroscopy tools are a prerequisite for the open sharing of this important form of experimental data. WebSpex could fill this need by providing an installation-free tool that can be used in any browser without plugins. Currently consisting of nothing more than just some ideas, WebSpex successes and failures will be documented here in several installments.
Image Credit: estaticist

