Adobe Flash for Cheminformatics: Chemul, a 3D Structure Viewer 4
Previous articles have discussed the use of Adobe Flash for cheminformatics. Tetsuya Hoshi has created a 3D structure viewer (embedded above) called Chemul that can be used with the Flash Player and which is written in ActionScript 3.0. Although the documentation is written in Japanese, it appears that Chemul supports multiple display options, as evidenced here.
Building WebSpex: Putting Custom Data Types In Their Place
The previous article in this series introduced WebSpex, a spectroscopic data visualization tool being designed especially for use in a Web browser. Previously, the platform on which the user interface would be built was discussed. This article will discuss the question of where to put the spectroscopy data that WebSpex will display.
Tag Soup
We've decided to target WebSpex for use on the Web, which means that spectroscopy data would need to be referenced or embedded in a Web page. How should we do this? The answer, it turns out, is far from obvious.
If we knew that WebSpex were going to be created as a Java or Flash applet, which is not the current plan, we might be tempted to pass a reference to the data (or the data itself) as a parameter in the <object> tag. For an applet, this might look something like:
<object type="application/x-java-applet;version=1.4.2" width="520" height="350">
<param name="code" value="com/metamolecular/webspex/applet/FullApplet.class">
<param name="archive" value="http://metamolecular.com/applets/webspex.jar">
<param name="jcamp" value="http://base-url/spectrum.jdx">
</object>In the example above, the parameter jcamp would encode the path to a JCAMP-DX file for WebSpex to load.
Alternatively, if we were going to develop WebSpex as a Flash applet, we might use an object tag like this:
<object type="application/x-shockwave-flash" width="520" height="350">
<param name="movie" value="webspex.swf">
<param name="FlashVars" value="filename=http://spectrum.jdx">
</object> In this example, we associate the parameter filename with the value "spectrum.dx" using FlashVars.
This works well enough, but what if we need to load a custom data type in a Web page without a plugin?
Some Options
There are a few options for including custom data in an HTML document:
Invent our Own Tag Browsers are designed to ignore content they don't understand. We could just hack our own tag, let's call it <spectrum>. But for a variety of reasons, this is a bad idea. Most importantly, we'd be breaking with conventions used worldwide, which is never a good idea without a very good reason. For another, any developer tools we'd use would probably complain about a mis-formed HTML document. Still another reason might be that browsers could parse our invented tag in unpredictable ways. We may also run into problems with search engines not indexing our content properly.
Use XHML We could try inventing a tag the right way: with XHTML. This might be a worthwhile option if our data type (JCAMP-DX) were XML-based, but it's not. At best we'd expend a lot of effort learning about namespaces, schema, and HTTP response headers only to end up with an amorphous flat <spectrum> tag containing freeform text.
Use JSON We could encode our JCAMP-DX files as JSON. JSON is a markup language like XML, but with the difference that it can be evaled directly by the JavaScript interpreter. This has the advantage that either a filename, or the actual data could be encoded. We could, in fact, create the entire object model for our spectrum, ready to be displayed, if we had software that could make the conversion from JCAMP-DX to JSON. This approach has the disadvantage that it could require significant amounts JavaScript code to be mixed in with our HTML, a less than ideal solution.
Use the Object Tag Given that none of the three options above are especially appealing, we might ask ourselves whether we've really tried everything possible to use plain old HTML to encode our data. More specifically, what if we were to use the object tag itself, without actually having a plugin?
Encoding Custom Data Types With The Object Tag
The HTML 4 specification has this to say about the object tag:
Most user agents have built-in mechanisms for rendering common data types such as text, GIF images, colors, fonts, and a handful of graphic elements. To render data types they don't support natively, user agents generally run external applications. The OBJECT element allows authors to control whether data should be rendered externally or by some program, specified by the author, that renders the data within the user agent.
In the most general case, an author may need to specify three types of information:
- The implementation of the included object. For instance, if the included object is a clock applet, the author must indicate the location of the applet's executable code.
- The data to be rendered. For instance, if the included object is a program that renders font data, the author must indicate the location of that data.
- Additional values required by the object at run-time. For example, some applets may require initial values for parameters.
The OBJECT element allows authors to specify all three types of data, but authors may not have to specify all three at once. For example, some objects may not require data (e.g., a self-contained applet that performs a small animation). Others may not require run-time initialization. Still others may not require additional implementation information, i.e., the user agent itself may already know how to render that type of data (e.g., GIF images).
In other words, we could place a reference to a spectrum object in an HTML page with code like this:
<object width="520" height="350">
<param name="data" value="http://base-url/spectrum.jdx">
</object>After loading the document, we could have WebSpex walk the DOM looking for object tags that could be replaced with an instance of WebSpex. That instance could actually be placed inside the original object tag like this:
<object width="520" height="350">
<param name="data" value="http://base-url/spectrum.jdx">
<div class="webspex">
<!-- WebSpex visual presentation -->
</div>
</object>The HTML 4 documentation states that any content contained within object tags not recognized by the user agent will be rendered (fallback content). So dynamically inserting the div into the object tag as shown above would have the effect of giving the browser something to display in place of the object tag.
Advantages of Using This Approach
This approach has several advantages worth mentioning:
It's fully compliant with the HTML 4 specification.
It provides a natural anchor point to attach both the custom data and the visual presentation of that data.
It's pure HTML, requiring minimal mixing in of JavaScript content.
Web spiders can be taught a single method to associate a spectrum with a URL, regardless of how the viewer is implemented.
It's technology-agnostic. This approach lets us implement WebSpex as a Java or Flash applet (or some other plugin technology) just as easily as a pure JavaScript UI. To change our viewer implementation, we just change a JavaScript file.
It allows spectra to be inlined, or place directly into the HTML. Using a Data URI, we could replace "http://base-url/spectrum.jdx" with something like "data:chemical/x-jcamp-dx;base64,iVBORw0KGgoAA...". This would be important in those situations in which a public URL to a JCAMP file was not feasible and/or desirable. It could also accelerate the rendering of multiple spectra in the same page by eliminating the need to create a separate HTTP request for each file.
The method carries an imporant limitation: if a user has disabled JavaScript, they may see nothing to indicate a problem. We could address this issue by always placing fallback content in the object tag that would then be overwritten by the JavaScript code.
Implementation Detail
This approach relies on Onobtrusive JavaScript techniques to keep JavaScript as separate from HTML as possible. One way to implement such a scheme would be to include a single reference to the relevant JavaScript somewhere in the document, probably withing the <head> tag or right after the opening <body> tag:
<script type="text/javascript" src="webspex.js"></script>The file webspex.js would then execute code to place a function into the document's onLoad queue that would scan for object tags containing JCAMP-DX content and insert the needed viewer.
Previous Uses
I'm unaware of any previous applications of this technique, although is seems like something that may have been used before.
Conclusions
Encoding and displaying custom data types in HTML is possible by using the HTML 4 object tag coupled with client-side JavaScript to rewrite the DOM. It offers the potential to create HTML documents that are both human- and machine-readable. Although the approach described here was developed for the special case of spectroscopy data, it could in principle be used for any data type requiring a visual presentation.
Image Credit: mdezemery
Chemistry, The Web, and Netflix 11
If you've ever rented movies from Netflix, you've probably noticed the information box that pops up when you hover over a movie image. If you just want a quick peek at what a movie is all about, this simple feature can save a great deal of time and effort in mousing around, clicking, and general navigation annoyance. It turns out that chemical compounds have a lot in common with movies in that they both can be referred to through one or more identifiers and they both have a lot of interesting metadata linked to them. This article shows that what works for Netflix can also work for chemistry.
The Problem
Interpreting IUPAC nomenclature and references to compound numbers is a major chore when working with chemistry experimental sections. When paper documents are used, this typically involves flipping pages back and forth many times between the narrative and the experimental section. With Web documents, this is usually either impossible or very inconvenient, and so the PDF is printed to paper.
A Demonstration
The following text is an edited and re-formatted passage taken from the experimental section of a paper published in Beilstein Journal of Organic Chemistry. If you hover over any hyperlink for half a second or more, a balloon will pop up showing you the chemical structure of the substance being referred to. Mousing away from the link hides the balloon.
1-[(1R)-1-(2- {[tert-Butyl(dimethyl)silyl]oxy}ethylhexyl] -2-piperidinone (34)
5-Bromopentanoyl chloride (1.84 g, 9.25 mmol) was added to a stirred solution of primary amine 32 (2.00 g, 7.71 mmol) in dry 1,2-dichloroethane (30 cm3), followed by anhydrous NaHCO3 (0.78 g, 9.25 mmol). The reaction mixture was left to stir at room temperature for 16 h. The resulting mixture was filtered through a pad of celite, which was then washed with CH2Cl2. The combined filtrate and washings were then evaporated in vacuo to yield a crude orange oil (4.06 g), which was purified by column chromatography on silica gel with hexane-EtOAc (7:3) as eluent to give the 5-bromo-N-[(1R)-1- (2-{[tert-butyl(dimethyl)silyl]oxy}ethyl)hexyl pentanamide 33 as an orange oil (2.92 g, 89%).
A portion of the bromoamide 33 (0.20 g, 0.47 mmol) was dissolved in dry THF (3 cm3) containing a suspension of potassium tert-butoxide (587 mg, 0.52 mmol), and the mixture was stirred at room temperature for 25 min before being diluted with EtOAc (10 cm3). The mixture was then washed with saturated aqueous sodium chloride solution (5 x 2 cm3). The combined organic extracts were dried (MgSO4), filtered and evaporated in vacuo to yield a crude yellow oil (0.16 g), which was purified by column chromatography on silica gel with hexane-EtOAc (85:15) as eluent to give 1-[(1R)-1-(2-{[tert- butyl(dimethyl)silyl]oxy}ethylhexyl]-2-piperidinone 34 as a pale yellow oil (0.13 g, 81%).
-Michael, Accone, Koning, and Westhuyzen, Beilstein J. Org. Chem. 2008, 4, 5
This demo has been tested on Internet Explorer 6/7, Firefox 2, and Safari 3.
Technologies
Although this demonstration is built on numerous Web technologies, two are at the top of the stack: the vector graphics rendering engine of ChemWriter and the open source Javascript library Balloon.js.
Chemical structures are displayed as lightweight Adobe Flash SWF files, as described in a previous Depth-First article. Software based on ChemWriter converts a molecular connection table into vector graphics commands for the Flash runtime with the help of the open source Transform SWF library.
Playing to the Web's Strengths
The Web is a new medium with a completely different set of rules compared to print media. One of its biggest strengths is interactivity: the ability to see something of interest and to immediately be able to find out more about it. One of its biggest weaknesses, even today, is technology standards. It's not enough to create interactivity; that interactivity must also fit within the technical constraints imposed by a medium that is still a work in progress.
As journal publishers and others grapple with how to approach the inevitable transition to purely Web-based scientific communication, it's important to keep both the strengths and limitations of the Web in mind. To date, nearly all attempts to create Web-based versions of chemistry journals have simply tried to duplicate the form of the print medium. This has resulted, if anything, on an even greater reliance on paper, resulting in valuable information being used well below its full potential.
Conclusions
This article has demonstrated a simple labor-saving technique in which chemical structures can be visualized by hovering the cursor over specially-designated chemical identifiers. There's quite a bit more that can be done with chemical vector graphics, chemical information, and Web technologies commonly used in consumer services like Netflix. Future articles will discuss some possibilities.
Adobe Flash for Cheminformatics: Fast, Scalable, and Attractive 2D Depiction of Chemical Structures with Vector Graphics 3
The previous article in this series discussed the use of vector graphics markup languages for cheminformatics, in particular for the display of 2D chemical structures. Although vector graphics are well-suited for creating responsive and appealing cheminformatics Web applications, the lack of universal native browser support makes both Scalable Vector Graphics (SVG) and its cousin Vector Markup Language (VML) unattractive at this time. This article highlights Adobe Flash as a 2D chemical structure renderer for Web applications, and features a fully-functional proof of concept based on the ChemWriter rendering engine.
About Adobe Flash
Although Adobe Flash is practically an industry unto itself today, at it's core, Flash is a lightweight vector graphics renderer. Introduced in 1996, the Flash Player can be found on millions of Internet-enable devices today. According to a study by Adobe, the Flash Player was running on nearly 99% of Internet-enabled desktops as of March 2008. The player has also found its way onto a host of handheld devices and phones.
Many technologies have been layered on top of the Flash Player. One of the first was the ActionScript scripting language. More recently, Adobe has introduced Flex, a full-fledged application development framework.
Unlike SVG and other vector graphics systems, Flash is ready today, proven, and about as close to universal as is possible on the Web. If you want to do vector graphics on the Web with the most convenient user and developer experience, Flash is your tool.
But what can Flash do for cheminformatics?
A Demonstration
The table below is composed of twelve cells, each of which display a chemical structure through the Flash Player.
| zoom | zoom | zoom |
| zoom | zoom | zoom |
| zoom | zoom | zoom |
| zoom | zoom | zoom |
Several points are worth mention:
Each of the structures can be zoomed by clicking on its 'zoom' link.
Each cell contains a lightweight embedded "SWF" file, or "ShockWave File," and the zoomed view displays exactly the same file. No matter how the SWF file is resized, it will always be proportionally-scaled to its smallest dimension and centered.
The size of each SWF file ranges from a low of 563 bytes to a high of 8.5 KB, with an average of around 1.5KB. The larger the molecule, the more space is required. A comparable PNG with a resolution of 150x150 pixels would require on average for each structure about 6-8 KB.
Each image was generated from a molfile using a development version of the ChemWriter rendering engine via the open source Transform SWF Java toolkit.
SWF Files, unlike applets, are highly optimized for multiple instance display on all major platforms and browsers. In every case, startup will be nearly instantaneous and scrolling will be smooth. The performance of Flash should be at least as good as, if not better than, raster images.
The Right Tool for the Job (is Probably not a Raster Image)
One of the first challenges developers of cheminformatics Web applications are faced with is how to render 2D chemical structures. For an overview of the technologies now in use, see the previous article in this series. Each option has its own set of trade-offs.
The most widely-used 2D structure rendering option, raster images, is both inflexible and inefficient. Unlike a vector image, a raster image by definition has only one resolution, which is fixed at creation time. If image dimensions need to change, then all structures must be re-imaged. Given the size of many of today's chemistry databases, such a system-wide re-imaging of structures can involve a non-trivial amount of processor power and bandwidth.
To compensate, many sites store relatively large images, say 300x300 pixel, and then use the HTML <img> tag to shrink it as needed. But this creates problems of its own: both storage and bandwidth requirements are far larger than they need to be, resulting in the need for more powerful server hardware and poorer application scalability. And then there are the application's users, who must wait through a 30KB or higher download for each 2D image.
A significant number of structures in any compound collection will be so large that even a 300x300 pixel image will be insufficient to render the necessary detail. For example, a recent Depth-First article discussed a vector graphics solution this problem within the context of Chempedia, the free chemical encyclopedia. Vector graphics simply eliminate this issue.
Many cheminformatics applications would benefit from being able to show 50 or more structures at a time, with each structure having a zoom view for closer inspection. To a non-chemist, this might seem unnecessary. But for today's chemists dealing with large chemical catalogs and high-throughput screens, it's not only possible, but a routine part of the practice of chemistry. The raster image approach makes it extremely difficult to meet this important need on the Web. Vector graphics, possibly delivered through the Flash Player, offer a much simpler and more efficient way to do it.
2D chemical structures are vectorial in nature; using raster images to depict them is in most cases the more costly and lower quality option.
Summary
Vector graphics are a near-perfect match for the job of depicting 2D chemical structures on the Web. Although there are many vector graphics platforms to choose from, the Flash Player is by far the most universal option. This article has demonstrated a working example of multiple 2D chemical structures rendered as lightweight vector images via the Adobe Flash Player, the first and only such demonstration of which I'm aware.
The key technologies behind this demonstration are the ChemWriter rendering engine and the open source Flash developer toolkits available from Flagstone Software. If you're interested in learning more about how vector graphics and Flash can improve both the user and developer experience in your cheminformatics Web applications, I'd be happy to hear from you.
The Other Vector Graphics Markup Language
Scalable Vector Graphics (SVG) is a technology that enables the creation and publication of high quality images that can be scaled to any resolution. SVG is ideally suited for the Web, and all major browsers now support it - except Internet Explorer (IE). This poses a problem: vector graphics are by far superior to raster images for many applications, but the lack of native IE support makes SVG a non-starter for most developers. This article discusses a little known IE capability that might provide a solution.
Oh Brother, Where Art Thou?
Way back in 1998 a group of companies including Microsoft submitted a proposal for a vector graphics language called Vector Markup Language (VML) to the W3C. This set in motion a series of events that culminated in the development of what we know today as SVG. But while use of SVG quickly expanded, VML remained almost exclusively limited to Microsoft products.
Soon after, IE 5 introduced the ability to decode and display VML - a capability that exists today in IE 7.
SVG and VML are two vector graphics languages, each designed to do essentially the same thing. For basic shape rendering, their similarities outweigh their differences.
About VML
To understand why VML never caught on, you need look no further than the documentation - or the lack thereof. The original VML submission is a decade old and has not been updated.
For the most part, VML documentation is scattered and incomplete. Nevertheless, there are some useful resources. Here, in no particular order are some of them:
Microsoft Documentation Authoritative, but lacking in examples.
VML, SVG, and Canvas Discusses some of the differences between VML and SVG.
Cum mortuis in lingua mortua Good history of VML.
Examples of the Vector Markup Lanugauge There are far too few of this kind of site.
VectorConverer A PHP library that uses XSLT to interconvert SVG and VML. Unfortunately, the stylesheet didn't work in my hands under Xalan or Ruby/xslt - and I know almost nothing about PHP.
Julie Nabong's Masters Thesis Julie wrote and documented an SVG/VML XSLT for interconverting the two languages.
JSDrawing: Interconverting Vector Languages on the Fly
One VML resource deserves special note - JSDrawing. This library seems to be capable of generating Flash, VML, or SVG from a common vector graphics language precursor. I'm not sure how practical this approach would be, but it does provide some food for thought.
Why It Matters
Chemistry is in a good position to take advantage of vector graphics. Chemical structures, being closely based on graph theoretical constructs, would seem to be a perfect match for vector languages like SVG and VML, especially on the Web. So far it hasn't happened, primarily for the reasons outlined above.
Currently, if you want to display 2D chemical structures in Web pages you're faced with some tradeoffs:
Raster Images. This is by far the most common practice. This option unfortunately makes it very difficult to redesign the layout of a site or support multiple views of the same structure, especially with databases of one million plus compounds becoming commonplace. Even if images are never regenerated, they need to be stored and retrieved, adding to cost and complexity. Images could be dynamically generated, but at the expense of substantial memory and CPU requirements.
Applets. This is the approach currently taken by Chempedia, the free chemical encyclopedia, and gives complete flexibility in page layout and structure appearance. Changing the dimensions of a structure is as simple as changing the size of a div. Unfortunately, some browsers handle multiple applets better than others. Firefox on OS X is very slow at refreshing applets while scrolling, and IE requires a Javascript trick to remove the 'click to active' message that causes some flashing when in progress.
Vector Graphics Through Plugins There are at least two SVG plugins for IE (one by Adobe and the other from Examotion). Will all of your users be able to find and install them? Unless the answer to both questions is 'yes', this option is probably best left as a last resort. Another option is to render SVG on IE through the Flash or Silverlight plugins. But as far as I can tell, neither approach is ready for prime-time.
Native Vector Graphics Available on all major browsers including Internet Explorer 5/6/7, Firefox 1/2, and Opera 8/9. Combines the flexibility, lossless depiction, inlineability and low data storage/retrieval overhead of applets with the speed of images. Interactivity and other special effects can be achieved through DOM manipulation. All of this depends, of course, on the vector graphics format being compatible with the rendering engine.
In some circumstances, serving VML to IE clients and SVG to everyone else would be a viable option - if it were possible to generate VML.
Conclusions
Vector graphics have a lot to offer chemistry, especially when used with Web applications. The combination of VML and SVG offers a proven technology platform that's ready today, but only if you can generate VML.
Older posts: 1 2

