The Structure Editor: (Forgotten) Link Between Chemistry and Cheminformatics 2
Of the many cheminformatics technologies developed over the last fifty years, which ones can no bench chemist function without today? There are many possibilities to choose from: molecular databases, fast substructure searching, the SD/Molfile format, SMILES and other line notations, a variety of molecular descriptors and in silico property calculations, and QSAR, to name a few. Throw in molecular modeling and 3-D visualization if you'd like.
I would propose that the lowly structure editor is more important to the daily activities of most chemists than any of these other technologies combined. Try this experiment: ask some practicing bench chemists what application you absolutely couldn't take away from them. More likely than not, the answer will either be a structure editor (ChemDraw) or something that intensively uses one (SciFinder).
No other kind of software gets as much "face time" with chemists as a 2D structure editor. This means that the user experience with the editor plays a disproportionate role in forming opinions about the application as a whole. In other words: if you make your editor suck a little bit less, your whole application will seem to suck a lot less.
Structure editors play a pivotal role in the end user experience. Strangely, understanding what makes a good structure editor, and putting this knowledge into practice, has not been given the time and effort it deserves. There have been plenty of attempts to build better editors, but very little analysis of the user experience with them.
In the series of articles to follow, I'll describe the design of a new 2D structure editor. Codenamed "Firefly", its purpose is to make the drawing of 2D chemical structures as intuitive and easy as it can be, and to do so seamlessly on the cheminformatics platform of the future - the Internet.
Twist and Shout
-Vivekananda Vrudhula, Bireshwar Dasgupta, Jingfang Qian-Cutrone, Edward Kozlowski, Christopher Boissard, Steven Dworetzky, Dedong Wu, Qi Gao, Roy Kimura, Valentin Gribkoff, and John Starrett, Jr., J. Med. Chem.
Yet another case of axial chirality in the recent literature comes in the form of ion channel openers described by Vrudhula et al. A previous Depth-First article illustrated how most popular cheminformatics tools are incapable of distinguishing axially chiral enantiomers such as those shown above. If your application suddenly needed to do so, could it cope?
FlexMol is an XML language designed to solve the molecular representation problems of today and, hopefully, those of tomorrow. Some of its capabilities have already been introduced:
- planar chirality
- square-planar isomerism
- axially chiral amides
- axially chiral biaryls
- tetrahedral chirality
- alkene geometrical isomerism
- multiatom bonding in metallocenes
Given the two previous articles on axial chirality, it should be clear how to represent the two enantiomers enantiomers of the structure shown above using FlexMol. What is far from clear at this point is how to bring this capability to chemists. For example, no 2-D structure editor I'm aware of can systematically encode axial chirality. Likewise, no 2-D rendering toolkit can draw it. FlexMol and languages like it are important first steps to solving these problems, but they are by no means the last.
Postscript: the two structures depicted above are actually identical! You can prove this to yourself by verifying that the phenol oxygen points into the plane of your screen and the chlorine atom points out of the plane in both structures. Clearly the authors intended for the phenyl ring to be flipped by 180 degrees, but they hand-placed the wedges on the wrong side of the benzene ring. The fact that this error appears in none other than J. Med. Chem. further underscores the need for tools that understand axial chirality.
Look Ma, No Applets!

The state of the art in structure editors for chemical Web services is Java applets. Although closed editors have long dominated this field, Open Source editors are a possibly viable option. Java applets are great from a developer's perspective. But applets are avoided by some end users and IT support for their underlying need to install a Java plug-in of some kind and long startup times.
Through David Bradly's sciencebase, I came across a non-Java solution to the structure editor problem. The software is called WebME. WebME looks and feels similar to Java Molecular Editor. It loads quickly and provides a clean, inviting user interface. It should work in any modern browser. Most interesting of all, WebME works without a browser plugin of any kind.
The magic behind WebME is AJAX, which has been summarized by Paul Graham as "Javascript now works". By asynchronously interacting with a Web server as user interface events occur, WebME is able to cram a lot of functionality into a relatively small deployment package. The user interface is written in a mixture of HTML and JavaScript, thus eliminating the need for Java.
Although WebME's use of AJAX is innovative, other non-Java solutions to the structure editor problem have also been implemented. For example, PubChem have developed an editor in JavaScript/HTML for use with their popular service.
Despite its advantages, the AJAX approach does involve some trade-offs. For example, WebME is not nearly as responsive as, say, JME. I would imagine that unusually high network latency could further erode WebME's responsiveness. Furthermore, the subtle visual cues that make JME a productive tool, such as highlighting the node or edge the cursor is about to edit, are non-existent. It's unclear if this is a limitation of this particular version of WebME I used or the underlying technology.
AJAX is a promising new technology that may well have a place in producing ergonomic user interfaces for chemistry Web services. On the other hand, it wasn't too long ago that JavaScript "didn't work", was loathed, or simply ignored altogether. It may well be that Java applets undergo a revival similar to that of JavaScript, perhaps triggered by built-in support for applets made possible through an open source Java implementation. Regardless of how Java applet technology evolves, WebME's approach is worth serious consideration.
The Automatic Encoding of Chemical Structures
No advantages accrue to the chemist from knowing how to generate and how to interpret a chemical code. Codes are needed only for the mechanical manipulation of chemical structures. Clearly then, if the coding of chemical compounds could be accomplished automatically, this automatic conversion would relieve the chemist of a considerable burden.
-Alfred Feldman et al. J. Chem. Doc. 1963, 3, 187-189
The success of any new molecular encoding method relies, in part, on its invisibility to its prospective users. After all, why should anyone bother to learn yet another molecular language, especially one designed with computers in mind? Yet these encoding systems are critical in connecting chemical information and information technologies. How can any new encoding method be made part of existing workflows invisibly?
Feldman and his group at Walter Reed faced a similar problem in the early 1960's. American Cyanamide had been using a modified typewriter to prepare attractive 2-D chemical structures, purely for human consumption. Feldman's idea was to modify the typewriter design still further such that a computer-usable molecular code would be recorded as a byproduct of preparing the structure diagram. The typist could remain blissfully unaware of the mechanical magic beneath, and get on with his or her job. The idea was later adapted by Shell to produce a more cost-effective device.
The structure editor has long since replaced the chemical typewriter. But the same forces are at work with today's new molecular encoding methods, especially InChI. To what extent are scientists themselves being given the tools to leverage these new technologies, without having to become aware of them? What will these new tools look like and how will they differ from what came before?
Four Free 2-D Structure Editors for Web Applications
The increasing trend toward hosting free chemical databases and other services on the web brings with it the need for a free, ergonomic, capable, and fast 2-D structure editor. For years, the options were rather limited. However, this situation has started to change. Four web-enabled editors are discussed here, with an emphasis on the steps needed to deploy them within a webpage and retrieve a text-based molecular representation. A sample webpage is provided for each editor that allows a user to draw a molecule and view the corresponding output in a browser.
Building a Web Application: The Key Players
Consider the case of John, who would like to know the TPSA of caffeine. John finds a new website, http://tpsacalculate.com, that calculates the TPSA of any molecule. This site presents John with a 2-D structure editor applet and a "Submit" button. John uses the applet to draw caffeine and then presses the button. After one second, John sees a new page showing the structure of caffeine and its TPSA descriptor.
By pressing the "Submit" button, John sets in motion a series of transactions between the editor applet, the webpage, and the server. First, the webpage extracts a molfile representation of caffeine from the editor using JavaScript. This molfile is then submitted to the server using an HTTP POST request. After processing the molfile, the server returns a page containing the TPSA that John requested.
Several variations on this pattern are conceivable, each involving varying levels of involvement by the browser, the applet, and the server. Advanced use of JavaScript can lead to elimination of the applet entirely, an approach taken by the PubChem structure search. Even more interesting is the use of AJAX, which would eliminate both the applet and the page refresh step, setting the stage for highly-interactive chemical content using only a browser and JavaScript. Although no AJAX-powered 2-D structure editors currently exist, this situation can be expected to change in the future.
Obtaining Text Output From a 2-D Editor
Extracting text-based output requires the same boilerplate code for all four editors. This code consists of four main components: (1) an editor applet into which the user draws a structure; (2) a JavaScript function that collects the output from the applet; (3) an HTML text field into which the JavaScript function inserts the output; and (4) an HTML form containing a button that when pressed sets the process in motion.
These commonalities make it possible factor out editor specific code and logic. The HTML below gives an example of what one basic template looks like.
<html>
<head><title>Molfile Test</title></head>
<body>
<!-- JavaScript -->
<script language="JavaScript">
function writeOutput()
{
document.form.output.value = document.applet.OUTPUT_METHOD();
}
</script>
<!-- Applet -->
<applet code="APPLET_CLASS" name="applet"
archive="APPLET_JARFILE.jar"
width=510 height=360>
Please enable Java and JavaScript on your machine.
</applet>
<br />
<!-- Form -->
<form method="post" name="form">
<input type="button"
value="Get Output"
onclick="writeOutput()"></input>
<br /><br />
<textarea name="output" rows=20 cols=80></textarea>
</form>
</body>
</html>The above HTML contains three editor-specific pieces of information: (1) APPLET_JARFILE; (2) APPLET_CLASS; and (3) OUTPUT_METHOD. APPLET_JARFILE is the name of the Java archive file (*.jar) containing the applet code. This name is created by the developer when s/he saves the archive to the webserver. APPLET_CLASS is the fully-qualified class name of the editor applet. OUTPUT_METHOD is the name of the applet method that returns output. These last two pieces of editor-specific information are listed in the summary that follows.
Java Molecular Editor (JME)
Homepage: Molinspiration
License: Free for noncommercial development.
Source Code: N/A
Size: 39 Kb
APPLET_CLASS: JME
OUTPUT_METHOD: molFile(); smiles(); nonisomericSmiles(); jmeFile();
JChemPaint
Homepage: CDK
License: GPL
Source Code: SourceForge
Size: up to 6.2 Mb
APPLET_CLASS: org.openscience.cdk.applications.jchempaint.applet.JChemPaintEditorApplet
OUTPUT_METHOD: getMolFile();
Comment: Although getSmiles() and getSmilesChiral() methods are available, neither produced the desired output during this test (version 2.1.5). The applet consists of 35 jar files, only some of which are necessary for minimal functionality.
JMolDraw
Homepage: SourceForge
License: GPL
Source Code: SourceForge
Size: up to 1.4 Mb
APPLET_CLASS: org.jmd.editor.main.JMolDraw
OUTPUT_METHOD: getContentsAsMolfile(); getContentsAsJMEString()
Notes: In contrast to the other three editors, there is no option to display this applet in the browser itself; it must be rendered as a separate window. In addition, this editor requires that several configuration and resource files be accessible on the server. Molfile output uses V3000 ctabs. Although V2000 ctabs are supported, the only way to activate this functionality is to modify the source code.
MCDL
Homepage: SourceForge
License: Public Domain
Source Code: SourceForge
Size: 256 Kb
APPLET_CLASS: mcdl.MCDLEditor
OUTPUT_METHOD: getMDCL()
Notes: This editor only supports output in Modular Chemical Descriptor Language format.
Conclusions
This review has only scratched the surface of what is possible with these editors. For example, all accept input as well as providing output. As a result, they can be used to render 2-D molecular images, with more or less Java coding. Both MCDL and JME are especially attractive from the developer perspective because they are each distributed as a single jar file with a small footprint.
Although numerous 2-D structure editors are available, those reviewed here meet the minimum requirements for the development of free chemical web applications: they work on nearly all computing platforms thanks to Java; and they are themselves free.


