Reading (and Rendering) ChemDraw CDX Files in JavaScript
About four years ago I speculated that JavaScript may be capable of filling some important holes in chemistry software. Since then, a number of important events have taken place to move JavaScript onto center stage as a programming language, including the release of Chrome with its huge JavaScript performance increases and rocket trajectory to overtake IE as the most popular browser, the runaway success of iOS as a computing platform free of browser plugins, the renaissance of server-side JavaScript through Node.js and other frameworks, and a dazzling array of languages like CoffeeScript that compile to JavaScript.
JavaScript for cheminformatics has started to look a lot less improbable as well. In late 2010, my company began distributing the first copies of its chemical structure editor ChemWriter - a complete rewrite of the original Java applet in JavaScript. Following this, we released the plugin- and server-free chemical structure rendering package ChemVector. Other companies now offer products in this space as well, with indications that big players are now starting to take notice.
This article highlights a small part of some work I'm now doing that demonstrates the readiness of JavaScript to solve difficult cheminformatics problems.
Demo: Read a ChemDraw File
Given an arbitrary binary ChemDraw CDX file, it's now possible to print out the encoded object graph in a JavaScript console. For example, this file generates this output:
Document 0 [0x0000]
Creation Program:ChemDraw 12.0.3.1216
Name:benzene.cdx
Font Table:Windows fonts: Font[id:20, characterSet: 10000, name: Times],Font[id:21, characterSet: 10000, name: Helvetica]
Bounding Box:(10444645t, 6456556r, 14452480b, 9927443l)
Color Table:Color[red: 65535, green: 65535, blue: 65535],Color[red: 0, green: 0, blue: 0],Color[red: 65535, green: 0, blue: 0],Color[red: 65535, green: 65535, blue: 0],Color[red: 0, green: 65535, blue: 0],Color[red: 0, green: 65535, blue: 65535],Color[red: 0, green: 0, blue: 65535],Color[red: 65535, green: 0, blue: 65535]
Foreground Color:0
Background Color:1
Atom Show Query:true
Atom Show Stereo:false
Atom Show Atom Number:false
Atom Show Terminal Carbon Labels:true
Atom Show Non-Terminal Carbon Labels:true
Atom Hide Implicit Hydrogens:true
Atom Show Enhanced Stereo:true
Bond Show Query:true
Bond Show Stereo:false
Bond Show Reaction:true
Interpret Chemically:true
Mac Print Info:0,3,0,0,2,208,2,208,0,0,0,0,29,254,22,246,255,134,255,136,30,118,23,112,3,103,5,40,3,252,0,2,0,0,0,72,0,72,0,0,0,0,2,216,2,40,0,1,0,0,0,100,0,0,0,1,0,3,3,3,0,0,0,1,39,15,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,104,0,0,25,1,144,0,0,0,0,0,32,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Print Margins:(2359296t, 2359296r, 2359296b, 2359296l)
Chain Angle:7864320
Bond Spacing:120
Bond Length:1966080
Bold Width:262144
Line Width:65536
Margin Width:131072
Hash Spacing:176896
Label Style:Font Style [id: 21 type: 96 size: 200 color: 3]
Caption Style:Font Style [id: 20 type: 0 size: 240 color: 3]
Caption Justification:0
Fractional Widths:true
Label Justification:5
? [828]: unknown
? [829]: unknown
? [82a]: unknown
Window is Zoomed:true
Window Position:(3997696, 4915200)
Window Size:(38731776, 50528256)
Page 43 [0x002b]
Bounding Box:(0t, 0r, 47179366b, 35389440l)
Width Pages:1
Height Pages:1
Header Position:2359296
Footer Position:2359296
Print Trim Marks:true
Fragment 27 [0x001b]
Bounding Box:(10444645t, 6456556r, 14452480b, 9927443l)
Node 24 [0x0018]
Z Order:1
Position 2D:(6489324, 11465523)
Atom CIP Stereochemistry:1
Node End
Node 26 [0x001a]
Z Order:3
Position 2D:(6489324, 13431603)
Atom CIP Stereochemistry:1
Node End
Node 28 [0x001c]
Z Order:5
Position 2D:(8192000, 14414643)
Atom CIP Stereochemistry:1
Node End
Node 30 [0x001e]
Z Order:7
Position 2D:(9894675, 13431603)
Atom CIP Stereochemistry:1
Node End
Node 32 [0x0020]
Z Order:9
Position 2D:(9894675, 11465523)
Atom CIP Stereochemistry:1
Node End
Node 34 [0x0022]
Z Order:11
Position 2D:(8192000, 10482483)
Atom CIP Stereochemistry:1
Node End
Bond 36 [0x0024]
Z Order:13
Bond Order:2
Bond Double Position:1
Bond Begin:24
Bond End:26
Bond Bond Ordering:41,0,0,37
Bond End
Bond 37 [0x0025]
Z Order:14
Bond Begin:26
Bond End:28
Bond CIP Stereochemistry:1
Bond End
Bond 38 [0x0026]
Z Order:15
Bond Order:2
Bond Double Position:1
Bond Begin:28
Bond End:30
Bond Bond Ordering:37,0,0,39
Bond End
Bond 39 [0x0027]
Z Order:16
Bond Begin:30
Bond End:32
Bond CIP Stereochemistry:1
Bond End
Bond 40 [0x0028]
Z Order:17
Bond Order:2
Bond Double Position:1
Bond Begin:32
Bond End:34
Bond Bond Ordering:39,0,0,41
Bond End
Bond 41 [0x0029]
Z Order:18
Bond Begin:34
Bond End:24
Bond CIP Stereochemistry:1
Bond End
Fragment End
Page End
Document End
For background, see previous D-F articles on the ChemDraw CDX file format and the CDX HexDumper utility.
Being able to visualize the contents of arbitrary CDX files this way is an important aid to debugging and a prerequisite to developing full-featured graphical display capability.
Why Read ChemDraw Files in JavaScript?
It's a pretty safe bet that any organic chemist who has written a report, submitted a journal manuscript, or filed a patent has used ChemDraw. As a result, ChemDraw files are everywhere in chemistry.
If you've spent any time watching how chemists work, the need to manipulate CDX files may be obvious. Even so, I'd like to offer a specific example taken from a recent project that illustrates that the need to work with CDX files can come from some unlikely places.
Reagents is iOS Organic Chemistry reference app that I co-developed with James Ashenhurst. It makes heavy use of ChemDraw CDX files to display structures and reaction schemes.
These structures and reactions schemes are automatically pre-rendered through AppleScript automation of the ChemDraw application. Although this works, it's far from optimal. For example, Reagents runs on iPad 3 (retina) displays. To prevent pixellation, our horizontal resolution measures in the thousands of pixels. The large size of the resulting images means more storage space and longer download times. But the biggest problem lies in the fact that every time we make changes to these ChemDraw files, we must regenerate the corresponding raster image.
It would be so much better to dynamically render these ChemDraw files inside an iOS UI element such as UIWebView.
We'd also like to do interesting things with the ChemDraw files themselves. I won't spill the beans here, but suffice it to say that if we're successful you'll be seeing one or more new iOS apps that repurpose the Reagents content in some interesting ways.
Yet another use case for a pure JavaScript CDX reader and display component would be in interfacing the system clipboard with a paste target in a Web page. ChemWriter currently supports this capability on Windows, but there's much more we could do.
Larger Vision
JavaScript has established a strong foothold in every area of modern computing: tablets; mobile; web browser; and server. But at the same time, there is no full-featured cheminformatics platform, commercial or otherwise, for building chemistry software in JavaScript.
We have many of the components for such a platform, but they're spread over different products and applications. Having been developed over the course of a couple of years, the code itself show signs of needing a refresher. A unified, well-designed JavaScript platform centralizing core cheminformatics functionality would be a valuable asset in the continued development of useful tools and applications.
Given the centrality of the CDX file format in chemistry, we're starting there first.