Reading and Translating ChemDraw CDX Files with OpenBabel

The previous article in this series gave a brief introduction to the ChemDraw CDX file format. It also described the important role this file format plays in the business and science of chemistry. Given the ubiquity of CDX files, how can we read them in an inexpensive way? This article describes one approach that uses Open Babel.

Get Open Babel

Open Babel is both a command line utility and a developer toolkit. In this tutorial, we'll be using the command line utility to read CDX files generated by a drawing program. If you haven't done so already, install Open Babel on your system. I'm running MacPorts, so it was this simple:

sudo port install openbabel

Get Some CDX Files

Now we'll need to make some CDX files. If you're fortunate enough to already have access to ChemDraw, you're all set. If not, you have a couple of options:

1. Download a free 2-week evaluation copy of ChemDraw. I'm not sure if CDX writing is supported in the evaluation version. After taking the time to download it, the installer looked sufficiently non-standard on my Mac to make me re-think letting this software loose on my system.
2. Buy a license to Chemdoodle. Chemdoodle is a full-featured desktop chemical drawing package. Unfortunately, the 30-day free trial does not support writing CDX files. It does, however, read CDX files quite nicely.
3. Use the Marvin online demo. Free to use with no installation required.

In the interest of time and money, I decided to go with Option (3) for now. To save even more time, you can download the files I created.

Get Translating

With a selection of CDX files in hand, we can now convert them. Change into the directory containing the CDX files. If you downloaded the examples, you might use:

babel -icdx benzene.cdx -omol benzene.mol

This generates a properly formatted molfile:

benzene.cdx
OpenBabel09171008482D

6  6  0  0  0  0  0  0  0  0999 V2000
34.4702  -47.1348    0.0000 C   0  0  0  0  0
31.0649  -49.1009    0.0000 C   0  0  0  0  0
31.0649  -53.0331    0.0000 C   0  0  0  0  0
34.4702  -54.9993    0.0000 C   0  0  0  0  0
37.8756  -53.0331    0.0000 C   0  0  0  0  0
37.8756  -49.1009    0.0000 C   0  0  0  0  0
1  2  1  0  0  0
1  6  2  0  0  0
2  3  2  0  0  0
3  4  1  0  0  0
4  5  2  0  0  0
5  6  1  0  0  0
M  END

Now let's try converting a reaction:

babel -icdx benzene-to-aniline-reaction.cdx -omol benzene-to-aniline-reaction.mol

The output is an SD file in which the first molecule is benzene and the second is aniline:

benzene-to-aniline-reaction.cdx
OpenBabel09171008512D

6  6  0  0  0  0  0  0  0  0999 V2000
19.7159  -48.6707    0.0000 C   0  0  0  0  0
16.3106  -50.6368    0.0000 C   0  0  0  0  0
16.3106  -54.5691    0.0000 C   0  0  0  0  0
19.7159  -56.5352    0.0000 C   0  0  0  0  0
23.1212  -54.5691    0.0000 C   0  0  0  0  0
23.1212  -50.6368    0.0000 C   0  0  0  0  0
1  2  1  0  0  0
1  6  2  0  0  0
2  3  2  0  0  0
3  4  1  0  0  0
4  5  2  0  0  0
5  6  1  0  0  0
M  END

benzene-to-aniline-reaction.cdx
OpenBabel09171008512D

7  7  0  0  0  0  0  0  0  0999 V2000
47.6623  -48.6707    0.0000 C   0  0  0  0  0
44.2570  -50.6368    0.0000 C   0  0  0  0  0
44.2570  -54.5691    0.0000 C   0  0  0  0  0
47.6623  -56.5352    0.0000 C   0  0  0  0  0
51.0676  -54.5691    0.0000 C   0  0  0  0  0
51.0676  -50.6368    0.0000 C   0  0  0  0  0
54.4730  -48.6707    0.0000 N   0  0  0  0  0
1  2  1  0  0  0
1  6  2  0  0  0
2  3  2  0  0  0
3  4  1  0  0  0
4  5  2  0  0  0
5  6  1  0  0  0
6  7  1  0  0  0
M  END

Notice how information has been lost (the reaction arrow and text) in this conversion due in part to impedance mismatch between the CDX format and the molfile format.

Wrapup

Open Babel can read and convert a subset of the ChemDraw CDX file format. But what if we wanted to do more? Future articles will offer some ideas.