ChemWriter 1.3.1
ChemWriter 1.3.1 can now be downloaded. This version resolves an EditorApplet issue in which molfiles containing two or more atoms with exactly the same 2D coordinates were not displayed properly.
Details are available on the Metamolecular Company Blog.
ChemWriter is the 2D chemical structure editor designed for Web applications. Lightweight and intuitive, ChemWriter makes an excellent choice for both creating and displaying 2D chemical structures on the Web.
The Fundamental Cheminformatics Toolset

Reference: W.J. Howe and T.R. Hogadone, J. Chem. Inf. Model.
Imagine you need to create a cheminformatics system that's useful to chemists in their daily work. What tools would you absolutely need, regardless of the specific system you're building?
The answer to this question is hardly academic. If you're looking for ways to disproportionately improve the state of cheminformatics, improving the performance of one or more of its fundamental tools would seem to be a logical path.
Here, in no particular order, are my picks for the five fundamental cheminformatics tools:
2D Structure Editor. Ubiquitous yet mostly-ignored, the 2D structure editor is the last mile connecting cheminformaticians with laboratory chemists. Take away the structure editor for data entry and building queries, and most cheminformatics systems become useless to the average chemist.
2D Structure Renderer. Chemists expect their cheminformatics systems to communicate with them the way that other chemists do - through 2D chemical structures. Rendering software makes this possible. Like the 2D structure editor, structure renderers are a widely-ignored yet critical link between producers and consumers of cheminformatics software. Although the 2D renderer and editor need not necessarily be related, the two technologies are so similar that most 2D editors are based on a related 2D rendering engine.
Structure Query System. The purpose of the vast majority of cheminformatics systems is to produce a set of chemical structure results based on a structure query. The structure query system makes this possible. As the datasets that chemists deal with become ever larger, the ability to specify query structures at a high level of detail, and retrieve the results efficiently, becomes increasingly important. This is an area ripe for big improvements.
Low-Level Cheminformatics Toolkit. Most cheminformatics systems involve one or more elements specific to their problem domain. For example, predictive tools may use molecular descriptors. A robust and versatile low-level cheminformatics toolkit makes it possible to build problem-specific cheminformatics libraries. This toolkit may or may not be used in the 2D structure editor and renderer, depending on whether an adequate text-based molecular language is available (see below).
Text-Based Molecular Language. Cheminformatics systems are frequently built from components developed independently by multiple groups. These systems may be developed in different programming languages, may even run on different operating systems, and may need to communicate over a network connection. A well-specified, open, text-based molecular language makes it possible for these systems to interoperate. Two widely-used examples include MDL's molfile format and Daylight's SMILES, both of which have significant limitations.
One of the reasons I consider this set of cheminformatics tools in particular to be fundamental is the perennial need to use and improve them. Elements of each of these tools can be seen, for example, in the COUSIN system developed by Howe and Hogadone at Upjohn over 25 years ago. Comparison of this system with PubChem shows just how little the basic problems change, despite major changes in underlying technology.
What are your fundamental cheminformatics tools and which of them are you working to improve?
Simple 3D Conformer Generation with Smi23D 3
Three-dimensional conformer generation is a common problem in cheminformatics. The most convenient and generally-useful method for creating chemical structures is the 2D chemical structure editor; applications that require three-dimensional representations need a way to generate reasonable coordinates from 2D user input. Until recently, there were no options for doing so with Open Source software. This article shows how the Open Source package smi23d can be used to convert ordinary SMILES strings into three-dimensional molfile representations.
About smi23d
smi23d uses a two-stage process to generate 3D coordinates.; an initial pass with smi2sdf generates rough coordinates and subsequent refinement by mengine results in the final coordinates. The package was originally written in C by Kevin Gilbert and updated by Rajarshi Guha. As part of what appears to be a growing trend in cheminformatics, smi23d is licensed under the highly-permissive Apache License.
On a related note, the source code for a program called Frog is reportedly on its way into the Open Babel project.
Prerequisites
To build smi23d, you'll need to install Scons, a Make-like build utility written in Python. I was able to install the Scons rpm on my Linux system without a problem. smi23d uses no other dependencies.
Download smi23d
smi23d can be downloaded with Subversion:
$ svn co https://cicc-grid.svn.sourceforge.net/svnroot/cicc-grid/cicc-grid/smi23d/trunk smi23d
Building smi23d
With the source code in place, compilation is just a matter of running Scons:
$ cd smi23d $ scons ...
Once the sources are compiled, we'll want to configure our system a bit:
$ cd build $ ls mmff94.prm mmxconst.prm $ cp ../src/smi2sdf/smi2sdf . $ cp ../src/mengine/mengine .
The two files mmff94.prm and mmxconst.prm are parameter files needed by both smi2sdf and mengine.
With smi2sdf and mengine both in the build directory, we can create a simple test with the SMILES for Ivabradine:
$ vi test.smi ... $ cat test.smi CN(CCCN1CCC2=CC(=C(C=C2CC1=O)OC)OC)C[C@H]3CC4=CC(=C(C=C34)OC)OC
With everything ready to go, we can begin Stage one:
$ ./smi2sdf test.smi Found 1 structures in test.smi field : MMX Atom Types: 169 Bonds: 580 Bond3: 0 Bond4: 0 Bond5: 0 Angle: 434 Angle3: 41 Angle4: 60 Angle5: 0 Torsion: 697 Torsion4: 58 Torsion5: 0 Vdw: 172 OOP: 91 Dipole: 474 Charge: 0 Improper: 0 STBN: 26 ANGANG: 0 STRTOR: 0 VDWPR: 4 Input file = test.smi Output file = output.sdf Param file = mmxconst.prm Log file = error.log Inorganic file = test_inorg.smi Structure: 0 CN(CCCN1CCC2=CC(=C(C=C2CC1=O)OC)OC)C[C@H]3CC4=CC(=C(C=C34)OC)OC
You can view the result in an application like Jmol:

It's not much to look at, but we're not quite done yet.
Stage two is accomplished by using the output of Stage one as input to mengine:
$ ./mengine -o optimized.sdf output.sdf field : MMX Atom Types: 169 Bonds: 580 Bond3: 0 Bond4: 0 Bond5: 0 Angle: 434 Angle3: 41 Angle4: 60 Angle5: 0 Torsion: 697 Torsion4: 58 Torsion5: 0 Vdw: 172 OOP: 91 Dipole: 474 Charge: 0 Improper: 0 STBN: 26 ANGANG: 0 STRTOR: 0 VDWPR: 4 field : MMFF94 Atom Types: 181 Bonds: 448 Bond3: 0 Bond4: 0 Bond5: 0 Angle: 1801 Angle3: 21 Angle4: 61 Angle5: 0 Torsion: 674 Torsion4: 38 Torsion5: 95 Vdw: 182 OOP: 112 Dipole: 0 Charge: 0 Improper: 0 STBN: 286 ANGANG: 0 STRTOR: 0 VDWPR: 0
We now have a file called output.sdf. As you can see, it's a pretty good 3D representation of Ivabradine:

Conclusions
In this tutorial, we've seen how the Open Source program smi23d can be used to assign reasonable 3D coordinates to an arbitrary SMILES string. One very practical use of smi23d would be to process the output of 2D chemical structure editors prior to use in a 3D program. Future articles will discuss some of the possibilities.
Image Credit: Mary Mactavish

