Small Molecule 3D Coordinates From PubChem

May 23, 2008

The PubChem team has quietly introduced a new feature - 3D coordinates for many of the small molecules in its compound collection. To my knowledge, these coordinates are only currently available via FTP. From the README:

The data contained here consists of a theoretical 3D description of PubChem Compound records computed using the MMFF94s force field without coulombic terms, including MMFF charges. Each provided theoretical 3D conformer is not a stationary point on the hyper-potential surface (i.e., is not at a minimum energy). Rather, the theoretical 3D description is a low energy conformer selected from a conformer model (a theoretical description of the conformational flexibility of a chemical structure consisting of multiple 3D representations or poses sampled using an RMSD {root mean squared distance} threshold) describing energetically-accessible and (potentially) biologically relevant coformations of a chemical structure.

Not every PubChem Compound record will have a theoretical 3D description. Structures considered too large (containing more than 50 non-hydrogen atoms) or too flexible (containing more than 15 rotatable bonds) are excluded. Furthermore, chemical structures containing elements other than H, C, N, O, F, P, S, Cl, Br, and I are also excluded.

Generation of theoretical 3D descriptions of small molecules is computationally intensive. As such, some PubChem Compound records may be added at a later time.

(A few open source packages for generating 3D conformers are also available.)

Recently, Geoff Hutchison wrote in to suggest that a potentially useful new feature of Chempedia could be the ability to directly obtain 3D coordinates for a molecule of interest.

One very economical way to do that would be to use PubChem's 3D dataset. It would also be trivial to display these coordinates as a resizable Jmol applet, in analogy to Chempedia's recently-added 2D molecule resizing feature.

Of course, there are many other potential uses for the PubChem conformer dataset, especially when applied to Web applications.