Can Your Cheminformatics Tool Do This? 2

Posted by Rich Apodaca Wed, 13 Jun 2007 12:36:00 GMT

Can Your Cheminformatics Tool Do This?

Posted by Rich Apodaca Thu, 05 Apr 2007 16:46:00 GMT

Axial chirality isn't the first thing most chemists come up with when they think of indole. Yet a recent J. Org. Chem. article by Kamikawa et al. describes not only axially chiral indoles, but an enantioselective method for their synthesis.

Axial chirality has been largely ignored by the cheminformatics community. There was once a time when the phenomenon was esoteric enough that it could be reasonably ignored. However, that time has long since passed, as the Kamikawa study, and many others demonstrate.

Several years ago, Andreas Dietz proposed a conceptual framework for solving this problem. More recently, it was put into practice in the FlexMol language and the Octet framework. These may not represent the best solutions to the axial chirality problem, but they clearly demonstrate that a practical solution, fully compatible with modern information technologies, does exists.

Axially chiral molecules like those in the Kamikawa study will increasingly find their way into chemical databases as they continue to become more synthetically accessible. When this happens, users will want to be able to distinguish stereoisomers, just as they wanted (and got) this capability for tetrahedral chirality ten to twenty years ago. When the inevitable request to distinguish axially-chiral stereoisomers in your database comes, how will you respond?

Octet Fundamentals: Immutable Molecules

Posted by Rich Apodaca Tue, 20 Feb 2007 05:44:00 GMT

The Immutable pattern increases the robustness of objects that share references to the same object and reduces the overhead of concurrent access to an object. It accomplishes this by forbidding any of an object's state information to change after the object is constructed. The Immutable pattern also avoids the need to synchronize multiple threads of execution that share an object.

-Mark Grand, Patterns in Java, Volume 1

Peruse the Octet API Documentation and you may find something surprising about the Molecule interface: it lacks mutator methods. Given that mutators enable the state of an object to change, how can a Molecule ever be created in the first place? Why would anyone even need immutable Molecules?

Mutable Molecules are Unnecessary

Most cheminformatics tools permit the unfettered modification of molecules after their creation. Octet takes a completely different approach. The Molecule contract says that the state of every Molecule will remain constant over its lifetime. Octet then backs up this promise by deliberately defining only accessor methods in the Molecule interface.

Perhaps the best reason for an immutable Molecule interface is that there are vanishingly few situations in which a Molecule needs to be changed after it's created. The creative use of Design Patterns obviates 80-90% of the perceived need for Molecule mutability.

The Virtues of Immutability

When clients know that a Molecule can never be changed, programming becomes a lot easier and less bug-prone. For example, consider that when a Molecule is immutable:

  • defensive copying of Molecules is unnecessary;
  • a mechanism that reports changes to internal Molecule state (listeners) is unnecessary;
  • memory leaks resulting from failure to disconnect a Molecule listener are eliminated;
  • the clone method, and all of its complexities, are unnecessary;
  • no special precautions need to be taken to achieve robust, thread-safe code;
  • Molecules can safely be used as keys in Hashtables;
  • the Molecule API is greatly simplified because all mutators have been removed.

Octet fully embraces the productivity gains made possible by immutable Molecules. In fact, many of Octet's key interfaces are immutable for precisely the reasons cited above.

How to Build Immutable Molecules

Immutable Molecules may be a good idea, but how can they get created in the first place? After all, there are no methods such as addAtom though which a Molecule can be built up!

Java inner classes and the Builder Pattern provide one solution to this problem. Consider the following Java snippet, which is adapted from the Octet source code:

public class BasicMoleculeBuilder implements MoleculeBuilder
{
  private MoleculeImpl molecule;

  public BasicMoleculeBuilder()
  {
    molecule = new MoleculeImpl();

    // implement the rest
  }

  public void addAtom(IsotopicDistribution distribution)
  {
    AtomImpl atom = new AtomImpl(distribution);

    // changing the molecule!
    molecule.atoms.add(atom);
  }

  public Molecule releaseMolecule()
  {
    Molecule result = molecule;
    molecule = new MoleculeImpl();

    return result;
  }

  // implement the remaining MoleculeBuilder methods

  private class MoleculeImpl implements Molecule
  {
    private List atoms;

    private MoleculeImpl()
    {
      atoms = new ArrayList();
    }

    // implement the remaining Molecule methods
  }

  private class AtomImpl implements Atom
  {
    private IsotopicDistribution dist;

    private AtomImpl(IsotopicDistribution dist)
    {
      this.dist = dist;
    }

    // implement the remaining Atom methods
  }
}

Notice that the addAtom method changes the state of the Molecule under construction. Strictly speaking, this does violate immutability. In practice, it makes no difference because the changes occur only in the context of BasicMoleculeBuilder, which keeps these changes from propagating to the outside world. Once a client invokes releaseMolecule, BasicMoleculeBuilder loses all contact with the Molecule it created, and so is incapable of further modification.

Although it may not be immediately apparent, Alan Holub's brilliant series of articles in JavaWorld on user interface design are directly applicable here. If you've never been exposed to rigorous object-oriented design, Holub's claims can seem rather bizarre ("get and set functions are evil"). But if you stick with it, you'll be rewarded with a valuable new appreciation for object-oriented programming in Java.

Although a discussion of the role of the Builder Pattern in the above code is beyond the scope of this article, look to future installments of the "Octet Fundamentals" series for more details.

Conclusions

Molecule immutability is a core Octet principle that results in cleaner, simpler, and more robust client code. Situations that might appear to require the ability to edit Molecules can usually be handled through the creative application of Design Patterns.

Octet Fundamentals: A Documented System of Atomic Masses

Posted by Rich Apodaca Fri, 02 Feb 2007 20:10:00 GMT

The way that atoms, and particularly their masses, are modeled sets the stage for the kinds of problems a cheminformatics environment can solve. Many systems are currently in use, a reflection of the many different ways there are to think about this problem. This article will introduce the atomic mass system used by Octet, which provides atomic mass values and uncertainties cross-referenced to the primary literature.

A Documented System of Atomic Masses

Mass and isotopic composition are fundamental atomic properties. In addition to the mass values themselves, the errors of these determinations are also important. Because these quantities are sometimes in dispute, it is essential that they be cross-referenced to the primary literature. Fortunately, a landmark work titled "Atomic weights of the elements" (AWOTE) accomplishing exactly this objective was published in 2000 by a team led by J. K. Böhlke from the U.S. Geological survey.

Octet uses an XML representation of the data contained in AWOTE. To view the entire document, click here. To illustrate the kind of data included in this document, consider this entry for the element carbon:

<entry symbol="C" atomic-number="6">
  <natural-abundance>
    <mass value="12.0107" error="0.0008" />
    <isotope mass-number="12">
      <mass value="12" error="0" />
      <abundance value="0.9893" error="0.0008" />
    </isotope>
    <isotope mass-number="13">
      <mass value="13.003354838" error="0.000000005" />
      <abundance value="0.0107" error="0.0008" />
    </isotope>
  </natural-abundance>
</entry>

Carbon has two naturally-occurring stable isotopes, 12C and 13C. They have relative abundances of 98.93% and 1.07%, and masses of 12 (exactly) and 13.003354838±0.000000005 unified mass units (u), respectively. Every element from hydrogen to uranium is included, excluding technitium. By reference to AWOTE, the determination of every value in the XML file can be found in the primary literature.

Using the Atomic Mass System

As a demonstration of Octet's system of atomic masses, consider the following Ruby code:

require 'rubygems'
require_gem 'rjb'

atomic_system=Rjb::import('net.sf.octet.model.BasicAtomicSystem').getInstance
carbon_distribution=atomic_system.getNaturalAbundance(atomic_system.getAtomicSymbol("C"))

puts carbon_distribution.countNuclei # =>2
puts carbon_distribution.getNucleus(0).getMassNumber # =>12
puts carbon_distribution.getNucleus(1).getMassNumber # =>13
puts atomic_system.getAtomicMass(carbon_distribution.getNucleus(0)).getValue.toString # => 12.0
puts atomic_system.getAtomicMass(carbon_distribution.getNucleus(1)).getValue.toString # => 13.003354838
puts atomic_system.getAtomicMass(carbon_distribution.getNucleus(1)).getUncertainty.toString # => 5.0E-9

The previous article in this series described the small number of steps needed to execute Ruby code such as that shown above on Windows and Linux systems. For more information on the AtomicSystem API, consult the Octet Javadoc.

Conclusions

Octet provides a comprehensive system of atomic masses containing both measurements and uncertainties. This system is furthermore cross-referenced to the primary literature. As a result, the mass of every Octet Molecule can be determined to high precision and with error analysis. Not every application will require this level of detail and documentation, but for those that do the capability exists.

numly esn 34181-070204-258949-40 Rate content:


Creative Commons License
This work is licensed under a Creative Commons Attribution 2.5 License.

A Molecular Language for Modern Chemistry: Reading FlexMol Documents with Octet

Posted by Rich Apodaca Wed, 31 Jan 2007 19:56:00 GMT

An XML language is only as useful as the software tools that take advantage of it. Previous articles have discussed how the XML language FlexMol can solve a variety of molecular representation problems ranging from the multiatom bonding of metallocenes to the axial chirality of biaryls. Octet is a framework written in Java that speaks FlexMol natively. In this article, I'll show how Octet can be used to read a sample FlexMol document.

Prerequisites

For this tutorial, you'll need Ruby Java Bridge (RJB). Previous articles have discussed the installation and use of RJB on Windows and Linux.

A Sample Molecule

A recent article disused a FlexMol representation of the chiral natural product monolaterol. Using a slightly modified numbering system for this molecule (shown above), we can construct a complete FlexMol representation. In this case, we simply start numbering at index zero, subtracting one from every index in the previous example to match the zero-based indices used in Octet.

A Demonstration Package

To illustrate the process of reading a FlexMol document, I've prepared a small package (demo-20070131.tar.gz) that can be downloaded from SourceForge. In it, you'll find an Octet jarfile (octet-0.8.2.jar), a FlexMol representation of monolaterol (s_monolaterol.xml), a Ruby library (reader.rb), and some Ruby test code (test.rb). Inflate this archive and make it your working directory.

A Simple Test

The following sequence of commands will run the test included with the demonstration package:

$ export CLASSPATH=./octet-0.8.2.jar
$ ruby test.rb

You should see several lines of output terminated with the line:

The exact mass of monolaterol is 276.115029755.

You can get more hands-on experience with loading and processing the monolaterol FlexMol document using interactive Ruby (irb). For example:

$ irb
irb(main):001:0> require 'reader'
=> true
irb(main):002:0> r=Reader.new
=> #:0x2b9ab1736690>, @handler=#<#:0x2b9ab1736e10>, @builder=#<#:0x2b9ab1736b90>>
irb(main):003:0> mol=r.read_file 's_monolaterol.xml'
=> #<#:0x2b9ab172cd48>
irb(main):004:0> mol.countAtoms
=> 21
irb(main):005:0> mol.countBondingSystems
=> 24

Of course, this is just scratching the surface of what can be done once a FlexMol document has been loaded by Octet.

Conclusions

Octet makes it possible to convert FlexMol documents into Java object representations that can be accessed through Ruby. With an object representation, the possibilities are limitless. Some simple examples have been provided here. Future articles will illustrate more advanced uses.

Older posts: 1 2