Flexible Depth-First Search With MX 2

Posted by Rich Apodaca Wed, 26 Nov 2008 16:13:00 GMT

Graph theory is an essential component of cheminformatics, if you dig deeply enough. MX is a lightweight cheminformatics toolkit written in Java with a major goal of exposing the most important cheminformatics graph manipulations in a flexible, Java-centric way. Previous releases have focused on implementing subgraph monomorphism functionality for use in substructure search. The new MX release, 0.104.0, introduces support for depth-first traversal. This article will give a simple example using this feature.

Downloading MX

MX can be downloaded in source or binary form:

Scripting MX with JRuby

A previous article outlined the simple steps needed to install JRuby on unix-based systems for scripting MX.

Finding All Paths From a Given Atom

A fundamental graph operation in cheminformatics is finding all paths through a molecule from a starting atom. MX makes this easy with the com.metamolecular.mx.path.PathFinder class. Depth-first traversal is used in creating molecular fingerprints. Another use is in creating SMILES strings, although a limited form of depth-first traversal is used in which each atom in a molecule is traversed only once.

We can create a short library to print out all of the paths through a molecule in JRuby:

require 'mx-0.104.0.jar'
import 'com.metamolecular.mx.path.PathFinder'

class PathPrinter
  def initialize
    @finder = PathFinder.new
  end

  def print_paths atom
    paths = @finder.find_all_paths atom

    puts "printing all paths through the molecule"

    paths.each do |path|
      print_path path
    end
  end

  private

  def print_path path
    path.each do |atom|
      print atom.get_index
      print '-' unless path.get(path.length - 1).equals(atom)
    end

    puts
  end
end

Saving the above code in a file called pathprinter.rb, we can test it from interactive JRuby:

$ jirb
irb(main):001:0> require 'pathprinter'                   
=> true
irb(main):002:0> import com.metamolecular.mx.io.Molecules
=> Java::ComMetamolecularMxIo::Molecules
irb(main):003:0> benzene=Molecules.create_benzene        
=> #<Java::ComMetamolecularMxModel::DefaultMolecule:0x43da1b @java_object=com.metamolecular.mx.model.DefaultMolecule@8a2023>
irb(main):004:0> p=PathPrinter.new                       
=> #<PathPrinter:0x19ed7e @finder=#<Java::ComMetamolecularMxPath::PathFinder:0x3727c5 @java_object=com.metamolecular.mx.path.PathFinder@1140709>>
irb(main):005:0> p.print_paths benzene.get_atom(0)       
printing all paths through the molecule
0-5-4-3-2-1
0-1-2-3-4-5
=> nil

How It Works

Two classes collaborate in this traversal: com.metamolecular.mx.path.PathFinder and com.metamolecular.mx.path.DefaultStep.

Creating a depth-first traversal of your own is as simple as creating a DefaultStep from an Atom and implementing a walk method similar to the one shown below:

public void walk(Step step)
{
  if (!step.hasNextBranch())
  {
    // do something with the completed branch

    return;
  }

  while(step.hasNextBranch())
  {
    Atom next = step.nextBranch();

    if (step.isBranchFeasible(next))
    {
      walk(step.nextStep(next));

      step.backTrack();
    }
  }
}

Conclusions

Depth-first traversal is an important tool in any cheminformatics library. MX offers an implementation of this traversal strategy that can be easily customized.

Substructure Search From Scratch in Java Part 1: The Atom Mapping Problem 2

Posted by Rich Apodaca Mon, 17 Nov 2008 19:17:00 GMT

One of the most important capabilities in cheminformatics is mapping the atoms of a query structure onto the atoms of a target structure. Although useful in itself, the main value of atom mapping comes from the software that gets built on top of it: exact structure comparators, substructure search systems, and query atom/bond search systems such as SMARTS. The fundamental nature of atom mapping means that correctness, efficiency and adaptability are essential features of a good mapping implementation. Recently, a D-F article made the case that atom mapping software written in Java needs to be Java-centric to achieve these goals. This article, the first in a series that describes a complete substructure search system written in Java, takes the first step by offering some simple interface definitions and code for the atom mapping problem.

The Problem

Given a query molecule (query) and a target molecule (target), our atom mapping software needs to find ways to match the atoms of query onto target such that the mapping describes a substructure embedded in target. The software might stop at one mapping, continue on to find all of them, or stop at some point in the middle. It all depends on the specific cheminformatics problem we're trying to solve.

The Recursive Function

Our implementation will gradually build up an atom mapping by traversing the atoms of query in depth-first order and trying to map each found atom onto an atom in target. At each step in the process, we will have a partial atom map that maps some of the atoms in query onto target. That map, and any other information needed to complete the analysis will be kept in an instance of a class implementing the State interface.

A State will be manipulated by a recursive method, mapFirst that returns when the first atom map is found:

// create a list to hold atom maps
List<Map<Atom, Atom>> maps = new ArrayList<Map<Atom, Atom>>();

// create initial state
State state = ...; 

boolean mapFirst(State state)
{
  if (state.isDead())
  {
    return false;
  }

  if (state.isGoal())
  {
    maps.add(state.getMap());

    return true;
  }

  boolean found = false;

  while (!found && state.hasNextCandidate())
  {
    Match candidate = state.nextCandidate();

    if (state.isMatchFeasible(candidate))
    {
      State nextState = state.nextState(candidate);
      found = mapFirst(nextState);

      nextState.backTrack();
    }
  }

  return found;
}

Comparison of the mapFirst method to the pseudocode VF algorithm Match procedure given in the previous article shows some similarities. In fact, something similar to the mapFirst method forms the basis of many atom mappers in use today.

Although it may be clear from the code, it's worth re-iterating that each time mapFirst is recursively called, an attempt is made to branch off a new State that maps an additional pair of atoms from query to target. If that branch leads to a possible solution, it's followed. Otherwise the next possible mapping is explored.

The State Interface

The recursive mapFirst method determines all of the methods the State interface needs to define:

public interface State
{
  /**
   * Returns the current mapping of query atoms onto target atoms.
   * This map is shared among all states obtained through nextState.
   */
  public Map<Atom, Atom> getMap();

  /**
   * Returns true if another candidate match can be found or
   * false otherwise.
   */
  public boolean hasNextCandidate();

  /**
   * Returns the next candidate match.
   */
  public Match nextCandidate();

  /**
   * Returns true if the given match will work with the current
   * map, or false otherwise.
   */
  public boolean isMatchFeasible(Match match);

  /**
   * Returns true if all atoms in the query molecule have been
   * mapped.
   */
  public boolean isGoal();

  /**
   * Returns true if no match will come from this State.
   */
  public boolean isDead();

  /**
   * Returns a state in which the atoms in match have been
   * added to the current mapping.
   */
  public State nextState(Match match);

  /**
   * Returns this State's atom map to its original condition.
   */
  public void backTrack();
}

Finally, State uses an instance of the Match class:

public class Match
{
  private Atom query;
  private Atom target;

  public Match(Atom query, Atom target)
  {
    this.query = query;
    this.target = target;
  }

  public Atom getQueryAtom()
  {
    return query;
  }

  public Atom getTargetAtom()
  {
    return target;
  }
}

Conclusions

With just a few lines of Java, we've managed to reduce the fundamental cheminformatics problem of atom mapping to the far simpler problem of implementing the State interface.

How many ways are there to implement the State interface? Probably as many as there are subgraph isomorphism algorithms. Notice that the way we've set up the problem lets us use the same recursive method to test all State implementations, an essential prerequisite for benchmarking and optimization.

Future articles in this series will describe one way to implement the State interface.

Image Credit: Dollar Bin

What's Broken in Cheminformatics?

Posted by Rich Apodaca Wed, 14 Feb 2007 10:17:00 GMT

A while back, Seth Godin gave a talk on things that are broken, why they are broken, and why they stay broken (video). Seth is one of those rare speakers who can entertain and inform at the same time. I especially liked how his talk identifies seven (maybe more) reasons for brokenness:

  • Not My Job
  • Selfish Jerks
  • The World Changed
  • I Didn't Know
  • I'm not a Fish
  • Contradictions
  • Broken on Purpose

As I watched Seth's talk, it wasn't hard to think of many examples in which chem(o|i)informatics is broken. Of course, the way you feel about something being broken depends on your perspective. It's far too easy to become cynical when dealing with the frustration of broken things. But for anyone wanting to do research that matters or build useful products, things that are broken are like manna from heaven.

Mongrel and Rails: It's Just not Fair

Posted by Rich Apodaca Mon, 05 Feb 2007 14:51:00 GMT

Rails excels as a rapid Web development platform. Deployment of Rails applications, on the other hand, has until recently been a lot less rapid and a lot more complicated. A new Rails technology called Mongrel is now set to change all of that.

Mongrel is a fast Rails application container. It can be used in standalone mode - exactly like WEBrick. For hassle-free scaling, your Rails application can also be run on multiple Mongrel instances (either located on the same server or on multiple servers) behind a load balancer such as Pen, Balance, or Apache 2.1+. Using Mongrel is very simple: it's packaged as a Ruby Gem and requires minimal configuration. Mongrel is also stable (Version 1.0 was just released), actively maintained (unlike lighttpd), and apparently quite secure.

What does this have to do with cheminformatics? Quite a lot if you, like me, see the Web as the ideal platform for developing cheminformatics applications. Sure, a lot of technologies will need to be created first. But this is just another way of underscoring the big opportunities ahead. Future articles will discuss some of them.

Anatomy of a Cheminformatics Web Application: Ajaxifying Depict

Posted by Rich Apodaca Mon, 04 Dec 2006 15:06:00 GMT

The previous tutorial in this series showed some techniques for improving the appearance and usability of a simple cheminformatics Web application. That application, Depict, rendered color images of 2-D molecular structures when given a SMILES string. Still, something is missing. Wouldn't it be better if the application responded to individual keystrokes in the input field, rather than waiting for the user to hit the return key? In this tutorial, we'll see how to quickly accomplish this effect with a technology called "Ajax."

Downloads and Prerequisites

For this tutorial, you'll need Ruby CDK (RCDK). A recent article described the small amount of system configuration required for RCDK on Linux. Another article showed how to install RCDK on Windows.

In addition, you'll need to install Ruby on Rails - something that can be done through RubyGems.

The Rails application that this tutorial starts with can be downloaded from this link. If you'd rather start working directly with the version of Depict produced by applying the changes outlined in this tutorial, the full source code can be downloaded from this link.

If you'll be running Depict on an AMD64 Linux system, you'll need to prepend your invocation of script/server with LD_PRELOAD. For example, on my system running Sun's JVM, the full command looks like:

$ LD_PRELOAD=/usr/java/jdk1.5.0_09/jre/lib/amd64/libzip.so ruby script/server

A Brief Introduction to Ajax

When stripped down to its essentials, Ajax is nothing more than an asynchronous communication channel between Web browsers and Web servers. In the pre-Ajax model of client-server Web interactions, a browser would make a request to a server and then wait until getting a server response, which would take the form of a complete Web page. In the Ajax model, a browser makes a request to a server, continuing to function while the server generates a response, which takes the form of a small section of the page that gets replaced. For this reason, Ajax-enabled Web sites are far more application-like than the document-centric sites that preceded them.

Ajax Support in Rails

Ajax is implemented in JavaScript using the HTMLHttpRequest object, although working at this level can require a lot of code to do anything meaningful. Fortunately, Rails and other Web application frameworks provide high-level interfaces to Ajax. In Rails, Ajax support takes the form of a variety of helper methods, one of which we'll use in this tutorial: observe_field. This method, an instance of the Observer Pattern, assigns an Observer to monitor input activity in a text field.

The Problem at Hand

We'd like Depict to provide immediate feedback by rendering a SMILES string as it is keyed into the input field. If the partial SMILES string is valid, it will be rendered, otherwise, an error image will be rendered. At no point will the user need to press the return key to see an image of the SMILES string they are typing.

Step 1: Ajaxify the View

Let's start by adding an observer to Depict's input field. These changes will occur to the SMILES View, contained in depict/app/views/smiles/depict.rhtml:

<html>
  <head>
    <title>Depict</title>
    <%= stylesheet_link_tag "default", :media => "all" %>

    <!-- Nothing works without this line. -->
    <%= javascript_include_tag :defaults %>
  </head>
  <body>
    <h1>Depict a SMILES String</h1>

    <!-- New id attribute needed by Ajax -->
    <div class="image" id="results" >
      <img src="<%= image_for_smiles :smiles => @smiles %>"></img>
    </div>
    <br /><br />
      <div class="smiles">
      <%= form_tag :action=>'depict' %>
        <label>SMILES: </label>

        <!-- Ajaxified text field. -->
        <!-- We turn off autocomplete to simplfify the interface. -->
        <%= text_field_tag :smiles, @smiles, {:autocomplete => "off"} %>
        <%= observe_field( :smiles,
                           :frequency => 0.5,
                           :update    => :results,
                           :url       => { :action => :ajax_depict } ) %>
      <%= end_form_tag %>
      </div>
  </body>

  <div class="about">
    <!-- Update the URL to point to the new Depth-First article -->
    <a href="http://depth-first.com/articles/2006/12/04/anatomy-of-a-cheminformatics-web-application-ajaxifying-depict">About this Application</a>
  </div>
</html>

The above code introduces three key elements:

  • The javascript_include_tag method is called, which is surprisingly easy to forget to do.

  • The original text_field method call is replaced by text_field_tag to simplify coding. We disable browser-based autocompletion by setting the autocomplete attribute to off. This removes a feature unlikely to ever be used, and leads to a more streamlined interface.

  • The observe_field method is called, linking activity in the text field to an Ajax action, ajax_depict, that will update the image area. To accomplish this, we assign the div containing our image the id "results."

Making these changes and refreshing the browser window gives a screen like the one below:

Although the client side of the Ajax communication channel is working, the server side is not. Let's fix that.

Step 2: Ajaxify the Server

Depict needs an Action and View that will be invoked in response to keyboard events in the SMILES input box. To do this, first add a new ajax_depict method to SmilesController, the source for which is found in depict/app/controllers/smiles_controller.rb:

class SmilesController < ApplicationController
  def depict
    if params[:smiles]
      @smiles = params[:smiles][:value]
    else
      @smiles = ''
    end
  end

  def image_for
    if flash[:bytes]
      send_data(flash[:bytes], :type => "image/png", :disposition => "inline", :filename => "#{flash[:smiles]}.png")
    end
  end

  # The new ajax_depict method.
  def ajax_depict
    @smiles=request.raw_post
  end
end

Making the above changes and refreshing your browser should give an error message:

The new ajax_depict method is being called, but no associated template exits. This template contains the HTML that will be inserted into the div with the results id attribute that we set up in Step 1. We can resolve the error we're getting by simply creating a new file (depict/app/views/smiles/ajax_depict.rhtml) containing the following partial template:

<img src="<%= image_for_smiles :smiles => @smiles %>"></img>

Now, refreshing your browser should produce a screen like that shown below. We have now Ajaxified Depict, but we're not quite done yet.

Step 3: Update the Cascading Style Sheet

As you type a SMILES string into the input window, you may have noticed the input box being repositioned toward the top of the application window just prior to the display of a new image. This is due to the image area being resized to zero height as the new image is generated.

Fortunately, the fix is simple; we'll just specify that the image area must be 400 pixels high, whether an image is being displayed or not. This is done by editing the image selector in the CSS file at depict/public/stylesheets/default.css:

.image {
    margin-left: auto;
    margin-right: auto;
    width: 400px;
    /* Keeps the input box from moving during image refresh.*/
    height: 400px;
}

Refreshing the Depict window should now give a statically-positioned SMILES input field.

Step 4: Backward Compatibility

As it stands, if the user presses the return key, they will see the "Enter SMILES Below" message. This is due to the change in the way SMILES strings are transmitted into the application. To fix this problem, we simply change the way that SmilesController assigns the smiles instance variable (depict/app/controllers/smiles_controller.rb):

def depict
  # Uses new input method.
  if params[:smiles]
    @smiles = params[:smiles]
  else
    @smiles = ''
  end
end

Making this change produces an interface that will render the correct image whether the return key is typed or not. If JavaScript is disabled, Depict will work exactly the same way as it did in the non-Ajax version.

Conclusions

Ajax makes the Web more attractive than ever as an application development platform. In this tutorial, we've seen how using Rails made it very easy to give Depict the feel of an interactive SMILES depiction tool using Ajax. But a few details remain before we're ready to deploy this application on a Web server for the public to use. For example, we need to take server load and network latency into account, and we need to make sure Depict works well on all major browsers. The next articles in this series will address these issues.

Older posts: 1 2