Easily Calculate TPSA Descriptors from SMILES Strings Using Ruby CDK 3

Posted by Rich Apodaca Wed, 19 Sep 2007 13:27:00 GMT

A D-F reader wrote in to ask how to calculate Topological Polar Surface Area (TPSA) using Ruby CDK. TPSA is one of the most widely-used descriptors for predicting membrane permeability and from it other important ADME properties. This article shows how to calculate TPSA with Ruby using Ruby CDK.

The Library

Our library consists of nothing more than a few method calls to manipulate the underlying CDK library. The tpsa_for method accepts any SMILES string and returns the calculated TPSA:

require 'rubygems'
require_gem 'rcdk'
require 'rcdk/util'

jrequire 'org.openscience.cdk.qsar.descriptors.molecular.TPSADescriptor'

module TPSA
  @@calc = Org::Openscience::Cdk::Qsar::Descriptors::Molecular::TPSADescriptor.new

  def tpsa_for smiles
    mol = RCDK::Util::Lang.read_smiles smiles

    @@calc.calculate(mol).getValue.doubleValue
  end
end

An Interactive Test

Saving the library to a file called tpsa.rb lets us test it through interactive Ruby (irb):

$ irb
irb(main):001:0> require 'tpsa'
./tpsa.rb:2:Warning: require_gem is obsolete.  Use gem instead.
/usr/local/lib/ruby/gems/1.8/gems/rcdk-0.3.0/lib/rcdk/java.rb:26:Warning: require_gem is obsolete.  Use gem instead.
=> true
irb(main):002:0> include TPSA
=> Object
irb(main):003:0> tpsa_for 'COCCc1ccc(OCC(O)CNC(C)C)cc1' # metoprolol
=> 50.72
irb(main):004:0> tpsa_for 'O=C3Nc1ccc(Cl)cc1C(c2ccccc2)=NC3O' # oxazepam
=> 61.69

The results we obtain for metoprolol and oxazepam are 50.72 and 61.69, respectively. These values compare well with those reported by Ertl et al. in the definitive paper on TPSA (50.7 and 61.7, respectively).

Conclusions

It doesn't take much Ruby to command a wide range of cheminformatics functionality - in this case TPSA calculations. But the fun doesn't stop there. The CDK, and by extension Ruby CDK, offer access to a wide array of descriptor calculations, each of which follow the same basic pattern outlined here. All of it can be prototyped, debugged, and deployed through one of the most flexible programming languages currently available.

A Molecular Language for Modern Chemistry: Reading FlexMol Documents with Octet

Posted by Rich Apodaca Wed, 31 Jan 2007 19:56:00 GMT

An XML language is only as useful as the software tools that take advantage of it. Previous articles have discussed how the XML language FlexMol can solve a variety of molecular representation problems ranging from the multiatom bonding of metallocenes to the axial chirality of biaryls. Octet is a framework written in Java that speaks FlexMol natively. In this article, I'll show how Octet can be used to read a sample FlexMol document.

Prerequisites

For this tutorial, you'll need Ruby Java Bridge (RJB). Previous articles have discussed the installation and use of RJB on Windows and Linux.

A Sample Molecule

A recent article disused a FlexMol representation of the chiral natural product monolaterol. Using a slightly modified numbering system for this molecule (shown above), we can construct a complete FlexMol representation. In this case, we simply start numbering at index zero, subtracting one from every index in the previous example to match the zero-based indices used in Octet.

A Demonstration Package

To illustrate the process of reading a FlexMol document, I've prepared a small package (demo-20070131.tar.gz) that can be downloaded from SourceForge. In it, you'll find an Octet jarfile (octet-0.8.2.jar), a FlexMol representation of monolaterol (s_monolaterol.xml), a Ruby library (reader.rb), and some Ruby test code (test.rb). Inflate this archive and make it your working directory.

A Simple Test

The following sequence of commands will run the test included with the demonstration package:

$ export CLASSPATH=./octet-0.8.2.jar
$ ruby test.rb

You should see several lines of output terminated with the line:

The exact mass of monolaterol is 276.115029755.

You can get more hands-on experience with loading and processing the monolaterol FlexMol document using interactive Ruby (irb). For example:

$ irb
irb(main):001:0> require 'reader'
=> true
irb(main):002:0> r=Reader.new
=> #:0x2b9ab1736690>, @handler=#<#:0x2b9ab1736e10>, @builder=#<#:0x2b9ab1736b90>>
irb(main):003:0> mol=r.read_file 's_monolaterol.xml'
=> #<#:0x2b9ab172cd48>
irb(main):004:0> mol.countAtoms
=> 21
irb(main):005:0> mol.countBondingSystems
=> 24

Of course, this is just scratching the surface of what can be done once a FlexMol document has been loaded by Octet.

Conclusions

Octet makes it possible to convert FlexMol documents into Java object representations that can be accessed through Ruby. With an object representation, the possibilities are limitless. Some simple examples have been provided here. Future articles will illustrate more advanced uses.

Build a Rails Cheminformatics Application in Thirty Minutes

Posted by Rich Apodaca Tue, 21 Nov 2006 20:06:00 GMT

A recent article highlighted the Web as a new cheminformatics platform. Advocacy is one thing, but a working, open, demo built with modern technologies is far more compelling. In the following tutorial we'll build a first-generation cheminformatics Web application using the Ruby on Rails framework and 100% Open Source components. We'll just cover the essentials here - look for future articles to discuss the underlying technology in more detail.

The Problem

Simplified Molecular Input Line Entry System (SMILES) is one of the most compact and easy-to-learn molecular representation systems ever developed. Part of a larger family of molecular languages called line notations, SMILES strings are always written as a single line of ASCII text. This makes them perfect in situations calling for data entry; witness their use in a wide range of new free online chemistry databases. This system typically works by a chemist drawing a structure in a graphical editor, copying a SMILES string from the editor, and pasting this string into a search window in the database application.

SMILES is a great language for computers, but not for chemists, who are trained to communicate through 2-D structure diagrams. Although SMILES strings can be decoded manually, this is a tedious and error-prone process, especially for SMILES encoding high degrees of branching and ring content. It's preferable for the computer to do this hard work for us, providing a perfectly laid-out 2-D structure diagram for use in debugging or inclusion in documents.

Depict is a Web application originally developed by Daylight for the conversion of SMILES strings into 2-D structure diagrams. Type a SMILES string into the form, press enter, and get a raster image of the encoded molecule. Daylight's Depict does a good enough job, but you can't change the interface or output. You also can't take the software apart to see how it works. Wouldn't it be great if you could?

About This Tutorial Series

This tutorial is the first in a series describing how to build a Depict server using 100% Open Source components. The application will accept a SMILES string in a Webpage text field, and then produce a 2-D structure diagram. It won't be designed for ease of use, appearance, or configurability - these improvements will be described in subsequent articles. When this application is finished, I'll deploy it on a Web server. At every step in this process, I'll provide enough detail for anyone to do the same.

It won't be necessary to finish every step yourself before you can work with the finished product. Near the beginning of each installment will be be a "Download and Prerequisites" section containing a link to the complete source code. Simply download this code and run it to see what it does.

Download and Prerequisites

For this tutorial, you'll need Ruby CDK (RCDK). A recent article described the small amount of system configuration required for RCDK on Linux. Another article showed how to install RCDK on Windows.

In addition, you'll need to install Ruby on Rails - something that can be done through RubyGems.

The complete Depict application can be downloaded from this link.

A Note on Ruby Java Bridge and AMD64 Linux Platforms

Our Depict application will use Ruby Java Bridge (RJB) as a Ruby interface to Java bytecode. Recently, a problem with RJB on AMD64-Linux was uncovered. This problem prevents third-party jarfiles from being loaded after Rails has been loaded.

In practice, this means that the command to start the Rails server (Step 2) needs to be prefixed with an assignment of LD_PRELOAD. You also need to make LD_LIBRARY_PATH point to your native Java libraries. On my platform, which is AMD64-Linux running Sun's JVM, the commands are:

$ export LD_LIBRARY_PATH=/usr/java/jdk1.5.0_09/jre/lib/amd64:/usr/java/jdk1.5.0_09/jre/lib/amd64/server
$ LD_PRELOAD=/usr/java/jdk1.5.0_09/jre/lib/amd64/libzip.so ruby script/server

If you get an "Internal Error" due to an "unknown exception" while running Depict, chances are good that you've hit the same problem. Starting the Rails server as above should resolve it.

Step 1: Create the Application

Getting started with Rails is as simple as issuing the rails command with the name of your application as an argument:

$ rails depict

Executing this command creates a complete Rails application template under the depict subdirectory in your working directory. You build your application by editing the files and directories that were generated.

Step 2: Start the Server

You can start the Depict application by running the included server script:

$ cd depict
$ ruby script/server
=> Booting WEBrick...
=> Rails application started on http://0.0.0.0:3000
=> Ctrl-C to shutdown server; call with --help for options
[2006-11-18 10:17:08] INFO  WEBrick 1.3.1
[2006-11-18 10:17:08] INFO  ruby 1.8.5 (2006-08-25) [x86_64-linux-gnu]
[2006-11-18 10:17:08] INFO  WEBrick::HTTPServer#start: pid=4036 port=3000

Let's see what Depict looks like so far. Point your browser to http://localhost:3000. You should see the following page:

Congratulations! You're now running Ruby on Rails.

Step 3: Create the SmilesController

Rails adapts the Model-View-Controller application paradigm to the Web. It also automates many of the steps in building models, views, and controllers. Let's create a controller to handle the manipulation of SMILES strings:

$ ruby script/generate controller Smiles
      exists  app/controllers/
      exists  app/helpers/
      create  app/views/smiles
      exists  test/functional/
      create  app/controllers/smiles_controller.rb
      create  test/functional/smiles_controller_test.rb
      create  app/helpers/smiles_helper.rb

Currently, SmilesController is just a skeleton:

class SmilesController < ApplicationController
end

Let's give SmilesController the ability to accept a SMILES string as input by adding an input method.

class SmilesController < ApplicationController
  def input

  end
end

Step 4: Create a Form

At this stage, pointing your browser to http://localhost:3000/smiles/input gives a screen containing an error message:

Rails is looking for view that doesn't exist, so let's create one. To your depict/app/views/smiles directory, add the following file, called input.rhtml:

<html>
  <head>
    <title>Enter a SMILES String</title>
  </head>
  <body>
    <%= form_tag :action=>'depict' %>
      Enter a SMILES String: <br />
      <%= text_field('smiles', 'value') %><br />
    <%= end_form_tag %>
  </body>
</html> 

This HTML view is an example of Ruby's templating mechanism, eRuby, which was discussed earlier in the context of converting SD files to HTML. In the template above, we've configured our form to invoke an action called depict when submitted. This action does not yet exist, but will be created in Step 5 below.

Now, pointing your browser to http://localhost:3000/smiles/input should give an input field:

Let's try it. Submitting the SMILES string for benzene gives the following error screen:

We haven't defined the depict action yet, a fact that Rails is communicating with this error message.

Have you noticed how we haven't had to restart the Rails Web server as we've made changes? This is but one of the many conveniences that makes Rails such a productive platform.

Step 5: Add a Depict Action

We need a way to pass a SMILES string from the Web page text field in which it's entered to our application and back to another view. To do this we'll add a depict method to depict/app/controllers/smiles_controller.rb:

def depict
  @smiles = @params[:smiles][:value]
end

Of course, our application still won't run properly because we haven't created a view for the new depict method to use. Let's do this by adding the following file, named depict.rb to the depict/app/views/smiles directory:

<html>
  <head>
    <title>Depict SMILES: <%= @smiles %></title>
  </head>
  <body>
    <h1>SMILES: <%= @smiles %></h1>
  </body>
</html>

Notice how the instance variable @smiles is available for use within the template.

Let's have a look at Depict so far. Pointing your browser to http://localhost:3000/smiles/input, entering the SMILES string for benzene, and pressing return produces the page show below:

So far, so good. We've been able to read user input from an HTML form and reprocess it into some simple HTML output. Now, lets render a 2-D molecular image to go with it.

Step 6: Generate the 2-D Image

We'll use a method called image_for, which we'll define shortly. The file depict/app/views/smiles/depict should look like this:

<html>
  <head>
    <title>Depict SMILES: <%= @smiles %></title>
  </head>
  <body>
    <h1>SMILES:<%= @smiles %></h1>
    <img src="<%= url_for :action => "image_for", :smiles => @smiles %>"></img>
  </body>
</html>

The added img tag is a placeholder for now. It loads an image dynamically generated from the image_for method, which we'll shortly add to SmilesController. We pass the SMILES string as a parameter.

The image_for method does all of the real work in the Depict application. It accepts a SMILES string as a parameter, and produces a laid-out 2-D color molecular image as output. The method uses a variety of functionality contained in the Java API itself, and in Ruby CDK.

In addition to an image_for method, we'll need to add some accessory code to make it work. Edit depict/app/controllers/smiles_controller.rb so that it looks like this:

# Load the RCDK library
require_gem 'rcdk'
require 'rcdk/util'

# New jrequire calls.
jrequire 'java.io.ByteArrayOutputStream'
jrequire 'net.sf.structure.cdk.util.ImageKit'
jrequire 'javax.imageio.ImageIO'

class SmilesController < ApplicationController

  # Already defined.
  def input

  end

  # Already defined.
  def depict
    @smiles = @params[:smiles][:value]
  end

  # New method.
  def image_for
    smiles = @params[:smiles]
    mol = RCDK::Util::Lang.read_smiles smiles
    mol = RCDK::Util::XY::coordinate_molecule mol
    out=Java::Io::ByteArrayOutputStream.new
    image=Net::Sf::Structure::Cdk::Util::ImageKit.createRenderedImage(mol, 300, 300)

    Javax::Imageio::ImageIO.write(image, "png", out)

    send_data(out.toByteArray, :type => "image/png", :disposition => "inline", :filename => "molecule.png")
  end
end

Let's test the application with a real-world example. The achiral SMILES string for Carmine is:

CC1=C2C(=CC(=C1C(=O)O)O)C(=O)C3=C(C2=O)C(=C(C(=C3O)O)C4C(C(C(C(O4)CO)O)O)O)O

Pointing your browser to http://localhost:3000/smiles/input and entering the above SMILES string produces a color 2-D image of the structure of the red food coloring:

Conclusions

Ruby on Rails is a fun and agile framework for rapid Web development. Although Depict isn't much to look at yet, it demonstrates many key Rails concepts. Several techniques could be used improve the application's look and usability. For example, we could use AJAX to depict SMILES strings as they are being typed - without the need to hit return. We could also provide options for changing image format, size, and color scheme. Future articles will describe these and other improvements.

Scripting Java with Ruby: Yet Another Java Bridge

Posted by Rich Apodaca Wed, 25 Oct 2006 18:53:00 GMT

New technologies attempting to compete with older technologies need to provide a clear upgrade path, if they are to succeed. A case in point is Ruby. Many Java developers' reaction to this language has less to do with its capabilities and more to do with previous investments in Java. What good is a new language if the special library X that you depend on needs to be rewritten from scratch?

Previous articles, starting with this one, have discussed Ruby Java Bridge (RJB) as a Java-Ruby integration tool. Two additional articles discussed RJB in the context of mapping Java packages onto Ruby modules and Java-Ruby integration on Windows. RJB currently provides the mechanism whereby the full Chemistry Development Kit (CDK) API can be used in Ruby with Ruby CDK.

Another option for Java-Ruby integration is JRuby, a Java implementation of the Ruby interpreter. JRuby offers tight integration with the Java Virtual Machine, which will be ideal in many situations. In other situations, it will not be the best choice. For example, one of the advantages of RJB over JRuby is that the standard C-Ruby implementation can be used. This in turn offers, for example, full Rails functionality and access to C extensions. A disadvantage of RJB is that, being written in C, it requires a working build toolchain for installation.

I've seen one report of a Macintosh installation of RJB that failed. Without a Mac of my own, I can't confirm if this is indeed a problem. But this report also pointed me to a third approach to Ruby-Java integration, Yet Another Java Bridge (YAJB). YAJB is different from both JRuby and RJB in that it extends the C implementation of Ruby with a Java bridge written in pure Java. In theory, it should run on any platform that both Ruby and Java run on.

YAJB-0.8.1 installed on my system without a hitch. From the root directory of the distribution:

# ruby setup.rb

Using YAJB was straightforward. A Java Vector instance could be instantiated and manipulated using familiar syntax:

require 'yajb/jbridge'
include JavaBridge

v = jnew "java.util.Vector"

v.add("one")
v.add("two")
v.size # => 2
v.elementAt(1) # => "two"

Good integration tools can make the difference between actually using new technologies and simply observing them. Java developers interested in using Ruby now have at least three good options to choose from: JRuby; RJB; and YAJB.

Metaprogramming with Ruby: Mapping Java Packages Onto Ruby Modules

Posted by Rich Apodaca Tue, 24 Oct 2006 18:11:00 GMT

Metaprogramming lets you define new constructs in your programming language. In safety languages like Java, metaprogramming is not a standard feature, although it can be done. Not surprisingly, the reaction most Java developers have to metaprogramming typically ranges from "useless" to "catastrophic". I was once in that category. But in the words of Paul Graham, sometimes doing the right thing involves "changing the language to suit the problem". This is a powerful concept that, when used with care, can greatly reduce development time. The Ruby language is especially well-equipped for metaprogramming. This article will show a simple metaprogramming technique that extends Ruby so that Java packages are mapped onto to Ruby modules.

Prerequisites

This tutorial uses Ruby Java Bridge (RJB). RJB can be installed either on Windows or Linux using the RubyGems packaging mechanism.

Defining the Problem

The large amount of existing Java code makes RJB one of Ruby's most useful integration tools. RJB provides a lightweight mechanism for working with Java classes from Ruby, while simultaneously allowing for the use of C-extensions and everything else the C implementation of Ruby offers.

Let's say you'd like to do image processing with Java's ImageIO class. This could be accomplished with RJB via:

require 'rubygems'
require_gem 'rjb'
require 'rjb'

@ImageIO = Rjb::import 'javax.imageio.ImageIO'

@ImageIO.getReaderFormatNames # => an array of format names

Here, the instance variable ImageIO holds a reference to the Java class of the same name. This isn't too bad, but can we do better?

Imagine a situation in which many Java classes need to be imported. This would lead to many variable assignments like the one above. The resulting duplications of Java class names and variable names doesn't smell especially good, nor does it scale well. In addition, the large number of these variable assignments would add a lot of mental overhead when reading and writing the code.

What we'd really like is if Ruby had a way for us to map a Java package/class hierarchy onto Ruby module/class hierarchy. We could then forget about the differences between Ruby and Java, and just get to work. For example:

#...

jrequire 'javax.imageio.ImageIO'

#...

Javax::Imageio::ImageIO.getReaderFormatNames

A Solution

Our solution will involve translating nested Java package/class constructs into nested Ruby module/class constructs at runtime. We'll need the ability to create new module hierarchies in running code, one of the problems metaprogramming solves. We'll also have to deal with capitalization: Java package names are all lowercase, but Ruby module names start with a capital letter. So java.lang.System will become Java::Lang::System. By mapping the Java package namespace onto the Ruby module namespace, we'll reduce the odds of creating a Ruby/Java class name collision, such as with java.lang.String.

Let's create a small library to illustrate these points. The code will take advantage of Ruby's const_set method, which allows new constants (and therefore new modules and classes) to be defined at runtime. Save the following code into a file called java.rb:

require 'rubygems'
require_gem 'rjb'
require 'rjb'

module Kernel

  def jrequire(qualified_class_name)
    java_class = Rjb::import(qualified_class_name)
    package_names = qualified_class_name.to_s.split('.')
    java_class_name = package_names.delete(package_names.last)
    new_module = self.class

    package_names.each do |package_name|
      module_name = package_name.capitalize

      if !new_module.const_defined?(module_name)
        new_module = new_module.const_set(module_name, Module.new)
      else
        new_module = new_module.const_get(module_name)
      end
    end

    return false if new_module.const_defined?(java_class_name)

    new_module.const_set(java_class_name, java_class)

    return true
  end
end

Usage

Using this library consists of requiring it, applying the new jrequire command, and manipulating the resulting class:

require 'java'
jrequire 'javax.imageio.ImageIO'

Javax::Imageio::ImageIO.getReaderFormatNames # => ["BMP", "jpeg", "bmp", "wbmp", "gif", "JPG", "png", "jpg", "WBMP", "JPEG"]

You can eliminate the need to use the fully-qualified module name by adding an include statement, just as you would with any other Ruby module:

require 'java'
jrequire 'javax.imageio.ImageIO'
include Javax::Imageio

ImageIO.getReaderFormatNames # => ["BMP", "jpeg", "bmp", "wbmp", "gif", "JPG", "png", "jpg", "WBMP", "JPEG"]

One Possible Variation

An interesting variation on the approach given here would be to override Ruby's require method itself to accept fully-qualified Java class names. Then something even more Rubyesque could be used:

require 'java'
require 'javax/imageio/ImageIO'

Javax::Imageio::ImageIO.getReaderFormatNames

Other Examples of Ruby-Java Metaprogramming

Ola Bini has written an article on JRuby metaprogramming that takes a slightly different approach than the one detailed here.

Conclusions

This tutorial has shown a simple and practical application of Ruby's built-in metaprogramming capabilities. The careful use of metaprogamming is a powerful way to reduce code complexity and build a more consistent programming environment. Look for more metaprogramming techniques to appear in future releases of the Ruby CDK library.

Older posts: 1 2