CampDepict: Building a Simple SMILES Depict Web Application With JRuby, Structure CDK, and Camping

Posted by Rich Apodaca Wed, 23 Apr 2008 15:16:00 GMT

Today's tribute to the power of simplicity comes by way of John Jaeger, who has built one of the simplest cheminformatics Web applications ever written. His creation, CampDepict, interactively produces a raster image of a 2D chemical structure given a SMILES string, not unlike Daylight's Depict application.

CampDepict uses the Ruby Web microframework Camping. From the README:

Camping is a web framework which consistently stays at less than 4kb of code. You can probably view the complete source code on a single page. But, you know, it‘s so small that, if you think about it, what can it really do?

The idea here is to store a complete fledgling web application in a single file like many small CGIs. But to organize it as a Model-View-Controller application like Rails does. You can then easily move it to Rails once you‘ve got it going.

John's application is loosely-based on the Rails Depict application first described in 2006 here on Depth-First. His code makes use of CDK and Structure CDK, and it runs on JRuby.

If you've ever been curious about what Ruby has to offer cheminformatics, CampDepict could be just the application to get your feet wet.

Chempedia.net: Mashing Up PubChem and Wikipedia 12

Posted by Rich Apodaca Fri, 04 Apr 2008 14:06:00 GMT

PubChem and Wikipedia represent two of the largest open repositories of chemical information in the world. And they complement each other very nicely. PubChem contains mainly low-level chemical structure information whereas Wikipedia contains free-text descriptions of chemical compounds in the form of compound monographs.

Both services offer permission and access to copy and reuse their contents. But neither service is, by itself, nearly as useful as it could be.

Why not mash them up?

To explore that question my company, Metamolecular, LLC has launched Chempedia.

To my knowledge, Chempedia represents the first publicly-facing database of compounds to incorporate Wikipedia's collection of organic compound monographs. And it's one of the few cheminformatics services to make use of free-text descriptions generated by individual chemists.

Chempedia has been somewhat selective about the compounds it includes. To date, it has spidered over 2,500 monographs, combining them with over 300,000 of the most interesting compounds from PubChem. Not every Chempedia.net molecule has a monograph, but now there's a tool that can actually make that absence apparent.

Chempedia is both an experiment and a service. It's immediately useful for anyone in the business of making or doing things with organic molecules. It's created several unexpected moments of "Oh, that's actually a useful molecule!" It also will serve as a platform to test some of the ideas discussed in Depth-First over the last year or so on the advantages of the Web for collaboration in chemistry.

Stay tuned for more details about how Chempedia was created and some of its applications in chemistry.

Simple Installation of Rubidium

Posted by Rich Apodaca Wed, 21 Nov 2007 14:26:00 GMT

Rubidium is a Ruby cheminformatics scripting environment. Previously, a problem was reported with the RubyForge gem repository that prevented the simple installation of the Rubidium gem. After filing a bug report, the problem was resolved.

The problem, which led to a 404 being issued when trying to install the gem from the remote RubyGems repository, was a variant of a known RubyForge issue.

You can now install Rubidium like this:

$ jruby -S gem install rbtk

Installation takes a few minutes due to the large size of the included Chemistry Development Kit jarfile.

Parsing SD Files with Ruby and Rubidium

Posted by Rich Apodaca Mon, 12 Nov 2007 16:27:00 GMT

Reading SD files is a bread-and-butter cheminformatics operation. At a minimum, a cheminformatics toolkit needs to parse the individual entries of an SD file, and provide access to the embedded molfile and data hash for each.

Recent articles have introduced Rubidium, a Ruby cheminformatics scripting environment. The Rubidium team now announces the release of Rubidium-0.1.1, which, among other features, introduces the ability to parse SD files.

Prerequisites

Rubidium is designed to run on JRuby. Installing JRuby is straightforward on unix-like systems. First, download the JRuby-1.1b1 binary release. Then, unpack the archive to your directory of choice. Set $JRUBY_HOME and $JAVA_HOME. Finally, add $JRUBY_HOME/bin to your path.

Installing Rubidium-0.1.1

Generally speaking, it should be possible to install Rubidium with a one-line command to RubyGems:

$ jruby -S gem install rbtk

Unfortunately at the time of this writing, I was receiving the mysterious RubyGems 404 error with the RubyForge remote repository:

$ jruby -S gem install rbtk
Select which gem to install for your platform (java)
 1. rbtk 0.1.1 (java)
 2. rbtk 0.1.0 (java)
 3. Skip this gem
 4. Cancel installation
> 1
ERROR:  While executing gem ... (OpenURI::HTTPError)
    404 Not Found

This appears to affect only certain RubyGems on RubyForge - possibly only those with multiple versions. It seems to be an error on the RubyForge server that occasionally appears and then disappears.

As a workaround, you can download the Rubidium gem and install it manually:

$ jruby -S gem install tmp/rbtk-0.1.1-jruby.gem

Because Rubidium-0.1.1 introduces an Active Support dependency, you will need to install that library before installing Rubidium:

$ jruby -S gem install tmp/rbtk-0.1.1-jruby.gem
ERROR:  While executing gem ... (RuntimeError)
    Error instaling tmp/rbtk-0.1.1-jruby.gem:
        rbtk requires activesupport >= 1.4.2
$ jruby -S gem install activesupport
Successfully installed activesupport-1.4.4
Installing ri documentation for activesupport-1.4.4...
Installing RDoc documentation for activesupport-1.4.4...
$ jruby -S gem install tmp/rbtk-0.1.1-jruby.gem
Successfully installed rbtk, version 0.1.1
Installing ri documentation for rbtk-0.1.1-jruby...
Installing RDoc documentation for rbtk-0.1.1-jruby...

It's possible that the RubyForge 404 issue will be resolved by the time you read this article, so jruby -S gem install rbtk should be tried first.

Parsing an SD File

Let's say we'd like to extract all InChIs from a PubChem dataset. If you don't have one handy, a compilation of about 2000 PubChem benzodiazepines has been deposited on RubyForge.

With our unzipped datafile in our working directory, we can now test the SD File parser by saving the following library to a file called parse.rb:

require 'rubygems'
gem 'rbtk'
require 'rubidium/sdf'

def parse_sd filename
  p = Rubidium::SDF::Parser.new File.new(filename)

  p.each do |entry|
    puts "InChI: #{entry['PUBCHEM_NIST_INCHI']}"
  end
end
which can be tested with jirb:
$ jirb
irb(main):001:0> require 'parse'
=> true
irb(main):002:0> parse_sd 'pubchem_benzodiazepine_20071110.sdf'
InChI: InChI=1/C16H12Cl2N2O/c1-20-14-7-6-12(18)8-13(14)16(19-9-15(20)21)10-2-4-11(17)5-3-10/h2-8H,9H2,1H3

[truncated]

RSpec and Behavior-Driven Development

If you check out the Rubidium source distribution, you'll notice that the SD parser library is tested with RSpec, the BDD framework for Ruby. Ultimately, all components of Rubidium will be tested and documented this way.

Acknowledgments

Rubidium's new SD file parser was written by Moses Hohman. It was kindly donated by Collaborative Drug Discovery, who have built their drug discovery application using Ruby on Rails.

Future Directions

One problem in working with SD files is pinpointing encoding errors. A parser should not only raise an exception, but point to a line number and identify offending text to aid debugging. Rubidium's SD parser will eventually incorporate these enhancements.

Because Rubidium runs on JRuby, performance gains may be achievable by re-writing select portions in Java.

Parsing SD files is only the beginning of the story. Many cheminformatics applications need a convenient, fast, and robust method for writing molfiles. This is also something Rubidium will attempt to provide.

If your company or organization is curious about Ruby and cheminforamatics, give Rubidium a try. Rubidium is licensed under the permissive MIT License to make collaboration as simple as possible.

An Introduction to the Rubidium Cheminforamtics Toolkit: Interconvert SMILES, InChI, and Molfile with an Open Babel-Like Interface 4

Posted by Rich Apodaca Mon, 15 Oct 2007 14:59:00 GMT

Interconverting molecular languages is a very common operation in cheminformatics, so convenient conversion tools are desirable. Recent articles have discussed JRuby as a functional cheminformatics scripting environement. In this article, we'll see how this functionality can be combined with convenience for molecular language conversions.

In addition to illustrating a technique, this article is the first in a series aimed at documenting a new cheminformatics toolkit for Ruby called "Rubidium". Rubidium will provide a unified set of Ruby APIs for working with diverse Open Source cheminformatics tools.

Rubidium will be distributed under the highly permissive MIT License.

Prerequisites

This Rubidium library requires JRuby and the Chemistry Development Kit (CDK). Copying the CDK jarfile into your JRuby lib directory is all that's needed.

The Library

The goal of this library is to provide a simple, yet flexible way to interconvert SMILES, InChI, and molfile formats. It was inspired the Open Babel library, in which an OBConversion object is configured with input and output formats prior to performing one or more conversions. In today's library, a similar Ruby interface is created for the CDK. Because of it's length, it won't be presented in its entirety. Instead, it can be downloaded here.

Testing the Library

The library can be tested by saving it as a file called cdk.rb and invoking jirb. We can then convert a SMILES for benzene into the InChI for benzene:

$ jirb
irb(main):001:0> require 'cdk'
=> true
irb(main):002:0> c=CDK::Conversion.new
=> #<CDK::Conversion:0x4c6320 ... >
irb(main):003:0> c.set_formats 'smi', 'inchi'
=> "inchi"
irb(main):004:0> c.convert 'c1ccccc1'
=> "InChI=1/C6H6/c1-2-4-6-5-3-1/h1-6H"

Upcoming articles will show more examples of interconversions using this library, and discuss some of its limitations.

An Aside

It might be useful for Rubidium to support multiple Conversions, each using its own cheminformatics toolkit. For example, a recent article discussed SMILES and InChI interconversion with Ruby Open Babel. With a little tweaking, the Ruby Open Babel OBConversion interface could be make identical to the Ruby interface used in today's tutorial. We could also configure JOELib and Rosetta Conversions in an analogous fashion.

Rubidium would then offer a family of molecular language converters, each of which used exactly the same API. We could then pick the best converter based on the situation at hand.

Conclusions

With just a little Ruby code, we've created a convenient Ruby interface for interconverting SMILES, InChI, and molfile formats. JRuby supports even more interconversions through the CDK as well as other Java and Java Native Interface libraries. Future articles will discuss some of the possibilities.

Older posts: 1 2 3