Easily Calculate TPSA Descriptors from SMILES Strings Using Ruby CDK
A D-F reader wrote in to ask how to calculate Topological Polar Surface Area (TPSA) using Ruby CDK. TPSA is one of the most widely-used descriptors for predicting membrane permeability and from it other important ADME properties. This article shows how to calculate TPSA with Ruby using Ruby CDK.
The Library
Our library consists of nothing more than a few method calls to manipulate the underlying CDK library. The tpsa_for
method accepts any SMILES string and returns the calculated TPSA:
require 'rubygems'
require_gem 'rcdk'
require 'rcdk/util'
jrequire 'org.openscience.cdk.qsar.descriptors.molecular.TPSADescriptor'
module TPSA
@@calc = Org::Openscience::Cdk::Qsar::Descriptors::Molecular::TPSADescriptor.new
def tpsa_for smiles
mol = RCDK::Util::Lang.read_smiles smiles
@@calc.calculate(mol).getValue.doubleValue
end
end
An Interactive Test
Saving the library to a file called tpsa.rb lets us test it through interactive Ruby (irb):
irb
irb(main):001:0> require 'tpsa'
./tpsa.rb:2:Warning: require_gem is obsolete. Use gem instead.
/usr/local/lib/ruby/gems/1.8/gems/rcdk-0.3.0/lib/rcdk/java.rb:26:Warning: require_gem is obsolete. Use gem instead.
=> true
irb(main):002:0> include TPSA
=> Object
irb(main):003:0> tpsa_for 'COCCc1ccc(OCC(O)CNC(C)C)cc1' # metoprolol
=> 50.72
irb(main):004:0> tpsa_for 'O=C3Nc1ccc(Cl)cc1C(c2ccccc2)=NC3O' # oxazepam
=> 61.69
The results we obtain for metoprolol and oxazepam are 50.72 and 61.69, respectively. These values compare well with those reported by Ertl et al. in the definitive paper on TPSA (50.7 and 61.7, respectively).
Conclusions
It doesn't take much Ruby to command a wide range of cheminformatics functionality - in this case TPSA calculations. But the fun doesn't stop there. The CDK, and by extension Ruby CDK, offer access to a wide array of descriptor calculations, each of which follow the same basic pattern outlined here. All of it can be prototyped, debugged, and deployed through one of the most flexible programming languages currently available.