Depth-First
Cheminformatics in Context
Reading and Writing SD Files With MX
MDL Structure Data Files (SD Files) are the de facto standard for the exchange of chemical structures and associated data. As a result, methods for efficiently reading and writing these files play an important part in any cheminformatics toolkit.
The latest release of MX, the open source cheminformatics toolkit, adds support for reading and writing SD Files. Both source and platform-independent binary distributions are available.
The new release introduces SDFileReader. In interactive JRuby:
$ jirb irb(main):001:0> require 'mx-0.107.0.jar' => true irb(main):002:0> import com.metamolecular.mx.io.mdl.SDFileReader => Java::ComMetamolecularMxIoMdl::SDFileReader irb(main):003:0> r=SDFileReader.new 'pubchem_sample_33.sdf' => #<Java::ComMetamolecularMxIoMdl::SDFileReader:0x40b181 @java_object=com.metamolecular.mx.io.mdl.SDFileReader@145b02f> irb(main):004:0> r.next_record => nil irb(main):005:0> m=r.get_molecule => #<Java::ComMetamolecularMxModel::DefaultMolecule:0xcb754f @java_object=com.metamolecular.mx.model.DefaultMolecule@60b407> irb(main):006:0> m.count_atoms => 31 irb(main):007:0> r.get_keys => #<Java::JavaUtil::ArrayList:0x381d92 @java_object=[PUBCHEM_COMPOUND_CID, PUBCHEM_COMPOUND_CANONICALIZED, PUBCHEM_CACTVS_COMPLEXITY, PUBCHEM_CACTVS_HBOND_ACCEPTOR, PUBCHEM_CACTVS_HBOND_DONOR, PUBCHEM_CACTVS_ROTATABLE_BOND, PUBCHEM_CACTVS_SUBSKEYS, PUBCHEM_IUPAC_OPENEYE_NAME, PUBCHEM_IUPAC_CAS_NAME, PUBCHEM_IUPAC_NAME, PUBCHEM_IUPAC_SYSTEMATIC_NAME, PUBCHEM_IUPAC_TRADITIONAL_NAME, PUBCHEM_NIST_INCHI, PUBCHEM_EXACT_MASS, PUBCHEM_MOLECULAR_FORMULA, PUBCHEM_MOLECULAR_WEIGHT, PUBCHEM_OPENEYE_CAN_SMILES, PUBCHEM_OPENEYE_ISO_SMILES, PUBCHEM_CACTVS_TPSA, PUBCHEM_MONOISOTOPIC_WEIGHT, PUBCHEM_TOTAL_CHARGE, PUBCHEM_HEAVY_ATOM_COUNT, PUBCHEM_ATOM_DEF_STEREO_COUNT, PUBCHEM_ATOM_UDEF_STEREO_COUNT, PUBCHEM_BOND_DEF_STEREO_COUNT, PUBCHEM_BOND_UDEF_STEREO_COUNT, PUBCHEM_ISOTOPIC_ATOM_COUNT, PUBCHEM_COMPONENT_COUNT, PUBCHEM_CACTVS_TAUTO_COUNT, PUBCHEM_BONDANNOTATIONS]> irb(main):008:0> r.get_data 'PUBCHEM_COMPOUND_CID' => "1"
SDFileReader implements lazy iteration with Molecules and data only being created when requested.
SD Files can be written with SDFileWriter. In interactive JRuby:
$ jirb irb(main):001:0> require 'mx-0.107.0.jar' => true irb(main):002:0> import com.metamolecular.mx.io.mdl.SDFileWriter => Java::ComMetamolecularMxIoMdl::SDFileWriter irb(main):003:0> import com.metamolecular.mx.io.Molecules => Java::ComMetamolecularMxIo::Molecules irb(main):004:0> w=SDFileWriter.new 'output.sdf' => #<Java::ComMetamolecularMxIoMdl::SDFileWriter:0x8a2023 @java_object=com.metamolecular.mx.io.mdl.SDFileWriter@43da1b> irb(main):005:0> w.write_molecule Molecules.create_benzene => nil irb(main):006:0> w.write_data 'key', 'value' => nil irb(main):007:0> w.close => nil
For an up-to-date summary of MX's current capabilities, please check out the MX Homepage.
© 2006-2010 Richard L. Apodaca | Content Licensed CC 3.0 BY-SA