Open Benchmarks for Cheminformatics: First Performance Comparison Between CDK and MX

The previous article in this series discussed Japex in the context of creating open cheminformatics benchmarks. If you're not familiar with it, Japex is a microbenchmarking framework written in Java that does for benchmarking what Ant does for building projects. Among its many interesting features is the ability to generate bar charts for performance comparisions.

Recently I finished building the first direct performance comparison between CDK and MX, two open source cheminformatics toolkits. The chart below summarizes the test.

Benchmark

You can read the full report for yourself here. This test compares the relative speed of loading a 33-record SD file and summing the calculated molecular masses from each record. As you can see, CDK is about 19% faster than MX on the system I looked at.

It should be pointed out that I'm no expert with Japex, so it's possible that I've introduced a source of error into this comparison that could affect the outcome.

Benchmarking is clearly a process, not an endpoint. In the months ahead, expect to see many more benchmarking comparisons, both between MX and other toolkits, and within MX itself.