PerlMol: A Case Study in Open Source Cheminformatics Software 4
How does open source software happen? Although many factors come into play, the majority of answers seem to revolve around a simple theme: developers building solutions to fill their own needs. Yet only a fraction of these solutions end up becoming open source software. And only a fraction of those end up being used by a wider audience. What's the key ingredient? There's still a lot to learn from studying individual cases.
Readable discussions about the origins of specific open source projects are pretty rare, but those dealing with the origins of open source cheminformatics software are more uncommon still. So it was with great interest that I came across Ivan Tubert-Brohman's account of how PerlMol was created.
PerlMol is an open source "collection of Perl modules for cheminformatics and computational chemistry." Many software packages fit into this category, and some of them are open source, so why write another? For Tubert-Brohman, the deciding factor was being able to work in his preferred environment, Perl:
I was surprised that CPAN [The Comprehensive Perl Archive Network] was sorely lacking in terms of modules for chemistry. The only available modules were Chemistry::Element, which allows you to convert between atomic number, element symbol, and element name and store other elemental information; and Chemistry::MolecularMass, which calculates the mass from the molecular formula. There were no modules that actually dealt with the structure of molecules. While some of the options in other languages are not bad, I was looking for something with the simplicity and conciseness of Perl that could allow me to write "chemical one-liners" to solve small problems very quickly, without having to compile anything. Hence, PerlMol was born.
The elimination of the need to compile, and relaxed syntaxes that promote succinct code are two of the biggest reasons to try a cheminformatics scripting environment.
There's a lot of great software still to be written in cheminformatics, and some of it will be open source. Although open sourcing that side project you've been working on may not be the best option for your career or your company, studying case studies like that of PerlMol gives plenty of food for thought.


Just for your information: Have you tried mayachemtools, another perl-based cheminformatics tool? It is very practical and functional tool, though not listed in your post cheminformatics scripting environment.
Hanjo, hadn't seen that one - thanks for pointing it out. At my last count that makes three environments for Perl: Open Babel/Perl, PerlMol, and MayaChemTools. Any others?
Thanks, I feel honored to be a case study! In retrospect, I have found another advantage in PerlMol: while some of it may seem like a reinvented wheel to to some people, and a square one at that, it is *my* wheel, in the sense that I know it perfectly well inside and out. Just today I found a minor bug, and it took me less than 5 min to fix it. When I find a bug in someone else's open source software, it usually takes me at least hours to fix, or I can file a report and pray...
Of course, this is only an advantage for me. :)
Ivan,
I think those of us who have been strongly involved in one open source project or another would completely understand. There are numerous times when you realize how much faster something could be -- because you just need to add (or fix) "one little thing."