Building a Molecule Preview with Firefly: The Joy of Swing

Posted by Rich Apodaca Wed, 18 Jul 2007 10:52:00 GMT

Previous articles have discussed Firefly, the codename for a new 2D structure editor for the Web. Although it can be deployed as a self-contained applet on Web pages, Firefly is composed of modules that are readily re-used. By taking advantage of this design and Java's native UI toolkit Swing, new UI elements can be built with relatively little effort. This article outlines one such use - the creation of a file dialog that contains a molecule preview.

Many image processing applications such as Photoshop or GIMP provide an image preview that appears in file browser dialogs. Wouldn't it be nice if applications that process molecular structure files came with a similar feature? The screenshot below shows a file chooser with an embedded molecule preview based of Firefly's Painter component:

When a new molfile is highlighted, a new preview is automatically generated:

The molecule preview is capable of all of the customizations available in Firefly including background, bond, and atom colors, borders, and atom label fonts. No matter how large or small the molecule, and regardless of its starting coordinates, it will always be exactly scaled to fit the available space and precisely centered.

This dialog was rapidly implemented using the accessory capability provided by Swing's JFileChooser:

JFileChooser chooser = new JFileChooser();
PreviewAccessory accessory = new PreviewAccessory(chooser);

accessory.setPreferredSize(new Dimension(150, 150));

chooser.setAccessory(accessory);
chooser.addPropertyChangeListener(accessory);

// ...

Defining a JComponent implmenting the PropertyChangeListener interface is all that's needed to get a working molecule preview:

class PreviewAccessory extends DefaultPainter implements PropertyChangeListener
{
  // implementation
}

Swing has come a long way since the dark days of JDK 1.2. What started out as the dog-slow ugly duckling of user interface toolkits has developed into one of the best platforms for building desktop applications out there. Advanced tools such as the WYSIWYG interface builder Matisse and the polished components offered by JIDE make Swing an even more attractive option. The example described here is just one instance of how Swing's well-conceived design simplifies the job of building rich user interfaces.

Go West, Young Man: Does Open Access Really Matter in the Long Run?

Posted by Rich Apodaca Mon, 16 Jul 2007 13:31:00 GMT

Making a name for yourself in science is no easy job. Aside from the technical challenge of doing noteworthy science while working under constraints, there's the compounding challenge of making your work known to influential colleagues. Excellent work done in a vacuum is lost to science, only to be "rediscovered" by those more willing or capable of self-promotion. Look around at the most successful scientists in your field, and you'll find that they are both extraordinarily adept at doing noteworthy science and in promoting their work.

Scientists have been using the scientific publication system for hundreds of years as a channel for promoting their work. For a variety of reasons, this system is now breaking down before our eyes. There are many reasons - consider these:

  • The printed page doesn't matter anymore. Old-guard scientific publishers have been able to prosper by acting as gatekeepers of a precious resource: the printed page. The arrival of immediate, ultra-cheap, ubiquitous, interactive, and persistent communication through the Internet means that printed journals are increasingly viewed as wasteful and irrelevant.

  • Printed journals have priced themselves out of the marketplace. How many printed journals has your library dropped over the last year? Does your "library" even carry current printed journals anymore?

  • Electronic Information wants to be free. Few things are more frustrating than knowing the answer to your question exists on a server somewhere, but you are forbidden from accessing it. Yes, you can pay $15-$30 for each article you need or get multiple subscriptions costing thousands per year, but is that any way to spend your budget?

  • The minimum publishable unit is shrinking. Scientists have legitimate interests in maximizing the number of papers they publish, in minimizing their size, and in decreasing their interval. Submissions to top-tier journals continue to increase. New journals are started to catch the overflow, placing additional strain on the system's ability to find readers and "qualified" reviewers, and driving up production costs in the process.

  • Too much information. How many scientific papers have you actually read, from start to finish, in the last month? How many important, relevant papers in your field could you have completely read in the last month? How many of them did you find through an automatic notification system? How many papers have you used solely for one specific piece of information they contain? How many of these papers did you find through a database of some kind?

Open Access has become a hot topic, mainly in response to the points above. Although well-intended, the debate assumes that scientific publication in the Internet age will continue to work essentially the same way it always has - with scientists submitting manuscripts to publishers who act as editors, distributors, and in many cases quality-assurance agencies.

But what if it doesn't end up working out that way?

The reason that the existing scientific publication system has flourished for hundreds of years is that it solved the fundamental problems of two key groups: (1) scientists who wanted to be informed of new developments; and (2) scientists who wanted to promote their work and careers.

If you accept this premise, then nothing prevents entirely new publication models from replacing the existing ones - provided that they solve the basic problem scientists face. If anything, the Internet is replete with examples of powerful old-guard gatekeepers of all stripes being first undermined as they denied that their business models were failing, then lashing out at everything but the root cause of their problems, and finally being driven into oblivion.

Why should old-guard scientific publishers be immune to this process?

Some scientists are discovering value in bypassing scientific publishers altogether. In chemistry, the best-known example is Jean-Claude Bradley and his group at Drexel. As Bradley's group is joined by others willing to experiment in this area, they will uncover a variety of problems that need to be addressed. Some of the most significant (at this point) include:

  • tools to create content

  • services that host and archive that content indefinitely

  • peer-review mechanisms that fully leverage the power of collaboration over the Internet

  • utilities for finding and promoting the work

These are the new high-payoff areas in scientific publication. Like all high-payoff areas, this one starts out looking dangerous or insignificant to most people.

This is not to say that that the Internet eliminates the need for gatekeepers. Instead, it creates tremendous opportunities for new gatekeepers. Google, eBay, and Wikipedia are gatekeepers. Facebook and YouTube are also gatekeepers. By all accounts, these services have done phenomenally well and will continue to flourish for some time. Significantly, each service addresses the basic need of information consumers to be informed and information producers to have their message heard. These systems have found powerful mechanisms for quality control that in many cases put the current practice of scientific peer-review to shame. And in no case will you find a business model requiring pay-per-view.

Google, eBay, Wikipedia, Facebook, YouTube, and hundreds of other gatekeepers thrive because each has found a new precious resource to allocate, not by trying to extract every last drop of value from the old ones. Both scientific publishers and Open Access proponents would be wise to consider their example.

Image Credit: Seamus Murray

Making Your 2D Structures Look Good: Firefly, Styles and Stylesheets

Posted by Rich Apodaca Fri, 13 Jul 2007 15:19:00 GMT

Chemists can be very discerning when it comes to chemical structure aesthetics. This is not surprising, given the central role played by 2D chemical structures in the day-to-day work of many chemists. For example, consider the Wikipedia Chemistry/Structure drawing workgroup's ongoing discussion about achieving a consistent look for chemical structures on the online encyclopedia.

Several articles have discussed Firefly, a 2D chemical structure editor specifically designed for the Web. With major work on the rendering engine and structure manipulation interface complete, recent efforts have turned toward exposing drawing settings through a graphical user interface. Here I'll provide some screenshots of an interface prototype along with sample structures. I'll also briefly discuss the larger question of making 2D structure drawing styles portable.

Drawing styles are edited through a tabbed dialog containing a live preview window that uses the current structure or a default structure if none is available. The dialog is resizable, enabling users to immediately see the effects of changes on structures of varying sizes. Although this dialog could be bundled and deployed with the editor, its large footprint makes it more appropriate for use as an optional feature or as a standalone configuration tool in a Web application.

Changes can be rolled back entirely ("Reset"), canceled ("Cancel"), or accepted ("OK").

Let's say we'd like to apply a black background with white bonds, as used in some Power Point presentations:

  

After applying this change, we decide that we'd rather not use atom coloring:

  

After looking at this structure for a few seconds, we decide that narrower stereo bonds are needed:

  

After some experimentation, we find a more appropriate non-stereo bond width and double bond offset:

  

What about a Serif font? No, I don't think so:

  

But we could certainly reduce the size of the atom labels:

  

On second thought, the original atom sizes were fine, although changing font may require us to reconsider the atom label heights:

  

As you can see, the possibilities for customization are nearly endless. In practice, however, most chemists will adopt only two structure drawing styles that they re-use as needed: one for reports and manuscripts; and one for presentations. It will be interesting to see whether a third style makes it's way into the standard repertoire: Web.

Each chemist will want a way to save their styles, possibly share them, and easily apply them. Although a few systems for doing so are feasible, the most practical approach would be a stylesheet. Applying a stylesheet to any structure diagram would change its appearance, offering a simple mechanism to achieve a consistent look across documents.

Developing a universal (cross-editor) stylesheet system would be no easy task, given the wildly divergent capabilities of 2D structure rendering software. Despite the technical difficulty, the payoff for users is obvious.

Waldorf Salad

Posted by Rich Apodaca Wed, 11 Jul 2007 13:36:00 GMT

One thing that really irritates me is badly drawn ChemDraw structures (maybe I should get out more...). ... Simple really... so why do a lot of ChemDraw structures that appear in papers or on slides at conferences look like a six-year old has drawn them with a crayon (and I mean a particularly untalented six-year old at that - one that probably wouldn't even win a Blue Peter drawing competition)? I think it’s just people being lazy...

-Stuart Cantrill, Badly Drawn Bonds

Chemical structures are a language. Like any language, grammar and spelling convey meaning and become "standardized" over time. But as native speakers of any language will tell you, languages also take on important cultural nuances that are difficult to convey to non-native speakers. Chemical structures are no different, as Stuart Cantrill's comments show.

Just about every chemist I've met has opinions on the "right way" to draw chemical structures that go beyond "grammar" and "spelling." These opinions come out most clearly in interdisciplinary environments, where non-chemists create chemical structures. Features such as "unconventional" line proportions, bond angles, or atom label proportions don't affect meaning. But they do violate conventions subconsciously learned over years of training. They look weird and because of this they distract attention and annoy audiences.

Not every badly-drawn structure is the result of laziness. Rather, it's more likely that: (1) the author didn't know there was a problem; (2) the author did know, but underestimated the benefit of correcting the problem; or (3) the author did know but wasn't using a tool that made it possible to correct the problem.

The increasing importance of computers in generating, manipulating, transmitting, and rendering chemical structures means that software developers now face the same choices as individual chemists with respect to quality. Before discounting structure aesthetics as irrelevant, it's worth considering that chemists can and do apply the same judgments to software as they apply to their fellow chemists.

Yet Another Free Chemistry Database: Heterocycles Web Edition

Posted by Rich Apodaca Fri, 06 Jul 2007 13:57:00 GMT

Yet another free chemistry database comes in the form of a service run by the journal Heterocycles. The Heterocycles Web Edition offers two ways to search for heterocylic ring systems: by structure or by synthesis.

You may assume that these services would only search the contents of Heterocycles. It would then be a pleasant surprise to find a number of highly-regarded journals being covered. Here are some of titles:

  • Angew. Chem. Int. Ed. Engl.
  • Chem. Eur. J.
  • Eur. J. Org. Chem.
  • Heterocycles
  • J. Am. Chem. Soc.
  • J. Med. Chem.
  • J. Nat. Prod.
  • J. Org. Chem.
  • Org. Lett.
  • Synlett
  • Tetrahedron
  • Tetrahedron Lett.

The current query interface supports text only, although a number of important criteria can be used. I haven't searched for many heterocyles, but my results for indolizidine give a flavor for what you might expect (the actual number of hits was 115):

It would be interesting to know how Heterocycles populated its database. Is it text-mining, manual curation, both, or something else? Regardless of how it's done, Heterocycles Web Edition is definitely worth looking at.

Older posts: 1 2 3