Casual Saturdays: Google AppEngine 1

Posted by Rich Apodaca Sat, 07 Jun 2008 15:47:00 GMT

Flex, Rich Internet Applications, and Cheminformatics

Posted by Rich Apodaca Fri, 25 May 2007 15:32:00 GMT

Building Rich Internet Applications (RIAs) is no easy task. The technologies are all there, but twelve years after the release of Netscape Navigator, Web developers are still hacking around browser idiosyncrasies.

From JavaScript to HTML, to CSS, browser quirks are still a fact of life. Browsers tend to disagree the most on the more advanced features, but examples of less advanced features requiring cross-browser work-arounds continue to suck up large amounts of developer time.

Even Java applets get in on the act. The interface between applets and the browser is the most poorly-defined aspect of the technology. Particularly with respect to keyboard focus, persisting applet state, and even initial activation, it's "write once, test everywhere."

Recently, I saw a presentation by James Ward on a technology called Flex that's aimed squarely at solving the Rich Internet Application problem.

Flex is a platform for building software that runs on the Flash multimedia player. If you've developed Java applets, you already know a lot about how Flex works. Here are some of the key points James brought up:

  • Flex applications are written in ECMAScript (aka JavaScript) and run on the Flash 9 runtime. The language offers, among other features, variable typing. This stands in contrast to the non-typed form of JavaScript typically used in browsers.

  • Sometime in 2007, Flex will be open sourced. The developer kit is currently a free download.

  • A desktop-native runtime for Flex called Apollo is nearing release.

  • A variety of eye-popping examples of Flex in action are available. My favorites so far: FlexBook and Picnik.

Adobe is clearly trying to position Flash/Flex as the successor to Java applets. By ensuring that Flash works the same way in all browsers, has a rich API for interacting with the browser, and is available on all platforms, they just might succeed.

Adobe claims that Flash is the most widely-installed multimedia browser plug-in. According to a recent study, nearly 99% of browsers have some version of Flash installed (congratulations to Rajarshi Guha for almost immediately getting the Name that Graph). Adoption of Flash 9, the latest version and the one required to run Flex applications, is significantly lower. My guess is that these numbers will rise quickly given the availability of Flash 9 for Linux and the mere 1.5 MB download footprint.

If Flex has anything going against it, it's probably the small footprint of the Flash player. One of the first things you'll notice about many feature-rich Flex applications is how long they take to download relative to comparable Java applets. What the Flash runtime lacks, your application may well have to supply on deployment.

I'm unaware of a single use of Flex/Flash in cheminformatics. This is really quite surprising given the highly visual nature of cheminformatics and the refined graphics capabilities of the Flash player. Even with it's limitations, Flex may offer solutions to a variety of difficult problems in developing Rich Internet Applications for chemistry.

Update: Four Free 2-D Structure Editors for Web Applications 1

Posted by Rich Apodaca Sun, 22 Apr 2007 15:26:00 GMT

A previous article discussing the deployment of four free 2D structure editors has been fixed. The sample pages demonstrating how to obtain a molfile from each have also been restored.

Google for Molecules with InChIMatic

Posted by Rich Apodaca Mon, 19 Feb 2007 15:18:00 GMT

InChIMatic is a simple Web application that uses Google to perform exact structure searches on the Web. After drawing your structure in the editor window, click the "InChI!" button to get a link. This link takes you to a Google query that displays matches for your molecule. You'll need both Java and JavaScript enabled in your browser to use InChIMatic.

The Technical Details

The technology at the heart of InChIMatic is the IUPAC International Chemical Identifier (InChI). An InChI is an alphanumeric string that uniquely identifies a molecular structure. By converting molecular structures to text, InChI makes it easy to use standard Internet tools to do exact structure searches.

The earliest reference in the peer-reviewed literature to using Google for searching InChIs is contained in a 2005 paper. More recently, a service called QueryChem has taken this idea one step further by using the Google API to perform substructure searches based on InChI.

InChIMatic works differently. Unlike a raw Google search, InChIMatic builds a Google query link for you. Unlike QueryChem, InChIMatic doesn't use the Google API and so has none of its restrictions. This does result in a limitation: InChIMatic can only currently be used to for exact structure queries.

The InChIMatic Web application has been discussed in greater technical detail in a previous article. The rapid Web application development framework Ruby on Rails made building InChIMatic a snap. InChIMatic is served by the Ruby application container Mongrel, which is hosted on a Linux server running Apache. Rino provided the Ruby interface to the IUPAC/NIST InChI toolkit. The 2-D structure editor is Java Molecular Editor (JME) by Peter Ertl, which is used with his kind permission.

Open Source (OSI) LogoAside from JME, all components of InChIMatic, from the operating system it runs on to the InChI system itself, are Open Source software.

Using InChI to Raise the Visibility of Your Content

InChIMatic returns many Google results for common molecules. But less common, known molecules return no hits at all. Three factors are responsible: (1) Google doesn't index all InChIs on the Internet; (2) few content providers currently use InChI; and (3) there is no standard and convenient mechanism to embed InChIs into Web pages for indexing by Google.

For these reasons, I consider InChI to be bleeding edge technology. Some will find it useful, most will not. Unfortunately, this state of affairs will persist until problems (1) and (3) are solved.

Nevertheless, if you're technically adventurous, InChIMatic offers a relatively painless way to begin incorporating InChIs into your content and verifying that they get indexed. There's no software to download, install, or upgrade. Forget about operating system incompatibilities (hopefully!). Just point your Java-enabled browser to inchimatic.com.

Although there's no standard method to encode InChIs in Web pages, some interesting ideas have been put forward. Egon Willighagen has proposed a system based on RDFa. Future iterations of InChIMatic may include support for generating scripts and/or markup for including InChIs into blogs and other online content.

Conclusions

InChI is a complex new technology in need of easy-to-use tools. InChIMatic is one such tool that makes it possible to perform exact structure queries using Google.

One of the exciting things about Web applications is how quickly they can evolve. If in trying out InChIMatic you find something you'd like changed or added, please feel free to write me.

Four Free 2-D Structure Editors for Web Applications

Posted by Rich Apodaca Mon, 21 Aug 2006 04:13:00 GMT

The increasing trend toward hosting free chemical databases and other services on the web brings with it the need for a free, ergonomic, capable, and fast 2-D structure editor. For years, the options were rather limited. However, this situation has started to change. Four web-enabled editors are discussed here, with an emphasis on the steps needed to deploy them within a webpage and retrieve a text-based molecular representation. A sample webpage is provided for each editor that allows a user to draw a molecule and view the corresponding output in a browser.

Building a Web Application: The Key Players

Consider the case of John, who would like to know the TPSA of caffeine. John finds a new website, http://tpsacalculate.com, that calculates the TPSA of any molecule. This site presents John with a 2-D structure editor applet and a "Submit" button. John uses the applet to draw caffeine and then presses the button. After one second, John sees a new page showing the structure of caffeine and its TPSA descriptor.

By pressing the "Submit" button, John sets in motion a series of transactions between the editor applet, the webpage, and the server. First, the webpage extracts a molfile representation of caffeine from the editor using JavaScript. This molfile is then submitted to the server using an HTTP POST request. After processing the molfile, the server returns a page containing the TPSA that John requested.

Several variations on this pattern are conceivable, each involving varying levels of involvement by the browser, the applet, and the server. Advanced use of JavaScript can lead to elimination of the applet entirely, an approach taken by the PubChem structure search. Even more interesting is the use of AJAX, which would eliminate both the applet and the page refresh step, setting the stage for highly-interactive chemical content using only a browser and JavaScript. Although no AJAX-powered 2-D structure editors currently exist, this situation can be expected to change in the future.

Obtaining Text Output From a 2-D Editor

Extracting text-based output requires the same boilerplate code for all four editors. This code consists of four main components: (1) an editor applet into which the user draws a structure; (2) a JavaScript function that collects the output from the applet; (3) an HTML text field into which the JavaScript function inserts the output; and (4) an HTML form containing a button that when pressed sets the process in motion.

These commonalities make it possible factor out editor specific code and logic. The HTML below gives an example of what one basic template looks like.

<html>
  <head><title>Molfile Test</title></head>
  <body>
  <!-- JavaScript -->
  <script language="JavaScript">
      function writeOutput()
      {
       document.form.output.value = document.applet.OUTPUT_METHOD();
      }
  </script>

  <!-- Applet -->
  <applet code="APPLET_CLASS" name="applet"
                archive="APPLET_JARFILE.jar"
                width=510 height=360>
    Please enable Java and JavaScript on your machine.
  </applet>

  <br />

  <!-- Form -->
  <form method="post" name="form">
      <input type="button"
                 value="Get Output"
                 onclick="writeOutput()"></input>
      <br /><br />
      <textarea name="output" rows=20 cols=80></textarea>
    </form>
  </body>
</html>

The above HTML contains three editor-specific pieces of information: (1) APPLET_JARFILE; (2) APPLET_CLASS; and (3) OUTPUT_METHOD. APPLET_JARFILE is the name of the Java archive file (*.jar) containing the applet code. This name is created by the developer when s/he saves the archive to the webserver. APPLET_CLASS is the fully-qualified class name of the editor applet. OUTPUT_METHOD is the name of the applet method that returns output. These last two pieces of editor-specific information are listed in the summary that follows.

Java Molecular Editor (JME)

Homepage: Molinspiration

License: Free for noncommercial development.

Source Code: N/A

Size: 39 Kb

APPLET_CLASS: JME

OUTPUT_METHOD: molFile(); smiles(); nonisomericSmiles(); jmeFile();

View the Sample Page

JChemPaint

Homepage: CDK

License: GPL

Source Code: SourceForge

Size: up to 6.2 Mb

APPLET_CLASS: org.openscience.cdk.applications.jchempaint.applet.JChemPaintEditorApplet

OUTPUT_METHOD: getMolFile();

Comment: Although getSmiles() and getSmilesChiral() methods are available, neither produced the desired output during this test (version 2.1.5). The applet consists of 35 jar files, only some of which are necessary for minimal functionality.

View the Sample Page

JMolDraw

Homepage: SourceForge

License: GPL

Source Code: SourceForge

Size: up to 1.4 Mb

APPLET_CLASS: org.jmd.editor.main.JMolDraw

OUTPUT_METHOD: getContentsAsMolfile(); getContentsAsJMEString()

Notes: In contrast to the other three editors, there is no option to display this applet in the browser itself; it must be rendered as a separate window. In addition, this editor requires that several configuration and resource files be accessible on the server. Molfile output uses V3000 ctabs. Although V2000 ctabs are supported, the only way to activate this functionality is to modify the source code.

View the Sample Page

MCDL

Homepage: SourceForge

License: Public Domain

Source Code: SourceForge

Size: 256 Kb

APPLET_CLASS: mcdl.MCDLEditor

OUTPUT_METHOD: getMDCL()

Notes: This editor only supports output in Modular Chemical Descriptor Language format.

View the Sample Page

Conclusions

This review has only scratched the surface of what is possible with these editors. For example, all accept input as well as providing output. As a result, they can be used to render 2-D molecular images, with more or less Java coding. Both MCDL and JME are especially attractive from the developer perspective because they are each distributed as a single jar file with a small footprint.

Although numerous 2-D structure editors are available, those reviewed here meet the minimum requirements for the development of free chemical web applications: they work on nearly all computing platforms thanks to Java; and they are themselves free.