Compiling Open Babel to Pure Java Bytecode with NestedVM: Building A Runnable Classfile that Almost Works 2
Previously, I described an unsuccessful first attempt to compile the popular cheminformatics C/C++ library Open Babel to pure Java bytecode using NestedVM. This article follows that topic one step further, and shows how to obtain a runnable Java classfile. Although major functionality is missing, the principle of compiling arbitrary C/C++ code to both Java source code and Java bytecode is illustrated.
Getting Started
This articles assumes that you've installed NestedVM and downloaded Open Babel on your system. You'll then need to set up your environment (from the nestedvm installation directory):
$ source env.sh
Run the Configure Script
The configure script we used last time didn't attempt to statically compile the binary utilities in the tools directory. This time, we'll add flags to allow this:
$ ./configure --disable-dynamic-modules --enable-static=yes --enable-shared=no --enable-inchi --host=mips-unknown-elf $ make
Note: leaving out the static compile directives does not produce a fully-functioning classfile either.
Next, we'll attempt to directly create the babel binary in Java classfile format, as we did last time:
$ cd tools
$ java org.ibex.nestedvm.Compiler -outfile Babel.class Babel babel
Exception in thread "main" java.lang.IllegalStateException: unresolved phantom target
at org.ibex.classgen.MethodGen.resolveTarget(MethodGen.java:555)
at org.ibex.classgen.MethodGen._generateCode(MethodGen.java:664)
at org.ibex.classgen.MethodGen.generateCode(MethodGen.java:618)
at org.ibex.classgen.MethodGen.dump(MethodGen.java:888)
at org.ibex.classgen.ClassFile._dump(ClassFile.java:193)
at org.ibex.classgen.ClassFile.dump(ClassFile.java:160)
at org.ibex.nestedvm.ClassFileCompiler.__go(ClassFileCompiler.java:380)
at org.ibex.nestedvm.ClassFileCompiler._go(ClassFileCompiler.java:72)
at org.ibex.nestedvm.Compiler.go(Compiler.java:259)
at org.ibex.nestedvm.Compiler.main(Compiler.java:183)
We're getting the same error as before. Although, an announcement of a bugfix was posted to the NestedVM list, in my hands the new version of NestedVM caused the same error.
As a workaround, we can compile to Java sourcecode first:
$ java org.ibex.nestedvm.Compiler -outformat java -outfile Babel.java Babel babel
We now have a Java source file encoding the babel program. Does it compile?
$ javac Babel.java
The system is out of resources.
Consult the following stack trace for details.
java.lang.OutOfMemoryError: Java heap space
at com.sun.tools.javac.util.Position$LineMapImpl.build(Position.java:139)
at com.sun.tools.javac.util.Position.makeLineMap(Position.java:63)
at com.sun.tools.javac.parser.Scanner.getLineMap(Scanner.java:1105)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:512)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:550)
at com.sun.tools.javac.main.JavaCompiler.parseFiles(JavaCompiler.java:801)
at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:727)
at com.sun.tools.javac.main.Main.compile(Main.java:353)
at com.sun.tools.javac.main.Main.compile(Main.java:279)
at com.sun.tools.javac.main.Main.compile(Main.java:270)
at com.sun.tools.javac.Main.compile(Main.java:69)
at com.sun.tools.javac.Main.main(Main.java:54)
Not exactly. But this is a massive source file, so we'll need to increase the Java compiler's memory allowance:
$ javac Babel.java -J-Xms256m -J-Xmx256m Note: Babel.java uses unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details.
This seems to have worked. Can we run the classfile?
$ java Babel -H Open Babel converts chemical structures from one file format to another Usage: Babel <input spe> <output spec> [Options] Each spec can be a file whose extension decides the format. Optionally the format can be specified by preceding the file by -i<format-type> e.g. -icml, for input and -ofor output --truncated--
Success! But before we get too excited, let's make sure Open Babel's file formats are recognized by testing for "SMILES":
$ java Babel -Hsmi Format type: smi was not recognized
As you can see, we have successfully converted the babel program to an executable classfile, but this classfile is missing most of the features of the native binary.
This may seem hopeless, but consider that natively compiling Open Babel using the above configure flags also produces a binary that doesn't know about SMILES or any other format.
So, it's very likely that if we can produce a native, statically compiled, self contained babel executable, then we will have solved the problem of running Open Babel entirely on a JVM.
This doesn't seem like a difficult problem, but apparently it is.
Compiling Open Babel to Pure Java Bytecode with NestedVM: An Unsuccessful First Attempt 7
Wouldn't it be great to be able to compile code written in languages like FORTRAN, C, and C++ to Java bytecode? NestedVM - almost magically - can do just that. This article documents a failed first attempt to compile the popular cheminformatics toolkit Open Babel, which is written in C and C++, to pure Java bytecode with NestedVM.
A previous article described the successful compilation of the InChI toolkit, a C library, to a platform-independent executable jarfile.
The Problem
Open Babel is one of cheminformatics' most widely-used open source packages. It interconverts dozens of molecular languages, performs a host of cheminformatics analyses, and serves as a platform for many programs and Web services.
As useful as Open Babel is, it doesn't run directly on a Java Virtual Machine (JVM). Although an Open Babel JNI interface does exist, using it introduces a platform dependency, which in many cases is not acceptable. JNI is a great solution in some cases, but when maintaining a single version of a program is important, or when applets need to be used, or when code needs to work with unusual system configurations, it's a poor choice.
Our goal is to compile Open Babel's "babel" command-line utility into pure Java bytecode that can be run on any recent JVM without using JNI.
Overview of NestedVM
In a nutshell, NestedVM converts MIPS binaries to Java class files. In theory, this allows software written in any language that can be compiled to a MIPS binary to be run on a JVM.
To do this, NestedVM distributes two categories of tools: (1) a complete MIPS cross-compiler toolchain; and (2) a MIPS binary to Java bytecode compiler and accessories.
Building NestedVM
The preferred method to install NestedVM is to compile it from source found in the project repository. There are a number of prerequisites your system must meet in order to be able to do so. For now, this article assumes your system has all of them. Some of the following steps can be found in these instructions as well.
To obtain the source code from the NestedVM darcs repository:
$ darcs get --repo-name=nestedvm http://nestedvm.ibex.org
Then change into the nestedvm directory and build the main code:
$ cd nestedvm $ make
On my machine, this step takes 10-15 minutes.
To make sure your build works, run the tests:
$ make test ... 1.574000e+00 -4.315000e+01l -43 -4.315000e+01 4.315000e+01 Hello, World 7F fabs(-2.24) = 2.34 Destructor!
NestedVM doesn't build the g++ compiler by default - it's something that needs to be done manually. Fortunately, it's not difficult to do:
$ make cxxtest ... java -cp build tests.CXXTest Test's constructor Name: 0x50b40 Name: PKc Is pointer: 1 Name: 0x50b3c Name: i Is pointer: 0 Hello, World from Test Now throwing an exception sayhi threw: const char *:Hello, Exception Handling! Test's destructor
Finally, with all tools built, we need to set up our environment:
$ make env.sh $ source env.sh
We're now ready to cross-compile Open Babel.
Cross-Compiling Open Babel
For this tutorial, we'll use the Open Babel 2.1.1 source distribution. Unpack the tarball and change into the directory.
Next, we'll need to set up our cross-compiler environment. Fortunately, NestedVM has made this easy. If you check your environment variables, you'll find that CXX and CC have both been set. All that remains is to notify the configure script that we'll be cross-compiling:
$ ./configure --host=mips-unknown-elf
Then we build the MIPS binaries:
$ make
Peeking into the tools directory, we can see all of the Open Babel command line tools have been built, including babel.
Unless you're running a MIPS machine, though, this binary won't be executable.
So far, it looks like everything worked. Although it didn't work the first time I tried it, the NestedVM team were most helpful.
Building the Java Class File
We're now ready for the final stage in the process, converting the MIPS binary to a Java class file. Again, NestedVM makes this simple:
$ cd tools
$ java org.ibex.nestedvm.Compiler -outfile Babel.class Babel babel
Exception in thread "main" java.lang.IllegalStateException: unresolved phantom target
at org.ibex.classgen.MethodGen.resolveTarget(MethodGen.java:555)
at org.ibex.classgen.MethodGen._generateCode(MethodGen.java:664)
at org.ibex.classgen.MethodGen.generateCode(MethodGen.java:618)
at org.ibex.classgen.MethodGen.dump(MethodGen.java:888)
at org.ibex.classgen.ClassFile._dump(ClassFile.java:193)
at org.ibex.classgen.ClassFile.dump(ClassFile.java:160)
at org.ibex.nestedvm.ClassFileCompiler.__go(ClassFileCompiler.java:380)
at org.ibex.nestedvm.ClassFileCompiler._go(ClassFileCompiler.java:72)
at org.ibex.nestedvm.Compiler.go(Compiler.java:259)
Unfortunately, NestedVM has blown up with an exception. Although our target class file, Babel.class is now in our working directory, it is not complete and won't run.
What Went Wrong?
After bringing this problem to the NestedVM mailing list, it appears that this is a NestedVM bug.
However, the way babel works is to load its various language modules dynamically. It may be possible to fix the problem by producing a version of babel containing all of its modules in a single binary.
Although there is a major issue to be resolved, this tutorial illustrates the full process of compiling C++ code to Java bytecode using NestedVM.

