08 November 2008

The Digital World

Digital maps and geographical databases have come to be a commonplace commodity. To locate a friend's address, you simple open Google Maps, when you go to their place by car, your nav system will guide you, and if you feel challenged by the patent folding of your Falk city map, you can download a map on your mobile phone to cover the last mile to your friend's home.

The raw geographical data for such services are supplied by companies like NAVTEQ and Tele Atlas, but it takes a lot of additional data processing to build user-friendly services based on these data.

My daytime work is dedicated to the design and development of a map compiler for Harman/Becker, which will produce the map databases for the next generation of car navigation systems to hit the market within a few years from now.

Doing this job in Java does not sound natural to everyone, but for us, the choice of Java has been the key to building our work on top of a whole lot of ready-to-use components and libraries, both for general purpose and domain specific tasks. So it is really Java and all these excellent open source components that help us getting around the world.

Let me just mention some of the Java libraries or applications that are useful for dealing with geographical data. I will cover some of them in more detail in future posts.

The best thing about geo data is that you can have a look at them - if you have some viewer tool that renders a map for you. A couple of years ago, I was searching for a Java-based alternative to the home-grown tools we use at work, and I came across JUMP. This gave me a jump start into geo data processing in Java...

JUMP is a map viewer and editor for a number of open standard data formats, and it has "plug-in" interfaces for supporting alternative data formats, which was a key requirement for me. I put the word "plug-in" in quotes, because JUMP has absolutely nothing to do with Eclipse plug-ins or extension points or OSGi bundles. JUMP does its own bit of class loader magic to use extension classes from JAR files you drop into a given folder.

I found JUMP easy to use and fairly easy to extend. Some people complained to me about its bland and somewhat out-dated GUI, but that has never really bothered me. JUMP is a solid and mature piece of work, and the major downside I see is the fact that it is no longer under active development. There are now more flexible extension mechanisms, and some of the features I would like to extend or modify are simply not laid open, so I would have to modify the sources.

Actually, Mama JUMP may have retired, but her offspring is alive and kicking: There is a whole family of JUMP spin-off projects.

JUMP introduced me to the JTS Topology Suite which it uses for all operations on two-dimensional geometrical shapes. I found JTS to be extremely useful and well-designed, it is based on solid mathematical background and suceeds in hiding the most of the maths from the average user who just needs to get his job done, and it has a rather interesting history.

JTS has an LPGL license, whereas JUMP is under GPL. This may be too restrictive for some commercial applications, and when I wrote to Vivid Solutions, the company behind both JUMP and JTS, about this licensing issue, they pointed me to uDig as an LGPLed alternative.

That must have been some time in 2005. My first encounter with uDig was not very successful: I did see the power of its architecture, built on top of Eclipse RCP, but at that time, I was simply overwhelmed by all the prerequisites you had to understand to implement as much as a Hello World plug-in in uDig. And back then, I knew little or nothing about Eclipse plug-in development, so I concluded that uDig was not (yet) for me, but I kept watching its progress.

In the meantime, uDig has grown a lot in terms of functionality, stability and documentation, whereas my own knowledge of OSGi and Eclipse technologies has increased. Getting started with uDig extensions is still rather a challenge, but my efforts have started paying off. With support from the developer mailing list, my uDig plug-ins have grown to a level of functionality almost equivalent to my JUMP extensions, so I am planning to stop working with JUMP and use uDig as a more flexible, powerful and (hopefully) future-proof platform.

The abstractions of cartographic features and datastores used by uDig are based on Geotools, another open source project providing a set of Java libraries for processing a vast number of geospatial data formats, including plain files, databases or web services. Both Geotools and uDig use JTS for geometrical operations.

All these projects directly or indirectly rely on OpenGIS standards and specifications, in particular, the Simple Features Specification. Geotools provides implementations of the OpenGIS Java interfaces published by the GeoAPI project.

To sum up, for geospatial data processing, there is a wealth of open source Java libraries based on open standards. For any Java-based development in this area, one of these projects may save you some work.

06 November 2008

Hibernate and OSGi: An elaborate solution

A much cleaner and flexible way of osgifying a plain old JAR with lots of dependencies is adding the required OSGi headers to each JAR, so you will replace the megabundle with a bunch of small bundles, one per JAR, which you can reuse in other contexts.

In an ideal OSGi world, every JAR would be an OSGi bundle, so you would have nothing to worry about. Unfortunately, most Java libraries still come as plain old JARs and you have to add the headers on your own.

If you are lucky, someone has done the job for you already. There are a number of third-party repositories offering osgified versions of popular Java libraries, e.g. the OSGi bundle repository or the SpringSource Enterprise Bundle Repository.

SpringSource even has an OSGi bundle of Hibernate itself. However, their version does not work, at any rate not in my setup, so I had to build my own Hibernate bundle.

At least, I was able to use the SpringSource OSGi version for each dependency of Hibernate.

These are the problems I had with the SpringSource version of Hibernate (3.2.6.ga):

  • The javax.transaction package does not resolve. It is imported with a specific version range. However, in Java 1.6.0, this package is contained in the JRE and comes from there without a version. I had to drop the version directive to make Hibernate use the javax.transaction package from the JRE.
  • I got an exception on running a HQL query, since ANTLR could not load the Hibernate token class.
  • I had a very mysterious ClassNotFoundException when accessing some of my model classes. As it turned out, the reason was my usage of lazy loading, where Hibernate injects CGLIB proxy classes into my model classes. The exception was due to the fact that my model bundle did not have access to the CGLIB classes. Rather than declaring a dependency on CGLIB for each of my model bundles, I added the following header to my Hibernate manifest:

Require-Bundle: com.springsource.net.sf.cglib;visibility:=reexport

This means that every bundle with a dependency on Hibernate automatically inherits the dependency on CGLIB. Even St. Peter the Evangelist who usually preaches about using Import-Package instead of Require-Bundle admits this to be one of the rare cases where the latter has its merits.

So far, with this approach, we have not used any Equinox buddy policies, but we still need to deal with the application model classes and resources.

For a while, I thought a fragment bundle would be the definitive solution, which would work not only on Equinox.

I created a bundle com.acme.myapp.hibernate.fragment, containing no Java classes, but only a manifest and some resources, i.e. the Hibernate configuration file and all my Hibernate mapping files. I added all model packages and all relevant JDBC driver packages to the Import-Package header (using resolution:= optional for the JDBC packages.) The host for this fragment is the org.hibernate.osgi bundle, of course.

This worked perfectly when launching my application from within the Eclipse IDE. However, I had an unpleasant surprise when running our batch builds which are based on the Eclipse PDE Ant runner.

PDE kept complaining about cyclic dependencies in my bundles. And not even in my own bundles, also between some of the third-party libraries used by Hibernate, in particular between jaxen and dom4j.

(I had a look at the sources of these two libraries to understand what was going on here: each of them contains some helper classes for the other one, and each must have been compiled against an older version of its friend - really scary...)

I had already spent half a day in working around this problem by repackaging each cycle of libraries in one bundle, and I was going to file an Eclipse bug report. When searching for similar issues, I found bug 208011 which not only describes the problem, but also offers a partial solution:

allowBinaryCycles = true

is a property you can set in your top-level build.properties file as of Eclipse 3.4. (Not a word about this in the Eclipse Online Help, and not even in the comments of the batch build template files!)

According to Chris Aniszczyk, this option will be accessible from the IDE UI in Eclipse 3.5M3.

allowBinaryCycles did suppress the error message regarding the dom4j-jaxen cycle.

But alas, now I had another error complaining about a cycle between Hibernate, my hibernate fragment and my model bundles. Apparently, Eclipse regards the fragment as part of its host (which is ok), and since the fragment depends on an application bundle, and the application bundle depends on Hibernate, so the PDE batch build now complains about a cycle between Hibernate and my application bundle.

So for now, I left all the resources and the JDBC dependencies in the Hibernate fragment, but reverted to using buddy declarations in Hibernate and my model bundles to break the dependency cycle.

I also created a fragment com.acme.myapp.antlr.fragment which imports org.hibernate.hql.ast.HqlToken and thus make the custom token class of Hibernate visible to ANTLR. (See the SpringSource bug report.)

All in all, this looks 95 % clean to me, so I think I'm going to leave it at that for a while...

Hibernate and OSGi: A pragmatic solution

The simplest thing you can do to turn a plain old JAR with a number of external dependencies into an OSGi bundle is wrapping this JAR and all its dependencies in another JAR and adding an OSGi manifest with all dependencies on the bundle classpath.

For example:
Bundle-SymbolicName: org.hibernate.osgi
Bundle-ClassPath: hibernate3.jar,
lib/antlr.jar,
lib/cglib.jar,
lib/commons-logging.jar,
...
Export-Package: org.hibernate,
org.hibernate.configuration,
...
With this megabundle, you can trivially solve all classpath or visibility issues between Hibernate and its dependencies.

You still have to do something to make the model classes of your application and the mapping files accessible to Hibernate.

Eclipse Equinox has an extension of the OSGi standard called buddy policies, which enables you to define some kind of classloader callback.

Your application depends on Hibernate, but you do not want Hibernate to depend on your application. Even if you wrap your own Hibernate bundle, you want to be able to use it in more than one of your applications.

There are several flavours of buddy policies, I will just mention one of them: Adding the header
Eclipse-BuddyPolicy: registered
to your Hibernate bundle manifest will tell Hibernate to ask all its buddies for classes it cannot load using its own classloader.

In each bundle of your application containing some Hibernate model classes, you then add a dependency on Hibernate and a header telling Hibernate that your bundle is a buddy:
Eclipse-RegisterBuddy: org.hibernate.osgi
Require-Bundle: org.hibernate.osgi
(The buddy thing does not work unless there is an actual dependency by Require-Bundle or Import-Package.) This will enable Hibernate to load classes and resources from your bundle, so it will also be able to locate your Hibernate mapping files if you put them into your bundle next to the model classes. Make sure to define your mappings in terms of resources, not files.

You can use the same approach for your JDBC drivers. More likely than not, your JDBC JAR does not come with an OSGi manifest, so you have to wrap it in a bundle anyway. Add a dependency on Hibernate and make your JDBC bundle a buddy of Hibernate. Logically, this is upside-down, but it has the advantage that you can exchange your JDBC drivers without changing the Hibernate bundle.

Alternatively, you can add optional dependencies to the Hibernate manifest on any JDBC driver you are planning to use, e.g.
Require-Bundle: com.microsoft.sqljdbc;resolution:=optional,
org.postgresql;resolution:=optional
Of course, this solution is a nightmare to any OSGi purist, but it does work and it is easy to set up.

There are the following major drawbacks:
  • You are bound to Equinox. Other OSGi implementations do not have a buddy policy equivalent. To my best knowledge, you would have to resort to DynamicImport-Package which opens the gates much wider.
  • The megabundle approach does not scale. It will be okay if Hibernate is the only component in your system you treat this way. If you have three or four of such heavyweight components you will end up wrapping general purpose JARs like commons-logging.jar over and over again.
  • And you will be in real trouble if some of the wrapped third-party dependencies occur in more than one megabundle and are used on the API of your components. With multiple copies of the same JAR in different bundles, and each bundle having its own classloader, you will end up with class cast or assignment exceptions, because classes loaded via different class loaders are always incompatible, even if they have the same fully qualified name.

05 November 2008

Hibernate and OSGi: The problem

Making Hibernate work in an OSGi environment is not trivial. Over the last couple of months, I have been experimenting with various approaches, and I'm not yet fully satisfied with the solutions I have found.

The funny thing is, there must be a whole bunch of people facing the same problem, and you do find a couple of postings or example code with the help of your favourite search engine, however, none of the articles or examples seem to match my environment or my requirements.

Now here is an outline of my approaches, which may or may not work for you. Take a look and give me some feedback.

Before talking about the solutions, we all know the answer is 42, but we haven't really worked out the question yet, so let me first list the issues you have to deal with:
  • Hibernate has a lot of third-party dependencies.
  • When you use an XML configuration file for your session factory, Hibernate must be able to load ist.
  • Hibernate needs to load your JDBC driver classes.
  • When using Hibernate mapping files for your entity classes, Hibernate needs to be able to load them as resources.
  • While processing the mapping files, Hibernate reads the names of your entity classes and loads them using Class.forName().
  • Hibernate internally uses ANTLR to parse HQL queries, and it passes a custom token class to ANTLR. Unfortunately, ANTLR requires you to specify this class by name and uses Class.forName() load the custom token class.
  • When using lazy loading with the CGLIB proxy generator, Hibernate injects a dependency on CGLIB into your model classes. Thus, any bundle using you model classes may have to load CGLIB classes.
There may be more issues, depending on the way you work with Hibernate. In my project environment, some issues mentioned by other people simply do not occur, while others lead to new problems.
  • We do not use Hibernate annotations.
  • Our OSGi configuration is static. All bundles are installed and started during the framework startup phase, registering all services. We do not have to worry about adding new classes to the session factory when a bundle gets installed at run-time. (For this reason, the extender pattern approach is a bit too heavyweight for my taste.)
  • We do not use a web container, or Spring, or anything else, just naked OSGi - well, Eclipse Equinox, in fact.
  • Our IDE is Eclipse and we want to be able to launch our OSGi application including Hibernate from Eclipse.
  • We also use Eclipse PDE batch builds.
Having described to problem in some detail, I will explain a pragmatic and a more elaborate solution in separate articles.

03 November 2008

Why I use OSGi

I want modules and controlled dependencies in my Java applications. OSGi gives me both, that's why I use it.

It's as simple as that. Really.

Yes, OSGi gives you a lot more into the bargain, most notably services, which I also use, but the unique selling point for me is modules.

When you look at the sources and binaries of any software system (not only in Java), preferably one you haven't written yourself, you see a lot of files in a directory structure. Usually, the directory structure reflects the system architecture, subdirectories corresponding to subsystems or components.

Some of these directories correspond to binary artifacts (JARs, DLLs, whatever) which you may want to reuse independently or in another system.

Now when you start pulling out this library, you have to satisfy all its runtime dependencies. So you need to take a bunch of other libraries for this library to work. Identifying the dependencies may be a tedious trial-and-error process.

There may be additional compile-time dependencies you cannot recognize by looking at the binaries only. In C/C++, there is often some global.h directly or indirectly included by every source file in your system.

Even if there was some clever architect who designed the system in terms of components and allowed dependencies, there is usually no guarantee that this architecture will be adhered to, because nothing ever prevents a developer from including or importing something they are not supposed to use.

Java has packages and classes with visibility levels, but classes and even packages are rather too fine-grained from an architect's perspective. I want my modules to be larger than packages, and I want to use them as configuration units. The modules should be in one-to-one correspondence to libraries or JARs.

This is just what OSGi provides. A plain old JAR is turned into an OSGi bundle (that's what modules are called in OSGi-land) simply by adding a couple of special headers to its manifest.

The manifest headers declare the import and export relations between modules. A bundle B may have a package with public classes, but at runtime, no other bundle C can use these classes unless bundle B exports them.

When working with Eclipse, its Plugin Development Environment (PDE) already controls bundle imports and exports at compile time. You get a warning when you import a class that was not exported.

Of course this does not prevent you from adding import or export directives to your bundle manifests, thus diluting the original architecture. But Eclipse offers tools for the architect (or yourself) to inspect the bundle hierarchy and to easily detect any unintentional dependencies.

This is just one aspect of OSGi, but it's the one I find most valuable.

02 November 2008

Joining the Blogosphere

To blog, or not to blog - that never used to be the question, but now it somehow makes sense to start.

Blogo ergo sum?

Well, recently I found myself reading a growing number of blog postings from other people working on related topics, which gave me some help on problems I was dealing with or showed me some new tools or technologies I found useful, so I think it's time to reciprocate.

Topics that are likely to show up here include Software Development in general, with a special focus on Java, Eclipse, OSGi and all sorts of tools and tricks that help you solve problems you never had without them.