27 July 2010

A Fish in the Clouds

How long does it take to find a hosting company and get an enterprise web application up and running from scratch? Could be weeks if there are enough lawyers and pointy-haired bosses involved. And even if you can directly talk to the providers, it usually takes a couple of days to request quotes and compare them.

That's the way we started for our current project, but now we've decided to have a go at cloud computing. (So at last this adds another buzzword to my CV...)

I vaguely remembered Arun Gupta's blog about Glassfish on Amazon EC2, and using that together with the Ubuntu EC2 Starter's Guide, it just took me about half an hour to set up a virtual machine in the Amazon Cloud with Ubuntu Server 10.04, JDK 1.6.0_20 and Glassfish 3.0.1, start the Glassfish domain and access the admin console from my local web browser.

And downloading the 78 M Glassfish zip file on the EC2 instance from Oracle's server took less than 2 seconds...

Not bad for a start. I would have expected to spend about a day to get as far as that.

I was also using Ubuntu 10.04 on my local machine, working with the command line interface from the
ec2-api-tools Ubuntu package most of the time.

Of course it will take some more time to set up the database server, deploy our web app and to secure the system. And it's too early to tell if we really require all the elasticity of EC2, which is likely to be more expensive in the long run than a conventional dedicated server cluster.

At any rate, installing a site in virtually no time and running it for less than 10 cents per hour is rather impressive. Thumbs up for Amazon Web Services!

26 July 2010

Building a Java EE 6 Web Application with Eclipse Helios, Maven and Glassfish

This is a short tutorial showing how to use Eclipse 3.6 (Helios) with Glassfish and Maven for building and running a Java EE 6 web application from a multi-module source tree.

We are going to import and run the Wicket gmap2-examples, demonstrating the use of Google Maps via Apache Wicket, but this is purely incidental - even if you prefer mainstream JSF or another web framework to Wicket, you may find this tutorial useful. There are absolutely no Wicket specifics involved; gmap2 was just a handy small but non-trivial example.

If you haven't worked with Maven before, you will get an idea how Maven can save you a lot of work managing the third-party dependencies of your project automatically.

This is what you'll see in the end:


There's quite a few things on your shopping list before you can start. If you are reading this, you probably already have the following on your disk:
You'll need to install these features into Eclipse:
  • Maven Integration for Eclipse (also known as m2eclipse)
  • Maven Integration for Eclipse (Extras)
  • Glassfish Java EE Application Server Plugin for Eclipse
Eclipse 3.6 has a new way of installing plugins via Help | Eclipse Marketplace. This is an integrated web client which lets you search for plugins by free text, so you no longer have to copy and paste update site URLs into the Update Manager - but you can still do so if you prefer.

The m2eclipse book has step-by-step instructions and screenshots explaining the installation process: start here with the m2eclipse installation. When you get to the Extras, the only ones required for the rest of this tutorial are Maven Integration for WTP and Maven SCM Integration. The WTP integration will let Eclipse recognize your Maven projects with war packaging as Dynamic Web Projects.

The Maven SCM Integration lets you fetch Maven projects directly from source code repositories like Subversion, Mercurial or others. Note that this integration simply invokes the corresponding command line clients like /usr/bin/svn.

Having installed m2clipse via the Eclipse Marketplace, use the same procedure to install the Glassfish Plugin.

After restarting Eclipse, set your Glassfish preferences via Window | Preferences | Glassfish Preferences. For this tutorial, you should check Start the Glassfish Enterprise Server in verbose mode and uncheck the other items.

Next, define a server runtime environment via Window | Preferences | Server | Runtime Environment | Add... Select the server type Glassfish | Glassfish Server Open Source Edition 3 (Java EE 6), check Create a new local server and click Next. Select your Glassfish installation directory - this is required to be the parent of the modules and domains directories, e.g. /home/hwellmann/glassfish-3.0.1/glassfishv3/glassfish. Click Next and fill in the domain name and the administrator credentials. If you did not change the Glassfish defaults, you can probably just click on Finish.

The Servers view has opened automatically. You can select your server and click on the Run button to  launch Glassfish from Eclipse. The Eclipse plugin launches Glassfish indirectly via an asadmin subprocess.

You will see some Glassfish log messages in the console. To stop Glassfish, do not hit the stop button in the Console view. This will just kill the asadmin process, but not Glassfish itself, and Eclipse and Glassfish will get terribly out of sync. Make sure to select the Servers view and hit the stop button there to avoid trouble.

Now for the interesting part: Let us import and build the gmap2 example project.

This project has a parent project with two subprojects, or modules in Maven terms. Maven requires you to store the module subprojects of a parent project in subdirectories of the parent directory. Eclipse normally cannot handle overlapping or nested directory structures for projects in the same workspace. Fortunately, m2eclipse works some magic to flatten the project structure, making Eclipse happy.

To import the example sources directly from the Subversion repository, select File | Import... | Maven | Check out Maven Projects from SCM and click Next. Fill in the SCM URL

https://wicket-stuff.svn.sourceforge.net/svnroot/wicket-stuff/tags/wicketstuff-core-1.4.9.2/gmap2-parent

and select the SCM provider svn, then click Next and Finish.

After a while, you will see three projects in your workspace:
  • gmap2
  • gmap2-examples
  • gmap2-parent

The parent project gmap2-parent has two subfolders gmap2 and gmap2-examples which are also represented as top-level Eclipse projects - this is a special feature of the m2eclipse integration.

The gmap2-examples subproject is a web application, as indicated by the Maven packaging type war. Using this and other information specified in the Maven POM, m2eclipse was able to turn this Eclipse project into a Dynamic Web Modules project - have a look at the Project Facets in the project properties to convince yourself.

Opening the project, you will notice two classpath containers Maven Dependencies and Web App Libraries. The latter contains just one item gmap2 with an open folder icon, indicating a reference to another subproject in our workspace that will go into WEB-INF/lib. The Maven Dependencies container displays all JARs required by our project, e.g. wicket-1.4.9.jar and slf4j-api-1.5.8.jar. All these were downloaded automatically by Maven into your local Maven repository on your hard drive, and m2eclipse makes Eclipse reference them from that location.

We are now ready to launch the web app: Select the gmap2-examples project folder and then Run As... | Run on Server from the context menu. Select the existing Glassfish server you created before and click Finish. After a while, you see the gmap2-examples welcome page in a web browser window.

Click on one of the links, e.g. marker listener, to see Google Maps in Firefox in Eclipse via Wicket on Glassfish on Java on Linux. (OK, maybe it's not Firefox or Linux on your machine...)

Note that the Java compilation and the WAR assembly and deployment were done by Eclipse, not by Maven.

However, you can also run a Maven build from Eclipse. To build our entire project hierarchy, select gmap2-parent and open Run As | Maven build... from the context menu. Enter the goals clean install and click Run. You will see the Maven messages in the console window. When Maven has finished, click Refresh (F5) on gmap2-parent, and open gmap2-examples/target to see the WAR file compiled by Maven.

20 July 2010

JPA 2.0: Querying a Map

Welcome back to more merriment with Maps in JPA 2.0!

After watching 3 out of 4 persistence providers choke on a model with a map in the previous post, let us now continue our experiments and see how our guinea pigs can handle JPQL queries for maps.

Recall that the JPQL query language has three special operators for building map queries: KEY(), VALUE() and ENTRY().

Now let us try and run the following query on a slightly modified model, compared to the previous post.

select m.text from MultilingualString s join s.map m where KEY(m) = 'de'

The corresponding model is:

@Embeddable
public class LocalizedString {

    private String language;

    private String text;

} 
 
@Entity
@Table(schema = "jpa", name = "multilingual_string")
public class MultilingualString {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    @Column(name = "string_id")
    private long id;

    @ElementCollection(fetch=FetchType.EAGER)
    @MapKeyColumn(name = "language_key")
    @CollectionTable(schema = "jpa", name = "multilingual_string_map", 
                     joinColumns = @JoinColumn(name = "string_id"))
    private Map<String, LocalizedString> map = new HashMap<String, LocalizedString>();
}

This time I've changed the model so that the map key is stored in its own column, which gives Hibernate and Eclipselink at least a chance to digest the model and proceed to the query. OpenJPA is fine with either version of the model.

DataNucleus is out of the game by now. I even tried replacing the @Embeddable by an @Entity and a few other things to cheat it into accepting my model, but in the end I gave up.

Now, Ladies and Gentleman, the winner and sole survivor is: OpenJPA again!

Both Hibernate and Eclipselink fail, merrily throwing exceptions. Hibernate only seems to have stubbed out the KEY() and VALUE() operators in their parser code (see HHH-5396 for the gory details and elaborate stack traces).

And Eclipselink's famous last words are:

Error compiling the query [select m.text from MultilingualString s join s.map m where KEY(m) = 'de'], 
line 1, column 9: unknown state or association field [text] of class [LocalizedString]. 
 

Not sure what the poor soul is trying to tell me.

To sum up: Should you ever consider working with persistent maps à la JPA 2.0, beware! Here be dragons...

17 July 2010

JPA 2.0: Mapping a Map

JPA 2.0 has added support for persistent maps where keys and values may be any combination of basic types, embeddables or entities.

Let's start with a use case:

The Use Case


In an internationalized application, working with plain old Strings is not enough, sometimes you also need to know the language of a string, and given a string in English, you may need to find an equivalent string in German.

So you come up with a LocalizedString, which is nothing but a plain old String together with a language code, and then you build a MultilingualString as a map of language codes to LocalizedStrings. Since you want to reuse LocalizedStrings in other contexts, and you don't need to address them individually, you model them as an embeddable class, not as an entity.

The special thing about this map is that the keys are part of the value. The map contents look like

'de' -> ('de', 'Hallo')
'en' -> ('en', 'Hello')

The Model


This is the resulting model:


[Update 20 July 2010: There is a slight misconception in my model as pointed out by Mike Keith in his first comment on this post. Editing the post in-place would turn the comments meaningless, so I think I'd better leave the original text unchanged and insert a few Editor's Notes. The @MapKey annotation below should be replaced by @MapKeyColumn(name = "language", insertable = false, updatable = false) to make the model JPA 2.0 compliant.]

@Embeddable
public class LocalizedString {

    private String language;

    private String text;

    public LocalizedString() {}

    public LocalizedString(String language, String text) {
        this.language = language;
        this.text = text;
    }
    
    // autogenerated getters and setters, hashCode(), equals()
} 
 
@Entity
@Table(schema = "jpa", name = "multilingual_string")
public class MultilingualString {

    @Id
    @GeneratedValue(strategy = GenerationType.SEQUENCE)
    @Column(name = "string_id")
    private long id;

    @ElementCollection(fetch=FetchType.EAGER)
    @MapKey(name = "language")
    @CollectionTable(schema = "jpa", name = "multilingual_string_map", 
                     joinColumns = @JoinColumn(name = "string_id"))
    private Map<String, LocalizedString> map = new HashMap<String, LocalizedString>();

    public MultilingualString() {}
    
    public MultilingualString(String lang, String text) {
        addText(lang, text);
    }
    
    public void addText(String lang, String text) {
        map.put(lang, new LocalizedString(lang, text));
    }

    public String getText(String lang) {
        if (map.containsKey(lang)) {
            return map.get(lang).getText();
        }
        return null;
    }
    
    // autogenerated getters and setters, hashCode(), equals()
}



The SQL statements for creating the corresponding tables:

CREATE TABLE jpa.multilingual_string
(
  string_id bigint NOT NULL,
  CONSTRAINT multilingual_string_pkey PRIMARY KEY (string_id)
)

CREATE TABLE jpa.multilingual_string_map
(
  string_id bigint,
  language character varying(255) NOT NULL,
  text character varying(255)
)

The Specification


The most important and most difficult annotation in this example is @MapKey. According to JSR-317, section 2.1.7 Map Keys:

If the map key type is a basic type, the MapKeyColumn annotation can be used to specify the column mapping for the map key. [...]
The MapKey annotation is used to specify the special case where the map key is itself the primary key or a persistent field or property of the entity that is the value of the map.

Unfortunately, in our case it is not quite clear whether we should use @MapKey or @MapKeyColumn to define the table column for our map key. Our map key is a basic type and our map value is not an entity, so this seems to imply we should use @MapKeyColumn.

On the other hand, our key is a persistent field of the map value, and I think the whole point of the @MapKey annotation is to indicate the fact that we simply reuse a property of the map value as the map key, so we do not need to provide an extra table column, as the given property is already mapped to a column.

The way I see it, replacing @MapKey by @MapKeyColumn(name = "language_key") - note the _key suffix! - is also legal, but then we get a different table model and different semantics: The table jpa.multilingual_string_map would have a fourth column language_key, this language_key would not necessarily have to be equal to the language of the map value.

Another open question: Is it legal to write @MapKeyColumn(name = "language")? If so, this should indicate that the language column is to be used as the map key, so this would be equivalent to the @MapKey annotation. On the other hand, you might say that this annotation indicates that the application is free to use map keys that are independent of the map values, so this contract would be violated if the column name indicated by the annotation is already mapped.

The Persistence Providers


I've tried implementing this example with the current versions of Hibernate, Eclipselink, OpenJPA and DataNucleus. I did not succeed with any of them. Only OpenJPA provided a workable solution using @MapKeyColumn, but as I said, I'm not sure if this usage is really intended by the specification.

[Update 20 July 2010: With the corrected model, the updated verdict is: Only OpenJPA passes the test, the other three bail out for various reasons.]

Let's look at the contestants in turn:

Hibernate


Using the mapping defined above, Hibernate 3.5.3-Final complains:

org.hibernate.AnnotationException: Associated class not found: LocalizedString

Apparently Hibernate is expecting the map value to be an entity not an embeddable.

Using @MapKeyColumn(name = "language"), the exception is

org.hibernate.MappingException: Repeated column in mapping for collection: MultilingualString.map column: language

Finally, with @MapKeyColumn(name = "language_key"), Hibernate no longer complains about duplicate columns, but I end up with a redundant table column in my database which I was trying to avoid.

Another problem with Hibernate is different behaviour when working with XML mapping data instead of annotations (which is what I prefer for various reasons, but that's a topic for another post).

Using XML metadata for this example, Hibernate happily ignores the table names from the metadata and simply uses the default names. I filed a bug report in April 2010 (HHH-5136), with no reaction ever since.


Eclipselink


Using Eclipselink 2.1.0, I simply get a rather cryptic exception

java.lang.NullPointerException
 at org.eclipse.persistence.internal.queries.MapContainerPolicy.compareKeys(MapContainerPolicy.java:234)

With @MapKeyColumn=(name = "language"), Eclipselink also complains about a duplicate column, and changing the name to language_key, my test finally passes, at the expense of a redundant column, as with Hibernate.

OpenJPA


With OpenJPA 2.0.0, the message is

org.apache.openjpa.persistence.ArgumentException: Map field "MultilingualString.map" is attempting to use a map table, 
but its key is mapped by another field.  Use an inverse key or join table mapping.

which I can't make sense of. Switching to @MapKeyColumn=(name = "language"), the new message is

org.apache.openjpa.persistence.ArgumentException: 
"LocalizedString.text" declares a column that is not compatible with the expected type "varchar".  

Its seems OpenJPA is confused by the column name text which sounds like a column data type. After adding @Column(name = "_text") to LocalizedString.text, my test case works and my database table only has three columns.

DataNucleus


DataNucleus 2.1.1 complains

javax.persistence.PersistenceException: Persistent class "LocalizedString" has no table in the database, 
but the operation requires it. Please check the specification of the MetaData for this class.

I'm getting the same message with all three variants of the annotation, so it appears that DataNucleus simply cannot handle embeddable map value and expects them to be entities.

Conclusion


Mapping maps with JPA is much harder than you would think, both for the user and for the implementor. Hibernate, Eclipselink and OpenJPA have all passed the JPA TCK. DataNucleus would have liked to do so, but they have not yet been granted access to the TCK.

All four implementors failed this simple map example to various degrees, which implies that there are features in the JPA 2.0 specification which are not sufficiently covered by the TCK.

An Open Source TCK for JPA would help in detecting and eliminating such gaps instead of leaving that to the initiative of individuals.

16 July 2010

JPA 2.0 Frustration

JPA 2.0 is one of the areas where Java EE 6 can make your life a lot easier compared to Java EE 5 - at least if your life is somehow connected to Java software development.

So much for the marketing blurb. In practice, working with JPA 2.0 means
  • trying to understand a specification (JSR 317) which may be quite challenging to read for the average developer and occasionally somewhat too vague even for experts
  • making sense of obscure stack traces from the JPA provider of your choice
  • discovering numerous bugs and omissions in implementations claiming to be JPA 2.0 compliant.
If you think this sounds fun, then read on...

Actually, this post is just an introductory note to a series of articles on specific use cases that seem to be particularly hard to get right.

I'm not unhappy about the spec as such - Object Relational Mapping (ORM) is a challenging topic and not exactly the stuff you expect first year computer science students to understand. JPA 2.0 narrows the gap between the JPA standard and vendor-specific extensions or native ORM features.

Still, there are some areas not covered by the standard where you have to fall back to vendor extensions or even write your own code: for instance, I would really like to see more flexible support for enum types, a standard for user-defined column types and an addition for spatial objects and queries, based on OGC standards.

The main source of frustration with JPA 2.0 is simply the lack of specification compliance of the available implementations. Implementing any but the most trivial persistence mappings and queries at application level can require hours of trial and error to get the expected results. Yes, most of the time the problem may be in your application code. But with JPA 2.0, chances are really high that your persistence provider has a bug.

The situation is not helped by the policy of Sun/Oracle/the JCP (I'm not really sure who's in charge of that) to keep the JPA TCK (Technology Compliance Kit) under a non-disclosure agreement - see this blog post from DataNucleus, which was linked in a related thread in the Glassfish forums recently. (DataNucleus is a JPA implementor not beloging to the inner circle of the JCP.)

So far, I've looked at
  • Hibernate
  • Eclipselink
  • OpenJPA
  • DataNucleus,
and sadly, I've had problems with all of them. I'm going to provide specific examples in the following posts.