28 February 2010

Misconceptions about Java Internationalization

Let me start with a joke:
What do you call someone who speaks three languages?
Trilingual.
What do you call someone who speaks two languages?
Bilingual.
What do you call someone who speaks one language?
American.
To be fair on Americans, even most of us multilingual Europeans tend to be biased when it comes to internationalization, tacitly assuming that text is written left-to-right and can be sorted from A to Z.

Most Java developers are familiar with resource bundles backed by properties files. The basics can be found in the Internationalization Trail of the Java Tutorial. Multilingual Java applications often come with a set of properties files, e.g.
  • MyApp_de_AT.properties
  • MyApp_de.properties
  • MyApp_es.properties
  • MyApp.properties
where MyApp.properties contains the "default" message resources in English, MyApp_de.properties and MyApp_es.properties contain the German and Spanish resources, respectively, and MyApp_de_AT.properties contains some country specific variants for the Austrian flavour of German. Usually, files for country specific variants are sparsely populated, containing only those properties that actually differ from the mainstream language version, like Jänner (de_AT) vs. Januar (de) vs. January (en).

However, you may be surprised in this case to end up with a German string even when you requested a resource for an English locale.

Assume nothing is a sound principle for robust software development, and you should not assume that English is the default or fallback language. In fact, the fallback for resources from an unsupported locale is the system default locale, which is based on the host environment.

See the documentation for ResourceBundle.getBundle() and Locale.getDefault() for more details.

So when the default locale of your system is de_DE and you request a resource for locale en_US, the lookup order for the properties files is
  1. MyApp_en_US.properties
  2. MyApp_en.properties
  3. MyApp_de_DE.properties
  4. MyApp_de.properties
  5. MyApp.properties
Hence, ResourceBundle.getString() will return a German string from MyApp_de.properties, since the first three files do not exist and the English resources are preceded by the German ones in this sequence.

There are two solutions:
  1. As a user, set your default locale to en when launching the application.
  2. As a developer, make sure to provide a properties file for locale en (which may be empty).
The method for changing the default locale depends on your Java VM and your operating system. Setting the system property user.language may work on some platforms, but not with the Sun JDK 1.6.0 under Linux. Instead, you need to set the environment variable LANG before launching the Java VM.

The preferred solution is the second one, of course. Even when MyApp_en.properties is empty, it will be picked up as entry point for resource lookup. If a given key cannot be found in this file, the parent file MyApp.properties will be used as fallback, which is just the desired behaviour.

24 February 2010

Editing Resource Bundles in Eclipse

Playing around with the Apache Roller blog engine, I noticed that some of the localized German text messages were missing or broken. Roller uses plain old Java resource bundles instead of the NLS mechanisms offered by Eclipse. Editing resource bundles for multiple languages in parallel is rather a pain with a plain text editor, so I was looking for an Eclipse plugin to do this job.

(Just to avoid any confusion, even though I've been writing a lot about OSGi bundles, the term bundle is only used in the sense of a resource bundle, or properties file, in this article.)

I found two solutions, both of which have minor bugs and lack some documentation but are very helpful nevertheless. And it turned out that the second solution uses code from the first one:
At first, I tried the Resource Bundle Editor. The latest version is from 2007, and some people have reported conflicts with newer Eclipse versions. I installed the Bundle in my Eclipse 3.5.1, and did not notice any version conflicts at all.

However, the Resource Bundle Editor does not parse the properties files correctly. It does not recognize exclamation marks as comment signs. For comment lines of the form
!some.key = some value
the editor will display a bogus key !some.key.

Looking at the sources, I found the the PropertiesParser class only recognizes a subset of the valid properties file syntax.

After that, I had a look at the Eclipse Babel editor. Unfortunately, the Babel project does not yet provide binary downloads, so you have to build the two plugins from source.

As it turned out, parts of the Babel sources are derived from the Resource Bundle Editor sources, and the same incomplete parser code is also used in the Eclipse project in class PropertiesDeserializer.

I changed a regular expression in the source to fix the "!"-problem. You can get the binary plugins including my patch from here:

After installing the plugins, go to Window | Preferences | Messages Editor and deselect the option Setup validation builder on Java projects automatically, or else you may get lots of error markers on other properties files which are not used as message bundles at all. I also set the Reports severities to Ignore and the Displayed Locales to de to narrow the Editor display to the language I'm actually working on.

To edit a resource bundle, select the properties file in the Package Explorer and open it with the Messages Editor via the context menu.

Here is a screenshot of the Messages Editor in action:

With the additional toolbar buttons, you can limit the view to missing or unused translations.

    23 February 2010

    Setting up Eclipse for Roller

    There is an Eclipse plug-in for almost any task, and most of them do their job rather nicely. On the other hand, even some of the more or less official ones may give you a hard time if you try to use them in combination.

    Recently, I've been playing around with Apache Roller, a Java blog engine, much like Blogger or Wordpress. This project is currently in beta for the next major release 5.0, so I checked out the sources from trunk, ran the Maven build, created a PostgreSQL database and got my own blog engine up and running within minutes.

    Some minor things did not quite work as expected, so I thought I'd just create an Eclipse workspace for Roller and build and run it from there. As it turned out, the combination of Maven, Subversion, and a Web Application was a rather fatal mix, and it took me a day to figure out what was going wrong.

    Most of this was not a Roller issue at all: Eclipse Web application tooling (WTP) and Maven Integration (m2eclipse) just make too many implicit and conflicting assumptions which make it hard to set things up correctly, so this article is really about working with a mavenized web application in Eclipse, and Roller is just an example.

    There are a couple of threads on the Roller developer mailing list dealing with Eclipse setups, but none of them really provides a working solution, so maybe this post can fill gap.

    Step 1: Get Eclipse and all required plug-ins


    To avoid any conflicts with other plug-ins or features not required for this project, I used a separate Eclipse installation consisting of
    • Eclipse for Java EE Developers 3.5.1
    • Subversive SVN Team Provider 0.7.8
    • Subversive SVN Connectors 2.2.1
    • Maven Integration for Eclipse 0.10.0
    • Maven Integration for WTP 0.10.0
    Even if you do not intend to commit any changes from Eclipse, you will need the Subversion integration to deploy your web application from a Subversion working copy, otherwise Eclipse will try to pack the hidden .svn directories into the JARs and WARs and complain about duplicate paths.

    Step 2: Set up m2eclipse

    m2eclipse has a built-in pre-release version of Maven 3.0.0 which is not compatible with most existing projects based on Maven 2.x. Get a local installation of Maven 2.1.0 and define it as default for m2eclipse in Window | Preferences | Maven | Installations.

    Step 3: Set up Tomcat

    Download and install Tomcat 6.0.24 to a local directory. Create a Tomcat server instance for Eclipse via Window | Preferences | Server | Runtime Environments pointing to your Tomcat installation directory.

    Install the additional prerequisites of Roller in the Tomcat lib directory:
    • mail.jar
    • activation.jar
    • your JDBC driver

    Step 4: Get Roller into your Eclipse workspace

    Create a new empty workspace and switch to the SVN Repository Exploring perspective. There is supposed to be an integration of m2eclipse and Subversive, which I never managed to get to work, so this is why I use the following somewhat clumsy procedure to populate my workspace:
    • Switch to the SVN Repository Exploring perspective and define a new repository location for https://svn.apache.org/repos/asf.
    • Check out Roller from roller/trunk. This will create a new project roller-project in your workspace.
    • Unfortunately, the Maven modules of this project do not yet appear as separate Eclipse projects. To change this, delete the project from your workspace and use File | Import | Maven | Existing Maven Projects. Select the workspace folder from your initial checkout.
    • After this, you should have six Maven projects in your workspace, all shared via Subversive.

    Step 5: Apply some fixes in the workspace

    • Go to roller-weblogger-business and delete src/test/resources/org/apache/roller/weblogger/business/package.html, since this file would cause a clash with another copy from src/main/resources.
    • Open /roller-weblogger-web/src/main/webapp/WEB-INF/security.xml and replace spring-security-2.0.1-openidfix.xsd by spring-security-2.0.4.xsd.
    • Copy your roller-custom.properties to /roller-weblogger-web/src/main/resources.

    Step 6: Configure your web application

    • Open the project properties of roller-weblogger-web.
    • Select the Java EE Module Dependencies and activate roller-planet-business, roller-core and roller-weblogger-business.
    • Make sure that the resources from all dependent projects will get copied into the web application by modifying the build path settings of roller-planet-business, roller-weblogger-business and roller-weblogger-web. Select Java Build Path from the project properties and remove the Excluded: ** entry from src/main/resources for each of these projects.

    Step 7: Run a Maven build

    • Select roller-project/pom.xml. From the context menu, select Run As | Maven build...
    • In the launcher dialog, fill in the goals clean install and (optionally) check Skip Tests to save some time during each build.
    • When the build has completed, select all projects and press F5 so that Eclipse will see all the resources created by Maven.
    • This step is required, since the Maven build generates some additional resources and runs the OpenJPA Enhancer. These two steps would not be handled by the Eclipse automatic build.

    Step 8: Make sure that Eclipse picks up the generated resources

    • Create a folder /roller-weblogger-web/src/main/sql and turn it into a source folder.
    • Copy /roller-weblogger-business/target/dbscripts into this folder.

    Step 9: Get Rolling!

    • Select roller-weblogger-web and invoke Run As | Run on Server from the context menu.
    • Select the Tomcat instance created in Step 3 and activate it as default if you like and click Finish.

    Troubleshooting

    • If you get stuck, clean the Tomcat instance. Open the Servers view and select Tomcat. From the context menu, invoke Clean...
    • To check the web application assembled by Eclipse, have a look into <Eclipse workspace>/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/wtpwebapps/roller-weblogger-web/
    All of this should have been a lot easier. m2eclipse will have to mature, and Eclipse should be more flexible about resource locations...