17 December 2009

Running parameterized JUnit tests in parallel

We use JUnit 4 for Anaconda, not just for unit tests, but also for integration or system tests. Typically, we iterate over all features of a given class in a database and validate each feature.

Note: I'm using the term feature in the sense of map feature, not in the sense of application feature or implemented requirement.

A simple pattern for such tests is

public class FeatureTest
{ 
    @Test
    public void testAllFeatures()
    {
        for (Feature feature : findAllFeatures())
        {
            testOneFeature(feature);
        }
    }

    private void testOneFeature(Feature feature)
    {
        // some logic with one or more JUnit assertions
    }
}

Obviously, this naive approach has the following drawbacks:
  • The test fails and terminates on the first incorrect feature. The remaining features will not be tested.
  • All features get tested sequentially. This may take awfully long for a large database.
JUnit 4 has a specialized runner Parameterized for running all tests in a given class with different parameters from a given list of parameter sets. An instance of the test class is created for each parameter set, and the parameters are passed to the constructor via reflection:

@RunWith(Parameterized.class)
public class FeatureTest
{
    // This is the parameter for each instance of the test.
    private Feature feature;

    public FeatureTest(Feature feature)
    {
         this.feature = feature;
    }

    @Parameters
    public static Collection<Object[]> getParameters()     
    {         
        List<Feature> features = findAllFeatures();
        List<Object> parameters = new ArrayList<Object[]>(features.size());
        for (Feature feature : features)         
        {
             parameters.add(new Object[] { feature };
        }
        return parameters;
    }
 
    @Test
    public void testOneFeature()     
    {
        // assertions acting on the feature member
    }
} 

This solves the first problem: Each feature gets tested in its own test instance. Now if there is a large number of features or if each individual test is very expensive, we would like to run the test instances in parallel, using a thread pool, or maybe even a grid of multiple computers.

Browsing through the JUnit sources, I found a surprisingly easy way of parallelizing the tests with a thread pool, simply by using a custom runner:


@RunWith(Parallelized.class)
public class FeatureTest
{
   // same class body as above
}


All you need is a simple extension of the Parameterized runner:

public class Parallelized extends Parameterized
{
    
    private static class ThreadPoolScheduler implements RunnerScheduler
    {
        private ExecutorService executor; 
        
        public ThreadPoolScheduler()
        {
            String threads = System.getProperty("junit.parallel.threads", "16");
            int numThreads = Integer.parseInt(threads);
            executor = Executors.newFixedThreadPool(numThreads);
        }
        
        @Override
        public void finished()
        {
            executor.shutdown();
            try
            {
                executor.awaitTermination(10, TimeUnit.MINUTES);
            }
            catch (InterruptedException exc)
            {
                throw new RuntimeException(exc);
            }
        }

        @Override
        public void schedule(Runnable childStatement)
        {
            executor.submit(childStatement);
        }
    }

    public Parallelized(Class klass) throws Throwable
    {
        super(klass);
        setScheduler(new ThreadPoolScheduler());
    }
}


The RunnerScheduler interface is fairly new in JUnit and marked as experimental. I discovered it in the current version JUnit 4.8.1 and found it missing in JUnit 4.4.0 which we have been using so far. RunnerScheduler is also available in JUnit 4.7.0, but I did not check if this is the earliest version.