Recent posts about “Clover”

Kate Ellingburg

We here in Internal Systems have started work on a fairly typical-for-us project: we're rewriting the application that sends data from our ordering system (HAMS) to our financial system (Netsuite).

FISC, the new application, will provide a REST-based interface which HAMS will call upon invoice payment. It's not a technically difficult or even a large application, but because it deals with financial information it's very, very important to get right.

Developing our New Application

We didn't quite embrace test-driven development as we wrote FISC, but we did keep asking ourselves, "how can we unit test this?". This is a bit of a departure from our typical development process which is where we ask ourselves, "how are we going to implement this?". We're using dependency injection and interfaces to make mocking out of services easy. We've made sure that each bit of code does one thing well rather than trying to do everything at once. In one case we split up one class into four, just to make it easier to test. It may sound like overkill but it's meant that we've ended up with very modular code.

Embracing Code Coverage

Clover has become an essential tool for us in writing this application. We've been using the Clover IDEA Plugin to track two main metrics: our total code coverage and our branch coverage.

Exclusions for Fun and Profit

We want to make sure that we were focusing on covering the right code - rather than focusing on covering all code. I always exclude private methods and property methods, because I know we're not interested in covering them:

excludingMethods.jpg

I also used Clover's exclusion feature to exclude all the automatically generated Netsuite code. This menu option is new in Clover 2.6 and has saved me umpteen units of effort trying to remember how to format Ant style patternsets:

excludingFiles.jpg

Prioritising Effort

We've been able to achieve 100% code coverage for the vast majority of the application. We're not aiming for 100% for completeness' sake, but to help us sleep at night! There's a lot of business logic, with a lot of different branches, and a lot of pain if we get this wrong. The Clover Cloud Report helped us identify which classes we needed to prioritise testing for. It managed to find the class with the scariest business logic fairly easily:

cloverCloud.jpg

By far the most helpful function I've found with Clover is the new "Hide elements with full coverage" option in the Coverage pane:

clover.jpg

We've been using this view as a quick way to identify which methods require unit testing. It's helped us pick up branches that we thought we'd covered but hadn't, and helps us ignore the classes we're already covering. It's also been pretty satisfying to see the list of methods in the pane get smaller and smaller.

Sleeping Well at Night

Unit testing is the key to the successful deployment of FISC - we won't begin integration testing until we are happy with our unit test coverage. And Clover has become integral to our unit testing effort, because it's the tool that's let us know what we've tested. Sure, it's still up to us to make sure the right values go in and the right values come out but that's a lot less stressful than lying awake wondering whether you've covered the case where an Australian partner paying in USD with an AMEX for a new two year Confluence license with three additional nodes.

Tips for young players: Some of the Clover features we used are new in Clover 2.6, which is scheduled for release in early September

Nick Pellow


Don't you hate committing code and then waiting hours to find out you broke the build? Even worse is when other people commit code at a similar time to you, and you get dragged into the 'who broke the build' witch-hunt by pure circumstance.

If your build times are blowing out because of long test runs (greater than ten minutes), then you are most likely suffering from CI (Continuous Integration) latency and the above problems are real problems for you and your team.

Clover can help alleviate theses problems, by optimizing both unit and acceptance tests, drastically reducing the feedback time for each commit. Below is a case study of how Clover's Test Optimization is run on the Confluence project.

Serious CI

With about 55 different CI Plans setup in Bamboo, the Confluence team are very serious about CI. So serious in fact, that if each build were to run end-to-end, a single commit would take over two days to be tested by each of those plans. Fortunately, Bamboo provides a pretty impressive CI-Cloud, that entails 20 different agents that run build plans in parallel. This makes it possible to get feedback in a couple of hours, as opposed to days.

Often, those few hours can mean the difference between one changeset being included in the build, or many. The main build can run for up to 40 minutes before a failure is detected. In that time, possibly multiple other commits have been made, making it more difficult to track down the root cause of the failure. 40 minutes is also long enough for a developer to be tempted by the ultimate SCM sin; commit-and-run.

For the past month, the Clover team have run a shadow plan of the Confluence trunk build which only runs the acceptance tests that cover code which was modified since the previous build. This is made possible by Clover's per-test coverage data, that reports which tests hit which lines of code during a test run.

Results

The optimized build (charted on the left) is configured to do a complete test run every 10 builds to refresh the per-test coverage data.

optimized-build-results-1.png main-build-results.png

The optimized build provides faster feedback on average compared with the main build, and because it completes on average a lot faster than the main build, more builds get run in the same amount of time.

Faster Feedback

A specific case of where the Clover Optimized build failed before the full Confluence build can be seen in CCD-CONFDF-164, where it took 7 minutes to detect the Acceptance test failure, as opposed to 38 minutes in CONFFUNC-MAIN-5247. This was one case where the changeset that triggered the Optimized build was identical to the changeset that triggered the main build. Quite often, the Optimized build was started on another agent while the main build was still churning through each and every JWebUnit acceptance test.

test-optimized.png
Clover Optimized build failing in 7 minutes

full-main-build.png
The full build took 38 minutes to fail

On average the Clover Optimized Build takes just 7 minutes. This currently runs all unit and integration tests, and optimizes the long running Acceptance Tests. The main Confluence build takes 40 minutes on average to complete.

The Clover optimized build is a 'gateway' build for the Confluence CI-pipeline. It is the canary down the CI-mineshaft, if you will. If the optimized build smells danger, then it fails; preventing other builds from being triggered and hogging valuable CI cycles.

Greater CI Throughput == Clearer CI Results

The faster a build can run, the greater its throughput will be. What does this mean for a build that typically takes 40 minutes to run? This means that you get a much clearer picture of exactly which changeset has caused a build failure.

These next two screenshots show the Bamboo build history page for the full build plan and the Optimized build plan:

non-optimized-2.png
Full build results for the past two hours

You can see that build 5285 failed fairly spectacularly. However, who do we blame for this failure? Three developers made changes that triggered the build which causes a possible disturbance for all three devs as each tries to clear their name.

The Clover Optimized build paints a clearer picture of the situation:

optimized-high-fidel-1.png
Optimized build results for the past two hours

Over the past seven days there were 74 Full Confluence builds triggered. For the same time period there were 116 optimized builds, which represents an approximate 56% increase in build throughput.

Where's the catch?

Of course, Test Optimization of acceptance tests is not a silver bullet.

A full build, that runs all tests regularly should still play an important role in the CI stack. This is because there are still cases where an optimized build may pass, but a full build will fail. Since Clover tracks per-test coverage for Java source files only, it will not detect which tests to run when a non-Java source file is modified. This means that if the only modification for a changeset is for a non-Java file, such as web.xml; velocity macro; pom.xml; build.xml; .jsp (and so on) then possibly no tests will be run, causing the build to pass — when it should have failed!

The aim of an optimized build plan is for it to fail faster than a full build, and also to run more often than a full build does. An optimized build should be about quantity, whereas the main build is there for quality.

What is per-test Coverage?

Per-test coverage, is a mapping of each line of code that was covered by one or more tests, back to the tests that covered the line. As an example here is a screenshot of the Clover report for a test run of Confluence's Acceptance Tests showing all the tests that covered the "AllQuery" class:

per-test-class-coverage.png
Per-test Coverage for the AllQuery class

This shows us that "AllQuery" was covered by 16 test methods and 6 TestCases. If the class containing this code is modified in any way Clover will ensure only those 6 TestCases get run. This means the Confluence Acceptance tests can complete in just a few minutes as opposed to 40.

Per-test coverage data is also excellent for answering the question: "Is there already a test for this Class and if so, which one?".

The Benefits of Test Optimization

The Confluence team benefit by having a Clover optimized in the following ways:


  • On average, they are alerted earlier to build breakages.

  • When a build does break, fewer committers are involved with the breakage, making it easier to discern who broke the build.

  • Fewer CI resources on our build server are consumed by Confluence's long running MAIN build, thereby reducing latency of other builds.

Nick Pellow

Via Twitter on the weekend I came across this pastie (see lines 259-274), supposedly containing the code for the Real-Time-Clock running on the 30GB Zune.

As you can see in the code an infinite loop occurs as soon as the local variable 'days' is exactly 366 and 'year' is a leap year. This was the case on January 1, 2009 GMT and is therefore aptly known as a Y2K9 bug. The glitch resulted in headlines such as:


This could have been prevented had the code coverage report been inspected ;)

I converted the buggy function to Java:

final int ORIGINYEAR = 1980;

/**
Function: ConvertDays
Local helper function that split total days since Jan 1,
ORIGINYEAR into year, month and day
Parameters:
Returns:
Returns TRUE if successful, otherwise returns FALSE.
*/
boolean ConvertDays(int days, Date lpTime) {
 int dayofweek, month, year;
 int month_tab;
 
 year = ORIGINYEAR;
 
 while (days > 365) {
  if (IsLeapYear(year)) {
   if (days > 366) {
    days -= 366;
    year += 1;
   }
  } else {
   days -= 365;
   year += 1;
  }
 }
 return true;
}

and wrote the following test case:

public void testConvertDays() {
 final RTC rtc = new RTC();
 final ReadableInstant originDate = new DateTime(rtc.ORIGINYEAR, 1, 1, 0, 0, 0, 0);
 // test on 2008, 12, 31.
 final ReadableInstant date20081231 = new DateTime(2008, 12, 31, 0, 0, 0, 0);
 assertTrue(rtc.ConvertDays(Days.daysBetween(originDate, date20081231).getDays(), new Date(date20081231.getMillis())));
 
}

Running the test with Clover enabled shows the following:

The if (days > 366) branch never evaluates to false. Adding the next test for this case, causes the infinite loop that Zune users experienced on Y2K9:


final ReadableInstant date20090102 = new DateTime(2009, 1, 1, 0, 0, 0, 0);
rtc.ConvertDays(Days.daysBetween(originDate, date20090102).getDays(), new Date(date20090102.getMillis()));


And in the interest of fair and balance reporting, don't forget that the iPod was born in an infinite loop...

Brendan Humphreys

Stop testing so much!

Brendan Humphreys talks about Clover November 5, 2008 3:22 PM


Automated testing is a great way of maintaining quality on a software project by providing quick feedback to developers when things break. Problem is, often teams find themselves with long-running suites of tests that become a time killer in the iterative development process. If the tests take too long to run, developers are less likely to run the full suite locally before a commit. Instead they commit their changes untested and rely on the Continuous Integration (CI) server to test it. In many teams this can mean the CI server gets quickly overloaded with changes to test, and developers wait hours to find out they broke the build.

With the release of Clover 2.4 we've added a new test optimization feature that can dramatically reduce build times by selectively running only the tests relevant to a particular change. This makes it practical for developers to run the test suite locally prior to a commit. It also means CI server throughput is greatly improved, both of which mean faster feedback to development teams.

xkcd_compiling.png
On Java projects, it's more likely running tests than compiling (thanks xkcd)

When too much testing is ...probably too much

In many teams it can take far too long for the impact of a code change to be known by the submitting developer. The developer might wait many minutes or even hours before the Continuous Integration server gets to building and testing their change. If instead they've run the suite locally, their machine is tied up running tests, leaving the developer expensively idle.

Build breakages can often derail a whole development team, with all work grinding to a halt while the spotlight shines on the developer who introduced the problem as they attempt to fix it.

If a particular change is going to cause one or more tests to fail, the team needs to know about it as fast as possible, and preferably before it is committed.

Two approaches to smarter testing

So much of the testing effort is wasted because many tests are needlessly run; they do not test the code change that prompted the test run. So the first step to improving test times is to only run the tests applicable to the change. It turns out that in practice that this is a huge win, with test-run times dramatically reduced.

The second approach, used in conjunction with the first or independently, is to prioritise those tests that are run, so as to flush out any test failures as quickly as possible. There are several ways to prioritise tests, based on failure history of each test, running time, and coverage results.

Clover's new test optimization

As a code coverage tool, Clover measures per-test code coverage - that is, it measures which tests hit what code. Armed with this information, Clover can determine exactly which tests are applicable to a given source file. Clover uses this information combined with information about which source files have been modified to build a subset of tests applicable to a set of changed source files. This set is then passed to the test runner, along with any tests that failed in the previous build, and any tests that were added since the last build.

The set of tests composed by Clover can also be ordered using a number of strategies:

  • Failfast - Clover runs the tests in order of likeliness to fail, so any failure will happen as fast as possible.
  • Random - Running tests in random order is a good way to flush out inter-test dependencies.
  • Normal - no reordering is performed. Tests are run in the order they were given to the test runner.

Note that Clover will always run tests that are either new to the build or failed on the last run.

Optimization safeguards

Clover's test optimization uses per-test code coverage to determine a minimal set of tests to run for a given code change. In some builds, changes with non-local effects or changes to non-source files (e.g. a Spring XML config file) mean that Clover's selected subset of tests won't adequately test the change. For this reason we recommend still running the full test suite periodically. Clover has a number of strategies to help with this:
  1. Clover can watch for modifications of specific files or filesets, and trigger a full test run if any change
  2. Clover can trigger a full test run every Nth build.

Practical integration for Ant and Maven

We've worked hard on this feature to make it easy to integrate into your existing Ant or Maven2 build. You don't need to use a specialized java environment or standalone test runner.

Clover's Ant integration for test optimization is designed to work with the existing <junit> Ant task. The new <clover-optimized-testset> container wraps existing <fileset>s to control which tests are actually run:

<junit ...>
<batchtest fork="true" todir="${test.results.dir}/results">
    <clover-optimized-testset snapshotfile="${clover.snapshot.file}">
        <fileset dir="src/tests" includes="${test.includes}" excludes="${test.excludes}"/>
    </clover-optimized-testset>
</batchtest>
</junit>

In Maven2, the optimization feature is enabled via a profile added to the project POM:

<profiles>
        ...
        <profile>
            <id>clover</id>
            <build>
                <plugins>
                    <plugin>
                        <groupId>com.atlassian.maven.plugins</groupId>
                        <artifactId>maven-clover2-plugin</artifactId>
                        <version>2.4.1</version>
                        <executions>
                            <execution>
                                <goals>
                                    <goal>setup</goal>
                                    <goal>optimize</goal>
                                    <goal>snapshot</goal>
                                </goals>
                            </execution>
                        </executions>
                    </plugin>
                </plugins>
            </build>
        </profile>
</profiles>

Some real world results

The FishEye team maintain an automated test suite that takes between 20-30 minutes to run, owing to some expensive setup and teardown. We enabled Clover's test optimization feature on the project and measured performance compared to the normal, non-optimized build. Over the 10 day trial period the FishEye team committed 142 changesets as part of their ongoing development effort. For each changeset, two builds were triggered - a "normal" build, where all tests were executed, and a test-optimised build, where only relevant tests were executed. The following chart shows cumulative times for both the normal and test-optimised builds:

c-opt-build-times.png
By only running tests that were applicable to each particular change, test execution time was reduced by a factor of four - a dramatic reduction. The number of tests run are shown in the following chart:

c-opt-tests-run.png

For this trial we configured Clover to run the full test suite every 10 builds, which explains the regular spikes in the number of tests run under the optimized scenario. This safeguard measure ensures that non-local or non-source changes that don't feed into Clover's optimal test subset calculation are still tested.

Some unexpected results

The next chart compares the number of test failures between optimized and normal builds:

c-opt-test-errors.png

Correlation of test failures was very good, with the optimized build detecting all but one of the test failures detected by the normal build, in a fraction of the time the normal build took. The missed failure was caused by an XML config file change. The change was corrected in a subsequent checkin before the optimisation safeguard test run kicked in, which would have detected the failure.

Curiously, in several builds some tests failed when run as part of the optimized build, but not when run in the normal test execution. After some investigation, we found several implicit inter-test dependencies - the execution of one test was required to make a subsequent test pass. This code smell is something the FishEye team are now working to remove :-)

Fail faster: Optimize your tests

Clover 2.4's new test optimization can dramatically reduce your build times, taking the load off your CI server and making it practical for you to run your automated test suite locally, prior to a commit.

You can download a free 30 day trial of Clover 2.4 and try test optimization on your project today. The Quick Start guides for Ant or Maven 2 will get you up and running fast.

Want to learn more?

Join me for a webinar this week to see Clover in action. Just click on the time below to register:

Tues, Nov 11, 2008 8:00 AM - 9:00 AM PST/16:00 GMT
Tues, Nov 11, 2008 5:00 PM - 6:00 PM PST

Check out all of the details about what's new in Clover 2.4 including demo videos.

Nick Pellow

Code Coverage as a "Static Debugger"

Nick Pellow talks about Clover July 17, 2008 7:23 PM

I recently added the following test to ensure any runtime exceptions thrown during multi-threaded report generation were being logged correctly.

1    public void testExceptionHandling() {
2       CloverExecutor executor = CloverExecutors.newCloverExecutor(10, "CLOVER-EXCEPTION-TEST");
3        Logger logger = Logger.getInstance();
4        try {
5            RecordingLogger bufferLogger = new RecordingLogger();
6            Logger.setInstance(bufferLogger);
7
8            executor.submit(new ExceptionCallable());
9
10           assertTrue(bufferLogger.contains(runtimeException));
11           assertTrue(bufferLogger.contains(runtimeException.getMessage()));
12       } catch (Exception e) {
13           fail("Exception thrown, which should have been caught and logged." + e);
14       } finally {
15           Logger.setInstance(logger);
16       }
17}


RecordingLogger is a test utility class that stores any log messages in a list. The #contains methods return true if the given argument was logged during execution.

This test initially failed and not because I was doing TDD ;) The assertion on line 10 was throwing an AssertionFailedError.

Can you spot the bug?


Bugger.

Instead of diving into the debugger, I had a look at the code coverage of the RecordingLogger class:

find_Matcher.jpg

#find is a method that gets called by #contains to do the actual lookup in the list.
Viewing the coverage made it clear that the buffer was empty for both calls to #find.
This lead me to the reason why the test was failing - the executor did not have enough time to fire up its thread, pop my Callable on the queue and execute it.

The one line fix was easy:


executor.awaitTermination(1000, TimeUnit.MILLISECONDS);

Viewing code coverage often provides enough insight into the cause of a failing test and is a cheap alternative to adding printlns or firing up the debugger. During testing, I think of it as a "static debugger". In fact, when stepping through the above code in a debugger, the test passes.

Nick Pellow

My tests touched what?!

Nick Pellow talks about Clover October 29, 2007 10:16 PM

Picture%2034.png
Mapping a test to the classes, methods, even lines that it executed ... that has to be my favorite new feature in Clover2. This is called per-test coverage in Clover2, and helps answer the following questions:

"Which tests cover class X?"


Many projects dutifully begin following a test-naming convention such as XTestCase.testYYY. This convention falls apart quicker than it takes you to google "Behavior Driven Development", or as soon as you have to write a regression test for a bug you just found.

Every line of source code has the following pop-up in a Clover2 report:

Per-test pop-up screenshot

The popup displays a table of the tests which entered the get method. The table contains the following sortable columns:


  • Test-Contribution: how much of this class's coverage is attributed to that test - showing which tests are the biggest contributors to the coverage.
  • Test Name: the name of the test and links to the test source code and to the test result summary page.
  • Result: whether the test passed or failed.

"How well tested is the code I'm about to change?"

Before jumping out of a plane, it's good to know how well your parachute is packed.

Changing code always involves taking some risks. It's re-assuring to be able to estimate what that risk is and how best to reduce it. Being able to view which unit tests exercise the code to be changed helps in doing this. Most importantly, to see immediately whether or not a test case already exists that specifically tests that code. If not, it's handy to see which module of your test suite is the best to add the new test case to.


"How did this bug escape our unit tests?"

When a bug is found in Clover, my first re-action is to ask the question: "Which unit test didn't discover that?". Being able to go from the location of the bug in the source view, to each of the tests which covered the buggy line, method or class is something I've not been able to do without Clover2.

On large or unfamiliar projects, finding the best spot to add a unit test is sometimes a daunting task. Its often tempting to simply add a brand new test. The reasons against this are:

  • You will not know if one of your existing tests is broken i.e. it passes when it should in fact fail
  • The new test may 'overlap' i.e. performs the same test/check as another test. This adds to the burden of maintaining tests.
Writing the test to reproduce a bug becomes much less of a task once you are already looking at a Test Class which tests code near to that containing the bug.


Go both ways.

Navigate from your covered source code to you tests, and from your tests to the code they cover.

More than that, find which classes are only covered by a given test. This is called unique coverage.

Test Result Summary Screenshot

Test Insight

Clover2 reports provide new insight into your project's tests which was previously not possible. This helps boost code quality by:

  • encouraging new tests to be written;
  • reducing the possibility of test overlap, and
  • giving you confidence to refactor knowing that your tests are solid.