libavg contains extensive unit tests that cover a very large part of the functionality. They are run on every call of make check and part of the continuous build. The tests are very fast - they run in about 10 seconds on a typical laptop. Considerable effort goes into the maintenance of the test suite, but as a result, we have several main benefits:

  • Stability: Even subtle bugs are often detected by the tests before they land in production code.
  • Development speed: The development cycle (code change -> compile -> test) is extremely fast. Bugs are quickly pinpointed by the tests.
  • Cleaner architecture: The tests allow us to incrementally make changes to the libavg architecture without undue fear of introducing new bugs.

libavg tests come in two main flavors: High-level functional tests and internal unit tests. make check runs all available tests. In addition, individual tests can be invoked at varying levels of granularity to pinpoint errors.

High-Level Functional Tests

The functional tests are written in python and use the libavg API like any other python client would. The over 250 tests are divided into a number of test suites. There should be a test for every public API in libavg; feel free to report a bug if an API has no test. Ideally, all code paths in libavg should be exercised with a test - including error conditions.

The functional tests reside in their own directory, src/test. Just invoking

$ ./Test.py

runs all functional tests. Command-line parameters can be used to select a test suite or an individual test to run. This is extremely useful for pinpointing errors:

$ ./Test.py image
$ ./Test.py image testImageMipmap

A list of available test suites is output if the command line contains an unknown suite. Test.py has a number of additional command line parameters that can be used to set the graphics configuration to use. Run it with --help to get a listing of the parameters.

Result Images

Many of the tests rely on image comparisons to verify correct execution. The test compares a screenshot of the rendered scene with a baseline image checked into source control. If the images are 'similar enough' (determined by calculating average and standard deviation of the difference image), the test passes. Otherwise, it fails. Mismatches - even minor mismatches that don't cause a test failure - are saved in src/test/resultimages. Per mismatch, the test saves three images:

  • a baseline image that shows what was expected,
  • the actual image generated and
  • a difference image.

The resultimages directory is easily inspected in a file browser set to preview with three or six images per line - see the image at the right for an example.

The system ignores minor differences in images because there are several benign causes for these, mostly related to the libraries and systems that libavg builds upon. For instance, a lot of minor mismatches are caused by varying interpretations of the OpenGL standard on different platforms. Text rendering is an exception: The differences between platforms are too large. For this reason, image mismatches in text rendering don't cause test failures; the result images are simply saved for human inspection.

Writing Tests

In code, tests manifest themselves as methods in one of the test suites. These methods are registered at the bottom of each test source file. Here is a very simple test method:

 1    def testSample(self):
 2        def getFramerate():
 3            framerate = player.getEffectiveFramerate()
 4            self.assert_(framerate > 0)
 6        root = self.loadEmptyScene()
 7        avg.ImageNode(href="rgb24-65x65.png", parent=root)
 8        self.start(False,
 9                (getFramerate,
10                 lambda: self.compareImage("testsample"), 
11                ))

Typical test code consists of three parts:

  • Scene setup: In the sample, we simply create a scene containing one image node.
  • A series of commands to execute in successive rendered frames: The start method initializes libavg playback and takes a list of python callables that are invoked using an ON_FRAME handler. In the sample, the test executes getFramerate() in the first frame and compares the rendered image to a baseline image in the second frame. Then it terminates.
  • Local functions: Functions to be used during execution of this test.

The call to compareImage() checks to see if the current screen contents are similar to an image found in src/test/baseline. Note that if the image doesn't exist in baseline, the results are placed in resultimages/. A new baseline image can simply be copied from there to baseline/ (obviously, the contents need to be inspected first!).

All test suites are derived from testcase.AVGTestCase, which in turn is derived from standard python unittest.TestCase. AVGTestCase exposes a number of additional entry points.

To test input and user interface functionality, test methods can use several _sendXxxEvent functions to simulate mouse and touch events. There is also a generic MessageTester class that can be used to check if publishers send out the messages they are expected to. A MessageTester is initialized with a publisher and a collection of MessageIDs. It subscribes itself to the MessageIDs and remembers which messages were sent. MessageTester.isState() compares the messages sent to a baseline list of messages expected.

Low-Level Unit Tests

The low-level unit tests are C++ programs in the individual source directories. They are named with a test prefix and can be run by executing the program on a command line (no parameters). Running low-level tests can significantly speed up the compile->build->run cycle. This is done by compiling up to the directory in question and running the test. For instance, the basic OpenGL functionality is tested in testgpu; after changes in src/graphics, it is enough to run

$ cd src/graphics
$ make
$ ./testgpu

to verify that the basic functionality is in place.

The mechanisms are similar to the ones used for the high-level tests. Each unit test is written as a single C++ class, derived either from Test or from GraphicsTest. The macros TEST, TEST_FAILED and QUIET_TEST are available to register test failures and successes. GraphicsTest provides an additional testEqual() function to compare images - similar to the python compareImage() function explained above. Using the two overloads of this function, generated bitmaps can be compared either to a file or to another in-memory bitmap.

resultimages_screenshot.png (34.1 kB) coder, 14/11/2012 13:47