Fixtures as Caches

I am responsible — for better or for worse — for the library which eventually became clojure.test. It has remained largely the same since it was first added to the language distribution back in the pre-1.0 days. While there are many things about clojure.test which I would do differently now — dynamic binding, var metadata, side effects — it has held up remarkably well.

I consider fixtures to be one of the less-well-thought-out features of clojure.test. A clojure.test fixture is a function which wraps a test function, typically for the purpose of setting up and tearing down the environment in which the test should run. Because test functions do not take arguments, the only way for a fixture to pass state to the test function is through dynamic binding. A typical fixture looks like this:

 (ns fixtures-example
   (:require [clojure.test :as test :refer [deftest is]]))
 
 (def ^:dynamic *fix*)
 
 (defn my-fixture [test-fn]
   (println "Set up *fix*")
   (binding [*fix* 42]
     (test-fn))
   (println "Tear down *fix*"))
 
 (test/use-fixtures :each my-fixture)
 
 (deftest t1
   (println "Do test t1")
   (is (= *fix* 42)))
 
 (deftest t2
   (println "Do test t2")
   (is (= *fix* (* 7 6))))

There are two kinds of fixtures in clojure.test:

:each fixtures run once per test, for every test in the namespace.

:once fixtures run once per namespace, wrapped around all tests in that namespace.

I think the design of fixtures has a lot of problems. Firstly, attaching them to namespaces was a bad idea, since namespaces typically contain many different tests, only some of which actually need the fixture. This increases the likelihood of unintended coupling between fixtures and test code.

Secondly, :each fixtures are redundant. If you need to wrap every test in some piece of shared code, all you need to do is put the shared code in a function or macro and call it in the body of each test function. There’s a small amount of duplication, but you gain flexibility to add tests which do not use the same shared code.

(Another common complaint about fixtures is that they make it difficult to execute single tests in isolation, although the addition of test-vars in Clojure 1.6 ameliorated that problem.)

So :once fixtures are the only ones that matter. But if you want true isolation between your tests then they should not share any state at all. The only reason for sharing fixtures across tests is when the fixture does something expensive or time-consuming. Here again, namespaces are often the wrong level of granularity. If some resource is expensive to prepare, you may only want to pay the cost of preparing it once for all tests in your project, not once per namespace.

So the purpose of :once fixtures is to cache their initialized state in between tests. What if we were to use fixtures only for caching? It might look something like this:

 (ns caching-example
   (:require [clojure.test :refer [deftest is]]))
 
 (def ^:dynamic ^:private *fix* nil)
 
 (defn new-fix
   "Computes a new 'fix' value for tests."
   []
   (println "Computing fixed value")
   42)
 
 (defn fix
   "Returns the current 'fix' value for
   tests, creating one if needed."
   []
   (or *fix* (new-fix)))
 
 (defn fix-fixture
   "A fixture function to provide a reusable
   'fix' value for all tests in a namespace."
   [test-fn]
   (binding [*fix* (new-fix)]
     (test-fn)))
 
 (clojure.test/use-fixtures :once fix-fixture)
 
 (deftest t1
   (is (= (fix) 42)))
 
 (deftest t2
   (is (= (fix) (* 7 6))))

This still avoids repeated computation of the fix value, but clearly shows exactly which tests use it. The :once fixture is just an optimization: You could remove it and the tests would still work, perhaps more slowly. Best of all, you can run the individual test functions in the REPL without any additional setup.

The same idea works even if the fixture requires tear-down after tests are finished:

 (ns resource-example
   (:require [clojure.test :refer [deftest is]]))
 
 (defn acquire-resource []
   (println "Acquiring resource")
   :the-resource)
 
 (defn release-resource [resource]
   (println "Releasing resource"))
 
 (def ^:dynamic ^:private *resource* nil)
 
 (defmacro with-resource
   "Acquires resource and binds it locally to
   symbol while executing body. Ensures resource
   is released after body completes. If called in
   a dynamic context in which *resource* is
   already bound, reuses the existing resource and
   does not release it."
   [symbol & body]
   `(let [~symbol (or *resource*
                      (acquire-resource))]
      (try ~@body
           (finally
             (when-not *resource*
               (release-resource ~symbol))))))
 
 (defn resource-fixture
   "Fixture function to acquire a resource for all
   tests in a namespace."
   [test-fn]
   (with-resource r
     (binding [*resource* r]
       (test-fn))))
 
 (clojure.test/use-fixtures :once resource-fixture)
 
 (deftest t1
   (with-resource r
     (is (keyword? r))))
 
 (deftest t2
   (with-resource r
     (is (= "the-resource" (name r)))))
 
 (deftest t3
   (with-resource r
     (is (nil? (namespace r)))))

Again, each of these tests can be run individually at the REPL with no extra ceremony. If you don’t want to keep paying the resource-setup cost in the REPL, you could temporarily redefine the *resource* var in its initialized state.

The key in both cases is that the “fixtures” are designed to nest without duplicating effort. Each test function specifies exactly what state or resources it needs, but only creates them if they do not already exist. Some of those resources may be shared among multiple tests, but that fact is hidden from the individual tests.

With this in mind, it becomes possible to share a resource across all tests in a project, not just within a namespace. All you need is an “entry point” which kicks off all the tests. clojure.test provides run-tests for specifying individual namespaces and run-all-tests to search for namespaces by regex. All you have to do is make sure your test namespaces are loaded, either via direct require or a utility such as tools.namespace. Then you can run a full test suite that only executes the expensive setup/teardown code once:

 (ns main-test
   (:require [clojure.test :as test]
             [my.app.a-test]))
 
 (defn -main [& _]
   (with-resource-1
     (with-resource-2
       ;;; ... more fixture wrappers ...
       (test/run-all-tests #"^my\.app\..+-test$"))))

Typed Assertions Tell You What Hurts

One thing clojure.test did reasonably well was tell you why an assertion failed. Currently, Lazytest fails in this regard.

The problem with requiring test functions to return true/false to indicate pass/fail is that they can’t attach any additional information to a failure to explain why it failed.

I realized that function return values are insufficient for describing failure conditions. Fortunately, we’ve long had another means for functions to signal failure: typed exceptions.

Typed exceptions seem to be out of favor at the moment. Clojure itself only uses a handful of generic exception types, and defines none of its own.

It’s slightly awkward to define new exceptions in Clojure because the JVM requires any thrown exception to be derived from the concrete base class java.lang.Throwable.

Sure, you could use gen-class, but generating a stub class that maps to a Clojure namespace seems like overkill for such a simple task. All I need is something that I can throw with an arbitrary payload attached.

So I did that thing that will make everyone cringe: I wrote it in Java. All nine lines of it:

package lazytest;

public class ExpectationFailed extends Error {
    public final Object reason;

    public ExpectationFailed(Object reason) {
	this.reason = reason;
    }
}

Now I can define any number of typed objects representing different failure conditions, and I still only have to worry about catching one exception type.

Next I can write functions that test for different conditions and throw ExpectationFailed when they are not met, attaching the appropriate failure object. I can even write a macro, expect, that transforms an ordinary predicate expression into an “expectation expression” by reflecting on the code.

The expect macro fills the same role in Lazytest as the is macro in clojure.test.

Now I just need to figure out how to merge all this back in to the master branch.

A Journey of a Thousand Lines Begins with a Single Test

I have a curious obsession with testing frameworks. The first thing I do with any new programming language is try to write a test framework in it. It’s a useful exercise for exploring the metaprogramming facilities provided by any language. So in C, I use preprocessor macros; in Java, annotations; and in a Lisp, macros.

When I started playing with Clojure, there was no testing framework. So I wrote one, borrowing ideas from Common Lisp test frameworks such as LIFT and Chapter 9 of Peter Seibel’s Practical Common Lisp.

This was clojure.contrib.test-is. By virtual of being first out of the gate, it became the de facto standard testing framework for Clojure, and release 1.1 gave it an official position as clojure.test.

After seeing clojure.test used in the wild, and using it on my own projects, I found some problems. I set out to fix them in a totally new framework called Lazytest, which I have been working on since February. Lazytest has gone through three major revisions already, and will probably get at least one more before I release it.

Lazytest started with the simple desire to fix all the problems I found with clojure.test, but it evolved into an attempt to make the perfect behavior-driven development framework for Clojure, incorporating all the best ideas from TDD/BDD frameworks in other languages.

This post is an attempt to document where Lazytest is now, the thought processes that got it there, and where it’s headed.

First, I’ll cover what clojure.test did wrong (and right).

Things clojure.test got wrong:

Test code is tightly coupled to reporting. Every assertion is responsible for calling clojure.test/report, which immediately prints the result and updates a global counter for tests passed/failed. The only way to change the report output format is to rebind report while tests are running. This makes it awkward to implement alternative result formats such as TAP and JUnit XML.

Tests can only be grouped by dynamic scope. Following the style of Seibel, the only way clojure.test can combine tests into groups (other than namespaces) is to call one test within the body of another. This conflicts with the default run-tests behavior of running all tests defined in a namespace, leading to the poorly-understood test-ns-hook hack. There is no way to group tests by lexical scope.

Fixtures can only be assigned per-namespace. Fixtures were a late addition to clojure.test and were not integrated well with the rest of the design. The fact that they are globally applied to an entire namespace makes them useless for all but the simplest cases.

Fixtures rely on dynamic scope. The only way to pass values from a fixture to a test function is with dynamic binding. Not only is this awkward to use (every value shared between fixtures and tests needs a global Var) it makes test functions dependent on the dynamic context provided by the framework. Individual tests cannot be run outside of run-tests.

Code templates. This was a clever idea that didn’t pan out. clojure.template/do-template is a really complicated way to do map and never should have been promoted from clojure-contrib to Clojure proper.

Tree-walking. clojure.walk was another clever idea that didn’t pan out. It is still useful in a handful of situations, such as recursively changing all keywords to strings, but it could probably be replaced with something simpler.

Things clojure.test got right:

An explicit assertion form, a.k.a. the is macro. In most drafts of Lazytest I omitted this form, instead treating the last expression of any test body as an assertion. I wanted to discourage the use of multiple assertions in a single test, but such usage is frequently necessary when testing real-world code.

Recognizing assertions by syntactic form. The is macro uses a multimethod to dispatch on the first symbol in the assertion expression. The multimethod can generate different code for different kinds of assertions, such as equality, instance? checks, or exceptions thrown.

I had hoped that people would extend the is macro with their own assertion forms, but almost no one did. It was too hard to understand and, like the rest of clojure.test, too tightly coupled to the reporting subsystem.

Goals for Lazytest

Separation of concerns. There should be well-defined interfaces for creating tests, running tests, and reporting test results. It should be trivial to replace any of those components with another that respects the same interface.

Separation of syntax from internal representations. There should be a simple, functional interface for defining tests without any need for macros. Different text syntaxes, implemented as macros, can be layered on top of this interface.

Support for continuous testing. It should be possible to start a “watcher” process to monitor directories and re-run tests when files change.

Lexical grouping. It should be possible to combine tests into groups, with unlimited nesting, using lexical scopes.

Composable per-test fixtures. Fixtures (called “contexts” at the moment) may be attached to individual tests or groups of tests, and may be composed.

Support for tagging. Tests may be tagged with arbitrary metadata, including “skip”, “pending”, and “focus” to control which tests are run.

Useful reporting. A test failure report should include enough information to diagnose the problem without referring back to the test code. This is probably the hardest goal, partly because it is dependent on having clearly-written tests.

If I can do all this, it will be TDD-nirvana, but that’s a big if. Even though I have code for most of the pieces, making them all work together will be a significant challenge.

The basics are already there, on my Lazytest github page. Please try it out and send me any feedback you have, but be aware that everything in the code, including the test syntax, is still alpha and subject to change.

I will make a proper release at some point, but not until I am satisfied that I have implemented the proper abstractions.

Tests Are Code

It’s interesting to see the first signs of rebellion against RSpec. I jumped on the RSpec bandwagon when it first appeared, mostly so I wouldn’t have to write “assert_equals” all the time. But while I liked and used RSpec, I don’t think it made my tests any better. If anything, they were a little bit worse. I found myself testing things that were not really relevant to the code I was writing, asserting obvious things like “a newly-created record should be empty.”

When I got interested in Clojure, one of the first things I wrote was a testing library called “test-is”. I borrowed from a lot of Common Lisp testing frameworks, especially the idea of a generic assertion macro called “is”. It looks like this:

(deftest test-my-function
  (is (= 7 (my-function 3 4)))
  (is (even? (my-function 10 2))))

This is pretty basic, but it’s sufficient for low-level unit testing. So far, I think that’s how the library has been typically used. There have been occasional requests, however, for RSpec-style syntax. I can see how this would be useful for testing at a level higher than individual functions, but I have come to believe that the added semantics of RSpec are not really necessary.

Right now, the test-is library is built on the same abstractions as Clojure itself. Tests are functions, so you can apply all the same tools that already exist for handling functions. Tests can be called by name, organized into namespaces, and composed. There is almost no extra bookkeeping code that I need to write to make all of this work.

In contrast, if I were to adopt the RSpec style, I would have to write code to call, store, and organize tests. That’s more work for me, and ultimately restricts the flexibility of the library for people who use it. Furthermore, RSpec has its own set of semantics, above and beyond the language itself, which must be learned.

This is my first experience supporting a library for anyone other than myself, and I don’t want to force anyone into a particular style. A library like RSpec is a complete environment that attempts to anticipate all possible usage scenarios, so it’s grown correspondingly complicated. I want to provide a set of small tools, that can be combined with other tools to do interesting things.

Of course, by making that decision I’m already dictating, to some extent, how the library can be used. But really, what I’m trying to do is set limits for myself. I will commit to providing a flexible, extensible set of functions and macros for writing tests. I am explicitly not trying to provide a complete testing framework. If someone wants to build an RSpec-style framework on top of test-is, more power to them. I will happily try to make test-is easier to integrate into that framework.

But there’s one other thing that struck me about that article that I linked to at the beginning — the idea of putting tests and code in the same file. I think that’s a great idea, and Clojure comes ready-made to implement it. Clojure supports the idea of “metadata” on definitions. You can attach a set of arbitrary properties to any object, without affecting the value of that object.

It’s easy to attach a test function as metadata on a definition in Clojure, but the syntax is a little ugly, and there is no easy way to remove the tests from production code. So I came up with in addition to my library, the “with-test” macro. It lets you wrap any definition in a set of tests. It looks like this:

(with-test
 (defn add-numbers [a b]
   (+ a b))
 (is (= 7 (add-numbers 3 4)))
 (is (= -4 (add-numbers -6 2))))

This is equivalent to adding metadata to the function, but the syntax is a little cleaner. I’ve also added a global variable, “*load-tests*”, which can be set to false to omit tests when loading production code.

I like having each function right next to its tests. It makes it easier to remember to write tests, and easier to see how the function is supposed to behave. So to the extent that test-is will promote a testing style, this is it. But it’s a pretty radical departure from the traditional style of testing, so I’m not sure how others will react to it.