Clojure Don’ts: Single-branch if

A short Clojure don’t for today. This one is my style preference.

You have a single expression which should run if a condition is true, otherwise return nil.

Most Clojure programmers would probably write this:

(when (condition? ...)
  (then-expression ...))

But you could also write this:

(if (condition? ...)
  (then-expression ...)

Or even this, because the “else” branch of if defaults to nil:

(if (condition? ...)
  (then-expression ...))

There’s an argument to be made for any one of these.

The second variant, if ... nil, makes it very explicit that you want to return nil. The nil might be semantically meaningful in this context instead of just a “default” value.

Some people like the third variant, if with no “else” branch, because they think when is only for side-effects, leaving the single-branch if for “pure” code.

But for me it comes down, as usual, to readability.

The vast majority of the time, if contains both “then” and “else” expressions.

Sometimes a long “then” branch leaves the “else” branch dangling below it. I’m expecting this, so when I read an if my eyes automatically scan down to find the “else” branch.

If I see an if but don’t find an “else” branch, I get momentarily confused. Maybe a line is missing or the code is mis-indented.

Likewise, if I see an if explicitly returing nil, it looks like a mistake because I know it could be written as when. This is a universal pattern in Clojure: lots of expressions (cond, get, some) return nil as their default case, so it’s jarring to see a literal nil as a return value.

So my preferred style is the first version. In general terms:

An if should always have both “then” and “else” branches.
Use when for a condition which should return nil in the negative case.

Clojure Don’ts: The Heisenparameter

A pattern I particularly dislike: Function parameters which may or may not be collections.

Say you have a function that does some operation on a batch of inputs:

(defn process-batch [items]
  ;; ... do some work with items ...

Say further that, for this process, the fundamental unit of work is always a batch. Processing one thing is just a batch size of one.

Lots of processes are like this: I/O (arrays of bytes), database APIs (transactions of rows), and so on.

But maybe you have lots of code that mostly deals with one thing at a time, and only occasionally makes a larger batch. In the name of “convenience,” people write things like this:

(defn wrap-coll
  "Wraps argument in a vector if it is not already a collection."
  (if (coll? arg)

(defn process
  "Processes a single input or a collection of inputs."
  (process-batch (wrap-coll input)))

This is prevalent in dynamically-typed languages of all stripes. I think it’s a case of mistakenly choosing convenience over clarity.

This leads easily to mistakes like iterating over a collection, calling process on each element, when the same work could be done more efficiently in a batch.

Now imagine reading some code when you encounter a call to this function:

(process stuff)

Is stuff a collection or a single object? Who knows?

When you read code, there’s a kind of ad-hoc, mental type-inference going on. This is true regardless of what typing scheme your language uses. Narrowing the range of possible types something can be makes it easier to reason about what type it actually is.

The more general principle:
Be explicit about your types even when they’re dynamic.

If the operation requires a collection, then pass it a collection every time.

A “helper” like wrap-coll saves you a whopping two characters over just wrapping the argument in a literal vector, at the cost of lost clarity and specificity.

If you often forget to wrap the argument correctly, consider adding a type check:

(defn process-batch [items]
  {:pre [(coll? items)]}
  ;; ... 

If there actually are two distinct operations, one for a single object and one for a batch, then they should be separate functions:

(defn process-one [item]
  ;; ...

(defn process-batch [items]
  ;; ...

Clojure Don’ts: Optional Arguments with Varargs

Another Clojure don’t today. This one is a personal style preference, but I’ll try to back it up.

Say you want to define a function with a mix of required and optional arguments. I’ve often seen this:

(defn foo [a & [b]]
  (println "Required argument a is" a)
  (println "Optional argument b is" b))

This is a clever trick. It works because & [b] destructures the sequence of arguments passed to the function after a. Sequential destructuring doesn’t require that the number of symbols match the number of elements in the sequence being bound. If there are more symbols than values, they are bound to nil.

(foo 3 4)
;; Required argument a is 3
;; Optional argument b is 4
;;=> nil

(foo 9)
;; Required argument a is 9
;; Optional argument b is nil
;;=> nil

I don’t like this pattern for two reasons.

One. Because it’s variable arity, the function foo accepts any number of arguments. You won’t get an error if you call it with extra arguments, they will just be silently ignored.

(foo 5 6 7 8)
;; Required argument a is 5
;; Optional argument b is 6
;;=> nil

Two. It muddles the intent. The presence of & in the parameter vector suggests that this function is meant to be variable-arity. Reading this code, I might start to wonder why. Or I might miss the & and think this function is meant to be called with a sequence as its second argument.

A couple more lines make it clearer:

(defn foo
   (foo a nil))
  ([a b]
   (println "Required argument a is" a)
   (println "Optional argument b is" b)))

The intent here is unambiguous: The function takes either one or two arguments, with b defaulting to nil. Trying to call it with more than two arguments will throw an exception, telling you that you did something wrong.

And one more thing: it’s faster. Variable-arity function calls have to allocate a sequence to hold the arguments, then go through apply. Timothy Baldridge did a quick performance comparison showing that calls to a function with multiple, fixed arities can be much faster than variable-arity (varargs) function calls.

Clojure Do’s: Uncaught Exceptions

Some more do’s and don’ts for you. This time it’s a ‘do.’

In the JVM, when an exception is thrown on a thread other than the main thread, and nothing is there to catch it, nothing happens. The thread dies silently.

This is bad news if you needed that thread to do some work. If all the worker threads die, the application could appear to be “up” but cease to do any useful work. And you’ll never know why.

In Clojure, this could happen on any thread you created with core.async/thread, a worker thread used by core.async/go, or a thread that was created for you by a Java framework such as a Servlet container.

One solution is to just wrap the body of every thread or go in a try/catch block. There are good reasons for doing this: you can get fine-grained control over how exceptions are handled. But it’s easy to forget, and it’s tedious to repeat if you can’t do anything useful with the exception besides log it.

So at a minimum, I recommend always including this snippet of code somewhere in the start-up procedure of your application:

;; Assuming require [ :as log]
 (reify Thread$UncaughtExceptionHandler
   (uncaughtException [_ thread ex]
     (log/error ex "Uncaught exception on" (.getName thread)))))

This bit of code has saved my bacon more times than I can count.

This is a global, JVM-wide setting. There can be only one default uncaught exception handler. Individual Threads and ThreadGroups can have their own handlers, which get called in preference to the default handler. See Thread.setDefaultUncaughtExceptionHandler.

I’ve tried more aggressive measures, such as terminating the whole JVM process on any uncaught exception. While I think this is technically the correct thing to do, it turns out to be annoying in development.

Also annoying is the fact that some Java frameworks are designed to let threads fail silently. They just allocate a new thread in a pool and keep going. If your application is logging lots of uncaught exceptions but appears to be working normally, look to your container framework to see if that’s expected behavior.

The Hidden Future

Another wrinkle: exceptions inside a future are always caught by the Future. The exception will not be thrown until something calls Future.get (deref in Clojure).

Be aware that ExecutorService.submit returns a Future, so if you’re using an ExecutorService you need to make sure something is eventually going to consume that Future to surface any exceptions it might have caught.

The parent interface Executor.execute does not return a Future, so exeptions will reach the default exception handler.

Using ExecutorService.submit instead of Executor.execute was a bug in very early versions of core.async.

Record Constructors

Some more Clojure Do’s and Don’ts for you. This week: record constructors.

Don’t use interop syntax to construct records

defrecord and deftype compile into Java classes, so it is possible to construct them using Java interop syntax like this:

(defrecord Foo [a b])

(Foo. 1 2)
;;=> #user.Foo{:a 1, :b 2}

But don’t do that. Interop syntax is for interop with Java libraries.

Since Clojure version 1.3, defrecord and deftype automatically create constructor functions. Use those instead of interop syntax.

For records, you get two constructor functions: one taking the values of fields in the same order they appear in the defrecord:

(defrecord Foo [a b])

(->Foo 1 2)
;;=> #user.Foo{:a 1 :b 2}

And another taking a map whose keys are keywords with the same names as the fields:

(map->Foo {:b 4 :a 3})
;;=> #user.Foo{:a 3, :b 4}

deftype only creates the first kind of constructor, taking the field values in order.

(deftype Bar [c d])

(->Bar 5 6)
;;=> #<Bar user.Bar@2168aeae>

Constructor functions are ordinary Clojure Vars. You can pass them to higher-order functions and :require :as or :refer them into other namespaces just like any other function.

Do add your own constructor functions

You cannot modify or customize the constructor functions that defrecord and deftype create.

It’s common to want additional functionality around constructing an object, such as validation and default values. To get this, just define your own constructor function that wraps the default constructor.

(defrecord Customer [id name phone email])

(defn customer
  "Creates a new customer record."
  [{:keys [name phone email]}]
  {:pre [(string? name)
         (valid-phone? phone)
         (valid-email? email)]}
  (->Customer (next-id) name phone email))

You don’t necessarily have to use :pre conditions for validation; that’s just how I wrote this example.

It’s up to you to maintain a convention to always use your custom constructor function instead of the automatically-generated one.1

I frequently define a custom constructor function for every new record type, even if I don’t need it right away. That gives me a place to add validation later, without searching for and replacing every instance of the default constructor.

Even custom constructor functions should follow the rules for safe constructors. In general, that means no side effects and no “publishing” the object to another place before the constructor is finished. Keep the “creation” of an object (the constructor) separate from “starting” or “using” it, whatever that means for your code.



Theoretically you could make the default constructors private with alter-meta!, but I’ve never found it necessary.

Clojure Do’s: Namespace Aliases

Third in a series, this time with some style recommendations based on my personal experience.

In a small project with only a few developers, things like naming and style conventions don’t matter all that much, because almost everyone has worked with almost all of the code.

With bigger teams and bigger code bases — think tens of developers, tens of thousands of lines of Clojure — there’s a good chance that anyone reading the code has never seen it before. For that reader, a few conventions can be a big help.

Optimizing for readability usually means being more verbose. Don’t abbreviate unless you have to.

It also means optimizing for a reader who is not necessarily familiar with the entire code base, or even an entire file. They’ve just jumped to a function definition in their editor, or maybe pulled a line number from a stack trace. They don’t want to take the time to understand how all the different namespaces relate. They especially don’t want to have to scroll to the top of the file just to see where a symbol comes from.

So these conventions are about maximizing readability at the level of single function definitions. Yes, it means more typing. But it makes it much easier to navigate a large codebase maintained by multiple people.

As a general first rule, make the alias the same as the namespace name with the leading parts removed.

(ns com.example.application
   [ :as io]
   [clojure.string :as string]))

Keep enough trailing parts to make each alias unique. Did you know that namespace aliases can have dots in them?

[ :as data.xml]
[clojure.xml :as xml]

Eliminate redundant words such as “core” and “clj” in aliases.

[clj-http :as http]
[clj-time.core :as time]
[clj-time.format :as time.format]

Use :refer sparingly. It’s good for symbols that have no alphabetic characters, such as >! <! >!! <!! in core.async, or heavily-used macros such those in clojure.test.

You can combine :refer and :as in the same :require clause.

[clojure.core.async :as async :refer [<! >! <!! >!!]]
[clojure.test :refer [deftest is]]

There are always exceptions. For example, some namespaces have established conventions for aliases:

[datomic.api :as d]

Whatever convention you adopt, use consistent aliases everywhere. This makes it easier for everyone to read the code, and makes it possible to search for code with text-based tools like grep.

Clojure Don’ts: isa?

Dynamic typing is cool, but sometimes you just want to know the type of something.

I’ve seen people write this:

(isa? (type x) SomeJavaClass)

As its docstring describes, isa? checks inheritance relationships, which may come from either Java class inheritance or Clojure hierarchies.

isa? manually walks the class inheritance tree, and has special cases for vectors of types to support multiple-argument dispatch in multimethods.

;; isa? with a vector of types.
;; Both sequences and vectors are java.util.List.
(isa? [(type (range)) (type [1 2 3])]
      [java.util.List java.util.List])
;;=> true

Hierarchies are an interesting but rarely-used feature of Clojure.

(derive java.lang.String ::immutable)

(isa? (type "Hello") ::immutable)
;;=> true

If all you want to know is “Is x a Foo?” where Foo is a Java type, then (instance? Foo x) is simpler and faster than isa?.

Some examples:

(instance? String "hello")
;;=> true

(instance? Double 3.14)
;;=> true

(instance? Number 3.14)
;;=> true

(instance? java.util.Date #inst "2015-01-01")
;;=> true

Note that instance? takes the type first, opposite to the argument order of isa?. This works nicely with condp:

(defn make-bigger [x]
  (condp instance? x
    String (clojure.string/upper-case x)
    Number (* x 1000)))

(make-bigger 42)
;;=> 42000

(make-bigger "Hi there")
;;=> "HI THERE"

instance? maps directly to Java’s Class.isInstance(Object). It works for both classes and interfaces, but does not accept nil as a type.

(isa? String nil)      ;;=> false

(instance? String nil) ;;=> false

(isa? nil nil)         ;;=> true

(instance? nil nil)    ;; NullPointerException

Remember that defrecord and deftype produce Java classes as well:

(defrecord FooBar [a])

(instance? FooBar (->FooBar 42))
;;=> true

Remember also that records and types are classes, not Vars, so to reference them from another namespace you must :import instead of :require them.

instance? won’t work correctly with Clojure protocols. To check if something supports a protocol, use satisfies?.

Clojure Don’ts: Concat

Welcome to what I hope will be an ongoing series of Clojure do’s and don’ts. I want to demonstrate not just good patterns to use, but also anti-patterns to avoid.

Some of these will be personal preferences, others will be warnings from hard-won experience. I’ll try to indicate which is which.

First up: concat.

Concat, the lazily-ticking time bomb

concat is a tricky little function. The name suggests a way to combine two collections. And it is, if you have only two collections. But it’s not as general as you might think. It’s not really a collection function at all. It’s a lazy sequence function. The difference can be important.

Here’s an example that I see a lot in the wild. Say you have a loop that builds up some result collection as the concatenation of several intermediate results:1

(defn next-results
  "Placeholder for function which computes some intermediate
  collection of results."
  (range 1 n))

(defn build-result [n]
  (loop [counter 1
         results []]
    (if (< counter n)
      (recur (inc counter)
             (concat results (next-results counter)))

The devilish thing about this function is that it works just fine when n is small.

(take 21 (build-result 100))
;;=> (1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6)

But when n gets sufficiently large,2 suddenly this happens:

(first (build-result 4000))
;; StackOverflowError   clojure.core/seq (core.clj:133)

In the stack trace, we see concat and seq repeated over and over:

(.printStackTrace *e *out*)
;; java.lang.StackOverflowError
;;      at clojure.core$seq.invoke(core.clj:133)
;;      at clojure.core$concat$fn__3955.invoke(core.clj:685)
;;      at clojure.lang.LazySeq.sval(
;;      at clojure.lang.LazySeq.seq(
;;      at clojure.lang.RT.seq(
;;      at clojure.core$seq.invoke(core.clj:133)
;;      at clojure.core$concat$fn__3955.invoke(core.clj:685)
;;      at clojure.lang.LazySeq.sval(
;;      at clojure.lang.LazySeq.seq(
;;      at clojure.lang.RT.seq(
;;      at clojure.core$seq.invoke(core.clj:133)
;;      at clojure.core$concat$fn__3955.invoke(core.clj:685)
;;      at clojure.lang.LazySeq.sval(
;;      at clojure.lang.LazySeq.seq(
;;      ... hundreds more ...

So we have a stack overflow. But why? We used recur. Our code has no stack-consuming recursion. Or does it? (cue ominous music)

Call the bomb squad

Let’s look at the definition of concat more closely. Leaving out the extra arities and chunked sequence optimizations, it looks like this:

(defn concat [x y]
    (if-let [s (seq x)]
      (cons (first s) (concat (rest s) y))

lazy-seq is a macro that wraps its body in function and then wraps the function in a LazySeq object.

The loop in build-result calls concat on the LazySeq returned by the previous concat, creating a chain of LazySeqs like this:


Calling seq forces the LazySeq to invoke its function to realize its value. Most Clojure sequence functions, such as first, call seq for you automatically. Printing a LazySeq also forces it to be realized.

In the case of our concat chain, each LazySeq’s fn returns another LazySeq. seq has to recurse through them until it finds an actual value. If this recursion goes too deep, it overflows the stack.

Just constructing the sequence doesn’t trigger the error:

(let [r (build-result 4000)]
;;=> nil

It only overflows when we try to realize it:

(let [r (build-result 4000)]
  (seq r)
;; StackOverflowError   clojure.lang.RT.seq (

This is a nasty bug in production code, because it could occur far away from its source, and the accumulated stack frames of seq prevent us from seeing where the error originated.

Don’t concat

The fix is to avoid concat in the first place. Our loop is building up a result collection immediately, not lazily, so we can use a vector and call into to accumulate the results:

(defn build-result-2 [n]
  (loop [counter 1
         results []]
    (if (< counter n)
      (recur (inc counter)
             (into results (next-results counter)))

This works, at the cost of realizing the entire collection up front:

(time (doall (take 21 (build-result-2 4000))))
;; "Elapsed time: 830.66655 msecs"
;;=> (1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6)

This specific example could also be written as a proper lazy sequence like this:

(defn build-result-3 [n]
  (mapcat #(range 1 %) (range 1 n)))

Which avoids building the whole sequence in advance:

(time (doall (take 21 (build-result-3 4000))))
;; "Elapsed time: 0.075421 msecs"
;;=> (1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6)

Don’t mix lazy and strict

There’s a more general principle here:
Don’t use lazy sequence operations in a non-lazy loop.

If you’re using lazy sequences, make sure everything is truly lazy (or small). If you’re in a non-lazy loop, don’t build up a lazy result.

There are many variations of this bug, such as:

(first (reduce concat (map next-results (range 1 4000))))
;; StackOverflowError   clojure.core/seq (core.clj:133)
(nth (iterate #(concat % [1 2 3]) [1 2 3]) 4000)
;; StackOverflowError   clojure.core/seq (core.clj:133)
(first (:a (apply merge-with concat
                  (map (fn [n] {:a (range 1 n)})
                       (range 1 4000)))))
;; StackOverflowError   clojure.core/seq (core.clj:133)

It’s not just concat either — any lazy sequence function could potentially cause this. concat is just the most common culprit.

Update October 3, 2015: My friend Jon Distad has come up with a way to avoid this bug with a different implementation of concat. See Concat implementation without stack overflow on the Clojure mailing list.



All these examples use Clojure version 1.6.0


Depending on your JVM settings, it may take more or fewer iterations to trigger a StackOverflowError.

Clojure 2014 Year in Review

My unscientific, incomplete, thoroughly biased view of interesting things that happened with Clojure in 2014.

Who’s Using Clojure?

No doubt about it: Clojure is making inroads in big business.

Cisco acquired ThreatGRID, a malware/threat analysis company using Clojure.

There hasn’t been what I’d call an official announcement from Amazon, but it’s clear from tweets and job listings that they’re using Clojure in production.

Also on the sort-of-announced front, WalmartLabs showed their love for Clojure in tweets and job listings.

Puppet Labs announced a big move towards Clojure and released their own framework, Trapperkeeper.

The U.K. Daily Mail reported on how they use Clojure at a Newspaper.

Greenius wrote about their Tech Roots: Clojure and Datomic.

Beanstalk told us that Beanstalk + Clojure = Love (and 20x better performance)

Cognitect published case studies from companies succeeding with Clojure and/or Datomic:

On the education front, Elena Machkasova has started gathering references for Clojure in undergraduate CS curriculum.

Radars & Rankings

Thoughtworks Radar January 2014 (PDF) placed Clojure firmly in the “adopt” category, as did element 84’s Technology Radar 2014.

Also in January, Clojure entered the top 20 in The RedMonk Programming Language Rankings.

By the time of Thoughtworks Radar July 2014 (PDF), the editors didn’t even consider Clojure a question, having moved on to “trial” for core.async and “assess” for Om.

Conferences & Events

We started the year off right with Clojure/West in San Francisco (videos on YouTube). I introduced Component, my not-quite-a-framework. Aaron Bedra threw down the gauntlet for securing Clojure web applications, leading to a flurry of activity making Clojure web frameworks more secure by default.

EuroClojure 2014 came to Krakow, Poland (videos on Vimeo).

At Lambda Jam 2014 in Chicago (videos on YouTube), Rich Hickey introduced Transit (Transit on GitHub).

At Strange Loop 2014 in St. Louis (videos on YouTube), Rich Hickey introduced Transducers. Ramsey Nasser and Tims Gardner introduced Clojure + Unity 3D, now named Arcadia. Ambrose Bonnaire Sergeant talked about Typed Clojure in Practice. Michael Nygard talked about Simulation Testing.

Rich Hickey made an appearance at JavaOne 2014 in San Francisco with Clojure Made Simple (video on YouTube).

In November, Clojure/conj 2014 in Washington, D.C. (videos on YouTube) was the biggest Clojure/conj yet. Over 500 attendees filled the beautiful Warner Theater.

Meanwhile, ClojureBridge held workshops throughout the year in Sydney, San Francisco, Edinburgh, and Minneapolis, just to name a few.

Language Ecosystem

The Clojure language itself continues to feature new and innovative ideas, this time Transducers.

ClojureDocs got a huge update: it now covers Clojure 1.6 and some important libraries like core.async and core.logic. There are also two new additions to the Clojure documentation sphere: Grimoire and CrossClj.

I for one am loving the surge in diversity of Clojure tooling. Cursive for IntelliJ garnered some serious attention, while CIDER and Counterclockwise both got major new releases. Boot is a new build tool with a radically different approach from the still-solid Leiningen.

Generative testing really started to catch on. Quick-check creator John Hughes gave a great keynote (video) at Clojure/west. Ashton Kemerling talked about Generative Integration Tests at Clojure/conj (also blogged). And of course the Clojure library simple-check became test.check, and has grown steadily in both capability and adoption.

Most of the Clojure contrib projects have gotten improvements and new releases.

ClojureScript growth accelerated, with three (and counting) frameworks built on top of Facebook’s React: Om, Quiescent, and Reagent.

Speaking of frameworks, there was a fair amount of activity around Component. No, I haven’t ported it to ClojureScript yet :) but there’s another ClojureScript port. People have started building things on top of Component, including juxt/modular and danielsz/system. uSwitch published some Example Component-based Apps.

Like a Phoenix rising from the ashes, new Pedestal releases appeared with support for fully non-blocking I/O, Transit, and Immutant.

What else is going on? The State of Clojure Survey 2014 analysis gave some insight into what people are thinking about Clojure.

Onward to 2015!

Thanks to Michael Fogus, Lake Denman, Alex Miller, and Paul deGrandis for their help in assembling this post.