Quick, Clojure programmers, what does the following expression do?
(get x k)
If you answered, It looks up the key k
in an associative data structure x
and returns its associated value, you’re right, but only partially.
What if x
is not an associative data structure? In every released version of Clojure up to and including 1.5.0, get
will return nil
in that case.
Is that a bug or a feature? It can certainly lead to some hard-to-find bugs, such as this one which I’ve often found in my own code:
(def person (ref {:name "Stuart" :job "Programmer"})) (get person :name) ;;=> nil
Spot the bug? person
is not a map but rather a Ref whose state is a map. I should have written (get @person :name)
. One character between triumph and defeat! To make matters worse, that nil
might not show up until it triggers a NullPointerException several pages of code later.
It turns out that several core functions in Clojure behave this way: if called on an object which does not implement the correct interface, they return nil
rather than throwing an exception.
The contains?
function is a more bothersome example. Not only is the name difficult to remember — it’s an associative function that checks for keys, not a linear search of values like java.util.Collection#contains — but it also returns nil
on functions which do not implement clojure.lang.Associative. Or at least it did up through Clojure 1.4.0. I submitted a patch (CLJ-932), included in Clojure 1.5.0, which changed contains?
to throw an exception instead.[1]
I submitted a similar patch (CLJ-1107) to do the same thing for get
, but not in time for consideration in the 1.5.0 release.
A few weeks later, I was writing some code that looked like this:
(defn my-type [x] (or (get x :my-namespace/type) (get (meta x) :my-namespace/type) (get x :type) (clojure.core/type x)))
I wanted a flexible definition of “type” which worked on maps or records with different possible keys, falling back on the clojure.core/type
function, which looks for a :type
key in metadata before falling back to clojure.core/class
.
Before the patch to get
in CLJ-1107, this code works perfectly well. After the patch, it won’t. I would have to write this instead:
(defn my-type [x] (or (when (associative? x) (get x :my-namespace/type)) (get (meta x) :my-namespace/type) (when (associative? x) (get x :type)) (clojure.core/type x)))
But wait! The meta
function also returns nil
for objects which do not support metadata. Maybe that should be “fixed” too. Then I would have to write this:
(defn my-type [x] (or (when (associative? x) (get x :my-namespace/type)) (when (instance? x clojure.lang.IMeta) (get (meta x) :my-namespace/type)) (when (associative? x) (get x :type)) (clojure.core/type x)))
And so on.
Every language decision means trade-offs. Clojure accepts nil
as a logical false value in boolean contexts, like Common Lisp (and also many scripting languages). This “nil punning” enables a concise style in which nil
stands in for an empty collection or missing data.[2] For example, Clojure 1.5.0 introduces two new macros some->
and some->>
, which keep evaluating expressions until one of them returns nil
.
Is Clojure’s get
wrong? It depends on what you think get
should mean. If you’re a fan of more strictly-typed functional languages you might think get
should be defined to return an instance of the Maybe
monad:
;; made-up syntax: get [Associative[K,V], K] -> Maybe[V]
You can implement the Maybe monad in Clojure, but there’s less motivation to do so without the support of a static type checker. You could also argue that, since Clojure is dynamically-typed, get
can have a more general type:
;; made-up syntax: get [Any, Any] -> Any | nil
This latter definition is effectively the type of get
in Clojure right now.
Which form is better is a matter of taste. What I do know is that the current behavior of get
doesn’t give much affordance to a Clojure programmer, even an experienced one.[3]
Again, tradeoffs. Clojure’s definition of get
is flexible but can lead to subtle bugs. The stricter version would be safer but less flexible.
An even stricter version of get
would throw an exception if the key is not present instead of returning nil
. Sometimes that’s what you want. The Simulant testing framework defines a utility function getx
that does just that.
Over the past five years, Rich Hickey has gradually led Clojure in the direction of “fast but correct by default.” This is particularly evident in the numeric primitives since release 1.3.0, which throw an exception on overflow (correct) but do not automatically promote from fixed- to arbitrary-precision types (slow).
I believe the change to get
in CLJ-1107 will ultimately be more help than hindrance. But it might also be useful to have a function which retains the “more dynamic” behavior. We might call it get'
in the manner of the auto-promoting arithmetic functions such as +'
. Or perhaps, with some cleverness, we could define a higher order function that transforms any function into a function that returns nil
when called on a type it does not support. This would be similar in spirit to fnil
but harder to define.[4]
Update #1: changed (instance? x clojure.lang.Associative)
to (associative? x)
, suggested by Luke VanderHart.
Update #2: Some readers have pointed out that I could make my-type
polymorphic, thereby avoiding the conditional checks. But that would be even longer and, in my opinion, more complicated than the conditional version. The get
function is already polymorphic, a fact which I exploited in the original definition of my-type
. It’s a contrived example anyway, not a cogent design.
Footnotes:
[1] We can’t do anything about the name of contains?
without breaking a lot more code. This change, at least, is unlikely to break any code that wasn’t already broken.
[2] There’s a cute poem about nil-punning in Common Lisp versus Scheme or T.
[3] I am slightly abusing the definition of affordance here, but I think it works to convey what I mean: the implementation of get
in the Clojure runtime does not help me to write my code correctly.
[4] I don’t actually know how to do it without catching IllegalArgumentException, which would be bad for performance and potentially too broad. Left as an exercise for the reader!
This mirrors the debate about power vs. protection in programming languages. Chuck Moore has described Forth as amplifying the power of individual programmers, power to do good as well as harm. He and his collaborators think of Forth as a way to move the mass of programmers toward the ends of the bell curve. In contrast, protective languages–such as Java and Ada–tend to diminish the harm done by the worst, but also diminish the productivity of the best. Thus they move programmers toward the middle of the curve.
I think we are seeing a decade-long shift away from protective languages. (Michael Feathers has called it the “End of the Era of Paternalistic Languages.” I’m enjoying it.
Good article Stuart, and a very important point.
I think the changes like CLJ-932 are in the right direction: silent failure is *much* worse than throwing exceptions and forcing people to write a couple of extra manual checks. Silent failure can lead to some very nasty bugs.
> We can’t do anything about the name of contains? without breaking a lot more code.
Of course we can; rename it to
contains-key?
and introduce an alias for backwards-compatibility. At some point you have all calls tocontains?
emit a warning, and then at a major version bump you remove it. Easy.