Threading with Style – Digital Digressions by Stuart Sierra

No, not multi-threading. I’m talking about Clojure’s threading macros, better known as “the arrows.”

The -> (“thread-first”) and ->> (“thread-last”) macros are small but significant innovations in Clojure’s Lisp-like syntax. They support a functional style of programming with return values and make composition intuitive. They answer the two chief complaints about Lisp syntax: too many parentheses and “unnatural” ordering.

There’s something delightful about expressing a complex procedure as a neat sequence of operations with values threaded through it:

(-> input
    step-one
    step-two
    step-three
    result)

I love writing code like this, and I love reading it. It’s like a table of contents, a shining path guiding me to through rest of the code.

All too often, however, I see the threading macros used in places where they make the code less clear. Early in Clojure’s public existence, there was a fad for “point-free” programming, a kind of code golf without any local variables. This is a fun exercise, but it does not lead to readable code, and I’m relieved that the enthusiasm for this style has worn off.

What has remained is a tendency to use threading macros any place they can be made to fit. I think this is a mistake. Rather than making the code easier to read, it makes it harder, forcing the reader to mentally “unwind” the syntax to figure out what it does.

With that in mind, here are my recommendations for situations where you should use threading macros.

Navigation

Thread-first -> is ideal for navigating large, nested structures. The shape even suggests direction. The functions calls should be short and simple, usually a mix of keywords, ordinals (first, nth), and lookups (get). For example:

(-> results :matches (nth 3) :scores (get "total_points"))

This works with Java objects too:

(-> results .getMatches (nth 3) .getScores (.getKey "total_points"))

Be aware, though, that if you mix Java methods and generic Clojure functions (like nth in the example above) you will lose type inference on the arguments and might introduce reflection warnings. For a (contrived) example:

(set! *warn-on-reflection* true)
(import (java.util Date))

(let [m {:now (Date.)}]
  (-> m :now .getTime))
;; Reflection warning ...
;; reference to field getTime can't be resolved.

You could add type-hints to eliminate the reflection:

(let [m {:now (Date.)}]
  ;; We cannot type-hint a keyword,
  ;; so wrap it in a list:
  (-> m ^Date (:now) .getTime))
;; No reflection warning

As even this small example shows, type hints interrupt the linear flow of ->, so they are better off in a let (see below).

If you have only Java calls and you want to save a few characters, you can use .. instead of ->, which lets you omit the leading dot on the method names:

(.. results getMatches (getItem 3) getScores (getKey "total_points"))

The .. macro is nice for navigating deep object hierarchies or “builder” patterns in Java APIs.

I don’t have strong opinions about whether you use .. or -> for interop: The .. macro clearly signals that everything which follows is a Java method, but the leading dots on the method names in -> do that just as well. (Why have both? The .. macro pre-dates the .method syntax for Java interop, so early versions of Clojure could not use Java methods in ->.)

Transformation

The second use case for -> is performing a series of functional transformations on a single value:

(-> username
    string/trim
    string/lower-case
    (string/replace #"[^a-z]" "-"))

You might have noticed that most standard Clojure functions take the “primary thing” as their first argument. Arguments which control the behavior of the function come after. Maps are an excellent example:

(-> game-state
    (assoc :next-player :player2)
    (update :turn-counter inc)
    (update-in [:scores :player1] + 10)
    (update-in [:scores :player2] - 3))

In general, I expect these transformations to start and end with data of the same or similar “type.” String in, String out. Map in, map out. It’s easy to imagine an object “flowing” through the -> if it keeps the same “shape.”

Changing types mid-stream is sometimes necessary, but try to avoid too many different types in a single ->. The exception to this rule is when you have to walk through several intermediate “types” to get to the one you want.

This contrived example converts a java.time.LocalDate into a DayOfWeek, then an integer:

(let [date (LocalDate/of 2018 1 1)]
  (-> date
      .getDayOfWeek
      .getValue))

This is more like navigation through a type hierarchy, so the section above applies.

Sequences

Whereas most Clojure functions take the “main thing” as their first argument, the Sequence API functions take the sequence as their last argument. There are precedents for this in other languages, particularly cons in older Lisps. Since most Clojure sequence functions are lazy, you can think of them as “wrapping” the sequence in a transformation, such as map or filter. The introduction of transducers carries this design even further.

I believe the ->> (“thread-last”) macro should be used only with sequence functions:

(->> data
     (map :players)
     (mapcat :scores)
     (filter #(< 100 %))
     sort)

The last operation in ->> will often be something that collects the elements of the sequence into a singular result, such as reduce, into, or (for side effects) run!.

This follows the same rules as “Transformation,” above, keeping the sequence “shape” throughout.

I don’t like reading code that uses ->> for anything other than sequences, because there’s rarely any other group of operations that consistently place the “main thing” at the end of their arguments. Just because a few calls happen to fit that pattern doesn’t necessarily make it a good use of ->>.

Don’ts

Navigation, transformation, and sequences: That’s it for the “do’s.” The rest are all “don’ts.”

Don’t mix `->` and `->>`

It might be tempting to do something like this:

;; Bad
(-> data
    :matches
    (->> (map :final-score)
         (reduce +)))

It even works. I often do things like this at the REPL, to explore large data structures. But I refactor before committing it to code. It’s just too much mental effort to read when the argument is flipping positions constantly.

Don’t thread arithmetic

;; Bad
(-> celsius
    (* 9)
    (/ 5)
    (+ 32))

Maybe someone finds this readable. As far as I’m concerned, it might as well be FORTH.

Don’t use anonymous functions to change argument position

Never do this:

(-> results
    :matches
    (#(filter winning-match? %))   ; BAD
    (nth 3)
    :scores
    (get "total_points"))

This is a clever trick that happens to work, but it’s syntactically confusing. Instead, use a let to pull out intermediate values:

(let [wins (filter winning-match? (:matches results))]
  (-> wins
      (nth 3)
      :scores
      (get "total_points")))

The as-> macro is another way to work around this issue:

(-> results
    :matches
    (as-> matches (filter winning-match? matches))   ; not great
    (nth 3)
    :scores
    (get "total_points"))

I don’t dislike as->, but I don’t like it much either. as-> can save you from rearranging a lot of code, but I would caution against overusing it.

Update: This is not a good example, as I explain in a followup post about as->.

Don’t mix threading macros with other syntactic macros

Never do this:

;; BAD
(->> rdr
     line-seq
     (map parse-line)
     set
     (with-open [rdr (io/reader file)])
     (def lines))

Can you work out what that does? It took me three tries just to write it.

The threading macros are syntactic transformations. Don’t mix them with other syntax-like macros such as with-open or def.

Alternatives

What to do if you have something that almost fits neatly into a -> or ->>, but breaks one of these rules?

99% of the time, the answer is simple: use let. That’s what it’s for. Local symbols are an opportunity to add semantic meaning to your code by giving names to values.

The other 1% of the time, you can add or modify functions to make the process fit more neatly into the threaded pattern. If this is an easy change to make and it makes the code significantly more readable, then go for it. But don’t do it just to save 1 line from what would otherwise be a let.

Context Maps

I started this article with this example:

(-> input
    step-one
    step-two
    step-three
    result)

In one sense, this is just a variation on transformation. Each of the step-* functions takes a single map argument and returns the same map, possibly with some additional data assoc’d in. Mentally I call this the “context map.”

You could say it’s a simplified version of the Blackboard Pattern, with the context map serving as the “blackboard.”

I find this pattern useful when trying to make sense of long procedures that implement a lot of complex business logic. It shows up often in frameworks such as Liberator and Pedestal.

When you have code that produces a lot of intermediate results, and you’re not sure when you might need one of those results again, a context map is a good place to stash them. If you use namespace-qualified keywords in the map, then different namespaces can share the same context map without risk of clashing keys. (Coincidentally, Clashing Keys is the name of my new synth-pop band.)

All that said, there are some aspects of this pattern that have always made me slightly uncomfortable. First, it’s easy to make mistakes, such as misspelling a keyword, that manifest as NullPointerExceptions. clojure.spec may help with this, but we’re still learning how best to apply it.

Second, and more critically, the functions that operate on these context maps often have implicit ordering constraints that are not expressed directly in the code. To modify code using this pattern, you have to be aware of the dependencies: which function adds a key to the context, and which functions use that key. I don’t have a good solution to this right now. Again, maybe clojure.spec will help.

Summary

So there you have it. Like many of Clojure’s “cool” features — lazy seqs, varargs, optional dynamic scope, polymorphism — the threading macros are powerful tools that should not be overused. Just because something can be written as -> or ->> doesn’t mean it should. As with anything else, the question to ask yourself is: Does this make the intention of the code more or less clear?

3 Replies to “Threading with Style”

Howard Lewis Ship says:

July 6, 2018 at 3:07 pm

Don’t mix threading macros with other syntactic macros

That’s a good rule; I’ve violated in the past in my quest to thread all the things, and wasn’t sure why I was dissatisfied with the result.

There was a rule in another style guide that essentially said to end the thread when the basic type of the thing has changed; I’m not sure I follow that guideline as I prefer the narrative of changes over the challenge of coming up with intermediate names in a let.
Ben Hammond says:

July 6, 2018 at 6:32 pm

Why is it ok to mix (-> with (as-> in your above example?

Would it be better to use (as-> all the way through?
- Stuart says:
  
  July 7, 2018 at 3:02 pm
  
  I think as-> should be used only as a temporary step in a larger ->. The argument order of as->, with the binding symbol coming second, is designed to be used this way. Long chains of as-> are no better than a let with repeated bindings of the same symbol.

Comments are closed.