List Processing and the Efficiency of CONS

One big difference between Lisp and most other programming languages is its use of recursion instead of iteration.

So while I was working on some text-parsing code, I fell in to this simple pattern:

(defun process (list)
  (if list
      (cons (do-something (first list))
	    (process (rest list)))))

Ah, the joys of Lisp-2.

If do-something were actually a function, this example would be better written:

(mapcar #'do-something list)

But the CL Hyperspec explicitly states, “For list traversal operations, the cdr chain of the list is not allowed to be destructively modified.” Stepping through the list myself means I can modify its structure along the way. I can insert new CONS cells into the list, or even rebind list to something else. This makes it easy to convert a flat list like this:

  • Heading 1
  • Paragraph
  • Paragraph
  • Heading 2
  • Paragraph
  • Paragraph

Into a structured tree like this:

  • Heading 1
    • Paragraph
    • Paragraph
  • Heading 2
    • Paragraph
    • Paragraph

Here I encounter the Lisper’s Dilemma: to CONS or not to CONS. Articles and books about Lisp frequently say “CONS is expensive, avoid it where possible.” The pattern above does a lot of CONSing, effectively rebuilding its input. But when I tried to modify my code to use destructive list operations like rplaca and nconc I quickly ran into problems. Destructive operations are tricky at the best of times. Using them while experimenting with new code is just plain asking for trouble. The CONSing code I ended up with was plenty fast enough for my purposes.

I think the constant repetition of “CONS is expensive” is disingenuous for Lisp, especially when it comes to teaching new programmers about its advantages. Yes, CONS is computationally inefficient, but its benefits in readability and maintainability make it is very programmer-efficient.

10 thoughts on “List Processing and the Efficiency of CONS”

  1. Lisp doesn’t use recursion, people do. Part of the trick of programming in any language is learning what is available and when it’s appropriate to employ it. Fortunately, CL gives you a lot of options.

  2. In Exscribe, I have a macro to recurse through the user-constructed tree of text and tags, side-effecting the structure as I go. It’s very effective to rewrite the Scribe-like tree into HTML.

  3. Do articles and books really say “CONS is expensive”, let alone “avoid [CONS] where possible”? I can believe that they say that “consing” is potentially undesireable; the Lisp community uses the verb “cons” to mean heap-allocation, and typical lisp implementations with their garbage collectors will suffer latency at some later point when enough heap allocation has been done to trigger a collection; for time-critical or low-latency code, these collections can cause trouble.

    As you note, there’s no need to optimize code beyond its speed requirements, and certainly no need to do so at the expense of clarity or correctness. But I don’t think that anyone is actually advocating that, and it’s a little disingenuous to suggest that they are.

  4. Another thing is that in Lisp, to “cons” means to allocate memory, not necessarily by means of calling the CONS function. Actually, calling other functions, like MAKE-ARRAY will usually “cons” a lot more.

    And my advice: do things the functional way and create new lists — it will save you a lot of debugging time.

  5. Christophe Rhodes Says: Do articles and books really say “CONS is expensive”, let alone “avoid [CONS] where possible”?

    I can only find one specific example right now, which gives the “eleventh commandment” for Lisp programmers as “Thou shalt not cons in vain.” I think it’s an impression I got from newsgroup and mailing list discussions. One frequently hears “consing is expensive,” which can lead the less-experienced Lisp programmer (e.g. me) to believe “consing is bad.”

  6. These days, with modern generational garbage collectors, you shouldn’t have to worry about consing. I rarely give it a second thought.

  7. Right – “shalt not cons in vain.” That’s hardly a condemnation of CONS (is it ever, in fact, good to do something in vain?).

    CONS is not “computationally inefficient.” At least, I don’t think it is, since it’s not clear what that phrase is supposed to mean. CONS is a totally efficient function, but calling it unnecessarily will lead to a more painful GC. But this is only a problem when it’s a problem. Avoiding CONS “just because” is silly.

    I find simple recursive patterns like PROCESS repulsive. Every time I look at a function like PROCESS, I have to walk through it in my head and then realize that, “Oh, it’s just an instance of Pattern X. I’ve seen this a million times.” I would try to use LOOP here, so that it’s immediately apparent what you’re doing.

  8. As in any programming language, you should “just write” the code first, then profile. Often (not always, but often) you’ll find some function that’s consing (not necessarily literally `cons’) some huge amount, possibly non-obviously, and you can get a dramatic speedup just by changing that one spot.

    The advice isn’t unique to Lisp. The difference is that in Lisp this particular issue happens to be one of the easier ways to find and get improvements.
    And there’s more awareness (with a catchy name) for this issue in Lisp.

  9. Recursion in Lisp should be used with caution. Most of the Lisp implementation do not support tail recursion by default so you more likely end up with stack overflow on long lists. loop macro and other iteration macros should be used in Lisp.

Comments are closed.