Design Philosophies of Developer Tools

I’ve been thinking about some of the tools that I use every day, and about the different design philosophies they reflect.

Git

First and foremost, Git. We use Git on every single project, internal and external. Git is a great example of the Unix design philosophy: many small programs — 153 of them by my count — each of which does exactly one thing and does it well. But this is not “loose coupling.” The components of Git are tightly integrated: they all depend on the same repository structure and file formats.

One of the nice things about Git is how its internals are both exposed for the world to see and thoroughly documented. We can easily write scripts to automate common tasks or create different workflows. With a bit more effort, we could even write new tools that integrate with the Git suite. These tools can do things that Git’s authors never intended, as long as they follow the documented repository structure. Git isn’t so much a version control system as the means to construct one.

Still, Git is one project with many components, not many separate projects. All 153 executables in Git are governed by a single release cycle, tested and known to work together. We never have to worry about incompatible versions of, say, git-branch and git-merge on the same machine. Older versions of Git can read repositories created with newer versions even if they don’t provide all the same features.

Maven

In stark contrast to Git, we have tools from the Java world like Ant and Maven. The JVM cannot fork/exec, so the many-small-programs design is a non-starter. Instead, the Java tools usually favor some sort of plug-in architecture, which is a great idea in theory but hard to get right in practice.

I’ve tried writing a Maven plugin. Hacking up a one-off for a single project is not too difficult, but designing a general-purpose plugin that works everywhere is maddeningly complicated. Maven plugins are just Java code, so they can do whatever they want, but the APIs for interacting with the rest of the Maven system are woefully underdocumented. The contract of a Maven plugin, what it can and cannot do, is not well-defined. The internals of Maven itself are largely a black box.

The core Maven plug-ins have independent release cycles, so there is the possibility for unexpected incompatibilities, but I’ve never encountered such. On the whole, the Maven ecosystem is quite stable. The struggle comes once you venture outside the realm of what the standard plugins provide. Maven plugins are not designed to be composed, so adding new capabilities is rarely as simple as scripting plugins that already exist. You have to start from scratch every time.

Ruby / Rubygems / RVM / Bundler

Finally, we have tools from the Ruby world, the ever-changing cornucopia of Ruby implementations, libraries, and tools to manage it all. The problem with the Ruby tools is that they are both tightly-coupled and uncoordinated. Despite having separate tools for each task, each tool reaches into at least one of the others: Rubygems modifies the behavior of the Ruby interpreter, Bundler modifies the behavior of Rubygems, RVM modifies the behavior of the shell, and so on. Each one adds another layer of indirection, making debugging harder.

All of the Ruby development tools have independent release cycles, and they don’t seem to plan or coordinate with one another in advance of each release. Integration testing is left up to the users.

I admire the speed and eagerness with which the Ruby community produces new tools. But on almost every Ruby project I’ve worked on, we’ve spent hours or days sorting out incompatibilities among some combination of libraries, language implementations, and development tools. Our internal mailing list is littered with advice like “Don’t use Bundler version X with RVM version Y.” The speed of development comes with its own cost.

Thoughts

So what do I take from all this? Just a few principles to keep in mind when writing software tools:

  1. Plan for integration
  2. Rigorously specify the boundaries and extension points of your system
  3. Do not depend on unspecified behavior

And a couple of ideas if you’re starting a new project from scratch:

  1. The filesystem is the universal integration point
  2. Fork/exec is the universal plugin architecture

Update 8/31: More comments at Hacker News.

5 thoughts on “Design Philosophies of Developer Tools”

  1. There’s a quote from Alan Perlis I keep reading on all the Clojure blogs about the number of functions that operate on a data structure. It seems directly applicable to Git. Admittedly, Git has more than one data structure. I would consider a “diff” a data structure in addition to the internal changeset representation. But I’m not sure I would label its design tight coupling–a term with a highly negative connotation that is usually used to describe one module’s dependence on another. Rather, all of Git’s “modules” seem to depend on a few data structures, and higher-level (porcelain) freely mixes and matches lower-level APIs (plumbing) to provide complex functionality. And most of these porcelain commands input/output “diffs”. It’s a really interesting architecture, which seems loosely coupled to me. I’d love to read your thoughts on this. Great post–thanks for writing it.

  2. Wow, I love your last two points. One question, though. Given that the JVM has no fork, how does your universal plugin comment fit with Clojure?

  3. Christian said: “Given that the JVM has no fork, how does your universal plugin comment fit with Clojure?”

    It doesn’t. And that’s the problem with every extant build management tool for Clojure.

    Clojure does have first-class functions, which are easier to compose than any plugin architecture. But Clojure functions don’t provide a way to isolate different environments, making them less appropriate for something like a build tool.

Comments are closed.