I’ve been thinking about some of the tools that I use every day, and about the different design philosophies they reflect.
First and foremost, Git. We use Git on every single project, internal and external. Git is a great example of the Unix design philosophy: many small programs — 153 of them by my count — each of which does exactly one thing and does it well. But this is not “loose coupling.” The components of Git are tightly integrated: they all depend on the same repository structure and file formats.
One of the nice things about Git is how its internals are both exposed for the world to see and thoroughly documented. We can easily write scripts to automate common tasks or create different workflows. With a bit more effort, we could even write new tools that integrate with the Git suite. These tools can do things that Git’s authors never intended, as long as they follow the documented repository structure. Git isn’t so much a version control system as the means to construct one.
Still, Git is one project with many components, not many separate projects. All 153 executables in Git are governed by a single release cycle, tested and known to work together. We never have to worry about incompatible versions of, say,
git-merge on the same machine. Older versions of Git can read repositories created with newer versions even if they don’t provide all the same features.
In stark contrast to Git, we have tools from the Java world like Ant and Maven. The JVM cannot fork/exec, so the many-small-programs design is a non-starter. Instead, the Java tools usually favor some sort of plug-in architecture, which is a great idea in theory but hard to get right in practice.
I’ve tried writing a Maven plugin. Hacking up a one-off for a single project is not too difficult, but designing a general-purpose plugin that works everywhere is maddeningly complicated. Maven plugins are just Java code, so they can do whatever they want, but the APIs for interacting with the rest of the Maven system are woefully underdocumented. The contract of a Maven plugin, what it can and cannot do, is not well-defined. The internals of Maven itself are largely a black box.
The core Maven plug-ins have independent release cycles, so there is the possibility for unexpected incompatibilities, but I’ve never encountered such. On the whole, the Maven ecosystem is quite stable. The struggle comes once you venture outside the realm of what the standard plugins provide. Maven plugins are not designed to be composed, so adding new capabilities is rarely as simple as scripting plugins that already exist. You have to start from scratch every time.
Ruby / Rubygems / RVM / Bundler
Finally, we have tools from the Ruby world, the ever-changing cornucopia of Ruby implementations, libraries, and tools to manage it all. The problem with the Ruby tools is that they are both tightly-coupled and uncoordinated. Despite having separate tools for each task, each tool reaches into at least one of the others: Rubygems modifies the behavior of the Ruby interpreter, Bundler modifies the behavior of Rubygems, RVM modifies the behavior of the shell, and so on. Each one adds another layer of indirection, making debugging harder.
All of the Ruby development tools have independent release cycles, and they don’t seem to plan or coordinate with one another in advance of each release. Integration testing is left up to the users.
I admire the speed and eagerness with which the Ruby community produces new tools. But on almost every Ruby project I’ve worked on, we’ve spent hours or days sorting out incompatibilities among some combination of libraries, language implementations, and development tools. Our internal mailing list is littered with advice like “Don’t use Bundler version X with RVM version Y.” The speed of development comes with its own cost.
So what do I take from all this? Just a few principles to keep in mind when writing software tools:
- Plan for integration
- Rigorously specify the boundaries and extension points of your system
- Do not depend on unspecified behavior
And a couple of ideas if you’re starting a new project from scratch:
- The filesystem is the universal integration point
- Fork/exec is the universal plugin architecture
Update 8/31: More comments at Hacker News.