Thoughts on Clojure Package Management

Update Sept. 3: Maven’s Not So Bad.

A lot of Ruby types come to Clojure and ask, “Where’s the package manager?” The answer is usually, “Maven or Ivy,” which isn’t really an answer.

I discussed this in the latter half of my Philly Lambda talk (PDF slides). The problem is that Clojure is built on Java, and any Clojure library that does something interesting is going to need some Java libraries beyond what the JDK provides.

Java has only one established dependency management system, Maven. (Ivy is an alternative, but it uses the Maven repositories.) Maven works, but it’s a big, complicated beast, built in the best giant-XML-configuration-file Java tradition. It’s also slow to accept new libraries into the public repositories. The central Maven 2 repository contains fewer than 700 libraries. Rubyforge, by contrast, lists over 8,000.

Maven seems to work well for large organizations that can benefit from setting up their own, private repositories, but it’s kind of a headache for the independent developer.

There’s a Clojure Maven plugin, some shell-based hacks like Corkscrew, and some Ivy-related code floating around, but none really provides what people want: one simple command to download and install all the dependencies for a project, without needing any XML.

What everyone wants, of course, is CPAN. Thousands of documented, tested modules for just about any task you could imagine, and quite a few you couldn’t (e.g., Acme::Buffy).

But CPAN was not created in a day. Most of its imitators (Rubygems, PEAR, Python Eggs) have failed to reach the same level of quality. Perl is also much older, and therefore more stable, than Python or Ruby. 10-year-old Perl code probably still works.

Part of this CPAN’s success, I think, has to do with the environment in it evolved. When Perl was the hot new language, running a web server was an expensive proposition. Even domain names weren’t cheap. If you were going to publish code on the web, there was a cost to doing so, either in time or money, so you wanted to make sure that it was worth publishing.

These days, when everyone has a blog and a Github account, sharing code is easy. Doing “git push” requires almost no thought, no investment of time. Why not release everything, even when it’s untested, undocumented, or unfinished?

So this weekend I started working on a package repository for Clojure. It was modeled it after CPAN, but designed to support anything that could be packaged in a JAR file, including compiled Java libraries and Clojure source code.

I got started. Then I thought, who would actually use this? Of the few dozen Clojure libraries that have been published on Github, only a handful are “production-ready.” Most aren’t even finished. Very few have been thoroughly tested. (I’m equally guilty in this regard.)

I concluded that it’s just too early. Clojure is a scarcely two years old. It just released “1.0” this year, and is still developing rapidly. The libraries are evolving equally rapidly. If you want to build a project using, say, Compojure, the best way to do it is with Git submodules.

The one place a package manager would really be useful is in downloading and installing the standard Java packages that get used in almost every project, like the Apache Commons libraries. For this, Maven/Ivy works, if not brilliantly.

Update: another Maven helper: Clojure-POM

11 thoughts on “Thoughts on Clojure Package Management”

  1. > some shell-based hacks like Corkscrew

    The fact that corkscrew shells out is a temporary implementation detail due to the fact that I haven’t figured out what to make of the plexus classworlds framework that using the Maven Java API requires. But the idea has always been to call it through the Java API. It actually currently does provide “what people want: one simple command to download and install all the dependencies for a project, without needing any XML.”

    Removing the shell-out hackiness in corkscrew will simply mean it can run on systems that don’t have Maven already installed. But it works right now.

  2. Though I think you’re right in general. I wrote Corkscrew because it was easy (it’s about 200 LOC), but sinking a lot of effort into a Clojure-specific solution probably wouldn’t offer too much benefit over the current state of the art.

    As long as we can get people to quit including jar files in the lib/ directory of their repositories I’m happy. =)

  3. Sorry, Phil, I actually meant nothing ill by “shell-based.” I think Corkscrew is quite clever at what it does. My real issue is that it pulls directly from source code repositories. There’s no continuity if a project disappears from Github. (Rubygems has the same problem.) Also, there are no numbered releases or multi-level dependencies. As I wrote, that’s more a fault of the libraries themselves at the moment.

  4. Phil wrote: “As long as we can get people to quit including jar files in the lib/ directory of their repositories I’m happy. =)”

    But that’s just it: everyone does this. Even big, BIG Java projects like Hadoop have JAR files in their version control systems. And Hadoop and Maven are both Apache projects! Automated dependency management for Java libraries has failed.

  5. We use clojure-pom and maven. It took a couple hours to set up and it works. Introducing a new non-maven package manager into clojure is an uphill battle IMO.

  6. Stuart: yeah, if you can avoid using source repos directly then you’ll be in better shape, but sometimes you just don’t have an option given the state of libraries. Maybe if corkscrew or some other tool made it easier to package Clojure projects up as Maven artifacts we would see more libraries publishing them.

    It’s true that you don’t get transitive dependencies with source repos; you need a full-fledged dependency system for that. But git can work with tags instead of arbitrary revisions, so you do have the option to use numbered releases if they are present.

    The other problem is that lots of people see the word “maven” and run away screaming with their hair on fire. The corkscrew readme is pretty explicit about the fact that you don’t need to write any XML, but multiple people have told me that they simply stopped reading it as soon as they saw the word “maven”, so that sucks.

  7. I should make that a little clearer “It took me an initial few days to figure out clojure + maven”. After that initial investment, new projects take only seconds to setup using “mvn archetype” and clojure-pom. They all build the exact same way and the dependency management is exact.

  8. Even if you don’t have “official production ready code” sharing it in a way that can resolve dependencies is very useful. IMHO we need to stop re-inventing package management for every single language and every single operating system distribution. We should all just pick something good like nix http://nixos.org/index.html
    Speaking of re-use, I wonder if you can use rubygems given that it works with jRuby.

  9. Greg wrote: “we need to stop re-inventing package management for every single language and every single operating system distribution.”

    I agree. But package/dependency management is one of those things that’s incredibly hard to get right. Witness all the attempts in the Linux world: rpm, yum, apt-get, yast, pkgtool, …. None of them is perfect. Then you’ve got one for every scripting language ever invented. Nix is pretty cool, but Unix-specific, and anything built for the JDK should work on any JDK platform.

    Using Rubygems sounds … interesting. I’m not enthusiastic about Rubygem’s dependency resolution, and using it would require loading the entire JRuby runtime.

    The long and short of it is, as Tim said, we’re stuck with Maven. Kudos to the first person who wraps Maven, including ALL the packaging and deployment features, in a simple Clojure interface.

  10. I agree that git submodules are rather useful, but I don’t think they’re a replacement for a good package manager. In my view, Clojure has reached the stage where there are enough third-party libraries that it would be useful to start thinking about package management.

    I’m a little biased, though, as I’ve started working on a Clojure package manager called Capra (http://github.com/weavejester/capra). It’ll probably be a few months before the project is in a usable state, but I’m hoping it’ll be unique enough to justify another re-invention of package management.

  11. James wrote: “I’m hoping it’ll be unique enough to justify another re-invention of package management.”

    I wish you luck, but I have my doubts that there’s anything unique left to be done here.

Comments are closed.