Apathy of the Commons

Eight years ago, I filed a bug on an open-source project.

HADOOP-3733 appeared to be a minor problem with special characters in URLs. I hadn’t bothered to examine the source code, but I assumed it would be an easy fix. Who knows, maybe it would even give some eager young programmer the opportunity to make their first contribution to open-source.

I moved on; I wasn’t using Hadoop day-to-day anymore. About once a year, though, I got a reminder email from JIRA when someone else stumbled across the bug and chimed in. Three patches were submitted, with a brief discussion around each, but the bug remained unresolved. A clumsy workaround was suggested.

Linus’s Law decrees that Given enough eyeballs, all bugs are shallow. But there’s a correlary: Given enough hands, all bugs are trivial. Which is not the same as easy.

The bug I reported clearly affected other people: It accumulated nine votes, making it the fourth-most-voted-on Hadoop ticket. And it seems like something easy to fix: just a simple character-escaping problem, a missed edge case. A beginning Java programmer should be able to fix it, right?

Perhaps that’s why no one wanted to fix it. HADOOP-3733 is not going to give anyone the opportunity to flex their algorithmic muscles or show off to their peers. It’s exactly the kind of tedious, persistent bug that programmers hate. It’s boring. And hey, there’s an easy workaround. Somebody else will fix it, right?

Eventually it was fixed. The final patch touched 12 files and added 724 lines: clearly non-trivial work requiring knowledge of Hadoop internals, a “deep” bug rather than a shallow one.

One day later, someone reported a second bug for the same issue with a different special character.

If there’s a lesson to draw from this, it’s that programming is not just hard, it’s often slow, tedious, and boring. It’s work. When programmers express a desire to contribute to open-source software, we think of grand designs, flashy new tools, and cheering crowds at conferences.

A reward system based on ego satisfaction and reputation optimizes for interesting, novel work. Everyone wants to be the master architect of the groundbreaking new framework in the hip new language. No one wants to dig through dozens of Java files for a years-old parsing bug.

But sometimes that’s the work that needs to be done.

* * *

Edit 2016-07-19: The author of the final patch, Steve Loughran, wrote up his analysis of the problem and its solution: Gardening the Commons. He deserves a lot of credit for being willing to take the (considerable) time needed to dig into the details of such an old bug and then work out a solution that addresses the root cause.