Archive for May, 2008

EC2 Authorizations for Hadoop

Wednesday, May 14th, 2008

I just did my first test-run of a Hadoop cluster on Amazon EC2. It’s not as tricky as it appears, although I ran into some snags, which I’ll document here. I also found these pages helpful: EC2 on Hadoop Wiki and manAmplified.
First, make sure the EC2 API tools are installed and on your [...]

Stop Your Java SAX Parser from Downloading DTDs

Thursday, May 8th, 2008

Back in February, in a slightly plaintive post, the W3 sysadmins asked that people stop hammering their servers with requests for XHTML DTDs. Everyone said yes, this is a stupid problem that wouldn’t have happened if a) the XML spec were less dumb, or b) XML libraries were less dumb.
After that post, I spent [...]

We Don’t Know How We Program

Thursday, May 8th, 2008

Paul Johnson, in the U.K., wrote a piece about how there is no known “process” for programming.  At some point, all the theory and methodology goes out the window and someone has to sit down, think about the problem, and write some code.
I’m sure I won’t be the only one to suggest this, but I [...]

Calling Java Constructors with this()

Monday, May 5th, 2008

The things I don’t know about Java… could fill a book. Here’s a new one, from the Hadoop sources:

public ArrayWritable(Class valueClass) {
// …
}

public ArrayWritable(Class valueClass, Writable[] values) {
this(valueClass);
this.values = values;
}

The second constructor uses the syntax this(arg) to call a different constructor, then follows with initialization code [...]

Astronauts Without Mission Control

Thursday, May 1st, 2008

Joel Spolsky complains that architecture astronauts are taking over at big, rich companies like Google and Microsoft, pushing out elaborate architectural systems that don’t solve actual problems.
He’s right in that smart, technical people like to take on any large, abstract problem that is, as he puts it, “a fun programming exercise that you’re doing because [...]