Monthly Archives: April 2008

A Million Little Files

My PC-oriented brain says it’s easier to work with a million small files than one gigantic file. Hadoop says the opposite — big files are stored contiguously on disk, so they can be read/written efficiently. UNIX tar files work on … Continue reading

Posted in Programming | Tagged | 20 Comments

The Great Database Rewrite

I just discovered the paper The End of an Architectural Era (It’s Time for a Complete Rewrite), about re-designing database software from the ground up.  It contains some unsurprising predictions — “the next decade will bring domination by shared-nothing computer … Continue reading

Posted in Programming | Tagged , , | Leave a comment

Power At Your Fingertips

I just ran my first Amazon EC2 instance.  Kind of a heady feeling, having nearly unlimited computing power just a few keystrokes away.  I got the same feeling the first time I logged in as root on a dedicated web … Continue reading

Posted in Programming | Tagged | Leave a comment

There Is No Database

I think I’m starting to get a handle on how Hadoop is supposed to work. The MapReduce model isn’t what troubles me.  The mind-bending part is that there is no database. Everything happens by scanning big files from beginning to … Continue reading

Posted in Programming | 1 Comment

Disk is the New Tape

An interesting scenario from Doug Cutting: Say you have a terabyte of data, on a disk with 10ms seek time and 100MB/s max throughput. You want to update 1% of the records. If you do it with random-access seeks, it … Continue reading

Posted in Programming | Tagged | 4 Comments

Continuous Integration for Data

As I told a friend recently, I’m pretty happy with the front-end code of AltLaw.  It’s just a simple Ruby on Rails app that uses Solr for search and storage.  The code is small and easy to maintain. What I’m … Continue reading

Posted in Programming | Tagged , , | Leave a comment

Privacy, Open Access, and the Law

Since we started putting court cases on the interwebs, first with Project Posner and then with AltLaw, we’ve had the occasional angry email from someone who Googles himself/herself and finds a court case from 20 years ago that reveals embarrassing … Continue reading

Posted in Uncategorized | Tagged , , | Leave a comment

Going Non-Linear

I recently read the expression “going non-linear” describing a person, where most people would say something like “going nuts.”  Incredibly geeky; I like it.

Posted in Programming | Leave a comment

Useful Reminders

Epigrams on Programming

Posted in Programming | Leave a comment

The Problem With Common Lisp

… as explained by Sir Kenny, From: Ken Tilton Newsgroups: comp.lang.lisp Date: Tue, 01 Apr 2008 14:53:07 -0400 Subject: Re: Newbie FAQ #2: Where’s the GUI? Jonathan Gardner wrote: > I know this is a FAQ, but I still don’t … Continue reading

Posted in Programming | Tagged , , , , | Leave a comment