Three Kinds of Error

Warning! This post contains strong, New York City-inflected language. If you are discomfited or offended by such language, do not read further …

further …

further …

further …

This is about three categories of software error. I have given them catchy names for purposes of illustration. The three kinds of error are the Fuck-Up, the Oh, Fuck and the What the Fuck?.

One

The Fuck-Up is a simple programmer mistake. In prose writing, it would be called a typo. You misspelled the name of a function or variable. You forgot to include all the arguments to a function. You misplaced a comma, bracket, or semicolon.

Fuck-Up errors are usually caught early in the development process and very soon after they are written. You made a change, and suddenly your program doesn’t work. You look back at what you just wrote and the mistake jumps right out at you.

Statically-typed languages can often catch Fuck-Ups at compile time, but not always. The mistake may be syntactically valid but semantically incorrect, or it may be a literal value such as a string or number which is not checked by the compiler. I find that one of the more insidious Fuck-Ups occurs when I misspell the name of a field, property, or keyword. This is more common in dynamically-typed languages that use literal keywords for property accesses, but even strongly-typed Java APIs sometimes use strings for property names. Compile-time type checkers cannot save you from all your Fuck-Ups.

I’ve occasionally wished for a source code checker that would look at all syntactic tokens in my program and warn me whenever I use a token exactly once: that’s a good candidate for a typo. Editors can help: even without the kind of semantic auto-completion found in Java IDEs, I’ve found I can avoid some misspellings by using auto-completion based solely on other text in the project.

Fuck-Ups become harder to diagnose the longer they go unnoticed. They are particularly dangerous in edge-case code that rarely gets run. The application seems to work until it encounters that unusual path, at which point it fails mysteriously. The failure could be many layers removed from the source line containing the Fuck-Up. This is where rapid feedback cycles and test coverage are helpful.

Two

Said with a mixture of resignation and annoyance, Oh, Fuck names the category of error when a program makes a seemingly-reasonable assumption about the state of the world that turns out not to be true. A file doesn’t exist. There isn’t enough disk space. The network is unreachable. We have wandered off the happy path and stumbled into the wilderness of the unexpected.

Oh, Fuck errors are probably the most common kind to make it past tests, due to positive bias. They’re also the most commonly ignored during development, because they are essentially unrelated to the problem at hand. You don’t care why the file wasn’t there, and it’s not necessarily something you can do anything about. But your code still has to deal with the possibility.

I would venture that most errors which make it through static typing, testing, and QA to surface in front of production users are Oh, Fuck errors. It’s difficult to anticipate everything that could go wrong.

However, I believe that Oh, Fuck errors are often inappropriately categorized as exceptions, because they are not really “exceptional,” i.e. rare. Exceptions are a form of non-local control flow, the last relic of GOTO. Whenever a failed condition causes an Oh, Fuck error, it typically needs to be handled locally, near the code that attempted to act on the condition, not in some distant error handler. Java APIs frequently use exceptions to indicate that an operation failed, but really they’re working around Java’s lack of union types. The return type of an file-read operation, for example, is the union of its normal return value and IOException. You have to handle both cases, but there’s rarely a good reason for the IOException to jump all the way out of the current function stack.

Programming “defensively” is not a bad idea, but filling every function with try/catch clauses is tedious and clutters up the code with non-essential concerns. I would advocate, instead, trying to isolate problem-domain code behind a “defensive” barrier of condition checking. Enumerate all the assumptions your code depends on, then encapsulate it in code which checks those assumptions. Then the problem-domain code can remain concise and free of extraneous error-checking.

Java APIs also frequently use null return values to indicate failure. Every non-primitive Java type declaration is an implicit union with null, but it’s easy to forget this, leading to the dreaded and difficult-to-diagnose NullPointerException. The possibility of a null return value really should be part of the type declaration. For languages which do not support such declarations, rigorous documentation is the only recourse.

Three

Finally, we have the errors that really are exceptional circumstances. You ran out of memory, divided by zero, overflowed an integer. In rare cases, these errors are caused by intermittent hardware failures, making them virtually impossible to reproduce consistently. More commonly, they are caused by emergent properties of the code that you did not anticipate. What the Fuck? errors are almost always encountered in production, when the program is exposed to new circumstances, longer runtimes, or heavier loads than it was ever tested with.

By definition, What the Fuck? errors are those you did not expect. The best you can do is try to ensure that such errors are noticed quickly and are not allowed to compromise the correct behavior of the system. Depending on requirements, this may mean the system should immediately shut down on encountering such an error, or it may mean selectively aborting and restarting the affected sub-processes. In either case, non-local control flow is probably your best hope. What the Fuck? errors are a crisis in your code: forget whatever you were trying to do and concentrate on minimizing the damage. The worst response is to ignore the error and continue as if nothing had happened: the system is in a failed state, and nothing it produces can be trusted.

Conclusion

All errors, even What the Fuck? errors, are ultimately programmer errors. But programmers are human, and software is hard. These categories I’ve named are not the only kinds of errors software can have, nor are they mutually exclusive. What starts as a simple Fuck-Up could trigger an Oh, Fuck that blossoms into a full-blown What the Fuck?.

Be careful out there.

2 Replies to “Three Kinds of Error”

  1. A funny way to categorize software errors, but maybe one of the best I’ve heard about.

    You may also apply this to other aspects of life. Thanks for sharing.

Comments are closed.