Oh, Those Hyphens!

Why does Lisp use hyphens instead of underscores or CamelCase? [Read more…]

Lisp differs from almost every other programming language on Earth by using hyphens in its function/variable names. C/Perl/Java/etc. programmers, who can only use hypens for subtraction, like to argue about the relative merits of NamesWithCamelCaseInThem, which are hard to read, or names_with_lots_of_underscores_in_them, which are hard to type. What the debaters rarely mention, it seems, is that neither style is particularly logical.

When you define a function or variable name, you are effectively defining a new word in the vocabulary of your program. Most Western languages allow the definition of “compound words” composed of multiple words. In normal language, these compound words are occasionally mashed together (e.g. “weekend,” “weblog”) but more often left separate (e.g. “dining room,” “miles per hour”).

No programming language parser has yet been written that can understand compound words with spaces in them, so we have to compromise. Western languages, English in particular, already have a compromise in place: they use hypens to form compound words which would be awkward to write as single words or as separate ones (e.g. “all-inclusive,” “president-elect”).

So Lisp’s hyphenated names are not only easier to read than CamelCase and easier to type than underscores, they are actually the most logical choice for creating new compound words. Yet another example of Lisp’s innate superiority. ;)

Programming by Linguists I

What would a programming language designed by a linguist look like? [Read more…]

Why do we speak of programming languages instead of programming notation? The latter seems a more accurate term for the mixture of words, punctuation, symbols, and and spacing that make up most programming, which owes more to mathematics than to language. We even call it code, an admission that it is not a real human language. Real languages allow infinitely more variety of expression, but are correspondingly harder to define precisely. Programming languages presumably grew out of a need to express concepts particular to computers in a way that could be engineered in electronic circuits. Engineering does not allow for ambiguity, which is a good thing. This all makes sense given that the pioneers of computer science were engineers and mathematicians. But I like to speculate: what would a programming language designed by a linguist look like?

It would almost certainly use far less punctuation. Natural languages tend to have a much higher ratio of words to punctuation and layout. Chinese, at one extreme, has almost no punctuation; English, at the other, has a fair amount. In either case, however, the punctuation and layout serve only as indicators for what would normally be conveyed by tone, inflection, and rate of speech.

Most programming languages are difficult to impossible to understand when spoken, unless one “reads” the punctuation characters aloud, which disrupts the flow of speech. Who ever reads this:

for (int i = 1; i <= 10; i++) {
        printf("%d\n", i);
}

as “for open parenthesis int i equal zero semicolon i less than equal ten semicolon i plus plus close parenthesis open bracket print f open parenthesis double quotation mark percent d slash n double quotation mark comma i close parenthesis semicolon close bracket”? No one, except maybe a beginning C programmer. On the other hand, how does one read the above program aloud? One can describe what it does in a sentence, “Print the integers from one to ten, one per line” but that version does not even scratch the surface of what the computer does.

To be continued…