Production Languages and Toy Languages

January 27, 2008

(Note January 4, 2009: Boy, did this post generate a lot of anger. I regret using the troll-like word “toy”. It detracted from the point of the entry.)


Neil Gafter recently wrote an entry called Is the Java Language Dying? In it he makes a distinction between a “living language” (one you’d consider for new code) and a “legacy language” (one only used to maintain an existing code base). For a living language, changes should focus on making it easy to design, implement, and support new code. For legacy languages, changes should focus on making it easy to support existing code. He asks whether Java is a living or legacy language. (This is in the context of designing features for Java 7, and specifically closures.)

I think this distinction is a bit odd. A more useful distinction, when considering new Java features, is a “production language” (used for large programs maintained over a period of years) and a “toy language” (used for quick scripts, experiments, tools, personal stuff). A production language should favor reading the code over writing it, and a toy language should do the opposite. Note that I don’t mean that the language is a toy, but that it’s used to make toys. Then we can ask whether Java is a production language or a toy language. (I think most will agree it’s a production language.)

Examples of production languages are C, C++, C#, and Java. Most other languages are toy languages. Lisp advocates, for example, won’t shut up about how “expressive” Lisp is. By this they mean that high-level concepts can be expressed concisely, by gluing together little functions that return or accept functions, or by doing voodoo meta-programming. This definitely makes it easier to write code, but it then takes 20 minutes for anyone reading to figure out what’s going on in the tiniest of functions.

A great example is in Structure and Interpretation of Computer Programs. They start with a simple function that calculates the square root of a number using Newton’s Method. They then use the expressiveness of Lisp to write a set of functions that take and generate various functions. Their final top-level function is:

(define (sqrt x)
    (fixed-point-of-transform (lambda (y) (- (square y) x))
                        newton-transform
                        1.0)

Good luck figuring that out. I’m sure it was fun to decompose the problem into those functions, but in production two years from now some programmer is going to be in a world of pain when he has to add a feature to that program. Many modern languages similarly give up readability for writability, like Ruby, where you can literally redefine the + operator on integers so that it no longer gives you their sum.

Another example is Paul Graham’s Viaweb, implemented in Lisp, which was re-written in C++ and Perl when it was sold to Yahoo. The official story is that Yahoo couldn’t find any Lisp programmers. I find that hard to believe. There has got to be thousands of people who love Lisp and would die to program in it for a living. Even if there weren’t, it would be cheaper to teach people Lisp than to re-write a giant system like Viaweb. I suspect (but can’t prove) that the Lisp code was unmaintainable, even by a good Lisp person. There are probably two ways to write Lisp code: with all the expressive constructs (in which case it’s unmaintainable) and without (in which case you may as well use Java). Paul Graham surely did the former, and Yahoo was forced to toss the work when he left. Lisp is effectively a write-only language, like all toy languages.

Programmers are always surrounded by complexity; we cannot avoid it. … If our basic tool, the language in which we design and code our programs, is also complicated, the language itself becomes part of the problem rather than part of its solution. —C. A. R. Hoare, 1980 Turing Award lecture

Java is a fairly readable language. It’s verbose, but that verbosity is part of what makes it readable. Will Neil’s closure proposal tip the balance toward reading or writing? Josh Bloch seems to think it will tip towards writing, pushing the language toward a toy language. I don’t know enough about the proposal to answer that, but the examples I’ve seen so far are very dense to read (partly because of the ugly => operator).

As I write more and more enterprise software, the value of having a readable language (at the expense of writing bulk unexpressive code) grows on me. I would rather write more clunky code now and avoid having my code tossed two years from now by a maintenance programmer who can’t figure out what I did.