Static type annotation vs. the entire world of contemporary web development

January 17, 2020

This is a guest post, written by my friend Dan Collens in 2011.

There are two references below to “short beards”. Pictures of Dennis Ritchie and Ken Thompson usually show them with a long beard. These programmers understood software from top to bottom and wrote software in a fundamentally different way than most of today’s web & mobile app programmers (whom we derisively refer to as “short beards”). Long beards are often the older and more experienced programmers, but there are plenty of young long beards, too.


People often ask me why I prefer Java to PHP.

It’s not that I prefer Java to PHP. It’s that PHP belongs to a class of languages I view as thoroughly unsuitable for large projects, while Java belongs to a class of languages I do not view as such.

The primary objection I have to PHP (and other languages which lack static type annotations permitting a rigorous static analysis pass at compile-time, including Python, Ruby, Perl, etc.) is that I do not believe such languages are effective for software projects whose scope extends non-trivially in time, quantity of code, and number of people. Anyone can reasonably convince themselves of the correctness of a page or two of carefully-written code, annotated with static types or not. But once you have thousands of pages of code, written over a period of years, by people who don’t even work on the project anymore, it is enormously hard to reason successfully about the code.

Static type annotations help you here because they permit automated verification of things such as:

Furthermore, static type annotations serve as a form of explicit (human- and machine-readable, and compile-time checked) documentation for the code. They are far more effective and expressive than comments (which eventually become inscrutable or false after enough time passes and enough changes accumulate to the code), or unit tests (which are hopeless for reasons I will get to shortly), or external documentation (which quickly diverges from the reality of the code under the competitive pressure to add features). They force the programmer to write down his assumptions and expectations explicitly, so that other programmers can read them, and so that the compiler can check and enforce them.

For a few quick examples, think about what happens in a language that lacks static type annotations if you want to rename a method. You need to find all the call sites to that method. If you miss one, you’ve introduced a bug that won’t be found until it actually breaks a test case, is caught in QA, or makes it out to the field. In a language with static type annotations, you simply recompile, and the compiler will tell you where you missed (or, in most Java IDEs, you’ll get a real-time report of the remaining places you haven’t fixed, or better still, you use the IDE’s automated refactoring command to rename all the references to the method as well as its declaration, all rigorously correctly due to the type annotations). You may think this is a trivial matter, but it’s not. To find these cases yourself, you have to search the source tree. You’d better not miss any files (what about generated files, or files that someone else is working on, and checks in after your rename, still referring to the old name). And don’t forget to send out an email to the team, because all your team members need to know not to call the old method name anymore in new code, because it won’t get caught by the compiler if they do.

Even if you do find all the files and all the references, some may not actually be referring to the method you renamed. For example, if you have two kinds of object in your system that both have a foo() method, but you’re only renaming the foo() method in one of those two objects, then you need to be careful to find only the foo() calls that are operating on the objects of the correct variety. Worse yet, you may have written some code that ambiguously accepts either kind of object, so now you can’t rename the foo() call at those sites, because the method you need to call now depends on the runtime type of the object passed in. Are you sure now that you’re always renaming the right method calls at the right place? To do this robustly, you’d have to trace through the entire program to convince yourself which type of objects can appear in which contexts (essentially trying to do whole-program type inference… in your head).

The identical problem arises if you wish to simply change the number of parameters that a method takes, or change the order of some of the parameters to a method, or even just adjust the assumptions on a particular method’s runtime type (e.g. now you expect argument 3 of this method to be an object which has a foo() call, when it used to expect an object with a bar() call). In each of these cases, in a large program with hundreds of thousands of lines of code, written over a period of years, by dozens of people, most of whom are no longer working on the project, it is a real challenge to prevent bugs from being introduced. And there are much more subtle cases than these — I’m just scratching the surface for a few easily-explained examples.

After explaining this once to a young man in his 20s who was using Python for his web app, he said “Oh, I just assumed that you’d never rename any methods in Python… just always add new ones”. This is, in a word, appalling. Imagine what the code will look like over time. Basically, in such languages, you’re building a brittle house of cards that will eventually fall in on you.

Now I hear the short-beards crying out, “But wait! unit tests will save us!” This is wrong on at least two levels. First of all, a lot of what people do when they write unit tests for software written in languages lacking static type annotations is to write tests that effectively document the runtime type assumptions of the packages they’re testing. In short, it’s a tedious, error-prone way of trying to recapture the lost value of static type annotations, but in a bumbling way in a separate place from the code itself. There are cases where unit tests are helpful — specifically when you have no satisfactory way to prove the correctness of the program, and the type system is too weak to express the guarantees you want, then you can write some tests to shore up your beliefs (and indeed, I’ve done that for my Java code from time to time).

But there are very strict limits to what you can learn from passing a test suite. Edsger Dijkstra was a famous and brilliant computer scientist (and winner of the Turing Award), and he has this to say on the topic: “Program testing can be used to show the presence of bugs, but never to show their absence!” What this means is that when you write a test, and run it, and the program fails the test, you know there is a bug that needs to be fixed. But when the program passes the test, you have only proven an absence of bugs in an infinitesimally tiny subset of the possible execution states of the program.

For a contrived example to illustrate what is meant by this remark, consider a function add(a,b) which returns the sum of a and b. If you test that add(1,1)=2, you know that it works for 1+1, but you know nothing of any other cases. You’d have to test every possible set of values to be certain it worked. But actually even then you’d be wrong, because what if add() uses some internal state that can be modified over the course of running the program? For example, let’s say it maintains an internal cache of previously-computed values (assume for the moment that addition is very expensive on this computer). Then it matters how many add() calls you do in a row, it matters what order you do them in, it matters what other things are using heap space in the system, it matters what other threads are calling add(), and so forth. For testing to rigorously establish the correctness of a piece of code, it has to not just cover every line of code, but every line of code must be covered in every possible program state (the program state space is essentially the vector space of the entire set of bits in the static data, stack variables and heap of the program, so its size is exponential in the number of bits of data/stack/heap state). This is an intractable computation for all but the tiniest of programs.

What’s needed instead are proofs of correctness. When I was Director of Engineering at a start-up I used to tell new hires that I wanted the code, first and foremost, to be correct. Testing was deemed neither necessary nor sufficient. This is precisely because of the above argument, nicely summarized by Dijkstra’s quote.

The first line of defence in programming is to write code that is so simple, elegant, and clear, that you are personally certain it is correct before you even run it. The second line of defence is static type analysis done by the compiler (which is a form of automated theorem-proving on a formal system defined by the static type annotations in the program). As you may know from incompleteness results in math and computer science, no consistent formal system can prove all true theorems, so there is an infinite class of correctness proofs which lie outside the reach of conventional static type analysis, but at least we know that everything inside that scope is correct once compilation succeeds. (Some languages like Haskell, ML, and their relatives seek to extend the type system much further to allow even more aggressive automated correctness proofs by the compiler, with some really beautiful results, but there are diminishing returns where getting each successive win in proof power costs ever greater amounts of complexity in the software’s type annotations.) Finally, one relies on testing as a third-string approach to sanity-check a few isolated points in the program’s immense state-space which lie outside the domain of the static type system’s theorem-prover.

In short, unit tests are no substitute for static analysis, and indeed there is no known substitute in the arsenal of the software engineer today. This is why I am appalled that in 2011, over 40 years since Dijkstra’s pithy remarks on the subject, we find ourselves in a whirlwind of breathless hype and excitement over a collection of web frameworks based on languages such as PHP, Python, Ruby, and JavaScript — all of which lack any form of static type annotations! It’s bad enough that the browser platform (such as it is) forces us to use one such language (JavaScript), but then to have people willingly propagate this atrocity onto the server-side (e.g. node.js) in the name of progress is almost more than I can bear. If it was just that though, I’d probably cope, but I often read about these short-bearded Y Combinator startups that are “language agnostic” only to find out that what they mean by this is that they don’t just use one but all of the languages lacking static type annotations, and do so more or less willy-nilly throughout their stack.

So as to not end on a bitter note, let me present by contrast the set of languages with (acceptably) sound static type annotation systems, and stable, broadly-available tools and runtimes, used widely in the field, with many practitioners from which to hire staff: C, C++, C#, Java, and maybe Scala if you’re feeling like a gambling man. (It looks like Go will likely join this party soon, but it’s still too immature to use in production.) Of these languages, only C#, Java, and Scala are also garbage-collected, and pointer-safe (i.e. no wild or uninitialized pointers). I personally avoid Microsoft’s products because of all the proprietary lock-in, and single-source vendor problems, but C# is otherwise a perfectly great language. So that leaves Java and Scala, and Java is the simpler, more conservative of the two.

I actually think Java is missing tons of great things, and could be vastly improved, but trying to do so by first discarding static type analysis is pure folly.

I didn’t want to muddy the waters of the above discussion, but amongst languages in common use for web development today which lack static type annotations, PHP is the second worst designed (ahead of only Perl). Its runtime library is woefully inconsistent, with many conflicting naming conventions, duplicate routines with slightly different names and subtly different semantics, etc. The syntax choices are equally appalling, e.g. “\” as a namespace scoping operator, etc. In short, it was designed by someone who (literally) hates programming, and it shows. By comparison, languages such as Java were carefully designed by a group of thoughtful people of great intellect, experience, and wisdom.