Joe Lewis's Blog

Thoughts, Musings & Stories

The Type System of the Future

Strongly typed vs More-strongly typed

We know types help us eliminate certain classes of errors from programs. For instance, it’s impossible to write an executable program in Java that adds a string to a number. If you’re a big company already using Java, you’ve hypothetically saved some significant amount of \$\$\$ in terms of developer productivity. But we still lose \$\$\$ with the most expensive mistake ever made in the software industry - Null. I’m pretty sure while you are reading this, at least one programmer across the world is either running into a null access exception or debugging one!

Looks like Java is not a strongly typed language after all. It’s just less-weakly typed than python. Well, we have Swift or Kotlin, which when written in a certain way can be statically verified that the program is free of null errors. Or even better, Haskell – which eliminates possibilities of an invalid program state altogether, with its strict immutability.

What I’ve really found is that type-checking is an incremental concept. As in, for every strongly typed language, there’s going to be a stronger-typed language. The interesting question is, what will happen if we push this to its extreme. Can I write a program that is free of all classes of errors? Can there be a language that eliminates the possibility of making any mistakes in your program? Of course, it’d be a nightmare to even code in such a strict language but it nevertheless is an interesting question to ask!

What are Types really?

For the answer to be obvious let’s step back and dig a little deeper into what Types really are?

Surprisingly, Types didn’t originate from the computer science industry. It actually came from mathematics! The type system is just one of the many secret affairs computer science had with math.

Consider the phrase “I’m lying”. How will you communicate that intention to a computer or an alien? You convert the phrase to math. It’d become

“For any proposition p that I affirm, p is not true”

But wait, “I’m lying” is a proposition itself. If this proposition is p, then “I’m lying” is not true too. So now we’ve ended up with an error. If he claims he’s lying, the alien will understand everything he’s saying is true - which defeats the entire point.

Here’s another such error:

A barber in a village shaves all men who don’t shave themselves. Now the question is, who shaves the barber?

If the barber shaves himself then he comes under the category of people who shave themselves, so he shouldn’t shave himself. But if he doesn’t shave himself then he comes under “people who won’t shave themselves” and therefore should shave himself!

We’ve ended up with a paradox. Actually a pretty famous one at that. It’s called Russell’s paradox. Russell tried to base all of mathematics on top of logic and was troubled by this paradox.

He came up with something awesome. He went into Zen-like mode and said there simply cannot exist a predicate (I’m lying) that includes everything. There should be a class of results that hold true for a predicate and he called them, Types.

For example the “I’m lying” proposition holds true for all propositions I say, except that one. “The barber shaves all who don’t shave themselves” holds true for all men, except the barber. These are valid exceptions because they are of higher-order! (If the term higher-order rings a bell, it’s because this is how first-order logic and higher-order logic was born)

This is just like how when we write int sum (int, int); we say sum() is a function, but that function only holds true for inputs of type (int, int). See the analogy? This is where we stole types from.

The twin faces of Types

Remember our original question? Is it possible to write programs that won’t allow us to make any mistakes? We dug into the history of Types because as you can see what type means in math is (though similar) very much more than what we call datatypes in software. It implies we can do better than just data-types.

For a function f(x) = x*x that outputs the square of its input, current languages that we use allow us to express that the function takes in only integer values and not strings. i.e the data-type of arguments.

Is there a language with a type system powerful enough to express that the function adheres to the constraint: for every input the result will be the square of the input?

If such an arrangement exists, then it’s safe to assume that a language in which no erroneous programs will exist is possible. This is where the deeper meaning of Types in math, helps. While Russell proposed the Ramified theory of types to contain paradoxes, following his lead a new theory emerged - Proposition as types by Brouwer.

This theory says that any proposition has a type. The type of the proposition is the proof of the proposition. For example, consider the proposition: “Chennai is in India”

The proof is based on the below facts. Let’s call it, Proof-A.

Chennai is in Tamil Nadu. ( Chennai ⊆ TamilNadu )

Tamil Nadu is in India. ( Tamil Nadu ⊆ India )

So Chennai ⊆ India is true by transitivity.

Here the proposition “Chennai is in India” is of the type “Proof-A”. This proof part is interesting because this proof is nothing but some logic, that is in-fact closely related to a computer program. This is another interesting observation. It’s called the Curry-Howard isomorphism. Curry observed that the properties of a proof in Logic, is very similar to the properties of a computer program.

And that is again how math and computer science is married again!

The world of statically verified languages

So it is possible to statically verify that a function always returns the square of its input, it just needs a proof (aka its type). And as Curry observed, proofs are nothing but computer programs, except that they work with facts from the compile-time, instead of the runtime!

For example, you can write a proof that your sort() function, can never return an unsorted collection. And the compiler will run your verification at compile-time and guarantee that for you. Much like unit tests, but at compile-time.

As expected, there are programming languages out there, that use this Curry-Howard isomorphism to statically verify its correctness. Languages like CoQ, Agda, Idris. Folks use these languages to build mission-critical software for rockets and airplanes, but the rocket-sciency nature of these languages is stopping the general business world from adopting these languages.

I have a strong feeling that the industry will slowly drift towards languages like these that have a type system powerful enough to express the logic itself as statically-verifiable types. A world where every library I use is statically verified and proved to be correct will be an awesome world to live in!

Edits

  1. Grammatical mistakes

  2. Wrong links