Type Systems Aren’t Magic

There’s a lot of enthusiasm for typed languages these days, now that languages with good type systems are closer to the mainstream. I’ve even seen some people go so far as to suggest that the future of programming involves typed languages, and dynamic languages are on their way out.

For practical reasons, most of my career has involved completely dynamic languages like Ruby and Clojure, or inferior typed languages whose type system confers poor value to pain ratios, like C and Java. I’ve been learning Haskell recently to gain more exposure into what it’s like to have a good type system, and what I’ve seen so far is that types aren’t magic or a panacea.

What is a Modern Type System?

I’ve come to think of a type system as an intentional engineering tradeoff where you provide constraints around values as they flow through your source code. In return, the compiler can do some fancy constraint solving to double-check that what you’ve written makes some semblance of sense.

In a good typed language, functions are values, too, and the language uses some super cool techniques to infer things you’ve left unspecified. It’s almost like having miniKanren in your compiler.

What kind of information can you specify about values in your type system? Well, that’s really up to the language, and it’s totally orthogonal to whether the language has any type inferencing built-in. When you hear talk of a strong type system, it’s common to, at a minimum, decomplect the possibility of nil from other values. For example, by indicating that a value has a type of string, it cannot possibly be null.

If you really need to indicate that a value could either be a string or no value at all, Haskell provides algebraic data types that can express this (among other things). This particular type is very common and has a name: Maybe. Swift, on the other hand, allows you to approximate this using enum types, which are very similar but more cumbersome in syntax and less flexible with recursion.

What Can Types Express?

Let’s consider what a Haskell type would look like for a coordinate in a two-dimensional Cartesian system:

data Point = Point Int Int

This is an algebraic type: Point is a type that has an Int AND another Int. You can use the first as the x coordinate, and the second as the y coordinate.

What if you wanted to use the type system to constrain a value such that it could only be a point in the second or fourth quadrant? That is, either the x coordinate or the y coordinate (but not both) must be negative? As it turns out, you can’t do this.

As another example, let’s say that you’re creating a data type that describes the state of a game such as Othello, using an array of Ints or some other type. There’s no way to define your data type so that it is impossible to represent invalid or nonsensical game states, such as having pieces that don’t touch any others on the board.

If this is a concern, the common solution to this problem would be to create two constructors for your data type. The first constructor would blindly accept anything that type-checks, and it would be declared private or isolated from the rest of your application. The second constructor would apply some domain logic or validation to the values that are passed in and conditionally leverage the former constructor.

When used carefully, this will give you some confidence that, across the rest of your application, you’re not likely to have an instance of your data type that doesn’t actually make sense.

Conclusion

Types aren’t a panacea, and they’re not magic. It’s possible to use a strongly typed language to express logic that is actually not correct for your problem domain. Types won’t prove your software for you, except in an extremely narrow fashion. For some reason, I was slightly surprised when I realized this.

Now that I’ve gained some more familiarity with types, I’m looking forward to contrasting this to Clojure’s Spec. In Spec, for example, it would be possible to express and validate (and generate!) that data matches my above examples.

Inherently, when you use a type system, you are giving up some flexibility. You trade the ability to express certain things. In return, you are reassured that, from a narrow perspective, you’ve expressed something that makes sense in a broader scope.

The real magic of a language like Haskell comes from its algebraic types. By going a bit further than, say, Java, and encoding some simple logic in the types themselves, you gain some limited ability to check the logic across your application.

Conversation
  • Ibrahim Tencer says:

    “What if you wanted to use the type system to constrain a value such that it could only be a point in the second or fourth quadrant? That is, either the x coordinate or the y coordinate (but not both) must be negative? As it turns out, you can’t do this.”

    It’s possible to do this in more powerful type systems (with “dependent types”). With dependent types you can express essentially whatever condition you want inside of a type definition. I’ve heard the Haskell people are trying to incorporate some kind of dependent typing into the language, but there are other, less-established languages that already support it (Idris, Coq).

  • Comments are closed.