Awhile back, I mentioned the idea that abstractions are not merely repeated patterns. While principles like DRY encourage you to factor out a function when you encounter code repetition, it’s not the repetition that’s important. Abstractions have properties, and it’s the properties that are arguably more important. It’s properties that give abstractions the meaning that allows us to reason about what our code does.

Most of the time, the properties of an abstraction are pretty, um, abstract, and rarely made concrete. For many abstractions, it can be extremely difficult to state them formally. But never mind formality, there’s often no direct statement of what they are, except perhaps as documentation. If the documentation even goes so far as to think about things in terms of properties, that is. Usually the best we can hope for is a human-level description of what a function does with its inputs, and what its outputs and effects could be.

So randomized property testing is a pretty marvelous tool. To use it, we have to think in terms of properties of our designs. And once we’ve used it, we have now stated those properties explicitly, and in running code as part of the test suite, at that. If we start to deviate from the properties we intended, we’re hopefully going to find out as a result.

Property testing should perhaps be regarded as one of the biggest breakthroughs in programming in the last couple decades (though the concept is older of course, I think it’s more recently that we’ve accumulated convincing enough evidence that it’s viable as a practical tool). Thinking about properties improves design. The test suite that results is often simpler and more powerful than a traditional unit test suite. There’s a lot of upsides.

One of the things I think is most impressive about property testing is that it actually peels off a piece of formal verification in an incredibly practical manner. Properties are things we could imagine proving about code, but instead we generate random instances of the variables and just run the code instead. “Real world” programmers can actually use this stuff, but at the same time its a technique even those doing formal verification can make use of. (How better to quickly and mechanically know you’ve made a mistake and are in a dead-end of an attempted proof than to randomly test out examples and see if they fail. If the sample of examples work, then maybe it’s the right path, but if one fails, you know you’ve definitely made a mistake!)

The basics of property testing

Property testing is not a functional programming thing, but simple examples of it almost always involve functional programming because of “equational reasoning.” Many functions have equivalences between different or specialized ways of using them, which means you have a simple equality between two different simple expressions.

Not every property is something that is property-testable. You have to be able to usefully generate random inputs to check the property on. Again, with functional programming, many of the simple examples are ones where any input is a useful input, and so this part of the problem can be ignored at first

So let’s start with everyone’s favorite example:

id :: a -> a
map :: (a -> b) -> [a] -> [b]

forall x, map(id, x) == x

Here we have the identity function id, the map function over lists, and we state (using a more mathematical notation, rather than the Haskell type notation I start with) the following property: if we map the identity function on every element of a list, we get the same list back.

We see the equational reasoning origins of this property. Haskellers would sometimes prefer to write this as map id == id, but this is not as useful for property testing. For one thing, there’s no obvious variables in that expression to generate examples of, and for another, we don’t know how to test for equality between functions. But, we can decide that the functions are equal if their results are equal for the same inputs, expand that out, and we get the property I wrote above, which is now more obviously testable.

In order to test it, though, we need the ability to generate values for x. In the real world, this almost always involves some choice, design, and implementation. Traditionally, in examples of property testing using Haskell and QuickCheck, the generator is inferred from the type. This only works in some cases, because all too often you need a “smarter” generator to get interesting inputs for a particular property. Even in the case above, we need to help it along, because we don’t specify a truly concrete type for x. As written it’s [a] and how do we generate concrete examples of “any type of list?” But, we can just choose to specialize to, say, [Integer] and then the default generator is good enough for this simple property.

Because the random inputs generated to check properties can be complex, when violations are found, property testing frameworks will also simplify the erroneous values as best they can with some basic heuristics. If we wrote a dummy implementation like map func list = [], the error we got back would tell us that map id [0] evaluates to [] not [0]. The shortest list with the simplest integer, and not whatever was originally generated.

Check this out! Here again are the two constraints we placed on the implementation of map above:

map :: (a -> b) -> [a] -> [b]

forall x :: [Integer], map(id, x) == x

These two checks are sufficient to 100% prove the implementation of map correct. Let me spell that out: knowing its type, and just this single property, using only the identity function and lists of integers, is enough to know the implementation of map is correct.

I find this fact so interesting, I’m sure I’ll have more to say about it in a future post.

But isn’t this incredible? The secret here is that the type signature is doing an incredible amount of heavy lifting for us, thanks to parametricity (“Theorems for free!” PDF). Barring “unsafe” (type system breaking) code or infinite loops or crashes, the only way to get a b is to apply the given function, because we can’t know what b actually is. The only place we can get arguments of type a to give to that function is from our input list of [a]. So just from the type, we already know the function is being applied correctly, so we never have to try anything more complicated than id.

The remaining task, that the type system can’t easily accomplish and so we reach for a property to check, is pretty much just to ensure we aren’t repeating or skipping elements of the list, which we’re perfectly capable of checking on just lists of integers. And as a bonus, because it’s running the code, it’s also ensuring those caveats above about crashing or infinite loops are off the table. While randomized testing of this property can’t be 100% the way a proof of this property could be, we can actually be incredibly confident that if the property test passes, the property actually holds. The type system has done that much work for us. It’s virtually impossible for an honest attempt to implement map to sneak past this property test, you’d have to be a malicious adversary.

(Credit where due: I first learned of this cool example from Favonia though this observation seems to go way back.)

Embracing properties as a part of design

We are, slowly, starting to see property testing develop as an integral part of the design process. It’s still the early days. Haskellers are sometime criticized for having a (possibly slightly unhealthy) obsession with category theory. I think this criticism partially stems from the way in which Haskellers know they’ve found something profoundly interesting but perhaps aren’t quite sure just exactly what that is yet, and as a result sometimes talk about it… oddly. Maybe. (But I don’t mean this as criticism: every new idea isn’t created fully formed and deeply understood. This is the process we go through to understand something new. Learning is hard.)

My personal theory is that the “category theory-y” parts of Haskell culture boil down to two big innovations, besides the direct inspiration from category theory itself:

  1. Designing abstractions around properties. (In particular, that type classes should have laws.)
  2. Designing for composition: building bigger pieces out of smaller parts. (Which I’ll have to drop for now and write some future post about, but almost every idea borrowed from category theory is about composition.)

That a type class should have associated laws is a fairly new idea that’s only recently starting to get taken seriously. The type classes introduced to Haskell that were inspired by category theory came with these laws from the beginning, but going beyond that to the notion that all type classes should have laws is relatively new, as far as I can tell. At least, we’ve had plenty of type classes designed without laws in the past.

To start with a traditional example, the “container types that are mappable” abstraction gets called Functor and is defined like so:

class Functor f where
  fmap :: (a -> b) -> f a -> f b 

  -- forall x, fmap id x  ==  x
  -- forall x g h, fmap (g . h) x  ==  (fmap g . fmap h) x

The type f is a type constructor, like lists. So when f is the list type, f a just means [a]. Or f could be a optional wrapper like Maybe with Just and Nothing data constructors, etc.

So this type class is just supposed to give use the function map but for any type constructor f instead of specifically the list type.

We see (now written above using the Haskell notation) the first property we wrote for map previously. Since that’s all we needed (besides types) to show map was correct, I think we understand why it’s there! But now we also see a second property as well. And unfortunately, this property is harder to test randomly because how do you generate random functions g and h? We can certainly write a generator, but we can’t write a very good one, so we might not be too confident we’re checking this property well.

A curious problem. We previously saw that the type signature and the single property involving id was enough to show map worked correctly on lists. What is it about functors that requires this second property?

Well, the answer is… nothing. Haskell and pretty much all of its documentation and blog posts have kinda just copied this second property from category theory (where it is necessary) when it’s actually unnecessary in Haskell. If you look at the Typeclassopedia there’s a note about how the second law is technically unnecessary (though of course it still gives it as one of the laws). You can read another decent explanation here. For functors in general, not just lists, it’s enough to have the type signature and the first property alone, thanks to parametricity.

(Though some clever people have discovered particular implementations involving seq, we’re usually willing to ignore such strange cases.)

Nevertheless, stating something like the second property can be helpful in reassuring readers that the property holds. Even if it’s unnecessary, and even if we can’t actually generate very good tests with it. We don’t want to get too far off into clever mathematics—the point of these properties is to communicate the meaning of the abstraction to another person, not merely to be technically correct.

One classic objection to the idea that type classes should have laws is the Show type class that turns a value into a printable string. But part of the idea of designing with properties in mind is that you may realize your design is less than ideal and needs fixing. So consider this alternate design for Show:

class ValueSerializable t where
  show :: t -> String
  read :: String -> t
  
  -- forall x, read (show x) == x

If we include parsing that string back into its type, we now have a natural way to describe what the properties of this type class really are. When we render it as a string, we should be able to parse it back into the same value. Tadaa! This is a simple and natural way to encode the property that the string should be a representation of the value. What else do you want to know about how these functions behave?

And there’s one last very interesting effect of thinking about properties on interfaces (not just Haskell type classes, but even OO interfaces, etc). Notice how we’re describing a property, but we don’t immediately have any concrete implementations to test it on. But, especially if we actually had language support for property tests, every time we create a concrete instance (an instance of a type class or a class that implements an interface), we automatically have property tests assigned to our new type. Before we’ve ever written a line of code for that concrete implementation, there’s a test suite that applies to us. I’m not aware of anything like this for ordinary unit testing, this ability to “port over” or “instantiate” a test suite on another type like this is quite a powerful capability. A single property test can not only stand in for a number of unit tests, it can stand in for a number of unit test suites.

End notes

I’ve said property testing is not a functional programming thing, but today that’s all I’ve shown, really. Next week: property testing imperative code.