Why variance matters
One basic and omnipresent part of program design is variance, and it deserves more attention than it gets. I gave a brief crash course in variance back when I discussed the design impact of inheritance. I’ve decided to elevate this discussion to full post of its own.
What is variance?
Variance is about our ability to substitute types for other types.
We generally think of this as being more or less specific types.
A typical inheritance hierarchy or subtyping relationship can create a chain from less constrained to more constrained types, like num :> int :> nat
.
We can treat the appearance of a type in three useful ways:
- Invariant behavior means that only that exact type is acceptable. Many functional languages lack subtypes, and so are generally invariant.
- Covariant behavior means subtypes are acceptable in lieu of the specified type. Many object-oriented languages have a default preference for covariance.
- Contravariant behavior means supertypes are acceptable substitutions in this type. Anytime when covariance is possible, contravariance also becomes possible.
The key takeaway I’d like to offer today is simple that variance is a form of added complexity, with some qualifications. This doesn’t make variance bad, of course. When that complexity is called for, it’s desirable, because it helps us cope with the actual challenge of the problem we’re facing.
Where does variance come from?
There are two main sources of variance. The first is simply functions.
If we have a hierarchy like num :> int :> nat
and a function type like int -> int
, how are we allowed to reinterpret this type?
We have a couple of options.
We are free to be more constrained in its arguments, and we can freely be more relaxed and forget constraints about its result.
So we can cast that function type safely to nat -> num
, but not num -> nat
.
Arguments are covariant, while results are contravariant.
But there’s a second major source of variance in our programs, and that’s IO or state.
Any time we’re looking at a type that describes what can be written or sent to it, we’re looking at something that’s covariant.
That is, a Sink<int>
can be reinterpreted as a Sink<nat>
.
We’re free to insist we only write more specific things to it.
Any time we have a type describing what can be read from something, we’re looking at something that’s contravariant.
A Source<int>
can be reinterpreted as a Source<num>
.
We’re free to relax our expectations about what we’ll read from it.
But this is where mutation has something big to say, because a mutable variable is essentially both a sink and a source!
This means that any time you have something mutable, like RefCell<int>
that we can read and write to, we can no longer legally reinterpret this as a RefCell<num>
or a RefCell<nat>
.
It’s fixed.
How do we interact with variance?
There are two general modes of interaction with variance: use-site variance, and declaration-site variance.
Sink
and Source
are good examples of where declaration-site variance is useful.
It’s simple, and it does the job.
We’re able to say that the type constructor itself has particular variance on each parameter, just like we can with the function type.
But use-site variance is more flexible.
It allows local code to make promises about how a type will be used that aren’t guaranteed of just any old use of that type.
The classic example here is dealing with array copying.
Arrays are invariant, so a copy from Array<T>
to Array<T>
could only ever do so to arrays of exactly T
.
But clearly this function is only reading from its input and writing to its output, so it could be covariant and contravariant respectively.
With use-site annotations, we can have an invariant Array
type, but have a copy
function that will happily read from a Array<nat>
and write to an Array<num>
, permitting variance where it is locally acceptable.
The notation for describing variance, uh, varies.
Java uses the rather irritating Array<? extends Type>
for “covariant Type
” and Array<? super Type>
for “contravariant Type
”.
Scala and friends use Array[+Type]
and Array[-Type]
respectively.
C# and friends use Array<out Type>
and Array<in Type>
.
Each language has varying support and syntax for declaration and use-site variance.
Variance appears even without subtyping
Although the complexity of variance is avoided without subtyping of some kind of create it, the underlying causes of variance still exist even when all of your type constructors are invariant. There’s a pretty neat example of this with Haskell.
One of the standard abstractions in Haskell is the Functor
which provides the fmap
function we’ve talked about before.
But it turns out, this is the covariant functor.
Let’s introduce two simplish types:
data Sink t = Sink (t -> IO ())
data Source t = Source (IO t)
These types represent IO actions that either consume a t
or produce a t
.
And here’s the thing: we can give a Functor
instance for Source
, but we can’t for Sink
.
Try it.
For a covariant functor, we operate on the values of type t
in the data structure.
But when that t
appears as the argument to a function, we don’t have that value anymore.
We have a function that takes that value.
It’s “polarity” gets flipped, and now to deal with those, we need a contravariant functor.
-- found in Data.Functor.Contravariant:
class Contravariant f where
contramap :: (a -> b) -> f b -> f a
For people coming from ordinary functors, this looks weird.
The types are backwards!
How are we going to get to b
from an a
when all we have is a -> b
?
But the key thing to remember is that with a contravariant functor we’re not dealing with a data structure that has values of type a
in it, we’re dealing with a data structure that has functions from type a
in it. (Typically.)
That’s what makes it contravariant.
Now we can write that instance for Sink
:
instance Contravariant Sink where
contramap f (Sink g) = Sink (g . f)
So this let’s us transform a Sink b
into a Sink a
by writing a function a -> b
.
The meaning is simply that when we “sink” a value of type a
into our new Sink
, we apply a -> b
and then “sink” the resulting b
into the original.
Avoiding the complexity of variance
I have this hypothesis that I’m not entirely sure how to substantiate. I think a significant part of the complexity that arises from variance and subtyping is a result of nominal subtyping rather than inherent.
I think that structural subtyping (or row polymorphism) results in a lattice structure that reduces or removes most of the complexity involved.
If ya’ll got any references I should read related to this, I’d like to hear about them.
But this is part of the reason I think variance impacts dynamic languages less than it does static. It’s not that variance doesn’t apply, or that variance is just complexity that’s created by a static type system. The complexity is partially inherent to variance, but partially created by nominal subtyping in particular. Dynamic languages often make use of “duck typing” which roughly corresponds to structural subtyping or row polymorphism, and so the reasoning required is less cognitively complex.
Or at least, that’s my guess.
End Notes
Besides invariant, covariant, and contravariant, there’s also bivariant, but that’s usually reserved for trivialities, and mostly get brought up due to a misplaced sense of completeness. I think it complicates initial understanding of the concept.
The interaction of variance and the “is a” notion of inheritance was part of the reason I criticized both inheritance and the “is a” philosophy of what inheritance even means awhile back.
So much of functional programming seems to be about making code as simple as it looks: invariance everywhere, immutability by default, more direct data structures, default non-nullability. Sometimes it strikes me as weird that functional programming could ever have been criticized as being complicated. I’m starting to agree with the idea that we do programmers a disservice by teaching the “machine model” of programming first. I’m not sure how else this stuff is more complicated except that people are struggling to map it down to the machine model, instead of understanding it on its own terms.