Using dependent types to write proofs in Haskell
Table of Contents
Target audience: This blog is meant for intermediate/advanced Haskell readers, specially those interested in logic. If you are new to Haskell, this blog will probably not be easy to follow.
We all know that we can use Haskell to write functional programs that compute stuff. But can we also use Haskell to write mathematical proofs? Yes! In case you have never been exposed to dependent types, the concept of writing proofs with a programming language will surely sound alien to you. In this blog I hope to give you an informal introduction to dependent types in Haskell that will allow you to understand what does it mean to prove something in Haskell and how to do it. I strongly recommend that you follow along with ghci on your side.
Clarification. GHC does not have full dependent types, however, as we will see it has some features that allow us to get pretty close to them. If you are looking for a language with full dependent types I suggest that you look into Agda. In this other blog I present an Agda implementation of a type safe domain specific language to write Hilbert style proofs. Also, GHC should not be used as a serious proof assistant, because, as we will see in this blog, it is very easy to prove false statements in it.
You can find all related code to what I will present in this gitlab repository.
In order to get close to dependent types we need the help of some GHC extensions. If you want to follow along, make sure to have the following extensions enabled in ghci. I will comment on some of them when they become relevant to the code I'm showing. Note that each extension links to the corresponding GHC manual page.
GADTs. Allow use of Generalised Algebraic Data Types (GADTs).
ScopedTypeVariables. Enable lexical scoping of type variables explicitly introduced with
DataKinds. Allow promotion of data types to kind level.
PolyKinds. Allow kind polymorphic types.
TypeFamilies. Allow use and definition of indexed type and data families.
TypeOperators. Allow the use and definition of types with operator names.
UndecidableInstances. Permit definition of instances which may lead to type-checker non-termination.
Equality on types
All proofs which I will present in this blog are proofs of equality. In fact, we will only show theorems which can be expressed in a formula of the form \[∀x₁…∀xₙ(t₁ ⇒ … ⇒ tₘ ⇒ r = s),\] where $xᵢ$ are type variables and $t₁,…,tₘ,r,s$ are types. So a first obvious step should be to define a type in Haskell which expresses type equality (according to the rules of GHC). We should agree that a the following header is a good place to start.
data Equal a b where ...
Now we should provide a constructor for this datatype. Of course, the
constructor should be a witness that
b are equal.
So we can define the desired constructor like this:
data Equal a b where Witness :: (a ~ b) => Equal a b
Recall that the constraint
a ~ b means precisely what we want. Namely, that
b are equal types according to GHC.
However, we are going to use an alternative definition, which is slightly
simpler and is already defined in
data a :~: b where Refl :: a :~: a
Don't get confused by the infix notation of
:~:. It serves the same purpose
Equal, but with infix notation (see
TypeOperators). The constructor is
Refl for Reflexivity.
We are ready to write our first proof. Let us show that the equality relation defined above is symmetric. Recall that in mathematics we say that a relation $R$ is symmetric iff $∀x∀y(xRy⇒yRx)$.
Our next job is to find a type which faithfully represents the symmetry proposition. Let's use the following type.
sym :: (x :~: y) -> (y :~: x)
First we need to convince ourselves that the type
(x :~: y) -> (y :~: x)
faithfully represents a proposition stating the property of symmetry. In fact
if we use explicit quantification for
y and we interpret the
function type (
->) as an implication, it looks exactly like the mathematical
forall x y. (x :~: y) -> (y :~: x)
Now, if we can find a Haskell term which has the above type we will have a
term such that given any
y and a proof of
x :~: y, will return
a proof of
y :~: x. If such term exists, it should be clear that the
proposition is true and thus we should accept that term as a proof! In
fact, this interpretation of a proof corresponds precisely to the BHK
interpretation. If you want to read more on the BHK interpretation I suggest
this Stanford Encyclopedia article.
On we go to find a suitable term (proof)! We see that
(x :~: y) -> (y :~: x)
is a function type with one argument, so a good place to start would be this:
sym :: (x :~: y) -> (y :~: x) sym p = _
If we load this into ghci it will tell us that it has found a hole:
Found hole: _ :: y :~: x
How can we fill this hole? Well, the only way to build something of type
:~: x is by using the constructor
Refl, but doing that would give us either
x :~: x or
y :~: y, which is not what we want. So we are only left with
the option to pattern match against
p. Here the type of the constructor
Refl will play a crucial role. Let's proceed by asking ghci about its type.
ghci> :t Refl ghci> Refl :: forall k (a :: k). a :~: a
What this says is: For any kind
k and for any type
a of kind
k we have a
term of type
a :~: a. We can think of a kind as the type of a type. I will
go back to types and kinds later in this section.
Let's reload the code after replacing
that I write
p@Refl so it is easier for me to refer to this specific
in the text, but we could simply drop
p@ to the same effect.
sym :: (x :~: y) -> (y :~: x) sym p@Refl = _
Now we get that there is still a hole, but now it has type
x :~: x instead
y :~: x.
Found hole: _ :: x :~: x
We see that because of pattern matching and the type of
Refl, GHC has
unified the type variables
y. Why? well, we know that before pattern
matching we had
p : x :~: y. Then we have that
p@Refl : x :~: y, and
therefore by the type of the constructor
Refl it must be that
x ~ y.
Filling the hole and thus finishing the proof is trivial! We only need to
Refl constructor, which is a suitable term of the desired type
sym :: (x :~: y) -> (y :~: x) sym Refl = Refl
Yay! we finished our first proof! However, there is one critical condition for
a proof that we haven't yet commented: termination. In the case of
it obviously terminates for any input, since we just return the constructor
Refl. However, as proofs get more complicated and we need to perform
induction (as we will see in the natural numbers section), it is easy to end up
writing a well typed term that will not terminate on some inputs. If a term
does not terminate on some inputs, then it obviously cannot be taken as a
proof. Unfortunately, we are on our own because GHC does not perform any kind
of termination checking for terms. As an example of a well typed but
non-terminating term consider the following definition:
anyProof :: forall a. a anyProof = anyProof
Of course, we should always be very careful not to introduce a non-terminating (pseudo)proof, as the proof may be accepted by GHC and will lead us to believe that we have proven something which may be false.
As a simple exercise, I propose that you try to prove that
:~: is a
transitive relation. In Haskell terms, this means that you ought to find a
suitable definition for this type signature.
trans :: (x :~: y) -> (y :~: z) -> (x :~: z)
Hint: The proof is really similar to the proof of
sym. After you have
proven this, you will have proven that
:~: is symmetric, transitive and
reflexive (this property is given directly by the constructor
you can conclude that
:~: is an equivalence relation, which is the minimum
requirement a definition of equality should satisfy.
We end this section by summarizing the steps we should follow when writing a proof in Haskell.
Write a type which represents the proposition that we want to prove.
Find a term that has that type. GHC makes sure that this step is correct by type checking the term.
Make sure (outside of GHC/in our meta reasoning) that the given term in the previous step terminates for any input. If it doesn't, we do not have a valid proof.
A note on types and kinds
This section is a small parenthesis on types and kinds. It is a bit technical and tangential to the main topic so you may want to skip it on a first read.
If we inspect closely the type of
sym :: forall x y. (x :~: y) -> (y :~:
x), one may ask the questions: what are
y? They are not defined
anywhere, yet we can use them? The answer lies in the fact that GHC types are
not fully explicit most of the time. Consider the identity function type:
id :: a -> a
We could have raised a similar question by looking at this type. If we rewrite the previous type more explicitly, we would have:
id :: forall a. a -> a
We see that we universally quantify a type variable
a. But still this is not
fully explicit. One could ask: What is the type of
a? Or in GHC terms, what
is the kind of
a? Let's rewrite the type to make it even more explicit.
id :: forall k (a :: k). a -> a
Ok, we have solved the problem for
a. Now we know that
a has kind
But what is the kind of
k? We need to make it more explicit one last time.
-- Type is imported from Data.Kind id :: forall (k :: Type) (a :: k). a -> a
But how can it be that type of a kind
Type? This is because in GHC
we have the axiom
Type :: Type. This may raise some eyebrows since in the
field of logic typing relations which include at least a reflexive element,
Type :: Type, are well known to lead to inconsistent logics, and
thus are undesirable (if this topic is intriguing to you, I suggest that you
read the article on the Russell's Paradox in the Stanford Encyclopedia).
However, GHC allows
Type :: Type because it has certain advantages. Moreover,
inconsistencies are present even without it. This makes sense because GHC is
not meant to be used as a proof assistant. In the paper System FC with
Explicit Kind Equality, Weirich, Hsu and Eisenberg argue that having the
Type :: Type axiom does not break GHC's type system.
If you want to know how the paradox has been avoided in a programming language with dependent types, I suggest that you look into the Agda documentation for universe levels.
There is a lot more to types and kinds in GHC than what I have written (and probably a significant portion escapes my knowledge at this point in time). If you wish to know more, I suggest some references:
Proofs on natural numbers
Finally something that is more tangible: natural numbers. As we know, a natural numbers are the elements of the infinite sequence $0,1,2,3,…$. We will begin by providing a Haskell inductive definition for them.
data Nat where Zero :: Nat Suc :: Nat -> Nat
We can define particular naturals as expected:
one :: Nat one = Suc Zero two :: Nat two = Suc one ...
Say that we want to prove that
one + one = two. It should be pretty
straightforward right? Well, we have seen that we need to use types to express
two are terms, so we can not use them in a type.
Moreover, how do we define addition on the type level?
In order to solve this problem we will use some help from the
extension. This extension promotes type constructors, such as
Nat, to kinds.
Likewise, the extension promotes constructors, such as
Suc, to type constructors. Note that when we want to refer to a promoted
constructor we should prefix it with an apostrophe
'. Now we can define
naturals on the type level thus:
type One = 'Suc 'Zero type Two = 'Suc One
The first part of the problem is solved. We now need to define addition on the
Nat. We can do so with the help of the
extension. With this extension we can define type families, which
approximately are Haskell functions on types.
First let's define addition on terms as we would normally do. This is only to have a point of reference.
(+) :: Nat -> Nat -> Nat where Zero + b = b (Suc a) + b = Suc (a + b)
And now see how we can translate the previous definition to type families.
type family (a :: Nat) :+: (b :: Nat) :: Nat where 'Zero :+: b = b 'Suc a :+: b = 'Suc (a :+: b)
One can observe that both definitions look quite alike. In fact, they would be exactly the same definition if we had full dependent types in GHC.
Let's try an example. Notice that the we need a
:k! in order to ask
ghci to give us the normalized type.
ghci> :k! ('Suc 'Zero) :+: ('Suc 'Zero) ghci> ('Suc 'Zero) :+: ('Suc 'Zero) :: Nat ghci> = 'Suc ('Suc 'Zero)
Neat! We are finally ready to express in a type the property
one + one =
two and prove it.
onePlusOne :: (One :+: One) :~: Two onePlusOne = Refl
It suffices to use reflection since both types
One :+: One and
'Suc ('Suc 'Zero).
Let's prove another property:
n + zero = n.
nPlusZero :: (n :+: 'Zero) :~: n nPlusZero = Refl
But now we get an error!
Couldn't match type ‘n’ with ‘n :+: 'Zero’
The problem is that
n :+: 'Zero cannot be normalized further. This is
because when we defined
:+: we pattern matched the left argument, but now we
'Zero on the right. Hence the expected way to proceed is to pattern match
n. Unfortunately, we cannot pattern match on types directly so we
need to figure something out.
The strategy is to define a data type which will serve as a bridge between the terms universe and the types universe.
data SNat (n :: Nat) where SZero :: SNat 'Zero SSuc :: SNat n -> SNat ('Suc n)
The new data type
SNat which we just defined has the same structure as a
Nat, but it is indexed by a
Nat on the type level. Note that the
the first line is not a term of type
n is a type variable
Nat. With the help of
GADTs, we can use the promoted constructors
'Suc to create the desired link between terms and types. The
following theorem makes such a link explicit.
Theorem. For any type
n of kind
Nat, the type
SNat n has exactly one term.
We prove it by induction on
SZero :: SNat 'Zero. To see that
SZerois the only term of type
SNat 'Zero, assume that there is a term
b :: SNat 'Zero. We continue by cases on
SZerowe are done. Otherwise
bis of the form
SSuc _, but then
b's type must be of the form
SNat ('Suc _), which is a contradiction.
By induction hypothesis we have that there exists exactly one term
i :: SNat m. Then we have that
SSuc i :: SNat ('Suc m). Now assume that there is a term
j :: SNat ('Suc m). From the type we know that
jis of the form
k :: SNat m. Then by the induction hypothesis we have that
Because of the previous property we call the type
SNat a singleton type.
If you want a more extensive introduction to singleton types, I recommend this
blog by Justin Le.
We can now use the singleton type
SNat to pattern match on types of kind
nPlusZero :: SNat n -> (n :+: 'Zero) :~: n nPlusZero SZero = _ nPlusZero (SSuc m) = _
After loading the previous code into ghci we get the following information:
For the base case (first hole):
Found hole: _ :: 'Zero :~: 'Zero
This hole can be trivially proven with
For the inductive case (second hole):
Found hole: _ :: 'Suc (n1 :+: 'Zero) :~: 'Suc n1
We see that we need to prove an equality of the form
'Suc a :~: 'Suc b, and this equality is implied by
a :~: b. Let's prove this fact as an auxiliary lemma.
congSuc :: a :~: b -> 'Suc a :~: 'Suc b congSuc Refl = Refl
We see that
congSucis a very simple lemma that can be proved in the same way we proved
In order to use the previous lemma we only need to apply it as a regular function.
nPlusZero :: SNat n -> (n :+: 'Zero) :~: n nPlusZero SZero = Refl nPlusZero (SSuc m) = congSuc _
congSucthe hole becomes:
Found hole: _ :: (n1 :+: 'Zero) :~: n1
It worked, the hole type became simpler! In fact, it looks much like the property that we wanted to prove initially. This is often telling us that we need to apply the induction hypothesis. Applying the induction hypothesis is as simple as doing a recursive call to the theorem that we are proving.
nPlusZero :: SNat n -> (n :+: 'Zero) :~: n nPlusZero SZero = Refl nPlusZero (SSuc m) = congSuc (nPlusZero m)
We have finished our first proof by induction! Now imagine that we want to use
the previous theorem in another proof. In particular, imagine that we need to
fill a goal of type
SNat a -> SNat b -> (a :+: b) :+: 'Zero. It is clear
that the strategy should be to instantiate the
n in the lemma
a :+: b, but then we should provide an argument of type
SNat (a :+: b).
We can achieve that by implementing addition for the type
SNat as follows.
(.+.) :: SNat a -> SNat b -> SNat (a :+: b) SZero .+. b = b SSuc a .+. b = SSuc (a .+. b)
Finally we can use the theorem
someThm :: SNat a -> SNat b -> (a :+: b) :+: 'Zero someThm a b = nPlusZero (a .+. b)
At this point I have given you enough tools so that you can start writing your own proofs.
Below I provide a list of theorems that you may want to prove as practice. If you get stuck you can refer to my solutions in this repository, specifically this file. I wrote these solutions about 3 years ago when I was still a complete novice at writing proofs. So it is likely that you will find better proofs.
In order to solve the exercises you will need the following definitions.
type family (a :: Nat) :*: (b :: Nat) :: Nat where 'Zero :*: b = 'Zero 'Suc a :*: b = b :+: (a :*: b) type family (a :: Nat) :^: (b :: Nat) :: Nat where a :^: 'Zero = 'Suc 'Zero a :^: 'Suc b = a :*: (a :^: b)
plusSucR :: forall a b. SNat a -> SNat b -> (a :+: 'Suc b) :~: 'Suc (a :+: b)
plusAssoc :: forall a b c. SNat a -> SNat b -> SNat c -> (a :+: b) :+: c :~: a :+: (b :+: c)
plusCommut :: forall a b. SNat a -> SNat b -> (a :+: b) :~: (b :+: a)
prodZeroL :: SNat n -> 'Zero :*: n :~: 'Zero
prodZeroR :: SNat sn -> (sn :*: 'Zero) :~: 'Zero
prodOneL :: SNat sn -> (One :*: sn) :~: sn
prodOneR :: SNat sn -> (sn :*: One) :~: sn
prodSucL :: forall a b. SNat a -> SNat b -> ('Suc a :*: b) :~: b :+: (a :*: b)
prodSucR :: forall a b. SNat a -> SNat b -> (a :*: 'Suc b) :~: a :+: (a :*: b)
prodDistribR :: forall a b c. SNat a -> SNat b -> SNat c -> (a :+: b) :*: c :~: (a :*: c) :+: (b :*: c)
prodDistribL :: forall a b c. SNat a -> SNat b -> SNat c -> a :*: (b :+: c) :~: (a :*: b) :+: (a :*: c)
prodAssoc :: forall a b c. SNat a -> SNat b -> SNat c -> (a :*: b) :*: c :~: a :*: (b :*: c)
prodCommut :: forall a b. SNat a -> SNat b -> (a :*: b) :~: (b :*: a)
powerZero :: a :^: 'Zero :~: One
powerOne :: SNat a -> a :^: One :~: a
prodPower :: forall a b c. SNat a -> SNat b -> SNat c -> (a :^: b) :*: (a :^: c) :~: a :^: (b :+: c)
powerProd :: forall a b c. SNat a -> SNat b -> SNat c -> (a :^: c) :*: (b :^: c) :~: (a :*: b) :^: c
Proofs on lists
More exercises for lists.
You will need some common functions on lists on the type level.
type family Length (l :: [*]) :: Nat where Length ' = 'Zero Length (_':r) = 'Suc (Length r) type family (a :: [*]) :++: (b :: [*]) :: [*] where ' :++: b = b (a ': as) :++: b = a : (as :++: b) type family Reverse (a :: [*]) :: [*] where Reverse ' = ' Reverse (a ': as) = Reverse as :++: '[a]
A singleton list type can be defined thus:
data HList (l :: [*]) :: * where HNil :: HList ' HCons :: t -> HList l -> HList (t ': l)
List of exercises.
concatNilL :: ' :++: a :~: a
concatNilR :: HList a -> a :++: ' :~: a
lengthCons :: HList a -> Length (t ': a) :~: 'Suc (Length a)
lengthConcat :: HList a -> HList b -> Length a :+: Length b :~: Length (a :++: b)
concatAssoc :: forall a b c. HList a -> HList b -> HList c -> (a :++: b) :++: c :~: a :++: (b :++: c)
lengthConcatCommut :: forall a b. HList a -> HList b -> Length (a :++: b) :~: Length (b :++: a)
lengthReverse :: HList a -> Length (Reverse a) :~: Length a
concatReverse :: forall a b . HList a -> HList b -> Reverse a :++: Reverse b :~: Reverse (b :++: a)
consReverse :: forall a t. HList a -> t ': Reverse a :~: Reverse (a :++: '[t])
reverseReverse :: forall a. HList a -> Reverse (Reverse a) :~: a
If you plan on using singleton types on your projects, you should definitely
look into the
singletons library. According to its documentation,
singletons contains the basic types and definitions needed to support
dependently typed programming techniques in Haskell. Additionally, it is very
convenient to combine it with
singletons-th library, which will
automatically generate a lot of boilerplate code for you. For instance, in
order to define the natural numbers we needed to define
latter can be automatically generated by
singletons-th. Moreover, when
defining addition we needed to define
:+: on the type level for the kind
.+. on the singleton term level for
SNat. Again, with the
singletons-th library, we could automatically generate both
from the simple definition of
+ on the type
If you want to know more or you want different takes on this topic, I would suggest that you look into these references.
An introduction to typeclass metaprogramming by Alexis King.
Tweag youtube channel featuring Richard Eisenberg.
Basic Type Level Programming in Haskell by Matt Parsons.
nLab: propositions as types (not Haskell, more theoretical).
If you know of a nice reference which is missing here, let me know and I'll add it.
I hope you found this interesting. If you find any mistake, please let me know so I can correct it and give you credit.