Sorting

Samuel Mimram

The goal of this TD is to prove a (somewhat) realistic algorithm: a sorting algorithm on lists. It is supposed to be done over two sessions.

Create a new file Sort.agda and begin by importing the libraries necessary to have access to equality, truth, conjunction, disjunction, natural numbers, products and lists:

open import Relation.Binary.PropositionalEquality
open import Data.Unit using (; tt)
open import Data.Product renaming (_×_ to __ ; proj₁ to fst ; proj₂ to snd)
open import Data.Sum renaming (__ to __ ; inj₁ to left ; inj₂ to right)
open import Data.Nat using (; zero ; suc ; _+_)
open import Data.Product renaming (proj₁ to fst ; proj₂ to snd)
open import Data.List hiding (length ; head)

For simplicity, we are going to sort lists of natural numbers. We must thus first define the order on those.

1 Order on natural numbers

The usual order on natural numbers can be defined as the inductive type

data __ : Set where
  z≤n : {n :}                  zero  ≤ n
  s≤s : {m n :} (m≤n : m ≤ n)  suc m ≤ suc n

(the symbol is typed by \le).

1.1 Compatibility with successor

Show that it is compatible with successor by constructing terms

≤-pred : {m n :}  (suc m ≤ suc n)  m ≤ n

and

≤-suc : {m n :}  m ≤ n  suc m ≤ suc n

and

≤s : (n :)  n ≤ suc n

1.2 Reflexivity

Show that the order is reflexive:

≤-refl : (n : ℕ) → n ≤ n

1.3 Transitivity

Show that the order is transitive:

≤-trans : {m n p :}  (m ≤ n)  (n ≤ p)  (m ≤ p)

1.4 Totality

Show that the order is total, meaning that any pair of natural number can be compared:

_≤?_ : (m n :)  (m ≤ n)(n ≤ m)

2 Insertion sort

We are interested in the insertion sort algorithm, which is a simple sorting algorithm, although not very efficient.

2.1 Insertion

Define a function

insert : (x :)  (l : List ℕ)  List ℕ

which given an element x and a list l, which is supposed to be sorted, inserts x in l so that the result is sorted.

2.2 Sorting

Use this function to program a function

sort : List ℕ  List ℕ

which sorts a list (by iteratively inserting its elements to the empty list).

In order to test your function, you can define a term such as

test : List ℕ
test = sort (4145832121 ∷ [])

and normalize it using the C-c C-n shortcut.

2.3 The bounded below predicate

Define a predicate

_≤*_ : (x :)  (l : List ℕ)  Set

such that x ≤* l means that x ≤ y is satisfied for every element of l (this function can be defined recursively, without defining the predicate of belonging to a list). It is simpler to define it as an inductive type.

2.4 The sorted predicate

Define a predicate

sorted : (l : List ℕ)  Set

which expresses that a list is sorted.

2.5 Insert is sorting

Show that inserting an element in a sorted list produces a sorted list:

insert-sorting : (x :)  (l : List ℕ)  sorted l  sorted (insert x l)

This proof might first require you to show (two) auxiliary lemmas.

2.6 Sort is sorting

Finally, show that the function sort always produces a sorted list.

2.7 The problem of specification

Actually, the specification is not complete. Construct a (very simple) function

f : List ℕ  List ℕ

which is not sorting its argument, and for which we can prove

f-sorting : (l : List ℕ)  sorted (f l)

What is missing from the specification of the sort function? We shall address this issue later on. For now, we are only interested in proving that sort is returning a sorted list.

2.8 Intrinsic approach

The above proof method was extrinsic: we first defined the sorting function and then showed that it was correct. We now investigate the intrinsic approach, which consists in directly defining the function with a type ensuring its correctness.

Try to define an inductive Sorted type with two constructors, such that the terms of this type can be thought of as the sorted lists. Do you manage to do it? If not, why?

In Agda, we can define functions which are mutually recursive: one is calling the other and vice versa. For instance, the functions even and odd, which determine whether a natural number is even or odd, can be defined by mutual recursion by

mutual
  even : Bool
  even zero = true
  even (suc n) = odd n

  odd : Bool
  odd zero = false
  odd (suc n) = even n

Define by mutual recursion the inductive type Sorted (whose constructors are called nil and cons) together with a function head which returns the first element of a sorted list or a default value if the list is empty:

mutual
  data Sorted : Set where
    ...

  head : Sorted 
  head d l = ?

Define an insertion function

insert' : (x :)  Sorted  Sorted

For the last case of the definition, you might need to prove a lemma, formulated as a mutually recursive function.

Finally, define a sorting function

sort' : (l : List ℕ)  Sorted

3 Preservation of elements

Please do next sections first and come back here afterward.

The goal of this section is to show that the insertion sort function is actually returning a sorted version of its input list. As we have seen, it is not enough to show that it returns a sorted list since the sorted list might have nothing to do with the original list. What is missing is that the list in the output of sort l has the same elements of the input list l. We might want to be even more picky and show that the sorting function preserves the multiplicities of the elements, i.e,

sort (311 ∷ [])

is

113 ∷ []

and not

133 ∷ []

nor

13 ∷ []

3.1 Permutations

A way to show this is to prove that the sorted list is a permutation of the original one. Since permutations are generated by transpositions (exchanging two consecutive elements), we can define the following relation which expresses that a list is a permutation of another:

data __ {A : Set} : List A  List A  Set where
  ∼-nil : [] ∼ []
  ∼-drop : (x : A) {l l' : List A}  l ∼ l'  (x ∷ l)(x ∷ l')
  ∼-swap : (x y : A) (l : List A)  (x ∷ y ∷ l)(y ∷ x ∷ l)
  ∼-trans : {l l' l'' : List A}  l ∼ l'  l' ∼ l''  l ∼ l''

3.2 Properties

Show that the relation is reflexive

∼-refl : {A : Set} {l : List A}  l ∼ l

and symmetric

∼-sym : {A : Set} {l l' : List A}  l ∼ l'  l' ∼ l

3.3 Insertion and permutation

Show that insertion permutes the first element:

insert-∼ : (x :) (l : List ℕ)  (x ∷ l)(insert x l)

Show that insertion is compatible with permutation

∼-insert : (x :) {l l' : List ℕ}  l ∼ l'  insert x l ∼ insert x l'

(recall that the C-c C-a shortcut instructs Agda to try to automatically fill a hole).

3.4 Sorting and permutation

Show that sorting produces a permutation of the original list:

sort-∼ : (l : List ℕ)  l ∼ (sort l)

4 Merge sort

As indicated above, the insertion sort we studied is not realistic because not very efficient. We now turn to merge sort which has a decent complexity. This sorting algorithm proceeds in three phases:

  1. split the input list \(l\) into two lists \(l_1\) and \(l_2\) of roughly the same length (up to a difference of 1 if the length of the input list is odd),
  2. recursively sort the lists \(l_1\) and \(l_2\),
  3. recombine the two sorted lists into a sorted list.

We recall that A × B is the notation for the product of the types A and B: terms of this type are pairs (a , b) consisting of a term a of type A and a term b of type B. The functions fst and snd can be used to retrieve the first and second component of a pair.

4.1 Splitting

Define a function

split : {A : Set}  List A  List A × List A

which splits a list in two (by returning the elements at even / odd position). For instance, the result of

split (4145832121 ∷ [])

will be the pair

(445321 ∷ [] , 1812 ∷ [])

4.2 Merging

Define a function

merge : (l m : List ℕ)  List ℕ

such that if l and m are sorted lists then their merge is the sorted list containing the elements of both. It can be defined recursively by comparing the first elements of l and m with ≤?. It might be the case that Agda thinks that your function is not terminating whereas it should, simply ignore this for now (see below).

4.3 Sorting

Finally, define the function

mergesort : List ℕ  List ℕ

You will certainly get an error message such as

Termination checking failed for the following functions: mergesort

because Agda could not ensure that the function is terminating. Of course, the function is terminating, because the lengths of the lists in the recursive calls do decrease, but Agda is not able to show it because it only considers functions which are structurally decreasing, i.e., those for which the recursive calls are performed on subterms of the arguments.

You can instruct Agda to trust it is by adding, just before the declaration of the function,

{-# TERMINATING #-}

Do this and test your function by normalizing (using the shortcut C-c C-n) the term test-merge defined by

test-merge : List ℕ
test-merge = mergesort (4145832121 ∷ [])

The answer should be

1148123245 ∷ []

and should be computed instantaneously. If you do not obtain an answer, this might indicate that your program is looping. This is case, use the menu Agda > Kill and restart Agda (or the shortcut C-c C-x C-r) to stop Agda (and C-c C-l should restart it).

Using {-# TERMINATING #-} in Agda is cheating since we might not detect that your program is not terminating. Moreover, it might lead to inconsistencies:

open import Data.Empty

{-# TERMINATING #-}
absurd :
absurd = absurd

We would thus rather avoid that, and prove the termination of our program, which is the goal of the questions below.

4.4 Splitting is decreasing

Define the strict order < on natural numbers (this can be done in one sort line, using and suc).

Define the length function on lists by

length : {A : Set} (l : List A) 
length [] = zero
length (x ∷ l) = suc (length l)

Show that, given a list l, the length of the first component of split l is strictly smaller than the length of l, for almost all lists l.

Do the same proof for the second component.

4.5 The fuel definition of merge sort

In order for Agda to know that the function is terminating, we construct a variant of the mergesort function with an extra integer argument n, the fuel, which is a bound on the number of steps the functions can make. The last added argument, of type length l ≤ n, is here to ensure that we can actually make recursive calls when we are not in the base case. Assuming that the above merge function is terminating, define the function

mergesort-fuel : (n :)  (l : List ℕ)  (length l ≤ n)  List ℕ

Use it to give a proper definition of

mergesort : (l : List ℕ)  List ℕ

Test it as above.

Optional. If your merge function was not recognized as terminating, use the fuel technique to make a proper definition of it (you might use addition and introduce a few easy lemmas about natural numbers).

4.6 Merge sort is sorting

Show that your implementation is returning a sorted list:

mergesort-sorting : (l : List ℕ)  sorted (mergesort l)

You are left with the choice of using the extrinsic or intrinsic approach (although the second one is advised). As in the first part, you will certainly have to prove some preliminary lemmas.

5 Well-founded definition of merge sort

5.1 Basic definitions

The types of relations on a given type is

Rel : (A : Set)  Set₁
Rel A = A  A  Set

We define the predicates of accessibility and well-foundedness by

data Acc {A : Set} (_<_ : Rel A) (x : A) : Set where
  acc : ((y : A)  y < x  Acc _<_ y)  Acc _<_ x

WellFounded : {A : Set}  (_<_ : Rel A)  Set
WellFounded {A} _<_ = (x : A)  Acc _<_ x

For now, we assume that ℕ is well-founded:

postulate wfNat : WellFounded _<_

5.2 Definition of merge sort

Define merge sort by well-founded induction on the length of the list. You should first define an auxiliary function which acts like merge sort, but takes as extra argument a proof that the length of the list is accessible.

5.3 Auxiliary lemmas

Our goal is now to show the above postulate. First, show the two auxiliary lemmas

≤-< : {m n :}  (m ≤ n)  m ≡ n ∨ m < n

and

<-last : {m n : ℕ} → (m < suc n) → m ≡ n ∨ m < n

5.4 ℕ is well-founded

Remove the above postulate and show

wfNat : WellFounded _<_