Sorting

The goal of this TD is to prove a (somewhat) realistic algorithm: a sorting algorithm on lists. It is supposed to be done over two sessions.

Create a new file Sort.agda and begin by importing the libraries necessary to have access to equality, truth, conjunction, disjunction, natural numbers, products and lists:

open import Relation.Binary.PropositionalEquality
open import Data.Unit using (⊤ ; tt)
open import Data.Product renaming (_×_ to _∧_ ; proj₁ to fst ; proj₂ to snd)
open import Data.Sum renaming (_⊎_ to _∨_ ; inj₁ to left ; inj₂ to right)
open import Data.Nat using (ℕ ; zero ; suc ; _+_)
open import Data.Product renaming (proj₁ to fst ; proj₂ to snd)
open import Data.List hiding (length ; head)

For simplicity, we are going to sort lists of natural numbers. We must thus first define the order on those.

1 Order on natural numbers

data _≤_ : ℕ → ℕ → Set where
  z≤n : {n : ℕ}                 → zero  ≤ n
  s≤s : {m n : ℕ} (m≤n : m ≤ n) → suc m ≤ suc n

1.1 Compatibility with successor

≤-pred : {m n : ℕ} → (suc m ≤ suc n) → m ≤ n

≤-suc : {m n : ℕ} → m ≤ n → suc m ≤ suc n

≤s : (n : ℕ) → n ≤ suc n

1.2 Reflexivity

1.3 Transitivity

≤-trans : {m n p : ℕ} → (m ≤ n) → (n ≤ p) → (m ≤ p)

1.4 Totality

Show that the order is total, meaning that any pair of natural number can be compared:

_≤?_ : (m n : ℕ) → (m ≤ n) ∨ (n ≤ m)

2 Insertion sort

We are interested in the insertion sort algorithm, which is a simple sorting algorithm, although not very efficient.

2.1 Insertion

insert : (x : ℕ) → (l : List ℕ) → List ℕ

which given an element x and a list l, which is supposed to be sorted, inserts x in l so that the result is sorted.

2.2 Sorting

sort : List ℕ → List ℕ

test : List ℕ
test = sort (4 ∷ 1 ∷ 45 ∷ 8 ∷ 32 ∷ 12 ∷ 1 ∷ [])

2.3 The bounded below predicate

_≤*_ : (x : ℕ) → (l : List ℕ) → Set

such that x ≤* l means that x ≤ y is satisfied for every element of l (this function can be defined recursively, without defining the predicate of belonging to a list). It is simpler to define it as an inductive type.

2.4 The sorted predicate

sorted : (l : List ℕ) → Set

2.5 Insert is sorting

insert-sorting : (x : ℕ) → (l : List ℕ) → sorted l → sorted (insert x l)

2.6 Sort is sorting

Finally, show that the function sort always produces a sorted list (sort-sorting).

2.7 The problem of specification

f : List ℕ → List ℕ

f-sorting : (l : List ℕ) → sorted (f l)

What is missing from the specification of the sort function? We shall address this issue later on. For now, we are only interested in proving that sort is returning a sorted list.

2.8 Intrinsic approach

The above proof method was extrinsic: we first defined the sorting function and then showed that it was correct. We now investigate the intrinsic approach, which consists in directly defining the function with a type ensuring its correctness.

Try to define an inductive Sorted type with two constructors, such that the terms of this type can be thought of as the sorted lists. Do you manage to do it? If not, why?

In Agda, we can define functions which are mutually recursive: one is calling the other and vice versa. For instance, the functions even and odd, which determine whether a natural number is even or odd, can be defined by mutual recursion by

mutual
  even : ℕ → Bool
  even zero = true
  even (suc n) = odd n

  odd : ℕ → Bool
  odd zero = false
  odd (suc n) = even n

Define by mutual recursion the inductive type Sorted (whose constructors are called nil and cons) together with a function head which returns the first element of a sorted list or a default value if the list is empty:

mutual
  data Sorted : Set where
    ...

  head : ℕ → Sorted → ℕ
  head d l = ?

insert' : (x : ℕ) → Sorted → Sorted

For the last case of the definition, you might need to prove a lemma, formulated as a mutually recursive function.

sort' : (l : List ℕ) → Sorted

3 Preservation of elements

The goal of this section is to show that the insertion sort function is actually returning a sorted version of its input list. As we have seen, it is not enough to show that it returns a sorted list since the sorted list might have nothing to do with the original list. What is missing is that the list in the output of sort l has the same elements of the input list l. We might want to be even more picky and show that the sorting function preserves the multiplicities of the elements, i.e,

sort (3 ∷ 1 ∷ 1 ∷ [])

1 ∷ 1 ∷ 3 ∷ []

1 ∷ 3 ∷ 3 ∷ []

1 ∷ 3 ∷ []

3.1 Permutations

A way to show this is to prove that the sorted list is a permutation of the original one. Since permutations are generated by transpositions (exchanging two consecutive elements), we can define the following relation ∼ which expresses that a list is a permutation of another:

data _∼_ {A : Set} : List A → List A → Set where
  ∼-nil : [] ∼ []
  ∼-drop : (x : A) {l l' : List A} → l ∼ l' → (x ∷ l) ∼ (x ∷ l')
  ∼-swap : (x y : A) (l : List A) → (x ∷ y ∷ l) ∼ (y ∷ x ∷ l)
  ∼-trans : {l l' l'' : List A} → l ∼ l' → l' ∼ l'' → l ∼ l''

3.2 Properties

∼-refl : {A : Set} {l : List A} → l ∼ l

∼-sym : {A : Set} {l l' : List A} → l ∼ l' → l' ∼ l

3.3 Insertion and permutation

insert-∼ : (x : ℕ) (l : List ℕ) → (x ∷ l) ∼ (insert x l)

∼-insert : (x : ℕ) {l l' : List ℕ} → l ∼ l' → insert x l ∼ insert x l'

(recall that the C-c C-a shortcut instructs Agda to try to automatically fill a hole).

3.4 Sorting and permutation

sort-∼ : (l : List ℕ) → l ∼ (sort l)

4 Merge sort

As indicated above, the insertion sort we studied is not realistic because not very efficient. We now turn to merge sort which has a decent complexity. This sorting algorithm proceeds in three phases:

We recall that A × B is the notation for the product of the types A and B: terms of this type are pairs (a , b) consisting of a term a of type A and a term b of type B. The functions fst and snd can be used to retrieve the first and second component of a pair.

4.1 Splitting

split : {A : Set} → List A → List A × List A

which splits a list in two (by returning the elements at even / odd position). For instance, the result of

split (4 ∷ 1 ∷ 45 ∷ 8 ∷ 32 ∷ 12 ∷ 1 ∷ [])

(4 ∷ 45 ∷ 32 ∷ 1 ∷ [] , 1 ∷ 8 ∷ 12 ∷ [])

4.2 Merging

merge : (l m : List ℕ) → List ℕ

such that if l and m are sorted lists then their merge is the sorted list containing the elements of both. It can be defined recursively by comparing the first elements of l and m with ≤?. It might be the case that Agda thinks that your function is not terminating whereas it should, simply ignore this for now (see below).

4.3 Sorting

mergesort : List ℕ → List ℕ

because Agda could not ensure that the function is terminating. Of course, the function is terminating, because the lengths of the lists in the recursive calls do decrease, but Agda is not able to show it because it only considers functions which are structurally decreasing, i.e., those for which the recursive calls are performed on subterms of the arguments.

You can instruct Agda to trust it is by adding, just before the declaration of the function,

{-# TERMINATING #-}

Do this and test your function by normalizing (using the shortcut C-c C-n) the term test-merge defined by

test-merge : List ℕ
test-merge = mergesort (4 ∷ 1 ∷ 45 ∷ 8 ∷ 32 ∷ 12 ∷ 1 ∷ [])

1 ∷ 1 ∷ 4 ∷ 8 ∷ 12 ∷ 32 ∷ 45 ∷ []

and should be computed instantaneously. If you do not obtain an answer, this might indicate that your program is looping. This is case, use the menu Agda > Kill and restart Agda (or the shortcut C-c C-x C-r) to stop Agda (and C-c C-l should restart it).

Using {-# TERMINATING #-} in Agda is cheating since we might not detect that your program is not terminating. Moreover, it might lead to inconsistencies:

open import Data.Empty

{-# TERMINATING #-}
absurd : ⊥
absurd = absurd

We would thus rather avoid that, and prove the termination of our program, which is the goal of the questions below.

4.4 Splitting is decreasing

Define the strict order < on natural numbers (this can be done in one sort line, using ≤ and suc).

length : {A : Set} (l : List A) → ℕ
length [] = zero
length (x ∷ l) = suc (length l)

Show that, given a list l, the length of the first component of split l is strictly smaller than the length of l, for almost all lists l (split-fst-decr).

4.5 The fuel definition of merge sort

In order for Agda to know that the function is terminating, we construct a variant of the mergesort function with an extra integer argument n, the fuel, which is a bound on the number of steps the functions can make. The last added argument, of type length l ≤ n, is here to ensure that we can actually make recursive calls when we are not in the base case. Assuming that the above merge function is terminating, define the function

mergesort-fuel : (n : ℕ) → (l : List ℕ) → (length l ≤ n) → List ℕ

mergesort' : List ℕ → List ℕ

Optional. If your merge function was not recognized as terminating, use the fuel technique to make a proper definition of it (you might use addition and introduce a few easy lemmas about natural numbers).

4.6 Merge sort is sorting

mergesort-sorting : (l : List ℕ) → sorted (mergesort' l)

You are left with the choice of using the extrinsic or intrinsic approach (although the second one is advised). As in the first part, you will certainly have to prove some preliminary lemmas.

5 Well-founded definition of merge sort

Rel : (A : Set) → Set₁
Rel A = A → A → Set

data Acc {A : Set} (_<_ : Rel A) (x : A) : Set where
  acc : ((y : A) → y < x → Acc _<_ y) → Acc _<_ x

WellFounded : {A : Set} → (_<_ : Rel A) → Set
WellFounded {A} _<_ = (x : A) → Acc _<_ x

postulate wfNatAxiom : WellFounded _<_

5.1 Definition of merge sort

Define merge sort by well-founded induction on the length of the list (mergesort''). You should first define an auxiliary function which acts like merge sort, but takes as extra argument a proof that the length of the list is accessible.

5.2 Auxiliary lemmas

Our goal is now to show the above postulate (stating that ℕ is well-founded). First, show the two auxiliary lemmas

≤-< : {m n : ℕ} → (m ≤ n) → m ≡ n ∨ m < n

5.3 ℕ is well-founded

wfNat : WellFounded _<_