Lab 1 (first-order data types)

General instructions

The lab contains both programming exercises and theory exercises, which can be done in any order. You should try to complete the non-optional questions if you can before the deadline.
All of the programming exercises in this lab can be solved using just pattern-matching and recursion, without using any higher-order functions (though some admit more concise solutions using higher-order functions, which will be covered in Lecture 2). You should not use any external libraries, but it is okay to import functions from the standard library (although, again, it is not necessary for this lab).
Download the template file Lab1.hs, put all your solutions in this file, and upload it to Moodle before the deadline. Make sure that you submit a working Haskell file!

Programming exercises

--- Programming exercises

Exercise 0 (warm-up)

Complete the function definitions below:

-- Exercise 0a
doubleList :: [a] -> [a]
doubleList = undefined

-- Exercise 0b
firstDoubled :: Eq a => [a] -> Maybe a
firstDoubled = undefined

where doubleList takes a list and returns a new list with every element replaced by two copies of itself, while firstDoubled takes a list and returns Just the first element that appears at least twice in order with no elements in between, or else Nothing if no such element exists. Some examples:

> doubleList [1..3]
[1,1,2,2,3,3]
> doubleList "yes!"
"yyeess!!"
> firstDoubled [1,2,2,3,3]
Just 2
> firstDoubled "hello"
Just 'l'
> firstDoubled "world"
Nothing

Exercise 1

[Adapted from Hameer & Pientka 2018]

You work at a cupcake shop that has recently taken the step of migrating their operations to Haskell, to improve reliability and customer satisfaction. When selecting a cupcake, the most important criterion for customers is whether it contains any ingredients they are allergic to, and afterwards they are mainly interested in its price. You therefore begin by introducing the following data type of allergens:

data Allergen = Nuts | Gluten | Soy | Dairy      deriving (Show, Eq)

(The “deriving (Show,Eq)” statement is for convenience, automatically deriving routines for pretty-printing and equality-testing.) Next, you define a “recipe” as a list of allergens (your bakers are highly experienced, and can deduce all of the ingredients in the recipe from the list of allergens):

type Recipe   = [Allergen]

(Note that Recipe is a type synonym rather than a data type declaration, which means that it doesn’t introduce any new constructors and there is no need for a “deriving” statement. A data type declaration with a single constructor would also work here though.) Finally, you introduce a data type of cupcakes, with a single constructor taking the name of the cupcake (a string), a recipe, and a price (in integer units).

type Name     = String
type Price    = Int
data Cupcake  = CC Name Recipe Price             deriving (Show,Eq)

Here are the recipes currently used at the bakery, and the list of cupcakes on sale at the shop:

r1, r2, r3, r4, r5 :: Recipe
r1 = [Gluten]
r2 = []
r3 = [Nuts]
r4 = [Dairy,Gluten]
r5 = [Soy]

onsale :: [Cupcake]
onsale = [CC "Chocolate Surprise" r1 200,
          CC "Lemon Mayhem" r2 150,
          CC "Peanut Butter Bliss" r3 150,
          CC "Yogurt Truly" r4 250,
          CC "Caramel Karma" r5 200]

For this exercise, you need to implement two functions

-- Exercise 1a
priceRange :: Price -> Price -> [Cupcake] -> [Name]
priceRange = undefined

-- Exercise 1b
allergyFree :: [Allergen] -> [Cupcake] -> [Name]
allergyFree = undefined

which will help clients quickly filter cupcakes by different criteria as follows:

priceRange minPrice maxPrice cs should return the list of names of cupcakes in cs (in any order) whose price is between minPrice and maxPrice inclusive. For example, priceRange 200 300 onsale should evaluate to ["Chocolate Surprise","Yogurt Truly","Caramel Karma"] (or some permutation thereof).
allergyFree as cs should similarly return the list of names of cupcakes in cs that do not contain any of the allergens in as. For example, allergyFree [Gluten,Dairy] onsale should evaluate to ["Lemon Mayhem","Peanut Butter Bliss","Caramel Karma"] (or some permutation thereof).

Exercise 2

You are tasked with assuring quality at the bakery for the cupcake shop. Under normal operations, cupcakes are baked together in batches on a cupcake tin before being priced. To improve quality control, you would like to include an additional step of checking that every tin satisfies some formal specification before it is baked. You introduce the following types:

type Tin = [Recipe]
data Spec = And Spec Spec | Or Spec Spec | Not Spec | HasCup Int Allergen  deriving (Show,Eq)

Such specifications should be interpreted as follows:

a tin satisfies the specification And s1 s2 iff it satisfies both s1 and s2
a tin satisfies the specification Or s1 s2 iff it satisfies either s1 or s2
a tin satisfies the specification Not s iff it does not satisfy the specification s
a tin satisfies the specification HasCup k x iff the recipe at position k (using 0-indexing) contains the allergen x

For example, the sample tin pictured below (image not to scale)

sampletin :: Tin
sampletin = [r3,r4,r2,r5]

satisfies the specification

And (HasCup 0 Nuts) (And (HasCup 1 Gluten) (And (Not (HasCup 2 Dairy)) (HasCup 3 Soy)))

but does not satisfy the specification

Or (HasCup 0 Dairy) (HasCup 1 Soy)

Write a function

-- Exercise 2a
checkSpec :: Spec -> Tin -> Bool
checkSpec = undefined

which takes a specification s and a tin t as inputs, and returns True just in case t satisfies the specification s, or else False. For this question, you can assume that all HasCup k x specifications use valid indices 0 <= k < n, where n is the number of cups in the tin t.

Then (optionally) write a function

-- Exercise 2b (optional)
checkSpec' :: Spec -> Tin -> Maybe Bool
checkSpec' = undefined

which behaves similarly to checkSpec, except that it also ensures that the specification is well-formed in the sense that it does not contain any subspecs of the form HasCup k x with k out of bounds. If the specification is well-formed then it should return Just the result of testing the specification on the tin, and otherwise it should return Nothing.

Exercise 3

During a lunch break, one of your more mathematically-minded coworkers at the bakery writes down the following data type:

data Tree a b = Leaf a | Node b [Tree a b]  deriving (Show,Eq)

She explains that this is a very general type of tree where every leaf contains a value of type a, and where every node contains a value of type b together with an ordered (possibly empty) list of children (You might compare this type with the type BinTree a of binary trees with labelled nodes introduced in Section 6 of the Lecture 1 notes.) For example, the tree

              ________1_______
             |        |       |
texample =  _2_     __3__     4
           |   |   |  |  |
           a   b   c  d  e

(which contains integers at the nodes and characters at the leaves) can be represented by the following value:

texample :: Tree Char Integer
texample = Node 1 [Node 2 [Leaf 'a', Leaf 'b'], Node 3 [Leaf 'c', Leaf 'd', Leaf 'e'], Node 4 []]

Likewise, the tree

            c
           / \
          /   \
bst =    a     d
        / \   / \
           b
          / \

(which contains characters at the nodes and nothing at the leaves) can be represented by the following value:

bst :: Tree () Char
bst = Node 'c' [Node 'a' [Leaf (), Node 'b' [Leaf (), Leaf ()]], Node 'd' [Leaf (), Leaf ()]]

Implement the functions:

-- Exercise 3a
canopy :: Tree a b -> [a]
canopy = undefined

-- Exercise 3b (optional)
preorder :: Tree a b -> [Either a b]
preorder = undefined

which compute respectively: the canopy of a tree, defined as the list of its leaves, ordered from left to right; and the left-to-right preorder traversal of a tree, listing both its nodes and its leaves embedded within a sum type.

For example, you should see the following outputs:

> canopy texample
"abcde"
> canopy bst
[(),(),(),(),()]
> preorder texample
[Right 1,Right 2,Left 'a',Left 'b',Right 3,Left 'c',Left 'd',Left 'e',Right 4]
> preorder bst
[Right 'c',Right 'a',Left (),Right 'b',Left (),Left (),Right 'd',Left (),Left ()]

Hint: write canopy in mutual recursion with a function forest_canopy :: [Tree a b] -> [a] that computes the canopy of a forest (= list of trees). Similarly write preorder in mutual recursion with a function forest_preorder. (Alternatively, you can avoid introducing an auxiliary function to deal with forests by instead using a higher-order function from the standard library.)

Exercise 4

Sometimes during slow hours at the cupcake shop you do a bit of tinkering with sorting algorithms, and now you believe you’ve devised an algorithm for sorting in linear time! It applies to lists of any type of values with a comparison operation, and the basic idea is to use an auxiliary stack to keep track of what value to output first. In more detail, here’s how the algorithm works:

Begin by initializing an empty stack.
Scan the input values in order. For each value x, do the following:
- If the stack is nonempty and x is larger than the top item, then pop the top item on the stack to the output.
- Repeat the previous step until either the stack is empty or x is less than or equal to the top item, and then push x onto the stack.
Once there are no more input values, pop the remaining values on the stack over to the output.

For example, suppose the input is the list of integers [4,2,1,3]. The algorithm performs the following actions:

Read 4 from the input. Since the stack is empty, push 4 onto the stack. (Current stack = [4])
Read 2 from the input. Since 2 <= 4, push 2 onto the stack. (Current stack = [2,4])
Read 1 from the input. Since 1 <= 2, push 1 onto the stack. (Current stack = [1,2,4])
Read 3 from the input. Since 3 > 1, pop 1 to the output. (Current stack = [2,4])
Since 3 > 2, pop 2 to the output. (Current stack = [4])
Since 3 <= 4, push 3 onto the stack. (Current stack = [3,4])
Since there are no more input values, pop 3 and 4 to the output.

Observe that the input values have been popped in the order [1,2,3,4]. In general, since each input value is pushed and later popped exactly once, the algorithm always runs in linear time. Fantastic!

Implement the above algorithm in Haskell as a function

-- Exercise 4
linearSort :: Ord a => [a] -> [a]
linearSort = undefined

from input lists to output lists. For example, linearSort [4,2,1,3] should evaluate to [1,2,3,4].

Hint: first figure out how to represent the auxiliary stack.

Exercise 5 (optional)

You remember reading somewhere that comparison-based sorting algorithms need at least Ω(n log n) time in general, and so you’re starting to have doubts whether your linear-time sorting algorithm really always sorts the input list.

Give a counterexample, in the form of a list of integers

-- Exercise 5a (optional)
counterexample :: [Int]
counterexample = undefined

such that the list returned by linearSort counterexample is not sorted.

Oh no!

After your initial disappointment, you decide to go back and look more carefully at the cases for which the algorithm does successfully sort the input. You try running linearSort on all possible permutations of the list [1..n] for increasing values of n (here importing the standard library function permutations :: [a] -> [[a]] from Data.List is useful), and notice a remarkable pattern in the number of such permutations that the algorithm successfully sorts:

> [length [xs | xs <- permutations [1..n], linearSort xs == [1..n]] | n <- [0..]]
[1,1,2,5,14,42,132,429,1430,4862,16796,  C-c C-cInterrupted.

Entering this sequence of numbers into the On-Line Encyclopedia of Integer Sequences reaffirms your suspicion that these are the Catalan numbers! Since you know that the Catalan numbers also count unlabelled binary trees, you wonder whether there is a connection between binary trees and linearSortable permutations…

Let Bin be a data type of unlabelled binary trees, defined as follows:

data Bin = L | B Bin Bin  deriving (Show,Eq)

Write a pair of functions

-- Exercise 5b (optional)
fromBin :: Bin -> [Int]
fromBin = undefined
toBin :: [Int] -> Maybe Bin
toBin = undefined

that convert a binary tree to a linearSortable permutation, and vice versa, in a reversible way.

More precisely, fromBin t should take a binary tree t with n B-nodes and return a permutation xs of [1..n] such that linearSort xs == [1..n]. Conversely, toBin should take a permutation xs of [1..n], and in the case that linearSort xs == [1..n] should return Just a binary tree with n B-nodes, otherwise it should return Nothing. (If xs is not a permutation of [1..n], then toBin can return an arbitrary value.) Finally, these functions should be partial inverses in the sense that

toBin (fromBin t) == Just t

for all t :: Bin, and

fromBin (fromJust (toBin xs)) == xs

for all xs :: [Int] such that linearSort xs == [1..n]. (Here the standard library function fromJust :: Maybe a -> a is defined by the equation fromJust (Just x) = x. It may be imported from Data.Maybe.)

Theory exercises

Do Exercises 2.2 and 3.1 from the Lecture 1 notes. If you have extra time, you can also do 3.2-3.4.

You can write your solutions as comments in your Haskell file:

--- Theory exercises (from Lecture 1 notes)

-- Exercise 2.2
{-
   (This is an example of a multi-line comment, delimited by
   curly braces.)
-}

-- Exercise 3.1

-- Exercises 3.2-3.4 (optional)