CSE 428: Lecture Notes 22


Higher Order in ML

ML allows the definition of Higher Order functions. Essentially this means that a function can
  1. Take a function as an argument
  2. Give a function as result
More in general, full higher-order requires the capability of treating functions as ordinary values. The slogan is: ``functions as first-class citizens''. This, in particular, means that we can have expressions of type function (i.e. expressions representing functions), without being obliged to give such function a name. Such expressions are called ``anonymous functions''.

Some languages, like Pascal, support only a limited form of higher order. In Pascal for instance it is possible to define functions which has another functions as parameter, but the actual parameter must be the name of a function.

Functions as arguments

We illustrate the utility of this feature with an example. Consider again the quicksort function seen a couple of lectures ago:
   fun quicksort(nil)  = nil
     | quicksort(h::t) = let val (low,high) = split(h,t) 
                          in append(quicksort(low), (h::quicksort(high)))
                         end
   and
       split(x,nil)    = (nil, nil)
     | split(x,(h::t)) = let val (k,m) = split(x,t) 
                          in if h < x 
                                then ((h::k),m)
                                else (k,(h::m))
                         end;
This function is of type int list -> int list, which means that can only be applied to lists of integers. However, the algorithm is very general, and one could easily imagine of applying the same method to order other kinds of lists, under the appropriate ordering relations. For instance: lists of strings under the lexicographic ordering, lists of lists under the ordering "shorter than", etc. The only thing that one needs to do is to replace the "<" relation in the test h < x in split with the appropriate ordering relation.

Of course, one would wish to write only one program, and to express the dependence on the ordering relation by using a parameter. Note however that the ordering relation is a function (its most general type is 'a * 'a -> bool), hence we need that the language allow expressing quicksort as an higher-order function. ML, being an Higher-order language, allows it, and we can define quicksort as a functions with two parameters, the list to be ordered and the ordering relation, as follows:

   fun quicksort(nil,ord)  = nil
     | quicksort(h::t,ord) = let val (low,high) = split(h,t,ord) 
                              in append(quicksort(low), (h::quicksort(high)))
                             end
   and
       split(x,nil,ord)    = (nil, nil)
     | split(x,(h::t),ord) = let val (k,m) = split(x,t,ord) 
                              in if ord(h,x)
                                    then ((h::k),m)
                                    else (k,(h::m))
                             end;
The type of this definition of quicksort is 'a list * ('a list * 'a list -> bool) -> 'a list.

For instance, we can use quicksort with the orderings "<" and ">" on integers. The only problem is that they are infix operators, while the pameter ord is used as a prefix operator. This problem can be solved by using the predefined "op" function, which trasforms an infix operator into a prefix one:

   - quicksort([3,2,5,1],op <);
   val it = [1,2,3,5] : int list
   - quicksort([3,2,5,1],op >);
   val it = [5,3,2,1] : int list
Let us see some other examples of functions with higher-order parameters.

Example: map

The function map : 'a list * ('a list -> 'b list) -> 'b list takes a list L and a function f, and returns the list obtained by applying f to all the elements of L. This function can be defined as follows:
   fun map([],f) = []
     | map(a::L,f) = (f a)::map(L,f);
For example, is we have previously defined fact as the factorial function, we have:
   - map([1,2,3,4],fact);
   val it = [1,2,6,24] : int list

Example: reduce

The function reduce : 'a list * ('a list -> 'b list) -> 'b list takes a list L, a binary function f, and a value v, and returns the result of the application of f over all the elements of L. v represents the neutral element of f.
   fun reduce([],f,v) = v
     | reduce(a::L,f,v) = f(a,reduce(L,f,v));
Examples:
   - reduce([1,2,3,4],op +,0);            (* sum all the elements of the list)
   val it = 10 : int
   - reduce([1,2,3,4],op *,1);            (* multiply all the elements of the list)
   val it = 24 : int
   - reduce([[1,2],[],[5,6,7]],op @,[]);  (* flatten the list)
   val it = [1,2,5,6,7] : int list

Anonymous functions

An anonymous function, or ``function without a name'', is an expression representing a function. Such expressions have the following syntax (in the simplest case):
   fn Id => Exp
where Id is a name representing the parameter of the function, and Exp is an expression representing the body of the function. For example:
   fn x => x * 2;
represents a function that, given a number n, returns the double of n, i.e. n * 2.

More in general, instead of a name as parameter we can use a pattern, like a tuple. For instance

   fn (x,y) => if x < y then y - x else x - y;
represents a binary function that, given two numbers n and m, returns the distance (namely the absolute value of the difference) between n and m.

Anonymous functions can be used anywhere a function of the same type is expected. For instance:

   - map([1,2,3,4],fx x => x * 2);
   val it = [2,4,6,8] : int list
   - reduce([1,2,3,4],fn (x,y) => if x = 0 then true else y,false); 
   val it = false : bool
Note: the expression
   - reduce(L,fn (x,y) => if x = 0 then true else y,false); 
gives true if and only if the list L contains at least a 0.

Functions as result

In a higher order language it is possible to define functions which give a function as result.

Consider for instance the functional composition, defined as follows:

The functional composition of f and g is a function that, for every x, gives as result g(f(x))
In ML we can define the functional composition in the following way:
   fun comp(f,g) = fn x => g(f(x))
The type of comp is
   comp : ('a -> 'b) * ('b -> 'c) -> ('a -> 'c)
where 'a -> 'b represents the type of f, 'b -> 'c represents the type of g, and 'a -> 'c represents the type of comp(f,g).

For instance, comp(fact,fn x => x*2) represents the function which, given n, first computes the factorial of n, and then multiplies it by 2. Thus we have:

   - map([1,2,3,4],comp(fact,fx x => x * 2));
   val it = [2,4,12,48] : int list

Some clarification about ML

Parentheses

Due to the higher-order features of ML, the rules about parentheses in expressions might be, in certain cases, a bit different than in other languages. Consider for instance the following definition of the identity function:
   - fun f(x) = x;
   val f = fn : 'a -> 'a
Now, suppose that we want to write the expression
   not(f(true));
in a language like C or Pascal you could eliminate the external parentheses, and write
   not f(true);
In ML, however, if we write such a thing, we get
   - not f(true);
   stdIn:117.1-117.12 Error: operator and operand don't agree [tycon mismatch]
     operator domain: bool
     operand:         'Z -> 'Z
     in expression:
       not f
The reason is that ML tries to interpret the above expression as
   (not f)(true)
This is because in an expression of the form
   op1 op2 const
it might be the case that op1 is an higher order function, which takes as an argument a function op2, and gives as result something that can be applied to const. In a language without higher order this interpretation would not make sense, and therefore the rules of priority are usually defined so that op1 op2 const is parsed as op1 (op2 const)

In general, in ML, an expression of the form

   e1 e2 e3 ... en 
is parsed as
   (...((e1 e2) e3) ... en) 
Of course this is just the default. In ML, like in other languages, there are some priority rules (like the precedence of "*" over "+") that may alter this order. In general, if you are unsure of how an expression is parsed, just put explicit parentheses to make sure that it is parsed the way you want.

One simplification that is always allowed in ML, however, is the elimination of the innermost parentheses around a "token". In other words, we can always write f x instead of f(x). Hence, for instance, we can write the identity function as

   fun f x = x;
and the expression not(f(true)) as
   not (f true);

Match nonexhaustive

When you define a function by pattern matching, you may get a warning of the form "match nonexhaustive". This means that, in the patterns, you have not covered all possible cases, i.e. all possible patterns that the input data may present.

For instance, consider the following function which gives the maximum of a list of integers:

   fun maxlist [x] = x
     | maxlist (x::l) = let val y = maxlist l
                         in  if x < y then y else x
                        end;
if you compile this definition in SML, you'll get a non-exhaustive matching warning. In fact, the case of emptylist is missing: If you write the expression
   maxlist [] 
you will get a run-time error.

When the missing cases corresponds to arguments for which the value of the function is undefined, the correct way to eliminate the warnings would be by introducing exceptions (ML allows to handle exceptions in a way similar to C++ and Java).

However, if you are sure that the missing cases do not correspond to interesting cases (i.e. to cases that may present in input) then you can just ignore the warnings of non-ehaustive matching.

For instance, suppose that you want to write a function which construct a balanced tree from a list, and that to this purpose you need to define an internal auxiliary function

   half: int * 'a list -> 'a list * 'a list
such that half(n,l) gives as result the two lists obtained by dividing l in two lists of equal length (plus or minus 1), and suppose that, in your intention, n represents the length of the original list l.

Then you probably will give a definition of the following form:

   fun half(0, nil) = ...
     | half(n,(x::l)) = ... ;
this will give you a non-exhaustive matching warning. (since the case (0,non-empty) is missing.) However, if you make sure that in your main program you always call your auxiliary function with an expression of the form
   half(n,l);
where n is defined as the length of l, then your program will never give a runtime error and you don't need to worry about the warning.

Equality types

Consider the following two declarations in ML, the first defining a function which checks whether two lists have the same number of elements, and the second checking that they are also the same elements.
   - fun same_length (nil,nil) = true 
       | same_length (x::l,y::k) = same_length(l,k)
       | same_length(l,k) = false;
   val same_length = fn : 'a list * 'b list -> bool
   -
   - fun same_list(nil,nil) = true
       | same_list(x::l,y::k) = x=y andalso same_list(l,k)
       | same_list(l,k) = false;
   val same_list = fn : ''a list * ''a list -> bool
As we can see, the ML type inference mechanism gives for same_list a type different from the one given for same_length, i.e. we get a parameter ''a instead of 'a. This is due to the fact that, in the definition of same_list, we make an equality test between two elements of the lists (x and y). The double quote notation, in ML, is used to represent a type where equality is defined (equality type).

Examples of equality types are integers, booleans, characters, reals, and any other structure (predefined or used-defined) made by equality types. For instance, pairs of equality types, lists of equality types, trees of equality types etc.

Examples of types on which equality is not defined are functional types and everything constructed with functional types. Thus, if we call the two functions above with lists of functions as arguments, same_length will give an answer true or false while same_list will give a type error.

Examples:

   - same_length([1,2],[2,3]);
   val it = true : bool
   - same_list([1,2],[2,3]);
   val it = false : bool
   - fun f x = x;
   val f = fn : 'a -> 'a
   - fun g x = x;
   val g = fn : 'a -> 'a
   - same_length([f],[g]);
   val it = true : bool
   - same_list([f],[g]);
   stdIn:24.1-24.19 Error: operator and operand don't agree [equality type required]
     operator domain: ''Z list * ''Z list
     operand:         ('Y -> 'Y) list * ('X -> 'X) list
     in expression:
       same_list (f :: nil,g :: nil)

Some exercises with binary trees

Consider the following definition of binary trees, which has the empty tree as base case:
   datatype 'a btree = emptybt | consbt of 'a * 'a btree * 'a btree;

Taversal

Below we define the three typical traversals of a binary tree. Each function puts the result of the traversal in a list. Examples of evaluations
- in_traverse (consbt(1,consbt(2,emptybt,emptybt),consbt(3,emptybt,emptybt))); 
val it = [2,1,3] : int list
- pre_traverse(consbt(1,consbt(2,emptybt,emptybt),consbt(3,emptybt,emptybt))); 
val it = [1,2,3] : int list
- post_traverse(consbt(1,consbt(2,emptybt,emptybt),consbt(3,emptybt,emptybt)));  
val it = [2,3,1] : int list

Changing the value of the elements

Suppose we want to define a function which takes a binary tree of integers, and returns a binary tree with the same structure and the same positive nodes, while the negative ones are converted to 0's. The following is a possible definition:
   fun convert_to_0 emptybt = emptybt
     | convert_to_0 (consbt(x,t1,t2)) = let val u1 = convert_to_0(t1) 
                                            val u2 = convert_to_0(t2)
                                         in if x < 0 
                                               then consbt(0,u1,u2)
                                               else consbt(x,u1,u2)
                                        end;