Fall 99, CSE 520: Lectures 6 and 8

Recursive definitions and fixpoints

In this section we review the concepts of recursive definition and fixpoint operator, and try to give some further explanation about them. Let us start with some basic definition.

Equations and solutions of equations

Given an equality theory, an equation on x is a formula of the form

E₁ = E₂

where E₁ are E₂ are expressions containing some (zero, one, or several) occurrences of x.

A solution for the above equation is any value v such that

E₁[v/x] = E₂[v/x] holds (i.e. this equality is a consequence of the theory)

where [v/x] represents the substitution of v for x. Of course, an equation can have zero, one, or several solutions.

Example

In the theory of rational numbers, the equation

x = (2*x³ - 6) / 5

has solution x = 2, in fact

2 = (2*2³ - 6) / 5 holds.

The equation of the example above is in a particular format: it has the form

x = f(x)

where f is the function defined as

f(y) =def= (2 y³ - 6) / 5

Equations in this format are called fixpoint equations.

For certain theories, the solution of fixpoint equations can be obtained in a uniform way, by applying an operator to the function f of the equation. Such operator is called fixpoint operator. (This is not the case for the equations on rational numbers, unless we introduce some restrictions on the functions to be used in the equations.)

The fixpoint operators in lambda calculus

The theory of lambda calculus has fixpoint operators. One fixpoint operator is the lambda term Y defined in previous lectures. In fact, as previously proved, we have

Y F = F (Y F)

for every lambda term F. This means that every fixpoint equation

X = F (X)

has a solution X =def= Y F.

Note that Y is not the only fixpoin operator in the lambda calculus. We have, actually, infinitely many such operators. The operator Y is due to Curry, and was called by him "paradoxical combinator". Another fixpoint operator, due to Turing, is the term (\x y. y (x x y))(\x y. y (x x y)).

Example Suppose that we want to find a term M such that, for every P, we have:

M P = P M

If we are able to reduce the above equation to a fixpoint equation, then we can use the fixpoint operator to solve this problem.

Observe that the above equation is equivalent to

M = \p. p M

Now use one further abstraction step on the left, and obtain

M = (\u p. p u) M

The latter is in the format of a fixpoint equation. Then we have that a solution is

M =def= Y (\u p. p u)

Recursive definitions

Consider a recursive definition of a function g, like for instance

g(n) =def= if n = 0 then ... else ... g(n-1) ...

or equivalently

g =def= \n. if n = 0 then ... else ... g(n-1) ...

what we mean by such a definition is that we want to define g as the (or better, any) solution of a fixpoint equation on g of the form

g = G (g)

where G is the function defined as

G =def= (\u. \n. if n = 0 then ... else ... u(n-1) ...)

The domain of the lambda-definable functions enjoys the property of the uniform solvability of fixpoint equations. One fixpoint operator is the lambda term Y defined in previous lectures. In fact, as previously proved, we have

Y G = G (Y G)

which means that Y G is a solution of the equation

g = G (g).

Consistency of the Lambda Calculus

We will show that the Lambda Calculus is consistent, in the sense that not all terms are lambda-convertible. In particular, different Church's numerals are not lambda-convertible. Note that if the Church's numerals were all identified by lambda-conversion, then the result of lambda-definability would not be of any interest.

In order to prove the consistency of the lambda calculus, it is convenient to introduce the notion of beta-reduction.

Beta reduction

The notion of conversion is symmetric, but from a computational point of view the beta rule has a direction. This directionality is captured in the concept of beta-reduction.

The one-step beta reduction, denoted by ->, is the least relation such that:

(\x. M) N -> M[N/x],
M =_alpha N, N -> N', and N' _alpha M' => M -> M'
M -> N => M P -> N P
M -> N => P M -> P N
M -> N => \x. M -> \x. N

Note: The second rule expresses the fact that we want to reason modulo alpha-renaming, i.e. to identify terms which are alpha-convertible.

The multi-step beta reduction, denoted by ->>, is the reflexive and transitive closure of ->, namely:

M -> N => M ->> N
M ->> M (reflexivity)
M ->> N and N ->> P => M ->> P (transitivity)

We will say that M is in beta normal form if there exist no N such that M -> N. Note that M is in beta normal form iff M contains no beta-redexes, i.e. no terms of the form (\x. M') N. If M ->> N, and N is in beta normal form, we say that N is the beta normal form of M.

Let us now consider the relation between conversion and reduction. Clearly, by their definitions, we have the following:

Proposition: the lambda-conversion is the reflexive, symmetric, and transitive closure of -> (or equivalently, of ->>).

Let us illustrate what are the difficulties in proving the consistency of the lambda calculus. Intuitively, M = N holds iff one of the following cases hold:

M ->> N, or
N ->> M, or
there exists P such that M ->> P and N ->> P
there exists P such that P ->> M and P ->> N
a combination of the above cases.

Suppose that we want to prove that [0] and [1] are not lambda convertible. Since [0] and [1] are both in normal form (they contain no beta-redexes), and they are not alpha convertible, then cases (1), (2) and (3) are not possible. Case (5) can be reduced to the others. The only difficult case is (4). In fact, there are infinitely many terms which reduce to [0] and [1]. For example, [plus] [0] [0] ->> [0] and [plus] [1] [0] ->> [1]. Thus, we should now prove that [plus] [0] [0] and [plus] [1] [0] are not lambda convertible, which is a problem more difficult than the original one.

Fortunately, case (4) can be reduced to (3) thanks to the following theorem:

Theorem (Church-Rosser) ->> is confluent. Namely, if P ->> M and P ->> N, then there exists Q such that M ->> Q and N ->> Q.

The property of confluence is also called "diamond property", because of the shape of the diagram that illustrates the property.

Confluence means, essentially, that it does not matter in which order we reduce the beta-redexes inside a term: we can always "rejoin" towards the same term. ("all roads bring to Rome" :-)

Example Consider P =def= [plus] ([times] [1] [2]) ([plus] [3] [4]). Then we have P ->> M and P ->> N, where M =def= [plus] [2] ([plus] [3] [4]) and N =def= [plus] ([times] [1] [2]) [7], i.e. M and N are obtained by reducing different parts of P. Now, by reducing, in both M and N, the other part, we get M ->> [plus] [2] [7] and N ->> [plus] [2] [7].

The example of reduction of operations on numerals is particularly simple; things are much more complicated when we consider reduction of higher-order terms. In that context the confluence property is not so obvious. The proof of this important result, in fact, is rather involved, and we will not see it in the course. The interested reader can find it on the Barendregt's references.

From the confluence property, we have the following:

Corollary If M = N then there exists P such that M ->> P and N ->> P.

Proof. Remember that = is the reflexive, symmetric and transitive closure of ->. This means that = is the least relation such that

M -> N => M = N
M = M
N = M => M = N
M = Q and Q = N => M = N

We prove now the statement of the corollary. By induction:

1 (Base case). Assume M -> N. Then the statement holds for P =def= N. In fact N ->> N.
2 (Base case). Assume M and N are identical. Then define P =def= M and observe that M ->> M.
3 (Inductive case). Assume N = M. By inductive hypothesis, we have that here exists P such that N ->> P and M ->> P.
4 (Inductive case). Assume M = Q and Q = N. By inductive hypothesis, we have that there exist P₁ and P₂ such that M ->> P₁, Q ->> P₁, Q ->> P₂, and N ->> P₂. Again by confluence, we have that there exists P such that P₁ ->> P and P₂ ->> P. Therefore we have M ->> P and N ->> P.

As a consequence of the above corollary we have:

Theorem The Lambda Calculus is consistent. In particular, different Church's numerals are not lambda-convertible.

Proof. If [m] = [n], then from previous corollary there must exist P such that [m] ->> P and [n] ->> P. However, Church's numerals are in normal form, i.e. they cannot be reduced. Hence we mush have that [m], P and [n] are identical (modulo alpha-renaming). But, by definition, [m] and [n] are identical only if m and n are the same number.

On the existence of normal forms

The above results give a method to prove whether two terms M and N are lambda-convertible or not: try to reduce reduce them to their normal forms, say P and Q, and check whether P and Q are identical (modulo alpha conversion). Of course, if a term has a normal form, then it is unique (by confluence).

This method however is not complete, because there exist terms which do not have a normal form. One example of such term is the fixpoint operator Y=def= \y.(\x.y(xx))(\x.y(xx)). In fact, we have

\y.(\x.y(xx))(\x.y(xx)) -> \y.y((\x.y(xx))(\x.y(xx))) -> \y.y(y((\x.y(xx))(\x.y(xx)))) -> ... -> \y.yⁿ ((\x.y(x x))(\x.y(xx))) -> ...

Another example is the term Omega =def= (\x.xx)(\x.xx). We have in fact

Omega -> Omega -> Omega -> ...

In general lambda conversion is semi-decidable, but not decidable. In other words, it is not provable, in general, that two terms are not lambda convertible. The exceptions are, of course, the terms which have a normal form, like the numerals.

Note that there are terms which have a normal form, but also the possibility of an infinite chain of reductions. For example, take the term M =def= [true] [0] Omega. We have M ->> [0], but also M -> M -> M ->... because of the possibility of reducing the last term, Omega. A term which only gives rise to finite chains of reductions (obviously resulting in the same normal form) is called strongly normalizing. A term which has a normal form, but also infinite chains of reductions, is called weakly normalizing.