Pi exists

Note that I have not included any diagrams. You will have to use your imagination!

Problem: To prove that the definition of pi as the ratio of the circumference of a circle to its diameter makes sense.

Do this in two parts, by first defining pi(C) to be the ratio of the circumference of circle C to its diameter.

Prove that for any circle C one has 3 < pi(C) < 4.
Prove that for any two circles C_1, C_2 one has pi(C_1) = pi(C_2).

You will need to state all your assumptions. You may need to define length.

Solution: I will give an outline of a proof and state at every point what assumptions I am making and what proofs I am omitting or leaving to the reader. These gaps can be filled in by refering to any geometry text. A complete proof of this result from a set of axioms and postulates (what is the difference between these two?) appears in the book [E. Moise, Elementary Geometry from an Advanced Viewpoint, Addison-Wesley, Reading MA, 1974].

To prove part (1), consider a circle C of radius r (and diameter 2r). Note that a circle is defined to be the set of points equidistant from a point (this common distance is the definition of the radius). Now, inscribe a regular hexagon (how would you construct this by ruler and compass?). Each side of the regular hexagon has length r (explain why), so the total perimeter of the hexagon is 6r. But the length of the perimeter of the circle is greater than the perimeter of the hexagon. The reason is that each arc subtending a side of the hexagon is longer than the side because it is an axiom that a line is the shortest distance between two points.

It follows that

pi(C) = (perimeter of C)/(diameter of C)= (perimeter of C)/(2 r) > (6r)(2r) = 3,

which proves the first part.

Showing that pi(C) < 4 is similar, but requires more foundational work. The basic idea is to circumscribe a square of side 2r and since the square has a longer perimeter, the result should follow. Actually, circumscribing a square only shows that pi(C) <= 4, and I leave showing that pi(C) < 4 as an exercise.

Unfortunately, it is no longer an axiom that the perimeter of the square is longer than the perimeter of the circle. Of course, you can make it an axiom, which is exactly what Archimedes did (the main point is that Archimedes recognized that this needs to be based on some unproved assumption). Archimedes' axiom says [Archimedes, On the Sphere and Cylinder I, Assumption 2, in ``The Works of Archimedes,'' translated by T.L. Heath, Dover, New York, 1953]:

``Of other lines in a plane and having the same extremities, [any two] such are unequal whenever both are concave in the same direction and one of them is either wholly included between the other and the straight line which has the same extremities with it, or is partly included by, and is partly common with, the other; and that [line] which is included in the lesser [of the two].''

This definition says that if two concave curves are such that one is inside the other except that their ends meet, then the outside one is longer. Since the circle and the square are both concave and the four pieces of the square that touch the circle satisfy the assumption, it follows that the square has a longer perimeter than the circle. However, since the circle is much simpler than the general concave curve, it is possible in this case to prove Archimedes' axiom. Before you can do this, though, you have to define what you mean by length. One definition can go as follows:

Definition. Consider a curve in the plane given by a map f:[0, 1] -> R^2 (explain why this assumption does not really lose any generality). Then the length of the curve from f(0) to f(1) will be the least upper bound of the numbers

|f(a_0) - f(a_1)| + |f(a_1) - f(a_2)| +... + |f(a_{n-1}) - f(a_n)|,

where 0=a_0 < a_1 < a_2 <... < a_n = 1 ranges over all partitions of [0, 1].

In other words, you approximate the curve by little segments, and add up the lengths of all the little segments and look at the values this takes. It is actually a theorem that must be proved that no matter how you decide to partition [0, 1], as long as you have |a_i - a_{i+1}| -> 0 (i.e., your partitions become finer and finer), the sum will approach the same number. Note that this definition requires the least upper bound axiom of the real numbers, i.e., that every bounded set of real numbers has a least upper bound that is a real number.

Given this definition, in order to show that pi(C) <= 4, one has to show that given any partition of the circle, i.e., an inscribed polygon, its perimeter is less than the circumbscribed square

The proof of this is based on a simple lemma

Lemma. Consider an isosceles triangle ABC with AB = AC. If AB and AC are extended to AD and AE, then DE >= BC.

Before proving the lemma, let's see how it proves the result. Given an inscribed polygon, take any side meeting the circle at P and at Q. If the circle has center O, then OP=OQ. Now Let OP meet the square at R and OQ meet the square at S, then by the lemma RS > PQ. Doing this for every side of the polygon shows that the perimeter of the polygon is smaller than the perimeter of the square.

It follows that any inscribed polygon has perimeter smaller than the square, so the least upper bound of the perimeters is smaller than or equal to the perimeter of the square which is equal to 8r. Since the least upper bound of the perimeters is equal to the length of the perimeter of C, one has

pi(C) = (perimeter of C)/(diameter of C) = (perimeter of C)(2r) <= (8r)(2r) = 4.

It follows that pi(C) <= 4, so you need to do even more work to show that pi(C) < 4. The main point, however, is that you have at least shown that pi(C) is finite.

Proof of Lemma: Assume that AE >= AD (the other case is similar) then let F be on AE such that AF=AD.

The triangles ABC and ADF are similar so it follows that DF >=BC (explain) and it is sufficient to prove that DE \ge DF. To do this, note the angle AFD is less than 90 degrees. This is shown by appealing to the elementary theorem that the base angles of an isosceles triangle are equal. Since the angle sum of a triangle is 180 degrees (why?), this implies that the base angles are each less than 90 degrees. Since angle AFD is less than 90 degrees, angle EFD is greater than 90 degrees (why?), so that the other angles in triangle EFD are less than 90 degrees (why?) so that angle EFD is the largest angle in this triangle. The result now follows from

Proposition. If two sides of a triangle are not congruent, then the angles opposite them are not congruent, and the larger side is opposite the larger angle.

Proof: Let the triangle be PQR, where PQ != PR.

Assume that PR > PQ (the other case is similar), and let S be on PR such that PS=PQ. It follows from the isosceles triangle theorem that angle PQS is equal to angle PSQ. Since S is in the interior of PR, it follows that angle PQS is less than angle PQR (though obvious, this must be proved from the axioms). Another elementary fact (though obvious this must be proved from the axioms) is that S lying in the interior of PR implies that angle PSQ is greater than the angle PRQ. Combining all these facts shows that angle PQR is greater than angle PRQ proving the proposition. Q.E.D.

Proof of part (2): To show that pi(C) is the same for all circles, one uses the method of exhaustion invented by Eudoxus (ca. 408B.C.-355B.C.) and used extensively by Archimedes to prove his results on areas and volumes of circles, spheres, etc. You will note that this method is essentially the method of limits used in calculus (so the ancient Greeks already understood limits).

This method uses a proof by contradiction. So assume that there are two circles C_1 and C_2 such that pi(C_1) != pi (C_2). One can assume that pi(C_1) > pi(C_2) since the other case works the same way.

The first step is to move circle C_1 so it has the same center as circle C_2, i.e., they are concentric. You have to prove that this does not change the length of the perimeter of circle C_2 nor its radius. This is a good exercise in seeing if you have understood the definitions.

Given that you have proved that this does not change pi(C_2), I will arrive at a contradiction. Now, since pi(C_1)> pi(C_2), the definition of least upper bound implies that there is a partition P_1 of the circle C_1 such that the length of P_1, i.e., the perimeter of the inscribed polygon given by P_1, divided by the diameter of C_1 is greater than pi(C_2).

Now consider the partition P_2 of C_2 generated by P_1 as follows: For each point p_1 of P_1 find the corresponding point p_2 on C_2 by drawing a radius through p_1 and letting p_2 be the point on C_2 meeting this radius (on the same side of the common center).

Now for any two adjacent points p_2,p_2' of P_2, consider the ratio |p_2 - p_2'|/r_2, where r_2 is the radius of C_2. By similar triangles and the construction of P_2, this is equal to |p_1 - p_1'|/r_1, where r_1 is the radius of C_1. It follows that

1/r_1 sum |p_1 - p_1'| = 1/r_2 \sum|p_2 - p_2'|

where the summation is over all adjacent points in each partition. Dividing this by 2 gives

1/d_1 \sum |p_1 - p_1'| = 1/d_2 \sum|p_2 - p_2'|

where d_1,d_2 are the diameters of C_1,C_2, respectively. But the assumption that the partition P_1 gave a ratio greater than pi(C_2) and the above equation leads to

1/d_2 \sum|p_2 - p_2'| > pi(C_2)

which contradicts that pi(C_2) is the least upper bound for such ratios (why is it the least upper bound of such ratios?). The result follows by contradiction. Q.E.D.

ilan's home page