Professional Documents
Culture Documents
Rvd
by
Herbert Robbins
Ian Stewart
and
WHAT IS MATHEMATICS?
WHAT IS
Mathematics?
ANELEMENTARYAPPROACHTO
IDEAS AND METHODS
Second Edition
BY
RICHARD COURANT
AND
HERBERT ROBBINS
Revised by
IAN STEWART
Mathrrna(lt'$lnst!tutc
lmvt"lb!ty'lfWan<kk
New Y<,rk
O.x:forri
1996
l'nnll<loo;Jh~l'ml<'<l:'il,<lhuf.,nH't!C"d
DEDICATED TO
ERNEST, GERTRUDE, HANS,
AND LEONORE COURANT
FOREWORD
FORI<: WORD
CONTENTS
lntroduchon
&1 CalculationwltJ\Intrgrrs
L Laws of Anthrnetk.
2. Tiw RE'prt>Sf'ntation of Integers
Computatmn in Systems Other than the Decrmal
~2. ThP [nfinitudt:'ofthe NumberSystrm Mathematica!Induchon
L The Pnnnple of MathPmat~ral Induction. 2 The Arithmehcal ProgrpsSlon. 3. The Geometncal ProgressiOn. 4. ThP Stun of thE' First n
SquarPs. ri An
Important
Inequality. 6. The
Bmomial
Theo
~2
~a.
~4
n.s
CONTENTS
mals. 5 General Dt>flnit!nn of Irrational SumbeTh by :>;estrd
Intervals 6 Altf'ma1JVP
Methods of Deflnmg lrrat10nal
~um
hf'rs. Df'drkind Cuts
3. Remark<; on Analytic Geometry
l. The Basic Principle. 2. Equations of Lines and CUIVes
The Ma1hf.'matical Analyslli of lnfiruty
J. Fundamental Concepts. 2 Tiw Demmwrabi!ity of thr Rahonal Nmn
hers and the Non-Dennnwrabili1y of thr Continuum ~1 Cantor's "Cardinal
Numbers 4. ThP Indirect Method of Proof. 5 The Paradoxes ofth<:> In
Introduction
Part I. Impossibility Proofs and Algebra
~1 Fundamental Geonwtrkal Const.mctions
I Constmct!on of Fields and Square Root
gons. 3. Apolloruus' Problem
~2. ConstructJble Numbf'rs and Numbrr flPlds
I G<:>nf'ral Theory. 2. All ConstrucTible Nt
:!. The llnsolvabJ..hty of the Thr<ee Greek Prob"
l. Doubling tht' Cub!" 2 A Thf'{)rem on
tlSPetmg
!hf' Angk 4 ThP R.. gular Heptagon
-"'~ on \Jlf' Probl<:>Jil of
Squanng thP Crrclf'
Part II Vanous Mt'lhods for PNformmg Constructions
4 Gf'OIIlf"tncal Trm1sformanons lnverswr
l. Gt:neral Remarks. 2
l Rf'stm't!ontolhe
of !hOi' Compass
Yl<eehanwa! Instn1
nwnt.<; Mechanual Curves Cydmds 4
P>aucrlli.-.r's and
Hart'slnveThON
{16 More Ahnut Innrswns mui Jts ApphcatJOJ\.<;
1 InvanancOi' of AnglOi's. Farmhes of C1rde 2 Applwa1JOn to U\t: Prob
lPm of Apo!lomus. 3. R<ep<Cated Rr>fiN"IJ<Jl\S
CHAPTER I\" PRO.JEl"TIV"f: (~f:nMt;TRY
AXIOM.~TIC~
1'-<<J"f:\ CI.Jf>F--\.~ GEOC'tfETRIES
~~ Introduction
l;s!
CONTENTS
~I
\"anabi<and Fun<!Wn
and ~~xample'> :J H,J<iJan \kasuw of :\ngh~
of a
Fomch"n luvPN' Func\lllll'> l CuinpOIIJHl
CONTENTS
uons :; Contmmry
and Transfonnatwn~
6 Functmns of Several
Vanabl<>~
2 Monotone &>quent<"~
ConumwdFractJOJ\S
Lrnu" ty Couuuuou' Appw.ch
~2
:3
Eula'~ ~urn
tion
Example on Contmmty
327
329
;J2f1
:J:\0
and Topol
Ohtusf'
Tnangks
'' Hemarks Concenung Problems of
*5
~(i
Skuwr',.,J'robkm
l Probl('tn and Solutwn 2. Analysi.<. u1 tlw Alt('rnatJV('"> :l A Comple
mentary l'roblPm. -1. Hemarks and E'iernses S. f<<neralJzalJorJ to the
StrePt!\Ptworkl'roblt>m
Extrrmaand hwquai!IJPS
1 l'hf' Anthnwucal ami \ it'onwtn( al MPan of T" o Positive
2 (,enerahzatwn to 11 \-;mablt>s :) The :ttethod of Least
CONTENTS
!ems
~9.
CUA!"TJ<;R\1ll Tm:C'ALLTLUS
398
lntroductwn
~I The Inte!'(ral
l ArNt as a Limit. 2 The Integral a. General Remarks on the Integral
ConcPpt. GrnPral Dt'finit.wn 4. Examplesoflnwgration. Integration of
X'. 5. Rules for thf' "lntf'grai Calcu!tL<;"
~2. Thf'Derivatwe
L Thf'
Ih>nvatwe
as
a
Slope. 2. The
Derivative
as
a
L!m1t 3 Examp!Ps. 4 0<-nvatlves
of
Tngonometncal
Func
tion.s
398
~iE}~~i~7~~
Geo;metneruMe<mingof<he
~~~}~~~""'"-
42i
43a
436
3 (lllwr Applications of
376
379
th~'
Con
2 Ordf'r of Magmtude of
CONTENTS
491
493
494
49fi
499
50!
50.5
525
INDEX
532
5:1:3
534
5~l7
5:)8
540
542
What Is Mathematics? is one of the great classics, a sparkling collection of mathematical gems, on!O' of whose aims was to counter the idea
that ''mathematics is nothing but a system of conclusions drawn from
definitions and postulates that must be consistent but otherwise may be
created by the free will of the mathematician." In short, it wanted to put
the meaning back into mathematics. But it was meaning of a very different kind from physical reality, for the meaning of mathematical objects states "only the relationships between mathematically 'nndefined
objects' and the rules governing operations with them." It doesn't matter
what mathematical things are: it's what they do that counts. Thus mathematics hovers uneasily between the real and the not-real; its meaning
does not reside in formal abstractions, but neither is it tangible. This
may cause problems for philosophers who like tidy categories, but it is
the great strength of mathematics-what I have elsewhere called its
"unreal reality." Mathematics links the abstract world of mental concepts to the real world of physical things without being located completely in either
I first encountered What Is MathPmatics? in 1963. I was about to take
up a place at Can1bridge University, and the book was recommended
reading for prospective mathematics students. Even today, anyone who
wants an advance look at university mathematics could profitably skim
through its pages. However, you do not have to b' a budding mathematician to get a great deal of pleasure and insight out of Courant and
Robbins's masterpiece. You do n'ed a modest attention span, an interest
in mathematics for its own sak', and Pnough backgronnd not to feel out
of your dPpth. High-school alg'bra, basic calculus, and trigonometric
fimctions are enough, although a bit of Euclidean geometry helps
One might Pxpect a book whose most rec(nt f'dition was prepared
marly fifty y'ars ago to seem old-fashion!O'd, its hnninology dated, its
viP'.'.'{loint out of line \\ith current fashions. In fact, What Is Mathernatics? ha..<> worn amazingly wt'IJ. Its Pmpha..<>is on problem-solving is up to
date, and its choice of matr>rial has lastPd so well that not a singlr> word
or symbol had to be del't'd from this nf'w edition
In case you imagiiw t.his is because nothing ever changes in mathe-
During th{' last years the force of events has led to an increased demand for mathematical information and training. Now more than ever
there exists the danger of frustration and disillusionment unless students and t('achers try to look beyond mathematical formalism and manipulation and to grasp the real essence of mathematics. This book was
written for such students and teachers, and the response t.o the first
edition encourages the authors in the hope that it will be helpful
Criticism by many readers has led to numerous corrections and improvement..'>. For generous help vvith the preparation of the third revised
edition cordial thanks are due to Mrs. Natascha Artin
R. Courant
New Rochelle, N. Y
March 18, 1943
October 10, 1945
October 28, 1947
WHAT IS MATHEMATICS?
Mathematics as an f'xpression of the human mind reflects the active
will, the contemplativE' reason, and the desire for aesthetic perfection.
Its basic elements ar(' logic and intuition, analysis and construction,
generality and individuality. Though differmt traditions may emphasize
different aspects, it is only the interplay of these antithE'tic forces and
the struggle for their synthesL<> that constitute the life, usefulness, and
supreme valuP of mathematical science
Without doubt, all mathematical developmPnt has its psychological
root<; in more or less practical requiremE'nts. But once started under the
pressure of necPssary applications, it inevitably gains momentum in itself and transcends thf' confmes of irrunediate utility. This trend from
applied to theoretical sciE'nce appears in andent history as well as in
many contributions to modern mathematics by engineers and physicist'>.
Recorded mathematics begins in the OriC'"nt, when, about :WOO B.C.,
the Babylonians collected a gnat wealth of material that. we would classify today under elementary algf'bra. Yet as a sdence in the modem
sense mathematics only enwrgps later, on Gr<f'k soil, in the lift.h and
fourth centuries B.C. The ever-inCrf'asing contact betwPen the- Orient
and the Greeks, beginning at the timE> of the Persian empire and rPaching
a climax in the pPriod following Alexander's expeditions, made thP
Greeks familiar with tht> achi('wments of Babylonian mathematics and
astronomy. Mathematics wa.:; soon subjected to thf' philosophical discussion that flourished in t.hf' Greek city states. Thus Gn.ek thinkPrs
becamf' conscious of tlw great difficulties inlwrPnt in tht mathematical
concepts of continuity, motion, and infinity, and in the problE'm of measuring arbitrary quantitif's by giwn units. In an admirable pffort the
challenge- was nwt, and the rf'sult, Eudoxus' tht-ory of thf' gPometrical
continuum, is an achiewnwnt that was only parallf'lpfl more than two
thousand yf'ars latt'r by the modPrn theory of irrational numbPrs. Tlw
dt-ducti\'f'-postulational trpnd in mathematics originated at thP timf' of
Eudoxus and was crystallizPd in Euclid's Elenwnts.
HowPwr. while tht> theoretical and postulational ltndency of Greek
matlwmatics rt>mains one of its important charactfristics and has exPrci.;;ed an enonnous infiut-nce, it cannot be trnpha..sizPd too strongly
WHAT IS MATHEMATH'S'1
that application and connection with physical reality playtd just as important a part in the mathematics of antiquity, and that a manner of
presentation less rigid than Euclid's was very often preferred
It may be that the early discovery of the difficulties connected with
"incommensurable" quantities deterred the Gn:eks from d'veloping thE'
art of numerical reckoning achieved before in the Orient. Instead th('y
forced their way through the thicket of pure axiomatic g('ometry. Thus
one of the strange detours of the history of science began, and perhaps
a great opportunity was missed. For ahnost two thousand years the
weight of Greek geometrical tradition retarded the ine-vitable t>volution
of the number concept and of algt>braic manipulation, which later
fanned the basis of modem science
After a period of slow preparation, the revolution in mathematics and
sd('nce began its \-igorous phase in the seventeenth century with analytic geometry and the differential and intt>gral calculus. While Greek
geometry retained an important place, th( Gret>k idf'al of axiomatic crystallization and syst.f'matk deduction disappeared in the seventeenth and
eight(>enth centuri('S. Logically precise reasoning, starting from clear
d('finitions and non-contradictory, "evident" axioms, seemed immaterial
to the new pioneers of mathematical science. In a veritablE' orgy of inhritive guesswork, of cogent reasoning interwoven with nonsensical
mysticism, with a blind confidence in the- superhuman power of fomtal
procedure, they conquNed a mathematical world of immense rich('s
Gradually the ecsta.'>Y of progress gaw way to a spirit of critical selfcontrol. In the nineteenth century the immanent need for consolidation
and the desire for more security in the extension of higher learning that
was prompted by the French revolution, inC'\itably Jed batk to a revision
of the foundations of the ne-w mathematics, in particular of the differential and integral calculus and the> !mdNlying concept of limit. Thus
the nineteenth century not only b(~c-ame a plO'riod of new advances, but
wa.<; also charactPrizr-d by a suc-cessful return to the classical ideal of
precision and rigorous proof. In this respect it ('Ven surpassed the model
of Greek sciC'nce. Onc-e more the pendulum swung toward the side of
logical purity and abstraction. At present we still Slem to be in this
period, although it is to be hoped that tht nsulting unfor1unat.t S'paration betwE-en pun mathe-matics and the vital applications, perhaps
ine\itable in times of critical revision, will be followed by an era of
closN tmity. Thf' r('gained inh'mal strength and, above all, the enormous
simplification attai1wd on the basis of de-arer comprE-hension make it
WHAT IS
~lATHf<~MATICS''
WHAT IS MATHEMATICS'?
WHAT IS MATHEMATICS?
concepts has been one of the most important and fruitful results of the
modem postulational development.
Fortunately, creative minds forget dogmatic philosophical beliefs
whenever adherence to them would impede constructive achievement.
For scholars and layman alike it is not philosophy but active experience
in mathematics itself that alone can answer the question: What is math
ematics?
CHAPTER I
THE NATURAL NUMBERS
INTRODUCTION
Number is the basis of modern mathematics, But what is number?
Whatdoesitmean to say that j + t = I,!!= l, and(-1) (-1) = 1?
We learn in school the mechanics of handling fractions and negative
numbers, but for a real understandi11g of the number system we must go
back to simpler elements. While the Greeks chose the geometrical concepts of point and line as the basis of their mathematics, it has
become the modern guiding principle that all mathematical statements
should be reducible ultimately to statements about the natural numbers,
1, 2, 3, . . . . "God created the natural numbers; everything else is
man's handiwork" In these words Leopold Kronecker (1823-1891)
pointed out the safe ground on which the structure of mathematics can
be built.
Created by the human mind to count the objects in various assem~
blages, numbers have no reference to the individual characteristics of
the objects counted. The number six is an abstraction from all actual
collections containing six things; it does not depend on any specific
qualities of these things or on the symbols used. Only at a rather
advanced stage of intellectual development does the abstract character
of the idea of number become clear. To children, numbers always remain connected with tangible objects such as fingers or beads, and primitive languages display a concrete number sense by providing different
sets of number words for different types of objects.
Fortunately, the mathematician as such need not be concerned with
the philosophical nature of the transition from collections of concrete
objects to the abstract number concf'pt. We shall therefore accept the
natural numbers 11-'> given, togct.hPr with the two fundamental operations, addition and multiplication, by which they may be combined.
1. CALCULAT!O:\ WITH INTEGERS
1. Laws of Arithmetic
The mathematical theory of the natural numbers or pos~tiue integers
is known as arithmetic. It is based on the fact that the addition and
[!J
multiplication of
are governed by certain laws. In order to
state these laws in
generality we cannot use symbols like 1, 2, 3
which refer to SpE'cific integers. The statement
1+2~2+1
is only a particular instance of the general law that the sum of two
integers is the :;arne regardless of the order in which they are considered.
Hence, when we wish to express the fact that a certain relation between
integers is valid irrespective of th.e values of the particular integers
involved, we shall denote integers symbolically by letters a, b, c, .
With this agreement we may state five fundamental laws of arithmetic
with which the reader is familiar:
1)
a+ b =
+a,
2) ab = ba,
3) a+ (b + ') ~ (a + b) + '
5) a(b
+ c)
= ab
4) a(b')
(ab)c,
+ ac.
The first two of these, the commutative laws of addition and multiplication, state that one may interchange the order of the elements involved
in addition or multiplication. The third, the associative law of addition,
states that addition of three numbers gives the same result whether we
add to the first the sum of the second and third, or to the third the sum
of the first and second. The fourth is the associative law of multiplication. The last, the distributive law, expresses the fact that to multiply
a sum by an integer we may multiply each term of the sum by this integer
and then add the products.
These laws of arithmetic are very simple, and may seem obvious. But
they might not be applicable to entities other than integers. If a
and b are symbols not for integers but for chemical substances, and
if ''addition'' is used in a colloquial sense, it is evident that the commutative law will not always hold. }'or example, if sulphuric acid is added to
water, a dilute solution is obtained, while the addition of water to pure
sulphuric acid may result in disaster to the experimenter. Similar illustrations will show that in this type of chemical "arithmetic" the associative and distributive lawtl of addition may also fail. Thus one can
imagine types of arithmctir in which one or more of the la\\s 1)-5)
do not hold. Such systems have actually been studied in modern mathematics
A concrete modE' I for the abstract concept of integer will indicate the
intuitive ba.sis on which the laws 1) 5) m~t. Instead of using the usual
number symbols 1, 2, 3, etc., let. us denote the integer that gives the
~+~11
Fi&.l. Addition.
To multiply a and b, we arrange the dots in the two boxes in rows, and
form a new box with a rows and b columns of dots. The rules 1)-5)
Fi&.3. Multiplleation.
~x(~+~)=l::l:::::l
On the basis of the definition of addition of two integers we may define
the relation of inequality. Each of the equivalent statements, a < b
(read, ''a is less than b") and b > a (read, ''b is greater than a"), mearu;
that box b may be obtained from box a by the addition of a properly
chosen third box c, so that b = a + c. When this is so we write
c = b- a,
which defines the operation of subtraction.
11~=~
Fig.4. Subtr...,tion.
(a+ d) - d
~a.
III
a+ 0 =a,
aO = 0,
for every integer a. For a + 0 denotes the addition of an empty box
to the box a, while a-0 denotes a box with no columns; i.e. an empty
box. It is then n tural to extend the definition of subtraction by setting
a-a=O
for every integer a. These are the characteristic aritlunetieal properties
of zero.
Geometrical models like these boxPs of dot.~, such a." the anrie-nt
abacus, were widely used for nurneri(al calculation!oi until late in the
middle ages, when they were slowly displaced by greatly supenor
symbolic methods based on the decimal system.
z = a 10
+ b 10 + c 10 + d,
2
REPRESENTATION OF INTEGERS
The integer z
abed.
We note in passing that the coefficients d, c, b, a are the remainders left
after successive divisions of z by 10. Thut~
10)3E
!Oj:l'l
10):!
0
Remainder
2
7
3
The particular expression given above for z can only represent integers
less than ten thousand, since larger integers will require five or more digit
symbols. If z is an integer between ten thousand and one hundred
thousand, we can express it in the form
w + b . 101 + c 102 + d .
z - a.
10
+ t,
z=
a. .
10"
+ a~-I
lO"~l
+ ... + a1
10
+ ao,
b . 7"
. 7
+ b, '
THl~
NATURA!, XUMBERS
Ill
where the b's arc digits from zero to six, and denoted by the symbol
bnbn-1 btba.
Thus "one hundred and nine" would be denoted in the septimal system
by the symbol214, meaning
2-72 +17+4.
As an exercise the reader may prove that the general rule for passing
from the base ten to any other base B is to perform successive divisions
of the number z by B; the remainders will be the digits of the number in
the system with base B. For example:
7) 109 Remainder
7)15
i)I!
0 2
109 (decimal system) = 214 (septimal system).
REPRESENTATION OF INTEGERS
it from the Moslems. The positional system has the agreeable property
that all numbers, however large or small, can be represented by the use
of a small set of different digit symbols (in the decimal system, these are
the "Arabic numerals" 0, 1, 2, , 9). Along with this goes the more
important advantage of ease of computation. The rules of reckoning
with numbers represented in positional notation can be stated in the
form of addition and multiplication tables for the digits that can bt memo~
rized once and for all. The ancient art of computation, once confined to a
few adepts, is now tau~ht in elementary schooL There are not many
instances where scientific progress has so deeply affected and facilitated
everyday life.
1 I
213
3
4
5
6
4
5
6
10
2
3
M ulliplication
3
4
4
5
5
6
6
10
10
11
6 10
10 11
10 11 12
11 12 13
l1 12
12 13
13 14
14 15
123456
l-1-2--3--4-5-6
2 2
3 3
4 4
5 I s
6 6
4
6
11
13
15
6 11 13
12 15 21
15 22 26
21 26 34
24 33 42
15
24
33
42
51
[I]
Let us now multiply 265 by 24, where these number symbols are
written in the septimal system. (In the decimal system this would be
1456
~
10416
To check this result we may multiply the same numbers in the decimal
system. 10,416 (septimal system) may be written in the decimal
system by finding the powers of 7 up to the fourth: 72 = 49, 7' = 343,
i 1 = 2,401. Hence 10,416 = 2,401 + 449 + 7 + 6, this evaluation
being in the decimal system. Adding these numbers we find that 10,416
in the septimal system is equal to 2,610 in the decimal system. Now
we multiply 145 by 18 in the decimal system; the result is 2,610, so
the calculations check.
Exercisf'.ll: 1) Set up the addition a.nd multiplication tables in the duodecimal
5ystem and work some examples of the same sort
2) Express "t.hirty" and "one hunrh:~d and thirty-three' in the systems with
thebaaes5,7,1l,l2
a) What do the symbols lllll and 21212 mPan in
4) "Form the addition and rnultiplkation tables for
+ 2 + 1~
10
mathematical induction.
III
aa anything can
be~
11
12
III
+ 2 + 3 + ... + n ~ n(n: 1)
+ 2 + 3 + ... + r
= r(r
i_D ,
we
+ 2 + a + ... + , + 1, + n ~ c<' ~~ 1l + 1, + n
~
,(, + 1)
IJ ~ (, + 1)i' + 2).
2('..i..
which is precisely the statement A,.+l. b) The statement At is obviously true, since 1 =
!~.
S. ~ 1
+ 2 + +
(n - 1)
+n
and
S.
+ (n
- I)
+ ... + 2 + 1.
On adding, we see that each pair of numbers in the same column yields
the sum n
1, and, since there are n columns in all, it follows that
28.
n(n
+ 1),
13
ARITHMETICAL PROGRESSION
From (1) we may immediately derive the formula ior the sum of the
first (n
1) terms of any art'thmetical progression,
P.
(2)
~ a+(a+d)+(a+2d)+ +(a+nd)
(n+!)~a+nd).
Far
P.
(n + 1)(2a + nd)
2
~---2---
For the
'l
-=Y-:':'.
G_ " a
+ aq ~
a(!
q).
=a+ aq +
. + aq
1-q'+l
a r=q-,
+1
1-q
1-q
= a
+ aq + + aq",
qG.. = aq
This
Set
14
[lj
w;."+~,
G,.
=a~ =:"+t.
(4)
12
+ 2 + 3 + ... + n2 =
2
n(n
+ 1~(~~+_!},
and one might guess that this remarkable formula is valid for all integers
n. To prove this, we shall again use the principle of mathematical
induction. We begin by observing that if the assertion A,., which in
this case is the equation (4), is true for the case n = r, so that
12 + 22
then on adding (r
I'+
'f + 3' + , , , +
= >:(>:_1~(2r
=
+az+ ... + rz
+ 1)
.p
+ 6(r + 1)'
(r + 1)(2? + 7r + 6)
---6---~
1)~2r + I) + (r + 1)'
= (d 1)[r(2r +
:)+
6(r + 1)]
(r
+ 1)(r + 2)(2r + 3)
_ _ _ 6 ___ _
is obviously true.
15
1'
+ 2' +
(1
+ p)"
2 1
+ np,
which holds for every number p > ~ 1 and positive integer n. (For
the sake of generality we are anticipating here the use of negative and
nonintegral numbers by allowing p to be any number greater than ~ 1.
The proof for the general case is exactly the same as in the case where
pis a positive integer.) Again we use mathematical induction.
a) !fit is true that (1 + p)' ?: 1 + rp, then on multiplying both sides
of this inequality by the positive number 1 + p, we obtain
(1
+ py+
2::: 1
+ rp + p + rl.
Dropping the positive term rpz only strengthens thi:; inequality, so that
(1
+ p)'"' 2
+ (c + l)p,
16
[II
which shows that the inequality (6) will also hold for the next integer,
1
r + 1. b) It is obviously true that (I+ p) ~ I+ p. This completes
the proof that (6) is true for every n. The restriction to numbers p > -1
is essential. If p < -1, then 1 + p is negative and the argument in
a) brealrn down. since if both members of an inequality are multiplied
by a negative quantity, the sense of the inequality is reversed. (For
example, if we multiply both sides of the inequality : > 2 by -1 we
obtain -3 > -2, which is false.)
+b
2
,
a+b =
(a+b)' ~
(a+b)'~
(a+
b)'~
which gives at once the general rule for forming the coefficients in the ex
pansion of (a + b)". We construct a triangular array of numbers,
17
10
15
21
10
15
20
35
35
21
The nth row of this array gives the coefficients in the expansion of (a.+ b),.
in descending powers of a and ascending powers of b; thus
(a+ b) 7 = a1
+ 7a b + 21a5b2 + 35a'l/ +
6
35afb 4
+ 21a~b' + 7ab~ + b
c;
= 1.
c:
(8)
c:::t' + c;-'.
c; ~
n(n-
<;- i+ I)
i!
(nn~ i)l"
(For any positive integer n, the symbol nl (read, "n factorial") denotes the product of the first n integers: nl = 123 ... n. It is convenient also to define OJ = 1, so that 9) is valid fori = 0 and i = n.)
This explicit formula for the coefficients in the binomial expansion is
sometimes called the binomial theorem. (See alsop. 475.)
Exercises: Prove by mathematical induction:
l)
2)
18
+nq"~l
"'1- (n
4) (l+q){l+ql)(l+J(l+q r,
fil
(~ ~~)l+nqft+l
..
~;~+I
(1:
B) 1
:t
xl)ft
X"
7): ~ ~~ + (: ~ ~y + ...
+ (: ~
::y.
s)
11
+g) 1' + 3J
+ + (2n + 1)" =
(n
+ l)t(2n' + 4n + 1)
(10)
{1
we define
""a""b.
"'n,then
a- b."
20
Ill
a) Suppose A, to be true, Let a and b be any twQ positive intege.n 11ueh that
max (a, b) - r + 1. Consider the two integers
{J-
b- 1;
then mtu: (a, P) - r. Hence a ... (J, for we~ reassuming A, to be true. It follows
that a ~ b; hence A,+l is true.
b) A1 is obviously true, for if max (a, b: ~ 1, then since a and bare by hypothe"
ais positive integers they must both be equal to l. Therefore, by mathematical
induction, A~ is true for every n.
Now if a and bare any two positive integers whatsoever, denote max {a, b) byr.
Since A~ hllo!! been shown to be true for every r, in particular A, is true. Hence
a- b,
SUPPLEMENT TO CHAPTER I
THE THEORY OF NUMBERS
INTRODUCTION
The integers have gradually lost their association with superstition
and mysticism, but their interest for mathematicia.Os has never waned.
Euclid (cirea 300 B.C.), \vhose fame rests on the portion of his Elements
that forms the foundation of geometry studied in high school, seems to
have made original contributions to number theory, while his geometry
was largely a compilation of previous results. Diophantus of Alexandria (circa 275 A.D.), an early algebraist, left his mark on the theory
of numbers. Pierre de Fermat (1601-1665), a jurist of Toulouse, and
one of the greatest mathEmaticians of his time, initiatf'd the modern
work in this field. Euler (1707-1783), the most prolifie of mathematicians,_included much number-theorfi'tiral work in his researches. Names
prominent in the annals of mathematics-Legendre, Dirichlet, Riemann
-----can be added to the list. Gauss (1777-,1855), the foremost mathematieian of modern times, who devoted himself to many different
branches of mathematics, is said to have expressed his opinion of number theory in the remark, "Mathe:natics is the queen of the scienr,es
and the theory of numbeffl is the queen of mathematic8."
in mathematics as a whole,
the number 5 or the number
2, 4, 6, 8,
or the class of all integers divisible by 3,
3, 6, 9, 12, ... '
or the class of all squares of integers,
1, 4, 9, 16, . ,
a.nd so on.
22
[IJ
composite.
One of the first questions that arises concerning the class of primes is
whether there is only a finite number of different primes or whether
the class of primes contains infinitely many members, like the cla..~s of
all integers, of which it forms a part. The answer is; There are infinitely many prmes.
+ 1,
23
FUNDAMENTAl, FACTS
Although this proof is indirect, it can ea.sily be modified to give a method for
COlllltructing, at least in theory, an infinite sequence of primes. Starting with
any prime number, such asp!= 2, suppose we have found n primes P , P, , p~;
we then observe that the number 1'
pft+ I either is itself a prime or contains
11t11 a factor a prime which differs from __ Jse already found.
Since this factor can
always be found by direct t.rial, we are sure in any case to find at least one new
prime Ps.r ; proceeding in this way we see that the sequence of constructible
primes can never end.
Exerci8e: Carry out this construction starting wit.h P> =- 2, P "" 3 and find
5 more primes
(1)
rn = P1P2 p, = g1q2 q. ,
where the p's and q's are primes. By rearranging the order of the p's
and q's if necessary, we may suppose that
P1~P2::;
Now
... ::;p,J
q~~q2~
.. :::;q,.
factodccom ea"h 'ib of equati,on (1) and obtain two e.<sentially different
24
[IJ
(3)
(4)
Sincep1 < q1 , it follows from (4) th<.t m' is a positive integer, while from
(2) it follows that m' is smaller tl.
m. Hence the prime decomposi-
tion of m' must be unique, aside from the order of the factors.
But
from (3) it appears that the prime p 1 is a factor of m', hence from (4)
P1 must appear as a factor of either (q1 - p1) or (q~qs q,). (This
follows from the assumed uniqueness of the prime decomposition of m';
see the reasoning in the next paragraph.) The latter is impossible,
since all the q's are larger than Pt . Hence Pt must be a factor of q1 - Pt ,
so that for some integer h,
But this shows that p 1 is a factor of q1 , contrary to the fact that q1 is
a prime. This contradiction shows our initial assumption to be untenable and hence completes the proof of the fundamental theorem of
arithmetic.
An important corollary of the fundamental theorem is the following:
If a prime pis a factor of the product ab, then p must be a factor of either
a or b. For if p were a factor of neither a nor b, then the product
of the prime decompositions of a and b would yield a prime decomposition of the integer ab not containing p. On the other hand, since p is
assumed to be a factor of ab, there exists an integer t such that
<ib =pt.
Hence the product of p by a prime decomposition oft would yield a prime
decomposition of the integer ab containing p, contrary to the fact that
the prime decomposition of ab is unique.
Examples: If one has verified the fact that 13 is a factor o 2652, and
the fact that 2652 = 6.442, one may conclude that 13 is a factor of 442.
On the other hand, 6 is a factor of 240, and 240 = VJ.l6, but 6 is not a
faetor of either 15 or 16. This shows that the assumption that p is
prime is an essential one.
25
FUNDAMENTAL FACTS
E:urciae: In order to find all the divisors of a.ny number a we need only decompose a into a product
where the p'a are distinct primes, each raieM to a certain power. All the divi.aon1
of a are the numbers
{1)
.$
<l('
(J,
.$ "''.
For example,
They are 1, 2, 4, 8, 16, 3, 6, 12, 24, 48, 9, 18, 36, 72, 144.
2' +I
26
are primes.
III
F(2) ~
F(3) ~
2''
2''
F(4) = 22'
I ~ 5,
+I
+
~ 2'
+I
~ 17,
I ~
2'
+1 =
216
+1 =
I ~ 257,
65,537,
27
(4a
+ !)(4b + I)
16ab
+ 4a + 4b + 1
4(4ab +a +b)
+ I.
Now suppose there were but a finite number of primes, P1, P1, p,,
of the form 4n + 3, and consider the number
N
= 4(PIP2
P~) -
I = 4(p, P~ ~ 1)
+ 3.
+ li.
28
III
will also increase without limit (although more slowly). For we know
that there are infinitely many primes, so the values of A,. will sooner
or later exceed any finite number. The "density" of the primes among
the first n integers is given by the ratio A,./n, and from a table of primes
the values of A,./n may be computed empirically for fairly large values
of n.
~~g. I ~.0.~).8:1.7478
The last entry in this table may be regarded aB giving the probability
that an integer picked at. random from among the first 10~ integers will
be a prime, since there arc 109 possible choices, of which Awo are
primes.
The distribution of the individual primes among the int{'gPrs is ex~
tremely irregular. But this irregularity "in the small" disappears if
we fix our attention on the average distribution of the primes as given
by the ratio An/n. The simple law that governs the behavior of
this ratio is one of the most remarkable discoveries in the whole of
mathematics. In order to state the prime number theorem 1ve must
define the "natural logarithm" of an integer n. To do this we take two
perpendicular axes in a plane, and consider the locus of all points in
the plane the product of whose distances x and y from these axes is
equal to one. In terms of the coOrdinates x and y this locus, an equilateral hyperbola, is defined by the equation xy = 1. We now define log
n to be the area in Figure 5 bounded by the hyperbola, the x-axis, and
the two vertical lines :r. = 1 and x = n. (A more detailed discussion of
the logarithm will be found in Chapter VIII.) From an empirical study
of prime number tables Gauss observed that the ratio An/n is approximately equal to 1/log n, and that this approximation appears to improve
29
ta~le.
A,.jn
1/logn
10' 0.168
lu' 0.078498
10' 0.050847478
0.145
0.072382
0.048254942
-~~
1/logn
1.159
1.084
1.053
_L_.
y
"
Fig.6. TheareaoftheAh&dedregionnnderthehyverholadefinMloll"
On the basis of such empirical evidence Gauss made the conjecture that
the ratio A,Jn is "asymptotically equal" to 1/!og n. By this is meant
that if we take a sequence of larger and larger values of n, say n equal to
10, 102, 103, 10', .
A~/n
to 1/log n,
A./n
1/lL., n'
calculated for these successive values of n, will become more and more
nearly equal to I, and that the difference of this ratio from 1 can be
made as small as we please by confining ourselves to sufficiently large
val uPs of n. This assertion is syrnbolically expressed by the sign ,....., ;
~ "'-' lo~ n
means
30
Ill
Euler0:c~o~u:~l<d~ 1~~~i:~,~;;'n~i~d~~l<f~~~~e~;v:;~~i~:~
in
31
speaking, it is difficult to establish connections between the multiplicati\'e and the additive properties of integers.
"Until recently, a proof of Goldbach's conjecture seemed completely
inaccessible. Today a solution no longer seems out of reach. An
important success, very unexpected and startling to all experts, was
achieved in 1931 by a then unknown young Russian mathematician,
Schnirelmann (1905~1938), who proved that every positive integer can
be represented as the sum of not more than 300,000 primes. Though this
result seems ludicrous in comparison with the original goal of proving
1. General Concepts
Whenever the question of the divisibility of integers by a fixed integer
d occurs, the concept and the notation of "congruence" (duo to Gauss)
32
[I]
I~
2
3
4
~
~
~
05 + 0
05 +I
05 + 2
05 + 3
o.5 + 4
7~15+2
-2~
9~15+4
-3
-4
-5
-6
etc.
10 ~ 25 + 0
11 = 2-5 + 1
5~15+0
12~2.5+2
etc.
1.5 +I
-I~
s~l5+3
~
~
~
~
-15 + 4
-15+3
-15
2
-15 + 1
-15 + 0
-25 + 4
33
GEI\'ERAL CONCEPTS
= c (mod d).
4') a
b s a'
b' (mod d).
5') a - b = a' - b' (mod d).
6')
ab = a' b' (mod d).
w'ith respect to the same modulus may be added, mb-.
Thus
a=a'+rd,
b=b'+sd,
then
a
+b=
+ b' + (r + s)d,
+ (r - s)d,
a' b' + (a's + b'r + rsd)d,
a'
a - b = a' - b'
ab=
34
[!}
since they leave the same remainder. In order to show this geometri~
cally, we use a circle divided into d equal parts. Any integer when
divided by d leaves as remainder one of the d numbers 0, 11 ,d- 11
which are placed at equal intervals on the circumference of the circle.
Every integer is congruent modulo d to one of these numbers, and hence
is represented geometrically by one of these points; two numbers are
congruent if they are represented by the same point. Figure i is drawn
for the case d = 6. The face of a clock is another illustration from
daily life.
Fia;.6.
Geometrica.lN!prescnt~>tionofth1!intea&n~.
Fig.7.Goometricalrepr..,entatiolloltheio~e!'llmodu.lo8.
+ 11.
(mod 11),
10 = -1
Successively multiplying this congruence by
10'
(-1)(-1)- 1
(mod 11),
, etc.
35
GENERAL CONCEPTS
li1}
z-
t = a 1 -ll
+ ~(IOz
1)
+ a (10 + 1) + (4(10'3
1)
+.
Since all the numbers 11, 102 - I, 103 + I, are congruent toO modulo
11, z - lis also, and therefore z leaves the same remainder on division
by 11 as docs t. It follows in particular that a number is divisible by 11
(i.e. leaves the remainder 0) if and only if the alternating sum of its digits
is divisible by 11. For example, since 3- 1 + 6- 2 + 8- 1 + 9 =
22, the number z = 3162819 is divisible by 11. To find a rule for
divisibility by 3 or 9 is even simpler, since 10
1 (mod 3 or 9), and
therefore IOn = I (mod 3 or 9) for any n. It follows that a number z
is divisible by 3 or 9 if and only if the stun of its digits
s=au+a1+a2++an
is likewise dhisible by 3 or 9, respectively.
For congruences modulo 7 we have
is divisible by 7.
Exercise: Find a similar rule for divisibility by 13
0,
I,
2, 3, 4
36
III
a-b
b
=0 1 2 3
a= 0
1
2
3
0
3
1
4
2
3 4
---~-----
a e 0
1
2
3
1
2
3
4
0
2 3
3
4
0
1
= 0 (mod
d) only if either a
which is an extension of the ordinary law for integers which stat('s that
ab = 0 only if a = 0 orb = 0. The law 7) holds only when the modulus d
is a prime. For the congruence
ab
=0
(mod d)
means that d divides ab, and we have seen that a prime d divides a
product ab only if it divides a orb; that is, only if
a """ 0
(mod d)
or
=0
(mod d).
If dis not a prime the law need not hold; for we can write d
where r and 8 are less than d, so that
r ;p 0
(mod d),
8 ~
r-8,
(mod d),
but
rs = d """ 0
For example, 2
(mod 6).
(mod d).
Exercise: Show that the following law of cancellation holds for congruences with respect to a prime modulus:
If ab """ ac and a
;p 0, then b = c.
inclu~ive
37
GE:-<ERAL CONCEPTS
+ + + +
2. Fermat's Theorem
In the seventeenth century, Fermat, the founder of modern number
theory, discovered a most important theorem: If pis any pr1'rne which
does not divide the infll{ler a, then
a~~'- 1
(mod p),
This means that the (p - l)st power of a leaves the remainder 1 upon
division by p.
Some of our previous calculations confirm this theoremi for example,
we found that 10~ = 1 (mod 7), 10~ = 1 (mod 3), and 1010 = I
(mod 11). Likewise we may show that 2 1 ~ e 1 (mod 13) and 510 e 1
(mod 11). To check the latter congruences we need not actually cal~
culate such high powers, since we may take advantage of the multi~
plicative property of congruences:
2 = 16
2' =
2n =
(mod 13),
-4
-43
52
os
5'
e.
s =
-12 '= 1
5)0
(mod 11),
za
-2
= 34
= 12;;;;;!! 1
= a,
mz = 2a,
ms
3a, , m;-l
(p -
l)a.
m1mz ..
m;~t
= 1.2.3 ..
1)
S:llli
(mod p).
1),
(p -
1) (mod p),
38
[!]
But K is not divisible by p, since none of its factors is; hence by the
la\v 7), (a"l - 1) must be divisible by p, i.e.
ap-l - 1 a 0
(mod p).
This is Fermat's theorem.
To check the theorem once more, let us take p = 23 and a = 5.
2
We then have, all modulo 23, 5 a 2, 5' = 4, 58 os 16 s -7, 516 s
26
49 = 3, 5 ;;;;;; 12, 522 """' 24 = 1. With a = 4 instead of 5, we get,
2
8
again modulo 23, 4 a; -7, 4 = -28 = -5, 44 = -20 a 3, 48 a 9,
411
-45
1 422
1.
where 0 $ r
ke
+ r,
art .. a
1 (mod p).)
3. Quadratic Residues
Referring to the examplee; for Fermat's theorem, we find that not
(ln]y is av-l ;!;; 1 {mod p) always, but (if pis a prime different from 2,
therefore odd and of the form ]J = Zp' + 1) that for some values of a,
a~' = a("- 1112 = 1 (mod p). This fact suggests a chain of interesting
jnvestigations. We may write the theorem in the following form:
a(p--Jl/ 2
or
a(p-l)/2 ao
-1
(mod p).
a= x
(mod p).
QUADRATIC RESIDUES
39
(p; )-(Y)
e,
(p
x)~
(mod p)
(e.g., 2
"""
(mod 7)),
40
III
since (p - xl = p2 - 2px + x 2 w x 2 (mod p). Hence half the numbers 1, 2, , p - 1 are quadratic residues of p and half are quadratic
non-residues.
To illustrate .the quadratic reciprocity law, let us choose p = 5,
q = 11. Since 11 E!l 12 (mod 5), 11 is a quadratic residue {mod 5);
since the product [(5- 1)/2][(11
1)/2] is even, the reciprocity law
tells us that 5 is a quadratic residue (mod 11). In confirmation of this,
we observe that 5 42 (mod 11). On the other band, if p = 7, q = 11,
the product {(7 - 1)/2]{(11 - 1)/2] is odd, and indeed 11 is a residue
(mod 7) (since 11
=
=i
tx -
y = -l,
+ ty
= 1
PYTHAGOREAN NIDfBERS
41
v~-u2
c=t:
2uv
+v2'
-=u2 +v1'
Therefore
a= (v 2
u )r,
b = (2uv)r,
(2)
c = (u2
+ v )r,
2
+ v)r 2,
2u~v2
b2 = (4uV)r:,
c2 = (u1
+ 2uV + v)r
2
,
so that 0.1 + b = c1
This result may be simplified somewhat. From any Pythagorean
number triple (a, b, c) we may derive infinitely many other Pythagorean
triples (sa, sb, sc) for any positive integers. Thus, from (3, 4, 5) we
obtain (6, 8, 10), (9, 12, 15), etc. Such triples are not essentially dis~
tinct, since they correspond to simifar ri~ht triangles. We shall there~
fore define a primitive Pythagorean number triple to be one where a,
b, and c have no common factor. It can then be shown that theformulat~
2
b = 2uv,
c = u~
+v
2
,
for any positive integers u and v with v > u, where u and v have no com~
mon factor and are not both odd, yield all prt"mitive Pythagorean number
triples.
Bxercit.e: Prove the l11,st st11,tement
42
[II
1. General Theory
The reader is familiar with the ordinary process of long division of one
integer a by another integer b and knows that the process can be carried
43
GENERAL THEORY
out until the remainder is smaller than the divisor. Thus if a = 648
and b = 7 we have a quotient q = 92 and a remainder r = 4.
92
7[648
648~7.92+4.
63
18
14
4
\Ve may state this as a general theorem: lj a is any integer and b s
any nteger greater than 0, then we can always find an nleger q such that
(!)
b q
+ c,
~ r
<
b.
To p10ve this ~tatement without making use of the ploccss of long division we
need only observe that any integer a is either itself a multiple of b,
a=
bq,
bq
+ l)
"" bq
+ b.
a- bq = r > 0,
while from the
~econd
inequality we haw
a- bq = r
so that 0
<r<
< b,
Ill
a= bq
+r
it follows that
(3)
(a,
b)~
(b, c).
a=su,
b=lu,
also divides r, since r = a - bq = su - qtu = (s - qt)u; and conversely, every number v which divides b and r,
b = s'v,
t'v,
1640
164
we find that
1804
5.328
164.
(328, 164).
Since
2
164 [328
328
0,
by a
proceHB.
GENERAL THEORY
45
< r,
b = r,q2
+ r2
{0
r, = r2q3
+ r3
(0
<
a=bq,+r,
(4)
T2=
r,
(0 < r
raq(+r,
<b)
<
<
r2)
rJ)
Henee after at mo~;t b steps (often man:-: fewer, since the difference
between two sucecHsive r's is usually greater than 1) the remainder 0 must
appear:
r.,_ 1 = rqq,.+l
+ 0.
remm'nder in the
of the equality
eincefcom <>uoc<cosive linee. of (4) we !Jave
(a, b) = (b, r 1).
(5).
the eq-
46
{lj
rs = b - q1r 1 = b - q2(k,a
= (- Cj<Jk1)a
= 1,
+ l,b)
+ (1
q~ll)b = ~a
+ l2b.
61
2-24 + 13,
11
24
~5-2+
1-13 + 11,
1,
13
1-11 + 2,
2-1 + 0.
61 - 2-24,
-61 + 3-24,
2-61
5-24,
11-
5-2~
(-61
-11-61 + 28-24.
47
kab
+ lpb.
kpc
+ lpb
p(kc
+ lb).
N = P1P2 p, = q1q2 q
Since p 1 divides the left side of this equation,"it mu::;i
divide the
right, and hence, by the previous exercise, must
one of the
factors
But qk is a prime, therefore p 1 must be equal to this q~.
After
equal factor1! have been cancelled from the equation, it
follows that ~ must divide one of the remaining factors q1 , and hence
must be equal to it. Striking out p 2 and q1 , we proceed similarly with
P3, , p., At the end of this process aU the p's will be cancelled,
leaving only 1 on the left side. No q can remain on the right side,
since all t.he q's are larger than one. Hence the p's and q's will be
48
III
paired off into equal couples, which proves that, except perhaps for the
order of the factors, the two decompositions were identical.
b)~
I.
For example, 24 and 35 are relatively prime, while 12 and 18 are not.
If a and b are relatvely prime, then for suitably chosen pcsitive or negative
integers k and l we can write
ka + lb ~I.
This follows from the property of (a, b) stated on page 45.
Exercise: I> rove the theorem; If an integer r divides a product ab and is relatively
prime to a., then r mmt divide b (Hint: if r is relatively prime to a then we can
find integers k and l Buch that
kr+la"" 1.
Multiply both sides of this eque.tion I.Jy b.) This theorem includes the lemma
of page 46 as a special cll.lle, since a prime pis relatively prime to an integer 'a if
and only if p does not divide a.
For any positive integer n, let <P(n) denote the number of integers from
1 to n which are rekltively prime W n. This function \l'(n), first intro-duced by Euler, is a "number-the-oretical function" of great importance.
The values of rp(n) for the first few values of n t:re easily computed:
(!)
(2)
(3)
(4)
(5)
(6)
.(7)
(8)
(9)
(10)
~I
~I
~
~
~
~
~
~
~
~
2
2
4
2
6
4
6
4
since
since
since
since
"
1 is relatively prime to 1,
1 is relatively prime to 2,
1 and 2 are relatively prime to 3,
1 and 3 are relatively prime to 4,
1, 2, 3, 4 are relatively prime to 5,
"
1 5
"
1 3 5 7
"
1 t 2' 4' 5 7 8
"
"
"
" 6
"
"
H
"8
<I 9'
etc,
49
where the p's represent distinct primes, each raised to a certain power,
then
!)(1 -
ll -
12(j)(j) - 4,
a.s it should be, The proof is quite elementary, but will be omitted here.
meralize Fermat'~ theorem of page 37.
'rJer, and a is rtlatwtly prim~ ton, then
a"(~) ...
4. Continued Fractions.
(mod n).
Diophantine Equations
1153
+ 229,
6ll
2229
+ 76,
153
276 +I,
ffi'- Hi
I+
~! =
I+
2+
+ 153,
6iiJ22iJ'
~~~ = + 2297153'
~ = 1 + l?fa = 1 + 153~76'
~~=2+ ..~-
50
Ill
number
~~J+---'---o-611
{7)
'+ .;.
when' the a's are positive integers, is called a continued fraction. The
Euclidean algorithm gives us a method for expressing any rational
number in this form.
Exercise: Find the continued fraction developments oi
2
43
169
s ao m
Continued fractions are of great importance in the branch of higher arithmetic known as Diophantine analysis. A Diophantine equation is an algebraic equation in one or more unknowns with integer coefficients, for which integer
solutions are sought. Such an equation may have no solutiona, a finite number,
or an infinite number of solutions. The simplest case ia the lintar Diophantine
equation in two unknowns,
18)
ax+by=c,
where a, b, an<1 care given integers, and integer solutions x, yare desired. The
complete solution of an equation of thia form may be found by the Euclidean
algorithm
To begin with, let us find d = (a, b) by the Euclidean algorithm; then for
proper choice of the inWgers k and l,
(9)
ak
bl =d.
Hence the equation (8) has the p!>rticular solution x k, 11"" I forth .. case c d,
More generally, if cis any multiple of d:
c- dq,
DIOPIIA~TINE
+ b(lq)
51
EQUATIONS
dq = c,
so that {8) has the particular solution x = x ... kq, y = y .. lq. Conver~ely,
if (8) has any solution x, y for a given c, then c must be a multiple of d ... (a., b),
ford divides both a and b, and hence must divide c. We have therefore proved
that the equation (8) has a solution if and only if cia a multiple of (a, b).
To determine the other solutions of (8) we observe that if x = ;~:', y .,. y' is
any solution other than the one, x = x, y = y, found above by the Euc\id('IUJ
algorithm, then x = z' ~ x*, y = y' ~ y is a solution of the "homogeneous'
equation
ax+ by ... 0.
(10)
For if
ax'
a;~:
+ bv ""
x)
fir~t
c,
we find that
b(y' - y) = 0.
Now the most general solution of the equation (10) is z = rb/(a, b), y"" -ra/(a, b),
where r is any integer. (We leave the proof as an exercise. Hint: Divide by
(a, b) &nd uee the Exercise on page 48.) It follows immediately that
x = x
+ rb/(a,
b),
y - y - ra/(a, b).
11=17+4,
1 = 4 - 3 = 4
Hence
7=14+3,
(7- 4) = 24
7(-3)
4=13+1,
+11(2)
= l,
+ llr,
(7,11)=1.
y = 26- 7r,
CHAPTER II
THE NUMBER SYSTEM OF MATHEMATICS
INTRODlJCTION
We must greatly extend the original concept of numlwr as natural
number in order to create an instmmt>nt powerful enough for the needs
of pradice and theory. In a long and hesitant evolution zero, negative
integers, and fractions were gradually accepted on the same footing
as the positive integers, and today the rules of operation with these
numbers are mastered by the average school child. But to gain complete freedom in algebraic operations we must go further by including
irrational and complex quantities in the number concept. Although
these extensions of the concept of natural number have lli>cn in use for
centuries and are at the basis of all modern mathematics it is only in
reeent times that they have been put on a logically sound basis. In
the pres()nt chapter we shall give an account of this development.
1. THE RATIONAL NUMBERS
53
tiples of this unit, say between 53 and 54 pounds. When this occurs,
we take a further step by introducing new sub-units, obtained by sub
dividing the original unit into a number n of equal parts. In ordinary
language, these new sub-units may have special names; for example, the
foot is divided into 12 inches, the meter into 100 centimeters, the pound
into 16 ounces, the hour into 60 minutes, the minute into 60 seconds,
etc. In the symbolism of mathematics, however, a .;;ub-unit obtained
by dividing the original unit 1 into n equal parts is denoted by the
~;ymbol 1/n; and if a given quantity contains exactly m of these subunits, its measure is denoted by the symbol m/n. This symbol is
called a fraction or ratio (."ometim11s written m:n). The next and de~
cisive step was ronsciousl:v taken only after centuries of groping effort
the symbol m/n was diYe;-;ted of its concrete reference to tl1e proce;,;.~ of
measuring and the quantities mrMurt"d, and instead considered as a
pure number, an entity in itself, on the same footing with the natural
numbers. \Vhen m and n are natural numbers, the symbol m/n is
called a rational number
The use of the word number (originally meaning natural number only)
for these new symbols is justified by the fact that addition and multiplication of these symbols obey the same laws that govern the operations
with natural numbers. To show this, addition, multiplication, and equality of rational numbers must first be defined. As everyone knows,
these definitions are:
4
ad+bc
~bd'
(I)
:
1,
b 'd
For example
bd'
~if ad
be,
54
[II]
in
p
p+q= q+ p
+ (q + r) = (p + q) + r
(2)
pq = qp
p(qc)
p(q
+ r)
(pq)c
pq
+ pr
For examplE', the proof of the commutative Jaw of addition for fractions is
exhibited b:r the equations
'
~ d
+ ab'
of which the first and last equality signs correspond to the definition (I)
of addition, while the middle one is a consequence of the commutative
laws of addition and multiplication of natural numbers. The reader may
verify the other four laws in the same way.
For a real understanding of these facts it must be emphasized once
more that the rational numbers are our own creation;-;, and that the rules
(1) are imposed at our volition. We might whimRically decree some
other rule for addition, such as [
55
(-!)(-!)
!,
ax= b,
exists as an integer only if a is a factor of b. If this lli not the rase, as for
exam'lte when a = 2, b = 3, we simply introduce a new symbol b/a,
56
(II]
as new number symbols makes division possible without restrictionexcept for division by zero, which is excluded once for all.
Expressions like 1/0, 3/0, 0/0, etc. will be for us meaningless symbols.
For if divi.~ion by 0 were permitted, we could deduce from the true cqua~
tion 0-1 = 0-2 the absurd consequence 1 = 2. It is, however, sometimes useful to denote such expr~s.sions by the symbol~ (read, "infinity"), provided that one does not attempt to operate with the symbol oo as
though it were subject to the ordinary rules of calculation with numbers.
The purely arithmetical significance of the system of all rational
numbers-integers and fractions, positive and negative---lli now apparent. For in this extended number domain not only do the formal associative, commutative, and distributive laws hold, but the equations
a+ x =band ax= b now have solutions, x = b ~a and x = bja, without
restriction, provided in the latter case that a ~ 0. In other words, in
the domain of rational numbers the so-called rational operations--addition, subtraction, multiplication, and division--may be performed
without restriction and will never lead out of this domain. Such a
closed domain of numbers is called a field. \Ve shall mePt with other
examples of fiP!ds later in this chapter and in Chapter III.
Extending a domain by introducing new symbol,; in such a way
that the laws which hold in the original domain continue to hold in the
larger domain if' one aspect of the characteristic mathematical process
of generalization. The generalization from the natural to the rational
numbers satislif's both the tlworetical need for removing the restrictions
on subtraction and division, and the practical need for numbers to
express the results of measurement. It is the fact that the rational
numbers fill this two-fold need that gives them their true signifteance.
As we have seen, this extension of the number concept Will> made possible
by the creation of new numbers in the form of abstract symbols like
0, -2, and
Today, when we deal with such numbers as a matter
of course, it
hard to believe that as late as the s('venteenth CE'ntury
they were not
credited with the same
as the positive integers,
wer(' used, whf'n necessary,
a certain
amount of doubt and
The inherent human tendency to
cling to the "concrete," as exemplified
the natural numbers, was
responsible for this slowness in taking an
Only in the
realm of the abstract can a satisfactory system of arithmdic be created
57
Fic.S.Thenumb<oruia.
negative numbers to the left. To represent fractions with the denominator n, we divide each of the segments of unit length into n equal parts;
the point..s of subdivision then represent the fractions with denominator
n. If we do this for every integer n, then all the rational numbers will
be represented by points of the number
We shall call such pointe.
rational
and we shall use the tem.,., ''rational number'' alld "ra-
is
achlleved hJ' the following lcfi,,iticm 'IChecahonail nnmbnd is said to be
lc.,thm ttn nal:.wn.al numtwcJJ (A < B), and B is said to he greater than
A (R > A), if B - A is positivt'. It then follow~ that, if A < B, tlw
point" (numlwr~) bdwccn A and Rare tlw:-:e v. hirh arr both >A and < B
Any sueh
of distinct
together with the points between
[A, B],
from the origin, considered. a.~ positiv<,
thnabwfnlnafn" of A and is indicated by the symbol
58
[II]
I A + B I < I A I + IB !.
IA + B I :5 I A I + I B I,
which is valid irrespective of the signs of A. and B.
A fact of fundamental importance is expressed in the statement: The
ratWnal points are dense on the Une. By this we mean that v.ithin each
interval, no matter bow small, there are rational points. \Ve need only
take a denominator n large enough so that the interval [0, 1/n] is smaller
than the interval [A, B] in question; then at least one of the fractions
m/n must lie within the interval. Hence there is no interval on the line,
however small, which is free from rational points. It follows, moreover,
that there must be infinitely many rational points in any interval; for,
if there were only a finite number, the interval between any two adjacent
rational points would be devoid of rational points, which we have just
seen to be impossible.
=;a.
When an equation of the form (I) holds we say that the two segmt>nt.fi
a and b are
since they have as a common mea,sure the
segment a/n \Vhich goes
The totality
59
INTRODUCTION
oom,me<.eu";ble with the unit segment wiU correspond to all the rational points
on the number axis. For all
of
measuring, the rational numbers are entirely
a.
theoretical vie\\l)Oint, since the set of rational points covers the line
densely, it might seem that all points on the line are rational
If this were true, then
would be
usually omitted
the
high-school versions
work. The theory became fully appreciated only in the lute nineteenth century, after Dedekind, Cantor, and Weierstrass hnd constmctcd
a rigorous theory of irrational numbers. We shall present the theory
in the modern arithmetical way.
First we show; 'l'hc diagonal of a square t"s incommensurable with t"ts
kide. 'Vc may HllPPO..'ie that the side of the given square is chosPtl a,;
the unit of lengtl1, and that the diagonal has the length x. Then, by
the Pythagorean theorem, we have
X~= } 2
+1
4r 2 = 2q 2, or 2r! = q2
anJ hence
which
60
{II]
61
INTRODUCTION
z}o =
= 0.004; but
For example,
~ = ~
b/!0"
would imply
10"
3b,
62
{II]
intervals of the decimal division with any desired degree of approximation. This approximation process may be described as follows.
Suppose that P lies in the first unit intervaL we subdivide this
interval into 10 equal parts, each of length 10-\ and find, say, that P
lies in the third such interval. At this stage we can say that P lies
between the decimal fractions 0.2 and 0.3. We subdivide the interval
from 0.2 to 0.3 into 10 equal parts, each of length 10-~, and find that P
lies, say, in the fourth such interval. Subdividing this in turn, we find
that Plies in the first interval of hmgth 10-3 We can now say that P
lies between 0.230 and 0.231. This process can be continued indefinitely,
and leads to an unending sequence of digits, a 1 , ~ , a 3, , a,. , . ,
with the following property: whatever number n \Ve choose, the point P
is in<'luded in the interval I,. whose left-hand end-point is the decimal
fraction O.a 1 ~a3 a,.~ 1 a,. and whose right-hand end-point iB
O.a1~<4 a,._,(a,. + 1), the length of I,. bein;:t 10-". If we choose in
succession n = 1, 2, 3, 4, ... , we see that each of these intervals,
I, , It, Is, , is contained in the preceding one, while their lengths,
10-\ 10-2, 10-3, , tend to zero. We say that the point P is contained in a nested
of decimal intervals. For example, if P is
the rational point t,
all the digits a 1 , till, a3 , are equal to 3,
and P is contained in every interval In which extend~; from 0.333 33
to 0.333 31; i.e., l is greater than 0.333 33 but less than
0.3:3:> . 34, where the number of dit,ri.ts may be taken arl;itrarily large.
\Ye express this fact by saying that then-digit decimal fraction 0.333
33 "tends tot" as n increases. \Ve write
~ =
V2
1 = 1
(1.4)' ~ 1.96
(1.41l = 1.9881
2
(1.414) = 1.999396
(1.4142l = 1.99996164
<2
<2
<2
<2
<2
< ?( = 4
< (1.5)' ~ 2.25
< (1.42) 2 = 2.0264
< (1.415/ = 2.002225
<
DECIMAL FRACTIONS
..,Y2 and
Sn
~ + { + ~ + ~ + ~
We SPI' that sn differs from 1 by(!)", and that thit=~ difference becomes ar~
bitrarily small, or "tends to zero" aE n increMes indefinitely. It makes no
64
[II}
seme to say that the difference is zero if n is infinite. The infinite enters
only in the unending procedure and not as an actual quantity. We
de-scribe the behavior of s,. by saying that the sum s,. approaches the
limit 1 as n tends to infinity, and by writing
(4)
'
~~+
a~.>breviated
b+ b+ ... + ~
but expre!lsive form we write
s,.--+1asn--+ ro.
(6)
q"--+Oasn--+ w,for -1
<
<
1.
(Incidentally, if q > 1 or q < -1 then qn does not tend to t.ero, but increases in magnitude without limit.)
To give a rigorom proof of the assertion (7) we start with the inequality proved on page 1.5, which states that (1 + p)" ;::: I + np for any
positive integer nand p > -1. If qis any fixed number betw-een 0 and
1, e.g. q = 9/10, we have q = 1/(1
p), where p > 0. Hence
(1
pt ;::: 1
+ np > np,
O<q"<~!
q" is therefore included between the fixed bound 0 and the bound
(1/p)(l/n) which approaches zero as n increases, since pis fuled. This
makes it (Widcnt that q~ ~o. If gis negative, we have q = -1/(1
p)
SH
= 1
+ q + (/ + q' + ... +
q"'.
(Sa)
q2
+ qa + qt + ' + q"+l,
and by subtJaction of (Sa) from (8) we .~ee that all terms except 1 and
qnq cancel out We obtain by this device
(1 - q)s,. = 1 - q"+t,
or, by division,
L.
1-q
1-q
s,.---+
~ qasn-
(:(I,
for -1
< q < 1.
l+q+q~+l+=
For example,
1
+ ~ + b+ ~ + ... ~ 1 ~. ~ 2,
~1-
1
1/10 = 1
66
so that 0.99999 = L
[III
+ q1
q3
=
~ q, if j q I < 1.
1
, where a., = n/(n + 1)?
""1 l((n 1) and observt;>.
qj -
4) Prove, for I q /
<
1. that 1
1 - 2q
+ 2q + 3q' +
+ 3q'-
4q!
iq'+ . . ?
67
0.3322222 . .
1 -
.. ).
The expression
\;w = ~~-
Hence
2970 + 20
9103 -
2990
29\J
0000 = ooo
we set O.ll!b2
decimal. Then p
p = O.a1a2
The
q =
is I/(1 -
a.,+
+ 10-h ).
68
[III
5) Write .11212121
as a fraetion
Find the Vt1lue of this symbol if it if'
meant in the system~ with tlle bases 3 or 5
till
f----
-~- ---~-
70
[Ill
71
h:::~:~~,c~;~i;l~ ~~"~::::'
has a l!
Ti
b*
hall'wn.r hctwccn
72
[Ill
A and smaller than the smallest element of B, and hence could belong
to neither.
In th~ third ca..:;e, where there is neither a largest rational number in
A nor a smallest rational number in B, the cut is said by Dedekind to
define or simply to be an irrational number. It is easily seen that this
definition is in agreement with the definition by nested intervaL'!; any
sequence !1, !2, Is, . of nested intervals defines a cut if we place in
the clru::;s A all those rational numbers which arc exceeded by the lefthand end-point of at least one of the intervala I ft, and in B all other
rational numbers.
Philosophically, Dedekind's definition of irrational numbers involves 8. rather
hill:h de~~:ree of abstraction, 8ince it places no restrictions on the nature of the
mathematical law which defines the two classes A and B. A more concrete
method of defining the real number continuum is due to Georg Cantor (18451918). Although at first sight quite different from the method of nested intervals
or ()f cuts, it is equivalent to either of them, in the sense that the number systems
dt'fined in these three ways have the eame properties. Cantor's idea was sug~
gestcd by the facts that I) real numbers may be rl"garded as infinite decimals,
and 2) infinite decimals are limits of finite decimal fractions. Freeing ourselves
from dependence on the decimal system, we may state with Cantor that any
sequence a, , a,, a,, of rational numbers defines a. real numbtr if it "con~
verges." Convergence is understood to mean tl1f the dHference (a,. - a")
between any two members of the sequence tenrls to ~.,ro when a,. and a" are sumciently fnr out in the sequence, i.e_ as m and r tend to infinity. (The successi\'C
decimal approximations to any number have this property, since any two after
the nth can differ by at most to-~.} Since there are many ways of approaching
the same real number by a sequence of rational numbers, we say that two con~
vergent sequences of rationals a, , a, , ao, and b, , b,, b1, define the
same real number if a~- b~ tends to zero as n increases indefinitely. The oper
ations of addition, etc., for such ~equf'nces are quite easy to ddine.
3. REMARKS ON ANALYTIC
GEO~mTRYj
a math:r of comsp
and
Introducing the continuum of numbers makes it possible to a~sociate
with each line segment a definite real number as its length. llut we may
go much farther.
q'7:p
y
JI
]J[
'
'
P'
F'ig,l2, Rectangularwrdinaleololapoint
Fig.IJ.
'!'~lour
quadrant.,.
74
[II]
x~, p~
d~ = (Xt - X2)
'
~.~::
;:,:
/1
- .
~.
x.
Fig, 14. The dita!lce
b~twoon
two P<>inta.
(x
a) 2
+ (y
b/
= r
This is called the equation of the circle, because it expresses the complete
(necessary and sufficient) condition on the coOrdinates x, y of a point P
75
(x - a/+ (y -
b/
2
,
where r 2 = k
a2 + b2 It follows that the equation (3) defines a
cirt'le of radius r around the point C with coOrdinates a and b.
The equations of straight lines are even simpler in form. For example,
the x-axis has the equation y = 0, since y = 0 for all points on the
x-axis and for no other points. The y-axis has the equation x = 0.
The lines through the origin bisecting the angles between the axes have
the equations x = y and x = -y. It is easily shown that any straight
line has an equation of the form
(4)
(6)
ax
+ by
= c,
~~-~:~1
7G
!Ill
AA ', of length 2p, is called the transverse axis of the h:rperbola The
hyperbola approaches more and more nearly the two straight lines
qx p-y = 0 as we go out farther and farther from the origin, but it
never actually reaches these lines. They are called the asymptotes of
the hyperbola. The hyperbola is the locus of all points P the
difference of whose distances to the two points F(yP2+ q~, 0) and
F'(-.y'p2 + tf, 0) is 2p. These points are again callt>d the foci of
Fii 17 '
YP~p+ q
The equation
(7)
xy =
also detines a hyperbola, whose asymptotes now arC' the two axes (Fig. 18).
The equation of this "equilateral" hyperbola indieate;:; that the area
of the rectanglf determinf'd by P is equal to I for every point P on the
curve. An equilateral hyperbola whose equation is
(7a)
xy = c,
c being a constant, is only a special case of the general hyperbola, just a.-;
the circle is a special case of the
The special charactl>r of the
equilateral hyperbola lies in the
its two
(in this
ca.~e the t\VO coOrdinate axes) are nen"m!;,,nJ.c to e.<ch
For us the main
h're is
(8)
77
ax+by=c
a'x
b'y = c'.
The point common to the two lines is then found simply by determining
its coOrdinates as the solution x, y of the two simultaneous equations
(8). Similarly, the points of intersection of any two curves, such as
the circle x 2 + y 2 - 2ax - 2by = k and the straight line ax + by = c,
are found by solving the two corresponding equations simultaneously.
Ft~t- JR. The equilaUral hy!J*trbo!a ry ~ 1. The "''" rv of the reo!angle det.rmind by the point
(%,v)ia<~quo.ltol,
are usuallY
members of classes or aggregate;
78
[Ill
P-x.
FU:'\DAJ\1ENTAL CONCEPTS
79
! ! : ! !
246SIQ.;2n
! !
is biunique.
described.
4 .. n ..
t t
80
[IIJ
Every rational number can be written in the form a/b, where a and b
are integero;, and all thPse numbers can be put in an array, with ajb in
the ath column and bth row. For example, 3/4 lli found in the third
column and fourth row of the table below. All the positive rational
.5
G 7
1, 2, 1/2, 1/3, 3, 4, 3/2,2/3, 1/4, 1/5, 5, ... which contains each positive
rational number once and only once. This shows that the set of all
positive rational numbers is denume-rable. In view of the fact that the
rational numbers correspond in a biunique manner with the rational
points on a line, we have prowd at the same time that the set of positive rational points on a line is denumerable.
81
2) S'bow that the setS+ T (seep, 110) is denumerable if SandT are denumtr
abl~ set~. Show the same for the sum of three, four, or any number, n, of sets,
and finally for a set cornpo.'!ed of denumerably many denumerable gets,
wherf' the N's df'note the intrgral parts and the small letters dPnote the
r::i,:~:;~~,~~,~~;~~;;;:;~,::,~~~b:w~,r';~':'"~ :m,f,e~
that
this sequence
decimal
The
essPntial
point inofthe
proof
a new number which we can
To do this we first choose a
avoid possible
aml<iguitiee which
l'com cqualitiee lil<e 0,999 , , ~ 1.000 , , ),
then a digit
b2
again unequal to 0 or 9, similarly c
different from co , and so on. (For example, \Ye might simply choose
a = 1 unless a1 = 1, in which case we choose a = 2, and similarly down
'"''Y """
82
[II]
z = O.abcde ...
This new number z is certainly different from any one of the numbers
in the table abQve; it cannot be equal to the first because it differs
from it in the first digit after the decimal point; it cannot be equal to the
second since it dlffers from it in the second digit; and, in general, it
cannot be identical with the nth number in the table since it differs
from it in the nth digit This shows that our table of consecutively
arranged decimals does not contain all the real numbers. Hence this set
is not denumerable.
The reader may perhaps imagine that the reason for the nondenumerability of the number continuum lies in the fact that the straight
line i'l infinite in extent, and that a finite segment of the line would
contain only a denumerable infinity of points. This is not the caf!e, for
Fig2l:
FiglM
Fig. 20. Buruqueoor:r.. pondomeob$t.r~ thepninblofbentocgmentanda wholestraigbtlin
F11. 21. Biunique oorroopondenoo betw~n the pol~< of two ""gmen!<l of diflerl)flt l<mgth
"":~:;::::~,::~,~ ~:~b;:ny
1nt""'
[A, B)
83
suppose that the set of all points on the line betwel.'n 0 and 1 can bearranged in a sequence
(1)
serirH
1/10
tcgcrs.
\\('
&t'en that thc.~e sets nrc
suspect that all infinite sets are equivalrnt
than that between finite numbers and infinity could not be made, but
84
[Ill
Cantor's result disproves this; there is a srt, the real number continuum,
which is not pquivalPnt to any denumerable set.
Thus there are at le:...;+ two different types of "infinity," the denumerable infinity of the inkbers and the non-denumNablc infinity of the
continuum. If two sets A and B, finite or infinite, are equivalent, .we
shall say that they have the same cardinal number. This reduces to the
ordinary notion of same natural number if A and B are finite, and may
be rega.rded as a valid generalization of this concept. Morcover 7 if a
get A ill equivalent with some subset of B, while B is not equivalent to
A or to any of its subsets, we shall say, following Cantor, that the set
B has a greater cardinal number than the set A. This use of the word
"number" also agreeH with the ordinary notion of greater number for
finite sets. The set of integers is a subset of the set of real numbers,
while the set of real numbers is neither equivalent to the set of integers
nor to any subset of it (i.e. the set of real numbers is neither denumerable
nor finite). Hence, according to our definition, the continuum of real
numbers baa a greater cardinal number than the set of integt'rs.
As a matter of b.et, C&ntor &ctually showed how to conLvruct a whole sequence
of infinite eets with greater and greater cardinal numbe!~. Since we may start
with the set of positive integer!, it clearly suffices to show that given any set A
it ill po8sibl~ to construct anot!u.r set H w1th a greater cardinal number. Because
of the great generality of this theorem, the proof is necesAarily somewhat abstract
We defin;: the set B to be the set whose elemenU are all the different subsets of
the set .4. By the word "subset" we shall include not only the proper subsets
of A but also tile set A it~elf, and the e!llpty "subset" 0, containing no elements
at all. (Thus, if A con11iets of the three integers 1, 2, 3, then B col\taius the 8
different elements II, 2, 3), {l, 2), \1, 3}, \2, 31, \11, \2}, \3}, and 0.} Each
element of thl' set B is itMif a iet, consisting of certain elements of A. Now
supposcthatBisequivlllenttoAortosomcsubsctofit, i.e. t]JatthereisBome
rule which correlates in a biunique manner the elements of A or of a subset of
A with all the elements of B, i.e. with the subseUI of A:
85
a......._ lal
y =
O.btb,b~l;,
86
(II]
not
~: ,~::u':::.::;~'\'::;,';: ::;,:j";~t~the
ticians haw
from mathematiC's
program were desirable, it would at
pre.~cnt
I~ DIRECT
:'I.IETHOD OF PROOF
87
plication and even the partial destruction of the body of living mathematics. For this reason it is no wondPr that the school of "intuitioni.~m," which has adopted this program, has met with strong re::~istance,
and that even the most thoroughgoing intuitionists cannot always live
up to their convictions.
88
{Ill
5. Cm!PLEX NUMBERS
1. The Origin
o(
Complex Numbers
For many reasons the concept of r.umber has had to be extended even
beyond the real number continuum by the introduction of the so-called
complex numbers. One must realize that in the historical and psycho~
logical development of mathematics, all these extensions and new invm~
tions were by no means the
of some on(~ individual',~ f'fforts.
They appear rather as the
of a gradual and hes.itant evolution
for which no single pt'rson can rp('rive
credit. It was the need
for more freedom in formal calculations
brought about the use of
negative and rational numbers. Only at the end of the middle ages
89
x 2 = 2,
(2)
has no real solution, since the square of any real number is never
nrgativc.
We must either be content with the statement that this simple equation is not solvable, or follow the familiar path of extending our concept
of number by introducing numbers that will make the equation
solvable. This is
what is done when we introduce the new
symbol i by
= -I. Of course this object i, the "imaginary
unit," has nothing
do with the concept of a number as a means of
It is purely a symbol, subject to the fundamental rule i 2 = -1,
and
value will depend entirely on wllf'tlwr by this introduction a
really use 1 and workable extension of the number system can be
effected.
Since we wish to add &nd multiply with the symbol i as with an ordinary real number,
hould be able to form symbols like 2i, 3i, - i,
90
[II]
2 + 5i, or more generally, a+ bi, where a and bare any two real num~
hers. If these symbols are to obey the familiar commutative, associa
tive, and distributive laws of addition and multiplication, then, for
example,
(2
+ 3) + (! + 4)
(2
+ 3i)(1 + 4i)
- (2
- 2
+ !) + (3 + 4)i +
Bi
3i
+ 7i,
12i'
- (2 - 12)
(8
3)i - -10
11i.
a.;s
though
(3)
In particular, we have
(a+ bi)(a
(4)
bi)
abi
+ abi -
bl = a
+b
(a + In)
(c +
di) -
(a -
c)
(b - d)i,
(5)
(The second
c.2+ d2 = 0.
For example,
(2
+ 3i
+ 3i
+ 3i)
l - 4i
(1
1-=F4i = i"+4i"l="4i =
+ 4i)
2 - 8i
1
- 1
+ 3i + 12
+ 16
14
5 .
= :; - 17 1"
91
i)g ~ ~~(
+ i) in the form
a+ bi.
2) Expres.s
in the forma+ bi
3) Express in the form
a+ bi:
x 2 = -1
has the two solutions x = i and x
i-i = (-i)(-i) = i 2 = -1. In
we can easily verify that now every
write in the form
ai
(6)
has a solution.
+ bx + c
= 0,
x +!x= -~,
1
.,
;x"
(7)
b2
+ax + 4a
b2
4a 2
+~ =
V~::.~4ac,
z=-bV~.
2a
[III
92
Now if 1./ - 4ac -2:: 0, then V02-::::-4ac is an ordinary real number, and
the solutions (7) an: real, while if 1/ - 4ac < 0, then 4ac - !/ > 0
and Vbt -=-4ac = V -(4ac :...:.-1}2) = V4aC-=--!Ji.i, so that the solutions
(7) are complex numbers. For example, the solutions of the equation
x 2 -5x+6=0
are x = (.5 V25 - 24)/2 = (5 1)/2 = 2 or 3, while the solutions
of the equation
x
are
= (2
2x
+2 =
0,
v r~S)/2 = (2 2i)/2 =
+ i or 1 -
i.
93
GEOME'l'RICAI, INTERPRETATION
x + Oi, while the points on the 1'-- " correspond to the pure
imaginary numbers z = 0 + yi.
z: =
If
Z =X+
yi
yi
~'ic 22.
the distance of the point z from the origin by p, then by the Pythagorean
theorem
/
= x
+ y2 =
(x
+ yi)(x-
yi) = z.z.
written
1'1
If z lieH on the real axis, its modulus i:> its ordinary absolute value. The
numbers with modulus 1 lie on the "unit circle" with center
and radius 1.
If z = 0 then z = 0. This follows from the definition of z
from the
.:\Ioreov('r the moclulu' of lhevmduct
of two complex numbers is
to the product of their
I
!z1.z2l
= lz~llz21
This will follow from a more general theorem to be proved on page 95.
Eun
of two
Cl
94
!III
2. From the fact that the product of two relll numbers is 0 only if one of the
factors is 0, prove the corresponding theorem for complex numbers. (Hint: Use
the two theorems just stated.)
Zt
+ Z2 =
(Xt
t1
X1
+ Yti
Fic.23. PIL<$lleloaru:nl"'oflllddltionafoompleznumbe...
points 0,
Zt, Z2.
lzt+ z2l
~I
Zt]+ lz2l
This follows from the fact that the length of any side of a triangle
cannot exceed the sum of the lengths of the other two f>ides.
Exerc:Je: When does the equality
i z,
+z
1 ,..
i z,
+ I Zz I
hold?
The angle between the positive direction of the x-axis and the line
Oz is called the angle of z, and is denoted by (Fig. 22). The modulus
of Z io; the same a-s the modulus of z,
I~
I ,I,
-q,.
GEOMETRICAL INTERPRETATION
95
(8)
+ yi =
p(cos+ isin);
x = pcos,
E.g. for z = i,
f0r
z = 1 + i,
for
z=
+i
l - i =
for
1- i,
z = -1
VZ, q,
= 45,
+i
sin 90");
so that
= V2, = -45,
V2 [cos ( -45)
+ v'3 i,
+ v'3 i
-1
so that
+ i sin ( -45)];
= 2, = 120,
= 2 (cos 120
so that
+ i sin 120).
z = p(cos + isin),
z' = p'(cos ' + i sin'),
then
Now, by the fundamental additlon theorems for the sine and cosine,
cos cos' - sin sin' = cos ( + $'),
cos sin'+ sin cos' = sin ( + ').
Hence
(9)
zz' = pp'{cos (q, + ') + i sin(+')!.
This is the trigonometrical form of the complex number with modu]u:,!
pp' and :otngle + '. In other words, to multiply two compk.r number.~,
we multiply their moduli and add their angles (Fig. 24). Thus we
96
[II)
Fic.24.
Multiplieo.tmnoftwoeolllpl~xnun.
Formula
for then
ang!eo~&readdedandthemodu!imultipli&d.
l = p\cos
+ i sin 2if,).
= p~(cos 3
+ i sin 3),
z" =
pn
(cos n
+ i sin n.)
{11)
{eos
+ i sin)" =
cos rup
+ i sin TIA/J.
97
GEOMETRICAL INTERPRETATION
(u
+ v)
u1
+ 3u v + 3uv' + v',
2
+ i sin 3
sin1 ).
we have finally
cos &p = cos1 - 3 cos (1
sin&p
-4sin 1
+ 3sin.
Similar formulas, expressing sin rup and cos rup in terms of powers of
sin and cos respectively, can easily be obtained for any value of n.
Exercises: 1) Find il1e correspumting formulas for fil!l 4 and cos -!q,
2) Prove that for a point, z = cos q, .-.1. i Bin q,, on the unit circle. 1/z =
ismtJ>.
3) Prove without calculation that (a+ bt)/(a - bi) always has the aL~olute
value l.
4) If z, and z, are two complex numbers prove that the angle of z, - z, is equal
to thi! angli! between the real axis and tho'l YCCtor pointing from z, to z1
5) Intnpret the angle of the complex number(~,- .!1 1 )/(tt- z,) in the triangle
formed by the points z, , z1, and Zt
6) Prove that the quotient of two complex numbers with the ;game anf"le is real
cos~-
7) Provt> that if for four comple:o: numbers z, , z,, z,, z, the Rnglea of
sml
z-= ~:
:l.r('
iarea.l.
98
[Ill
Fi;.25. Thetwelv<ltwelfthrootaoll.
d~~~:;'p~~:;:;:':n;~~bed~~;h;'~,;;;;:::~:'~~~yh:~~;'~~:'~;{.;~
n-
regular
1 as
one of
vertices. This is almost immediately
Figure 25
(drawn for the case n = 12). The first vertex of the polygon i'> 1. Th,1
next is
360'
(12)
. . 360'
a= cos----,;:- +zsm
n'
f'.ince its angle must be the nth part of the total angle 360. The next
vertex is aa = cl, sinre we obtain it by rotating the vector a through
the
angle~-~.
i~ ti,
99
[ cos
-n + ~. sm. 360']"
n =cos 360" + ~. sm. 360 = 1 + Oi..
360'
c/
= cos
1.
n ).
(n720') + .sin (720'
~
The same is
We can see
this by writing
(a2)" = ah = (a")2 = (1)2 = 1,
cos720"+isin720" = 1
In the same way we see that a!! the n numbers
l,
a, ol, a\,
a"-
+Oi
= 1.
x"- l
(13)
Xn -
(x -
I)(x"-l
+ xn~~ + xn~ -t
3
(15)
x'.-~
1).
""' 0
100
equation.
[II)
+ iyf3),
cos 120
+ isin 120 =
!(-1
= cos240
+ isin 240 =
!(-1-
a-=
i,Y3),
x2 +x+1=0,
as the reader will readily see by direct substitution. Likewise the fifth
roots of 1, other than I itself, satisfy the equation
x4
(16)
x'
+ :r? + x + 1
= 0.
+~ +X + ~+1=
X?
or, since (x
I/x) = x
0,
w +w-1=0.
Hence the complex fifth roots of 1 are the roots of the two quadratic
equations
+~
+ HV5-
l)x
+1=
0,
Kv'5 +
l)x
+1=
0,
w1,
or
= U'\!,
or
xz-
and
<I
2) Find (I
+ i)ll
.yt{,
..Y=i.
101
or Algebra
but far more is
real or complex
+a1x+au=O,
ha8 solutions in
of tiw 3rd
and lth degrees
was
century by Tartr.glia, Cardan, and ot.hcrs, who solved such equations by formulas essentially similar to that for the quadratic equation. although much more
complicated. For almost two hundnd yenrs tlw g:enPnd equations of
5th and
intrn,ively
but au efforts to solve
them
failed. It
achievement when the
young
in his doctoral thesis (li99) succeeded in giving the first
complete proof that solutions Pxi.~t, although the question of generalizing
the classical formulas, which express the solutions of equations of degree
less than 5 in terms of the rational
plus root extraction, remained unanswered
118.)
~:~.:~:~:,:~::~~i:,~;::~~,;(~l.~7):,
/(a)~
uhere a1,
f(x) ~ U.
(x -
1)(x - i)(x
+ i)(x + 1).
102
!lll
That the a's are roots of the equation f(x) = 0 is 1 lent from the
factorization (19), since for x = a, one factor of f(x';, 1d hence f(x)
itself, is equal to zero.
In some caSDs the factors (x - o:1), (x - a2), of a polynomial
f(x) of degree n will not all be distinct, as in the example
f(x) ~ x' - 2x
I ~ (x
l)(x - 1),
which has but one root, x = 1, "counted twice" or "of multiplicity 2."
In any case, a polynomial of dr:.;rPe n can have no more than n distinct
factors (x - a) and the correl'n
ling equation n roots
To prove the factorization thLvrem we again make use of the algebraic identity
(20) xk- ak = (x- a)(xk-l + axk- 2 + cl:l- 3 + ... + at~ 2x + ak-1),
which for a = 1 is merely the formula for the geometrical series. Since
we are a.~suming the truth of
t,heorem, we may Huppose that
a = a 1 is a root of equation
so that
j(a1)
= a!
WE'
= 0.
obtain the
+ +
f(x)
a1(x- a!)
(x - a,)g(x),
Xn-l
our purposes it is
:\'ow we may
there exists
+ b,..-~x"- + + b1X + bo
2
(22)
j(x)
= (x - a1)(x - a2)(x -
~) (x -
a,).
103
From (22) it follows not only that the complex numbers 011, a:2, , a:,.
are roots of the equation (17), but also that they are the only roots. For
if y were a root of equation (17), then by (22)
f(y)
(y
0.
VZ
is an algebraic number,
2 = 0.
numbers whose
can arrange
the algebraic numbers
a sequence by
those of height 1, then t.aking those of height 2, and so on
This proof that the set of algebraic numbers is denumerable assurr.s
104
III!
the existence of real numbers which are not algebraic; such numbeNJ are
called tran~ende11tal, for, as Euler said, they "transcend the power of
algebraic methods.''
Cantor's proof of the existence of transcendental numbers can hardly
be called constructive. Theoretically, one could construct a transcendental number by applying Cantor's diagonal process to a denumerated
table of decimal expressions for the roots of algebtaic equations, but this
procedure would be quite impraetical and would not lead to any number
whose expression in the decimal or any other system could actually be
written down. 1t-1oreover, the most interesting problems concerning
transcendental numbers lie in proving that certain dPfinite numbers
such as 7r and e (these numbers will be defined on pages 297 and 299}
are actually transcendental.
(a."' 0),
LIOUVILLE'S THEOREM
105
P1 p,
q; q; ...
of rational numbers with larger and larger denominators such that
E:-z.
q,
+ ...
+ a,.-tJlO-(m+l)T + .
O.a1a,000a300000000000000000a40000000
where the a; are arbitrary digits from 1 to 9 (we could, for example,
choose all the a, equal to 1). Such a number is characterized by
rapidly inereasing stretches of O's, interrupted by single non-zero digits.
Let us denote by z,. the finite decimal fraction formed by taking only
the terms of z up to and including a.,..J0-"'
Then
(4)
1 z ~ z, 1 < 10. w-<m+l)!.
1
= z,
<
10
]Q(m+lf! =
106
NUMBFR SYSTEM OF
!IIJ
MATHE~fATIC~
Hence z
is transcendental.
It remains to prove Liouville's theorem. Suppose z is an algebraic
number of degree n > 1 which satisfies (1), so that
(5)
a,(,. - ,) +a,('! -
Zm ~
i)
Then
+ . + a.(,: - ,")
z, and
USiui;
the algebraic
we obtain
(61
f(z,..) = a 1 + UJ!(Z..
z,..-z
Since z .. tends to z as a limit) it will differ from z by less than 1 for suffi~
eicntly large m. \Ve can therefore write the following rough estimate
for sufficiently large m:
( )
7
I~('::);
I< I
a,
I+ 21
a,
Zm
= 11,[,
If now we
then
(8)
Then
lf(z,) I= i ~_-~tq"-~~~~!(!.
0.
LIOUVILLE'S THEOREM
107
zv'T
is a transcendental, or even that it is an irrational number. For almost
three decades there was not the slightest suggestion of a promising line
of attack on this problem. Finally Siegel and, independently, the young
Russian, A. GC>!fond, discovered new methods for proving the transcendental character of many numbers significant in mathematics, including
the Hilbert number 2V2 and, mor-e
any number a6 where
a is an algebraic number :;6. 0 or 1
b is any irrational algebraic
number.
SUPPLEMENT TO CHAPTER II
in mathematics.
GENERAL THEORY
109
B :::>A.
For example, the set A of ail integers that are multiples of 10 is a subset of the set B of all integers that arc multiples of 5, since every
multiple of 10 is also a multiple of 5, The statement A C B does not
exclude thf' possibility that B CA. If both relations hold, we say that
the sets A and B are equal, and write
A=B.
For this to be true every element of A must be an element of B, and
conversely, so that the sets A and B contain exactly the same elements.
The rp\ation A C B has many similarities with the order relation
a =:;; b between real numbers. In particular, it is true that
I)
A CA.
2}
If A
3)
If A C Band B
Band B
c
c
A, then A = B.
C, then A
C.
A
and B the Ret
con>~ist.ing
II, 2, 31,
of the integers 2, 3, 4,
B
(2, 3, 41,
A C B it follows that
4)
5)
A Cl,
110
Ill]
and since the empty set contains no objects at all, this is impossible no
matter what the set A.
We shall now define two operations on sets which have many of the
algebraic properties of ordinary addition and multiplication of numbers,
though they are conceptually quite distinct from those operations To
this end, let A and B he any two sets. By the "union" or
sum" of A and B we mean the set which consists of all the
which arc in either A or R (including any that may be in both).
set we dtnote by the symbol A + B. By the "intersection" or "lo;;;ica!
product" of A and B we mean the set consisting only of those elements
which are in both A and B. This set we denote by the symbol A B or
simply AB. To illustrate these operations, we may again choose as
A and B the sets
A=/1,2,3},
Then
II,
B={2,3,4j.
AB
2, 3, 41,
(2, 3}.
8) A
+ (B + C)
7) AB
~ (A
+ B) + C
10) A+ A~ A
12) A(B
+C)~
14) A+ 0
16) A +I
~I
11) AA
(AB
+ AC)
.1
9) A (BC)
13) A
15)
BA
~
+ (BC)
AI~
.1
17) AO
(AB)C
~A
(A
-1-
B)(A +C)
Ill
GENf.;RAL THEORY
A+B
~
AB
FJg.26.t:noooandinte
The reader will have ob;:erved that tli_ .,.s 6, 7, 8, 9, and 12 are
identical with the familiar commutative, associative, and distributive
laws of algebra. I.t follows that all rules of the ordinary algebra of
numbers which are consequences of the commutative, associative, and
dist.ributive laws are also valid in the algebra of sets. The laws 10, 11,
and 13, however, have no numerical analogs, and give the algebra of
sets a simpler structure than the algebra of numbers. For example,
the binomial theorem of ordinar} algebra is replaced in the algebra of
sets by the equality
(A +B)"~ (A +B)(A +B) .
.. (A
+B)~
A +B
112
A+ A'
20)
[Ilj
AA'
0' ~ l
22) I ' - 0
A"- A
24) The relation A C B is equivalent to the relation B' C A'.
2.5) (A +B)'- A'B'
26) (AB)'- A'+ 8'.
21)
23)
are
evervwh~re
and
and
ar.rl
interchanged (insofar
:::>
For example, the law 6 bi7comeR 7, 12 bec.omes 13, 17 becomes 16, etc
It follows that to mty theorem which can be proved on the basis of the laws
1 to 26 there corresponds another, "dual," theorem, obtaned by making
tM interchanges above. For. since thf' proof of any theorem will consist
of the successive applicl'-tion at eaeh step of certain of the laws 1 to 2G,
the application at each stev of the dual law will provide a proof of
the dual theorem. (For a similar duality in geometry, see Chapt. IV.)
2. Application to l\tatbematical l.ogle
The verification of the Jaws of the al!!:ebra of set.fl rested on the analysis
of the logical meaning of the relation A C Band the V}X'rations A + B,
AB, and A'. We can now reverse this process and use the laws 1 to 26
as the be.sis for an
of logic." More precisely, that part of
concm1s "",s,on>nat is equivalent, properties or attnbuks
to a formal algebraic :;ystem based on the
"uniYen:.c of discourse" defines the set I;
~! of
the set A consisting of all
pos81!ss th
rules for translating the
u~ual Jogic~d tern1inology in
;:mzuage of set.11 may be illustrated
by the following examples:
"Either A or lJ"
A+B
AB
"NotA''
113
A'
(A
10
11
AUAareB"or"IfAthenB"or"A
impliesB"
A CB
AB 7"' 0
AB
AB'
A~o
0
0
If A
B and B c
r then A
c C.
AA'
0,
A+
A'~
1,
A+B=B+A
~+m+c~A+~+0
B)'~
A.
114
!Ill
+B
= B.
p(A)
'!_Umber of -~~U:?.~.t.a in A
numberofelement.sin I
the numher of clement~ in any set A by the symbol n(A), then this
be written in the form
(1)
p(A} =
(2)
We
n(A
h~we
+ B)
"' n(A)
+ n(B)
n(AB),
13, n(l)
52,
p(A
+B
+C)
p[(A
+ B) + Cj
p(A +B)
+ p(C)
+ B)Cj
"" p(AC
+ BC)
= p(AC)
+ p(BC)
+ B)C]
+ uc: Hencu
~ p[(A_
115
"- p(.4BC)
p(A
+ B +C)
= p(A)
p(B)
+ p(C)
~ p(AB) ~ p(AC) -
p(BC)
+ p(ABC).
=:
= {;
for when one digit occupies its proper place there are two possible orders for the
remaining digits, out of a total of 321 = 6 possible arrangements of the three
digit:!. Moreover,
p(AB) = p(AC) = p(BC) ,..
and
p(ABC) "" ~.
p(A
+B
+C) = 3! - 3(i-)
= 1
+ i-
+i
=
= 0.6666 .
where thf' symbols~ ~' ~, ~~ttand for summation of the possible com
116
!Ill
then digits 1, 2, 3, , n are written down ln random order, the probability that
at least one digit will occupy ita proper place is
(5)
where the last term is taken with a plus or minus si1 n according as n is odd or
even. In particular, for n = 5 the probability is
l
Pi""
19
te11d~
s~..-2I-3r+4r-
Since from
CHAPTER III
GEOMETRICAL CONSTRUCTIONS. THE ALGEBRA OF
NUMBER FIELDS
INTRODUCTION
Construction problems have always been a favorite subject in geometry. With ruler and compass alone a great variety of constructions
may be performed, as the reader will rrmember from school; a line segment or an angle may be bisected, a line may be drawn from a point
perpendicular to a r;iven line, a regular hexagon may be inscribed in a
circle, etc. In all these problems the ruler is used merely as a straightedge, an instrument for drawing a straight line but not for meaBuring
or marking off di~blll('CH. ThP traditional f('Strirtion to ruler and compass alone goe;:; back to antiquity, although the Greeks themselves did
not hesitate to use other instruments.
One of the most famous of the classical construction problems is
the so-called contact problem of ApolloniuA (cina 200 B.C.) in which
three arbitrary circles in the plane are given and a fourth circle tangent
to all three is required. In particular, it is permitted that one or more
of the given circles have degenerated into a point or a straight line
(a "drcle" with radius zero or "infinity," respectively). For example,
it may be required to construct a circle tangent to two given straight
lines and passing through a given point. While such special cases are
rather ('asily dealt with, the gcrwral probltm is considerably more
difficult.
Of all construction
that of construding with ruler and
compass a regular polygon of
has perhaps the greatest interest.
For certain ndms of n--e.g. n = 3, 4, 5, 6 -the solution has be~n known
and forms an important part of school geometry. But
twptagon (n = 7) thl' con:o;truction has been proved
There are thrpe other cla~sical Gr~ek problems for which
a
haR b(tn sought in vain: to trisect an arbitrary g-iven angle,
to double a given cube (i.e. to find the edge of a cube whose volume shall
be twin that of a eube with a given segmf'nt. as its edge) and to square
the circle (i.e. to construct a square having the same area as a given
117
118
GEOMETRICAL CONSTRUCTIONS
{III]
circle). In all these problems, ruler and compass are the only instruments pf'rmitted.
Unsolved prsblems of this ,:;ort gave rise to one of the most remarkable
and novel developments in mathematics,
after centuries of futile
search for solutions, the suspicion grew that
bP
definitely unsolvable.
gate the question: How is t't possible to prove that certain problems cannot
be solvrdt
1!'\TRODUCTIOX
119
'f' +I.
The first Fermat numbers are 3, 5, 17, 257, 65537 (seep. 26). So
over1'<helmed was young Gauss Ly hi:; di~covcry that he at once
up his intention of becoming a philologist and resolved to
his
life to mathematics and its applications. He
looked bark on
this first of his great feats with particular
bronze statue of him was erected in
honor could be devised than to shape the
regular 17-gon.
'Vhen dealing with a geometrical construction, one must neYer forget
that the problem is not that of drawing figures in practiee with a certain
degree of accuracy, but of whether, by the usc of
and
compass alone, the solution ean be found theoretically,
instruments to have
\\'hat Gauss
constructions could
in principle. His
concern the simplest way actually to perform them
could be used to simplify and to cut down the number of neres>-ary st.eps.
This is a question of much less theoretical importance. From a
tical point. of view, no such construction would
result as could be obtained by the use of a
properly to understand the theoretical
queCJtion of geometrical construction and stubbornness in refusing to take
of well-established scientific facts are responsible for the
120
GEOMETRICAL CONSTRUCTIONS
[III I
PART I
IMPOSSIBILITY PROOFS AND ALGEBRA
!. FUNDAMENTAL GEOMETRICAL CONSTRUCTIONS
1. Construction of Fields and Square Root Extraction
CO:"\STRUCTIO:\ OF FIELDS
121
a+
band"-~.
Fig 2S.Conolruct>onu!<>/3.
lo'ig.30.Construetion ofob.
122
GEOMETRICAL CONSTRUCTIONS
!III)
V..] "j:.
-----=--+'--j
a
x
x
I'
x=
va.
2. Regular Polygons
~>omewhat more elaborate constru<'tion
with the regular decagon.
is
in a circle with radius 1
its
side x. Since x will subtend an angle of 36
cenkr of the circle,
the other two
of the largp, triangle will each be 72, and hence
the dotted line
bisects angle A diYides triangle OAB into two
isosceles triangles, each with equal sides of length x. The radius of the
circle is thus divided into two segments, x and 1
x. Since OA B is
123
REGULAR POI,YGONS
V5
it as the hypotenuse of a right triangle whose other sides have l<'ngths 1 and 2
We then obtain x by subtracting the unit length from~ and bisecting the result
The ratio OB:AB of the preceding problem has bePn called the
golden ratio, because the Greek mathematicians considered a rectangle
he~a.gon.
whose two sides are in this ratio to be &sthetically the most pleasing.
Its value, incidentally, is about 1.62.
Of all the regular polygons the hexagon is simplest to construct. \\'c
start with a circle of radius r; the length of the side of a regular hexagon
inscribed in this circle will then be equal to r. The hexagon itself can
bt constructed by successively marking off from any point of the circle
chords of length r until all six vertices arc obtained.
124
GEOMETRICAL
CONSTRUCTlO~R
I III
If 3ft denotes the length of the side of the regular n-gon inl'leribed in the unit
circle (circle with radius 1), then the side of the 2n-!(on is of length
This may be proved as follows: In Figurl) 34 lin i~ equal to DE'"' 2DC, ~~~equal
to DB, and AB equal to 2. The area of the right triangle ABD is given by
tBDAD and by !ABCD. Since AD- y'Alf.l--=-l5Bi, we find, by substituting
.AB "' 2, BD = 8t, CD = tsA, and by equating the two expressions for the area,
-- ~~ .Y4-
a:.
........
From this formula and the fact that s, (the Bide of the square) is equal to
it follows that
v'2
- Vz- Vz,
3
n =
> 2,
with n - 1 nested square roots. The circumference of th'.J 2"-gon in the circle
i11 2"~1", As n tends to infinity, the 2"-gon tends to the circle. Hence 2"s,,
approaches the len~~:th of the circumferenc~~ of the unit circle, which is by definition 2..-. Thus we obtain, by substituting m for n- 1 and cancelling a factor 2,
the limiting formula for..-:
, ...;-,--_-..,;',~+~..,;;"',"'-+"':"':-.=+=v~,
msquare roots
REGULAR POLYGONS
125
n square roots
or
(la) x + y - r'
2xx, - 2yy, 2rr1 + x~ + y~ - r~ = 0,
etc. The plus or minus sign is to be chosen in ch of these equations
according as the circles are to be extErnally or internally tangent.. (See
Fig. 3.5.) Equations (1), (2), (3) are three quadratic equations in three
unknowns x, y, r with the
that the second degree terms are
the same in each equation, as
from the expanded form (la)
Hence, by subtracting (2) from
we get a linear equation in x, y, r:
(4)
ax+ by+ cr = d,
where a = 2(x2 - x 1), etc. Similarly, by subtracting (3) from (1), we
get another linear equation,
(5)
a'x + b'y
c'r = d'.
2
126
GEO~IETRICAL
CONSTRUCTIONS
[III I
Solving ('1) and (5) for x andy in terms of r 11.nd then substituting in (1)
we W't a quadratic equation in r, which can be solved by rational opera~
tions and the extraction of a square root (see p. 91). There will in
general be two solutions of this equation, of which only one will be
positive. After finding r from this equation we obtain x andy from the:
two linear equations (4) and (5). The circle with center (x, y) and
radius r will be tangent to the three given circles. In the whole proces&
we have used only rational operations and square root extractions. It
follows that r, x, and y can be constructed by ruler and compass alone.
for
APOLLONIUS' PROBLEM
127
;,
i~ again a rational number,
Any set of numbers
this property of closure with 1-cspet>t to the four rational
operations is called a number field.
128
I III]
GEOMETRICAL CONSTRUCTIONS
Exerci~e:
Show that every field contains all the rational numbers at least
(Hint: If a :t- 0 ie a number in the field F, then aja = 1 belongs to F, and from 1
we can obtain any rational number by rational operations.)
Starting from the unit, we can thus construct the whole rational
number field and hence all the rational points (i.e. points with both
coOrdinates rational) in the x, y plane. We can reach new, irrational,
numbers by using the compllS8 to construct e.g. the number V2 which,
as we know from Chapter II, Z, is not in the rational field. Having
constructed V2 we may then, by the "rational" constructions of 1,
find all numbers of the form
(I)
a+ bv~f,
where a, bare rational, and therefore are themselves constructible. We
may likewise construct all numbers of the form
a+bV2
,+dv2
+ by'2
+ bv'2
' - dv'll
(a,+
2bd)
+ (b, + ad)vz
, + ,vrz.
(f~~
::;..
Tlw eonstructibility of
129
GENERAL THEORY
vl+vrz~ vk,
and with it, according to 1, the field consisting of all the numbers
(2)
+ ,.yk,
Represent
+ +
+ b,._ 1 a~-=i+- . .~ + b o: + bo
b~
Exercise: If two segments of lengths 1 and a are given, give actual construe
tions for 1 +a+ a 1, (1 + a)/(1 ~ o:), a'
Now let us assume more generally that we are able to construct all
the numbers of some number field F. "\Ve shall show that the use of the
rukr alone wiU never lead us out of the field F. The equation of the
straight line through two points whose coOrdinates aJ , bl and lll!, f 1 are
in F is (b1
~)x + (112- a,)y + (a,b:- a2b1) = 0 (seep. 491); its
coefficients are rational expressions formed from numbers in F, and
therefore, by definition of a field, are themselves in F. Moreover,
if we have two lines, ax + fly - 'Y = 0 and a'x + {J'y - y' = 0, with
coefficients in F, then the coOrdinates of their point of intersection,
found by solving these two simultaneous equations, are x
y = ~}'
y{J:
.~,~~~~ ,
the use of the ruler alone cannot take us beyond the confines of the
field F.
130
!III]
GEOMETRICAL CONSTRUCTIONS
v'i
(1
respectivf'ly
\\' e can only Ureak through the walls of F by using: the compass
For thi:-; purpose we select an element k ofF which is .">uch that v'k
F. Then we can construct 0 and therefore all the numbers
is not in
a+ bv'k,
(3)
+
+
;; ~i
(a+
=~7 +~~;::vii,
bf:(~ dy'li2 ~ j
v'2
(1
+ y'2)
v'2
+ v"2
v'2
4 (1 + jy'2) ,92
2 -
v'2-0
,92) ~ -2
2
_ (2__v'22 0
4 - 2
131
GENERAL THEORY
= 0,
A straight line,
ax+ by+ c = 0,
joining any two point:,; whme coOrdinates are in F, has coefficients a, b, c
in F, as we have seen on page 129. By eliminating yfrom these simulta~
neous equations, we obtain for the x~co6rdinate of a point of inter~
section of the circle and line a quadratic equation of the form
Ax
Bx
+C=
0,
with coefficients A, B, C in F
2(ac + b2a- ab{3), C = c2
2bcj3
formula
-BVB'-
-zA
A similar formula
x2
= 0,
= 0,
then by subtracting the serond equation from the first we obtain the
linear equation
2(a - a')x
+ 2(~
~')y
(r - r') ~ 0,
132
GEOMETRICAL CO.:-.ISTRUCTIONS
{III!
one or two new points, and these new quantities are of the form
p + qvfk, with p, q, kin F. In particular, of course, Vk may itsf'll
= 4,
a new
when~
whose
root docs not lie in F 1 Repeating this
shall
a field F~ after n adjunct.ions of square roots.
numbers are those and only those whtch can be reached by such a sequence of
extm.\ion fields; that is, which lie in a jiekl F.. of the tvpe de~P:ribed. The
133
GENERAL THEORY
+ 5.
Let F11 denote the rational field. Putting ko = 2, we obtain the field P\,
which contains the number 1
.y2. We now take k 1 = 1
V2
and k'j = 3. As a matter of fact, 3 is in the original field Po , and
4 fortiori in the field F'j, so that it is perfectly permissible to take
kt ""' 3. We then take ka = V1 + V2 + 0~ and finally k, =
,Y
vi+V'2 + v3 +7,:
<" ' ' + vm/(1 + vi~;:?:l),
(v'2+V'3l(0 + v'l+-v, .;:~r:;: -\/3~0).
2. All Constructible Numbers are Algebraic
If the initial flei(l Fo is the rational field generated by a. single segment, then
all constructible numbers will be algebraic. (For the definition of algebraic
numUers seep. 103). The numbers of the field F, are wots of quadratic equa-
tions, thoae ofF 1 are roots of fourth degret\ equations, and, in genersl, the numTo Hhow
thie for a field F 1 we may first consider as anexamplex~VZ+V3-+?2 We
have {x- v'2)- 3+ V2,x+2- 2V2x = 3+ V2, or x -1 = V2(2x + 11,
a quadratic equation with coefficients in a field P,. By squaring, we finally
obtain
bers ofF~ are rooU! of eqaations of degree 2~, with ra.tioual coefficients.
I""
p+
qVW~
p-a+b.y;:q=
From (4) we have
:.e' - 2p:z
p' - q"w,
134
GEOMETRICAL C'ONSTRL'CTIO!'-:R
+ ux + v =
x1
where
of the
(5)
v'B(rx
[III]
+ 1),
+ ux + v)'
= B(rx
+ t)l
3
-
2 = 0.
x=p+q.YW.
135
(2)
where a and b are in Fk-t. By an easy calculation we can show that
a = p"
3pq~w
2, b = 3p2q + lw. If we put
.
y~p
qv'W,
+ bv'w
2 = 0, hence
~ 0.
This implies --and here is" the key to the argument-that a and b must
both be zero. If b were not zero, we would infer from (3) that VW =
-a/b. But then VW \vould be a number of the field Fk-I in which a
and b lie, contrary to our assumption. Hence b = 0, and it follows
immediately from (3) that a = 0 also.
Now that we have shown that a = b = 0, we immediately infer
from (2') that y = p- qy'W is also a solution of the cubic equation (1},
since y 8 - 2 is equal to zero. Furthermore, y ~ x, i.e. x - y ~ 0;
for, x
y = 2qy'W can only vanish if q = 0, and if this were so then
x = p would lie in F k-t , contrary to our assumption.
We have therefore shown that, if x = p + qy'w is a root of the
qy'W is a different root of this equacubic equation (1), then y = p
tion. This leads immediately to a contradiction. For there is only
one real number x which is a cube root of 2, the other cube roots of 2
being imaginary (see p. 98); y = p - qy'W is obviou.'!ly real, since
p, q, and VW were real.
Tl-Jus our basic assumption has led to an absurdity, and hence is
proved to be wrong; a solution of (1) cannot lie in a field Fk, so that
doubling the cube by ruler and compass is impossible.
2. A Theorem on Cubic Equations
Our concluding algebraic argument was especially adapted to the partieular problem at hand. If we want to dispose of the two otht>r Greek
GEOMETRICAL CONSTRUCTIONS
136
I III}
z' + al + bz
(4)
that, if Xt,
X1,
+c =
~++~--~
Let us consider any cubic equation (4) whl.'re the coefficients a, b, care
rational numbers. It may be that one of the roots of the equation is
rational; for example, the equation x 3 - I = 0 has the rational root 1,
while the two other roots, given by the quadratic equation x 2 x + 1 =
0, are necessarily imaginary. But we can eastly prove the general theo
rem: If a cubic equation with rational coefficients has no rational root, then
rwne of its roots is constructible starting from the rational field Fo.
Again we give the proof by an indirect method. Suppose x were a
constructible root of (4). Then x would lie in the last field F1< of some
chain of exteruion fields, F 0 , F\, ... , F, as above. \Ve may assume
that k is the smallest integer such that a root of the cubic equation (4)
lies in an extension field F~:. Certainly k must be greater than zero,
since in the statement of the theorE'' it is assumed that no root x lies
in the rational field Fo. Hence x can be written in the form
z-p+qVW,
where p, q, ware in the preceding field,Fk-l, but VW is not. It follows,
exactly as for the special equation, i- 2 = 0, of the preceding article,
that another number of F" ,
y=p-qyw;
will also be a root of the equation (4)
and hence x ;;""' y.
From (5) we know that. the third root u of the equation (4) is given
by u = -a - x - y. But since x + y = 2p, this means that
u = -a- 2p,
t The polynomia.l z1 + az1 + bz + c may
(z- z 1)(z- z,)(z- x,), where z,, x,, x,,
(4) (see p.lOl). Hence
zl
+ az + bz + c ..,
z - (x,
+ x, + z 1 )zl +
(z 1z 1
+ x,x, + x,x,)z
- z 1z 1x 1
ao that, sine(! the Poefficient of each pow<:>r of z must he t!w samf' on both sidell!,
-a ,.. x,
+ x, + x,,
b = x,x,
x,x,
x,x,,
137
We shall now prove that the trisection of the angle by ruler and
compass alone is in general impossible. Of course, there arc angles, such
M 90 and 180, for which the trL'lection can be performed. 'Vhat we
have to show is that the trisection cannot be effected by a procedure
valid for every angle. For the proof, it is quite sufficient to exhibit
only one angle that cannot be trisected, since a valid general metlwd
would have to cover every single example. Hence the non~existence of
a general method will be proved if we can demonstrate, for example,
that the angle fill 0 cannot be trisected by ruler and compass alone.
We can obtain an algebraic equivalent of this problem in differ(nt
ways; the simplest is to consider an angle 8 as ginn by it:-; co&ine: (os 8 = g
Then the problem is equivalent to that of finding the quantity x =
cos (8/3). By a simple trigonometrical formula (see p. 97), the cosine
of 8/3 is connected with that of 0 by th~;> equation
COS 8 = g = 4 Ct>.... (8/3) - 3 COS (8/3).
(6)
4z 3 -3z-g=O.
Sz'- Gz = ],
articlP1 we nef'd
= 2z. Then
v1 -3v=1.
!38
GEOMETRICAL CONSTHCCTIOKS
[IIJ)
0
Fig, 36. Archimedoo'
tr~S<>Ciiun
of
~n ~~le
139
know that the vertices of the hepbgon are given by the roots of the
equation
00
i-1=~
the coOrdinates x, y of the
being considered as the real and
imaginary parts of complex
z = x + yi. One root of this
equation is z = I, and the others are the roots of the equation
(10)
+ l/z + i +
3
Iji
+ z + 1/z + 1 =
Dividing (10)
0.
Y~+l-2y-l=O.
(13)
z =cos+ isin,
(14)
+rs 2
2d -
t/
= 0;
140
IIlli
GEOMETRICAL CONSTRVCTIONS
TR:\XSFOR:i\IATIO~S.
IXVERSIOK
1. General Remarks
problems
('tlli
~~f~,;;::,:;~:::~~~~i::~~:~;~~::::~'
general
standpoint{'on.:;truction,
of "gPometrical
ing
an individual
\H' Hhall
class of problems eonneded by certain proc<'sses of
Tbe clarifying power of the concept of a class of geometrical transform&
141
GENERAL REMARKS
line.
By a tranaformation, or mapping, of the phme onto itself we mean a.
rule which as..<;igns to every point P of the plane another point P', called
the image of P under the transforma1
; the point P is called the
antecedent of P'. A simple example of sucn a transformation is given
by the reflection of the plane in a given straight line L as in a mirror:
a point P on one side of Lhasa.'! its image the point P', on the other side
of L, and such that Lis the perpt:'ndicular bisector of the segment PP'.
A transformation may leave certain points of the plane fixed; in thf.
case of a reflection this is true of the points on L.
,P
FJ1".3S.lnver.,or
.. poir
(I)
OPOP'
r~
'vith
inverse
to C.
of P,
142
GEOMETRICAL CONSTRUCTIONS
[III]
r.
& ~.
2. Properties of Inversion
The most important property of an inversion is that it trall8forms
straight lincs and circles into straight lines and circles, l\Iore precisely, we shall show that after an invBrsion
(a)
a line through 0 becomes
a line through 0,
(b)
a line not through 0 becomes
a circle through 0,
(c)
a circle through 0 becomes
a line not through 0,
(d)
a circle not through 0 becomes
a circle not through 0.
Statement (a) is obvious, since from the definition of inversion any
point on the straight line has as image another point on the same line,
so that although the points on the line are interchanged, the line as a
whole is transformed into itself.
To prove statement (b), drop a perpendicular from 0 to the c;traight
line L (Fig. 39). Let A be the point where this perpendicular meets L,
PROPERTIES OF INVERSION
143
and let A 'be the inverse point to A. Mark any point Pon L, and let P'
be its inverse point. Since OA'OA = OP'.OP
r 2, it follows that
OA'
OP
iJJS'
=ox
Hence the triangles OP'A' and OAP are similar and angle OP'A' 's a
right angle. From elementary geometry it follows that P' lies on the
circle K with diameter OA ',so that the inverse of Lis this circle. This
proves (b). Statement (c) now follows from the fact that since the inverse of L is K, the inverse of K is L.
It remains to prove statement (d). Let K be any circle not passing
through 0, with center M and radius k. To obtain its image, we draw
a line through 0 intersecting Kat A and B,and then determine how the
images A', B' vary when the line through 0 intersects Kin all possible
ways. Denote the distances OA, OB, OA ', OB', OM by a, b, a', b', m,
and
be the length of a tangent to K from 0. We have aa' =
bb' =
by definition of inversion, and ab = t 2, by an elementary geo-property of the circle. If we divide the first relations by
the second, we get
2 2
2
a'fb = b'/a = r /t = c ,
where c2 is a constant that depends only upon r and t, and is the same
for all positions of A and B. ThroughA'wedrawalineparalleltoBM
meeting OM ' Q. Let OQ = q and A 'Q = p. Then q/m = a'/b =
pfk, or
2
2
q = ma'/b = mc ,
p = ka'/b = kc
This means that for all positions of A and B, Q will always be the same
point on OM, and the distance A'Q will always have the same value,
144
GEOMETRICAl. CONSTRUCTIONS
{III!
Likewise B'Q = p, since a'/b = b'/a. Thus the images of all points
A, BonK are points whose distance from Q is always p, i.e. the image
of K is a circle. This proves (d).
Fig.U. lnvoroiono!anoutoldepointlnllcitcle,
= r
145
that the triangles AOP, OPQ, OQC are equilateral, so that OA and OC
form an anJ;le of 180<), and OC = OQ = AO. By repeating this procedure, we can easily extend AO any desired number of times. Incidentally, since the length of the segment AQ is rV3, as the reader can
easily verify, we have at the same time constructed v'3 from the unit
without using the straightedge.
Now we can find the inverse of any point P irudde the circle C. First
we find a point R on the line OP whose distance from 0 is an integral
multiple of OP and which lies outside C,
OR= n.QP,
We can do this by successively measuring off the distance OP with the
compass until we land outside C. Now we find the point R' inverse
to R by the construction previous!:'! given. Then
, ~ OR'.OR ~ OR'.(n.OP) ~ (n.OR')OP.
Therefore the point P' for which OP' = n-OR' is the desired inverse.
)/ '\
J ;,
\,
(-~~-RP"
~,;_7
Fll;.43.lnvel'llionofnms><lepointlnolnl1e.
4. How to Bisect a Segment and Find the Center of a Circle with the
Compass Alone
X ow that we have learned how to find the inverse of a given point by
using the compass alone, we can perform some interesting constructions.
For example, we consider the problem of finding the point midway
using tlw compass alone (no
between two given points A and B
straight lines may be drawn!). Here
the solution: Draw the circle
mark off three arcs with radius
with radius AB about B as center,
AB, starting from A. The f I point C will be on the line AB, with
AB = BC. Now draw the cir..:le with radius AB and center A, and
let C' be the point inverse to C with respect to this circle. Then
AC'.AC = AB1
AC'2AB = AB2
2AC' = AB.
Hence C' is the desired midpoint
146
GE0~1ETRICAL
CONSTRl:CTIONS
[III]
Fig.~~. Fmdin~
Fi~r.
45.
~ondma-thecenWrol
actrcle.
147
Fig.46. Anilllltrum&ntlordoublmgthecube.
148
GEOMETRICAL CONSTRUCTIONS
[III I
construction is as follows: from A and Bas centers, swing two arcs with
radiUs AO. From 0 lay off arcs OP and OQ equal to AB. Then swing
two arcs with PB and QA as radii and with P and Q as centers, intersecting at R. Finally, with OR as radius, describe an arc with either
P or Q as center until it intersects AB; this point of intersection is the
required midpoint of the arc AB. The proof is left as an exercise for
the reader.
It would be impossible to prove Mascheroni's general theorem by
actually giving a construction by compass alone for every con.'itructimJ
possible with ruler and compass, since the number of pos~>ible constructions is not finite. But we may anive at the same goal by proving
.~.
Y,g.(J
Bisectinguno.r~woth
theemnp"""
RJ:o~STRICTION
149
X and X' are equidistant from 0 and P, since A and B are so by construction. This follows from the fact that the inverse of Q is a point
whose distance from X and X' is equal to the radius of C (p. 144).
Note that the circle through X, X', and 0 is the inverse of the line AB,
since this circle and the line AB intersect Cat the same points. (Points
on the circumference of a cirele are their own inverses.)
The construction is invalid only if the line AB goes through the center
of C. But then the points of interifection can be found, by thP construction given on page 148, ru;; the midpoints of arcs on C obtained
by swinging around B an arbitrary circle which intersects C in B 1
andB2.
A---
---o
'-;
Fig.l8 lnte,_,banofcircleandlin,nnt
thraughC8nteJ'.
}'og.49.
Inte,...,otiooofciml~andhnethrnughoenter
The method of determining the circle inverse to the line joining two
given points permits an immf'diate Rolution of problem 4. Let the lines
be given by AB and A'B' (Fig'. 50). Draw any circle C in the plane,
and by the preceding method find the circl<'s inverse to AB and A'R'.
These circles intt>rsect at 0 and at a
Y. The point X inverse
to Y is the required point of
and ean be construct<:d by
the process already used. That X is
the fact that Y is the only point that is invprsc
and A'R'; hence the point X inverse to mm.t li<' on both AB a<>d Al'B
'With tlwsP two constructions \H' havp (ompletcd tlH'
equivalC'nee between l\la.c.;cheroni construe! ion;:,. usi11g only
compa.o.:.."
and the eonventional geometrical constructions with ruler and compass
150
[Ill]
GEOMETRICAL CONSTRUCTIONS
Fi,.iO.lnt.Motionoltwolinet~
Let A be any point on the given circle K. The side of a regular inscribed hexagon is equal to the radius of K. Hence we can find points
B, C, D on K such that JB = fiC = CD = 60 (Fig. 51). With A
1.'51
(avz.
of a given
for all geometrical C{}nstructions in the
sense
It is all the more remarkable that Stpiner wa.~ able to restrict the use
in the plane which are
!52
CO~STRIJCTIONS
GEOMETRICAL
!III
possible with the straightedge alone, provided that a single fixed circle
and its center are given. ThE>se constructions require projective methods and will be indicated later (see page 197).
"This circle and its center cannot be dispnsed with. For example, if a circle,
but not its center, is given,' it is impossible to construct the latter by the u~e of
the stra.iglltcdge alone. To prove this we shall make use of a fact that will lw
discussed later (p. 220): There exists a transformation of the plane into itEelf
which has the following properties: (a) the given circle is fixed under the trSl\8
formation. (b) Any straight line i3 carried into a straight Jin<.l. (c) The center
of the circle is ca,rried into some other point. The mere existence of such a transfr;rmation shuwa the imposRibility of constructing with the straightedge alone tlw
center of the given circle. For, whatever the construction might be, it would
consist in drawing a certain number of strr~ight lines and finding their intersections with one another and with the given circle. Now if the whole fi!l:llre, consi~tin~~: of the given circle together with all points rmd lines of the construction,
is subJected to the transformation whose existence we have n.bsumed, the transformed figur<' will satisfy all the requirements of the construction, but will yield
as result a po~nt other than the center of the given circle. Henct' such a construction is impossible
ax.!
+ bx + ex
2
= k,
xy = k,
y = ax
For if
+ bx + c,
tben solving equation (I) amounts to solving the simultaneous equations (2) by eliminating y; i.e. the roots of (1) are the x-eoOrdinatcs of
the points of intersection of the
and parabola in (2). Thus
the solutions of (1) can be
if we have instruments with
which to draw the hyperbola and parabola of
Since antiquity mathematicians have known
curves can be defined and drawn
simple
Of these "mechanical curveo::" the cycloid''""
able. Ptolemy (circa 200 A.D.)
them in a very ingenious way to
describe the movements of the o\anets in the heavens.
Fie
153
52 Graplneal.._.lutionoaoubioequat!<>n
!54
GEOMETRICAL CONSTRUCTIONS
[III]
FiJ:,M, Geneu.lcyck>ldo.
Fic.M. 'l'bt<'e-()UIIpedhypocydmd
155
*4. Unkagcs.
~n
engine
to a point on the flywheel iin~~~c~:~,::~,:l;~;,th\~?:;~::o:~~~~:~,~
flywhePI would move the piston al
was only approximate, and despit-e
mathematicians, the problem of constructing a linkage to """''" ['oint
precisely on a straight line remained unsolved. At one time,
proofs for the impossibility of solutions to certain problems were attracting wide attention, the conjecture was made that the construction of
such a linkage was impossible. It was a great surprise when, in 1864, a
French naval officer. Peaucellier, invented a simple linkage that solved
!56
GEOMETRICAL CONSTRUCTIONS
!III]
Fig
58.PMuoollinr'ot,..,nofonnationofrotationintotru&~i!ln-.rmotion.
QP.OQ
(07' - PT)(OT
~ (OT'
""'t
2
-
+ ST')
i.
+ PT)
- (PT'
OT'
+ ST')
PT'
157
LINKAGES
+n)
and
OQ/AC~OB/AB~n/(m+n).
Thus
OP.OQ = [mn/(m
This
n) )BDAC = [rnn/(m
+ n) J(ADl2
AS:).
!58
GEOMETRICAL CONSTRUCTIONS
!III I
!NVARIANCE OF ANGLES
!59
OP
ilA
OA'
=
iW"
i.e. the triangles OAP and OA 'P' are similar. Hence angle x is equal
to angle OA'P', wbich we call y. Our final step consista in letting the
point A move along C and approach the point P. This causes the
secant line AP to revolve into the position of the tangent line to Cat P,
while the angle x tend'! to Zc. At the same time A' will approach P',
and A'P' will revolve into the tangent at P'. The angle 11 approaches Yo
Since xis equal toy at every position of A, we must have in the limit,
Zo =Yo
Our proof is only partially completed, however, since we have considered only the -case of a curve intersecting a line through 0. The
general case of two curves C, C* forming an angle;:: at Pis now easily
disposed of. For it is evident that the line OPP' divides z into two
angles, each of which we know to be preserved by the inversion.
It ehould be noted that although inversion preserve111 the magnitwk of angles,
it reverees their serue; i.e. if a ray through P sweeps out the a.ngle Zt in a counter
l'llockwise dire\ltion, its image will !!!weep out angle Yo in a clockwise direction.
160
GEOMETRICAL CONSTRUCTIONS
[Ill!
straight line.'! appears to be quite different from that of the circles, yet
we see that they are closely related~indeed from the standpoint of
the theory of inversion they are entirely equivalent.
Fic.02.TaQ~entcircl~tranaformedintop&rallellin.
161
()Gr:~
oO
Fijr.t\3.
Pr~liminll.l"ytoApoUonita.'oo!llltruction.
162
GEOMETRICAl, CONSTRUCTIONS
[III I
REPEATED REFLECTIONS
163
(\
VfJ)
164
GEOMETRICAL CONSTRUCTIONS
(lll]
little more complicated. Here the cirriefl and their images reflect successively into one another, growing smaller with each reflection, until
they narrow down to two points, one in each circle. (These points have
the property of b('ing mutually inverse with respect to both
The situation is ,qhown in Figure 67. The use of three circles leads
CHAPTER IV
PROJECTIVE GEOMETRY. AXIOMATICS. NON-EUCLIDEAN
GEOMETRIES
I. 1!\TlWDliCTilJ:\'
1. Classification of Geometrical Properties.
Transiormations
In variance under
Instead, we may
say that the
of elementary geometry concern magniludes-leligths, measures of angles, and areas Two figures are equivalent from
165
166
I IV 1
this point of view U they are congruent, that is, if one can be obtained
from the other by a rigid motion, in which merely position but no magnitude is changed. The question now arises whether the concept of
magnitude and the related concepts of congruence and similarity are
essential to geometry, or whether geometrical figures may have even
deeper properties that are not destroyed by transformations more
drastic than the rigid motions. We shall see that this is indeed the case.
Suppose we draw a circle and a pair of its perpendicular diameten; on
a rectangular block of soft wood, as in Figure 69. If we place this
0[0j
FiK. 69. Compr.,...iono! .. oircle.
block between the jaws of a powerful vise and compreRS it to half its
original width, the circle will become an ellipse and the angles between
the diameters of the ellipse will no longer be right angles. The circle
hw t.he property that its points are equidistant from the center, while
this does not hold true of the ellipse. Thus it might seem that all the
geometrical properties of the original configuration arc destroyed by
the compression. But this is far from being the case; for example, the
statement that the centN bisects each diameter is true of both the
circle and the ellipse. Here we have a property which persists even
after a rather drastic change in the magnitudes of the original figure.
This observation suggests the possibility of classifying theorems about a
geometrical figure according to whether they remain true or become false
when the figure is subjected to a uniform compression. More generally,
given any definite class of transformations of a figure (such as the class
of all rigid motions, compressions, inversion in circles, etc.), we may a,gk
what properties of the figure will be unchanged under this class of
transformations. The body of theorems dealing with these properties
will be the geometry associated with this class of transformations. The
167
168
PROJECTIVE GEOMETRY.
AXIOMATICS
2. FUNDAMENTAL CONCEPTS
1. The Group of Projective Transformations
We !ln;t. define the cia.&;, or "group,"t of projective tran::<formationc-.
Suppose we haYe two planes 1r and 1r 1 in
not
parai!Pl
1
to each other. \Ye may tlwn perform a
projection
1r onto 1r
from a given centN 0 not lying in :n- or 1r' by defining the image of cad1
1
point P of 1r to be that point P' of 1t , such that P and P' lie on the same
straight line through 0. We may also perform a parallel projection
where the projecting lines are all parallel. In the same way, we can
define thp proje(tion of a line l in a plane 1r onto another line l' in 1r
from a poinL 0 in 1r or by a parallel wojeetion.
169
Any mapping of one figure onto another !Jy a central or parallel pro~
jection, or by a finite succession of such projections, is called a projP-e~
tive transformation.t The
geometry of the plane or of the line
consists of the body of
geometrical propositions which are unaffected by arbitrary projective transformations of thP figures to which
they refer. In contrlli!t, we shall cal! metric geometry the body of those
propositions dealing with the magnitudes of fi!-!;utel'l, invariant only undpr
the claM of rigid motions.
Jtbl
J::fp/
11_
Fig.71.PBrallelprojection.
170
[IV[
Thus the incidence of a point and a line is invariant under the projective
graup. From this fact many simple but important consequences follow.
If three or more points are collinear, i.e. incident \Vith some straight line,
then their images will also be collinear. Likewise, if in the plane 1r
three or more straight lines are concurrent, i.e. incident \vith some point,
then their images will also be concurrent straight lines. While these
simple properties-incidence, collinearity, and concurrence--,are projcc~
tive properties (i.e. properties invariant under projections), measures of
length and angle, and ratios of such magnitudes, are generally altered
by projection. Isosceles or equilateral triangles may project into
triangles all of whose sides have different lengths. Hence, although
''triangle" is a concept of projective geometry, "equilateral triangle"
is not, and belongs to metric geometry only.
2. Desarguess Theorem
One of the earliest diseoveries of projective geometry was the famous
triangle theorem of Del:largurs (1593-1662): If in a plane two triangles
ABC and A'B'C' are situated so that the straight lines joining correspond~
ing vertict's are concurrent in a point 0, tlwn tlw corresponding sides, 1j
extended, w-ill intersect in three collinear points. Figure 72 illustrates
the theorem, and the reader should draw other figures to test it by
experiment. The proof is not trivial, in spite of the simplicity of the
f.,.ue, which invob/es only straight line,.. The theorem clearly be~
longs to projective geometry, for if we project the whole figure onto
another plane, it "'ill retain all the properties involved in the theorem.
DESARGUES'S THEOREM
171
planes, and
that this Desargues's theorem of
geometry is very
easily proved. Suppose that the lines AA', BB', and CC' intersect at
0 (Fig. 73), according to hypothe.,.is. Then AB lies in the same plane
limit,
the whole
and the
172
(!Vj
3, CROSS-RATIO
1. Definition and Proof of Invariance
Just as the length of a line segment is the key to metric geometry, so
there is one fundamental concept of projective geometry in terms of
which all di~tinctively projective properties of figures can be expressed.
If three points A, B, C lie on a straight line, a projection will in
general change not only the distances AB and BC but also the ratio
AB/BC. In fact, any three points A, B, Con a straight line l can
always be coOrdinated with any three points A', B', C' on another line
l' by two successive projections. To do this, we
r
about the point C' until it
a position l"
to l (see Fig.
by a projection parallel to the line
74). We then project l onto
joining C and C', defining three points, A", B 11 , and C" ( = C'), The
173
lines joining A', A" and B', B" will intersect in a point 0, which we
choose as the center of a second projection. These two projections
accomplish the de:>ired result. t
As we have just seen, no quantity that involves only three points on
a line can be invariant under projection. But--and this is the decisive
discovery of projective geometry~if we have four points A, B, C, D
on a straight line, and project these into A', B', C', D' on another line,
then there is a certain quantity, called the cros:Halio of the four points,
that retains its value under the projection. Hero is a mathematical
property of a set of four points on a line that is not destroyed by projection and that can be recognized in any image of the line. The crossratio is neither a length, nor the ratio of two lengths, but the ral.io of
two such ratios: if we consider the ratios CA/CH and DA/DB, then
their ratio,
1----
----1
1--1-----
1i I~ ~ g~,~: I~-~;
The proof follows bvele.neniiM' mPans. We recall that the area of a
triangle is equal to
and is also given by half tht?
product of any two
the included angle. We then
have, in Figure 75,
area OCA = !hCA = !OA -OC sin L COA
~-OB-OC
sin L COB
area ODA
~h-DA =
area ODR
= ~h-DB =
1~''
ar(' j>andlel?
174
{IV]
It follows that
~~~ = ~.~ = q~.QC.sin~_EQ-!.OB-ODsin L DOB
CB DB CB DA OB.QC.sin L COB OA .QD.I'lin L DOA
F!t;.75. Invarianaeofcr.,.,.tionndercentralprojeotion.
175
(1)
where the numbers CA, CB, DA, DB are understood to be taken \\ith
the proper sign. Since a reversal of the chosen positive direction on l
will merely change the sign of every term of this ratio, the value of
(ABCD) w:ill not depend on the direction chosen. It is easily seen that
(ABCD) will be negative or positive according as the pair of points
A, B is or is not separated (i.e. interlocke-d) by the pair C, D. Since
this separation property is invariant under projection, the signed crossratio (ABCD) is invariant also. If we select a fixed point 0 on l as
A
(ABCD)>O
B
C
D
(ABCD)<O
A
FiJ:.77. Signo1crOMtat!o
origin and choose as the coi.irdinatc x of each point on l its directed di.'!
tancc from 0, so that the coi.irdinatcs of A, B, C, D are x1 , x~, x0 , x,,
respectively, then
j1J.4
(ABCD) = {J,1
=
CB DB
When (ABCD) = -1, so that CA/C15 = -DA/DB, then C and D
..
Fi1.18. Croe-ratiointertn.ofoOOrdinr.t..
176
llVJ
>.,
1 - X,
1/X,
X~
1'
1 .:. AI
A: 1'
These six quantities are in genera.! distinct, but two of them may coincide-as in the caBe of harmonic division; when A = -1.
\Y e may also define the cross-ratio of jour coplanar (i.e. lying in a
common plane) and concurrent straight lines 1, 2, 3, 4 as the cross-ratio
of the four points of intersection of these lines with another straight
line lying in the same plane. The position of this fifth line is immaterial because of the invariance of the cross-ratio under projection.
Equivalent to this is the definition
) _ sin(!, 3) ;sin (I, 4)
(I
234
sin (2, 3) sin (2, 4)'
taken with a plus or miDI.L'> sign according as one pair of lines does not
or does separate the other. (In this formula, (1, 3), for example, means
the angle between the lines 1 and 3.) Finally, we may define the crossratio of four t:oaxial planes (four planes in space intersecting in a line/,
their axis). If a straight line intersects the planes in four poiots, these
points will always have the same cross-ratio, whatever the positic'n of
the line may be. (The proof of this fact is left as an exen:ise.) Hence
we may assign this value as the cross-ratio of the four planes. Equivalently, we may define the cross-ratio of four coaxial planes as the cros_s..
ratio of the four lines in whlch they are intersected by any fifth plane
(see Fig. 79).
177
178
[IV]
general, the problem has one and only one solution; for, if xis the coOrdimtte of the desired point D, then the equation
(2)
equation that x
land l".
F11. SO
Pmiectiv~
eorr"'!pondance
Letw""~
tho
poi~t. o~
two hn"'l
179
Exercises: l} Prove that, given two lines together with a projective ~or
respondence between their points, one can shift one line by a parallel displacement into such a position tbat the given correspondence is obtained by a simple
projection. (Hint: Bring a pair of correspondiny points of the two lines into
coincidence.)
2) On the basis of the preceding result, show that if the points of two lines
land l' are coOrdinated by any finite succession of projections onto various intermediate lines, using arbitrary centers of projection, the same result can be obtained by only two projections
Y,g,81.
Cmnp\etaqu~dnlaWral.
x = (ABCD) = (I Pll D)
by projection from E,
(IFHD) = (BACD)
by projection from G.
180
PROJECTIVE GE0!\1ETRY,
AXIOMA TICS
[IV I
Fi1.82, Producilll&!inftbftyoJ.uaD.IIbet.cll.
POINTS~'
181
say that the two lines intersect at a "point at infinity." The essential
thing is then to give this vague statement a precise meaning, so that
point8 at infinity, or, as they are aometimea called, ideal points, can be
dealt with exactly as though they were ordinary points in the plane or
in space. In other words, we want all rules concerninb the behavior of
points, Jines, planes, etc. to persist, even when these geometric elements
are ideal. To achieve this goal we can proceed either intuitiYely or
formally, just as we did in extending the number system, where one
approach was from the intuitive idea of measuring, and another from
the formal rules of arit.hmPtical operations.
First., let W3 realize that in synthetic geometry even the basic concepts of
"ordinary" point and line are not mathematically defined. The so~called
definitions of these concepts which are frequently found in textbooks on
elementary geometry are only suggestive descriptions. In the case of
ordinary geometrical elements our intuition makes us feel at ease
as far as their "existence" is concerned. But all we really need in
geometry, considered as a mathematical system, is the validity of certain
rules by means of which we can operate with these concepts, as in
joinin& points, finding the intersection of lines, etc. Logically con~
sidered, a "point" is not a "thing in itself," but is completely described
by the totality of statements by which it is related to other objects.
The mathematical existence of "points at infinity" will be assured as soon
as we have stated in a clear and consistent manner the mathematical
properties of these new entities, i.e. their relations to "ordinar:r" pointa
and to each other. The ordinary axioms of geometry (e.g. Euclid'a)
are abstractions from the physical world of pencil and chalk marks,
stretched string.s, li., _trays, rigid rods, etc. The properties which these
a-dorns attribute to mathematical points and lines are highly simplified
and idealized def'criptions of the bf'havior of their physical counterparts.
Through any two actual twncil dots not one but many pencil lines can
be drawn. If the dots become smaller anrl smaller in diametE-r then all
these lines will have approximately the same appearance. This is what
we have in mind when we state as an axiom of geometry that "through
any two points one and only one straight line may be drawn"; we are
referring not to physical points and lines but to the abstract and conceptual points and lines of geometry. GeometriC'a.l points and lines
have
simpler properties than do any physical objects, and
th>e"m>pht,cat>oo providec: the essential condition for the development
a dpductive science.
noticed, the ordinary geometry of points and lines is
182
!IV]
Jreatly complicated by the fact that a pair of parallel lines do not inter~
sect in a point. \Ve are therefore led to make a further simplification
in the structure of geometry by enlarging the concept of geometrical
point in order to remove thL'! exception, just as we enlarged the concept
of number in order to remove the restrictions on subtraction and division. Here also we shall be guided throughout by the desire to preserve
in the extended domain the laws which governed the original domain.
We shall
to add to the
on each line a
point shall be
to belong W all the
line., pamncl to lhegiccn line
no other lines. .As a consCrJUence of
this convention every pair of lines in the plane will now intersect in a
single point; if the lines arc not para-llel they will inWr&>ct in an ordinary
point, while if the lines are parallel they will intersect in the ideal point
common to the two lines. For intuitive reasons the ideal point on a
line is called the point at infinity on the line.
lVc shall also agrl'e W add to the ordinary lines in a plane a single "ideal"
line (also called the line at infinity in the planA), containing all tM ideal
points in tM plane and no other points. PredE>ely this convention is
forced upon Ufl if we wish to presene the original law that through
every two points one line may be drawn, and the newly gained law that
every two lines intersect in a point. To see this, let us choose any two
ideal points. Then the unique line which is required to pass through
these points cannot be an ordinary line, since by our agreement any
ordinary line contains but one ideal point. l\Iorcover, this line cannot
contain any ordinary 110ints, sin~e an ordinary point. and one ideal point
determine an ordinary line. Finally, this line must contain all the
ideal points, since we wish it to have a point in common with
ordinary lliin::';,"~,:~:ht;~:~~:,',:;:,~~~~~.:c ~:~~sdy lche :rroJ>ertii" 11hirh
we have fl-1'
a sequrnee
that the inten-ection of two parallel lines is a point at infinity ha. no
mysteriouf; connotation, but is only .a eonvt'niNlt way of stating that Uw
lines are parallel. This way of expressing parallelism, in the language
183
Fit;.83.
Proj..clionintoelemenQ~~ot!nfinit:~~.
lishes a correspondence between the points and lines of 1r and those of r'.
To every point A of 1< corresponds a unique point A' of 71" 1, with the
following exceptions: if the projecting ray through 0 is parallel to the
184
[IVJ
the space and containing all lines at infinity. Each ordinary plane
sects the plane at infinity in its line at infinity.
185
inter~
.J<', .:)
(ABCP) ~ CA/PA
CB PB'
and asP recedes to infinity, PA/PB approaches 1.
(ABC~)~
Hence we define
CA/CB.
5. APPLICATIONS
1. Preliminary Remarks
\Vith the introduction of elements at infinity it is no longer necessary
to state explicitly the exceptional cases that arise in constructions and
theorems when two or more lines are parallel. \Ve need merely remember that when a point is at infinity all the lines through that point
are paralleL The distinction between central and parallel projection
n'eed no longer be made, since the latter simply means projection from
a point at infinity, Tu Figure 72 the voint 0 or the line PQR may be
186
[IV]
state~
F>~
cen~r
o,tinfinity.
Not only the statement but even the proof of a projecthe theorem is
often made simpler by the use of elements at infinity. The general
principle is the following. By the "projective class" of a geometrical
figure F we mean the class of all figures into which F may be carried
by projective transformations. The projective properties of F will be
identical with those of any member of its projective class, since projective properties are by definition invariant under projection. Thus,
any projective theorem (one involving only projective properties) that
is true of P will he true of any member of the projective class of F,
and conversely. Hence, in order to prove any such theor('m for F, it
suffices to prove it for any other member of the projective elass of F.
We may often take advantage of this by finding a .~pccial m('m!wr of
the projective r!a.-;s ofF for which the theorem is t;impler to prove than
for F itself. For example, any two points A, B of a plane 1r can be
projected to infinity by projecting from a center 0 onto a plane 1r'
parallel to the plane of 0, A, B; the straight lines through A and those
through B will be transformed into two families of paralkl lines. In
the projective theorem.~ to be proved in this section we shall make such
a preliminary tran,;formation.
The following elementary fact about parallel lines 'h-ill be useful. Let
two straight lines, intersecting at a point 0, be cut by a pair of lines 11
PRELIMINARY REMARKS
187
If l1 and l, are
Fit. 86.
OA
oc
and conversely, if
g~
OB
{J[j;
The proof
We now give the proof that for two triangles ABC and A'B'C' in a
plane situated as shown in Figure 72, where the lines through corresponding vertices meet in a point, the intersections P, Q, R of the corresponding sides lie on a straight line. To do thie we first project the
figure :so that Q and R go to infinity. After the projection, AB will be
parallel to A'B', ACto A'C', and the figure will appear as shown in
Figure 87. As we have pointed out in Article 1 of this section, to
188
{lVI
B'C'; then P, Q, R will indeed be collinear (since they will lie on the line
at infinity). Now
AB IIA'B' implies
and
AC
II A'C'
implies
'
8
cross~
ratio (seep. 177), then this proof remains entirely in the plane.
Exercise: Prove, in::> aimilnr mannrr, thp ron verse of Desarglles'a throrrm: If
triangles AUC and A'U'C' have the prnperty that P, Q, Rare collinear, then tlw
lines AA ', BB', CC', are concurrent
3. Pascal's Tbeorcmt
This theorem Htate;,: lf lhr !H:rticcs of a !U'xagon lie alU:rnalcly on a
pair of inU:rsl'rling lines, then lhe three intersections P, Q, H of the opposite
sides of the hexagon are collinear (Fig. 88). (The hexagon may intersect
itself. Thf' "opposite" sides can be recognized from the schematic
b+y
a+ x =
b"+y- +S'
IJ
b+ y=
a+ x
a+-_x_ +-r
Therefore
a+x+r
b = h-+Y+s'
so that 16 1134,
aE
was to be proved.
tOn p. 209 we shall discuss a more gcn.,ra\ theorem of the same type The
present special case is also known by the name of ita discoverer, Pappus of Alexandria (third century A.D.).
PASCAL'S THEOUK\'1
189
190
[IV]
4. Brianchon's Theorem
This theorem states: If the sides of a hexagon pass alternately through
two fixed points P and Q, then the three diagonals joining opposite pairs
of vertices of the Mxagon are concurrent (see Fig. 91). By a projection
we may send to infinity the point P and the point where two of the
diagonals, say 14 and 36, intersect. The situation will then appear
as in Figure 92. Since 141136 we have ajb = ujv. But x/y = a/b
and u/v = rjs. Therefore xfy = r/s and 36 il25, so that all three of
the diagonals are parallel and therefore concurrent. This suffices to
prove the theorem in the general case.
Q
REMARK ON DUALITY
191
5. Remark on Duality
The reader may have noticed the remarkable similarity between the
theorems of Pascal {1623-1662) and Brianchon (1785-1864). This similarity becomes particularly striking if we write the theorems side by side:
Pascal's Theorem
Brianchon's Theorem
Not only the theorems of Pa..'lral and Brianchon, but all the theorems
of projective geometry occur in pairs, each similar to the other, and, so
to speak, identical in structure. This relationship is called duality. In
plane geometry point and line are called dual elements. Drawing a line
through a point, and marking a point on a line are dual operation:s. Two
figures are dual if one may be obtained from the other by replacing each
element and operation by its dual element or oper::-.tion. Two theorems
are dual if one becomes the other when all elements and operations are
replaced by their duals. For example, Pascal's and Brianchon's theorems are dual, and the dual of Desargues's theorem is precisely its converse. This phenomenon of duality gives projective geometry a character quite distinct from that of elementary (metric) geometry, in which
no such duality exists. (For example, it would be meaningless to speak
of the dual of an angle of 37 or of a segment of length 2.) In many
textbooks on projective geometry the principle of duality, which states
that the dual of any true theorem of projective geometry is likewi:se a true
theorem of projective geometry, is exhibited by placing the dual theorems
together with their dual proofs in parallel columns on the page, as we
have done above. The basic reason for this duality will be considered
in the following section {see alsop. 217).
6. AKALYTIC REPRESENTATION
1. Introductory Remarks
In the early development of proiective geometry there was a strong
tendency to build everything on a synthetic and Hpurely geometric"
basis, avoiding the use of numbers and of algebraic methods. This
program met wit,h great difficulties, since there always remained places
where some algebraic formulation seemed unavoidable. Complete sue~
cess in building up a putely synthetic projective geometry was only
192
[IV)
ax+by+c=O.
(I)
+ (y-
b)2 = r2
~+~= 1
eto.
The naive approach to analytic geometry is to start with purely
"geometric" concepts-point, line, (h~.-and then to translate these
intv the language of numbers. The modern viewpoint is the reverse.
W{' start with the set of all pairs of numbers x, y and call each such pair
a point, since we can, if we choosE', interpret or vsualizc such a pair of
INTRODUCTORY REMARKS
193
194
J!VI
ax+by+cz=D.
Hence this is the equation in homogeneous coOrdinates of a straight
line in 1r.
:Kow that the geometrical model of the points of 1r as lines through 0
has served its purpose, we may lay it aside and give the following
purely analytic definition of the extended plane:
.'\ point is an ordered triple of rcalnumbe!'f! (x, y, z), not all of whieh
are Zf'ro. Two such triples, (x1 , y,, z1) and (x2 , Yz, z2), define the same
point if for some t ,r. 0,
HOMOGE:-;EOUS COORDINATES
195
ax
+ by + cz
= 0,
where a, b, care any three constants, not all zero In particular, the
points at infinity in 1r all satisfy the linear equation
z = 0.
(2)
This is by definition a line, and is called the line at infinity in 1r. Since
a line is defined by an equation of the form (1'), we call the triple of
numbers (a, b, c) the homogeneous coOrdinates of the line (t'). It follows
that (Ia, tb, tc), for any 1. ~ 0, are also coOrdinates of the line (1'), since
the equation
(3)
+ (tb)y + (tc)z
(ta)x
is satisHed
In these
ax
+ by + cz
= 0,
and this is likewise the condition that the point whosP robrdinates are
(a, b,
lie on the line whosn coOrdinates are (.r, y, z). For example,
the
identity
23
+ 14- 52 =
196
!IV!
aX+
bY+,~
aX+ bY+ c- 0
will represent a etraight lim.! in'~~" On ~qbstituting X - :t/z, Y- yjz 1md multiplying through by z we find that the equation of the same line in homogeneous
ordinates is, as stated on page 195,
ax+ by+- 0.
CO
x' - a1x
(4)
+ b,y + c,z,
y' - alx
+ b,y + Cll!,
+ b,y + cp~,
"" a,x
connecting the homogeneous coOrdinates .:z:', y', z' of the points in tlw plane tt'
with the homogeneous eoOrdinates x, y, z of the points in tho planr tt. From
our present point of view we may now deft ~a projective transformation as one
given by any set of linear equations of t!1e form (4). Thl.' theorems of projective
geometry then become theorems on the behavior of number triples (x, y, t) under
Buch transformations. For example, the proof that the cro~s-ratio of four points
on a line ia unchanged by such transformations bccoml.'a simply an exercise in
the algebra of linear transformations. \Ve cannot go further into the details of
this analytic procedure. Instead we shall return to the more intuitive aspects
of projective geometry
7. PROBLEMS ON COXSTRUCTIO:O.:S WITH THE
EDGE ALONE
STRAIGHT~
l'ROBLE:VIS OX COXSTIWCT!ONS
197
198
PROJECTIVE GEOMETRY.
AXIO:\IATICR
[IV l
199
+ by + c:ry + dx + ey + f
2
= 0,
is either one of the three conics, a straight line, a pair of straight lines,
a point, or imaginary. This is usually proved by introducing a new and
suitable coOrdinate system, as is done in any course in analytic geometry.
These definitions of the conic sections are essentially metric, since
the-y make use of the concept of distance. But there is anothPr definition that establishes the place of the conic sections in projective
geometry: The conic sections are simply the projections of a circle on a plane.
If we project a circle C from a point 0, then the projecting lines will
form an infinite double cone, and the intersection of this coni.' with a
plane 1r will be the projection of C. This intersection will be an ellipse
or a hyperbola according as the plane cuts one or both portions of the
cone. The inLermediate case of the parabola occurs if 1r is parallt'l to
one of the lint's through 0 (see Fig. 94).
The projectinp; cone need not be a right circular cone with its vertex 0
perpendicularly above the center of th(' drclc C; it may also be oblique.
In all
as we shall here accept without proof, the intersection of
be a curve whose P.quation is of second dPgrcc;
every t'Urvc of second degree can be obtained from a
circle by
a projection. It is for this reason that the curws of
second degree are called conic sections.
\Vben the plane intersects only one portion of a
circular cone
we have stated that the cune of intersection E is an
\Ye may
prove that E satisfies the usual fucnl dt:finition of the
as given
above, by a simple but beautiful argument given in 1822 by
Belgian
mathematician G. P. Daudelin. The proof is based on the iutroduction
200
[IV
of the two spheres S 1 and 82 (Fig. 95), which are tangent to 11' at the
points F 1 and F 2 , respectively, and which touch the cone along the
parallel circles K 1 and K2 respectively, We join an arbitrary point
~ 94. Com
~tton~
P of E with F 1 and F 2 and draw the line joining P to the vertex() of the
cone. This line lies entirely on the surface of the cone, and intcrgccts
the circles K1 and K 2 in the points Q1 and Q2 respectively. Now PF:
PF, = PQ1.
Simil~trly,
PF! = PQ2.
equat~ons
PFt
we obtnin
PFt = PQ1
+ PQJ.
201
But PQ1 + PQ2 = Q1Q~ is just the distance along the surface of the
cone between the parallel circles K, and K2and is therefore independent
of the particular choice of the point PonE. The resulting equation,
PF1 + PF2 = constant
E is
202
PROJECTIVE GEm.tETRY.
AXIO~fATICS
[IV]
203
circle into any conic K, we shall obtain on K four points, again called
A, B, C, D, two other points 0, 0', and the two quadruples of lines
a, b, c, d and a', b', c', d'. These quadruples will not be congruent,
since equality of angles is in general deo;troyed by projection. nut since
cross-ratio is invariant under projection, the equality (a b c d) =
(a' b' c' d') will still hold. This leads to a fundamental theorem:
If any four given points A,
0 of K by lines a, b, c, d,
independent of the position of 0 on K
is
Then any four lines a, b, c, d of the pencil 0 will have the same cross-ratio
204
!!VI
as the four corresponding lines a1 , b', c', d' of 0'. Any biunique cor
respondence between two
of lines which has this property is
called a projective
0"
circumHbncc~,
dlgcnerate into a
~tmight
line;
205
and all its points are counted as belonging to the locus. Hence the
conic degenerates into a pair of lines, which agrees with the fact there
are sections of a cone (those obtained by planes through the vertex)
which consist of two lines.
Fig.9\'l. CirdeBndequi!Bt.eralhyperbol.o.generatedbyptoiectivePMlcilt!.
206
I IV!
invariant under projection, a proof for the case of the circle will suffice
to establish the theorem in general.
FiJI.IOO. A ciroleM"""toft.anpnil
Similarly,
by the arc TQ at a
where
207
t The set of points on " tr"-ight line is called a range of poi:-,18. Th.i! ie the
dual of a pencil of lines.
208
[IV]
.
I
.
II I 1"
.
A come as a set of pomls_ con- [ A come as ~ set o mes consiSts
sists of the points of intersectwn of I of the lines Joining corresponding
corresponding lines in two pro- points in two projectively related
je.ctively related pencils of lines.
ranges of points.
~'ig.
F~g
103. A
pa.r~Lola
r~nj{""'
If we regard the
the point itself, and
209
tangrnts) fi':l the dual of a "point curve" (the set of all its points), then
the complete duality between these two statements is apparent. In the
tmns!ation from one statement to the other, replacing each concept by
its dual, the word "conic" remains the same: in one case it is a "point
conic," defined by its points; in the other a "line conic," defined by its
tangmts. (See I'ig. 100, p. 206.)
An important eonsequence of this fact is that the principle of duality
in plane projective geometry, originally stated for points and lines only,
ma.\' now be t>xtended to cover conics. If, in the statemenl of any theorem
and conics, each element is replaced by iis dual
~~~:;):'~~:,;~~;,dwt:;z"!z:~~,:~~l'~:. ~,!~~~~~on,An
a conic
is a of
tangent
to the
cc
example
the working
principle will he found in Article 4 of this section.
The construction of conies a-s line curves is shown in Figures 103-104.
If, on the two projectively related point ranges, the two points at
infinity correspond to each other (as must be the case with congruent or
similart range~), the conic will be a parabola; the eonverse is also true.
E:arrisc
210
!IV!
Mark the points of intersection of (1, 2) with (4, 5), (2, 3) with (5, 6),
and (3, 4) with (6, 1). Then these three points of intersection lie on a
straight line.
Brianclwn's theorem: Given six tangents, 1, 2, 3, 4, 5, 6, to a conic.
Successive tangents intersect in the points, (1, 2), (2, 3\ (3, 4), (4, 5),
(5, 6), (6, 1). Draw the Jines joining (1, 2) with (4, 5), (2, 3) with
(5, 6), and (3, 4) with (6, 1). Then these lines go through a point.
Fig IQ&.
Po~cs.l'o
~nwal
~. ~.
211
212
! IV l
H. JJ
p.
BA, we obtain
k
\XABoo)
BX:BA.
Hence we have
BX:BA
YP:YA,
THE HYPERBOLOID
213
tion, while each line of one family intersects all the lines of the other
family.
f'jg. 108. Conatruction of !mM mterMl'ltlllg thr!!e find li<>illl in )leneral p<.ontion.
Fig. 100.
Th~
hyperf.>c>lmd
214
[IV)
ratio of the four points where any four given lines of one family intersect
a given line of the other family is indepl.'ndent of the position of the
AXIO~!ATICS
215
choice of the propositions selected as axiolllil is to a large extent arbitrary. But little is gained by the axiomatic method unless the postulates are simple and not too great in number. Moreover, the postulates
must be consistent, in the sense that no two theorems deducible from
them can be mutually contradictory, and complele, so that every theorem
ol- the system is deducible from them. For reaBons of
it is
also desirable that the postulates be independent, in the sense
no
one of them is a logical consequence of the others. The question of the
consistency and of the completeness of a set of axioms has been the
subject of much controversy. Different philosophical convictions concerning the ultimate roots of human knowledge have led to apparently
irreconcilable views on the foundations of mathematics. If mathematical entities are considered as substantial objects in a realm of "pure intuition", independent of definitions and of individual acts of the human
mind, then of course there can be no contradictions, since mathematical
facts are objectively true statements describing existing realities. From
this "Kantian" point of view there is no problem of consistency. Un~
fortunately, ho\vever, the actual body of mathematics cannot be fitted
into such a simple philosophical framework. The modern mathematical
intuitionists do not rely on pure intuition in the broad Kantian sense.
They accept the denumerably infinite "-s the legitimate child of intuition,
and they admit only constructive properties; but thus basi<' concepts
such as the number continuum would be bani~hed, important part.'l
of actual mathematics excluded, and the rest almost hopelessly com*
plicated.
Quite different is the view taken by the "formalists." They do not
attribute an intuitive reality to mathematical objects, nor do they claim
that axioms express obvious truths concerning the realities of pure
intuition; their concern is only with the formal logical procedure of
reasoning on the basis of postulates. This attitude has a definite advantage over intuitionism, since it grants to mathematics all the freedom
necessary for theory and applications. But it imposes on the formalist
the necessity of proving that his axioms, now appearing as arbitrary
creations of the human mind, cannot possibly lead to a contradiction.
Great effort.'l have been made during the last twenty years to find such
at least for the axioms of arithmetic and
consistency
and for the
of the number continuum. The results are
significant, but success is still far off. Indeed, recent results
that such efforts cannot be completely sucecs:oful, in the sense that
proofs for consistency and completeness are not possible within strictly
216
clo~<cd
I IV]
217
218
!IV]
219
220
{IV I
work in the field was stimulated by the id.cns of the '''"''""' ge<;meter
Cayley (1821-1895). In this model, infinitely
ean be drawn "parallel" to a given line through an extcmal pmnL
L~ calkd Bolyai-Lobarhf'vskian or
Kkin's model is
221
that
any "point" not on a
line" infinitely many
"straight
can be drawn having no
in common with the
given "line." The first "straight line" i:,; a Euclidean chord of the
circle, while the second "straight line" may be any one of the chords
which
through th0 given "point" and do not intersect the first
the circle. This simple model is quite sufficient to settle
the fundamental question whith gave rL~e to non-Euclidean geometry;
that
postulate cannot be deduced from the other
"inm"''' l<:nnli<Jnon geometry. For if it could be so deduced, it would
be a true theorem in the geometry of Klein's model, and we have seen
that it is not.
222
[IV]
+ (OSRQ)
;< (OSRP).
(!)
(OSRP).
tion
the geometry
physical world? As we have already seen,
experiment can never decide whether there is but one or whether there
are infinitely many straight lines through a point and parallel to a given
line. In Euclidt'an geometry, however, the sum of the angles of any
triangle iR 180, while it can be shown that in hyperbolic
the
sum is less than 180. Gauss accordingly performed an
to
settle the question. Be accurately measured the angles
a triangle
formed by three fairly distant mountain peaks, and found the angle-sum
GEO~fETRY
AND REALITY
223
to be 180", within the limits of experimental error. Had the result been
noticeably less than 180", the consequence would have been that hyper
bulic geometry is preferable to describe physical reality. But, as it
turned out, nothing was settled by this experiment, since for small triangles whose sides are only a few miles in length the deviation from
180 in the hyperbolic geometry might be so small as to have been undetectable by Gauss's instruments. Thus, although the experiment was
inconclusive, it showed that the Euclidean and hyperbolic geometries,
which differ widely in tM large, coincide so closely for relatively small
figures that they are experimentally equivalent. Therefore, as long
as purely local properties of space are under consideration, the choice
between the two geometries is to be made solely on the basis of simplicity
and convenience. Since the Euclidean system is rather simpler to
deal with, we are justified in using it exclusively, as lor~ as fairly small
distances (of a few million miles!) are under consideration. But we
should not necessarily expect it to be suitable for describing the universe
as a whole, in its largest aspects. The situation here is precisely
analogous to that which exists in physics, where the systems of Newton
and Einstein give the same results for small distances and velocitie!l,
but diverge when very large magnitudes are involved.
The revolutionary importance of the discovery of nonEuclidean
lay in the fact that it demolished the notion of the axioms
as the immutable mathematical framework into which our
experimental knowledge of physical reality must be fitted.
4. Poincare's Model
The mathematician is free to consider a "geometry" as defined by any
set of con:,istent axioms about "points," "straight liMs," etc.; his inwill be useful to the physidst only if
axioms ('orre'""''d to tl.c nlhveioal behavior of objects in the world. From this
of vi{'w we wish to examine the meaning of the statement "light.
in a straight line." If this is regarded as the physical definition
of "straight line," thPn the axioms of geometry must be so chosen as to
correspond with the behavior of light rays. Let us imagine, with Poincare, a world composed of the interior of a circle C, and such that the
velocity of light at any point inside the circle is equal to the distance of
that. point from the circlimfercnce. It can be proved that rays of light
will then take the form of circular arcs perpendicular at their extremities
to the ciJ'Cumference C. In such a world, the geometrical properties of
"straight lines" (defined as light rays) will differ from the Euclidean
224
!IVJ
(in
thiE~
light ray
geometry.
Lobachcvc:kian gcomct,y,
or Bolvai<thctadt
the line is
225
infinite (the infinite extent of the line is essentially tied up with the
concept and the axioms of "betweenness"). But after hyperbolic
geometry had opened the way for freedom in constructing geometries,
it was only natural to ask whether different non~Euclidean geometdes
could be constructed in which a straight line is not infinite but finite
and closed. Of counw, in such geometries not only the parallel pos~
tulatc, but also the axioms of "betweenm'ss" will have to be abandoned
Modern developments have brought out the physical importance of
these geometries. They were first considered in the inaugural address
delivered in 1851 by Riemann upon his admission as an unpaid in~
structor ("Privat~Docent") at the University of GoeUil: ~en. Geometries with closed finite lines can be constructed in a completely consistent
Fi~:.l13
"Straightlmeo"inal:tiemann\angeometry.
226
!IV)
sarily a sphere, and let us define the "straight line" joining any two
points to be the curve of shortest length or "geodesic" joining these
points. The points of the surface can be divided into two classes:~!.
Points in the neighborhood of which the surface is like a sphere in that
it lies wholly on one side of the tangent plane at the point. 2. Points
in the neighborhood of which the surface is saddle-shaped, and lies on
both sides of the tangent plane at the point. Points of the first kind
227
Fl,;.
11~.
Ryperboliopoint.
APPENDIX
'GEmmTRY IN :I!ORE THAN THREE DHIENSIONS
I. Introduction
ordinary sense is
on many occasions it is quite convenient to speak
of "spaces" having four or more dimensions. What is the meaning of
an n-dimensional space when n is greater than three, and what purposes
can it serve? An answer can be ghen from the analytic as well as from
the purely geometric point of view. The terminology of n-dimensional
228
[IV)
2. Analytie Approach
We have already remarked on the inversion of meaning which came
about in the course of development of analytic geometry. Points, lines,
curves, etc. were originally considered to be purely "geometrical"
entities, and the task of analytic geometry was merely to assign to them
systems of numbers or equations, and to interpret or to develop g<'ometri~
cal theory by algebraic or analytic methods. In the course of time the
opposite point of view began increasingly to assert itself, A number x,
or a pair of numbers x, y, or a triple of numbers x, y, z were considered
as the fundamental objects, and these analytic entities we-re then "visual~
ized" a<; points on a line, in a plane, or in spa<'e. From this
of
view Jeometrieal language serves only to state relation:-;
numbers. We may discard the primary or pven the indPpendent Pharacter of geometrical objects by saying that a numher pair x, y is a point
in the plane, the set of all number pairs x, y that ><ati:-;fy the linear
equation L(x, y) = ax + by + c = 0 v.ith fixed numbers a, b, c
a
line, etc. Similar definitions may be made in space of three dimensions.
Even if we are primarily interested in an algebraic
be that the language of geometry lends itself to an
scription of it, and that geometrical intuition suggests
appropriate
algebraic procedure. For example, if we wish to solve three simultaneous linear equations for three unknown quantitief'. x, y, z:
0
a'x
+ by + ('z + d
+ 1/y + c'z + d'
L'(x, y, z)
L"(x, y, z)
a"x
= 0,
J,(x, y, z) = ax
+ by + d > 0
ANALYTIC APPHOACII
229
+ by+
L(x, y, z) = ax
"half~space"
cz
+d >0
L(x, y, z) = 0.
The introduction of a "four-dimensional space" or even an "n-dimensional space" is now quite natural. Let us consider a quadruple of
numbers
z, t. Such a quadruple is said to be represented by, or
simply, to
a point in four-dimensional space Rt. More generally,
a poii1t of n-dimensional space Rn is by definition simply an ordered set
of n real numbers x 1 , x 2 , , x,.. It does not matter that we cannot
visualize such a point. The geometrical language remains just as
suggestive for algebraic properties involving four or n variables. The
reason for this is that many of the .algebraic properties of linear equations, etc. are essentially independent of the number of variables involved, or, as we may say, of the dimension of the space of the
variables. For example, we call "hyperplane" the totality of all points
x1 1 X2 1 1 x,. in the n-dimensional space R.. which satisfy a linear
equation
L(:r1 , :r2 ,
:r ,.)
= a,:rl
+ a-.x2 + + a~x,. + b =
0.
L,(:r,'
X~,
f,,.(z,
X2
1:n) = 0
.. t :r,) = 0
1 '''
In)= 0,
230
[IV)
quadruples x, y, z, t. By the introduction of a non~Euclidcan hyperbolic geometry into this analytic framework, it became possible to describe many otherwise complex situations with remarkable simplicity.
Sim.ilar advantages have accrued in mechanics and statistical physics,
as well as in purely mathematical fields.
Here are some examples from mathematics. The totality of all circles
in the plane forms a three-dimensional manifold, because a circle with
center x, y and radius t can be represented by a point with the coOrdinates x, y, t. Since the radius of a circle is a positive number, the
totality of points representing circles fills out a half-space. In the
same way, the totality of all spheres in ordinary three-dimensional space
forms a four-dimensional manifold, since each sphere \\ith center
x, y, z and radius t can be represented by a point with coOrdinates
x, y, z, t. A cube in three-dimensional space with edge of length 2,
sides parallel to the coOrdinate planes, and center at the origin, consists
of the totality of all points x1, X2, xa for which I X1 I ~ 1, I x2l =:; 1,
I x 3 I :::; 1. In the same way a "cube" in n-dimensional space R~ with
edge 2, sides parallel to the coOrdinate planes, and center at the origin,
is defined as the totality of points X1 , X2, , x,. for which simultaneously
lxd o
I,
I x, I ,;
I, ... ' I x.
I ,;
I.
The 11sutface" of this cube consists of all points for which at least one
equality sign holds. The surface elements of dimension n - 2 consist
of those points where at least two equality signs hold, etc.
Exercise: Describe the surf ~tee of such
sional ca~es.
a.
!l
~tnd
n-dimen-
231
c----A
Fig. 116,
Trian~;le
~ndt..
232
(IV]
Eurciet: Carry out this reduction for all the regular polyhedra {see p. 237}.
111
'f':'l
Do
nov'l':l
n
u~
~u
2u
byooi'i~dinationof
vertioeeandadgllfl.
233
of this article,
C~ =
234
[lVI
different subsets of i objects each that can be formed from a given set
of r objects. Hence an n-dimen.sional "tetrahedron" contains
C~H=n+l
cH _
'
(n+ I)[
- 21(n- I) I
C"+t _ (n+l)!
3
3!(n- 2)1
c+>_(n+
4
-
c:;: =
4l(n
l)l
3)!
vertices
(To's),
segments
(T,'s),
triangles
(T2's),
Ta's,
T.,'s.
CHAPTER V
TOPOLOGY
INTRODUCTION
In the middle of the nineteenth century there began a completely new
developmeut in geometry that was soon to become one of the great
forces in modern mathematics. The new subject, called analysis situs
236
TOPOLOGY
lVI
(l)
V-E+F=2.
237
one of the faces of the hollow polyhedron, we can deform the remaining
surface until it stretches out flat on a plane. Of course, the areas of th~
wit! have
238
TOPOLOGY
(Vi
+F
- lli - 32
+ 16 -
239
original polyhedron, while the number of polygons will be one less thaH
in the original polyhedron, since one face was removed. We shall now
show that for the plane network, V- E + F = 1, so that, if the removed
face is counted, the result is V - E + F = 2 for the original polyhedron.
First we "triangulate" the plane network in the following way: In
some polygon of the network which is not already a triangle \Ve draw a
diagonal. The effect of this is to increase both E and F by 1, thus
preserving the value of V- E +F. We continue drawing diagonals
joining pairs of points (Fig. 122) until the figure consists entirely of
triangles, as it must eventually. In the triangulated network,
V
E
F ha.s the value that it had before the division into tri~
angles, since the drawing of diagonals has not changed it. Some of the
triangks have edgPs on the boundary of the plane network. Of these
some, such as ABC, have only one edge on the boundary, while other
triangles may have t\vo edges on the boundary. 'Ve take any boundary
triangle and remove that part of it which does not also belong to some
other triangle. Thus, from ABC w-e remove the edge AC and the face,
leaving the vertices A, B, C and the two edges AB and BC; while from
DEF we remove the face, the two edges DF and FE, and the vertex F.
The removal of a triangle of type ABC decreases E and F by 1, while V
is unaffected, so that V - E + F remains the same. The removal of a
triangle of type DEF decreases V by 1, E by 2, and F by 1, so that
V - E + F again remains the same. By a properly chosen sequence of
these operations we can remove triangles with edges on the boundary
(which changes with each removal), until finally only one triangle
remains, with its three C.~6es 1 three vertices, and one face. For this
240
IV I
TOPOLOGY
nF = 2E;
(2)
belon~s
rV = 2E,
(3)
~+~
-E=2
TOPOLOGICAL PROPER.TIES
241
p +-->p'
between the point.9 p of A and the points p' of A' which has the follow.
242
TOPOLOGY
[Vj
Ft-123. TopologioaEI}
'-"~""'nlPntautFao""
@
into a circle or an ellipse, and hence these figures have exactly the sam!
topological propprties. But one cannot deform a cirrle into a line segnwnt, nor the ~;urface of a ~phere into the surface of an inner tube.
The general
of topo!ogiral transformation is wider than thr
concept of
For example, if a figure is cut during a doformation and the edgeo. of the cut sewn together after the defonnati-on
in exactly the same way as before, the process still defines a topological
transformat-ion of the original figur-e, although it is not a deformation
Thus the t\vo curves of Figure 131 (p. 256) are topologically equivalent
to each other or to a circle, since they may be cut, untwisted, and the
cut se\'\71
But it is impossible to deform one curve into the other
or into
without first cutting the curve.
Topological properties of figures (5uch as are given by Euler's theorem
and others to be discussed in this section) are of the greatest interest
TOPOLOGICAL PROPERTIES
243
2. Connectivity
As another example of two figures that are not topologically equiva~
lent we may consider the plane domains of Figure 125. The first of
doubl~oonnectivitr.
these consists of all points interior to a circle, v,hilc the second consists
of aU points contained hctween two concentric drcles. Any closed
curve lying in the domain a can .be continuously deformed or "shrunk"
down to a single point uJithin the domain. A domain with this property
is said to be frimply connected. The domain b is not. simpl,Y connected.
For example, a circle concentric with the two boundary circles and mid~
TOPOLOGY
I vI
way between them cannot be shrunk to a single point within the domain,
since during this proccsA the curve would ncecssarily pass over the center
of the circles, which is not a point of the domain. A domain which is
not simply connected is said to be multiply connected. If the multiply
connected domain b is cut along a radius, as in Figure 126, the resulting
domain is simply connected.
More generally, wt can construct domains with
three, or more
"holes," such as the domain of Figurp 127. In order
couvert this
domain into a simply conne-cted domain, two cuts arP nf'Pessary. If
F1g.127. Recluctionofatrip!yonon\>CW<.I<lQ!lllliD
245
length of the curve and the area that it encloses can be changed by a
deformat.ion. But there is a topological property of the configuration
which is so simple that it may seem trivial: A simple closed curve C in
the plane divides the plane into exactly two domains, an inside and an
outdde. By this is meant that the points of the plane fall into two
classes~A, the outside of the curve, and B, the inside-such that any
pair of points of the same class can be joined by a curve which does not
cross C, while any curve joining a pair of points belonging to different
classes must cross C. This statement is obviously true for a circle or
an ellipse, but the self~evidence fades a little if one contemplates a
complicated curve like the twisted polygon in Figure 128.
246
TOPOLOGY
[VI
247
248
TOPOLOGY
lVI
~~oo~---~oo~-~oo~-----
win~..,~.,
249
250
TOPOLOGY
!Vl
set, containing no points at all, baa dimension -1. Then a point setS is of dimension 0 if it is not of dimension -1 {i.e. if S con .ins at least one point), and if
each point of Scan ba enclosed within an arbitrarily small region whoee boundary
intefl:leCtll Sin a set of dimension -1 (i.e. whose boundary contains no points of S)
For example, the set of rational points on the line is of dimension D, since each
rational point can be made the center of an arbitrarily Bma\1 interval with irra-
tioMI endpninte. The Cantor set Cis 11.\ao BMn to be of dimension 0, since, like
the set of rational points, it is formed by removing a dense set of points from
the line.
So far we have defined only the concepts of dimension -1 and dimension 0
The definition of dimension 1 suggests itself at once: a setS of points is of dimension 1 if it is not of dimension -l or 0, and if each point of Scan be enclosed
within an arbitrarily small region whose boundary intersects Sin a set of dimension 0. A line segment has this property, since the boundary of any interval is
a pair of points, which is a set of dimension 0 according to the preceding definition
Mo cover, by proceeding in the same manner, we can successively define the con<'Cpts of dimension 2, 3, 4, 5, , each rl'sting on the previous definitions. Thus
a setS will be of dimension n if it is not of any lower dimension, and if each point
of Scan be enclosed within an arbitrrily small region whose boundary intersects
S in a eet of dimension n - 1. For example, the plane is of dimension 2, eince
each point of the plane can be enclosed within an arbitrarily small circle, whos~
circumference is of dimension I.t No point set in ordinary space can have dimf"nsion high~~ than 3, since each point of space c~ ~made the center of an arbitrarily small sphere whose surface is of dimensio
But in mod~rn mathematic~
the word "spacf'"' is used to df'not~ any system
,1bjeds for which a notion of
"distance" or "neighborhood" is defined (seep. 316), and these abstract "spaces'"
may have dimensions higher than 3. A simple ~xample is Cartesian nspace.
whose "points" are ordered arrays of n real numbers:
I'
(x 1
x,, x,,
th~
x~),
, u.);
251
boundary), then there will necessarily be points where three or more of these
re.;;ions meet, no matler what the shapes of the regions. In addition, there will exist
subdivisions of the figure in which each point belongs to at most three regions of
the subdivision. Thus, if the ty,.. o-dimensional figure is a square, as in Figure 131,
then a point will belong to the three region~, 1, 2, and 3, while for this pn.rticular
subdivision no point belongs to more tha-n three regions. Similarly, in the threedimensional cal!e it may be proved that, if a volume is covered by sufllciently
small volumes, there always exist points common to at least four of the latter,
while for a properly chosen subdivision no more than four will have a point in
Fia.131.Thetil!nc th..,<em
252
TOPOLOGY
IV!
\\"e consider a circular disk in the plane. By this we mean the region
consisting of the interior of some circle, together with its circumierence
Let us suppose that the points of this di'lk an subjected to any continuous transformation (which need not even be biunique) in which eaeh
point l<'mains within the circle, although differently situahd. Fur
example, a thin rubber disk might be shnmk, turned, folded,
or deformed in any way, so long as the final position of each point
the disk lies within its original circumfercnc('. Again, if the liquid in a
glass is set inW motion by stirring it in such a manner that partie IPs on
the surface remain on the surface but move around on it to other positions, then at any given instant the
of the particles on the
surface df'fines a continuous
of the original distribution
The theorem of Brouwer now states: Each such trans(o,nati,on !'aves at kasl
point fixed; that is, there exists at least one
point whose position after the transformation is the same as its original
position. (In the example of the surface of the liquid, the fixf'd point
will in general change with the time, although for a simple circular
rotation it is the eentcr that is always fixed.) The proof of the existence
of a fixed point is typical of tlw rcas.oning used to establish many topological theorems.
Consider the disk before and afkr the tran:4ormlltion, and assume
C'Ontrary to thP stat.emPJlt of thP th.Porem, thnt no point remains
so that undt'r the tran.':lforrnation caf'h point. mows to another
253
Now consider the points on thP boundary of thf' drcle, with their associated vectors. All of tht>se vt'ctors JJOint into the ('ircle, since, by assumption, no points are transfom1ed into points outside the circle. Let
us begin at some point P, on the boundary and travel in the counterclockwise direction around the cirele. As we do so, the direction of
the vector will change, for the points on the boundary have variously
pointed vectors associated \vith them. The din'etions of these vectors
may be shown by drawing parallel arrows that issue fwm a single point
in thP plane We noticP that in traversing the circle once from P1
around to P 1 , the vector turns around and comes back to its original
position. Let us f'all the number of ('OmplPte revolutions made
this
vecttJr the "indPx" of the vPctor~ on the cirele; more
define the index a.'! the
is always dirt"eted
TOPOLOGY
254
[VI
inside the circle and never along the tangent, Now, if this transformation vector turns through a total angle different from the total angle
through which the tangent vector turns (which is 360, because the
tangent vector obviously makes one complete positive revolution), then
the difference between the total angles through which the tangent vector
and the transformation vector turn will be some non.zero multiple of
360, since each makes an inte-gral number of revolutions. Hence the
anY
255
5. Knots
256
TOPOLOGY
[VI
answer is by no means simple, and still less so is the complete mathematical analysis of the various kinds of knots and the differenc-es between
them. Even for the simplest case this has proved to be a sizable task.
Consider the two trefoil knots shown in Figure 134. These two knots
Fi>t-136.
Cut.oon~pherOland tom~.
257
separate the surface into two parts. To say that C separates the sphere
into two parts means that if the sphere is cut along C it will fall into
two distinct and unconnected pieces, or, what amounts to the same
thing, that we can find two points on the ::phere such that any curvE
on the sphere which joins them must intersect C. On the other hand,
if the torus is cut along the closed curve C', the resulting surface still
hangs together; any point of thP surface can be joined to any other
point by a curve that does not intersect C'. This difference between
the sphere and the torus marks the two types of surfaces as topologically
distinct, and shows that it is impossible to deform one into the other
in a eontinuous way.
Next let us consider the surface with two holes shown in Figme 136
On this surface we can draw two non-intersecting closed curves A and B
which do not separate the surface. The torus is always separated into
two parts by ftny two such curves. On the other hand, three closed nonintersecting curves always Rf'parate tllf' surface with two holes.
These factil suggest that we define the genus of a surface as thP largest
number of non-intersecting simple dobcd curves thnt, can be drawn on
the surface without separating it. The genus of the sphere is 0, that of
the torus is
while that of the surfaee in Figure 136 is 2. A similar
The genus is
propCon-
258
lVI
TOPOLOGY
are both closed surfaces of genus 2, and it is clear that either of thesf'
surfaces may be continuously deformed into the other. Since the
doughnut with p holes, or its equivalent, the sphere with p handles, is
F!g.137 Surl,..,.,.ofgonuo2
E+F~2-2p,
F!,:.l3!1.
2.59
Now let us cut the surface S along the curves A2, B2, and
straighten the handles out. Each handle will have a free edge bounded
by a new curve A*, B*, ... with the same number of wrtiecs and arcs
as A2, B2, .. respectively. Hence V - E + F \Viii not change, since
the additional vertices exactly counterbalance the additional arcs, while
no new regions are created. ~<'xt, Wl' deform the surface by flattening
out the projecting handles, until the resulting surface is simply a sphNe
from which 2p regions have been removed. Since V
E
F is known
to equal 2 for any subdivision of the whole sphere, we have
V-E+F~2-2p
for the sphere with 2p regions removed, and hence for the original sphere
with p handles, as was to be proved.
Figure 121 illustrates the application of formula (I) to a surfare S
consisting of flat polygons. This surface may be continuously deformed
into a torus, so that the genus p i.~ 1 and 2 - 2p = 2 - 2 = 0. As
predicted by formula (1),
V - E
+F
1G - 32
+ 16
0.
3. OneSided Surfaces
An ordinary surface has two sides. This applies both to closed
surfaces like the sphere or the torus and to surfaces with boundary
curves, such as the disk or a torus from which a piece has been removed. The two sides of such a surface could be painted with different
colors to distinguish them. If the surface is closed, the two colors never
meet. If the surface has boundary curves, the two colors meet only
along the~ curves. A bug crawling along such a surface and prevented
from crossing boundary curves, if any exist, would always remain on
the same side.
Moebius made the
only one side. The>inpicet;~h,udac~ietheeocelled}loehicosstdp,
formed by taking a long rectangular strip of paper and
ends together after giving one a half~twist, as in Figure
A bug
crawling along tills surface, keeping always to the middle of the strip,
will return to its original position upside dmm. The :\Ioebim; strip has
only one edge, for its boundary consiHts of a single closed curve. The
ordinary two~sided surface formed by pasting together the two end<; of a
260
TOPOLOGY
[V
jlljjl
II Illlljljl
I IIljlIIljljljll/1111111111111
I I II
I
ONE~SIDED
SURFACES
261
can be deformed into a flat. one e.g. a rircle. During the deformation,
the strip may }w allowed to intersect itself so that a onesided selfinter~
secting surface results as in Figure 140 known as a cross-cap. The locus
of selfintersedion is regarded as two different lines, each belonging to
plan~
bmmd.s.ry
The one-
262
TOPOLOGY
I VJ
the deformation in such a way that the boundary of the Moebius strip
beromos flat, e.g. triangular, while the strip remains free from selfinterscctions. .Figure 141 indicates such a model, due to Dr. B. Tuckermann; the boundary is a triangle defining one half of one diagonal square
of a regular octahedron; the strip itself consists of six faces of the octahedron and four rectangular triangles, each one fourth of a diagonal
plane.
Another interesting one-sided surface is the "Klein bottle." This
surface is closed, but it has no inaide or outside. It is topologicall_y
equivalent to a pair of cross-caps with their boundaries coinciding.
The proof is analogous to that for tw()-sided ~urfaces First we show that the
Euler characteristic of a cross-cap or Moebius strip is 0. To do this we observe
that, by cutting across a Moebius strip which hs.s been subdivided into a number
of regions, we obtain a rectar"-o;le that contains two more vertices, one more edge,
and the same number of regions fl.fl the Moebius 11trip. For the rectangle,
V - B
P = l, ae we proved on page 239. Hence for the Moebius strip
V ~ E + F = 0. As an exercise, the reader may complete the proof.
ONE-SIDED SURFACES
ADA ADA
B
CYLINDER
263
TORUS
ADB ADA
AA A
MOEBIUS STRIP
KLEIN BOTTLE
.. o(.H3.Cioaedsurf!IC(Nidefinedbyeo(lrilinationoled~~:\lllinplan~filure.
Ft~.1H
Thr.,-dimen"<>naltor .definedbybound.aryidenti!i<mt.ion.
264
TOPOLOGY
[VJ
equivalent to the space between two concentric torus surfaces, one inside
the other, in which correspondinb points of the two torus surfaces are
identified (Fig. 145). For the latter manifold is obtained from the cube
if two pairs of conceptually identified faces are brought together.
FiJ 14!. Another repreaentation of thre6--dimenoional torUli. (Figure cut tOIBbow ideniillllatio!W
APPENDIX
265
THEORE~I
polygon with fewer than six sides. Denote by P., the number of regions
of nsidee in a regular map; then, ifF denotes the total number of regions,
(1)
F = F'2 + Fa + F4 +
Each arc has two ends, and three arcs end at each vertex. Hence, if E
denotes the number of arcs in the map, and V the number of vertices,
2E
(2)
3V.
2E~u~~+~+~+
+ F = 2,
or
6V
6E
+ 6F =
12.
6(F2
(6 - 2)F,
+ Fa+ F4 +
+ (6 -
3)f',
+ (6
- 4)F',
+ (6 -
5)F,
+ (6 -
(6 - 7)F,
6)Fo
+ ...
12.
Hence at least one of the terms on the left must be positive, so that at
least one of the numbers F'z, F'3, F,, F 0 is positive, as we wished to show.
Now to prove the five color theorem. Let M be any regular map on
the sphere with n regions in all. We know that at lea:t one of these
regions has fewer than six sides.
Case 1. M contains a region A with 2, 3, or 4 sides. In this
remove the boundary between A and one of the regions adjoining
A has 4 sides, one region may come around and touch two nonthe Jordan eurve thcorPm, the
sides of A In this case,
266
TOPOLOGY
IV I
~~
M
M'
267
be colored with
proof is conpracticable, although 'vearisome,
map with n regions in a finite number
268
IV]
TOPOLOGY
First we obsr-rve that all the points on any line segment not inter~
secting P have the same parity. For the parity of a point p moving
along such a segment can only change when the ray in the fixed direction
through p passes through a vertex of P, and in neither of the two
possible cases will the parity actually ehange, because of the agreement
made in the
From this it follows that
any two
polygonal path which cio"notinuna, P.
p
q.
If the straight segmwt pq joining p to q does not intersect P it i.~ tlw
desired path. Otherwise, let p' be the first point of int{'rsectiun Of thi~:>
segment with P, and Jet q' be the latlt such point (Fig. 149). Construct
the path starting from p along the segment pp', then turning off just
before p' and following along P until P returns to pq at q'. If we can
prove that this path will intersect pq between q' and q, rather than
between p' and q', then the path may be continued to q along q'q without
intersecting P. It is clear that any two points r and 8 near enough to
each other, but on opposite sides of some segment of P, must have
different parity, for the ray through r will intersect P in one more point
than will the ray through 8. Thus we see that the parity changes as
we cross the point q' along the segmC'nt pq. It follows that the dotted
path crosses pq hetween q' and q, ::;inee p and q (and hence CYcry point
on the dotted path) han dw same parity.
show
,,__
269
f(z) = z~
f(z) = (z
a1)(z -
a2) (z - an),
z
where r
~y~.
yi = r (cos 8
+ i sin 8),
+ i sin ne).
270
TOPOLOGY
[VJ
/(') "'0.
On this assumption, if we now allow z to describe any closed curve in
the x ,y~plane, f(z) will describe a closed curve r which never passes
Flg.\5().
Prool of
luadam~ntul
tboorem of alg"brl>
271
paragraph that.p(t) = n for large values oft. But the order.p(t) depends
continuously on t, since j(z) is a continuous function of z. Hence we
shall have a contradiction, for the function .P(t) can assume only integral
values and therefore cannot pass continuously from the value 0 to the
value n.
It remains only to show that {l) = n for large values oft. We observe that on a circle of radius z = t so large that
lf(z)
z"l
~ ia~,r'+ +a.l
:S: ia._,[.izl._, +I a._, 11 1"-' + ... + Ia. I
~ ,-'[1-1 + .. + ~]
a,_, I+ + laoll < t" ~ 1"1Since the expression, on the left is the distance between the two points
:S:t"-'lla._,l +I
z" and f(z), while the last expression on the right is the distance of the
point z" from th-e origin, we see that the straight line segment joining
the two points f<z) and z" cannot pass through the origin so long as z
is on the circle of radius t about the origin. This being so, we may
continuously deform the curve traced out by f(z) into the curve traced
out by z" without ever passing through the origin, simply by pushing
each pointj(z) along the segment joining it to z". Since the order of
the origin wiil vary continuously and can assume only integral values
during this deformation, it must be the same for both curves. Since
the order for zn is n, the order for f(z) must also be n. This completes
the proof.
CHAPTER VI
x2 +2x-3
has no definite numerical value until the value of x is assigned. We
say that the value of this expression is a functt'on of the value of x,
and write
2
x
2x
3 = f(x).
For example, when x = 2 then 22 + 2-2 - 3 = 5, so that/(2) = 5.
In the same way we may find by direct substitution the value of f(x)
for any integral, fractional, irrational, or even complex number x.
The number of primes le,;s than n is a function 1r{n) of the integer n.
\Vhen a value of n is given, the value 1r(n) is determined, even though
no algebraic expression for computing it is known. The area of a
triangle is a function of the lengths of its three sides; it varies as the
lengths of the sides vary and is determined when these lengths are given
definite values. If a plane ia subjected to a projective or a topological
transformation, then the coOrdinates of a point after the transformation
depend
i.e. are functions of, the original coOrdinates of the point.
The
of function enters \Vhcnever quantities are connected by a
relationship. The volume of a gas enclosed in a
cylinder
a function of the temperature and of the pressure on the
piston. The atmospheric pressure as observed in a baHoon is a function
of the altitude above sea level. The whole domain of periodic phc
nomcna-thc motion of the tides, the vibratiuns of a plucked string, the
emission of light waves from an incandescent filament--is governed by
the simple trigonometric functions sin x and cos x
To Leibniz (1646-1716), who first used the word "futlction," and to
272
INTRODUCTION
273
is a convenient symbolic expression of the fact that the sum of any two
integers is independent of the order in which they are taken. A par~
ticular case is expressed by the equation
2+3~3+2,
involving constants, but to express the general law, valid for all pairs
of number.,, symbols having the meaning of variables are needed.
It is by no means necessary that the domain S of a variable X be a
set of numbers. For example, S might be the ."ct of all circles in the
plane; then X would denote any individual circle. Or S might be the
set of all closed polygons in the plane, and X
individual polygon.
Nor is it
that the domain of a
contain an infinite
For example, X might denote any member of the
population S of a given
at a given time. Or X might denote any
one of the possible
when an integer is divided by 5; in this
ci\.Se the domain S would consist of the fiW! numbers 0, 1, 2, 3, 4.
274
!VI)
If X ranges over the set S, then the 'ariablP (} will range over another
set, say T. For example, if Sis the set of all triangles X in the plane,
a function F(X) may be defined by assigning to each triangle X the
length, U = F(X), of its perimeter; T will be the set of all positive
numbers. Here we note that two different triangles,
and X 2 ,
have the same perimeter, so that the equation F(X1) = F(.,~,) i~ po$;lblo
even though XI ;of x~. A projective transformation
onto another, T, assigns to each point X of S a single
according to a definite rule which we
by
symbol U = F(X). In this case
and we say that the mapping of
Functions of a continuous variable are often
pressions. Examples are the functions
u = xZ,
=X'
+x
275
= la
+ 2 + . + n
~?1
1),
+ 22 + , , , + n2 = ~~~-t__D(i(2n + 1)'
+ za + .. + na
= nt(n ~~
y ~ g(t),
' - h(l).
276
!VI]
z~axis
under the
influ~
0,
y = 0,
-!gt1,
so~called
P'
277
u = f(x) =
l1{l
l1{l,
a,. .
+1
+ 5'
278
lVI]
Inverse Functions
u=ax+b
are represented by straight lines; quadratie functions such as
2
u=ax +bx+c
by parabolas; the funcfion
Fi~t. 1~1.
COli
279
:r:.
G(U).
280
[VI)
the two sets, and the inverse function n = m/2 is uniquely defined.
Another example of a biunique mapping is provided by the function
As x ranges over the set of all real numbers, u will likewise range over
the set of all real numbers, assuming each value once and only once
The uniquely defined inverse function is
X=
...:;u.
y'U
281
defined.
by rotating the original graph through an angle of 180 about the dotted
line (Fig. 154), so that the positions of the x~axisand the u~axis are inter~
changed. The new position of the graph will depict x as a function of u.
In its original position the graph shows u as the height above the horizontal x-axis, while after the rotation the same graph shows x as the
height above the horizontal u-axis.
1----------,-(
u = tan x.
This function is monotone for - 1r/Z < x < 1rj2 (Fig. 152). The values
of u, which increase steadily with x, range from - > to + >;hence
the inwrsc function,
x
g(u),
Fic.U6."''"'""'tola-.
!VI]
282
4. Compound Functions
A second important method for creating new functions from two or
more given ones is the compounding of functions. For example, the
function
u ~ f(x) ~
VJ+Xi
+ :l,
u = h(z) =
VZ,
f(x)
h(g[x])
Likewise,
u
j(x)
z = g(x) = 1-
x~,
w = h(z) = y'z,
u = k(w) =
so that
u
f(x)
The function
u-f(:r,)
k(h[g(x)]).
-sin;
u ""h(z) = sinz.
g(:t),.. ;,
The functionf(:t) is not defined for x = 0, since for x = 0 the exprBssion 1/:t haa
no meaning. The graph of this remarkable function is obtained from that of the
eine. We know that sin z = 0 for z = br, where k is any p-ositive or negative
Integer. Furthermore,
"" -
l-1
'"
' - f4k
+ 1)
for z={4k-1)~,
COMPOUND FUNCTIO::>rS
if k is any integer.
283
Hence
If we set successively
k =I, 2, 3, 4,
5. Continuity
284
lVI)
+x
for x
f(x) = -1 +x
for
>0
:s;o
0 fo,
x >" 0,
f(O)
I,
CONTINUITY
285
~i
286
!VI}
discontinuities.
t< =
*3) Show that the function arc tan; has a discontinuity of the second type
(jump) at x
f(x, y).
This notation is also used if, as often happens, two quantities x andy
appear from the outset as independent variables. For example, the
pressure u of a gas is a function of the volume x and the temperature y,
and the area u of a triangle is a function u = f(x, y, z) of the lengths
x, y, and z of its three sides.
In the same way that a graph gives a geometrical representation of a
function of one variable, a geometrical representation of a function
u = f(x, y) of two variables is afforded by a surface in the three-dimcn~
sional space with x, y, u as coOrdinates. To each point x, y in the
x, y~plane we assign the point in spa-ee whose coOrdinates are x, y, and
u = f(x, y). Thus the function u = Vl=X"2-=-?f is represented by a
spherical surface with the equation u' + x~
y2 = 1, the linear function u = ax + by
c by a plane, the fu~A+:~"l. u = xy by a hyperbolic
paraboloid, etc.
A different representation of the fun
f(x, y) may be given
in the x, y-plane alone by means of conU
Instead of considering
the three-dimensional 1'landscape" u = f(x, y;, we draw,as on a contour
map, the level curves of the function, indicating the projections on the
x, y-plane of all points with equal vertical elevation u. These level
287
curves are simply the curves f(x, y) = c, where c remains constant for
each curve. Thus the function u = x + y is characterized by Figurf!'
Fig.lW. Hoill
Fig.l60. HypotbohcpsrBboloid.
(~~f~l
v
~
u =c.
Fullctions _of several variables occur in
physics when the motion of a continuous
substance is to be described. For example,
~ig.IG3. V.elcurvesolu=z+u suppose a string is stretched between two
points on the x-axis and then deformed so that
position xis moved a certain distance perpendicuto the axis. If the string is then released, it will vibrate in :o.urh
a way that the particle With the original coOrdinate x will have at the
288
[VI[
289
x'
ax
gx
+ by -t__c
+ hy + k'
dx+ey+f
iii+-hy_+_ k'
where a, b, ... , k are constants, and where x, y and x', y' are coOrdiM
nates in tbe two planes respectively. From this point of view the idea of
an inverse transformation makes sense. We simply have to solve this
system of equations for x and yin terms of x' and y'. Geometrically,
this amounts to finding the inverse mapping of 1r' onto 1r. This will be
uniquely defined, provided the correspondence between the points of
the two planes is biunique.
The transformations of the plane studied in topology are given, not
by simple algebraic equations, but by any system of functions,
x'
f(x, y),
290
[VI
1,
~) ~
...
'~.
The only trouble with this explanation is that the meaning of the
italicized phrases i'lnot entirely clear, How far is "far enough,'' and how
little is "as little as we please"? If we can attach a precise meaning
to these phrases then we can give a precise meaning to the limiting
relation (2).
A geometric interpretation will help to make the situation clearer,
If we represent the terms of the sequence (1) by their corresponding
points on the number axis we observe that the terms of the sequence
appear to cluster around the point 0, Let us choose any interval I on
the number axis with center at the point 0 and total width 2E, so that
the interval extends a distance e on each side of the point 0, If we
choose E = IO, then, of course, all the terms an = 1/n of the
will lie inside the interval!. If we choose E = 1/10,
terms of the sequence will lie outside I, but all the terms from a11 en,
1
IT'f2'i3'I4',
will lie within [,
sand
LIMIT OF A SEQUENCE
291
<ln
will lie within I. Clearly, this reasoning holds for any positive number~:
as soon as a po;,itive E is chosen, no matter how smaH it may be, we can
then find an integer N so large that
1
X<
E.
From this it follows that all the terms an of the sequence for which
n 2: N will lie within I, and only the finite number of terms a 1 , CZ2, ,
a...,._J can lie outside. The important point is this: First the width of
the inten;al I is assigned at pleasure by choosing e. Then a suitable
integer N can be found. This process of first choosing a number E and
then finding a suitable integer N can be carried out for any positive
number t, no matter how small, a-nd gives a precise meaning to the
statement that all the terms of the sequence (1) will differ from 0 by
as little as we please, provided we go out far enough in the sequence.
To summarize: Let E be any positive number. Then we can find an
integer N such that all the terms an of the sequence (1) for which n 2:. N
will lie within the interval I of total width 2E and with center at the
point 0. This is the precise meaning of the limiting relation (2).
On the basis of this example we are now ready to give an exact definition of the general statement: "The sequence of real numbers a1, a2,
a 3 , has the limit a." We include a in the interior of an interval I
of the number axis: if the interval is small, some of the numbers a,. may
lie outside the interval, but as soon as n becomes sufficiently large, say
greater than or equal to some integer N, then all the numbers a,. for
which n ~ N must lie within the interval I. Of course, the integer N
may have to be taken very large if a very small interval I is chosen,
but no matter how small the interval I, such an integer N must exist
if the sequence is to have a as its limit.
The fact that a sequence an has the limit a is expressed symbolically
by writing
lima,.= a
asn-
oo,
or simply
The
292
[VI l
(3)
I a - a. I <
for all
n;,:;. N.
This is the abstract formulation of the notion or the limit of a sequence. Small wonder that when confronted with it for the first time
one may not fathom it in a few minutes. There is an unfortunate,
almost snobbish attitude on the part of some writers of textbooks, who
present the reader with this definition without a thorough preparation,
where
a, = n--:::--1
a"
= 1.
21!3
whose center is the point 1 and for which t = 1/10, then I can satisfy
your requirement (3) by choosing N = 10; for
0< 1 - n : 1 =
=n!l<fo
294
[ Vl)
some, infinitely many, or even all the numbern a,. to be equal to the
limit value a. For example, the sequence for which a1 = 0, ~ = 0,
a,. = 0, . is a legitimate sequence, and its limit, of course, is 0.
A sequence a., with a limit a is called convergent. A sequence O-n
without a limit is called divergent.
Exercises: Prove:
is less
than~
a~
= nt :
(Hint: aA .. _I_]_
n+~
2.
Thesequenrea~ ""nn'
++
1
basthelimit0. (Hint a
1
1+~
=~lies
."
n+!~
between
Oand~.)
3. The sequence 1, 2, 3, 4, and the oscillating sequences
1,2,1,2,1,2,
-1, l, -1, 1, -1,
a11d 1,
!, I,!, 1, },
J,
(i.e. a.=
(-1)~},
and an""
n:
,:
11
1
295
LIMIT OF A SEQUENCE a.
2. Monotone Sequences
In the general definition of page 291, no specific type of approach of a
convergent sequence a1 , a 2 , aJ , to its limit a is required. Th
simplest type is exhibited by a so~called monotone sequence, such as
the sequence
I 2 3
For
Asequence
of this sort, where a,..,_ 1 > a,., is said to be monotone increcsing. Similarly, a sequence for which a,. > a,.r~, such as the sequence 1, 1/2,
1/3, , is called monotone decreafJiTI{J. Such sequences can approach
their limits from one side only. In contrast to these, there are sequences
that oscillate, such as the sequence -1, + 1/2, -1/3, + 1/4, .. .
This sequence approaches its limit 0 from both sides (see Fig. 11, p. 69).
The behavior of a monotone sequence is especially easy to determine.
Such a sequence may have no limit, but run away completely, like the
sequence
1, 2, 3, 4, ... '
where a,. = n, or the sequence
2, 3, 5, 7, 11, 13, ...
where a,. is the 1"lth prime number, Pn. In this case the sequence tends
to infinity. But if the terms of a monotone increasing sequence remain
bounded~that is, if every term is less than an upper bound B, known
in advance-then it is intuitively clear that the sequence must tend to a
certain limit a which will be Jess than or at most equal to B. We
I
a1
a,a,
I I""'
IB .
Fig.l66. Moootonebounded,.quanoe.
296
!VI]
lower bound.) It is remarkable that the value of the limit a need not
be given or known in advance; the theorem states that under the pre.
scribed conditions the limit exists. Of course, this theorem depends on
the introduction of irrational numbers and would otherwise not always
be true; for, as we have seen in Chapter II, any irrational number (such
a, "" A,.p,p,p,
a,=
A,.q,qot7
where the A, are integers and the p, , q, , etc. are digits from 0 to 9. Now run
down the column of integers A,, A1, A1, . Since the sequence a,, az,
is bounded, these integers cunnot increase indefinitely, and since the
a1 ,
sequence ia monotone increasing, the~ qnence of integers A,, A,, A,,
will
remain constant after attaining it$ ma:nmum value. Call this maximum value A,
and suppose that it, ia attained at the N oth row. :S:ow run down the second
column p 1 , q, , r 1 , , confining attention to the terms of the N 0th and auh~
sequent rows. If x 1 is the largest digit to appear in this column after the .V 0 th
row, then x 1 will appear constantly after its first appearance, which we may suppose to occur in theN ,th row, where ,V 1 ;-:-: ."-'o, For if the digit in this column
decreased at any time thereafter, the sequence a 1 , a,, a, 1
would not be lnonotone increasing. Next we consider the digits p,, q,, r,, of the third column
A similar argument shows that after a ('crtain integPr .V, > N 1 the digits of the
third column are constantly equal to some digit x, . If ;e
this
for the 4th, 5th, columns we obtain digits x,, x,, x,,
integers N 1 , N,, N,,
It ia easy to see 1hat the
is the limit of the serpwnce a 1 , lh, a,,
For if is chosen;-:-: 10-"', then for
all n ;-:-:_ N" the integral part and first m places of digits after the decimal point
in a. will coincide with those of a, so that the diffcrenct> I a - a. I cannot exceed
10~"'. Since this can 00 done for any positive ., however small, by choosing m
sufficiently large, the theorem is proved
It is also possible to prove thi~ theorem on the basis of any one of the oth\'r
definitions of real numbers given in Chapter II; for example, the definition by
ne~ted intervals or by Dedckind cuts. Such proofs are to be found in most tf'xt~
on advanced calculus.
297
MONOTONE SEQUENCES
in Chapter II to
a= A.a,a.a,
Two such expressions cannot be added or multipli('d in the ordinary way, starting
3. Euler's Number e
nl = 1234
for the product of the first n integers, we consider the sequence
a1, U-2, a3, ,where
(4)
a,.<
(5)
For we have
il
B = 3.
an+l
f)l.
originates
:\Ioreover,
[VI l
298
and hence
a.. < 1 + 1 + ~
+ 2(1
- (!)")
< 3,
using the formula given on page 13 for the sum of the first n terms of a
geometric series. Hence, by the principle of monotone sequences, a"
must approach a limit as n tends to infinity, and this limit we call e
To
the fact that e = lim an, we may write e as the "infinite
This
of expressing
+ 1~ + ~! + +
i~
ll
and
&!+d-1+
<~(l+y~+~+
This is so small that it cannot affect the ninth digit of 2:. Hence, allowing for a possible error in the last figure of the value given above, we
have e = 2.7182818, to eight digits.
EULER'S NUMBER e
299
(7)
+ q! + 34 ... q + 45
.. q
+ ... + (q-
1)q
+ q + 1]
+ (q+l) +
+ ....
On the left side we obviously have an int<'ger. On the right side, t.he term in
brackets is likewise an integer. The remainder of the right side, however, is a
positive number that is Jess than I and hence no integer. For q :;-::: 2, and benet!
the terms of the series 1/(q + I) +
are respectively not greater than the
corresponding tenus of the geometrical seriea 1/3 + 1/31 + 1/3' + , whose
aum isl/3[1/(I -l/3)] = !. Hence (7) presents a contradiction; the integer on
the left side cannot be equal to the uumber on the right side; for this latter num~
bt>r, being the sum of an integer and a positive number less than!, is not an
inte~;;er
4. The Number
(_
.. ~<\\
300
{VI]
sired, for it gives no information about the nature of 'IT as a real number:
is it rational or irrational, algebraic or tran;;eendenta!? As we have
i=I-~+~-~+
expressing 'lr/4 as the limit for increasing n of the partial sums
8" =
the Eng!i~h
that
301
CONTINUED FRACTIONS
~~3+--1_ _
17
2+ _l_'
1
+~
+ 2x =
1,
+ :z:)
x~--f-,
+ ~Y+x
and then
I
x=~+u~~
2+x
and so on, so that after n step::; we obtain the equation
2+
--~.----1
2+~------
2+x
nsteps,
302
[Vlj
x2 =ax+ 1,
we obtain the expansion
x =a+
a+~-+~+~-
+ vs)
1+
-----'-.--
1+---'-.-1+
+ -------------'~----1 + -----"--,----2+
--------~',-----
1+1
+ -----'-,-1+
303
CONTINUED FRACTIONS
General Definition
(x
= 0, where the
x~xa -1
= x+:a-x=
~.
304
(VI]
Fi1<.l68 u
(x +z>)/%,
f(x) - 1 =
r1; for
tk,
f(x) - 1 =
10
More
11 <
v'
then
n 2: N."
In the case of a function f(x) of a continuous variable x as x tends to a
finite value X1 , we merely replace the "sufficiently large" n given by
INTRODL'CTION
305
lf(x)- al <
for all x
'.
x-'"Xt.
306
[VII
itself.
307
Xt;
a~ as
n --+ oo, e.g. a,. = 1/n, we never substitute n = oo in the formula.
However, as x tends to Xt, f(x) may approah the limit a in such a
way that there are values x r! x1 for which f(x) = a. For example, in
Considering the functionf(x) = xjx as x tends to 0 \Ve never allow x to
equal 0, butf(x} = I for all x ~ 0 and the limit a exists andJs equal io 1
according to our definition.
3. The Limit of
si: x
s~ x
symbol 0/0. The reader with access to a t.able of trigonometric functions will be able to compute the value of
x.
places.
Jio y
O.Ol745y, to 5
10,
X=
5o,
20,
10
Although these
0.1745,
0.0873,
0.0349,
0.0175,
sin x
0.1736,
0.0872,
0.0349,
0.0175,
0.9948
0.9988
1.0000
1.0000.
it would appear
(I)
sin :r/:r -1
as x
0.
308
{VI J
<x <
i we have
< sm
~<--!-,
x
cos x
cosx <
(2)
sin x.
(3)
Since sin x
<L
! .-t"-~<:~__x
= 1 -:
I+cosx
8082
x =
I+cosx
-~-~-::__ <
l+cosx
:l < cosx.
<L
1- X~<
(4)
forT< x <
0, since
< x <
si~~~)x)
this inequality is
~ x, and
(-x)2 = xz.
si~ x
Er~rcises:
a~~x-o
For
<
~ =
y' ~.
~ : 08 x ->0
I.IMIT OF e~n z
309
;;i:.
3) z(:n_x I} .
4)
ta:
:1~-~~.
8)
7)
1
l
9 );-tanx'
x.
5) !!n:tlaz.
IO)sin:z:-~
4. Limits as x -
oo
j(x) ._a
as x ......-) oo,
if, corresponding to each positive number E, no matter how small, there can
be found a positive number K (depending on E) such that
lf(x)- a I <
provided only that
/xI >
tion on p. 305.)
In the case of the function f(x) = 1/x, for which a
choose K = 1/ E1 as the reader may at once verify.
0) it suffices to
1/x--->0.
relatinn~
hold:
2<
4.
si:
x __, 0
7
6<
8.
Defino "j(x)
-+ "' 9.11
x-+ "'."
310
[VIj
There is one difference between the case of a function j(z) and a sequence a~.
In the case of a sequence, n can tend to infinity only by increasing, but for a
function we may o.llow x to become infinite either positively or tegatively. If it
is desired to restrict attention to the behavior of j(x) when x assumes large po8itive values only, we may replace the condition I xI> K by the condition x > K;
for large negative values of x we use the condition :c < -K. To symbolize these
two method~ of "one-sided" approach to infinity we write
respectively.
0.
"
~'\.~ ~~~~'-.__./
-+
u/
311
oa
= 0.
1 ~0 .
Then
1 ~,
Now let us assign any positive value toE, for example e ...
1~~ and 1f 00 .
{j
i~ byE=
Xi
= 0
312
[VI]
defined for every positive number for which I can prove that ! x - x1 I < 8
implies always lf(x) - /(z,) I < t<. In
case of the function u "" f(z) = ~
at the value x 1 .,. 0, the function~ =
was~ = ~Exerciaes: 1) Prove that sin x, cos x are continuous functionB.
2) Prove the continuity of 1/(l
x) and of VI+ if:
(p. 68)
313
To this end we consider tl1e interval I, a ::.:; x ::::; b, in which the func-
If at
314
lVII
haps more than anyone else, was responsible for the modern trend
towards rigor in mathematical analysis. This theorem states: lf a june.
tion f(x) is continuous in an interval I, a::; x :5 b, including the endpoints a and b of the interval, then there must exist at least one point
in I where f(x) attains its largest value M, and another point where f(x)
attains its least value m. Intuitively speaking, this means that the
graph of the continuous function u = f(x) must have at least one highest
and one lowest point.
It is important to observe that the statement need not be true if the
functionf(x) fails to be continuous at the endpoints of I. For example,
the function f(x) =
~has no largest
for rational x,
315
The existence of a leaBt value m may be proved in the same way, or it follows
directly from what has already been proved, since the least value of f(x) is the
greatest value of g(x) ,.. -j(x)
Weierstrass' theorem can be proved in a similar way for continuous functions
Instead of an interval with its endpoints we
of two or more variables x
ain, e.g. a rectangle in the x, y-plaue which includes
have to consider a clA1ed
it11 boundary.
Exercise: Where, in the proofs of Balzano's and Weierstrass' theorem!!, did
we use the fact thatj(x) was assumed to be defined and continuous in the whole
closed interval a ::;; x::;; band not merely in the interval a < x $ bora < x < b?
To prove this theurem we divide the interval I into two closed subintervals I' and I" by marking the midpoint
1': a :5 x :S
~-t -~of[;
~-{~.
[";~-{~~X ~b.
In at lea.~t one of these, which we may call I, , there must lie infinitPly
many terms Xn of the original sequence. Choose any one of these terms,
say Xn,, and call it y,. ~ow proceed in the same way with the interval It, Since there are infinitely many terms Xn in !1, there must be
infinitely many terms in at least one of the halves of !1 , which we may
tall I 2 Hence we can certainly find a term x~ in 12 for which n > n 1
316
[VI]
+ BB' + CC',
where AA', etc. denotes the ordinary distance between the points A and A'.
Whenever there exists such a notion of "distance" in a set S we may define the
concept of a sequence of elements X,, X,, X 1 , tending to a limit element
X of S. By this we mean that d(X, Xft) ....... 0 as n ....... >. We shall now say that
IM ail S is compact if from any Siquence X 1 , X 1 , X 1 , of element& of S we
can always extract a subsequence which tenda to some limit element X of S. We
have shown in the preceding paragraph that a closed int-erval a~ x ~ b is compact
in this sense. Hence the conceptofacompact set may be regarded as a generalization of a closed interval of the number axis. Note that the number axis as a"whole is
not compact, since the sequence of integers 1, 2, 3, 4, 5, neither tends to a limit
nor contains any subsequence that does. Nor is an open interval such as
0 < x < 1, not including its endpoints, compact, since the sequence i, !.
i, or any subsequence of it tends to the limit 0, which is not a point of
thE" open interval. In the same way it may be shown that the region of the plane
consisting of the points interior to a square or rectangle is not compact, but beif the boundary poinh are added. Furthermore, the set of all
vertices lie within or on the circumference of a. given circle is
compact.
We may also extend the
u = F(X) is any
always e.xiMs an element of S for
one for which it allllins il3 smallest value,
The proof is simple once one has ~ra.'lpt'd the general concepts involved, but
we shall not go further into this subject. It will appear in Chapter VII that
the genera.! theorem of WeierstrRSil is of great importance in the theory of maxima.
and minima.
GEOMETHlCAL APPLICATIONS
317
318
[VIJ
A1
+A
= A,
+ A,
and
l~+90
l,
Fi(.IH.
from which it follows, on subtracting the second equation from the firstl
that
i.e.
GEOMETRICAL APPLICATIONS
319
and hence
AI= A,.
Thus if we can show the existence of an angle a such that for l..
A,(a)
A,(a),
then our theorem will be proved, since for such an angle all four areas
will be equal. To do this, we define a function y = j(x) by drawing l~
and setting
f(x) ~ A,(x) - A 0(x).
For x = 0"', j(O) = At(O) - A2(0) may be positive. In that case, for
x
90"', A,(90) - A2(00) = A2(0) - AB(O} = A~(O) - A,(O) will be
negative. Therefore, since j(x) varies continuously as x increases from
0"' to 90"', there will be some value a between 0"' and 90"' for which
f(,y_) = A 1(a) - A 2 (a:) = 0. The lines 1, and l<>+90 then divide the area
into four equal pieces.
It is interesting to observe that these problems may be generalized
to three and higher dimensions. In three dimensions the first problem
becomes: Given three volumes in space, to find a pi rre which bisects
all three simultaneously. The proof that this is always possible again
depends on llolzano's theorem. In more than three dimensions the
theorem is still true but the proof requires more advanced methods.
*2. Application to a Problem of Mechanics
We shall conclude this section by discussing an apparently difficult
problem in mechanics that is easily answered by an argument based on
continuity concepts. (This problem was suggested by H. Whitney.)
Suppose a train travels from station A to station B along a straight
sect.ion of track The journey need not be of uniform speed or acceleration. The train may act in any manner, speeding up, slowing down,
coming to a halt, or even backing up for a while, before reaching B.
But the exact motion of the train is supposed to be known in advance;
that is, the function s = f(t) is given, where sis the distance of the train
from station A, and tis the time, measured from the instant of departure.
On the floor of one of the cars a rod is pivoted so that it may move without friction either forward or backward until it touches the floor. If it
does touch the floor, we assume that it remains on the floor henceforth;
this will be the case if the rod does not bounce. Ia it poosible to place
Lhe rod in such a position that, if it is released at the instant when the
train starts and allowed to move solely under the influence of gravity
320
[VI J
and the motion of the train, it will not fall to the floor during the entire
journey from A to B?
It mi;;-ht seem quite unlikely that for any given schedule of motion
the interplay of gravity and reaction forces will always permit such
a maintenance of balance under the single condition that the initial position of the rod is suitably chosen. Yet we state that such a position
always exists.
Paradoxical as this assertion might seem at first sight, it can be proved
easily once one concentrates on its essentially topological character.
No detailed knowledge of the laws of dynamics is needed; only the
following simple assumption of a physical nature need be granted: The
motion of the rod depends continunusly on its iniHal position. Let us
characterize the initial position of the rod by the initial angle x which it
makes with the floor, and by y the angle which the rod makes with the
floor at the end of the journey, when the train reaches the point B. If
the rod has fallen to the floor we have either y """ 0 or y = w-. For .a
given initial position x the end position y is, according to our assumption, uniquely determined as a function y = g(x) which is continuous
and has the values y = 0 for x = 0 and y = "If for x = 1r (the latter
assertion simply expressing that the rod will remain flat on the floor if
it starts in this position). Now we recall that g(x), as a continuous
function in the interval 0 :::; x :::; 1r, assumes all the values between g(O) =
0 and g(rr) = 1r; consequently, for any such values y, e.g. for the value
y =~'there exists a specific value of x such that g(x) = y; in particular,
there exists an initial position for which the end position of the rod at
B is perpendicular to the floor. (Note: In this argument it should not
be forgotten that the motion of the train is fixed once for alL)
Of course, the reasoning is entirely theoretical. If the journey is of
long duration or if the train schedule, expressed by s = j(t), is very
erratiq, then the range of initial positions x for which the end position
321
E:rercise3: L Using the theorem of page 315, show that the reasoning above
may be generalhed to the case where the journey is of infinite duration.
2. Generalize to the case where the motion of the train is along any cutve in
the plane and the rod may fall in any direction. (Hint: It is not possible to map
a circular disk continuously onto its circumferf'nce alone by a mapping which
eaves every point of the circumference fixed (seep. 255)).
3. Show that the time required for the rod to f&ll to the Boor, if the car is
station~ry and the rod is released at &n angle~ from the vertical position, tends
to infinity as t tends to zero
SUPPLEMENT TO CHAPTER VI
MORE EXAMPLES ON LIMITS AND CONTINUITY
(2)
(I
h)"
q"
+ nh >
2: I
nh,
We set q = I
322
323
>
nh
>
>
k/h
k;
3. The Limit of
VP
VP-
(3)
V::P
1 as n -
c.:;.
we mean, as
nth root.
n is even.)
For
we may set
-v;;~H-h.,
where hn i!:l a
then shows
I)Uantity depending on n
p ~ (1
324
{VI l
<
<
h~
pfn.
obtained.
Incidentally, we have derived an estimate for the difference h.. between
y'p and 1; this difference must always be less than pjn.
If 0 < p < 1, then Vp < 1, and we may set
V'P ~!~h.'
where h,. is again a positive number depending on n.
It follows that
1l
= (1
.oo that
O<h,.<~.
From this we conclude that h,. tends to 0 as n increases. Hence, since
Vp = 1/(1 + h,.), it follows that VIP---> L
The equalizing effect of nth root extraction, which tends to push
every positive number towards 1 as n increases, is even strong enough
to do this in some cases if the radicand does not remain constant, We
shall prove that the sequence 1, v'z, ~. VJ4, --Y5, .. tends kl 1,
i.e. that
vn-1
as n increases.
vn.
k. '
Vn
k,.
kn<n=~
Hence
1
< Vn
= (1
+ k..l
= 1
+ 2k,. + k~ < 1 +
+ ~.
THE LIMIT OF
y'p
325
y;;,
lim j.(x).
1-+~XOJ~.
= 1 and hencej,.(x)
= 1/2
[1
fo, i xI< I,
=~1/2for[xl
f(x) =lim
\0
= 1,
IX I >
fM
I.
f()
2+ .fTX2
x
"X =X
+ ~~-2
:l
+ +_x_'_
(1 + i)""
IT?'
326
!VI]
has such a law of formation; each tenn after the first is formed by taking
the square root of 1 plus its predecessor. Thus the formula
a1 = 1 , Gn+l = .Yf+lt..
defines the whole sequence. Let us find its limit. Obviously a, is
greater than 1 for n > 1. Furthermore a, is a monotone increasing
sequence, for
a,H
a! = (1
+ a,,.)
+ a,._t)
- (1
a,.
Hence whenever a,. > a,._I it will follow that a,.H > a,., But we know
that. a2 - a, = V2 - 1 > 0, from which we conclude by mathematical
induction that a,.+l > a,. for all n, i.e. that the sequence is monotone
inereasing. Moreover it is bounded; for by the previous results Wf'
have
an+t=~~!'
327
LIMITS BY ITERATION
+ v'n).J
2) Find the limit of v'n + a
yl7ii+b
3) Find the limit of v'n +an+ b- n.
4) Find the Hmit of ::j;l +1~+--y;~
the limit of Vn+~l is 1.
6) WhRt is
limit of va~~+-s;; if I}_ > I>> 0?
7) What is the limit of Va".+h;;~+C if a> b > c > o?
8) What i~ thl' limit of-\la;;b"--+a-;;;;;-+ ~"if a> b
9) Wr
later (p. 449) that. e = lim (I
lim (I+
5) Prove
O?
What then is
2. EXAMPLE ON CONTINUITY
To give a precise proof of the continuity of a function requires the
explicit verification of the definition of page 310. Sometimes this is a
lengthy procedure, and therefore it is fortunate that, as we shall see in
Chapter VIII, continuity is a consequence of differentiability. Since
the latter will be establishf'...d systematically for all elementary functions,
we may follow the usual course of omitting tedious individual proofs of
continuity. But as a further illustration of the general definition we
shall analyze one further example, the function f(x)
may restrict x to a fixed interval i x
selected number. Writing
f(x,) - f(x)
1
1 +x2
(!
I :::;
We
328
EXAMPLES ON
LI~.:ITS
AND CONTINUITY
I X1 I :::; M
lf(x,) -f(x)l,;; lx x,llx+x,l,;; I
[VI]
x,I.2M.
Hence it is clear that the difference on the left side will be smaller than any
positive number
if only
I X1
xI< 0=
ziJ .
CHAPTER VII
330
MAXI~!A
! VII]
A:-.;D :\II:-li:\1A
2. Heron's Theorem.
HERO~'S
THEOREM
331
332
!VIII
and from there to Q. This would give a path PRSQ (see Fig. 17(})
determined in a manner similar to the previous path PRSQ. The length
of the first path may be greater than, equal to, or less than that of the
.,econd.
"'Exercise: Show that the first path ill smaller than the second if 0 and R lie
on the same side of the line PQ. When will the two paths be of equal length?
Fig. ISO
TriRn~le
of
mi~iron!n
point R such that the distance from R to the line 1-'Q is equrtl to the
given h, and such that the sum a
IJ is a minimm11. From the first
condition it follows that R must lie on the line para!Jpj t.o PQ at a distance h. The answer is given by Heron's theorem for the special casE
333
where P and Q are equally distant from L: the required triangle PRQ
is isosceles.
b) In a triangle let one side c and the sum a
b of the two other
sides be given; to find among all such triangles the one with the largest
area. This is just the converse of problem a). The solution is again
the isosceles triangle for which a = b. As we have just shown, this
triangle has the minimum value of a + b for its area; that is, any
other triangle with the ba.__<oe c and the same area has a greater value
of a + b. Moreover, it is clear from a) that any triangle with base c
and an area greater than that of the isosceles triangle also has a greater
value of a b. Hence any other triangle with the same values of a + b
and of c must have a smaller area, so that the isosceles triangle provides
the maximum area for given c and a + b.
Corresponding
segment of L lying inside the ellipse; for each point of this segment
p
q would be less than 2a, since it is easily seen that p + q is less
than 2a inside the ellipse and greater than 2a outside. Since we know
that p + q ;::: 2a on L, this is impoo;sible. Hence L must be tuugent
to the ellipse at R. But we know that PR and RQ make equal angles
with L; hence we have incidentally proved the important theorem: A
334
[VII]
tangent to an ellipse makes equal angles with the lines ining the foci
to the point of tangency.
Closely related to the foregoing discussion is the following problem:
Given a straight line L and two points P and Q on opposite sides of L
(see Fig. 182), to find a point Ron L sucn that the quantity I p - q I ,
that is, the absolute value of the difference of the distances from P and
Q to R, is a maximum. (We shall assume that Lis not the perpendicular
bisector of PQ; for then p - q would be zero for every point R on L
and the problem would be meaningless.) To solve this problem, we
first reflect P in L, obtaining the point P' on the same side of L as Q.
For any point R' on L, we have p = R'P = R'P', q = R'Q. Since R',
Q, and P' can be regarded as the vertices of a triangle, the quantity
J p q I = j R'P' - R'Q I is never greater than P'Q, for the difference
"4
p
Fi~. 18~.
I I'R
- QR
I-
m,uimUIIl,
between two sides of a triangle is never greater than the third side. If
R', P', and Q all lie on a straight lim', p - q j will be equal to P'Q,
a.''! is seen from the figure. Therefore the desired point R is the inte~
section of L with the line through P' and Q. As in the previous case,
it is easily seen that
v<hich RP and RQ make with L are
since the triang!n-;
is connected with
I
335
Fig.l83. TllnK"ntprop&rlyofhyp<>rbol&.
h"'''
336
[VII)
Ip
A point Ron C for which the distance PR has its smallest or its largest
value must be such that the line PR is perpendicular to the tangE'nt to
C at R; in othf'f words, PR is pe11Jcndicular to C. The proof is as
follows: the circle with center at P and passing through R must be
337
The problems in Article 4 concerning the sum or difference of distances can now be generalized. Consider, instead of a straight line L,
a simple closed curve C with a tangent at every point, and two points,
P and Q, not on C. \Ve wish to characterize the points on C for which
the sum, p + q, and the difference, p - q, take on their extreme values,
whNe p and q denote the distances from any point on C to P and Q
respectively. No use can be made of the simple construction of reflec~
tion with which we solved the problems for the case where Cis a straight
line. But we
use the properties of che ellipse and hyperbola to
Since C is a closed curve and no longer a
line
both the minimum and maximum problems
be taken as
that the quantities
p + q is a maximum,
sider the ellipse with foci at P and
which p + q = 2a.
is left as an exercise for the
PR and QR make equal
the ellipse at R; since the ellipse is
tang~?-nt to Cat R, the lines PR and QR must. also make equal angles
v.ith Cat R. If p + q is a minimum for R, we see in the same way that
338
I VII l
PR and QR make equal angles with Cat R. Thus we have the theorem:
Given a closed curve C and two points P and Q on the same side of C;
then at a point R of C where the sum p + q takes on its greatest or
least value on C, the lines PR and QR make equal angles with the curve
C (i.e. with its ta::Jent) at R.
If Pis inside C and Q outside, this theorem also holds for the greatest
value of p + q, but fails for the least value, since the ellipse degenerates
into a straight line.
Fig.
18~.
+ QR
THE PRINCIPLE
339
y~
those of Q
then
P = vf(-;-=--;J 2
+ (y-
1/~)1 ,
vf(x- x2)1
+ (y- Y2) 2,
f(x, y)
+ q.
340
[VII)
extreme valw: a,
2. Examples
The results of the preceding section are easily seen to be special cases
of this general theorem. If
q is to have an extreme value, the
function f(x, y) i:; p + q,
curves f(:r, y) = c are the confocal
ellipses with foci P and Q. As predicted by the general theorem, the
ellipses
through the points on C where f(x, y) takes on its ex~
treme
were seen to be tangent. to C at these points. In the case
where the extrema of p ~ q are sought, the function f(x, y) is p - q,
the curvesf(;"c, y) = c arc the confocal hyperbolas with P and Q as their
foci, and the hyperbolas passing through the points of extreme value
of f(x, y) were seen to be tangent to C.
l 0 og.li)S Conloc.Uelhpm
EXAMPLES
341
t<:>nds the same angle at all points of the circumference on the same side
of the chord, As is seen from Figure 190, two of these circles will, in gen-
z.,
era!, be tangent to
with centers on opposite sides of PQ. One of the
points of tangency gives the absolute maximum for 8, while the other
point
a "relative" maximum (that is, the value of 8 will be ]('ss
in a
of this point than at the point itself. The
greater of the
maxima, the absolute maximum, is given by that
point of tangency which lies in the acute angle formed by the extension
of PQ and D, and the smaller one by the point which lies in the obtuse
angle formed by these two lines. (The point where the extension of the
sPgment PQ intersects D gives the minimum value of 8, ZNO.)
As a generalization of this problem we may replace L by a curve C
and seek the point R on C at which a given line segment PQ (not intersecting C) subtends the greatest or lea<>t angle. Here again, the circle
through P, Q, and R must be tangent to C at R.
342
[VIII
represents only one side of the case, for the vitality of mathematics
depends most decidedly on the individual color of problems and methods.
In its historic development, the differential calculus was strongly influenced by individual maximum and minimum problems. The connection between extrema and the differential calculus arises a,.<; follows.
In Chapter VIII we shall make a detailed study of the derivative f'(x)
of a function f(x) and of its geometricd meaning. In brief, the derivativef'(x) is the slope of the tangent to the curve y = f(x) at the point
(x1 y). It is geometrically evident that at a maximum or minimum of
a smooth curve y = f(x) the tangent to the curve must be horizontal,
that is, its slope must be equal to zero. Thus we have the condition
f'(x) = 0 for the extreme values of f(x)
To see what the vanishing of f'(x) means, let us examine the curve
Jf Figure 191. There are five points, A, B, C, D, E, at which the tangent
343
Fi~
193 The
eorroop~nding
contour mllp
Figure 192 two mountains A and Bon a rang-e and two points C and D
on different sides of the mountain rang", and suppose that we wish to
344
[Vlll
go from C to D. Let us first consider only the paths leading from C toD
obtained by cutting the surface with some plane through C and D.
Each such path will have a highest point. By changing the position of
the plane, we change the path, and there will be one path CD for which
345
C to C' in such a way that one's path does not rise higher than necessary.
346
[VII]
'.
Exercise: Repeat the reasoning with the other type L' of closed curve on C
that cannot be contracted to a point, lUI in Figure 196.
Hermann Amandus Schwarz (1843~1921) was a distinguished mathematician of the University of Berlin and one of the great contributors
to modern function theory and analysis. He did not disdain to write
on elementary subjects, and one of his papers treats the following
problem: Given an acute-angled triangle, to inscribe in it another
triangle with the least pos.o:;ible perimeter. (By an inscribed triangle
we mean one with a vertex on each side of the original triangle.) \Ve
Jhall see that there is exactly one such triangle, and that its vertices
a.re the foot-points of the altitudes of the given triangle. We shall call
this triangle the altitude triangle.
347
SCHWARZ'S PROOF
F>~
a:n~l ....
348
{VII)
349
SCHWARZ'S PROOF
Fig.Jgg.
Fig200.
Q;
350
[VIII
l'lg.201
perpendiculars from B to QP, QR, and PR, thus obtaining the points
L, M, and N. Then QL and Qlf are the projections of the altitude QB
on the linesQP and QR respectively. Consequently, QL + QM < 2QB.
Now QL
QM equals p, the perimeter of the altitude triangle. For
triangles MRS and NRB are congruent, since angles MRB and NRB
are equal, and the angles at M and N are right angles. Hence
RM = RN; therefore QM = QR + RN. In the same way, we see that
PN = PL, so that QL
QP + PN. \Ve therefore have QL + QAI =
QP + QR + PN + NR
QP + QR + PR
p. But we have
shown that ZQB > QL + QM. Therefore p is less than twice the
altitude QB; by exactly the same argument, p is less than twice any
ANOTHER PROOF
351
3. Obtuse Triangles
In both of the foregoing proofs it has been assumed that the angles
A, B, and C are all acute. If, say, C is obtuse, as in Figure 202, the
Q,-
Fil;.262. Altltudt.trirn.gleforobtuutriangle.
points P and Q will lie outside th.e trianJ;le. Therefore the altitude
triangle can no longer, strictly speaking, be said to be inscribed in the
triangle, unless by an inscribed triangle we merely mean one whose
vertices arc on the sides or on the extensions of the sides of the original
triangle. At any rate, the altitude triangle does not now give the
minimum perimeter, for PR > CR and QR > CR; hence p = PR +
QR + PQ > ZCR. Since the reasoning in the first part of the last
proof showed that the minimum perimeter, if not given by the altitude
triangle, must be twice an altitude, we conclude that for obtuse triangles
the "inscribed triangle'' of smallest perimeter is the shortest altitude
counted twice, although this is not properly a triangle. Still, one can
fmd a proper triangle whose perimeter differs from twice the altitude by
as little as we
For the boundary case, the right triangle, the
two
the shortest altitude, and the altitude triangle--coincide
The intert?o;tiug question whether the altitude t.riang]p ha~ any sort
352
[VII]
353
354
[VIIi
curve C, where the wall Cis supposed to act as a perfect mirror, reflecting
the otherwise free particle at the same angle at which it lilts the boundary. For example, a rectangular box (an idealized billiard table with
perfect reflection and a mass point as billiard ball) leads in general to
an ergodic path; the ideal billiard ball going on for ever will reach thf'
vicinity of every point, except for certain singular initial positions ami
directions. '\Ve omit the proof, although it is not difficult in principle
Of
interest is the case of an eHiptical table with the foci
F1
P~. Since the tangent to an ellipse makes equal angles with
the lines j-oining the point of tangency to the two foci, every trajectory
through a. focus will be reflected through the other focus, and so on.
lt is not hard to see that, irrespective -of the initial direction, the trajecn reflcdiom tenUs with incrt'asing n to the major axis F\F2.
then there are two possithen all the reflected rays will
to a certain hyperb-ola
If tlw h>itia.l m.Y dc>encot '<epacate F1 anJ f.\ ,
5 STEINER'S PROBLEM
l. Problem and Solution
355
~
A
c
Hg.:W8. Least! mofdietan,.,.tothr...,pointe.
subtends an angle of 120"'. If, however, an angle of ABC, e.g. the angle
at C,
to or larger than 120", then the point P coincides with the
vertex
It is an easy matter to obtain this solution if we use our previous
results concerning extrema. Suppose Pis the required minimum point.
There are these alternatives: either P coinddes with one of the vertices
A, B, C, or P differs from these vertices. In the first case it is clear that
P must be the vertex of the largest angle C of ABC, because the sum
CA + CB is Jess than any other sum of two sides of the triangle ABC.
Thus, to complet-e the proof of our statt>ment, we must analyze the
second case. Let K be the circle with radius c around C. Then P
must be thepointonK such that PA + PR is a minimum. If A and B
356
l VI! l
+b+c~
AB
+ AC,
ANALYRIS OF ALTER:-:ATIVES
357
Figure 211. In this ease there is no point P from which all three sides
subtend 120". However, K1 and K2 determine at their intersection a
point P' from which AC and BC subtend angles of 60" each, while the
side AB opposite the obtuse angle subtends 120".
For a triangle ABC having an angle greater than 120" there is, then,
no point at which each side subtends 120". Hence the minimum point P
must coincide with a vertex, since that was shown to be the only other
alternative, and this must be the vertex at the obtuse angle. If, on the
other hand, all the angles of a triangle are less than 120", we have seen
that a point P can be constructed from which each side subtends 120".
But to complete the proof of our theorem we hwe yet to show that
a + b + c will actually be less here than if P coincided with any vertex,
for we have only shown that P gives a minimum if the smallest total
358
MAXIMA
A~D
MINIMA
[VII I
--<~~;~_>,
~-~
-B
A COMPLEMENTARY PROBLEM
359
360
a1
[VII]
distance PA,. (For four points, arranged as in Fig. 215, the point P
po<n\o>
K~
A,
A,
mor~
'
thao. 3 j)llin~.
361
selves here with pointing out the answer in the typical cases shown in
Figures 216-8. In the first case the solution consists of five segments with
two multiple intersections where three segments meet at angles of 120.
In the second case the solution contains three multiple intersections. If
the points are differently arranged, figures such as these may not be
possible. One or more of the multiple intersections may degenerate
and be replaced by one or more of the gi\en points, as in the third case.
In the case of n given points, there will be at most n - 2 multiple
intersections, at each of which three segments meet at angles of 120.
The solution of the problem is not always uniquely determined. For
four points A, B, C, D forming a square we have the two equivalent
solutions shown in Figures 219-20. If the points Al, A 2 , . , , A, are
>-<X
Fi.il'.
21~-20.
the vertices of a simple polygon with sufficiently flat angles, then the
polygon itself will give the minimum.
We begin with a
which occurs Vl'ry ofiln
in pure mathematics
it.s
In geometrical language it
amounts to the following: Among all rectangles with a prescribed per-
362
[VII]
imeter, to find the one with largest area. The solution, as one might
expect, is the square. To prove this we reason as follows. Let 2a bt
the prescribed perimeter of the rectangle. Then the fixed sum of the
lengths x and y of two adjacent edges is x + y, while the variable are
xy is to be made as large as possible. The "arithmetical mean" of
and y is simply
m=::-{-1!.
We shall also introduce the quantity
d=Y
so that
x=m+d,
y=m-d,
and therefore
xy = (m
Sir.
+ d)(m-
d) =
d2 =
nl
~--t~f- d2
the wequality
(1)
where the equality sign holds only when d = 0 and x = y = m.
Since X + is fixed, it follows that
and therefore the area xy,
is a maximum when x = y. The expression
vx-y,
g ~
v'Xii,
pression
is neces::>arily non-negative,
A geometrical de-rivation of
sidering the fixed straight line x
+y
363
the family of curves xy = c, where cis constant for each of these curves
(hyperbolas) and varies from cunc to curve, As is evident from Figure
x+Y""2m
Fig.22L Muirttt"
221, the curve with the greatest vame of c having a point in common
with the given straight line will be the hyperbola tan,:ent to the line at
the point x = y = m; for this hyperbola, therefore, c = m 2 Hence
xy
~ (Xt_!!)'.
2. Generalization to n Variables
and geometrical means
number n of positive
V'i;i; .. x.
364
A~D
MAXIMA
MINI;o..1A
{VII]
The
m,
~tart
with
a1 , . . , x,. = a,..
x2=s,
xa=aa,,X,.=a,.,
where
'~
a1
+a,
~2-.
a1 =s+d,
G.j!=s-d,
where
+ d)(S-
d)aa ...
(I,=
<
P',
365
is any one of the a's; it follows that all the a's are equaL Since g = m
when all the x, are equal, and since we have shown that only this gives
the maximum value of g, it follows that g < m otherwise, as stated in
the theorem.
+ + x,.
for this assumption one mm;t enter into a detailed discussion of the
theory of probability. But we can at leru;t point out a minimum property of m which makes it a reasonable ehoice. Let u be any possible
value for the quantity measured. Then the differences u - x1, ,
u - Xn are the deviations of this \'alue from the different readings
TheS<l deviations can be partly positivf>, partly negative, and the ten~
will
be to assume as the optimal value for u one for
sense as small as possible. Follo\vnot the deviations
but
to
(u - x,) = (m - x,)
+ (u
m),
we obtain
(u - x,) 2
= (m- x,) 2 +
(11 -
m)''
366
MAXI~1A
[VII]
AN"D MINIMA
Now add all these equations fori = 1, 2, . - , n. The last terms yield
2(u - m)(nm - x1 - - Xn), which is zero because of the definition
of m; consequently we retain
(u - xt}1
+ .. +
(u - xS
(m - x 1)
+ +
(m - xS
+ n(m -
u)
(u - x1)
+ +
2
(u - x,) 2':. (m - x1l
+ +
2
(m - Xn) ,
and that the equality sign holds only for u = m, which is exactly what
we were to prove.
The general method of lellSt squares takes this result as a guiding
principle in more complicated cases when the problem is to decide on a
plausible result from slightly incompatible measurements. For example,
suppose we have measured the coOrdinates of n points x; , y; of a
theoretically straight line, and suppose that these measured points do
not lie exactly on a straight line. How shaH we draw the line that
best fits the n observed points? Our previous result suggests the following procedure, 'vhich, it is true, might be replaced by equally reasonable variants. Let y = ax + b represent the equation of the line,
so that the problem is to find the coefficients a and b. The distance
in the y direction from the line to the point x,, y, is given by
y;
(ax;+ b) = y, ~ ax;- b, with a positive or negative sign according as the point is above or below the line. Hence the square of this
distance is (y, - ax; - b/, and the method is simply to determine a
and b in such a way that the expression
(Yt - ax, - b/
value.
+ +
(Yn - axn - b)
DIHICHLET'S PHINCIPJJ<;
1. General Remarks
GE~ERAL RE~IAltKS
367
368
MAXI~IA A~D
::\IINI:\fA
some enthusiastic
{VIII
mathematics and mathematical phyc:ics became one of the great triumphs in the history of modern mathematical analysis.
In Riemann's paper the point open to critical attack is the question of
the existence of a minimum. Riemann based much of his theory on
what he
Dirichlet's principle (Dirichlet had been Riemann's
teaeher at
and had lecturf'd but never written about this
principle.) Let us
that part of a plane or of any
surface is covered
and
a stationary electric current is
set up in the layer of tinfoil by connecting it at two points with the
poles of an electric battery. There is no doubt that the physical experi
ment leads to a definite result. But how about the corresponding
mathematical problem, which is of the utmost importance in function
theory and other fields? According to the theory of electricity, the
physical phenomenon i:> described by a "boundary value problem of a
partial differential equation", It is this mathematical problem that con
cerns us; its solvability is made plausible by its a&:~umed equivalence to
a physical phenomenon but is by no means mathematically proved by
this argument. Riemann disposed of the mathematical question in two
steps. First he showed that the problem is equivalent to a minimwn
problem: a certain quantity expressing the energy of the electric flow is
minimized by the actual flow in comparison to the other flows possible
under the prescribed conditions. Then he st.ated as "Dirichlet' principle" that such a minimum problem has a solution. Riemann took
not the slightest step towards a mathematical proof of the sceond assertion, and this was the point attackPd by Weierstrafltl :\'ot only 'vas the
existence of thf' minimum not at all E'vident, but, as it turned out, it
wa.s an extrPmely delicate question for which the mathematico; of that
time was not yet prepared and which \vas finally set.tled only after many
decades of intensive research.
2. Examples
We shall illustrate the sort of difficulty involved by two examples.
l) \Ve mark two points A and Bat a dio;tance don a straight line L,
and ask for the polygon of short<:st length that starts at A in a direction
perpendicular to L and ('nds at B. 8ince the straight segment AB is
the shortest connection between A and B for all paths, we can be certain
that any path admissible in tho'! competition has a len~?;th greater than d,
369
EXAMPLES
for the only path giving the valued is the straight segment AB, which
violates the restriction impo;;ed on the direction at A, and hence is not
admissible under the terms of the problem, On the other hand, con~
cbs
370
IVIIJ
But among all the surfaces bounded by Conly the disk itself has
this area, and since the disk does not go through Sit violates the
con~
strass.
The two just considered show well enough that the existence of
371
For
the distance between a point A 1 on C1 and a point A2 on C2 is a continuous function on the compact set consisting of the pairs A 1 , A 2 of
points under consideration. However. if the two curves are not bounded
but extpnd to infinity, then the problem may not have a solution, In
the N<-~e shown in Figure 224 neither a smallest nor a
distance
between the curves is attained; the lower bound for
is zero,
the upper bound is infinity, and neither is attained. In some cases a
minimum but. no maximum exists. For the case of two branches of a
hyperbola (Fig 17, p. 76) only a minimum distance is attained, by A and
.4 ',since obviously no two poinLH exist with a maximum distance apart.
372
~IAXI:\lA A~D
Mir-;'I:\L\
[VII I
373
6666(\
v v v v
(\
(\
v v v
6
v v v v v v
struct.ure.
374
lVII]
curve C, since it includes the additional areas I and II. This contra~
diets the assumption that C contains the largest area for a closed curve
of length L. Hence C must be convex.
Now choose two points, A, B, dividing the solution curve C into arcs
of equal length. Then the line AB must divide the area of C into two
equal parts, for otherwise the part of greater area could be reflected in
AB (.Fig. 227) to give another curve of length L with greater included
area that C. It follows that half of the solution C must solve the
following problem: To find the arc of length L/2 having its endpoints
A, B on a straight line and enclosing a maximum area between it and
this
line Now we shall show that the solution to thiR nnv
probkm
a semicircle, so that the whole curve C solving the isoperimetric problem is a circle. Lt>t the arc
It is suffic1ent to show that evcrv inscribed
Figure 228 is a right angle, for thi~ will
Suppose, on the contrary, that the angle A OR i< not 00' Then we r:an
replace Figure 228 b:t another one, 229, in which the shaded areas and
375
the length of the arc AOB are not -:,, >ged, while the triangular area i1>
increased by making 4 AOB equal to or at least nearer to 90". Thus
F:;:;ure 229 gives a larger are -':ha- the original (see page 330). But we
started with the assumption wat .r 1gure 228 solves the problem so tl' '
Figure 229 could not possibly yield a larger area. This
shows that for every point 0, 4-AOB must be a right angle,
completes the proof.
Th(' isoperimetric property of the circle can be expressed m
of an inequality. If L is the circumference of the circle, its a 1. is
Vj4r, and therefore we must have the isoperimetric inequality, A S:
Vj4r, between the area A and length L of any closed cunre the equality
sign holding only for the circle.
*As is apparent from the discussion in 7, Steiner's proof has only a
conditional value: "If there is a curve of length L within maximal area
~ E\
375
t1AXI.\'fA
A~D
[VIII
MINIMA
paper
36'l!"V 2 :$A a
between the
dimensional
area A and the -.:olume V of any do.,ed threethe equality holding only for the sphere.
*9. EXTRE.\It':::vi PROBLE.\18 \VITH BOUKDARY CO:t\DITIOXS. COI\KECTIO~ BETWEEN STEIXER'S PROBLE\1 AXD THE ISOPERil\IETlUC PROBLE::\1
Interesting resultil arise in extremum problems when the domain of
the variable is restricted by boundary conditione;. The theorem of
Weierstrass
largest and
377
378
[VII]
attained in the interior of the triangle ABC, which is the case of the
three equal angles, or it is attained at a boundary point C. A similar
pair of alternatives exists for the complementary problem.
As a last example we may consider the isoperimetric problem modified
by restrictive boundary conditions. \Ve shall thus obtain a surprising
connection between the isoperimetric problem and Steiner's problem
and at the same time what is perhaps the simplest illstance of a new type
of extremum problem, In the original problem the independent variaable, the closed curve of given length, can be arbitrarily varied from the
circular shape, and any such deformed curve is admissible into the
competition, so that we have a genuine free minimum. Now let us
consider the following modified problem: thP curves C under consideration shall include in their interior, or pass through, three given points, P,
Q, R, the area A is prescribt>d, and the length Lis to be made a minimum.
This represents a genuine boundary condition.
It is clear that, if A is prescribed sufficiently large, the three points
P, Q, R will not affect the problem at all. Whenever the circle circumscribed about the triangle PQR hM an area less than or equal to A, the
solution will simply be a circle of area A including the three points.
But what if A is smaller? We state the answer here but. omit the somewhat detailed proof, although it would not be beyond our reach. Let
us characterize the solutions for a sequence of values of A which decreases to zero. As soon as A falls below the area of the circumscribed
circle, the original isoperimetric circle breaks up into three arcs, all
having the same radius, which fonn a convex circular triangle with
0.6A
f'V, 231
F1g.
.AQ
232
F11.233
.~Q
P'112U
,....1-l~c~ ....
tendinctotbeoo!utionofSkhtv'oprobl.,..,
Hi9
P, Q, R as vertices
232). This triangle is the solution; its dimensions can be
from the given value of A. If A decreases
further, the radius of these arcs will increa.~e, and the arcs will become
more and more nearly straight, until whC'n A is exactly the area of the
triangle PQR the solution is the triangle itself. If A now becomes
even smaller, then the solution will again consist of three circular arcs
having the same radius and forming a triangle with corners at P, Q, R.
This time, however, the triangle is concave and the arcs are inside tlw
triangle PQR (Fig. 233). As A continues to decrease, there will come
a moment when, for a certain value of A, two of the concave arcs become
tangent to each other in a corner R. With an additional decrease of A.
it is no longer possible to construct a circular triangle of the previous
type. A new phenomenon occurs: the solution is still given by a concave circular triangle, but one of its corners R' has become detached
from the corresponding corner R, and the solution now consists of a
circular triangle PQR' plus the straight line RR' counted twice (because
it, travels from R' toR and back). This straight segment is tangent to
the two ar-cs tangent to each other at R'. If A decreases further, the
~eparation process \Yill also set in at the other vertices.
Eventually we
obtain as solution a circular triangle consisting of three arc.'l of equal
radius tangent to each other- and for-ming an equilateral circular
P'Q'R', and in addition three doubly counted
Q'Q, R'R (Fig. 234). If, finally, A shrinks to zero,
circular
triangle reduces to a point, and we return to the solution of Steiner"s
problem; the latter is thus seen to be a limiting case of the modified
isoperimetric problem.
If P, Q, R form an obtuse triangle with an angle of more than 120",
then the shrinking process leads to the corresponding solution of Steiner's
problem, for then the circular arcs shrink toward the obtuse vortex
The solutions of the generalized Steiner problem
Figs. 2Hl8 on p
360) may be obtained by limiting procc~ses
nature.
10. TilE CALCULUS OF YARIATIONS
1. Introduction
380
[VII I
under thP influence of gravity alone, along which such t'une will thP
time required for the descent be least? It is easy to see that tlw falling
particle will require different lengths of time for different paths. Tht'
straight line by no means affords the quickest journey, nor is the circular
arc or any other elementary curve the answer, Bernoulli boasted of
having a wonderful solution which he would not immediately publi<>h
in order to incite the greatest mathematicians of the time to try thf'ir
skill at this new type of mathematical question. In particular, hP
challenged his elder brother Jacob, with whom he was at the time engaged in a bitter f(ud, and whom he publicly described as incompetent,
to solve the problem. Mathematicians immediately recognized the
different character of the brachistorhrone problem. While heretofore,
in problems treated by the differential calculus, the quantity to be minimized depended only on one or more numrrieal variables, in this problem
the quantity under consideration, the time of descent, depends on the
whole curve, and this makes for an essential difference, taking the problem
out of the reach of the differential calculus or any other method known
at the time.
The novelty of the problem--apparently the isoperimetric propert}
of the rirelc was not clearly rccognizNi as of the same nature--fascinated
the contemporary mathematicians all the more when the solution turned
out to be the
a curve that had
been discovered. (We
ference of a circle that
without slipping along a straight line, as
shown in Fig. 236.) This curve had been brought into connection with
interesting mechanical
with the construction of an
ideal pendulum. Huygens
that an id('al mass point
which oscillates without friction under the influpnce of gravity on a
vertical cycloid ha:, a period of oscillation indqlf'ndcnt of the amplitude
of the motion. On a circular path, such as is provided by an ordinary
pendulum, this independence is only approximately true, and this Wad
381
INTRODUCTION
II
R
method for solving such problems was called the calculus of
382
IVJJ]
the velocity is w will follow a path PQR. The empirical law found by
Snell (1591-1626) states that the path consists of two straight segments,
PQ and QR, forming angles a, a 1 with the normal determined by the
conditions sin a/sin a' = v/w. By means of the calculus Fermat
that this path is such that the time taken for the light ray to
P
to R is a minimum, i.e. smaller than it would be along any
connecting path. Thus Heron's law of reflection was supplemented sixteer,
hundred years later by a similar and equally important law of refraction
Fermat generalized the statement of this law so as to include cur\'ed
surfaces of discontinuity between media, such as the spherical surfaces.
used in lenses. In this case the statement still holds that light follows
a path along which the time taken is a minimum relative to the time
that would be re!]uired for the light to describe any other possible path
between the same two point,.,. Finally, Fermat considered any optical
system in which the velocity of light varies in a prescribed
from
point to point, ..11-s it does in the atmosphere. He divided
continuous inhomogeneous medium into thin slabs, in each of which the
velocity of light is approximately constant, and imagined this medium
replaced by another in which the velocity is actually constant in each
slab. Then he could again apply his principle, going from each slab to
the next. lly letting the thickness of the slabs tend to zero, he arrived
at the general Fermat principle of geometrical optics: In an inhumogeneou~
medium, a light ray travelling between two puints follows a path along
which the time taken is a minimum with respect to all path>S joining tlw
two points. This principle has been of the utmost importance, not only
theoretically, but in practical geometrical optics. The technique of tlw
calrulus of variations applied to this principle provides the basis for
calculating lens systems.
principles have also become dominant in other branrlws uf
It was observed that stable
of a mechanical
energy" is a minimum. As an example, let us
homogeneous chain suspended at its two ends and allowing full play to tlw
force of gravity. The chain will then assume a form in which its potential
energy is a '1imum. In this case the potential energy is determineJ
by the hCJ~
of the center of gravity above some fixed axis. The
curve in which the chain hangs is called a catenary, and resemblPs superficially a
Not
the laws of equilibrium, but also those of motion, are dominated by maximum and minimum principles. It was Euler who ob-
383
tained the first clear ideas about these principles, while philosophically
and mystically inclined speculators, such as Maupertuis (1698~1759),
were not able to separate the mathematical statements from hazy ideas
about "God's intention to regulate physical phenomena by a general
principle of highest perfection." Euler's variational principles of physics, rediscovered and extended by the Irish mathematician \V. R.
Hamilton (1805~1865), have proved to be among the most powerful
tools in mechanics, optics, and electrodynamics, with many applications
to engineering. Recent developments in physics-relativity and quan
tum theory-are full of examples revealing the power of the calculus of
variations.
Fir.238.
edge. We start with the fact, taken from mechanics, that a mass point
falling from rest at A along any curve C will have at any point P a
velocity proportional to Vh, where h is the vertical distance from A
to P; that is, v = cVh, where c is a constant. Now we replace the
given problem by a slightly different one. We dissect the space into
many thin horizontal slabs, each of thickness d, and assume for the
moment that the velocity of the moving particle changes, not continuously, but in little jumps from slab to slab, so that in the first slab
A the velocity is cvfd, in the second cVU, and in the nth
= c...jh, where h is the vertical distance from A to P (see
If this
is considered, then there are really only a
finite number of
In each slab the path must be a straight
S(lgment, no existence problem arises, the solution must be a polygon,
and the only questionis how to determine its corners. According to the
minimum principle for the law of simple refraction, in each pair of sue-
384
of Q must be such
possible time Hence
sin a'
vml ~ v(n +
l)d.
sino:1
-V'd'
sina2
:;;u
where a,. is the angle between the polygon in the nth slab and the vertical.
Now Bernoulli imagines the thickness d to become smaller and smaller,
tending to zero, so that the polygon jqst obtained a..'! the solut,jon of the
approximate problem tends to the desired solution of the original problem. In this passage to the limit the equalities (I) are not affected, and
therefore Bernoulli concludes that the solution must be a curve C with
the following property: If a is the angle between the tangent and the
vertical at any point P of C, and his the vertical distance of P from
the horizontal line through A, then sin a/v'h is constant for all points
P of C. It ca-, ~e shown very simply that this property characterizes
the cycloid.
Bernoulli's "proof" is a typical example of ingenious and valuable
mathematical reasoning which, at the same time, is not at all rigorous.
There are several tacit asi:iurnption~S in the arj!;ument, and their justifica~
tion would be more complicated and lengthy than the argument itself
For example, the existence of a solution C, and the fact that the solution
of the approximate problem approximates the actual solution, v.we both
assumed. The question as to the intrinsic value of heuristic considcra~
tions of this type certainly deserves discusRion, but would lead us too
far astray.
4. Geodesics on a Sphere.
Geodesics and
Maxi~Minima
GEODESICS ON A SPHERE
385
length, nor can it give the maximum length for curves joining P and Q,
since arbitrarily long curve-s between P and Q can be drawn. The
answer is that c' solves a maxi-minimum problem. Consider a point
S on a fixed great circle separati:_~ P and Q; we ask for the shortest
connection between P and Q on the sphere passing through S. Of
course, the minimum is given by a curve consisting of two small arcs of
great circles PS and QS. Kow we seek a position of the point S for
which this smallest distance PSQ becomes a8 large as possible. The
solution h;~ S mu~;t lw such lhat. PSQ is the longer arc c' of the great
Jo'i11.
~39. Geod..,icaono.opher~~o
circle PQ. We may modify the problem by first seeking the path of
shortest length from p to Q passing through n prescribed points, sl '
82, , S" , on the sphere, and then seeking to determine the points
Sr , , S,. so that this minimum ler:.;:h becomes as large as possible,
The solution is given
path on the great circle joining P and Q, but
this path winds
the sphere so often that it passes through the
points diametrically oppositP P and Q exactly n timeo:.
This example of a maximum-minimum problem is typical of a wide
class of questions in the calculus of variations that have been studied
with great sucress by me-thods developed by Morse and others.
11.
EXPERI~IEXTAL
SOAP
It is usually very difficult, and sometimes impos!:iible, to solve variational problems explicitly in terms of formulas or geometrical construe-
386
t VII l
Eukr
387
the existence of the solution for the general case was proved only
J. Douglas and by 'I'. RadO.
experiments immediately yield physical solutions for very
general contours. If one dips any closed contour made of "'ire into a
liquid of low surface tension and then withdraws it, a film in the form
of a minimal surface of least area \\ill span the contour. (We assume
that we may neglect gravity and other forces which interfere \\ith the
tendency of the film to assume a position of stable equilibrium by
attaining the smallest possible area and thus the least possible value of
the potential energy due to surface tension.) A good recipe for such a.
liquid is the following: Dissolve 10 grams of pure dry sodium oleate in
500 grams of distilled water, and mix 15 cubic units of the solution with
11 cubic units of glycerin. Films obtained with this solution and with
frames of brass wire are relatively stable. The frames should not
exceed five or six inches in diameter.
With this method it is very easy to "solve" Plateau's problem simply
by ::;haping the wire into the desired form. Beautiful models are obtained in polygonal wire frames formed by a sequence of edges of a
regular polyhedron. In particular, it is interesting to dip the whole
frame of a cube into such a solution. 'I'he result is first a system of
different surfaces meeting each other at angles of 120" alone; lines of
intersection. (If the cube is withdrawn carefully, there will be thirteen
nearly plane surfaces.) Then, we may pierce and destroy enough of
the sum of these two curva.turea is the mean
388
!VII]
Fi&.241.
Onoo-oided&urfmc~{MO!lbiu&strip).
389
390
MAXHIA
A~D
1\IINI?IIA
[\'Ill
Fr~~rne"panni.,gthreediffe>entsurfll<leoof
genUI!Oan<ll.
tion, we return to the initial position of the frame, but now with the
other solution in it.
genusl.
391
392
[VII]
system of vertical planes between the plates and joining the fixed bars.
The projection appearing on the glass plat{'S is the solution of the problem discussed on page 359.
<'ttrves fornwd
will illustrate nt'W
393
others two congruent circular arcs. The re:sult is shown in Figure 251
If the planes of the ar<'s form an angle of Jpss than 120, we obtain tim('
surfaces meeting at angles of !20"; if we turn the two arcs, int'rcasing
the included angle, the solution changes eontinuously into two plane
circular segments.
394
1\IAXll\fA AX"D
~HNIMA
[VII l
Fig.
~63.
Jea~<t
Finally, a word about soap bubbles. The spherical soap bubble showB
that among all closed surfaces including a given volume
the amount of air inside), the sphere has the least area. Ih'" e<m,iclec
soap bubbles of given volume which tend to contract to a minimum
arPa but which are restricted
then the
31Jfl
n
R
F>w<254-filol-"'rnnetricfiiUreswithho"nda<'Y<elrictiono
396
[VIII
different states of
arc a source of experiments that are very
illuminating from
mathematical point of view. The experiments
illustrate the theory of stationary values, since the transitions can be
made to take place so as to lead through an unstable equilibrium which
is a "stationary state."
397
Fig. 258
CHAPTER VIII
THE CALCULUS
INTIWDUCTION
\Vith an absurd oversimplification, the "invention" of the calculus is
sometimes ascribed to two men, Newton and Leibniz. In
the
calculus is the product of a long evolution that was neither
nor terminated by Newton and Leibniz, but in which both played a
Scattered over seventeenth century Europe, for the most
part
the schools, was a group of spirited scientists '"ho strove
to continue the mathematical work of Galileo and Kepler. By correspondence and travel these men maintained close contact. Two central
problems held their attention. First,
mine the tangent lines to a given eurvc, thefuwJam.cnoll p'coblem
differential calculus. Second, the problem of quadrature:
determine
the area within a given curve, the fundamental problem of the integral
calculus. Newton's and Leibniz' great merit is to have clearly recognized the intimate ronnection between these two problems. In their hands
the new unified methods became powerful instruments of science, Much
of the success was due to the marvelous symbolic notation invented by
Leibniz. His achievement is in no
diminished by the fact that it
was linked "'ith
and untenable
which are antto <>eco,eluatc
INTRODUCTION
399
1. THE INUJGRAL
1. Area as a Limit
400
THE CALCULUS
(VIII!
AREA AS A LIMIT
401
402
THE CALCULUS
(1)
~.&.&,
{VIII)
s.. - t 4,
(2)
and this limit A, the area under the curve, is independent of the particular way in which the sequence (1) is chosen, so long as the widths of the
approximating rectangles tend to .zero. (For example, S,. can arise
from Bn-1 by adding one or more new points of subdivision to those
defining Sn-1 , or the choice of points of subdivision for 8.,. can be entirely independent of the choice for 8,._1 .) The area A
the domain,
expressed by this limiting process, we call by definition the integral of
the function f(x) from a to b. Wit,h a special symbol, the "integral sign,''
it is written
oi
A~
(3)
f(x)dr.
The symbol f, the "dx," and the name "integral" were introduced
by Leibniz in order to suggest the way in which the limit is obtained.
To explain this notation we shall repeat in more detail the process of
approximation to the area A. At the same time the analytic formula
tion of the limiting process will make it possible to discard the restrictive
assumptions f(x) ~ 0 and b > a, and finally to eliminate the prior intuitive concept of area as the basis of our definition of integral (the latter
will be done in the supplement, 1).
Let us subdivide the interval from a to b into n small subintervals,
which, for simplicity only, we shall assume to be of equal width,
(b- a)/n. We denote the points of subdivision by
Xo
X2
= a
+ ~~:
a,
X1
=a+
a), , x,. = a
l1
~a
n--,
b.
We introduce for the quantity (b- a)jn, the difference between consecutive x-values, the notation Ax (read, "delta x"),
THE INTEGRAL
403
S,. = j(x1) dx
+ j(:r2) dx
which is abbreviated as
(f>)
2+ 3+ 4+
.. +
1+2+3+
10 = ~j,
+nc:::t;j.
1'+2' +3' + + n
aq
+ aq' + + aq"
a+ (a +d)+ (a +Ul +
t-;jl.
~ aqi,
+(a+ nd) =
(a+jd)
404
(6)
THE CALCULUS
= lim
~f(x,.)6x
!VIII]
1b f(x) dx.
2:
General Definition
REMARKS ON
I~TEGF..\L
CONCEPT
405
Fig.261. PooJtiveendnol&tiv&e.._.
posithe.
We must emphasize that the value of the int-egral remains the same
even if we do not restrict ourselves to equidistant points xi of subdivision, or, what is the same, to equal x-diffcrenccs f).x = xi 11 - xi.
We may choose the x, in other ways, so that the differences
A:ti = x,H - x, are not equal (and must accordingly be distinguished
by subscripts). Even then the sums
S .. = f(x!)f).xu
+ f(Xi)Axt + + f(x,.)f).x,._t
= f(xo)f).xo +
f(xi)f).xl
+ ... + f(x~-l)f).x,._l
if only
xi
THE CALCULUS
406
[VIlli
as n-t <X), In this limit Vi may denote any point of the interval
x, :s:; Vi :s:; xi+l, and the only restriction for the subdivision is that the
longest interval Ax 1 = Xi+ 1 - x 1 must tend to zero as n increases.
}',g.
2~2.
The existence of the limit (6a) does not need a proof if we take for
granted the concept of the area under a curve and the possibility of
approximating this area by sums of rectangles. However, as will appear in a later discussion (p. 464), a closer analysis shows that it is desirable and even necessary for a logically complete presentation of the
notion of integral to prove the existence of the limit for any continuous
functionf(x) without reference to a prior geometrical ('Oncept of area.
4. Examples of Integration.
Integration of x
Until now our discussion of the integral has been merely theoretical.
The crucial question is whether the general patt<'rn of forminp; a sum S,.
and then passing to the limit actually leads to tangible results in concrete cases. Of course, this will require some additional reasoning
adapted to the specific function j(x) for which the intE'gral is to be found
When Archimedes two' thousand y\'ars ago found the area of the parabolic segment,
\vhat we now <'all the int.0gration of the
functionf(x) =
for simple functions such as
again by
dmices.
experience with specific ca~es was a general approach
to
problem of integration found in the systematic methods of the
calculus, and thus the scope of solvable individual problems was greatly
widened. In the present article we shall discuss a few of the instructive
special problems belonging to the
stage, for nothing can
better iJlu,.,trate integratiun as a limitiog pmce>o.
a) \Ve start with a quite trivial
If y = f(x) i,; a com;tant,
407
EXAMPLES OF INTEGRATION
f(x,)ox
1-1
2ox
-1
ox
1-l
2(b - a)
~Ax
+ (x, - x1) +
+ (xn- Xn--1) = Xn- Xo =
= (x1 - xo)
b- a.
Here { xdx is
l~;-a = b~~f~
This result again agrees with the definition (6) of the integral, as is seen
= natJ.x
Sn
Since Ax=
= nat.x + ~p
b~ , this
s" ~
(Ax)
is equal to
a(b - a)
+ !(b
a)'
+ .];; (b -
a)'.
If now we let n tend t.l) infinity, the last term tends to zero, and we
obtain
lim Sn
= {X dx
= a(b - a)
!(b - a) 2 = !(b2
2),
408
THE CALCULUS
[VIII
Fl&.2M. At$10ndetapr.rabo!a,
l1x = b/n.
= /(l!.x} 2,
expression
,_,
= (1
b/n, we
obtain
. 1. 2
= ~ , and
409
EXAMPLES OF INTEGRATION
x dx ""' a'/3,
1b x~ dx
b'
aa.
Exercise: Prove in the same way, using formula (5) on page 15 that
b-a
[ x . ' d z - 4- - .
By developing general formulas for the sum tk + 2"' +
powers ol the integers hom l l,o 11, one ~an obtain the result
(7)
[" :r!' dx -
s-~-r:.~'
11~
of the kth
Instead of proceeding in this way, we can obtain more simply an even more
general result by utilizing our previolll:l remark that we may calculate the integral
by meanB of non-equidistant points of subdivision. We shall establish formula
{7) not only for any positive integer k but for an arbitrary positive or negative
rationa.l number
k = 1.1/V,
x~ =a,
Xt,
x1 ,
We set~~ ""q,
aqi,
S. = aH'(q
f~ctor
I)jl
at(aq -
Substituting I fer qH' we see that the expressicu in braee8 is the geometricalseriea
410
1
THE CAI,CULUS
+ I + t 1 + + tn->,
q"(k1'l)
(.)...
;
{VIJJ l
But.
t~
HeHce
s~ ~
(8)
(q
I) {b:::-_a:+~}
bH>-ar.+>
-N-
where
v;
Now we
sha~[
~~ii1~
as was to he proved
ExerciM: Prove that for any rational k ""' -1 the aam~ limit formula, N -
k + 1, and therefore the result (7), !"('Dlains valid. Firat give the proof, according
to our mod-el, for negative integrrs k Then, if k = 1</v, write q''" = s and
N
If n increases, both s and q tend to 1, and therefore the two quotients on the
+1
forthelimitofN.
In S we shall see how this lengthy and somewhat artificial discussion may be
replaced by the simpler ".nd more powerful methods of the calculus.
Exercisel':: I) Check the preceding integration of x for the rases lr. = !, -!,
2, ~2,3, -3
2) Find t.he values of the integrals:
) r "''
b)
f ,,, I:,.,,
o)
L:'
x' d:x.
b)
1:
c)
1:
d)
r ,.,,
e)
1
x cos1 x sin' x dx.
d)
J: ,,,
[~'tan x dx
(Hint: Consider the graphs of the functions under the integral sign, take into
account their symmetry with respect to x = 0, and interpret the integrals aa areas.)
4) Integrate ein x and cos x from 0 to b by substituting 6x = hand using the
formulas of page 488
EXAMPLES OF lNTEGRATIO:-.-
411
f(x) =
11<1
+ a1x + ~x + + a,.x",
2
412
THE CALCULUS
[VIII)
f(x) dx = ao(b - a)
+a
li ;
al
Another rule, obvious both from the analytic definition and the geometric
{ f(x) dx
+{
f(x) dx
rf(x)dz
~- [!(x)dx,
is in agreement with the laat two rules, since it corresponds to (10) for
c = a.
Sometimes it is convenient to use the fact that the value of the in
t..egral in no way depends upon the particular name x chosen for the
independent variable in f(x); for example
{ f(x) dx
{ f(u) du
For a mere change in the name of the coOrdinates in the system to which
the graph of the function refers does not alter the area under the curve.
The same remark applies e\'en if we make certain changes in the coordinate system itseU. For example, let us shift the origin to the right
by one unit from 0 to 0', as in Fi~re 265, so that x is replaced by a
.,...,
y
I
,.
I
0',
I
/(z)"'/(l+s')
413
new coOrdinate x' such that x = 1 + x'. A curve with the equation
y = f(x) will have in the new coOrdinate system the equation
y=f(l+x'). (E.g.y=l/x=l/(l+x').) AgivenaroaAunderthis
curve, say between x = 1 and x = b, is, in the new coOrdinate system,
the area under the arch between x' = 0 and x' = b - 1. Thus we have
(12)
f(x) dx
For example,
!.
(12a)
x' dx
Similarly,
(12c)
(k
;o: 0).
:-1 (1 + u) du
= kb:'l
(~
414
THE CALCULUS
[VIII]
We suppose that b > a and that the values of f(x) in the interval
nowhNe exceed those of another function g(x). Then we have
(13)
{ f(x) dx ::::; {
g(x) dx,
as is immediately clear either from Figure 266 or from the analytic defini-
Fia. 266
Cowpariaonofint.<~Fahl.
(14)
Since
I -f(x) I =
lf(x)
I,
we also have
- [!(x)dx s{lf(x)ldx,
which, together with (15), yields the somewhat stronger inequality
(16)
s [I f(x) Idx.
2. THE DERIVATIVE
DERIVATIVE AS SLOPE
415
~.
The length PR is taken a.s positive, while RQ is taken as positive or negative according as the direction from R to Q is up or down, so that the
slope gives the rise or fall per unit length along the horizontal when we
proceed along the line from left to right. In Figure 267 the slope o
the first line is %, while the slope of the second line is --1
F~Jr:,267,
Slopt'Oiofli.....,,
THE CALCULUS
416
!VIII]
of the area under a curve. This limiting process is the basis of the
differential calculus. We consider on the curve another point P 1 ,
near P, with cOordinates x1 , y1 The straight line joining P to P1
"
FJ.c.268.Tbderiv.-tivaoalimit.
P1-+P,
tt-+t,
and
at-+a.
VI
confusion
DERIVATIVE AS LIMIT
417
slope of t1
=~~=A~~).
limf(x~- ~(x) =
limtx',
The word "differentiation" comes from the fact thatf'(.r) is the limit of
the differencef(x,) - f(x) divided by the difference x1 - x:
(1)
f'(x) = limf(x 1)
f(x)
as
Xi- X
f'(x)
Df(x),
418
THE CALCULUS
[VIIlJ
df(x)
dx
dX'
for x we may find the positions of the maxima and minima, as was
first done by Fermat.
3. Examples
The considerations leading to the definition (I) might seem to be
without practical value. One problem has been replaced by another.
instead of being asked to find the slope of the tangent to a curve y
at a point, we are asked to evaluate a limit, (1), which at first
awc'aco eqa'"IIY difficult. But as soon as we leave the domain of gen~
considPr specific functions f(x) we Rhall obtain tangihle
f'(x)
EXAMPLES
419
~'c'OO!__'.C:Ol
~~-: =
Xt
~X=
0,
so that, trivially,
lim/~r) - f(x) = 0 as
Xt _....
x,
XI- X
f'(x)
for all values of x, and the analytic definition (I) again yieldli
1,
so that
lim
!J!..!2_::_f(x)
as
X1- X
x\
6x
If we should try to pass to the limit directly in numerator and denominator we should obtain the meaningless expression 0/0. But we can
avoid this impasse by rewriting the difference quotient and cancelling,
before passing to the limit, the disturbing factor x 1 - x. (In evaluating
the limit of the difference quotient we consider only values xi ~ x, so
that this is permissible; seep. 307.) Thus we obtain the expression:
= (x1 - x)(xt
".!] ""'
XJ
+ x.
X1- X
Now, after the cancellation, there is no longer any difficulty with the
limit as x1 _.... x. The limit is obtained "by substitution"; for the new
420
[VIII J
THE CALCULUS
x1 = x, in our case x
+x =
2x, so that
Ay
ax=---~
X~ + X1X + X 2.
+ +
f(x)
x",
x)(z~' +x~---ix
+ x~-Jx+
+x 1 x~-+zo-l))
As a further example of simple devief!s that permit explicit determination of the derivative we consider the function
f(x)
We have
~ -~.
EXAMPLES
421
Of course, neither the derivative nor the function itself is defined for
X=
0.
~, j'(:r.)
= -
= (1
+ x)~, j'(:r.)
.. n(l
~' f'{:r.)
= -;.;for
+ :t)-t.
Yi-v'X
+ VX)
we can cancel
~ v'X. + v'x
Passing to the limit yields
f(x)-
2~;;
f'(x) = -
( ~;); for:'..;)
vr;,
... .y;,f'{x)-
f(x 1) = sin (x
+ h)
= sin x cos h
+ cos x sin h.
Hence
sin (x +h)- sinx
(2)
sin
h) +smx
. (oos
-1)
- . - .
=coax-~~-
(
422
THE CALCULUS
[VIII)
limsi~h
= 1
and
lim
C:?.8-.~ =-~
= 0.
Hence the right side of (2) approaches cos x, giving the result:
1'he function f(x) = sin x has the derivatit'e f'(x) = cos x, or briefly,
D sin x =- cosx.
Exercise: Prove that D cos x =
x.
-8ill
;~:,
and obtain
-[~~) = (~~-~)
- ~~-:) 1
1
+
+
cos (x
cos x h
sin
n) cosx(x +h)sinx
--- (x______
h cos
____
""'___
~in
-,-~,.~~-
(The last equality follows from the formula sin (A -B) =sin A cos Bcos A sin B, with A = x + h and B = h.) If now we let h approach
(x
l)
cot x = -
~2 -x,
or
=
I
~~
DIFFERENTIATION
A~D
CONTINUITY
423
The preceding discussion of the d-erivative was carried out in connection with the geometrical concept of the graph of a function. But the
significance of the derivative concept is by no means limited to the
problem of finding the slope of the tangent to a curve. Even more imin the natural
is the problem of calculating the rate of
'h"'"11' ,,f """' qu,mt;ty f(t) which varies with the time t. It was from
his approach to the differential calculus.
to analyze the phenomenon of velocity,
position of a moving particle are considered as
where tlw time
tllf' variable elements, or, as ~cwton expressed it, as the "fluent
quantities."
If a particle moves along a straight line, the x-axis, its motion is
completely described by giving the position x at any timet as a function
x = j(t). A "uniform motion" with constant velocity b along the x-axis
is defined
a linear function x = a + bt, where a is the coOrdinate
of the
at the time t = 0.
In a plane the motion of a particle is described by two functions,
X~
g(t).
f(l),
characterizing the two coOrdinates as functions of the time. In particular, a uniform motion corresponds to a pair of linear functions,
x=a+bt,
y=c+dt,
+ bt,
y = c
+ dt
!gi,
424
THE CALCULUS
[VIII!
+ ~ (x- a)
- ig (z
~t a)~,
v = velocity =
di~:~ce
t =:
f(t~;
={(t).
f(t,) -f(t)
~
But when the motion is not uniform, as in the case of a freely falling body whose velocity increases as
it falls, then the quotient (3) does not give the velocity at the instant t.
but merely the average velocity during the time illterval from t to lt .
425
To obtain the velocity at the exact instant t we must take the limit of
the average velocity as t 1 approaches t. Thus we define with Newt-on
(4)
,, 'f(t) = j'(t),
(5)
f(t) ~ jgt',
(6)
f'(t) ~ gt,
=f'(t) = g,
which is constant.
Suppose it is required to find the velocity of the body 2 seconds after it
has been released. The average velocity during the time interval from
t = 2 tot = 2.1 is
?u(z.;?i~4f1_~
~.:p~
th~
time mterval
For motion in the plane the-two derivativesf'(i) and g'(t) of the functions x = /(1) a.nd y = g(t) ddl.ne the components of the velority. For
motion along a fixed curve the veloc-ity \\-ill be defined by the derivative
of the function 8 = f(t), where 8 is the arc length.
lVIII]
THE CALCULUS
426
Similarly, if f"(x)
(Fig. 271).
= f"(x)
(1
+ (f'(x)) )
2
11
~.
We can find the maxima am! minima of a given function j(x) by firA
fonningf'(x), then finding the values for which this derivative vanishes,
MAXIMA AND
MI~ I:\IA
427
2xa - 9x 2
12x
+ 1,
and obtain
f'(x) = 6x
18x
+ 12,
af(x)
+ bg(x),
428
THE CALCULUS
! VIII I
af'(x)
+ bg'(x).
f(x)g(x),
the derivative is
p'(x)
f(x)g'(x)
+ g(x)f'(x).
Thio; is eatJily proved by the following device: we write, adding and subtracting the same term,
p(x
h) - p(x)
~
f(x
f(x
h)g(x
h)g(x
h) - f(x
+ h) - f(x)g(x)
+ h)g(x) + f(x +
h)g(x) - f(x)g(x),
and obtain, by combining the first two and the second two Wl'IW:l,
p(x
~ f(x
h) g(x
+_
"1--
g(x)
g(x) f(x
+ h~_- f(x)
Now we let h approach zero; sincef(x +h) approachesf(x), the statement to be proved follows immediately.
Eurcise: Prove that the funrtion p(x) "' x" ha;j th<> derivative p'(x) =
(Hint: Write
x" =
m>~.thematieal
nx~--.
indudion.)
+ a1x + .. + a~x";
UQ
the derivative is
f'(x) = a1
j(x) = (I
+ xr
= 1
(2)
1)
n(l
+ x)
n-l
= a1
+ ~x + 3a x + + na,.x~-
2
TECHNIQUE OF DIFFEREXTIATION
429
In this formula we now set x = 0 and find that n = a1, which is (2)
for k = 1. Then we differentiate (3) again, obtaining
+ ... + n(n-
n(n
1)(1
x)n-z = 2a,;
3.2a3x
Substituting x = 0, we find n(n - I) =
k ~ 2.
l)anx"-~.
If
q(x)
;tj.
then
q'(x)
~ g(x)f'(x~g(:)i;x)g'(x).
Exercise: Derive by thiH rule the forrnu]a8 of plif!;e 422 for the derivatives of
tan x and cot x from those for sin x aud ros .z:. Prove that the derivatives of
&ec x = 1/cos x and cosec x = 1/ain x are ain x/cos 1 x and -coa x/ai
tively.
~ ~:
(!
+ x)'"
Exercise: Differentiate
f(x)
=!;."'X""',
The result is
f'(x)- -mx-.
If
and
f(x)
2
~ JTx!
o'
g(y)
VY),
Dq(y)-Dj(x)
1.
430
THE CALCULUS
[VIII]
quotients~
and
~=
f(x)
~ yx ~ )
!ny~--1
i?
x~
andy-"' = x-L,
D(xll"') =
kx~-~
kvy-"',
f
(x) =
~ x~-\
or
Dare tanx =
~ x2
In the same way the reader may derive the following formulas:
D arccotx =
D arc sin x
Darccosx=
1
l+x'
"\Ill- ;2
-~.
TECHNIQUE OF DIFFERENTIATION
431
pounded from two (or more) simpler ones (see p. 282). For example,
z = sin (Vx) is compounded from z = sin y and y = Vi; the function
5
z = Vx + yXi ia compounded from z = y + y and y = Vx; z =
0(\
(\II=\ (\ n
vv
Fi ... 272
\j
v'
Fi1J.273
If two functions
x - g(y)
and
y - f(x)
We a.."!Sert that
(4)
k'(x) - g'(y)f'(x),
For if we write
THJ~
<32
I VIII)
CAI,CUI.US
2~,
k'(z) ~ (1 + 5z') ~
2
k'(z)
yx + VXO,
k(x)
= sin (x2),
k(x)
=sin~
(cosyx)
0) ~
vro . . xi;
f(x)
J.-t.
x'.
f'(x) = rx.,...1
Exercises: 1) Carry out the differentiations of the exercises on page 421 by
using the rules of this section
1
2) Differentiate the fol!owing functions: x ~in x, i+x~ ~in nx, (;r' - 3x x
~~ond
of
TECHNIQFE OF DIFFERENTIATIQ}r
433
(Remark: The function poB~easca only one point with vanlshi11g derivatiw;
therefore, since a minimum but obviously no maximum occurs, therf i~ no need
to study the second derivative.)
]I'[ ore Problems on Max1ma and .lfinima: 5) Find the extrema of the following
functions, sketch their graphs, determine the intervals of increa!Je, decrease,
2, x/(1
6) Study the maxima and miniw of the fundion x' + 3ax + 1 in their dependence on a
7) ~Vhich point of the hyperbola 211 1 - x = 2 is nearest to the point x = 0,
11"" 3?
8) Of all rectangles with given area find the one with the shortest diagonal
9) Inscribe the rectangle of greatest area in the ellipse x 1 /a 1 + y1 /b 1 = 1.
10) Of all circular cylinders with given volume find the one with the least
434
THE CALCULUS
{VIII]
ox
For the limit, the derivative, which we called f'(x) (followinb the usage
introduced later by Lagrange), Leibniz wrote
dy
(IX'
LEIBNIZ' NOTATION
435
~.t!!!.=
dy dx
'
"as if" the "differentials" may be cancelled out from something like an
ordinary fraction, Likewise, rule (e) of page 431 for differentiating a
compound function z = k(x), where
'
g(y),
Y ~ f(x),
now reads
dt
dx
dz dy
=
dy"dx'
436
THE CALCULUS
l VIlli
been adop~d, some of which haw' proved quit.e useful in the calculus
and in its application~ to goonwtry,
F(x)
f(u) du,
FJ&:.214.
Tbelntea;~Mfuno!Jonoluppatltuut
437
F'(x)
f(x).
!11- other words, the process of integration, leading from the function f(x)
to F(x), is undone, inverted, by the process of d/fferentiation, applied In F(x).
On an intuitive basis the proof is very easy. It. depends on the interpretation of the integral F(x) as an area, anU would be obscured if one
tried to represent F(x) by a graph and the derivative F'(x) by its slope.
Instead of this original geometrical interpretation of the derivative we
retain the geometrical explanation of the integral F(x) but proceed in an
analytical way with the differentiation of F(x). The difference
F(x1)- F(x)
Xt
0
Fig.27S. Proofofthefundamenta.ltheoTem
area lies between the values (x1 - x)m and (x1 - x)M,
(x 1
F(x) S (x 1 - x)JJ,
where M and m are respectively the greatest and least va!ueo: of f(u)
in the intf'rval betwf'en :r and x1 . For these two produrts are the areas
438
THE CALCULUS
[VIII]
m$
$M.
(2)
&8
stated.
1 =:(x~ =
F(x~
f(x),
f(x).
+c
(c any constant.)
THE
FUXDAM~NTAL
439
THEOREM
This leads to a most import-nt rule for finding the value of an integral
between a and b, provided we know a primitive function G(x) of J(x)
According to our main theorem,
F(x) - { f(u) du
is also a primitive function of f(x). Hence F(x) = G(x) + c, where c
is a constant. The constant cis determined if we remember that F(a) =
[
f(u) du = 0.
+ c, so that c =
-G(a).
Then
(3)
G(x) such that G'(x) = f(x), and then form the difference G(b) - G(a).
2. First Applications.
Integration
or x, cos x, sin x.
Arc tau x
THE CALCULUS
440
G'(x)
=:! i
!VIH
x" = :r!'.
b"+~ ~ ~ +
11
This process is much simpler than the laborious procedure of finding the
integral directly as the limit of a sum.
More generally, we found in 3 that for any rational s, positive or
negative, the function x has the derivative BX"-\ and therefore, for
8 = r + 1, the function
G(x) = r
~ 1 x+l
x'dx =
r~ 1 (b'+l- a.-+~.
In (4) we suppose that in the interval of inte.;ration the integrand :ff is defined
and continuous, which excludes x = 0 if r < 0. We therefore make the a.ssumption that in this case a and b are positive.
0) = 1 -cos a.
Likewise, since for G(J} = sin x we have G'(x) = cos ;r, it follows
that
' dx.
.4r' rrxz
441
FIRST APPLICATIONS
Now we have arc tan 0 = 0 because to the value 0 of the tangent the
value 0 of the angle is attached. Hence we find
(5)
arc tanb= {
l~z"'.dx.
11'/4 =
1'!+Xi
I
dx.
This shows that the area under the graph of the function y = 1/(1 + x 2 )
from x = 0 to x = I is one~fourth of the area of a circle of radius 1.
I''
,I. 'lf"/t&rundot~
i~~-~+~
(7)
}+~-il+
By the symbol +
we mean that the sequence of finite "partial
sums", formed by breaking off the expression on the right after n
terms, converges to the limit :~r/4 as n increases
To prove this famous formula, we have only to recall the finite geometrical series
~=-f
~q
= 1
+ q + q~ + + q"-
1 +q+l-t.
1
,
or
= 1-
x2
+x
x'
+ + (-1)"-
xt.......z
+ R ..,
442
THE CALCULUS
[VIII]
~"z2'
(9)
1'
o
[b
x"' dx = 1/(m
1-!+!-!++(-l)"_, __I_+T.,
3
f. --"+-z
o I
2n-1
dx,
ia equal to 11'/4. The difference between r/4 and the partial sum
S.
~I - ~ + ~ + .. + ~-~;~
is rr/4- S,. = T,., What remains is to show tha.t T,. approaches zero
8':1 n increases. Now
forO:::;; x::; 1.
Recalling formula (13) of 1, which states that { f(x) dx :::; { g(x) dx
< b,
IT,.j=
we see that
{ 1 ~"xzdx:s;[x2 "dx;
since the right side is equal to 1/(Zn + 1), as we saw before (formula
(4)), we find I T. I < l/(2n + 1). Hence
li- s..
< 2n~l
But this shows that S,. tends with increasing n to 11'/4, since 1/(2n
tends to zero. Thus Leibniz' formula is proved.
+ 1)
443
There one usually begins with the integra.! powers an of a positive number
a, and then defines a11 m = "\/U:~ thus obtaining the value of a' for every
rational r = njm. The value of a" for any irrational x is next defined
so as to make a~ a continuous function of x, a delicate point which is
omitted in elementary instruction. Finally, the logarithm of y to the
base a,
x = !og.,y,
is defined as the inverse function of y = a".
In the following theory of these functions on the basis of the calculus the ordt>r in which they are considered is reversed. We begin
with the logarithm and then obtain the exponential function.
Euler's Number e
F(x)=!ogx=
'l
-du
' u
(see Fig. 5, p. 29). The variable x may be any positive number. Zero
is excluded because the integrand l/11 becomes infmite as u tends to 0.
It is quite natural t-0 study the function F(x). For we know that
the primitive function of any power x" is a function xn+1/(n + 1) of the
same type, except for n = -1. In the latter case the denominator
n
1 would vanish and formula (4), p. 4-W would be meaningless.
Thus we might expect that the integration of 1/x or 1/u would lead
to some new -and interesting type of function.
Although we consider (1) the definition of the function log x, we do
not "know" the function until we have derived its
and havt'
It is
of the
general
definitions such as (I) on
then deduce
defined and, only at the very end, anive at
F'(x)
1/x.
444
{VIII J
THE CALCULUS
From (2) it follows that the derivative is always positive, which confirms the obvious fact that function log x is a monotone increasing
log a+ log b
log (ab).
By (2), and
1/x.
+ '
log I
0,
because the defining integral has for x = 1 equal upper and lower limits.
Hence we obtain
k(I) = log (al) = log a = log 1
+c=
c,
1
)
= 3 log x.
445
Equation (4) shows that for increasing values of x the values of log x
Furthermore we have
=Jog x
+log~'
so that
Jog~
(5)
= - log x.
Finally,
Iogxr = rlogx
(6)
1l
so that
logx~ =
:log:r.
-:-r~
:~---- -
.~"'I
~E<I
~o
loge= 1.
446
THE CALCULUS
{VIII I
(8)
E(a)-E(b)
E(a +b)
for any pair of values a and b. This law is merely another form of the
law (3) for the lobarithm. For if we set
E(b) = x,
E'(a) = z
{i.e. b = log x, a ""' log z),
we have
lugxz = logx + logz = b +a,
and therefore
E(b +a) ~ xz ~ E(a)-E(b),
which was to Le proved,
Since by definition Jog e = I, we have
E(l)
f'-
e,
In general,
E(n) = e"
for any integer ~-P Likewise E(l/n) = en, so that E(p/q) = E(ljq)
E(l/q) = [eq] ; hence, setting pjq
E(') ~e'
r, we have
447
e'-
E(y)
for any real number y, since the E-function is continuous for all values
of y, and identical with the value of e11 for rational y. We can now
express the fundamental law (8) of the E*function, or exponential june*
lion, as it is called, by the equation
(9)
eo that
Now we define a"" by the compound expression
z = a"' = e"~ = e~ 1.,. ".
(10)
For example,
10"" =
e~ 1.,. 10
We call the inverse function of a"" the logarithm to the base a, and we see
immediately that the natural logarithm of z is x times a; in other words,
the logarithm of a number z to the base a is obtained by dividing the
natural logarithm of z by the fixed natural logarithm of a. For a = 10
this is (to four significant figures)
log 10 = 2.303.
~-
f. ~ Vx dX
x - E(y),
448
THE CALCULUS
{VIII]
i.e.
(11)
E'(y) - E(y).
-fxe" =
(lla)
e~.
'
I and
therefore
Zt
If we set x1 = x
sequence
449
By writing z = l/x and using again the laws for the logarithm we obtain
lim log [ (I
+ ~)"]
as
n -
~.
(12)
(13)
and for
+ 1/n)",
z = -1,
(13a)
= lim (1 - l /nt.
(1
21
n2
3!
n'
nn
by
replacing~
(14)
by 0 in each term.
450
lVIII}
THE CALCULUS
+ ~+2~1+3~!+l1+'
"hicb establishes the identity of e with the number defined on page 298.
For x
- 1 we obtain the series
2\-~+b ~+,
which giYes an excell'nt numerical approximation with very few terms,
the total error involved in breaking off the series at the nth term being
less than the magnitude of the (n + Ost term.
By t"Xploiting the differentiation formula for the exponential function
\VC
We ' we
as h t('nds to 0,
this limit is the derivative of e~ for y = 0, and
this is equal to
= I. In this formula we sub:>titute for h the .-alues
z/n, where z is an arbitrary number and n ranges O\'er the sequence of
positin~ integers.
This gi\es
e' 1" - l
n-,-~1,
n(V'<'
1) _,
= log x ore'
as n tends to infinity.
\Vriting z
(15)
limn(V'X-1)
log:r
as
= x, we finally obtain
n-oe.
451
J6) Show thut the nth derivative of e-ll~ has the form e-ll~'.}(x'ft multiplied
by a polynomial of degree 2n - 2
J7) Loganthmic differentiation. By u11ing the fundamental property of the
logarithm, the differentiation of products can 110metimes be effected in a simplified
manner. We have for a product of the form
p(x) - f,(x)f,(x) /.{x),
D(log p(x)) ~ D{log f,(x))
+ n),
b) xe-..1
Numerical Calculation
It is not formula (15) that serves as the basis for numerical calculation
of the logarithm. A quite different and more useful explicit expression
of great theoretical importance is far better suited to this purpose. We
shall obtain this expression by the method used on page 441 for findin~
r, exploiting the definition of the logarithm by formula (1). One small
preparatory step is ne<.,>ded; instead of aiming at log x, we shall try to
expmss y = log (1 + x), composed of the functions y = log z and
z= 1
+ x.
We
have~=~~.~~=;.
1=
i~z
log (1
+ x)
= {
i ~ u du.
(Of cour1>e, this formula could just as well have been obtained intuitively
from the geometri{'al interpretation of the logarithm as an area. Com
pare p. 413.)
In formula (16) we insert, as on page 442, the geometrical series for
(1 + u)-\ writing
1~ u =
R. - (-I)"
I~' u.
THE CALCTJLUS
452
[VIIIJ
Substituting this series in (16) we may use the rule that such a (finite)
sum can .~ e integrated term by term. The integral of u' from 0 to :r
yields x+
8
-~++(-1)"-1 f+r
log(l+x)=x-i-t.,
.,
~udu.
We shall now show that T,. tends to zero for increasing n provided that
xis chosen greater than -1 and not greater than +I, in other words,
fo,
-1
<X :S:
1,
and therefore
-11
IT. I~ 1- a
1
u"dul,
i
I log (1
+ x)
< x S:
I.
(19)
obtai!~
453
~+~-~+
log2=1
This formula has a structure similar to that of the series for 11'/4.
The series (18) is not a very practical means for finding numerical
values for the logarithm, since its rant;e is limited to values of 1 + x
betwcl.'n 0 and 2, and since its convergence is so slow that one must
include many terms before obtaining a reasonably accurate result.
By the following device we can obtain a more convenient expression.
Replacing x by -x in (18) we find
Jog (1
(20)
x) = -x -
x2
x4
x'
2 - 3
4 -
Subtracting (20) from (18) and using the fa('t that log a. - log b = log a
+ log (ljb) = lor: (a/b), we obtain
1
1
+x
Jog-~=2
(21)
-X
( x+-+-+.
.
x'
)
3
Not only does this series converge much faster, but now the left side
can express the logarithm of any positive numLer z,
always has a solution x between -1 and
calculate log 3 we set x = ! and obtain
log 3 = log
\Vith only
(i
!j:-_! =
1-!
terms, up to
+I.
z(.l2~ + 32"
_ +
ii~2Jt
___!__
~ince ~ -~-;
= z
Thus, if we want to
____!___
f.i.~
+ .. )
log 3 = 1.0986,
which is accurate to five digits.
7. DIFFEHENTIAL EQUATIONS
1. Definition
= f(x)
with deriva
454
THE CALCULUS
[VIII}
tive u'
f'(x)-the notation u' is a very useful abbreviation for f'(x)
as long as the quantity u and its dependence on x as the function f(x)
u'
+ sin (xu)
+ 3u =
:l.
More generally, a differential equation may involve the second deriva-tive, u" = f"(x), or higher derivatives, as in the example
u" + 2u' - 3u = 0.
In any case the problem is to find a function u = f(x) that satisfies
the given equation. Solving a differential equation is a wide generalization of the problem of integration in the sense of finding the primitive
function of a given function g(x), which amounts to solving the simple
differential equation
u' = g(x).
u' = u
u = ce,
u' = ku,
then according to the rule for finding the derivative of an inverse function we have
h' =
But
~=
of
fu.
~,
so that x = h(u) =
lo~ u + b,
f(t),
and in which the quantity u is changing at each instant at a rate proportional to the value of u at that instant. In such a case, the rate of
change at the instant t,
Uo
which was
[VIII]
THE CALCULUS
456
0.
that
(4)
!\'ote that we start with a knowledge of the rat-e of c/w.nge of u and deduce
the ia.w
which gives the actual amo-unt of u at any time t. This is
just the
of the problem of finding the derivative of a function
A typical example is that of ra.dioacthe disintegration. Let u = f(t)
be the amount of some radioact.ive ~ubstance at the timet; then on the
hypothesis that ea('h individual particle of the substance has a certain
probability of disintegrating in a given time, and that the probability
is unaffeded by the presence of other such particles, the rate at which
u is disintegrating at a given time t will be proportional to u, Le. to the
total amount present at that timt:'. Hence u will satisfy (3) with a
negative constant k that measures the speed of the disintegration proeess, and therefore
""'.
~ = ~.~ =
u1
~e'1
e"-ll,-ljl
'
which depends only on - t 1 , To find out how long it will take for
a given amount of the
to disintegrate until only half of it iio
left, we must determine s = I~ - t1 so that
~
~""'e ..,
k =
~~
-0.0000447.
457
It follows that
An example of a law of growth that is approximately exponential
is provided by the phenomenon of compound interest. A given amount
of money, U<l dollars, is placed at 3% compound interest, which is to be
compounded yearly. After I year, the amount of money will be
Ut
= un(l
+ 0.03),
Ut(l
+ 0.03)
= ~(J
+ 0.03)~,
(6)
u, = uo(l
+ 0.03)
!~log 2 =
23.10.
twenty-three years.
Instead of following this step-by-step proredure and then passing to
the limit, we could have derived the formula (7) simply by saying that
the rate of increase u' of the capital is proportional to u with the factor
k = .03, so that
u' = ku,
where
k = .03.
The formula (7) then foilows from the general result (4).
458
THE CALCULUS
3. Other Examples.
{VIII!
Simplest Vibrations
z11
+z
= 0,
z"
+ k 2z =
0,
for which z
cos kt and z = sin kt are solutions, occur in the study of
vibrations. This is why the oscillating curves u = sin kt and u = cos kt
459
OTHER EXAMPLES
and the solutions now are "damped" vibrations, mathematically expressed by the formula
2
e--rl/ cos
wf;
R~) :
(As an exercise the l'eader
efficient r.
THE CAI,CULUS
460
[VIII]
461
-ex -
mx"
+ rx' + ex
= 0.
1. Dift'erentiability
We have linked the concept of derivative of a function y = f(x) with
the intuitive idea of tangent to the graph of the function. Since the
general concept of function is so wide, it is necessary in the interests of
logical completeness to do away with this dependence on geometrical
--
FJI.282 ~s+:.-1.
*-
_J.;_.
Fia:.283.~-l=l
Flg.2M.I!-:z+lzl+
(.:o-1)
y=x+x=2x
y = X - X = 0
+ I x I,
for
x~O,
for
<
+1:<-tl.
where
! x I is
0.
+I
xI)+ j,j(x-
ll
one~half
of a. regular
DIFFERENTIABILITY
463
non~differentiability,
we consider the
= f(z) = z sin~
which iB obtained from the function sin 1/x (see p. 283) by multiplication by the facOOr x; we definef(x) to be zero for x = 0. Th:s function,
whose graph for positive values of ;t is shown in Figure 285, is con
sin~ - ~cos~
+ h) h
/(0)
h sin
i smii.
.
= -h-
.J-64
THE CALCULt:H
! VIII I
between -1 and + 1
appcoach ammt; "'"'"thE' function cannot be differentioscillat~\S
2. The Integral
to the integral of a continuous
the "area under the curve"
exists and which can be exthe
THE INTEGRAL
465
S.
~ ~f(v,)(x;
x;_,) ~ 1;.J(v1)Ax;,
where :to
a, x1 , , Xn = b is a subdivision of the interval of integration, llx 1 = x, - x 1_ 1 is the x--differencc or length of thejth subinterval,
and v1 is an arbitrary value of x in this subinterval, i.e. x;_1 :$ V; ~ x,.
(We may take, for example, v,. = x,. or v1 = Xj-1 .) X ow we form a
sequence of such sums in which the number n of subintervals increases
and at the same time the maximum length of the subintervals decreases
to zero. Then the main fact is: The sum S,. for a given continuous
function f(x) tends to a definite limit A, which is independent of the
specific way in which the subintervals a~d points v, are chosen. By
definition, this limit is the integral A =
f(x) dx.
ence of this limit requires analytical proof if \Ye do not wish to rely on
au intuitive geometrical notion of area. This proof is given in every
rigorous textbook on the calculus.
Comparing differentiation and integration, we are confronted with
the foilowing antithetieal situation. Differentiability is definitely a restrictive condition on a continuous function, but the actual carrying out
of the differentiation, i.e. the algorithm of the differential caleulus, is in
practice a straightforward procedure based on a few simple rules. On
the other hand, every continuous function without exception possesses
an integral between
given limits. But the explicit eakulation
of such integrals, even
quite simple functions, is in general a very
difficult task. At this point the fundamental theorem of the calculus
becomes in many cases the decisive instrument for carrying out the
integration. However, for most functions, even for
dcmf:'ntary
ones, integration does not yield simple
and the
numerkal computation of integrals requires
Length
Dissociating the analytical notion of int6ral from its original gcometrieal interpretation, we meet a number of other, equally important,
interpretations and
For example, the integral can be
interpreted in
as expressing the concept of work. The fol~
lowing simplest case will suffice for our explanation, Suppose a mass
THE CALCULUS
466
[VIII!
moYes along the zwaxis under tht> influence of a force directed along
the axis. This muss is thought of as concentrated at the point with
the coOrdinate x, and the force is given as a functlonf(z) of the position,
the sign of f(x) indicating whether it points in the positive or negative
x~direction. If the force is constant and moves the mass from a to b,
then the work done is given by the produet, (b - a)f, of the intensity J
of the force and the distance traversed hy the mass. But if the inw
tensity varies with :r, we shall have to define the amount of work done
by a limiting process (as we defined velocity). To this end we divide
the interval from a to b as before into small subintenals by the points
Xo = a, Xt, , x,. = b; then we imagine that in each subinterval the
force is constant and equal, say, to- f(x,}, the actual value at the end~
point, and calculate the work that would correspond to this stepwise
varying force:
S~
f(x.)A:x.,
If we now refine the subdivision as before and let. n increase, we see that
the sum tends to the integral
{!(x)dx.
Thus the work done by a continuo-usly varying force is defined by an
integraL
As an example let us consider a mass m fastened by an elastic spring
to the origin x = 0. The force f(x)
in line with the discussion on
page 461, be proportional to x,
.,.,;u,
f(x)
-k7 x,
where
is a positive eonstant. Then the work done by this force if
the mass moves from the origin to the pasition x
b will be
J.'
k'xdx
~ -k'~,
and the work we must do against this foree, if we want to pull out the
is+ k 2 ~.
tivef'(:z:)
467
APPLICATIONS
L = limLn
as the length of the arc AB. (In Chapter VI the length of a circle
was obtained in this way as the limit of the primeters of inscribed
regular n~gons.) It can be shown that for sufficiently smooth curves
this limit exist..; and is indpp(>ndent of the specific way in which the se~
quence of inscribed polygons is chosen. Curves for which this holds
are said to be rectifiable. Any "reasonable" curve that arises in theory
or applications will be rectifiable, and we shall not dwell on the investigation of pathological ca.ses. It will suffice to show that the arc AB,
for a function y = f(x) with a continuous derivative f'(x), ha.s a length
L in this sense, and that L can be expressed by an integraL
To this end, let us denote the x-coOrdinates of A and B by a and b
respectively, then subdivide the x-interval from a to b as before by the
points xa = a, x1, 1 Xi, 1 x,. = b, with the differences .:lxi =
xi
Xi-1, and consider the polygon with the vertices xi, Yi = f(xi)
above these points of subdivision. A single edge of the polygon will
L.
t 11 110(~7)i
i-l
\Axi
llx,.
if
468
derivative
!VIII]
THE CALCULUS
~ =
expression
(2)
vi-.,
Vl-f--j'(x)
by the integral
V'll
1'
a
v~~-~2'
whence
dx
Vl- x2
arcsinb
arc sin a.
For the parabola ?I = x 2 we bave f'(x) = 2x and the arc length from
x=Otox=bis
[VI+ 4x dx.
2
For the curve y = log sin x we ha,'e f'(x) = cot x and the arc length
is expre.ssed by
{ Vl + cot 2 xdx.
469
APPLICATIONS
We shall be content with merely writing down these integral expressions. They could be evaluated with a little more
than we
have at our com:rii.and, but we shall go no farther in this
2. ORDERB OF
MA~:"ITUDE
J;. -o.
and
~=
One might think that with the powers of n as a yardstick one eould
measure the different degrees of becoming infinite for
sequence a~
that tends to infinity. To do this one would havt to
a suitable
power n' with the same order of magnitude as an ; i.e. such that
tends to a fmite constant different from tero. It is a remarkahle
that this is by no means
an with a > 1 (e.g.
tends to
large we choose s,
log n tends to
however small the positive exponent s. In
tions
(1)
THE CALCULUS
470
{VIII)
and
(2)
~.-o
as n increases. Let b = a11 '; since a is assumed to be greater than
1, band also Vli = b1 will be greater than 1. We may wlite
b'
where q is positive.
=]
+ q,
bn12 = (1
+ qr
2:
+ nq >
nq,
so that
and
a;r.
<~=
mt
~<~=a~-o.
This remark may be used to prove (2). Setting :r
log n and
e = a, so that n = e"' and n = (e")"', the ratio in (2) becomes
a'
which is the special case of (3) for s = 1.
471
Bxercisc~:
1) Prove that for x ~> > the function log log x tends to infinity
more slowly than log x. 2) The derivative of x/log xis 1/log x- 1/{log :c)'.
Prove that for large ;r; it is "asymptotically" equivalent to tho first term,
1/log x, i.e. that their ratio tends to l 8.8 x ,..... >.
of the rectangles whose tops are marked by solid lines, .and which
together do not exceed the area
r+' log x dx ~
),
(n
+ I) lbg (n + 1) -
(n
+ 1) + 1
under the logarithmic curve from 1 ton+ 1 (seep. 450, Exercise 1)}.
But the sumP" is likewise equal to the total area of the rectangles whose
472
THE CALCULUS
[VIII)
tops are marked by broken lines and which together exceed the area
under the curve from I to n, given by
{'tog xdx = n log n- n
+ 1.
Thus we have
n log n - n +-1
< Pn < (n
1
+ 1) log (n + 1)
n,
1) - lo; n
iOg;i
~:;tated,
bt+bz+b~+,
where
(2)
Thus the equation (1) is equivalent to the limiting relation
(3)
lim
Sn
= sasn-+C(!,
where sn is defined by (2). When the limit (3) exists we say that the
series (1) comergcs to the values, while if the limit (3) does not exist,
we say that the series diverges.
Thus, the series
1-l+!-++
473
1+ ..
1-~+~
1+1+1+1+
diverges because the partial sums tend to infinity.
\Ve have already encount<Jred series whose terms b; are functions of
x of the form
b; = c,x',
with constant factors c;. Such series are called power series; they are
limits of polynomials representing the partial sums
S,. = eo
(4)
i!
(5)
tan-
(6)
log (1
(7)
1 logi:: =
(8)
.,
X=
1
1- X+ X~- ::l' +
x =x-i-+~- .
+ x)
valid for
-1 < x < +1
valid for -1 $ x :$ +1
x-; + i-
valid for -1
x+~+~ + ,
validfor-1 <x<
x2
x'
x'
=l+x+21+3l+41+
<
x +1
~
+1
sinx=x-~+~-,
474
THE CALCULUS
(10)
COSX=
x2
[VIII]
x'
1-2J+41-
(b)
i"
X.$ 1.
1-cosx:::;
x'
x'
sinx~x-fs=x ~~
Proceeding indefinitely in this manner, we get the two sets of inequalities
sinx .$ x
COS
sinx~x-~
cosx;:::t-
ainz S x,
X.$ 1
x'
21
~+ ~
cosx~l-~~+~;
x~x~x 7
cosx~l-~+~ ~
SIDX~X-31+5j-'fi
ft
= C m
475
It follows that
x=x-~+~-~+
=1Since the terms of the series are of alternating sign and decreasing
magnitude (at least for I xI ::; 1), it follows that the error committed by
breaking off either series at any term will not exceed in magnitude the value
of the first term dropped.
Remarks. These series can be used for the computation of tables.
Example: What is sin 1? 1 is 1r /180 in radian measure; hence
sin
o)
than~ ( 1 ,
which is less than 0.000 000 000 02. Henee sin 1 = 0.017 452 406 4, to
10 places of decimals.
Finally, we mention without proof tlw !(binomial Reries''
(11)
where
(1
c~
+ xt
= 1
+ ax + C2x~ + C~x +
1
c:
2) (a -
+ 1)
VJ+x
= 1
+ !x -
476
!VIII]
f(x) = Co
+ CtX + ~X + CsX 8 + ,
2
j'(O).
= 2C:l
+ 2-3-cax + + (n-
1)-n-c,.x"~
+ ... ;
"f"'(O),
477
c.~~f"'(o),
where f"l (0) is the value of the nth derivative of f(x) at x = 0.
result is the TayWr series
(14)
f(x)
The
+ i sin rup)
(cos rup
In this we substitute
= (cosq:o
+ i sin q:o)".
If'
sm-
-i"-d
~-o
n
(see p. 307), we see that sin ~ is asymptotically equal w ~.
may therefore find it plausible to proceDd to the limit formula
(14)
cosx+isinx=!im(l+~Yasn---><>:>.
\Vrc
478
THE CALCULUS
[VIII)
Comparing the right side of this equation with the formula (p. 449)
e'
=lim (1
+ ~y as n-+
<tJ,
we have
cos x
(15)
+ i sin x
= e
1
"',
comparing the right hand side with the series for sin x and cos x we
again obtain Euler's formula.
Such reasoning is by no means an actual proof of the relation (15).
The objection to our second argument is that the series expansion for
e was derived under the assumption that z is a real number; therefore
the substitution z = ix requires justiftcation. Likewise the validity of
the first argument is destroyed by the fact that the formula
= lim (1
+ z/nt as n ~
(()
1
1
right behaves most strangely, becoming
i sin x =
~i
479
the
I +I-I+.
This series does not converge, since its partial sum8 oscillate between 1
and 0. This indicates that. functions may give rise to divergent seril:'s
even when the functions themselves do not show any irre&Ularity. Of
course, the function
~-X
--l'
-1.
Since it
~-X for
x = -1.
~ ;;2 may
be expanded into
the series
~ x2 = 1 - x + x - x + ...
1
by substitutingx2 forx in (4). This series will also converge for! xI < 1,
while for x = 1 it again leads to the divergent series 1
I + I I + ... , and for ! xI > I it diverges explosively, although the function itself is everywhere regular.
It has turned out that a complete explanation of such phenomena is
possible only when the functions are studied for complex values of the
2
x2
3. The Ha.rmonic Series and the Zeta Function. Euler's Product for
the Sine
Series whose terms are simple combinations of the integers are particularly interesting. As an example we consider the "harmonic series"
(16)
1l
+ ~ + + + ... + ~ + .
480
THE CALCULUS
[VIII}
even~numbered
terms only.
8,.
= 1 + ~ + ~ + ...
+ ~
tends to a finite limit. Although the terms of the series (16) approach
0 as we go out farther and farther, it is easy to see that the series does
not converge. For by taking enough terms we can exceed any positive
number whatever, so that s,. increases without limit and hence the
series (16) "diverges to infinity." To see this we observe that
"~ 1 + !.
~+C+D>+~+D~1+t
8s
8t
and in general
(18)
a,..
>
1 +~.
8:2..
200.
(19)
may be shown to converge for any value of 8 grer..ter than 1, and defines
for all 8 > 1 the so-called zeta function,
(20)
t(8) = hm
-+ e>:> ,
0
so that
< ~<
1,
481
1
[~
-I - ljp;
I+~+
f. + ... ~
!(,),
by virtue of the fact that every integer greater than 1 can be expressed
uniquely a.'J the product of powers of distinct primes. Thus we have
repre~nW the zeta-function as a product:
<211
!(l)~l+i+l+
diverges to infinity. This argument, which can easily be made into a rigorous proof, shows that there are infinitPly many primes. Of course, this
is much more involved and sophisticated than the proof given by Euclid
(seep. 22). But it has the fascination of a difficult ascent of a mountain
peak which could be reached from the other side by a comfortable road.
Infinite products such as (21) are sometimes just as useful as infinite
series for representing functions. Another infinite product, whose discovery was one more of Euler's achieYements, concerns the trigonometric
function sin x. To understand this formula we start with a remark on
polynomials. If
+ a,,x'' is a polynomial of
degree n and has n
x1 1 , Xn, then it is known from
algebra that f(x) can be decomposed into linear factors:
f(x) = an(X - xi) (x - x,.)
482
THE CALCULUS
{VIII]
ao,
This infinite product converges for all values of x, and is one of the most
beautiful formulas in mathematics. For x = ! it yields
+ I)
'
22446688
483
......_.Jo:n
By this is meant that the ratio of A(n) ton/log n tends to the limit 1
as n tends to infinity.
We start by making the assumption that there exists a mathematical
law which describes the distribution of the primes in the followin~; sense:
for large values of n the function A (n) is r-.pproximately equal to the
integral
f'
484
THE CALCULUS
[a],,
[VIII]
Hen<'e
[n'},,
[1},
Mt +2M2 + 3M3+
= (Nt- Nz)
=
N1
+ 2(N2
- Na)
+ 3(N3 -
N,)
+ Nz + N, +
I:__!!:____
:r<" p- 1
< n.
Thus we
logp.
Comparing this with our previous asymptotic relation for log n! we find,
writing x instead of n,
(1)
log x ,__,
:E
~l!.E_ .
:><>~P-
~f-
485
log x
(2)
~{
W(i)
;o~; d!.
From this we shall determine the unknown function W(x). If we replace the sign ,......, by ordinary equality and differentiate both sides with
respect to x, then by the fundamental theorem of the calculus
1= W(x);o~~.
(3)
W(x) =
;l~g~'
approxi~
integral
X- 1
z xlogX dx.
(4)
\Ve
l~X
1
(logx) 2 '
1
(logx)l'
Iogx- xlogx
are approximately equal, since for large x the second term in both cases
will be much smaller than the first. Hence the integral (4) will be
asymptotically equal to the integral
{
j'(x) dx
log2'
since the integrands will be almo5t equal over most of the range of integration. The term 2/log 2 can be neglected for large x since it is a
constant, and thus \Ve obtain the final result
A(x) "'io:x'
486
THE CALCULUS
{VIII]
CHAPTER IX
RECENT DEVELOPMENTS
L A F'ORMU..A FOR PRIMES
(SPP
page 2/5)
+ h + j
qj 0
[(gk + 'l.g -c- k + l) (II + j) ..,. h - z]'
(k + 2)(1--[1/'Z
tl"
[IX]
1) + 1 - [((a +
- a))"- 1)
+ 4dy 2 ) + 1 [n + l + I' - :IJ!" - [(a"' - I)l2 + 1 - [ai + k + 1 - l - [p ...- f(a - n - 1) + b(2an + 2a
nJ - 2n - 2) - [q + y(a- p- 1) +
+ 2a p~
2)- .r]'
- [16riy'(a 2
+ 1(2ap -
- [z +pi(a - p)
- 1) - pm] 2 !
variab!f~S) =
2. THE GOLDBACH
CO~JECTtlRE
(pagr SO)
489
4UO
HEt'ENT llE\'ELClP:O.lENTS
PXJ
y 2 =aJ. 1 +b.r+r.r+d
Say that (X, Y) is a raUonal point if both X and Yare rational numbers
Tlwn FPrmat's I..a...:;t. TiteOT('!ll is equivalent to lh<' assertion that no rational point can li(' on the FPnnat curw 0) whf'n n ~ :l. BttwNn 1970
and Hl7G, Yves Hellegouarch inve-stigated a curious connPction betwt<n
FPrmat cunres (3) and f'iliptic curws (2). Jean-Pierr(' Serre suggestPd
trying tht> connrse to Pxploit prorwrt.i<s of dliptic cun-es to prow results on F('nnat's L<:k'>t Tiwonm. In 198S GNhard Frty madP this suggPstion pnf'isp by introducing what ts now callf'd th( FIY'".IJ eUipliceunc
associat('d with a pr<'stunptiVf' solution oft.hl' Fl'nnat <quat ion. Suppose
that thf'n' is a nontri\ial soltttion A" + H" = C" ofth<' Fl'mtat <quation,
and foml the Plliptic CUIYf'
(4)
"-Fl")(.r
lJ'')
This is ttw Frey t>lliptic cur\'f', ;md it exists if and only if Femmt's La...;;t
Tiworem is false. So in order to prow Ft-nnat's Las1 TheorPm it is
enough to proVf' that Frey's cunf (4) cmmot txist. The way to do this
is to follow the "indinct" nwthod of proof (seP p. 86): that 1s, to assunw
482
HECt~r-;T
DE\'EUJPMEI'<TS
(IX I
that it doPs exist and deduce a contradiction. This impliPs that tlw FT{y
cmve does not exist aftpr all, which impliPs that F'ennat's Last 111eorem
is true. Frey found strong pvidencp that his curve "ought not to (Xist''
by proving that it has several extremf'ly curious and tmlikt>ly sounding
properties. In 1986 Keruwth Ribet pinned the probem dov.'ll by proving
that l<Tey's cuiVe cannot Pxist pro\idPd that a big unsolwd probltm in
numbtr thtory, the T:miyama cor\i('dUrf'. 1s true. Ht> thereby reducrd
onP major tmsolved probltm, Ff'rmat's La._qt Theorem. to another maj01
unsolvf'd problf'm. This kind of reduction is often unhelpful, just H'placing one hard problem by a harder one, but in this casP it hit paydirt,
because it provided a context in which to tackle Femmt's Las1 Th('Ofl'tn
The Taniyama conjectun is again IN'hnkal, but it can be explained
with rt>fennct to a special ('asP. ThPn, is an intimate rPlationship bf'twef'n the "Pythagorean ('quation'' a;-'- 1/ =- c', the unit circlP, and tht'
trigonomt>tric ftmctions sin and cos. To find this relationship. obserw
that the Pythagorean equation can be rewrif!Pn in the fonn (ale)" +
"'"" 1, which imp!i(S that the point (.r,y) =(ale, bk) lies on the unit
whose tquation is
+ .11" = 1. It is Wf'll known that the trigonomf'tric fundions provide a simple way to represent the unit circle
Specifically, Pythagora;'s Theorem and the geometric definition of sin
and cos imply that tlw E'quation
.r
(5)
cos.'e + sin'8
holds for any angle 9 (S('t' p. 277). lf we Sf'! .r = cos 8, y = sin 8, then
(!5) statts that the pomt (X,.IJ) lit>s on tlw tunt cirdf'. To smn up: solv:mg
thP Pythagorean C'quation in integers is t>quiYalent to finding an angle 8
such that both cos 8 and sin 8 are rational numbers (Pqual respPctiwly
to ale and b!c). Because the trigonometric functions haw all sons of
pleasant propertiPs. this idt-a is th~ bas1s of a wally fruitful theory of
the Pythagorean N}mttion
The Taniyama colljtctun says that (in a ratlwr tf'dmical sPtting) a
similar kind of Jdea c;-ul bt apphNl to any elliptic cm>:e, but replacing
sin and cos by morf' soplusticated 'modular" functiOns. So problems
about elliptic cur.if'S c;_m bt> replact>d by problems about Tll()(ju]ar fum
tions, just as problems about the cJrdP ('an be nplaced by prob\tms
about trigonometric functwns
Wiles realizPd that Frey's approach c;m bt' puslwd through to a satisfactory conduswn Without usmg thP full force of the Taniyama con
JN'IIll'l'. Inshad, a par1kular ca.=w suffict>s. one that apphPs to a class of
dliptic CUIY('S known a._<; "st>mistablf'." In a lOO-p agE' paper he marshalled
Tl-IE:
CONTI~l'!
TM HYPOTHESIS
<'nough powprful machinPry to provP the sPmistablP cas< of thP Taniyama conjecture. leading to the follo""ing thPorem. Suppose that M and
N are distinct nonzero relatiwly prime integers such that MN(M - l'iJ
is divisihlf' by 16. Thf'n the Plliptic curvP .11" = ..r(.r + 2\f)(.r + N) can be
parametrized by modular f1mctions. Indeed thP condition on divisibility
by 16 implies that this curve is semistable, so thf' sPmi.<>table Taniyama
conjecture establishes the desired property
WP now apply Wiles's theorem to Frey's curve (4) by lettingM =A",
N = -B". Tiwn !rl- N =A" + B" =- C", so !rfl'v'(M- N) = A"B"C'',
and wp must show this is a multiplP of Hi ~ow at least one of A, B, r
must bP evf'n~for if 1 and B are both odd then C" is a sum of two odd
mrmbers, hence even-which impliPs that C is even. We may further
assunw that n ? :), because Euler long ago provpd Fennat's Last Theorem for n = :l But since th( fifth or highN power of an even number is
dhisible by ;f = :32, the number -A"B"C" is a multiple of ;32, h(nce
certainly a multiplE' of 16. ThNPforp Ffpy's curvP satisfips the hypothesis
of Wiles's theorem, Implying that it can be parametrized by modular
functions. However, Ribet's proof that the Taniyama conjecture implies
the non<xistl'll('(' of FrPy's curvc works by prO\ing that the Frey eurn
crmnol be parametriud by modular fund ions. Thjs is a contradiction,
so FPrmat's Last ThPorem is tn1e
This proof is very indirt>ct and requirPs sophisticatt>d idt>as. MorpmPr,
some difficulties emerged concerning the first version of Wiles's proof,
which added to the sense of dranm. lie circulated a message by electronic maJ.I to tlw math<'lllalltal community, acknowledging thtse difficultks but asslrting Jus eonfidE'nC(' that his nwthods would overconw
tht>m. Rcpmring the proof took longer than hoped. but on 26 Octobt>r
1994 Karl Rubin circulated another nwssage "As most of you know, the
argument described hy Wil<>s
turned out to have a serious gap,
nmnely the constnution of an EuiPl' sys!Pnl. AftN trying tmsuc(essfully
to rt'pair that construclum. \\'!It's WPT\t ba('k to a difftnnt approa('h,
whkh tw hat! tnNl PtU'litr but abandomd m fmour of Hw Eukr syst<'lll
idt'a. lie wa._q thPn ahk to complf'tf' his proof"
~4
poqeX8)
uum
494
RECENT IJEVELOPMf<:NTS
[IX)
now known that the Continuum HypothPsis is neither tnw nor false, but
undecidable. In order to understand what this means, we must briefly
rf'call the axiomatic method (p. 214). Tiw axiomatic nwthod specifies a
mathematical object by stating an explicit system of conditions, a:rio-rns,
that the object. is required to sa1isfy. 11Us focuses attention on the abstract relationships bPtween that objf'ct and others, rathPr than on the
raw materials from which it is "built.'' Simple prestntations of set tlteOI}'
assume that notions such a.<> "set" an" defined, and described how to
manipulate them. In order to se-t up a rigorous framf'work in which to
discuss the Continuum Hypothesis, it is necessary to specify a system
of axioms for set theory
In 1964 Paul Cohen prov0d that Uw truth of the Continuum Hypothesis depends upon which axioms for se-t the-ory are chosen. The situation
is similar to that for geometry. The truth or falsity of Euclid's parallel
axiom depends upon the type of g<:>ometry: there is a "Euclidean" gPometry for whid1 it is true, but thf're are also "non-Euclidean" gtometries for which it is false (sPt' p. 218). Similarly, therE' arf' "Cantorian"
spt theories in which the Continuum IIypothPsis is tme and "nonCantorian" ones in whi('h it is faJse. Earli<'l" Kurt GOdel had proved that
the Continuum Hypothesis is true in somE' axiomatizations of sf't theory
Using a new tN'hnique caJled ''forcing," CohN! prove-d that in other axiomatizatJons it is false. In particular, there is no clistinguishPd choice
of axioms that leads to a unique 'natural' theory of sets
Fi. SET-THEORETIC
NOTATIO~
Tiw complement A' is often written A', but A' is still common. The current notation for subsets is either C or s;;. Cnlike < and %, the e).-pression A C B does not imply that A = B, either today or in Courant and
Robbins's time. In order to denote inequality in a subset relation, thP
cumbersome notation A ~ B is used.
Th' notations A + B, AB, and A' do still sunive in computN sdence
and f>lE>ctronic engineering, whPrl? thPy are used to describe circuits
fonned from logic gates
Ironically, the modem notation obscures the algebraic analogies in
properties (6--17) on p. 110. In view of (10, 11, 1:3), howpvf'r, this may
not be Pntirely a bad thing
\6. THE FOCR COLOR THEOREM
(see pages 247, 2(14)
Tiw four color thf'orem was prowd in June 1976 by Kf'nneth AppPl
and Wolfgang Haken. Their proof depNtds upon showing that some two
thousand specific maps behaw in a particular rather complicated way
Checking all thPse cases is immensely tedious, so they used a computer,
which rPquirP<l sewral thousand hours to complete thP checks. Tht
proof can now bt> verified in a few hours, thanks to better theorf'tkal
mE-thods and fa...:;ter computers, but no "pencil and paper" proof has yf't
bl'en found. Does a simpler proof exist'? Nobody knows, although it has
been shown that no substantially simpler proof can run along similar
lines.
Courant and Robbins's proof of the Five Color Theorem (p. 264) is
an adaptation of work of Arthur KPmpe, an attomPy and an1att'ur mathematician, who published a purported proof of the Four Color Thf'orPm
in 187!). It Pmploys a variant on the method uf matlwmatical induction
(pp. H-20), the existf'nce of a so-ealkd "minimal criminal." The basic
idea is that if the Four Color Thtoonm is falsf'. then there must exist
maps that require a fifth color. If such "bad'' maps exist, thPy can be
incorporatPd into btgger maps in all smts of ways, all of which will nNd
a fifth color too. Since then is no point in making bad map;; bigger, we
go tlw opposite
and look at the smallt>st bad maps. colloquially
known as
Tll(> Pxisknc(' of a minimal criminal follows from tlw principii:' of the smallest in1.Pgf'r (p. 18), which is equiv-
RfX"E?\T
f>~:\'EUli'ME:-.JTS
[fX)
497
RECENT DEVEUJPMF:NTS
[IX]
flaws and tritd again. More subtle problems emerged and wpre duly
corrected. After some six months of this dialoguf', Appel and IIakPn
twcamf' cominced that thtir method of proving unavoidability had a
good chanct' of success. In 197fi their rest?arch program moved from th<
exploratory pha-;r to the final attack In January 1976 they brgan construction of an unavoidable set with somt:' 2000 ngions, and by .JunP
1976 the work was complett:'. Thm they tested each configuration in this
set for reducibility. Here the computPr proved indispensable, duly reporting that f'VPI}' one of the 2000 configurations m Appel and IJakpn's
unavoidable set is rC'dueible. This contradicts tlw assumed existf'nce of
a minimal criminal, so four colors alonE' sllffi.Cf' to color any planar map
To what e.ll."tent can an argument that relies on all enormous computation, which an unaidf'd human brain catmot possibly cheek, be considf'rf'd a proof? StPphen Tymoczko, a philosopher, wrote: "If we accept
the four-color theorem as a theorem, then we are conunitted to changing
the sense of 'theorem,' or mort? to the point, to changing the sense of
the undC'rlying concept of 'proof". However, few pract.idng research
mathematicians agree. One rea'>Oll is that therP exist mathematical
proofs that do not rely on a computer, yet are so long and complicated
that even aftf'r studying them for a df'cadf', nobody could put his hand
on his heart and declare them to be totally unflawed. For exantple tlw
so--<alled "classification Uworem for finite simplf' groups" is at le-a.;;t
10,000 pagps long, required tlw (fforts of owr a hnndred ppople and can
be followed only by a highly train<'d SpPeialist. However. mathematicians at(' gerwrally comincPd that t.h(' proof is correct. Thf' nason is
that the strategy makes sense, the details hang together, nobody ha..-;
found a serious error, and thP judgment of tht peoplP doing the work
is at lEast as trustworthy a.s that of an outsirlPr. That comiction would
of course vanish if anybody-msidN or outsidPr-found a mistak(', but
so far nobody has
There is nothing in the Appel-Hakf'n proof that is any kss I'Onvincing
than thf' cla.;;sificaHon theorem for finite simplf' groups. In fact, a eom
putPr is much less likely to make an enor than a human, provHl('(l Jt.s
program is com:ct. Appel and Hakt>n's proof strat{'g:y makE's good log
ical sense: thPir unavoidable SPt was in any ca'>t' obtanwd by hand: and
thfn seems little re-ason to doubt the accuracy of Ow program usPd to
cht>ck ndueibility. Random "spot tests" ha\e found nothmg amiss. In a
nf'WSpapPf lll1Pnif'W, HakPn sumnwd Up the CUnSPnSUS Vi('W: ".Anyont
anywhf'ff' along thP linf', can fill in tlw details ;md check tlwm. ThP fad
that a computN can nm through morE' df'ta.ils in a few hours than a
,199
DIME~SIO;.;
A!\<D FRACTALS
RECENT DEVF:LDPMENTS
500
1 dimension
2 copies
2 dimensions
4 copies
(IX]
3 dimensions
8 copies
does not tend to infinity. Here each tenu in the sequence is the square
of the previous term, plus c
The Hausdorff~Beskovitch dimension of a set, now ofttm called its
.fmctal dimension, has many applications in different brandws of science, because it is a precise quantity that can be measured experimentally
and compart:'d with theory. Surprisingly, it nE>ed not be an integer. This cu~
rio us featurE>, the reason why the nmnber is still reasonably considered as
a dimension, can bE> understood by thinking about a simpler version
lmown as scaling dimension, as follows. Some shapes can be assembled
to form larger copies of themselves. For example (see Fig. 289), it requires
two copies of a line segment. (a 1dimensional object) to make a line segment twice the size. It requires four copies of a (2-dimensional) square to
make one twice the size, and it requires eight <"Opi{'S of a (3-dimensional)
cube to makf' one twice the size. In general it requins 2" copies of addimensional hyper<"Ub( (se(' p. 2:30) to make one twice th(' size. and it requires c =a<~ copies to make one a times the size
We can solve this equation for c by taking logaritluns (se( p 445.
f'quation (6))
loge
dloga
so that
(6)
d =
log c
log a
We can now work thp oth(f way rormd and use this pquation to defi.Iw
d, give-n c and a. Tlw rt>sult is callPd the scaling din1cnsion of thf' sPt
conctmed. In examplf's this }(ads to intriguing conclusions. For
KNOTS
501
instancP, the Cantor sPt (seep. 248) can bP mad(' three tmws as big
(a = :3) by assembling two copies (c = 2) (see Fig. 290).
According to definition ( 6) the scaling dimension d of the Cantor set
is therefore
d = log
= 0.6:3092:1.
log;~
a real number but not an integer. Similarly the Sierpiftski ga...,;ket (Fig.
291) can be doubled in size (a = 2) by assembling three copies, so its
s(aling dimension is
d = log :3 = 1.584962
log 2
This quantity is called a dimension becaust: it takf's the samf' value
a.<> the usual dimension for "nice" SCI$ such as intervals, squares, cubes,
and so on. The fractal dimension agrees with thP scaling dimension for
many sets but is defined for sets that cannot be enlar_Jed by assembling
copies of themselves. The frartal dimension of a fractal set is usually
not an integer, although sometimes it can bt. For t:'xamplP, in 1991 Mitsuhiro Shishikura proW'd that thp fractal dimension of thf' boundary of
the Mandelhrot set is 2. The truf' significance of the fractal dimension
is as a measure of "how well the set fills space" or ''how rough the set
is." For example, the Cantor set, with dimension strictly betwepn 0 and
1, fills space bdtN than a point (dimension 0) but less well than a line
segment (dimension 1). Thus the fractal dimension resolws the question
whf'thPr the Cantor set should have dimen...:;ion 0 or 1 (sf'e p. 2-l!:l) in a
very different manner from Poincare's approach.
8. K.'i'OTS
(see pa.Qe 255)
HECENT DEVELOPMENTS
502
[IX[
1st copy
2nd copy
K~OTS
n cop1es
0 000
(a)
(b)
RECENT DIWEWPMENTS
504
LEFT
TREFOIL
RIGHT
~
t2t+1
t'-t+1
FIGURE-EIGHT
(j)
t'-31+1
REEF
t'-2t'+3t'-2t+ 1
GRANNY
~
t'-2t'+3t'-2t+ 1
A PROBLEM IN MECHANICS
505
c-2.::r-"
P(granny)
= ( -2-r'
- .r1 +
- x'1 +
y~.
+x
~vJ
Here x and y are the two variables required to define the polynomial.
These results obviously prove not only that the two types of trefoil are
topologieally inequivalent, but also that the reef knot and granny knot
arE' topologieally inf'quiva1'nt
9. A PROBLEM IN MECHANICS
506
m;CE!\T DEVELOPMENTS
[IX]
the floor, assume it stays thtre throughout the subs('quPnt motion. Suppose we specify in advance how the train moves. The motion need not
be uniform: the train can speed up, stop suddenly, even go into rPversf'
for a time. It must start at one station and end at the othPr.
Courant and Robbins ask whether it is always possible to place thP
rod in such a position that it never hits the floor during the journey
Their solution is to note that thP final position of the rod dt>pf'nds continuously on its initial position. There is a continuous rangP of st.<utmg
angles, from oo to 180. B('cause the final position depends continuously
on the initial position, Balzano's theorem (p. 312) implit>s that the rangP
of final angles is also continuous. If we start with tht' rod lying do\\TI
forwards at 0, it stays there. If wt' start with it lying dov.rn backwards
at 180", it stays there. So the range of final angles indudt's all value-s
between oo and 180. In particular, it includes 90, so we can arrange for
the rod to finish up vertical. Since it stays on the floor whe-n it hits it, it
cannot hit the floor at all.
The difficulty is that the continuity assumption made in the above
discussion is arguably not justified. The problem is not the intricad(s
of Newton's laws of motion, but those "absorbing boundary conditions":
if the rod hits the floor, then it stays there. In ordl'r to see why thP
boundary conditions cause trouble. we introduce a topological picturP
of the possib!(' motions of the system. This approach, known as a phase
portrait, goes back to Poincare. Tht:' idea is to draw a kind of spact.time diagram of th(' motion, not just for a single initial position of the
rod, but for many diffprent positions-in principle, all of them. The position of the rod is an angle between oo and 860", and we can graph this
in th(' honzontal direction (s('e Fig. 294). Let time nm in the vertical
dirPction. !\ote that the left and right hand edges of this picturP should
be identified b('cause oo = :360: concpptually, the rectanglE' is rolkd into
a cylindPr.
Now. the path in space and time of the anglP that detem1inPs !lw
position of the rod fonns some curve that nms up the cylinder~what
Albert EinstPin callPd a "world-line." Difff'rE'nt ini1ial :mglPs lead to difft!fE'Ht curves. 'The laws of dynamics show that tlwse eurYes vary con
tinuously as the initial angl( varies continuously-provided the
boundary conditions are not pnforced. Without thosf' conditions the rod
is frf'f' to tum a full :360--thPre is no floor to prewnt it turning all the
way rmmd. A possible history is shown in F'tg 294a. and here t!w final
position does depend continuously upon thf' initial position.
HowevPr, when tlH' absorbing boundary conditions ;u-e put back (.F':ig
STEINER'S PROBLE~l
[){)7
initial angle
90"
mitial angle
180"
270"
0"
90"
l
11
{b)
284b ), the final position need not depend continuously on th~ initial one
Curves that just graz~ th~ left-hand boundary can swing all the way over
to the right Indeed, in this particular picture all initial positions end up
on the floor: c-ontrary to what Courant and Robbins daim, there is no
choice that keeps the rod off the floor throughout the motion
This ('rror in Courant and Robbins's reasoning was first pointed out
by Tim Poston in 1976, but it is still not widely !mown. Th(' continuity
assumption can be resuscitated by imposing extra contraints on thf'
motion, for example a perfectly level track, no springs on the train, and
so forth. But it seems mon instructive, as an ('Xercis(' in the application
of topology to dynamics, to understand why the absorbing boundary
conditions dtstroy continuity. This difficulty is important in advanced
topological dynamics, where it has given ris~ to thP concept of an "isolating block," which is a region such that no dynamical trajectories are
tangent to its boundary
10. STEI:NER'S PROBLEM
(sf'f' pagf' 35.9)
008
HECENT IJIWELOl'MENTS
[IX[
120, is that Pis thf' uniqm point such that the linf'S PA, PB, and PC
Inf'Pt at 12W to each other (pp. :155--6). Steiner's problem can be gene-ralizf'd to the> streN network problem, which a..-;;ks for the shorttst nl'twork of Iinf's ( stre('tS) joining a given set of points (towns) to each otht>r
(p. :359). It ha..-;; giwn rise to a fascinating cof\iE'CtUrt', only nce-ntly
proved
Suppose wp v.ish to find a nPtwork of lints that will comwct a set of
towns. Onp way to do this is to use a s(K'a}}('d spawning network, which
uses only tlw straight lint>s joining pairs of towns. Anothe-r is to use a
Steirwr network, in which E'Xtra towns are pe-nnitted, such that the lines
nmning into them mf'f'l at 1200 angles. Let the length of the shortest
spanning network for a given SE't of towns be called the- spanni n_q frmgth,
and let tlw lt>ngth of the shortest Steirwr mtwork be thf' StPinrr frn_qth
The problf'm of finding the Steiner lf'ngth is discussed by Courant and
Robbins (p. 359) under the title "Street Nf'twork Probl~m." Obviously
tlw Sh:iner length is less than or equal to the spanning length. How much
smaller can it get?
Suppose, for example, that therf' an" three towns at thf' Vf'rticf's of
an equilat\'ral triangle of side I unit. Fig. 295 shows the shortest SWinf'r
network and th\' shortest spanning network. The rww point introducf'd
in the center is called a Steine-r point: in general, a St('iner point is one
at which three lines Uoining it to other points in the set of towns) meet
at angJ(S of 120. The spanning length is 2 and the Steinf'r }('ngth is ..J3.
In this case, the ratio between the Steiner length and tlw spanning length
is 'l'a/2 = 0.866, and the saving in length obt.ailled by using the shortest
Steiner network rather than the shortest spanning network is about
13.:34%.
In 1968, Edgar G-ilbert <Uid Henry Pollak co[\jecturt>d that no matter
how the towns are initially locat(d. the Stf'inPr length never falls short
of the spanning lf'ngth by morf' tlwn 1:3.:34%. Equivalently,
(7)
for any set of towns. This statement ha..<> becom{' !mown as thP S'!eirwr
Aftn considerablE' pffort it wa..<> fmally proYed by Ding
Frank Hwang in 1991: Wf' rlf'scrihP tlwtr approach once Wf'
haV(' set up the nPcessary barkground
Fmding the sparming length ts ::.~simple computation, ('Ven for a hugenumlwr of towns. It is solwd hy th(' greedy algoritlnn: start with the
STEINER"S PROBLEM
50H
shortest connecting line you can find, and at each stag( thereaftf'r add
on thr- shortr-sl rr-maining Hne that. does not complete a dos('d loop,
until t'WIY town is included. Finding the Steiner length is nowhere near
a_.:; E'<k">y. You cannot just take all possible triples of towns, find their
StPiner points, and look for the shortest network that joins the towns
and meets either at towns or at these particular Steint>r points. For example, supposP there are six towns arra.nwd at the comers of two adjacent squar(S, a_.:; in Fig. 296 One possiblE' StPinPr trep is shown in Fig
296a: it is found by solving thP problem for a square of forn towns first,
and then linking in the two remaining towns via their Steiner point with
one that is already hooked in. However, the shortest Steinf'r trPe is that
sho-wn in Fig. 296b. The grey squares arr- indudr-d only to indicah where
thP tovms arC' placf'd
You cannol build up shortest St.einer trf'es pif'ct'meal. The correct
gtnr-ra.lization of StPIOf'r point to a set of many towns is any point at
which link.:; can mePt at l:W For as simplE' an PxamplP as four towns
at the verticE's of a square, these points arP not StPinPr points of any
subsPt of threP to\\TIS (J:olg. 297). There are iniinit,pJy many points in the
planE', and even though most of them arP probably irrtlf'vanl, it is nol
ob\ious that any algorithms (xis!. In faet th('y do; the first wa-.:; inwntPd
by Z. A. Mt!zak, but in practicf' his nwHlOd hecomes unwif'ldy evf'n for
moderatf' numbErs of towns. H ha.<; since been improvE'd, but not dramatically
0
REt'f~NT
510
DEVEU)PMENTS
[IX]
(a)
We now know that there are good reasons why these algorithms are
inefficient_ The growing use of computers has led to the dewlopment
of a new branch of mathematics, Algorithmic Complexity Themy. This
studi1?'s not just algorithms~methods for solving problems-hut how
efficient those algorithms are. Given a probl<'m involving some number
n of object<! (here towns), how fast does the running tinw of the soluUon
grow as n grows'? If thl.' nmning time grows no faster than a constant
multiple ofafuwd powf'r ofu, such a.<> r>rr' or 1066n 1, then thf' algoritluu
is said to run in polynomial time, and the problem is considered to be
"easy.'' Usually this means that the algorithm is practical (but it will not
be if thf' constant. is absolutely huge). If the running tim<' grows nonpolynomially-faster than any constant multiple of powers of n, f01
STEINER'S PROBLEM
!'ill
instance exponentially, like 2" or 10"-then the problem has nonpolynomial running time and is "hard." Usually this means that the algorithm is totally impractical. In be-tween polynomial time and
exponential time is a wilderness of "fairly easy" or "moderately hard"
probiPms, wherE' practicality is morf' a matter of experience
For instance, adding two n-digit numbers requires at most 2n onedigit additions, including canies, so the time taken is bounded by a
constant multiple (namely 2) of the first power of n. Long multiplication
of two such numbPrs involves about n~ one-digit multiplications and no
more than 2n" additions, or 3n" operations on digits, so now thf' bound
involves only tlw second power of n. Tiw opinions of SC'hookhildren
not.,.,ithstanding, thl'Sf' proble-ms art' thcreforP "ea.:;y." In contra.:;t, considt:>r t!w TravE'iing Salesman Problem: find the shortest route that takes
a salesman through a given set of cities. If there are n cities then the
number of routes that we have to consider is n~ """ n(n - 1)(11 -- 2)
~1.2.1 which grows faster than any power of n. So cas(-by-<'asf' f't\Ulllf>fation is hopelessly inefficient
Oddly enough, thP big probl(m in Algoritlunic ComplPxity TiteOiy is
!LX]
to provE' that the subject actually Pxists. That is, to prove that some
"interesting" problem really is hard. The diffkulty is that it. is easy to
prove a problem is easy, but hard to prow that it is hard! To show a
problem is ('asy, you just exhibit an algoritlun that solws it in polynomial time. It does not haw to be the best or the deverpst: any "Will do
But to prove that a problem is hard, it is not enough to exhibit some
algorithm """ith non-polynomial running time. Maybe you chose tlw
wrong algorithm, maybe there is a hPtter one which does nm in polynomial time. In order to rule that possibility out, you have to find some
mathematical way to consider all possible algorithms for th(' problt>m
and show that none of thPm runs in polynomial time. And that is E'Xtnnwly difficult
Tiwrf> ar(' lots of candidates for hard problems-the trawling salPsman problt'm, tht bin-packing problem (how can you ll(>St fit a set of
items of given sizes into a S('t of sacks of given sizes?), and th(' knapsark
problem (.~iven a fixpd sizE' sack and many objt>cts. does any spt of
objecl.:; fill the bag exactly?). So far nobody has managed to prove any
of them are hard. However, in 1971 Stephen Cook of th(' Cniversity of
Toronto showed that if you can prove that any onE' problem in tlti~
candidate group really is hard, thf'n they all are. Roughly spf'aking, you
C'all "code" any one of them to bt'comE' a spPcial case of one of thE'
oth('rs: th(y sink or s-wim together. TiwsE' problems are called NPcomplete, where NP stands for non-polynomial. Ewryone believes that
~P-completp problt:>ms really are hard, but this has nf''VN bf'en proved
~P-completeness relatE'S to the Stl'iner problem because> Ronald Graham, Michael Garey and David Johnson have proved that the problem
of C'omputing thP Steiner length is r-iP-compktf'. That is, any effici('nt
algoritlml to fmd the precise Steiner length for any set of towns would
automatically lead to pfficiE'nt solutions to all sorts of computatiOnal
problems that are widely b(!ievpd not to possess such solutions
Tht> Steiner ratio C'onjectun (7) IS thPrefore important, because i!
proves you can replaC'e a hard problem by an e-asy one without losmg
very much. Gilbert and Pollak had quilt' a lot of positive f'Vldence wlwn
they stated that cor\iecture. In partiC'ular tlwy could provf' that somf'thing Wt'akf'r must bP tmP: the ratio ofStt>ilwr h'ngth to spanning lt>ngth
is always at lt'<L'>t fl.i"i. By W90 various pPople had p('rfomwd heroic
calculations to vPrify !he i'Onjf'('ture complett>ly for nt>twork; of 4, 0,
;;md 6 towns. For g('nPral a.rratlg(nwnts of as m::my towns
like,
tlwy also pushed up thE' limits on !hr nrtlo from O.r) to O.G7,
and
0.8. Around 1990 nraham and Fatlg Chung rms(d tt to 0.824, in a com-
fiU
putation that thpy described as "really horribi~'-11 wa._.:; dear it wa._<; the
wrong approach"
To makf' further progress possiblf', the horrible calculations had to
be simplified. Du and Hwang found an approach that is so much better,
it does away with thE' horrible calculations completely. The ba.sicques.
tion is how to gt~t equilateral triangles in on the act. There is a big gap
betwePn the triangle example in Fig. 295, which sets the bound on the
ratio, ru1d a gPneral system of to'"'ll.s, which is supposed to obey the
samt> bound. How can this No Man's Land b~, crossed? There is a kind
of halfway house. Imagine the plane tiled with idPntical equilateral tri~
angles, in a triangular lattice (Fig. 298). Put towns only at the cornNs
of the tiles. It tt1ms out that the only SteinN points that need be considered are tht' Ct'nters of the tilf'S. In short, you have a lot of control, not
just on computations, but on tht'oretical analyses
Of course, not ewry St't of towns conveniently lies on a triangular
lattice. Du and Hwang's insight is that the crucial ones do. Again thC'
proof is indirect, by contradiction. Suppose the conjecture is false. Then
ther(' must Pxist a c:mmterexrunple: some set of towns for which the
ratio is less than '1'312. Du and Hwang show that if a counterexample to
thf' rof\lecture exists then there must be one for which all the towns lie
on a triangular lattice. This introduces an element of regularity into tht'
problem, and it is then relatively simple to complete the proof.
In order to prove this lattice property they reformulate th(' conjPcturP
as a problem in ganu' theory, wherP players compete and try to limit
the gains made by their oppomnts. Gan1e theory was inwntPd by John
von N(>umann and Oskar Morgenstern in their dassk Theory of Games
a rut Bconomic BPhav'iorof 1947. In the Du-Hwang version of the Steiner
ratio con,jectun, one player selects tlw general "shape" of tht Steiner
tree, and tht> otlwr picks the shortest one of that shape that they can
find.. Du and Hwang deduce the existencP of a lattice counterexample
by observing that thP payoff for their game has a Spt>dal "convexity'
property
~11.
)14
IHX'E:-.JT UJ<;VEL()J>MENTS
[IX)
515
minimizes the total area. Perhaps surprisingly, the difficult step is the
first and most qualitative of Plateau's ptinciples~that the shape consists
of a finite number of sutfaces. The other two principles follow relatively
easily from gf'ometric argument.'>, just as the 120 angle in Steiner's problem dor>s. We first indicate this deduction and then discuss the proof of
Plateau's first principleThe first step in the deduction of the second and third principles from
the first is to use the smoothness of the surfaces to reduce the problem
to one about planes. If a very small region near a line of intersection of
threP surfaces, or a point of intPrsection of four, is magnified. then the
surfaces appear nearly flat, and the greater the magnification thE' flatt.E'r
they seem to be. By thinking about the errors involved in such an approximation, it turns out to be sufficient to prow Plateau's second and
third principle 1mder the simplifying assumption that the- smfaces are
planar. The- second step is to rE'dU<'(' this quf'stion to one about lines on
a sphere. ConsidE'r how tlw planar rf'gions intersect a sphere centered
on the line or point of int.('rsPctjon. The system of planPs is then replac'd
by a system of arcs of great circles (see Fig. :300). The analogue of the
rf'quirenwnt of minimum area is that the total length oft.hf'Sf' arcs should
bP minimal. By a sphe-rical ve-rsion of St('iner's theorem (p. ~154), proved
in a similar mamwr, the ares nwl'1 in thrPt' at anglf's of 120. The third
516
RECENT DEVEUWMENTS
[IX]
517
RECEl\T DEVELOPr
:NT~
[IX)
On pagP 4:3l> Courant and Robbms remark that "'differentials" as infinitt'ly small quantiti(s an now dPfinitE'Iy and dishonorably (hscarded,"
an accuratt' nflec1ion of thl' const>rtsus vltw wiwn What Is Mathemalit'.~:' was wriUPrt. Despite Courant :md Robbins's wrdict, tJwrP has always hPf'n sonwthing intuitive and appealing about tlw old-stylt
argumenl<> -with infinitesimals. ThPy are still Pmbt>ddPrl in our languagE',
in ideas such a.<> "instants" of time, 'instantaneous" velocities, a cmve
a." a serit.>s of intinitdy small straight lines. the art.>a bounded by a cmve
a.<> an in1iniiP sum of area..-; of mfimtesimal rPctangll'S. This kind of intutitinn tums otH to tw justifiPd, for it has I'N'i'ntiy hP('!\ disconreclthat
NONSTANDARD ANALYSIS
:319
(8)
first ordf'r, and so are all tlw usual laws of algebra; but the "Archinwdean a..xiom"
IS
(f))
if x ,
is second order. :\lost of the usual axioms for thP real numhPrs are first
onlf'r. but the Jist indurks somt that arf' sPcond ordPr. In fact the second (ll"dt>r axtom (U) is the cmcial one that mks out both infinitPsimals
and inlinitlf's m R. llowpnr, it turns out that ifthf' axioms are weakened
to comprise only the first order propl'rtws of R, then othl'r models exist,
[IX[
HE<'E:\T DE\.'EL<H'ME;o.;TS
including some that Yiolatt Ull above. Lf't R~ lw such a model and caiJ
it tlw systPm of hypm'f'ral numbers. This idea, the ba.<;Js ofnonst.andarrl
<Ula!ysis, was discowred by Abraham Robinson arotmd HiGO. We have
already sf'en that there an nou-Euchckan gPomf'trif's andnon-Cantorian
set theorif's; now we find that there are non-Archimt>dean numbPr sysTlw set R" contains several important subsPts. Tht>re is a st>t of "standard" natural numbt>rs N
10. 1, 2. :3,
), <md thtrp is also a larger
systPm of "nonstandard" natural tnunlwrs N*. ThPrf' are the standard
integers Z and a cornsponding Pxtension to nonstandard integPrs Z*
Titere are thP standard rationals (J, and a corresponding extensiOn to
nonstandard rationals Q*. And therf' arE' standard nals R and nonstandard rPals (or hyperrPals) R"
Every first order property of R has a uniquf' natural extension toR*
However, (9) expresses a second ordN property. and it is false in R*
The hyperreaJs contain actuaJ infmitiPs, actual infinitesimals. For example .r E R* is infinitesimal if and only if x 7"- 0 and .r < lin for all
n E ,V. ThP usual argunlf'nt that "infinitesimaJs do not ('Xist '' actually
prows that wal infinitPsimals do not exist; that is, that thP infinitesimals
in R* do not belong toR. But that is (ntirPly rea.:;onable. becausE' R* is
bigger than R. Incidentally, th( 'corrtct" analogue of (9) in R* is
(10)
0,
NONSTANDAIUJ ANALYSIS
S21
numbPr isjmite if It is smaller than sonw standard r('aL It is ii(f/,nitesimal if it is smaller than all positive standard reals. Anything not ftnitP
is irlfinite, and anything not in R is nonstandard. If x is infinitesimal
then llx is infinite, and vice versa.
None of this would be of any great importance if all that could be
donf' wao;; invent a nf'w number system. But even though R and R* are
different, they are intimately connected. In fact, every finite- hypPrreal x
has a unique standard part std(J") which is infinitely dose to x, that is.
r - std(.r) is intinitt:simal. In other words, each flnite hyperrf'al has a
uniqup exprf'ssion as "standard real plus infinitesimal" It is as if each
standard real IS surrounded by a cloud of infinitely close hyperreals,
oftpn eallt>d its halo. And ('aeh such halo surround.'> a singlt nal. which
for sonw obscure reason is usually callf'd its slwdmr, although a word
like "core" or "center" would convey thf' image better. By using thP
standard part .:e can transfer properties from R* toR, or vice Vf'l"Sil
To see how proofs in nonstandard analysis differ from their standard
counterpart5, consider I.Ribniz's calculation of the derivat.iw of the func~
tion y = f(x)
What h<' does is takf' a small number b..x and form
the ratio Lf(x + b..r) - f(x)]l b..r. (Ne\Vi:on's approach was basically the
same. except that lw used the symbol o in place of b.x.) Following Lib~
niz we calculate
2.r
A.r
Leibniz then argut>d that sincP .:l.r is infinitf'simal, it can bf' ignored,
ll'a\ing 2.r. Howf'nr, lu must be nonzero in ordf'f for [.f(.r -'- L1.r) JV)]I ,:~.r to makl:' SPflSf', in which ca._<;(' 2:r + t-.1' is not lqual to 2.r. It
was this difficulty that !0d Bishop BPrkeliy to write his famous critiquf'
111e Anai.IJsf, Or a Disf'ourst Address('(/ to an /r1fhld Mathenwtician,
RECENT DEVELOPJ\.ff<:NTS
[IX[
pressed similar ideas, but not wtth thf' same crystal clarity as
Weierst.ass's and 8.) Because nonzero values of &.:r can tend to zero,
we may assume all values of Ll.r that are encountered during the calculation are nonzPro, so that dividing by d::r is meaningfuL Then we take
the limit as d.:r -~ 0 to get rid of that awkward extra term dx and leavp
the rf'quirPd answer 2x
In nonstandard analysis therP is a simpler way. Take x to be finite
and standard (that is, let :r E R) and assume that Ll.r is a genuine infinitesimal. Instead of 2:r + &.r take its standard part std(2Y + &.r), which
is 2r. In other words, df'fine the derivative of.f(x) to he
where .r is a standard n~al and Ll.r is any infinitesimal. The innocentlooking idea of the standard part is pxactly what is needed to make lhe
derivative a real function of .r instead of a hypf'rreal function of x and
&:r. It is a perfectly rigorous way of removing the Ll:.r term, because
std(x) is a uniquely defined real. Instead of the extra Ll.r being swept
tmder the carpet with much special plE-ading, it is neatly expunged.
A course in nonstandard analysis looks likP an e:x-tended parade ot
exactly those errors that Courant and Robbins spend so many pages
ttaching us to avoid. For example:
1. A sequ('nce s,. convNg('s to a limit L if s" ~ L is infinitesimal for
all infinite w. (Compare with p. 291.)
2. A functionfis continuous at .r 1f,{(.1 +E) is infinitely close tof(x)
(that is,J(.r + t)- J(.r) is infinitf'simal) for ail infinitesimal. (Compare with p. :HO.)
3. The function f has derivative d at .r if and only if [j(.r + Ll.c)
j{x)]ll!..J' is infinitely dose to d for all infinitesimal<; 6.r. (Compan
with p. 417.)
4. Th(' arf'a of a eun'ed region is an infinite sum of infinitesimal f('Ctangles. (ComparE' with p.
!IoWE'VN, within tlw franwwork
nonstandard analysis tiWS(' statenwnt<> can be giv('n a rigorous mt'aning.
In fact, nonstandard analysis doE's not lt>ad to any condusions about
R that differ from standard analysis. It is ea..:;y to condud(' from this that
there is no point in using the nonstandard approach, because "it dof's
not ltad to anything nt'W." But this cdticism is not conclusive: the question is not "does it givP the same results?'. so much a_<; "is it a simpler
or more natura! way to derive thosP results?" As :-.JPwton show(d in his
NONSTANDARD ANALYSIS
Princ'ipia. anything that can be proved with calculus can also be proved
by classical geometry. In no way does this imply that calculus is worthless, and the same goes for nonstandard analysis.
Experience suggests that proofs via nonstandard analysis are usually
shorter and more direct than the classical epsilon-delta proofs. This is
because they avoid complicated estimates of the sizes of things, which
form the bulk of the classical proof. The main obstacle to the widespread adoption of nonstandard analysis is that its appreciation requires
a background with an emphasis on mathematical logic-very different
from traditional analysis.
APPENDIX
SUPPLEMENTARY REMARKS, PROBLEMS, AND EXERCISES
Many of the following problems are intended for the somewhat ad
vanced reader. They are designed not so much to develop routine
technique as to stimulate inventive ability.
(1) How do we know that 3 does not divide any power of 10, ae
stated on page 61? (Seep. 47.)
(2) Prove that the principle of the smallest integer is a consequence
of the theorem of mathematical induction. (Seep. 19.)
(3) By the binomial theorem applied to the expansion of (1
1)",
show that c; + Cl + c; + ... + c: = 2".
(*4) Take any integer1 z = abc .. , form the sum of its digits,
a + b + c + .. , subtract this from z, cross out any one digit from
the result, and denote the sum of the remaining digits by w. From a
knowledge of w alone, can a rule be found for determining the value
of the digit crossed out? (There will be one ambiguous case, when
w = 0.) Like many other simple facts about congruences, this can be
used as the basis for a parlor trick.
(5) An aritlunctical progression of first order is a sequence of numbers,
a, a + d, a + 2d, a + 3d, , such that the difference between successive members of the sequence is a constant. An arithmetical progression of second ordN is a sequence of numbers, a, , az, a3 , such
that the differencps a,+, - a, form an arithmetical progression of first
order. Similarly, an arithmetical progression of kth order is a
such that the differences form a.n al'ithmetical progression
order
k - 1. Prove that the squares of the integers form an arithmetical
progression of second order, and prove by induction that the kth powers
of the integers form an arithmetical progression of order k. Prove that
any sequence whose nth term, a,., is given by the expression eu + c1n
2
C2n + .. + cknk, where the c's are constants, is an arithmetical progression of order k. "'Prove the converse of thiR statement fork = 2;
k = 3; for general k.
APPENDIX
(6) Prove that the sum of the first n terms of an arithmetical progression of order k is an arithmeti('al progression of order k
L
(7) How many divisors has 10,296? (See p.
2
2
2
2
(8) From the algebraic formula (a + b )(c +
= (ac - bd) +
2
(ad + bc) , prove by induction that any int.Pger r
a 1G.:t an , where
all the a's are sums of two squares, ifi itself a sum of two llquares. Cherk
this v.ith 2 = 12 + 12, 5 =
+ 2\ 8 = 22 + 22 , etc. for r = 160,
r = 1600, r = 1300, r
625. If possible, give several different representations of these numbers as sums of two squares.
(9) Apply the re.~ult of Exercise 8 to construct new Pythagon-an
number triples from given one!l
(10) Set up rules for divisibility similar to those on page 35 for number
systems with the bases 7, 11, 12.
(11) Show that for two positive rational numbers, r = a/b and
s = c/d, the inequality r > sis equivalent to ac - bd > 0.
(12) Show that for positive r and s, with r < s, we always have
r<r
~ <8
8
and
(r
+ s)~.
+
+
2sin ~
+ cosn'f'
= sin (:in+!;)rp.
(15) Find what the formula of Exercise 3 on page 18 yields, if '"e substituklq = E(v>).
Analytic Geometry
A careful study of the following exercises, supplemented by drawings
and numerical examples, will help in mastering the elements of analytic
AJXALYTIC GEo:\fETRY
Fi27
we mutit
(16) Prove: If
ordinates of the
(xJ + x2)/2. '!/o =
P, are distinct, then
the ratio P 11'0 : P1P2 of
coOrdinates
then
be said
APPENDIX
528
m =tan a=~.
X2- X1
xcof;8
usin{3
d = 0.
ANALYTIC GEOMETRY
This is the normal form of the equation of the line l. Note that this
equation doPs not depend on the direction assigned to for a change
in direction would change the sign of every term on the
side, and
hence would leave the equation unchanged.
By multiplying the normal equation with an arbitrary factor, we obtain the general form of the equation of the line:
ax+by+c=O.
To retrieve from this general form the geometrically significant normal
form we must multiply by a factor which will reduct! thr: first two coefficient>" to cos {:1 and sin {:J, whose squares add up to 1. This may be
done by thP factor 1/Va:!:f bZ, which yields the normal form
so that we havP
V;+IJz =
sin {:J,
m(x - xo),
y=mx+yo
mx.1-
Prove that the line through two given points, PI(x1, YI), P2(xz, y2),
has an equation
factor, show that the equation of a line may be written in the intercept
form,
= 1,
it~;
h = ucos{J
+ vsin{3-
d,
or by
= 0.
the equation ax + by + c = 0 of a
Let A and A' be constant.'>, with
ANALYTIC GEOMETRY
}.. + }..' = 1. Show that, if land l' intersect in Po(xo, yo), then every
line through Po has an equation
>l(x, y)
+ >'l'(x, y)
0,
and conversely; and that every such line is uniquely determined by the
choice of a pair of values for}.. and A'. (Hint: Po lies on l if and only
,f l(;r 0 , y0 ) = axo + byo + c = 0.) What lines arc represented if l
'1nd l' are parallel? Note that the condition}..+}..' = 1 is unnecessary,
but serws to determine a unique equation for each line through Po
(29) l:se the result of the previous exercise to find the equation of a
line through the intersection Po of l and l' and through another point,
P 1 (x 1 , y 1), without finding the coOrdinates of Po. (Hint: Find }.. and
A' from the conditions Al(x 1 , y 1) + A'l'(x,, y,) = 0,}.. +A' = 1.) Check
by finding the coOrdinates of Pu (see pp. 76-77) and showing that
Po lies on the line whose equation you have found.
(30) Prove that the equations of the bisectors of the angles formed
by intersecting lines l and 11 are
y) =
V(j2-::t=.l;;i l'(x,
y}.
(Hint: See Ex. 27.) What do these equation8 represent if l and l'
are paral!cl?
(31) Find the equation of the perpendicular bisector of the segmtnt
P1P2 by each of the following methods: (a) Find the equation of line
P 1P2 ; find the coOrdinates of the midpoint P 0 of segment P 1 P 2 ; fmd
the equation of the line through Po perpendicular to P 1Pn. (b) Write
the equation expressing the fact that. the distance (p. i4) between P 1
and any point P(x, y) on the perpendicular bisector is equal to the
distanC'e between P2 and P; square both sides of the equation and
the equation of the circle through three non-collinear
I\, P2, Pa, by each of the following method:;:: (a) Find the
of the bisectors of the segments P1P2 and P2P~ ; find the ('Oof the center as the point of intersection of these lines; find
the radius as the distance betw('en the cmter and P 1 lb) The equa
tion must be of the form x2 + y 2 - 2ax - 2by = k (seep. 74). Since
each of the given points lies on the circle we must have
xi
x~
x:
+
+
y~
2ax1 - 2by,
yi - 2ax2 - Zby2
"; - 2ax~ - 2by1
= k,
= k,
=
k,
APPENDIX
for a point lies on a curve if and only if its coOrdinates satisfy the equa-tion of the curve.. Solve these simultaneous equations for a, b, k.
(33) To find the equation of the ellipse with major axis 2p, minor
axis 2q, and foci at F(e, 0) and F( -e, 0), where 1l = p 2 - q\ use the
distances rand r' from F and F' to any point on the curve. By defmi~
tion of the ellipse, r
r' = 2p. By using the distance formula on
page 74, show that
r'
(x
+ e)
(x - e) 2 = 4ex.
Since
r'~
- r 2 = (r'
+ r)(r'
- r)
= 2p(r'
- r),
(x- e)
= 2p to
+ p.
+r
p)
+ y~ = ( -~ x + p
y 2, equate this
just above,
y.
Show
y2
~+qi= 1.
Carry out the same procedure for the hyperbola, defined as the
locus of all
P for which the absolute value of the difference
quantit - . Hcree 2 = p2 + {
the locus of a point whose distance
from a
line (the directrix) IS equal to its distance from a fixed
point (the focus). If we choose the line x
-a as directrix and
the point F(a, 0)
show that the equation of the parabola may
be written in the form
= 4ax.
Geometrical Constructions
(35) Prove
the numbers
,.
',,c. ;"
ofp:~;t~hc~!n~h:i~~J:1;~('~:~~~
of
GEQ:\IETRIC.\L CO:\l:iTHUCTIOKS
x'
Find algebraically the equations giving x, y in terms of x', y'.
(*40) Prove analytically by using Exercise 39 that by inversion the
totality of circles and straight lines is transformed into itself. Check
the properties a) -d) on page 142 separately, and likev.-isc the transformations corresponding to Figure 61.
(41) What becomes of the two families of lines, x = const. and
y = const., parallel to the coOrdinate axes, after inversion in the unit
circle about the origin? Find the answer without and with analytic
geometry. (See p. 160.)
Carry out the Apollonius constructions in simple cases of your
Try the solution analytically according to the method
of page 125.
Projective and Non-Euclidean Geometry
APPENDIX
the plane of the conic with the plane thruu6h the circle in which the
Daudelin sphere touches the cone. (Since the circle does not. rome under
thi:; t.'haracterization except as a limiting casf', it is nut
propriate to choose this prope-rty as a dt>finit.ion of the conics,
this is sometimes done.)
(50) Discuss: "A conic, regarded as both a set of point<> and :;et
of lines, is self~dual." (See p. 20\J.)
(5!) Try to prove Desargues's theorem in the plane by carrying out
the pa.<:L Je to the limit from the three~dirl1ensional configuration of
Figure 73. (See p. 1i2.)
(52) How many line$ int.erseding four given skew lines can be
dra\vn? How can they be characterized? (Hint: Draw a hyperboloid
through three of the given lines, see p. 212 )
(*5.'3) If the Poincare circle ia the unit f'ircle of the c.umplex
then two points z1 and Z2 and the z-valuf's
'2 of the two
intersection of the "straight line" through
two points with the
unit circle define a cross-ratio ~!- ~~ 1 ;z~
z1
~ z~
Topology
TOPOLOGY
which F ~ E
F = ;~ - 3 + 1 = 1. How can we be sure that the
final result
not be a
of triangles ,vith no vertices in
so that V - E + F =
- 6 + 2 = 2? (Hint: \\'e can assume
the original network is connected, i.e. that one can pass from any vertex
to any othH along edges of the network. Show that this property
cannot be destroyed by the two fundamental operations.)
We ha\'e admitted only two fundamental operations in the reduc
Might it not happen at some stage that a triangle
having only one vertex in common with the other triangles of
network? (Construct an example.) This would require a third
Removal of two vertices, three edges, and a face. Would
affect the proof?
Can a wide rubber band be wrapped three times around a broom
lie flat (i.e. untwistPd) on the broomstick? (Of course, the
rubber band must cross it~c!f somewhere.)
(59) Show that a circular dii:ik from which the point at the renter
has been rf'moverl admits a continuous, fixedpointfree transformation
into itse-lf
of a disk one unit
Of course, this is
not a transformation of t.he
into
some points will be
tak('n into
outside the disk. \\'hy
not the argum<>nt of
page 255,
on the tran.~formation
- t P*, hold in this case?
(61) Suppose we have a rubber inner tube, the inside of which is
painted white and the outside black. ls it possible, by cutting a small
hole, deforming the tube, and thPn sealing up the hole, to tum the tube
inside out, Ro that the inside will be black and the outside white?
(*62) Shm' that there is no "four color problem" in thrf'e dimensions
by
that for
desired number n, n bodies ('an be placed in
all the -others.
a
14a),
all the others
118 consi~ts of five
Evtn if
APPENDIX
TOPOLOGY
(77) Find the eontinm'd fraction expansion for the ratio OB:AB
of page 123
(78) Show that the ,equence a~ =
V2,
APPENDIX
f(xJ)
'
'J..zf(x2)
f(htXt
h2x1)
v'l
+ xf ~
1
- (~
2 Xt
+ X"J~)
v'l
2:
+ -x~
~2 -
Xt
+ X2
2:
~~"-+' r-.:t-~
l'
(83) The same for u = x2, u = x" for x > 0, u = sin x for 1r :$ x :$ 21f,
u = tan x for 0 :$ x :$ 1r/2, u = -v'l~ for I xI :$ 1.
Maxima and Minima
between P and Q as in Figure
two given lines alternat-ely n
+ . + a,.b,.) 2 ~
(ai
+ .. + a~)(bi + .. + b!),
valid for any set of pairs of numbers a,, b, ; prove that the inequality
sign holds only if the a, are proportional to the b,. (Hint: Generalize
the algebraic formula of F:x. 8.)
(*92) With n positive numbers x 1 , , x,. we form the expressions
Sk defined by
8k =
(XtX~ Xk
+ )/C:,
where the symbol "+ ... " means that all the c; products of com~
binations of k of these quantities are to be added. Then prove that
where the equality sign holds only if all the, fllillntities x, are equal
(93) For n = 3 these inequalities ~tate ~,tat for three positive numbers
a, b, c
:i40
APPENDIX
The Calculus
(105) Differentiate the funetions
v'l+T, Vi +'x2 ,
THE CALCULUS
:)4l
;(~j ~
t, -
oo
we have
(p ~ n2 + 2~ ! n2 + + n2 ! na)---+ i
~(sin~+ sin~+
...
+sin~)---+ cos b- 1.
(11~
e-),
v = coshx =He"'+ e)
w=tanhx=
called hyperboUc sine, hyperbolic cosine, and hyperbolic
tively. These functions have many properties analogous
respec
those of
APPENDIX
!":i-12
D sinh x = cosh x,
D tanh x = 1/cosh x,
D arc sinh u =
.y;f+-~;
Darctanhw=
D arc cosh v =
~-w2 '
(lwi
>
+ VU2:tl);
~;}~----~
1).
+ sinh 2x
+ ...
+ sinh nx
and
+ cosh x
cosh 2x
+ ..
+ cosh nx
G(x)
f(x) dx,
TECHNIQUE OF INTEGRATION
H(u)
G(x),
where
Hu)
u - (x)
and
G'(x)f'(u).
If
G'(x) ~ f(x),
we can write
G(x)
j f(x) dx
and also
G'(x);.'(u)
f(x);.'(u),
Jf(f(u))f'(u) du.
"\Vritten in Leibniz' notation (see p. 434) this rule takes the very
suggestive form
j f(x) dx Jf(x)
=
du,
APPENDIX
which means that the symbol dx may be replaced by the symbol~ du,
just as it dx and du were numbers
and~ a fraction.
=I u!~gudu.
Herewestartwiththerighthandsideof(l),
~,
f(x) =
~;
hence
=I~= logx,
Ju!~u =loglogu.
We can verify this result by differentiating both sides.
I ~i~
cot u du
du.
Y,'(u) = cos u,
Setting .r
f(x)
= Hill
x,
hence,
J=
J~=logx
cotudu = logsinu.
=x
J =
~ ~~(~] du,
and find
dx
-;; =
logx = logY,(u).
We find
('O!'!Wt.
u = ift(u) we fir1d
TECHNIQUE OF INTEGRATION
d) J
c) J =
:)45
Jx dx
Then
= }(bg u?,
Tn the example::; below (I) is used, starting from the loft side.
f) J =
j -$~
Set VX
u.
Then x
u and*= 2u.
There-
fore
"
JL zu du
2u =
=;
zvx.
= - sin u. Then
- Jsin~udu=
Cbiug sin 2u = 2 sin u cos n
J
J~--=-~os2udu= -~+sin42u.
=
2 cos
-~arc cosx
uV!=COsU,
we have
+ !xV~X
2
119)
JU2~~1~1'
12
120)
J ueu' du.
125)
121)
-!)
126)
f X2+-~:x+b
f VCt-f at.
1x~~ dt
t
APPENDIX
122)
123)
J3 f\x dx.
J0 +d: +
J(;~~
1.
127)
f ~;.t
128)
dt.
arc sinh
~.
(p(x).q(x))'
p(x).q'(x)
+ p'(x)q(x),
(II)
p(x).q'(x)dx
+ Jp'(x).q(x)dx
Jlog x dx.
Set p(.r)
log x, q'(x)
J~dX
Jog
1, so that q(x) = x.
b) J ""'
JJog
J lo~~:
x
dx =
x dx.
Jog
X -
X - X.
Then
Jx sin
dx.
Jx sin x dx =
-x cos x
+ sin x.
TECHNIQUE OF INTEGRANON
J
131) J
130)
J
133) Jx~e"'dx.
xe/ dx.
132)
x 2 cosxdx.
(Hint:
xa log x dx
(a #- -1).
(Hint: UseEx.l30)
Jsin"'
= -(m- 1)
.,
1
sin'-~2 xdx,
1'
sin"'xdx =
sin"'-2 xdx,
because the first term on the right side of (II), pq, is equal to zero for
the values 0 and 71'/2. By repeated application of the last formula we
1.~~- ~
...
~~,
= " ~
I,.,,= 2n ~ 1 ; : ~
Since 0 < sin x < 1 for 0
sin2n+l x, so that
<x<
>
sin:n x
>
(seep. 414)
or
APPENDIX
548
.J
n-t,
Zn
2-24466 .. (2n)(2n)
in~
If we now pass to the limit as n-+ oo we see that the middle term tends
to I, hence we obtain Wallis' product representation for 1r/2:
1f
2=
Z'"(n!)c
~ hm [(Zn) !J'(Zn-~FI)
Ber~
London: Allen
New York:
Norton, 1938.
D. E. Smith. A Source Book in lffalhcmatics. New York: l\lcGrawHill, 1929.
H. Steinhaus. Mathematical Snapshots. New York: Stechert, 1938.
H. W(yl 'The 11athematical "\Vay of Thinking,'' Science, XCII (1940~
p. 437 ff.
?ir,()
CHAPTER I
L. E. Dickson. Introduction to the Theory of Numbers. Chicago:
University of Chicago Press, 1931.
- - . Modern Ell:'mentary Theory of Numbers. Chicago: University
of Chirago Press, 1939.
G. H. Hardy. "An Introduction to the Theory of Number.'!," B1dlelin
of the American Jfathematical Society, XXXV (1929), p. 789 ff.
G. H. Hardy and E. M. Wright. An Introduction to the Theory o(
,Vumber.~. Oxford: Clarendon Pref<S, 1938.
J. V. lTspensky and M. H. IIeaslet. Elementary Number Theon;.
New York: McGraw.Hill, 1939.
CHAPTER II
G. Birkhoff and S. ~lacLane. A Survey of Modern Algebra. New
York: Macmillan, 1941.
M. Black. The .Vature of Mathematics. New York: Harcourt, Brace,
193.1.
CHAPTER III
J. L. Coolidf,!;E'.
Oxford: Clarendon
Press, 1940
A. De 1Jorgan. A Budget of Paradoxes, 2 vols. Chic_,;o: Open Court,
1915.
L. E. Dickson. New First Course in the Theory of Equations. Kew
York: \Viley, 1939
CHAPTER IV
W. C. Graustein. Introduction to lligher Geometry. New York: Mac~
mil!an, 1930.
D. Hilbert. The Foundations of Geometry, translated by E. J. Town~
send, 3rd edition. La Salle, Ill.: Opc..1 Court, 1938.
C. W. O'Hara and D. R. Ward. An Introduction to Projective Geom.
elry. Oxford: Clarendon Press, 1937.
G. de B. Hobinson. The Foundations of Geometry. Toronto: Uni
versity of Toronto Press, 1940.
Girolamo Saccheri. Euclides ab omni naevo vimlicatus, translated by
G. B. Halsted. Chicago: Open Court, 1920.
It. G. Sanger. Synthetic Projective Geometr11. New York: McGrawHill, 1939.
CHAPTER V
Alcxandroff
Einfachste Grundbegriffc der Topo/ogie.
Berlin;
1932
D.
und S. ('ohn-Vosscn. Anschauliehe Geometric Berlin:
Springer, 1932
11 H.},. Xrwman. Elcme11ts of the Topology of Plane St/.~ of Point,~.
P.
1939
Topulogie.
ncr, 1934.
Leipzig: Teub-
552
CHAPTER VI
R. Courant. Differential and Integral Calculus, translated by E. J.
).lcShane, revised edition, 2 vols. New York: Nordemann, 1940.
G. II. Hardy. A Course of Pure~tlathematics, 7th edition. Cambridge:
University Press, 1938.
W. L. Ferrar. A Text-book of Convergence. Oxford: Clarendon Press,
1938.
For the tlwory of continued fractions see, <'.g.
S. Barnard and J. :L\I. Child. Advanced Algebra, London:
!viae~
millan, 1939.
CHAPTER VII
R. Courant. "Soap Film Experiments with Minimal Surfaces," American Mathematical Monthly, XLVII (1940), pp. 167-174.
J. Plateau. "Sur les figures d'equilibre d'une masse liquide sans
pCsanteur,'' Ml:moires de l'Acadimie Royale de Belgique, nouvelle
sCrie, XXIII (1849).
- - . Statique expl:rimenlale et lhioretique des Liquides. Paris: 1873.
CHAPTER VIJI
C. B. Boyer. Tlw
of the Calculus. New York: Columbia
UniverRity Press,
R. Courant. Differential and Integral Calculus, translated by E. J. :\1cShane, revised edition, 2 vols. Nev, York: Xordemann, 1940.
G. H. Hardy. A Course of Pure Mathematics, 7th edition. Cambridge:
University Press, 1938.
Sl 'G<1ESTIONS Hm ADDITIONAL
B:F.AilJN\~
;)i)f,
>
19f)6
wn1
Gi':i6
Sl:(;(;J<~STJ(
~amf'd
~71.
no a
5. Set-TbPoretic Notation
CcH>e<'pt,, <!f Mc!dmn ,<Tathcnnotks )';ew York
WHG
8. Knots
HI<~AJ)lNG
557
\' F. R. Jones. "A Polynomial Invariant for Knots via von Nernna.nn
Algebras." Bulletin of the American Mathematical Society 12
(1985)<10:!-111
V. F. R. Jones. "Knot Theory and Statistical Mechanics.'' ScientiJicAmerican 263, no. 5 (1990):52-57.
W. B. R. Lickorish and K. C. Millett "The New Polynomial Invariants of
Knots and Links." Mathematics Magazine 61 (1988):3-23
C. Livingston. Knot Theory. Carns Mathematical Monographs 24. Wa..<>hingt.on: Mathematical Association of America, 1993.
I. Stewart. From Here to Infinity. Oxford: Oxford University Press, 1996
- - . "Knots, Links, and Videotape." Scientifo; American 270, no. 1
(1994)<136-138
9. A Problem in Mechanics
T Poston "Au Courant with Differential Equations" Manifold 18
(Spring 1976):6-9
I. Stpwart, Game, Set, and Math. Oxford: I31arkwt-ll, 1989
10. Steiner's Problem
M. W. Bern and R. L. Graham. "The Shortest-Ketwork Problt>tn ' Sdentijic Arnerican 260. no. 1 (1989):66-71
E. N. Gilbert and H. 0. Pollak. "Steiner Minimal Trf'f'S." SIAM Journal
of Applied Mathmnaths 16 (1968):1-29
z. A. Melzak. Companion to r:oncrete Mathematics. ~ew York: Wiley,
1973
I. Stewart. "Ttrts, TelPphones, and Tiles.'' NPu' Scientist 1795 (1991)
26-29
P Wint.t'r. "Steinf'r Problems in !'-letworks: A Survey., NetuorkB 17
(1987):129-167
558
C. Isenberg. 'l1w Science of S'oap Fihns a.nd Soap Bubbles. NPw York
Dover Publications, 1992
12. Nonstandard Analysis
J. W. Dauben. Abraham Robinson: The Creation of Nonstandard Analysis. Prin<'Pton: Princeton University Press, 1995
A. E. Hurd and P. A. Loeb. An lntmduction to Non.;;tandard Real Analysis. New York: Academic Press, 1985
M. J. Keisler. Founda.tions of Infinitesimal Calculus. New York: Prindle,
Weber, and Schmidt, 1976
A. Robinson. Introduction to Model Theory and to theMetamathnnatics
of Algebra. Amsterdam: North-Holland, 1963.
K. D. Stroyan and W. A U, LuxPmhurg. lntroducHon to the Theory of
lnjinitesimafs. New York: Academic Press, 1976.
INDEX
absolute value, 57
acceleration,425
addition, of complex numbers, 00
ofnaturalmunbers, 1-J
of ratwnal numbers, 53
of real numbers, 70
of sets, llO
a(ljrmctwn of irrauonals, 132
Alexande-r polynomial, 503, 50fi
algebra, Boolean, 114
fundament.a.! theorem of, 101-10:J,
269--271
fifiO
INDEX
aw
eompact set~,
compass constructions, 145--146, 147-
151
complE'mentofast-t, Ill
complete quadrilalt:'rdl, 170--180
complex cof\jugatl", 93
complex nurnb!"rs, 88--103
absolutl" \-alue of, 93
angle of. 94
modulus of, 93
opt>ratwns With, !:10-!H
trigonomctnc repr('S('ntatum of, 9S
complex vana.ble, theory of funetwns
of a, 478--479
composite numbt>rs, 22
eompotwd flmetJons. 281-283
compound intf'rPst. 457
concurrent IInf's, 170
congruencE' of g<>onwtrica.l figures, 166
congnwne<>s (arithmetical), 31--!0
conies, 19B---212
equations o( 74--77
line,207
mt>tric dr-firution of, 199. 494, 496
pomt, 204
proJeCtive definition of. 204
COIIJUga.te, eompiPX, 9,1
COIIJUga.te, ha.nnomc, l7f>-176
eoruwcmity,24:3--244
constant, :n:3
construetJblf' numb!"rs and numlwr
fields, 127-1:34
def!mtion, Ja2-l33
contmuousvanablf', 274
eontmuum hypotlt<>sis, 88, 493--494
contmuum of wal numbt>rs. f>R
demunerability,79--80
contour lines, 286-2R7
INDEX
'j(ij
INDEX
least squares, method of, ~16,<;,-.;36(-i
l.Ribmz' fonnula for tt, 441
Leibniz and nonstandard analysis, 519,
521-522
length of a curve, 466-4{}9
ltvel lines, 286-287
564-
INDEX
metamaUwma1ics, 8fl
nwtnc geometry, 109
minunax, points of, 34:J-.-;J.45
modulo d. 32
modulus of complex nwnber, 93
Moebius strip, 259-202
monotone function, 280
monotone sequence, 295--297
Morse relations, ~W5
motion, equations of, 460-461
ergodk, 353--354
rigid, 141
multiplicity of roots of algebraic equa
lion, 102
natural numbers, 1-20, 520
rH:linwnsional geometry, 227-2:34
n('gative munbers, 54-G5
NI'W1onian dynamics, 460-461, 506
non-denumerability of continuwn, 8183
non-Endid('an geometry, 21&-227
nonstandard analysis, 1>18-..523
NP--complete, 512
nnmber fields, 127-134
nnmbt"r system, ..51-107, .501
munbers, algebraic, 103---104
cardinal, 83---86, 49:3
cmnplex,88---I03
composi1(', 22
constructible, 127-1:34
fennat,25, 119
natural, l-20, 520
negatlw,54-55
numbers, pnnw, 21-31
Pythagorean. 40-42
ratwnal, 52-58, 520
reaJ,.58--72
transcendental, 10~1---104
INDEX
56{)
INDEX
MATHEMATICS
"A lucid representation of the fundamental concepts and methods of the whole field of
matbematics.... Easily understandable.''
Albert Einstein"'
Wntten for beginners and scholars, for students and teachers, for philosophers and
engineers, What IS Mathematics? 1s a sparkling collection of mathematical gems that
offers an entertaining and accessible portrait of the mathematical world. Brought up to
date with a new chapter by Ian Stewart, this second edition offers new insights into
recent mathematical developments and describes proofs of the Four-Color Theorem
and Fermat's Last Theorem, problems that were still open when Courant and Robbins
wrote this masterpiece, but ones that have since been solved
A marvelously literate story, What is Mathematics? opens a window onto the world
of mathematiCs
*Praise for the first edition:
"Without doubt, the work WJJl have great influence. It should be in the hands of everyone,
professiOnal or otherwise, who 1s mterested in scientific thinking"
The New York limes
Mathemallcal Rev1ews
"Excellent. ... Should prove a source of great pleasure and satisfaction "
fournal of Applied Physics
Manton Morse
The late Ricbard Courant, headed the Department of Mathematics at New York University
and was Director of the Institute of Mathematical Sciences, which was subsequently
renamed the Courant Institute of Mathematical Sciences_ His book Mathematical Physio 1s
familiar to every physicist, and h1s book Dijferenrwl and Integral Calculus is acknowledged
to be one of the best presentations.of the subject written in modem times. lho:rbr1 ROObins
is New Jersey Professor of Mathematical Statistics at Rutgers UniveJ"l;ity. Ian Stewart 1s
Professor of Mathematics at the University of Warwick, and author of Nature's Numbers and
Does God Play Dice? He also writes the "Mathematical Recreations" column m Scientific
American. In 1995 he was awarded the Royal Society's Michael Faraday medal for signifi
cant contribution to the pubhc understanding of science
Oxford Paperbacks
Oxford University Press
U.S.$!9.95
ISBN 0-19-510519-2