You are on page 1of 20

EVALUATION OF POWER SERIES

TSOGTGEREL GANTUMUR

Abstract. We consider the evaluation of elementary transcendental functions such as ex ,


log x, sin x, arctan x, with the help of power series. We also discuss power series algorithms
for computing the digits of π. Some notes on historically important algorithms are included.

Contents
1. Taylor polynomials 1
2. Roundoff error analysis 4
3. The exponential function 6
4. Logarithmic functions 8
5. Trigonometric functions 11
6. Inverse trigonometric functions 15
7. Computation of π 15

1. Taylor polynomials

After the basic arithmetic and comparison operations, and the n-th root extraction n x,
the next important operations for real number computations are the evaluation of elementary
transcendental functions such as exp x, log x, sin x, and arctan x. Each of these functions can
be represented as a power series, which yields an efficient way to approximately evaluate the
function. In fact, the discovery or power series in the 17-th century led to a huge leap in the
computational capability of humans.
Let us start by fixing some terminology.
Definition 1. A function f ∶ (a, b) → R is called analytic at c ∈ (a, b) if it is developable into
a power series around c, i.e, if there are coefficients an ∈ R and r > 0 such that

f (x) = ∑ an (x − c)n , for all x ∈ (c − r, c + r). (1)
n=0
Moreover, f is said to be analytic in (a, b) if it is analytic at each c ∈ (a, b).
Remark 2. (a) This definition can be extended to complex valued functions f ∶ Ω ⊂ C → C
in a straightforward way.
(b) Power series can be differentiated term-wise, implying that the coefficients of the power
series of f about c are given by an = f (n) (c)/n!. In other words, if f is analytic at c, then
the following Taylor series converges in a neighbourhood of c.

f (n) (c)
f (x) = ∑ (x − c)n . (2)
n=0 n!
This formula was first published by Brook Taylor in 1715, although it was known previ-
ously to several mathematicians, including James Gregory as early as 1671.
Date: April 3, 2018.
1
2 TSOGTGEREL GANTUMUR

Example 3. (a) Arguably the most important power series is the geometric series

1
= ∑ xn , (3)
1 − x n=0
which converges for all x satisfying ∣x∣ < 1.
(b) The next in line is perhaps

xn
ex = ∑ , (4)
n=0 n!
which converges for all x ∈ R. This was discovered by Leonhard Euler in 1748.
Assume that (1) converges, and let
n
Tn (x) = ∑ ak (x − c)k , Rn (x) = f (x) − Tn (x), (5)
k=0

where Tn is called the n-th degree Taylor polynomial of f , and Rn is called the remainder.
The idea now is that in order to approximate f (x), we compute Tn (x) for some large n, such
that the remainder Rn (x) is small.

4 4

3 3

2 2

1 1

0 0

1 1

2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.5 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5

Figure 1. The exponential function ex and its Taylor polynomials T1 , T2 , and


T3 . To model roundoff error, the graph on the right is generated by introducing
random error in the computation of the Taylor polynomials.

We need a way to estimate the remainder term. The simplest case is when the series (1) is
alternating, in which case we have
∣Rn (x)∣ ≤ ∣an+1 (x − c)n+1 ∣. (6)
This is in fact approximately true in general, as the following theorem shows.
Theorem 4 (Lagrange 1797). Let f ∈ C ([c, x]) be n + 1 times differentiable in (c, x), with
the n-th derivative f (n) continuous in [c, x). Then there exists ξ ∈ (c, x), such that
f (k) (c)
n
f (n+1) (ξ)
f (x) = ∑ (x − c)k + (x − c)n+1 . (7)
k=0 k! (n + 1)!
Proof. The case n = 0 is simply the mean value theorem. We will give a proof only for the
case n = 1, which contains all essential ideas of the general case.
We look for a quadratic polynomial q(z) = α + β(z − c) + γ(z − c)2 satisfying
q(c) = f (c), q(x) = f (x), and q ′ (c) = f ′ (c), (8)
POWER SERIES 3

which turns out to be the following unique polynomial


(z − c)2
q(z) = f (c) + f ′ (c)(z − c) + [f (x) − f (c) − f ′ (c)(x − c)] . (9)
(x − c)2
Let g(z) = f (z) − q(z), so that g(c) = g(x) = 0 and g ′ (c) = 0. Then g is twice differentiable in
(c, x), with
2(z − c)
g ′ (z) = f ′ (z) − f ′ (c) − [f (x) − f (c) − f ′ (c)(x − c)] , (10)
(x − c)2
and
2[f (x) − f (c) − f ′ (c)(x − c)]
g ′′ (z) = f ′′ (z) − . (11)
(x − c)2
Moreover, g ′ (c) exists and g ′ ∈ C ([c, x)). Since g(c) = g(x), by Rolle’s theorem, there is
η ∈ (c, x) such that g ′ (η) = 0. Now recalling that g ′ (c) = 0 and g ′ ∈ C ([c, x)), another
application of Rolle’s theorem gives the existence of ξ ∈ (c, η) such that g ′′ (ξ) = 0. In other
words, we have
1
f (x) = f (c) + f ′ (c)(x − c) + f ′′ (ξ)(x − c)2 , (12)
2
for some ξ ∈ (c, x). □
Example 5 (exp). As (ex )′ = ex , the exponential series (4) has the remainder term
eξ ∣x∣n+1
∣Rn (x)∣ = , (13)
(n + 1)!
for some ξ ∈ (0, x) or ξ ∈ (x, 0), depending on whether x > 0 or x < 0. In case x > 0, we have
eξ < ex , which yields the following estimate on the relative error of Tn (x):
∣Rn (x)∣ xn+1
< . (14)
ex (n + 1)!
In case x < 0, the best we can do is eξ ≤ 1, and so
∣x∣n+1
∣Rn (x)∣ < . (15)
(n + 1)!
Note that the latter estimate is identical to (6), which comes from the alternating character
of the series (4) for x < 0.

Exercise 1. (a) Show that


1
1 + x + . . . + xn → as n → ∞, (16)
1−x
for ∣x∣ < 1.
(b) Look for a function f satisfying f ′ (x) = f (x) and f (0) = 1 in the form

f (x) = ∑ an xn , (17)
n=0

and arrive at the exponential series (4). Show that the series converges for all x ∈ R.
Exercise 2. Show that the binomial series
a(a − 1) 2 a(a − 1)(a − 2) 3
(1 + x)a = 1 + ax + x + x + ..., (18)
2! 3!
converges for ∣x∣ < 1, where a ∈ R is a constant.
4 TSOGTGEREL GANTUMUR

Table 1. Approximation of ex by its Taylor polynomials. The last row shows


the exact value ex . For x = 1, we see that each additional term brings about
1 correct decimal digit. For x = 14 and x = 16 1
, each additional term gives
approximately 1.5 and 2.5 correct decimal digits, respectively. This type of
convergence is called a linear convergence. In the last column, we never get
the last digit correct, because of roundoff errors. The use guard digits in the
intermediate computations would be needed to settle this issue, cf. Example 6.
√ √
n Tn (1) ≈ e Tn ( 14 ) ≈ 4 e Tn ( 16
1
) ≈ 16 e
1 2.0000000000000000 1.2500000000000000 1.0625000000000000
2 2.5000000000000000 1.2812500000000000 1.0644531250000000
3 2.6666666666666665 1.2838541666666667 1.0644938151041667
4 2.7083333333333330 1.2840169270833335 1.0644944508870444
5 2.7166666666666663 1.2840250651041669 1.0644944588343304
6 2.7180555555555554 1.2840254041883683 1.0644944589171146
7 2.7182539682539684 1.2840254162985183 1.0644944589178538
8 2.7182787698412700 1.2840254166769605 1.0644944589178595
9 2.7182815255731922 1.2840254166874727 1.0644944589178595
10 2.7182818011463845 1.2840254166877356 1.0644944589178595
∞ 2.7182818284590452 1.2840254166877415 1.0644944589178594

Exercise 3 (Cauchy’s form of the Taylor remainder). In the setting of Theorem 4, show that
n
f (k) (c) f (n+1) (η)
f (x) = ∑ (x − c)k + (x − η)n (x − c). (19)
k=0 k! n!
for some η ∈ (0, x) or η ∈ (x, 0), depending on whether x > 0 or x < 0.

2. Roundoff error analysis


We shall consider the computation of the Taylor polynomial (5) in inexact arithmetic. Let
us denote the k-th term of the Taylor polynomial by bk , and its computed value by b̃k , as
bk = ak (x − c)k , b̃k = (1 + βk )bk , k = 0, . . . , n, (20)
where βk accounts for the error made during the computation of bk . If we assume that the
coefficients ak are given, we need at most n multiplications in ak (x − c)k , and hence we can
estimate ∣βk ∣ ≤ ρ× (n, ε), with ε being the machine precision, and ρ× (n, ε) = 1−nε

. In general,
ρ× (n, ε) depends on how the coefficients ak are computed, on how the product ak (x − c)k is
computed, and on the error in the input value x. Proceeding further, introduce the notations
yn = b0 + b1 + . . . + bn ,
yn′ = b̃0 + b̃1 + . . . + b̃n , (21)
ỹn = b̃0 ⊕ b̃1 ⊕ . . . ⊕ b̃n ,
where yn = Tn (x) is the true value, and our goal is to estimate yn − ỹn . Assuming the “naive
summation” algorithm, we get the intermediate estimate
∣ỹn − yn′ ∣ ≤ ((1 + ε)n − 1)∣b̃0 + b̃1 ∣ + ((1 + ε)n−1 − 1)∣b̃2 ∣ + . . . + ε∣b̃n ∣
n n (22)
≤ ρ+ (n, ε) ∑ (1 + ∣βk ∣)∣bk ∣ ≤ ρ+ (n, ε)(1 + ρ× (n, ε)) ∑ ∣bk ∣,
k=0 k=0
POWER SERIES 5

with ρ+ (n, ε) = 1−nε



. In general, ρ+ (n, ε) should depend on how the summation in (21) is
carried out. Finally, an application of the triangle inequality yields
n n
∣ỹn − yn ∣ ≤ ∣ỹn − yn′ ∣ + ∣yn′ − yn ∣ ≤ ρ+ (n, ε)(1 + ρ× (n, ε)) ∑ ∣bk ∣ + ∑ ∣βk ∣∣bk ∣
k=0 k=0
n (23)
≤ (ρ+ + ρ× + ρ+ ρ× ) ∑ ∣bk ∣,
k=0
with ρ+ = ρ+ (n, ε) and ρ× = ρ× (n, ε), and hence
∣ỹn − yn ∣ ∣b0 ∣ + . . . + ∣bn ∣
≤ (ρ+ + ρ× + ρ+ ρ× ) = (ρ+ + ρ× + ρ+ ρ× )κ+ (b0 , . . . , bn ), (24)
∣yn ∣ ∣b0 + . . . + bn ∣
where κ+ (b0 , . . . , bn ) is the condition number of the summation b0 + . . . + bn .
Example 6 (exp). With the notations of the preceding paragraph, for the exponential series
(4), we have bk = xk /k!, which may be computed by using 2k − 2 multiplications and divisions,
with the relative error
∣b̃k − bk ∣
∣βk ∣ = ≤ 4kε ≤ 4nε =∶ ρ× (n, ε), (25)
∣bk ∣
assuming that k ≤ n ≤ 4ε 1
. From the latter assumption, we infer ρ+ (n, ε) ≤ 2nε, and hence
∣ỹn − yn ∣
≤ (2nε + 4nε + 8n2 ε2 )κ+ (b0 , . . . , bn ) ≤ 8nεκ+ (b0 , . . . , bn ), (26)
∣yn ∣
where we have used nε ≤ 1
4 once again in the last step. Furthermore, we have
e∣x∣
= e∣x∣−x = emax{0,−2x} ,
κ+ (b0 , . . . , bn ) = (27)
ex
indicating a potentially catastrophic cancellation for x large negative. Thus keeping in mind
that ex for x < 0 can be computed by ex = 1/e∣x∣ , in the following, we assume that x ≥ 0.
Taking into account that yn ≤ ex , we get
∣ỹn − yn ∣ ≤ 8∣yn ∣nε ≤ 8ex nε, (28)
and invoking (14), we infer
∣ỹn − ex ∣ ∣ỹn − yn ∣ ∣yn − ex ∣ xn+1
≤ + ≤ 8nε + . (29)
ex ex ex (n + 1)!
To simplify it a bit, assuming x ≤ b and n + 1 ≥ m for some constants 0 < b < 1 and m ≥ 2, we
can replace (29) by
∣ỹn − ex ∣ bn+1
≤ 8nε + . (30)
ex m!
If ε is given, then even though the second term decays with n, the first term grows unbound-
edly. The right hand side is minimized when n is such that
bn+1 8ε
≈ . (31)
m! log( 1b )
For example, taking ε = 2−52 , b = e−2 , and m = 9, we get n ≈ 10, which suggests that
80ε + 4ε = 84ε is the best possible relative error, if we do all computations in double precision.
This is consistent with (especially the results for x = 16 1
in) Table 1.
On the other hand, if a target accuracy, say, δ > 0 is specified, then one could choose n so
xn+1
large that (n+1)! ≤ 2δ , and then choose ε > 0 so that 8nε ≤ 2δ . For example, if want δ = 2−52
for x ≤ e−1 , then n = 13 would guarantee xn+1
(n+1)! ≤ 2δ . This implies that all computations must
6 TSOGTGEREL GANTUMUR

be performed with relative error ε ≤ 16n


δ
≈ 2−60 , i.e., in order to have the value ex in double
precision, one must compute with 8 guard bits, and sum the first 14 terms of the power series.
Exercise 4. There is a way to efficiently evaluate polynomials, known as Horner’s scheme:
an xn + an−1 xn−1 + . . . + a1 x + a0 = ((⋯((an x + an−1 )x + an−2 )x⋯)x + a1 )x + a0 . (32)
Since this requires n to be known beforehand, it is not a practical method for power series.
However, we can rescue the method by writing
a0 + a1 x + . . . + an−1 xn−1 + an xn = [((⋯((a0 y + a1 )y + a2 )y⋯)y + an−1 )y + an ]xn , (33)
with y = 1/x. Perform a roundoff error analysis for the modified Horner scheme (33).

3. The exponential function


Recall the power series

xn x2 x3 x4
ex = ∑ =1+x+ + + + ... (x ∈ R), (34)
n=0 n! 2 3! 4!
for the exponential function. The first thing to notice here is that since ex ≈ 0 for x << 0,
we expect cancellation of digits in this regime, cf. Example 6. On the other hand, from the
condition number κ(x) = ∣x∣ of the exponential, we should expect the computations of ex and
of e−x to have roughly the same difficulty. Indeed, this expectation can be realized by taking
advantage of the relation
1
e−x = x , (35)
e
to flip the sign of x.
From now on, we assume that x > 0. If x is large, the terms of the series (34) will grow with
n, until around the point where n! ≈ xn . This is undesirable, as it would inflate the number
of terms needed to sum to achieve a desired accuracy. To deal with the problem, we perform
an argument reduction before utilizing any power series, e.g., by expressing ex in terms of ey
with small y. In this regard, the law of addition comes in handy. Thus let b > 1 be a constant,
and let m ∈ N and y ∈ R be such that
x = y + m log b, (36)
where the idea is of course that we choose y small. For instance, we can ensure 0 ≤ y < log b,
or − 21 log b < y ≤ 12 log b. Then in light of
ex = ey+m log b = bm ey , (37)
the power series computation can now be done only for e . Here, the choice b = e simplifies
y

(36), but we would need to compute the power em in (37). On the other hand, the choice
b = 2 would lead to the simple power 2m in (37), but we would need a value of log 2 in (36).
The preceding method may be called an additive argument reduction. We can also approach
the argument reduction problem multiplicatively. Let r = x/n, where n is some large integer,
preferably of the form n = 2k . Then we have
ex = enr = (er )n , (38)
where er is to be computed with power series. If n = 2k , then the power (er )n can be computed
by repeated squaring, as in, e.g., (er )8 = (((er )2 )2 )2 .
Remark 7. Let us look at the roundoff error. We assume n = 2k , and that the division
r = x/n can be done exactly. Furthermore, suppose that z = er is computed with relative
precision η > 0, as
z̃ = er (1 + δ0 ), ∣δ0 ∣ ≤ η. (39)
POWER SERIES 7

We perform the repeated squaring


ỹ = [. . . [z̃ 2 (1 + δ1 )]2 (1 + δ2 ) . . .]2 (1 + δk )
k k−1 k−2 (40)
= ex (1 + δ0 )2 (1 + δ1 )2 (1 + δ2 )2 ⋯(1 + δk ),
with the same relative precision: ∣δj ∣ ≤ η. Noting that
(1 + δj )m ≤ (1 + η)m ≤ 1 + 2mη, for 2mη ≤ 1, (41)
we infer
∣ỹ − ex ∣
≤ 2k+1 η(1 + 2k+1 η) + 2k η(1 + 2k η) + . . . + η
ex (42)
≤ 2k+2 η + 22k+3 η 2 = 4nη(1 + 2nη) ≤ 8nη.
This means that we lose approximately k correct significant bits, and in particular, if we want,
say, ε > 0 as the accuracy of the final computation, then the intermediate computations must
be done with the relative precision η = 2−k−3 ε.
Remark 8. We can do a bit better if x > 0 is small, by working with E(x) = ex − 1 instead
of ex . To be more precise, note that
x2 x3 x4 x
E(x) = x + + + + ..., and E(x) ≤ , (43)
2 3! 4! 1−x
with the latter being true for 0 ≤ x < 1. This function has the following “doubling formula”
E(2r) = e2r − 1 = (er − 1)(er + 1) = (er − 1)(2 + er − 1) = E(r)(2 + E(r)). (44)
As before, we assume that n = 2k , and that the division r = x/n is done exactly. Suppose that
each application of the doubling formula is performed with relative precision η, that is,
Ẽ(2r) = Ẽ(r)(2 + Ẽ(r))(1 + δ), (45)
with ∣δ∣ ≤ η. Writing Ẽ(r) = E(r)(1 + ξ), we have
Ẽ(2r) − E(2r) = E(r)(2[(1 + ξ)(1 + δ) − 1] + E(r)[(1 + ξ)2 (1 + δ) − 1]), (46)
which yields
∣Ẽ(2r) − E(2r)∣ ≤ E(r)(2(∣ξ∣ + ∣δ∣) + E(r)(2∣ξ∣ + ∣δ∣)) + O(η 2 )
(47)
= E(2r)(∣ξ∣ + ∣δ∣) + E(r)2 ∣ξ∣ + O(η 2 ),
and
∣Ẽ(2r) − E(2r)∣ E(r)∣ξ∣
≤ ∣ξ∣ + η + + O(η 2 ) ≤ (1 + x)∣ξ∣ + η + O(η 2 ), (48)
E(2r) 2 + E(r)
where we have used the estimate E(r) ≤ 2r ≤ 2x, under the assumption that x ≤ 21 . We then
apply the latter formula k times, to arrive at
∣Ẽ(x) − E(x)∣
≤ η[1 + (1 + x) + (1 + x)2 + . . . + (1 + x)k ] + O(η 2 )
E(x)
(1 + x)k+1 − 1
≤η + O(η 2 ) (49)
x
(k + 1)η
≤ + O(η 2 ).
1 − (1 + k)x
This shows that the loss of accuracy is logarithmic in n. Writing Ẽ(x) = E(x)(1 + ξ), and
assuming x ≤ 2(k+1)
k−1
for convenience, we have
∣ξ∣ ≤ kη + O(η 2 ). (50)
8 TSOGTGEREL GANTUMUR

Finally, the computation of ex = 1 + E(x) is


ỹ = 1 ⊕ Ẽ(x) = (1 + Ẽ(x))(1 + δ) = (1 + E(x)(1 + ξ))(1 + δ), (51)
with ∣δ∣ ≤ ε, giving the error estimate
∣ỹ − ex ∣ ∣ξ∣(1 + ε)E(x) 1
x
≤ε+ x
≤ ε + ∣ξ∣(1 + ε) ≤ ε + kη + O(η 2 + ηε). (52)
e e 2
Remark 9. In view of the relation

ex = sinh x + 1 + sinh2 x, (53)
it is possible to replace the exponential series by the series

x2k+1 x3 x5 x7
sinh x = ∑ =x+ + + + ... (x ∈ R). (54)
k=0 (2k + 1)! 3! 5! 7!
The latter series converges about twice as fast, because the terms have the “stepsize” x2
as opposed to x. However, this reduction can only be used once, meaning that the usual
argument reduction must still be performed beforehand.
Exercise 5. From the definitions
ex − e−x ex + e−x
sinh x = , and cosh x = , (55)
2 2
derive the power series (54), and

x2k x2 x4 x6
cosh x = ∑ =1+ + + + ... (x ∈ R). (56)
k=0 (2k)! 2 4! 6!
Prove the relation (53).
Exercise 6. Perform a roundoff error analysis of the additive argument reduction (36).

4. Logarithmic functions
For logarithms, the basic power series is

(−1)n−1 xn x2 x3 x4
log(1 + x) = ∑ =x− + − + ... (−1 < x ≤ 1), (57)
n=1 n 2 3 4
which was discovered independently by Gerardus Mercator and Isaac Newton around 1667.
The series only converges for −1 < x ≤ 1, and so we need argument reduction to compute log y
for arbitrary y > 0. To this end, let us perform the multiplicative reduction
y = r2n , so that log y = n log 2 + log r, (58)
where we require n ∈ Z and 1 ≤ r < 2, ensuring r = 1 + x with 0 ≤ x < 1. Note that we can
replace the condition 1 ≤ r < 2 by the more general ρ ≤ r < 2ρ, where ρ < 1 but ρ ≈ 1.
Further reduction is possible, by the recipe
√ √ √
√ ( 1 + x)(1 + 1 + x) 1+x+1+x
log(1 + x) = 2 log 1 + x = 2 log √ = 2 log √
1+ 1+x 1+ 1+x (59)
x
= 2 log (1 + √ ) =∶ 2 log(1 + z),
1+ 1+x
which can be applied repeatedly. Note that
x x
z= √ < for x > 0, (60)
1+ 1+x 2

and also that this form is numerically better behaved than the alternative z = 1 + x − 1.
POWER SERIES 9

Finally, once the problem is reduced to evaluating log(1 + z) with z small, we can resort to
the series
1 1 + x ∞ x2n+1 x3 x5 x7
log =∑ =x+ + + + ... (∣x∣ < 1), (61)
2 1 − x n=0 2n + 1 3 5 7
discovered by James Gregory in 1668. Solving 1+x1−x = 1 + z for x gives x = 2+z ≈ 2 , so not only
z z

the the “stepsize” of Gregory’s series is x2 as in Remark 9, but also the argument is about
twice as small as the argument of the corresponding Mercator series. However, note that as
in Remark 9, this reduction can only be used once.
Example 10. Note that we need an accurate value of log 2 for the reduction (58).
(a) Mercator’s series for log 2 is
1 1 1
log 2 = 2 − + − + . . . , (62)
2 3 4
whose convergence is extremely slow. We can also use
1 1 1 1 1
log 2 = − log = + + + + ..., (63)
2 2 2⋅2 2 3⋅23 4 ⋅ 24
which is much better.
(b) On the other hand, for Grogory’s series, 2 = 1+x1−x gives x = 3 , and hence we have
1

2 2 2 2
log 2 = + + + + ..., (64)
3 3⋅3 3 5⋅3 5 7 ⋅ 37
which is way faster.
(c) We do the reduction (59) once, to get
1
log 2 = 2 log (1 + √ ), (65)
1+ 2
and compute the latter logarithm by Gregory’s series, with
1
x= √ . (66)
3+2 2
Compared to b), the magnitude of x has almost been halved.

Table 2. Approximation of log 2 by the methods presented in Example 10.


The last row shows the exact value log 2. In the last column, the maximum
possible precision under the floating point arithmetic has been attained.

n Mercator (62) Mercator (63) Gregory (64) Gregory (66)


1 1.0000000000000000 0.5000000000000000 0.6913580246913580 0.6930256795263684
2 0.5000000000000000 0.6250000000000000 0.6930041152263374 0.6931446209503476
3 0.8333333333333333 0.6666666666666666 0.6931347573322881 0.6931471218850720
4 0.5833333333333333 0.6822916666666666 0.6931460473908271 0.6931471791455733
5 0.7833333333333332 0.6885416666666666 0.6931470737597851 0.6931471805246939
6 0.6166666666666666 0.6911458333333332 0.6931471702560119 0.6931471805590457
7 0.7595238095238095 0.6922619047619046 0.6931471795482411 0.6931471805599221
8 0.6345238095238095 0.6927501860119046 0.6931471804592440 0.6931471805599448
9 0.7456349206349207 0.6929671999007935 0.6931471805498115 0.6931471805599454
10 0.6456349206349207 0.6930648561507935 0.6931471805589162 0.6931471805599454
∞ 0.6931471805599453 0.6931471805599453 0.6931471805599453 0.6931471805599453
10 TSOGTGEREL GANTUMUR

Remark 11 (Briggs’ method). Logarithms were independently discovered by John Napier


and Jost Bürgi around 1614. The property ℓ(x ⋅ y) = ℓ(x) + ℓ(y) was the main focus of their
investigations, and what they called logarithms were instances of
ℓ(x) = α log(x/β), (67)
where α and β are constants. Specifically, Napier’s logarithm is set up so that ℓ(10 ) = 0 7

and ℓ(107 − 1) = 1, while Bürgi’s logarithm satisfies ℓ(108 ) = 0 and ℓ(108 + 104 ) = 10. Upon
consultation with Napier, Henry Briggs compiled large tables of logarithms around 1620, with
the normalization ℓ(1) = 0 and ℓ(10) = 1. This is of course the common logarithm with base
10. Essentially, the method they used is to pick a large number n, and compute the powers
xk = (1 + n1 )k , k = 0, . . . , (68)
after which, they set
ℓ(xk ) = k. (69)
Hence the name logarithm (logos – ratio, arithmos – number). It is easy to see that
log xk n log xk
k= = , (70)
log(1 + n ) log(1 + n1 )n
1

and that for large n,


k log xk
= ≈ log xk , (71)
n log(1 + n1 )n
which would be one way to arrive at the natural logarithm. Of course, n does not have to be
an integer. Thus, for instance,√in order to have ℓ(xk ) = 1 for xk = 10, we take repeated square
roots of 10, to find, e.g., α = N 10 with N = 254 , as Briggs did. Then we use 1 + n1 = α in (68),
which ensures that xN = 10. Finally, to set ℓ(10) = 1, we do a scaling in (69), so that
k
xk = α k ⇐⇒ log10 xk = ℓ(xk ) = . (72)
N
Obviously, computing all powers αk for k = 1, 2, . . . , N is an impossible task. To find the
logarithm of a specific number, √ say x = 2, we need k such that αk = x. Taking the N -th root
from both sides, we get αk/N = N x. The right hand side can be computed by repeated square
roots, and since α ≈ 1, the left hand side can be written as
k
αk/N ≈ 1 + (α − 1), (73)
N
yielding the following formula
√ √
k N
x−1 N
x−1
log10 xk = ≈ = N√ . (74)
N α−1 10 − 1
In fact, from our perspective, Briggs’ method corresponds to the reduction√(59) taken to its
extreme: It is equivalent to applying the reduction 54 times, √ which makes

N
x − 1 very small,
and finally invoking the “power series” approximation log x ≈ N N
x − 1. The main work
involved here is the square root operation performed 54 times. In order to get 14 correct
decimals in the final result, Briggs used 32 to 40 decimals in the intermediate calculations.
The improvements brought about by Mercator’s and Gregory’s series were clearly phenomenal,
especially when all computations were performed by hand. It is a general observation that
before power series, square root extractions were the workhorse of heavy computations.
Exercise 7. (a) By integrating
1
= 1 − x + x2 − x3 + . . . , (75)
1+x
derive the Mercator series (57).
POWER SERIES 11

(b) Expanding each of the logarithms in log(1 + x) − log(1 − x) into Mercator’s series, derive
the series (61). Show that
1 1+x
arctanh x = log , (76)
2 1−x
where arctanh x is the inverse function of the hyperbolic tangent
sinh x ex − e−x
tanh x = = . (77)
cosh x ex + e−x
Exercise 8. Perform a roundoff error analysis of the argument reduction (59) applied k times.
Exercise 9. (a) Suppose that we want to compute log 2 by writing log 2 = log 23 + log 34 , and
employing Gregory’s series to the resulting two logarithms. Would this method be faster
than that described in Example 10(c)?
(b) Suggest fast methods to compute log 5 and log 7, in the spirit of (a).
(c) Suggest several methods to compute log 3, in the spirit of Example 10.

5. Trigonometric functions
Recall the well known Maclaurin series for the sine and cosine functions

x2k+1 x3 x5 x7
sin x = ∑ (−1)k =x− + − + ... (x ∈ R),
k=0 (2k + 1)! 3! 5! 7!

(78)
x2k x2 x4 x6
cos x = ∑ (−1) k
=1− + − + ... (x ∈ R).
k=0 (2k)! 2 4! 6!
Both series were discovered independently by Isaac Newton and Gottfried Leibniz during the
period 1669–1676. It turns out that this was a rediscovery, in that the series (78) appeared
in the writings of early 16-th century Indian mathematicians, who attributed the discovery
to Madhava of Sangamagramma. Unfortunately, their knowledge was not widespread, and
apparently had no influence on the development of calculus in Europe.
The fundamental properties of these functions are periodicity
sin(x + 2πn) = sin x, cos(x + 2πn) = cos x, n ∈ Z, (79)
symmetry
sin(−x) = − sin x, cos(−x) = cos x, (80)
and the law of addition
sin(x + y) = sin x cos y + cos x sin y,
(81)
cos(x + y) = cos x cos y − sin x sin y.
Their interrelations are summed up in
sin2 x + cos2 x = 1, and cos x = sin( π2 − x). (82)
Thus by periodicity and symmetry, the arguments of both functions sin x and cos x can be
reduced to the case 0 ≤ x < π2 . This makes the series alternating, which means straightforward
remainder estimates. Moreover, the law of addition implies the double angle formulas
sin 2x = 2 sin x cos x, cos 2x = cos2 x − sin2 x, (83)
which in turn give us the following argument reduction recipes

sin x = 2 sin( x2 ) 1 − sin2( x2 ),
(84)
cos x = 2 cos2( x2 ) − 1.
12 TSOGTGEREL GANTUMUR

Remark 12 (Tangent series). In view of (82), we only need to be able to compute either sin x
or cos x, and all the other functions tan x, cot x, sec x, and csc x can be expressed in terms of
a single function. However, other series may be of independent interest. For instance, take
the tangent series

t2k+1 x2k+1 t3 x3 t5 x5 t7 x7
tan x = ∑ = t1 x + + + + ... (∣x∣ < π2 ), (85)
k=0 (2k + 1)! 3! 5! 7!
where t1 = 1, t3 = 2, t5 = 16, t7 = 272, etc., are positive integers, called the tangent numbers.
These numbers are related to the Bernoulli numbers by
2n
B2n = (−1)n 2n t2n−1 , n = 1, 2, . . . . (86)
4 − 22n
To compute the tangent numbers, note that
t2k+1 = (tan x)(2k+1) ∣x=0 . (87)
It is immediate from
(tan x)′ = 1 + tan2 x, (88)
that (tan x)(n) is a polynomial in tan x, of degree not exceeding n + 1. Then by writing
(tan x)(n) = an,0 + an,1 tan x + . . . + an,n+1 tann+1 x
= an−1,1 (tan x)′ + . . . + an−1,n (tann x)′ (89)
= an−1,1 (1 + tan2 x) + . . . + nan−1,n tann−1 x(1 + tan2 x),
we derive the recurrence relation
an,j = (j − 1)an−1,j−1 + (j + 1)an−1,j+1 , (90)
with the understanding that an−1,j = 0 for all j < 0 and j > n. Thus starting with a0,j = δ1,j ,
we can compute the coefficients an,j , and then set t2k+1 = a2k+1,0 . An attractive feature of this
algorithm is that all operations are over positive integers. Table 3 illustrates the algorithm.
Table 3. Computation of the tangent numbers, cf. (90).

n=0 0 1
n=1 1 0 1
n=2 0 2 0 2
n=3 2 0 8 0 6
n=4 0 16 0 40 0 24
n = 5 16 0 136 0 240 0 120
n = 6 0 272 0 136 0 1680 0 720
n = 7 272 0 272+3⋅136 0 3⋅136+5⋅1680 0 5⋅1680+7⋅720 0 7⋅720

We end this section with a couple of historical remarks, summarizing the computational
methods for trigonometric functions that existed before power series.
Remark 13 (Using square roots only). Trigonometric functions have been around much
longer than logarithms, since the times of ancient Greeks. While the Greeks only used the
chord function
chord x = 2 sin x2 , (91)
explicitly, 5-th century Indian mathematicians introduced sin x, cos x, arcsin x, and others.
Let us try to compute the exact value of cos x for as many values of x as possible. Thus, the
sine series (78), in combination with (82), implies
cos π2 = sin 0 = 0, sin π2 = cos 0 = 1. (92)
POWER SERIES 13

The double angle formula for the cosine


cos 2x = 2 cos2 x − 1, (93)
cf. (83), can be solved to yield the angle bisection formula

2 cos x = 2 + 2 cos 2x, (0 ≤ 2x ≤ π). (94)
Starting with 2x = π
2, this gives
√ √ √
√ √ √
2 cos π
4 = 2, 2 cos π
8 = 2 + 2, 2 cos π
16 = 2+ 2+ 2, ... (95)
and then by using the law of addition, we can work with the angles 3π 5π
16 , 8 , etc.
To generate more angles, we trisect angles by finding cos x from the triple angle formula
cos 3x = 4 cos3 x − 3 cos x. (96)
The special case 3x = π
2 is the cubic equation
(4t2 − 3)t = 4t3 − 3t = cos π2 = 0, (97)
which gives √
cos π6 = 23 , and so sin π6 = 12 . (98)
Repeated bisections then yield
√ √ √ √ √
π
2 cos 12 = 2 + 3, 2 cos π
24 = 2+ 2 + 3, ... (99)
On the other hand, π6 cannot be trisected any further by using square roots, i.e., by using
only straightedge and compass. This impossibility was proved only in the 19-th century,
which means that countless ancient and medieval mathematicians tried to solve the problem
in vain. They were extremely reluctant to accept solutions involving tools more general than
straightedge and compass. In this light, the bisection formula (94) is a recipe to bisect an angle,
and the special cubic equation (97) corresponds to a problem of constructing an equilateral
triangle (or equivalently, a regular hexagon), all with only straightedge and compass.
We can go further, and consider angle quadrisection, quintisection, etc. Quadrisection is
simply repeated bisections, so it would offer nothing new. The next possibility is to consider
the quintuple angle formula
cos 5x = 16 cos5 x − 20 cos3 x + 5 cos x. (100)
In general, this is worse than the trisection problem, but the special case 5x = π
2 leads to
16t4 − 20t2 + 5 = 0, (101)
where we have already factored out the monomial t. This is a biquadratic equation, which
can easily be solved to yield √ √
π
cos 10 = 5+
√ 5. (102)
2 2
Note that the corresponding classical Greek mathematics is a construction of a regular penta-
π
gon with straightedge and compass. Again, 10 cannot be quintisected any further, but it can
be trisected once (In fact, we can simply apply the law of addition to 60 π
= 10
π
− 12
π
). Thus we
π π π π π
get some new angles, such as 60 , 30 , 20 , 120 , 240 , and the repertoire of these angles remained
the same for over 2000 years, until Carl Friedrich Gauss discovered that the regular 17-gon
can be constructed with straightedge and compass. This fact can be expressed as
√ √ √
√ √ √ √
16 cos 17 = −1 + 17 + 34 − 2 17 + 2 17 + 3 17 − 170 + 38 17.

(103)
It was Gauss’ first major discovery, and was so special to Gauss that he wanted a regular
17-gon to be engraved on his tombstone.
14 TSOGTGEREL GANTUMUR

Remark 14 (More general methods). In view of the preceding remark, the classical Greeks
π
could not get an acceptable exact answer for chord 360 = 2 sin 720
π π
. The value chord 480 was
accessible, and so Claudius Ptolemy invoked the approximation chord x ≈ x for x ≈ 0, to find
π
chord 360 ≈ 4
3
π
chord 480 , (104)

and used bisection and the law of addition to compute chord x for x = 720π 3π 4π
, 720 , 720 , . . ., result-
ing in a table with 3 hexagesimal digits of accuracy. From our perspective, what Ptolemy does
is an argument reduction that ensures ∣x∣ ≤ 720
π
, followed by the “power series” approximation
sin x ≈ x. Notice the similarity with Briggs’ method, cf. Remark 11, both in the repeated
use of square root extractions in the argument reduction, and in the simplicity of the final
approximation. Ptolemy’s method was taken to the extreme by the early renaissance schol-
ars, such as Regiomontanus (1436–1476), who computed the sines for every minute, with 7
decimals, and Rheticus (1514–1574), who produced, e.g., a table of sines for every 10”, with
the accuracy of 10 decimals.
On the other hand, around 650, Bhaskara I gave the excellent approximation
π 2 − 4x2
cos x = + E(x), (105)
π 2 + x2
which (we now know) satisfies the error bound ∣E(x)∣ ≤ 0.002 for ∣x∣ ≤ π2 . It appears that the
continued attempts to improve this formula by the medieval Indian mathematicians culmi-
nated in the discovery of power series by Madhava.
Starting with digit-by-digit algorithms for extracting square and cubic roots, medieval
Indian, Chinese, and Islamic mathematicians developed iterative algorithms for solving poly-
π
nomial equations. Thus, Al-Kashi (c. 1420) solved the cubic (96) for cos x, to trisect 120 , and
π
computed cos 360 with 9 hexagesimal digits of accuracy. In an apparently independent devel-
opment, François Viète (c. 1600) discovered iterative procedures to solve polynomial equations
associated to dividing an angle into 3, 5, and 7 equal pieces. Note that the solution of general
cubics, exposed in 1545 by Gerolamo Cardano, means that angle trisection was possible by
that time, if one allows cubic root extractions. Further research in iterative methods eventu-
ally led to the discovery of the Newton-Raphson method by Henry Briggs, Isaac Newton, and
Joseph Raphson. Briggs applied his iterative method to compute trigonometric functions of
very small angles, which served as the basis for his famous trigonometric tables. It appears
that Briggs discovered the Newton-Raphson method well before the birth of either Newton
or Raphson. Finally, we must mention Jost Bürgi, who discovered (c. 1592) an ingenious
iterative method that produces an entire table of sine at once.
Exercise 10. Look for a function f satisfying f ′′ (x) + f (x) = 0 in the form

f (x) = ∑ an xn . (106)
n=0

(a) Putting f (0) = 0 and f ′ (0) = 1, arrive at the sine series in (78). Show that the series
converges for all x ∈ R.
(b) Putting f (0) = 1 and f ′ (0) = 0, arrive at the cosine series in (78). Show that the series
converges for all x ∈ R.
Exercise 11. Perform a roundoff error analysis of the argument reduction (84) applied k
times.
Exercise 12. Come up with an argument reduction formula for tan x, and design an algo-
rithm to compute tan x in the Ptolemy-Briggs style, that performs the argument reduction
sufficiently many times, before invoking tan x ≈ x.
POWER SERIES 15

6. Inverse trigonometric functions


The basic series for inverse trigonometric functions are the arctangent series

(−1)n x2n+1 x3 x5 x7
arctan x = ∑ =x− + − + ... (∣x∣ ≤ 1), (107)
n=0 2n + 1 3 5 7
which was discovered independently by Madhava in the early 15th century, James Gregory in
1671, and Gottfried Leibniz in 1673, and the arcsine series
1 x3 1 ⋅ 3 x5 1 ⋅ 3 ⋅ 5 x7
arcsin x = x + ⋅ + ⋅ + ⋅ + ... (−1 ≤ x < 1), (108)
2 3 2⋅4 5 2⋅4⋅6 7
which was discovered independently by Isaac Newton and Gottfried Leibniz, during the period
1669–1676. The law of addition for the tangent implies the corresponding law
a+b
arctan a + arctan b = arctan (ab < 1). (109)
1 − ab
Putting a = b in turn gives the doubling formula
x
arctan x = 2 arctan √ , (110)
1 + 1 + x2
which can be used to reduce the argument of (107). Moreover, we have relations among the
inverse trigonometric functions, such as

x 1 − x2
arcsin x = arctan √ , arccos x = arctan , (111)
1 − x2 x
meaning that the ability to compute either (107) or (108) would be sufficient.
Exercise 13. (a) By integrating
1
= 1 − x2 + x4 − x6 + . . . , (112)
1 + x2
derive the arctangent series (107).
(b) By expanding (1 − x2 )1/2 into a binomial series, cf. Exercise 2, and integrating term by
term, derive the arcsine series (108).
(c) Derive the sine series (78), by inverting (108).
Exercise 14. Perform a roundoff error analysis of the argument reduction (110) applied k
times. Design an algorithm to compute arctan x in the Ptolemy-Briggs style, that carries out
the reduction (110) sufficiently many times, before invoking arctan x ≈ x.

7. Computation of π
At the risk of deviating a bit from the main subject, as an interesting case study, we include
here a brief historical review on methods to compute the digits of π, up to and including the
point when power series dominated the field.
The first rigorous method to compute π is due to Archimedes of Syracuse, c. 250 BC.
He approximated the circumference of a circle by using inscribed and circumscribed regular
polygons with 6, 12, 24, 48, and finally 96 sides, to conclude

71 < π < 3 7 ,
3 10 1
(113)
10
where we note that 3 71 = 3.1408 . . . and 3 17 = 3.1428 . . .. To add a bit more detail, it is easy to
see from Figure 2 that the perimeter of the regular n-gon inscribed in the unit circle is 2n sin nπ ,
and the perimeter of the regular n-gon circumscribed about the unit circle is 2n tan nπ , yielding
n sin πn < π < n tan πn . (114)
16 TSOGTGEREL GANTUMUR

π 1 π
2n n
1

pn pn

p2n p0n

(a) pn = 2n sin nπ and p2n = pn / cos 2n


π (b) p′n = 2n tan nπ = pn / cos nπ

Figure 2. Archimedean bounds on π.

From our perspective, this is immediate because sin x < x < tan x for 0 < x < π2 . To compute
π
sin 96 , Archimedes used the recurrence formula displayed in Figure 2(a), or equivalently
sin 2α sin 4α sin 16α
sin α = = = ... = 4 , (115)
2 cos α 22 cos α cos 2α 2 cos α cos 2α cos 4α cos 8α
with α = 96
π
, in combination with the angle bisection technique (94) for the cosines, cf. the
values (99). Note that sin 16α = sin π6 is available. In the end, we get bounds such as
π 2
1< < √ = 1.1547 . . .
3 3
√ √
π 2+ 3 2
1< ⋅ < √ √ = 1.0352 . . .
3 2 2+ 3 (116)
√ √ √ √ √
π 2+ 3 2+ 2+ 3 2
1< ⋅ ⋅ < √ √ √ = 1.0086 . . . .
3 2 2
2+ 2+ 3
This algorithm leads to the infinite product expansion
√ √
√ √ √ √ √ √ √
3 2+ 3 2+ 2+ 3 2+ 2+ 2+ 3
= ⋅ ⋅ ⋯, (117)
π 2 2 2
although Archimedes would be extremely reluctant to consider such things.
Until the invention of calculus, the Archimedes algorithm and its variations were the only
methods available for computing the digits of π. Thus Ptolemy used his accurate trigonometric
table to compute the perimeter of the inscribed 360-gon, and deduced
π ≈ 3 120
17
= 3.14166 . . . . (118)
Around 250, independently of Archimedes, Liu Hui developed an algorithm based on the areas
of inscribed and circumscribed polygons, and replaced (114) by the slightly improved version
n sin πn < π < n sin πn + dn , where dn = n sin πn − n
2 sin 2π
n . (119)
POWER SERIES 17

Note that
π
2n sin 2n = n sin πn + d2n , π
4n sin 4n = n sin πn + d2n + d4n , (120)
etc. He then observed
d2n ≈ 41 dn , (121)
so that the right hand side of
π = n sin πn + d2n + d4n + . . . ≈ n sin πn + (1 + 1
4 + ( 14 )2 + . . . )d2n = n sin πn + 43 d2n , (122)
would be more accurate than the simple update 2n sin = n sin + d2n . Liu Hui tried his
π
2n
π
n
accelerated method on a 192-gon, which gave the same accuracy as that of the non-accelerated
method on a 3072-gon. This is a precursor to the modern acceleration techniques.
In 1579, François Viète derived the formula
√ √ √ √ √ √ √
√ √ √
2 2 2+ 2 2+ 2+ 2 2+ 2+ 2+ 2
= ⋅ ⋅ ⋅ ⋯, (123)
π 2 2 2 2
by modifying the Archimedes algorithm to start with a square, instead of a hexagon, cf. (117).
This beautiful formula is considered to be the dawn of modern mathematics, as it was the
first ever explicit occurrence of an infinite process in mathematics.

Table 4. Approximation of π by Archimedean algorithms.

Date Name Number of sides Decimal places


250 BC Archimedes 96 3
150 Ptolemy 360 3-4
250 Liu Hui 6 ⋅ 29 5
480 Zu Chongzhi 6 ⋅ 212 ? 7
499 Aryabhata 384? 4
1424 Al-Kashi 6 ⋅ 227 14
1579 Viète 6 ⋅ 216 9
1593 van Roomen 230 15
1596 van Ceulen 15 ⋅ 231 20
1615 van Ceulen 262 33
1621 Snell 230 35
40
1630 Grienberger 2 38

Table 4 lists some of the notable progress achieved with the help of Archimedean algorithms.
The last and most impressive of the more straightforward computations were done by Ludolph
van Ceulen, when he used a polygon of 262 sides, to derive
π = 3.14159265358979323846264338327950288 . . . . (124)
In order to explain how the final accuracy depends on the number of sides of the polygon
used, we note that
π3 1 π3 1
n sin πn = π −2
+ O( 4
), n tan π
n = π + 2
+ O( 4 ). (125)
6n n 3n n
Neglecting the higher order term, we see that doubling n would reduce the error roughly 4
times, i.e., going from n to 2n would add log10 4 ≈ 0.6 correct decimal digits. For instance,
from van Roomen’s computation (n = 230 ) to van Ceulen’s computation (n = 262 ), there are
32 doublings of n, which nicely explains why van Ceulen has 18 ≈ 0.6 ⋅ 32 additional correct
decimal digits.
18 TSOGTGEREL GANTUMUR

From Table 4, we see that Willebrord Snell had a success comparable to van Ceulen’s
by using a polygon with “only” 230 sides. This is because he observed that the particular
combination 32 n sin πn + 13 n tan πn of the perimeters of the inscribed and circumscribed polygons
converges faster than either of the perimeters. A mathematical explanation was given by
Christiaan Huygens in 1654. From our perspective, the expansions (125) immediately yield
1
2 π
3 n sin n + 13 n tan πn = π + O(
). (126)
n4
Hence Snell’s method converges twice as fast, in the sense that doubling n adds approximately
1.2 correct decimal digits. For instance, Snell was able to squeeze out 7 correct decimal digits
from Archimedes’ 96 sided polygon. It is also easy to see that Liu Hui’s accelerated formula
(122) has the same quality:
1 π3 1
n sin πn + 43 d2n = π + O( ), because d2n = + O( 4 ). (127)
n4 8n 3 n

Table 5. Performance of Archimedean algorithms. Here n designates the


number of doublings, i.e., n = 1 corresponds to a hexagon or a square, and
n = 2 corresponds to a dodecagon or an octagon, etc.

n Archimedes (117) Liu Hui (127) Viète (123) Snell (126)


1 3.0000000000000000 3.0000000000000000 2.8284271247461903 3.0000000000000000
2 3.1058285412302489 3.1411047216403318 3.0614674589207183 3.1423491305446567
3 3.1326286132812382 3.1415619706315674 3.1214451522580529 3.1416390562199918
4 3.1393502030468672 3.1415907329687442 3.1365484905459398 3.1415955404083902
5 3.1410319508905098 3.1415925335050572 3.1403311569547530 3.1415928338087959
6 3.1414524722854620 3.1415926460837791 3.1412772509327729 3.1415926648502488
7 3.1415576079118575 3.1415926531206555 3.1415138011443009 3.1415926542935209
8 3.1415838921483181 3.1415926535604717 3.1415729403670913 3.1415926536337748
9 3.1415904632280500 3.1415926535879608 3.1415877252771600 3.1415926535925420
10 3.1415921059992713 3.1415926535896785 3.1415914215112002 3.1415926535899645
11 3.1415925166921572 3.1415926535897856 3.1415923455701176 3.1415926535898038

Just before the advent of power series, John Wallis discovered the noteworthy formula
2 1 3 3 5 5 7 7 9
= ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋯, (128)
π 2 2 4 4 6 6 8 8
which was modified by William Brouncker into the continued fraction
4 1
=1+ . (129)
π 9
2+
25
2+
49
2+
2+⋯
Both formulas were published in 1655, and π was computed to a few decimal places. However,
these formulas converge too slowly to have any practical utility.
We now turn to power series. The arctangent series (107) gives the nice looking formula
π 1 1 1
= arctan 1 = 1 − + − + . . . , (130)
4 3 5 7
POWER SERIES 19

but its convergence is again incredibly slow. For instance, to get an accuracy of 9 decimal
digits, one needs to sum a billion terms. A better option is to use the relation tan π6 = √1 ,
3
which yields
π 1 1 1 1 1 1 1
= √ (1 − ⋅ 2 + ⋅ 4 − ⋅ 6 + . . . ). (131)
6 3 3 3 5 3 7 3
This was in fact used by Madhava to reach 11 decimal digits of accuracy.
In 1665, Isaac Newton used the expansion

π 3 1/4 √
− =∫ x(1 − x)dx
24 32 0
(132)
1 1 1 1 (2n − 3)!!
= − 5 − 9 − 13 − . . . − 3n+2 − ...,
12 2 ⋅ 5 2 ⋅ 7 2 ⋅ 9 2 n!(2n + 3)
to compute π to 16 decimals.
Then van Ceulen’s record was finally broken by Abraham Sharp in 1699, when he computed
71 decimal digits of π. He used the
√ series (131), as well as its variations based on other known
values of tan x, such as tan π8 = 2 − 1.
The next advance came in 1706, when John Machin used the formula
π 1 1
= 4 arctan − arctan , (133)
4 5 239
to cross the 100 decimals mark. This formula is remarkable, because the powers of 15 are
1
easily computed in base 10, and the series for arctan 239 converges very rapidly. Since then,
a wealth of similar formulas have been discovered. For example, we have
π 1 3
= 5 arctan + 2 arctan , (134)
4 7 79
due to Jurij Vega and Leonhard Euler (c. 1780), and
π 1 1 1
= 12 arctan + 8 arctan − 5 arctan , (135)
4 18 57 239
due to Carl Friedrich Gauss (1863). Table 6 illustrates some of the aforementioned power
series methods for computing π in action.

Table 6. Performance of power series formulas for π. Note that Gauss’ for-
mula saturates the machine arithmetic beyond 6 terms.

n Madhava-Sharp (131) Newton (132) Machin (133) Gauss (135)


1 3.4641016151377544 3.2990381056766580 3.1832635983263602 3.1443881670703955
2 3.0792014356780038 3.1490381056766577 3.1405970293260603 3.1415875736078829
3 3.1561814715699539 3.1423416771052288 3.1416210293250346 3.1415926647657662
4 3.1378528915956800 3.1416906354385628 3.1415917721821773 3.1415926535629728
5 3.1426047456630841 3.1416074056800398 3.1415926824043994 3.1415926535898602
6 3.1413087854628827 3.1415950812734890 3.1415926526153086 3.1415926535897927
7 3.1416743126988367 3.1415930785574249 3.1415926536235550 3.1415926535897927
8 3.1415687159417836 3.1415927314480228 3.1415926535886025 3.1415926535897927
9 3.1415997738115049 3.1415926683631725 3.1415926535898362 3.1415926535897927
10 3.1415905109380793 3.1415926564721790 3.1415926535897922 3.1415926535897927
11 3.1415933045030808 3.1415926541650681 3.1415926535897940 3.1415926535897927

In Table 7, we list some of the historic computations. Perhaps the last great manual com-
putation of π was that of William Shanks. He went back to Machin’s formula (133), and
20 TSOGTGEREL GANTUMUR

computed 707 decimal digits, but it was later discovered that “only” 527 of them were cor-
rect. In the early days of the electronic computer era, it must have been difficult to resists
the temptation to see how many more digits of π the newly invented machines can produce.
In fact, computation of π became a standard test for new computers. As for the algorithms,
almost all computations up to the late 1970’s were done by employing Machin-like formulas.
For instance, the 100,000 decimals mark was first crossed by Daniel Shanks and John Wrench,
who used the Gauss formula (135), in combination with
π 1 1 1
= 6 arctan + 2 arctan + arctan , (136)
4 8 57 239
due to Carl Størmer (1896).

Table 7. Approximation of π by power series.

Date Name Series Number of terms Decimal places


1400 Madhava (131) 21 11
1665 Isaac Newton (132) - 16
1699 Abraham Sharp (131) - 71
1706 John Machin (133) - 100
1789 Jurij Vega (134) - 136
1873 William Shanks (133) 510 707 (527)
1961 Daniel Shanks team (135)-(136) - 100,000
2002 Yasumasa Kanada team (137)-(138) - 1.24 trillion

From the late 1970’s, more sophisticated algorithms, based on ideas such as Ramanujan se-
ries and arithmetic-geometric mean iterations dominated the scene, but power series methods
still remain competitive. This is evidenced by the record computation of Yasumasa Kanada
and his team, performed in 2002 to find 1.24 trillion decimal digits of π. They used the
Machin-like formulas
π 1 1 1 1
= 44 arctan + 7 arctan − 12 arctan + 24 arctan , (137)
4 57 239 682 12943
due to Størmer (1896), and
π 1 1 1 1
= 12 arctan + 32 arctan − 5 arctan + 12 arctan , (138)
4 49 57 239 110443
due to Kikuo Takano (1982).
Exercise 15. Prove Liu Hui’s upper bound (119) by showing that
sin 2x π
sin x < 2 sin x − for 0 < x < . (139)
2 2
To compare this with the Archimedes’ upper bound (114), show also that
sin 2x π
2 sin x − < tan x − sin x for 0 < x < . (140)
2 2
Exercise 16. For 0 ≤ a ≤ 1, show that
a√ 1 √
∫ x(1 − x)dx = ( π2 − arcsin b − b 1 − b2 ), where b = 1 − 2a. (141)
0 8

Then expanding the factor 1 − x in the integrand by the binomial theorem (18), termwise
integrating the resulting series, and finally putting a = 14 , prove Newton’s formula (132).

You might also like