You are on page 1of 3

First: the integral is defined to be the (net signed) area under the curve.

The definition in terms of Riemann sums is precisely


designed to accomplish this. The integral is a limit, a number. There is, a priori, no connection whatsoever with derivatives.
(That is one of the things that makes the Fundamental Theorems of Calculus such a potentially surprising things).

Why does the limit of the Riemann sums actually give the area under the graph? The idea of approximating a shape whose
area we don't know both from "above" and from "below" with areas we do know goes all the way back to the Greeks.
Archimedes gave bounds for the value of by figuring out areas of inscribed and circumscribed polygons in a circle,
knowing that the area of the circle would be somewhere between the two; the more sides to the polygons, the closer the
inner and outer polygons are to the circle, the closer the areas are to the area of the circle.

The way Riemann tried to formalize this was with the "upper" and "lower" Riemann sums: assuming the function is
relatively "nice", so that on each subinterval it has a maximum and a minimum, the "lower Riemann sum" is done by taking
the largest "rectangle" that will lie completely under the graph by looking at the minimum value of the function on the
interval and using that as height; and the "upper Riemann sum" is done by taking the smallest rectangle for which the graph
will lie completely under it (by taking the maximum value of the function as the height). Certainly, the exact area under the
graph on that interval will be somewhere between the two. If we let S(f,P)S_(f,P) be the lower sum corresponding to
some fixed partition PP of the interval, and S(f,P)S(f,P) be the upper sum, we will have that

S(f,P)baf(x)dxS(f,P).S_(f,P)abf(x)dxS(f,P).

(Remember that baf(x)dxabf(x)dx is just the symbol we use to denote the exact (net signed) area under the graph of
f(x)f(x) between aa and bb, whatever that quantity may be.)
Also, intuitively, the more intervals we take, the closer these two approximations (one from below and one from above) will
be. This does not always work out if all we do is take "more" intervals. But one thing we can show is that if PP is a
refinement of PP (it includes all the dividing points that PP had, and possibly more points) then

S(f,P)S(f,P) and S(f,P)S(f,P)S_(f,P)S_(f,P) and S(f,P)S(f,P)


so at least the approximations are heading in the right direction. To see why this happens, suppose you split one of the
subintervals [ti,ti+1][ti,ti+1] in two, [ti,t][ti,t] and [t,ti+1][t,ti+1]. The minimum of ff on [ti,t][ti,t] and on [t,ti+1][t,ti+1]
are each greater than or equal to the minimum over the whole of [ti,ti+1][ti,ti+1], but it may be that the minimum in one of
the two bits is actually strictly larger than the minimum over [ti,ti+1][ti,ti+1]. The areas we get after the split can be no
smaller, but they can be larger than the ones we had before the split. Similarly for the upper sums.

So, let's consider one particular sequence of partitions: divide the interval into 2 equal parts; then into 4; then into 8; then
into 16; then into 32; and so on; then into 2n2n, etc. If PnPn is the partition that divides [a,b][a,b] into 2n2n equal parts,
then Pn+1Pn+1 is a refinement of PnPn, and so we have:

S(f,P1)S
(f,Pn)baf(x)dxS(f,Pn)S(f,P2)S(f,P1).S_(f,P1)S_(f,Pn)abf(x)dxS(f,Pn)S
(f,P2)S(f,P1).

Now, the sequence of numbers S(f,P1)S(f,P2)S(f,Pn)S_(f,P1)S_(f,P2)S_(f,Pn) is


increasing and bounded above (by the area). So the numbers have a supremum; call it SS_. This number is no more than
baf(x)dxabf(x)dx. And the numbers S(f,P1)S(f,P2)S(f,Pn)S(f,P1)S(f,P2)S(f,Pn) are
decreasing and bounded below, so they have a minimum; call this SS; again, it is no less than baf(x)dxabf(x)dx. So
we have:

limnS(f,Pn)=Sbaf(x)dxS=limnS(f,Pn).limnS_(f,Pn)=S_abf(x)dxS=limnS(f,Pn).

What if we are lucky? What if actually we have S=SS_=S? Then it must be the case that this common value is the
value of baf(x)dxabf(x)dx. It just doesn't have a choice! It's definitely trapped between the two, and if there is no space
between them, then it's equal to them.

What Riemann proved was several things:

If ff is "nice enough", then you will necessarily get that S=SS_=S. In particular, continuous functions happen
to be "nice enough", so it will definitely work for them (in fact, continuous functions turn out to be "very nice", not just
"nice enough").

If ff is "nice enough", then you don't have to use the partitions we used above. You can use any sequence of partitions,
so long as the "mesh size" (the size of the largest subinterval in the partition) gets smaller and smaller, and has limit of 00 as
nn; if it works for the partitions "divide-into-2n2n-equal-intervals", then it works for any sequence of partitions
whose mesh size goes to zero.

So, for example, we can take PnPn to be the partition that divides [a,b][a,b] into nn equal parts, even though
Pn+1Pn+1 is not a refinement of PnPn in this case.

In fact, you don't have to do S(f,P)S_(f,P) and S(f,P)S(f,P). For the partition PP, just pick any rectangle that
has as its height any value of the function in the subinterval (that is, pick an arbitrary xixi in the subinterval [ti,ti+1]
[ti,ti+1], and use f(xi)f(xi) as the height). Call the resulting sum S(f,P,x1,,xn)S(f,P,x1,,xn). Then you have

S(f,P)S(f,P,x1,,xn)S(f,P)S_(f,P)S(f,P,x1,,xn)S(f,P)

because S(f,P)S_(f,P) is computed using the smallest possible values of ff throughout, and S(f,P)S(f,P) is computed
using the largest possible values of ff throughout. But since we already know, from 1 and 2 above, that S(f,P)S_(f,P)
and S(f,P)S(f,P) have the same limit, then the sums S(f,P,x1,,xn)S(f,P,x1,,xn) also get squeezed and must
have that same limit, which equals the integral.
In particular, we can always take the left endpoint (and get a "Left Hand Sum") or we can always take the right
endpoint (and get a "Right Hand Sum"), and you will nevertheless get the same limit.

So in summary, you can pick any sequence of partitions, whichever happens to be convenient, so long as the mesh size
goes to 00, and you can pick any points on the subintervals (say, ones which make the calculations simpler) at each stage,
and so long as the function is "nice enough" (for example, if it is continuous), everything will work out and the limit will be
the number which must be the value of the area (because it was trapped between the lower and upper sums, and they both
got squeezed together trapping the limit and the integral both between them).

Now, (1) and (2) above are the hardest part of what Riemann did. Don't be surprised if it sounds a bit magical at this point.
But I hope that you agree that if the lower and upper sums for the special partitions have the same limits then that limit must
be the area that lies under the graph.

Thanks to that work of Riemann, then (at least for continuous functions) we can define baf(x)dxabf(x)dx to be the limit of,
say, the left hand sums of the partitions we get by dividing [a,b][a,b] into nn equal parts, because these partitions have
mesh size going to 00, we can pick any points we like (say, the left end points), and we know the limit is going to be that
common value of SS_ and SS, which has to be the area. So that, under this definition, baf(x)dxabf(x)dx really is
the net signed area under the graph of f(x)f(x). It just doesn't have a choice but to be that, when ff is "nice enough".

Second, the area does not turn into "the" antiderivative. What happens is that it turns out (perhaps somewhat magically) that
the area can be computed using an antiderivative. I'll go into some more details below.

As to how Newton figured this out, his teacher, Isaac Barrow, was the one who discovered there was a connection between
derivatives and tangents; some of the basic ideas were his. They came from studying some simple functions and some
simple formulas for tangents he had discovered.

For example, the tangents to the parabola y=x2y=x2 were interesting (there was generally geometric interest in tangents and
in "squaring" regions, also known as finding the "quadrature" of a region, that is, finding a way to construct a square or
rectangle that had the same area as the region you were considering), and led to associating the parabola y=x2y=x2 to lines
of the form y=2xy=2x. It does not take too much experimentation to realize that if you look at the area under y=2xy=2x
from 0 to a, you end up with a2a2, establishing a connection. Barrow did this with arguments with infinitesimals (which
were a bit fuzzy and not set on entirely correct and solid logical foundation until well into the 20th century), which were
generally laborious, and only for some curves. When Newton extended Barrow's methods to more general curves and
tangents, he also extended the discovery of the connection with areas, and was able to prove what is essentially the
Fundamental Theorem of Calculus.

Now, here is one way to approach the connection. We want to figure out the value of, say,

a0f(x)dx0af(x)dx

for some aa. This can be done using limits and Riemann sums (Newton and Leibniz had similar methods, though not set up
quite as precisely as Riemann sums are). But here is an absolutely crazy suggestion: suppose you can find a "master
function" MM, which, when given any point bb between 00 and aa, will give you the value of b0f(x)dx0bf(x)dx. If you
have such a master function, then you can use it to find the value of the integral you want just by taking M(a)M(a)!

In fact, this is the approach Barrow had taken: his great insight was that instead of trying to find the quadrature a particular
area, he was trying to solve the problem of squaring several different (but related) areas at the same time. So he was looking
for, for instance, a "master function" for the region was like a triangle except that the top was a parabola instead of a line
(like the area from 00 to aa under y=x2y=x2), and so on.

On its face, this is a completely ludicrous suggestion. It's like telling someone who is trying to know how to get from
building A to building B that if he only memorizes the map for the entire city first, then he can use that knowledge to figure
out how to get form A to B. If we are having trouble finding the integral a0f(x)dx0af(x)dx, then the "master function"
seems to require us to find not just that area, but also all areas in between! It's like telling someone who is having trouble
walking that he should just run very slowly when he wants to walk.

But, again, the interesting thing is that even though we may not be able to say what the "master function" is, we can say
how it changes as b changes (remember, M(b)=b0f(x)dxM(b)=0bf(x)dx is a number that depends on bb, so MM is a
function of bb). Because figuring out how functions change is easier than computing their values (just think about
derivatives, and how we can easily figure out the rate of change of sin(x)sin(x), but we have a hard time actually
computing specific values of sin(x)sin(x) that are not among some very simple ones). (This is also something Barrow
already knew, as did Newton).

For "nice functions" (if ff is continuous on an interval that contains 00 and aa), we can do it using limits and some theorems
about "nice" functions: Using limits, we have:

limh0M(b+h)Mh =limh01h(b+h0f(x)dxb0f(x)dx)=limh01hb+hbf(x)dx.limh0M(b+h)
Mh=limh01h(0b+hf(x)dx0bf(x)dx) =limh01hbb+hf(x)dx.

Since we are assuming that ff is continuous on [0,a][0,a], it is continuous on the interval with endpoints bb and b+hb+h (I
say it this way because hh could be negative). So it has a maximum and a minimum (continuous function on a finite closed
interval). Say the maximum is M(h)M(h) and the minimum is m(h)m(h). Then m(h)f(x)M(h)m(h)f(x)M(h) for all xx
in the interval, so we know, since the integral is the area, that

hm(h)b+hbf(x)dxhM(h).hm(h)bb+hf(x)dxhM(h).
That means that

m(h)1hb+hbf(x)dxM(h) if h>0m(h)1hbb+hf(x)dxM(h) if h>0


and
M(h)1hb+hbf(x)dxm(h) if h<0.M(h)1hbb+hf(x)dxm(h) if h<0.

As h0h0, the interval gets smaller, the difference between the minimum and maximum value gets smaller. One can
prove that both MM and mm are continuous functions, and that m(h)f(b)m(h)f(b) as h0h0, and likewise that
M(h)f(b)M(h)f(b) as h0h0. So we can use the Squeeze Theorem to conclude that since the limit of
1hb+hbf(x)dx1hbb+hf(x)dx is squeezed between two functions that both have the same limit as h0h0, then
1hb+hbf(x)dx1hbb+hf(x)dx also has a limit as h0h0 and is in fact that same quantity, namely f(b)f(b). That is

ddbM(b)=limh0M(b+h)M(b)h=limh01hb+hbf(x)dx=f(b).ddbM(b)=limh0M(b+h)
M(b)h=limh01hbb+hf(x)dx=f(b).

That is: when ff is continuous, the "Master function" for areas turns out to have a rate of change equal to ff. This is not that
crazy, if you think about it: how is the area under y=f(x)y=f(x) from x=0x=0 to x=bx=b changing? Well, it's changing by
whatever ff is.

This means that, whatever the "Master function" turns out to be, it will be an antiderivative of f(x)f(x).

We also know, because we are very good with derivatives, that if F(x)F(x) and G(x)G(x) are two functions, and F(x)=G
(x)F(x)=G(x) for all xx, then FF and GG differ by a constant: there exists a constant kk such that F(x)=G(x)
+kF(x)=G(x)+k for all xx.
So, we know that the "Master function" is an antiderivative. If, by some sheer stroke of luck, we happen to find any
antiderivative F(x)F(x) for f(x)f(x), then we know that the only possible difference between M(b)M(b) and F(b)F(b) is a
constant. What constant? Well, luckily we know one value of M(b)M(b): we know that M(0)=00f(x)dxM(0)=00f(x)dx
should be 00. So, M(0)=0=F(0)F(0)M(0)=0=F(0)F(0), which means the constant has to be F(0)F(0). That is, we
must have M(b)=F(b)F(0)M(b)=F(b)F(0) for all bb.

So, if we find any antiderivative FF of ff, then M(b)=F(b)F(0)M(b)=F(b)F(0) is in fact the "Master function" we were
looking for, the one that gives all the integrals between 00 and aa, including 00 and including aa. So that we have that two
very different processes (computing areas using limits of Riemann sums, and derivatives) are connected: if f(x)f(x) is
continuous, and F(x)F(x) is any antiderivative for f(x)f(x) on [0,a][0,a], then

a0f(x)dx=M(a)=F(a)F(0).0af(x)dx=M(a)=F(a)F(0).
But the integral did not "magically turn" into an antiderivative. It's that the "Master function" which can be used to keep
track of all integrals of f(x)f(x) has rate of change equal to ff, which gives us a "back door" to computing integrals.

Newton was able to prove this because he had the guide of Barrow's insight that this was happening for the functions he
worked with. Barrow's insight was achieved because he had the brilliant idea of trying to come up with a "Master function"
instead of trying to rectify lots of different areas one at a time, and he noticed the connection because he had already worked
with tangents/derivatives for those functions. Leibniz likewise had access to Barrow's ideas, so the connection between the
two was also known to him.

You might also like