You are on page 1of 143

MATH 2010 Supplementary Notes on Multivariable Calculus

Department of Mathematics The Hong Kong University of Science and Technology

January 27, 2012

ii

Contents
1 Vectors and Geometry of Space 1.1 Three-Dimensional Coordinate Systems 1.2 Vectors . . . . . . . . . . . . . . . . . . 1.3 The Dot Product . . . . . . . . . . . . . 1.4 The Cross Product . . . . . . . . . . . . 1.5 Equations of Lines . . . . . . . . . . . . 1.6 Equations of Planes . . . . . . . . . . . 1.7 Quadric Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 5 9 12 16 20 25 29 29 33 36 39 39 42 46 59 66 74 99 99 108 117 128 134 136

2 Vector-Valued Functions 2.1 Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Calculus with Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Arc Length in Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Partial Derivatives 3.1 Functions of Several Variables . . . 3.2 Limits and Continuity . . . . . . . 3.3 Partial Derivatives . . . . . . . . . 3.4 The Chain Rule . . . . . . . . . . . 3.5 Directional Derivatives . . . . . . . 3.6 Applications of Partial Derivatives

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

4 Multiple Integrals 4.1 Double Integrals . . . . . . . . . . . . . . . . . 4.2 Double Integrals Over Non-rectangular Regions 4.3 Double Integrals in Polar Coordinates . . . . . 4.4 Triple Integrals . . . . . . . . . . . . . . . . . . 4.5 Triple Integrals in Cylindrical Coordinates . . . 4.6 Triple Integrals in Spherical Coordinates . . . .

iii

Chapter 1

Vectors and Geometry of Space


To apply calculus in many real-world situations as well as in higher mathematics, we need a mathematical description of the three-dimensional space. In this beginning chapter we will introduce three-dimensional coordinate systems and vectors. Building on what we already know about coordinates in the two-dimensional xy-plane, we establish coordinates in space by adding a third axis which measures distance above and below the xy-plane. Vectors are used to study the analytic geometry of space, where they provide simple ways to describe lines, planes, surfaces, and curves in space. We will use these geometric ideas later to study motion in space and the calculus of several variables, with their many important applications in science, engineering, and higher mathematics.

1.1

Three-Dimensional Coordinate Systems

In this section we will have a fairly short introducion on the three-dimensional coordinate system and conventions that we will be using. We will also take a brief look at how the dierent coordinate systems can change the graph of an equation. The 3-dimensional coordinate system is often denoted by R3 . Likewise the 2-dimensional coordinate system is often denoted by R2 and the 1-dimensional by R. To locate a point in R3 , we use three mutually perpendicular coordinate axes, arranged as in the following gure. The axes shown there make a right-handed coordinate frame. When you hold your right hand so that the ngers curl from the positive x-axis toward the positive y-axis, your thumb points along the positive z-axis. z R = (0, y, z) S = (x, 0, z) P = (x, y, z)

y x Q = (x, y, 0)

Let us look at the basic coordinate system. It is assumed that only the positive directions are shown by the axes. If we need the negative axis for any reason we will put them in as needed.

1. Vectors and Geometry of Space

Also note the various points in the gure. The point P of the Cartesian coordinates (x, y, z) is the general point sitting out in 3-dimensional space R3 . If we start at P and drop a straight line down until we reach a z-coordinate of zero we arrive that the point Q. We say that Q sits in the xy-plane. The xy-plane corresponds to all the points which have a zero z-coordinate. We can also start at P and move in the other two directions as shown to get points in the xz-plane (that is the point S with a y-coordinate of zero) and the yz-plane (that is the point R with a x-coordinate of zero). The xy, yz, and xz-planes are often called the coordinate planes. Also, the point Q is often referred to as the projection of P in the xy-plane. Likewise, R is the projection of P in the yz-plane and S is the projection of P in the xz-plane. Most of the formulae that you are familiar with in R2 have their natural extensions in R3 . For instance, the distance between two points in R2 is given by p dist(P1 , P2 ) = (x2 x1 )2 + (y2 y1 )2 , while the distance between two points in R3 is given by dist(P1 , P2 ) = p (x2 x1 )2 + (y2 y1 )2 + (z2 z1 )2 .

We might use the distance formula to write equations for spheres in space. A point P (x, y, z) lies on the sphere of radius a centered at P0 (x0 , y0 , z0 ) precisely when dist(P0 , P ) = a, or (x x0 )2 + (y y0 )2 + (z z0 )2 = a2 . It is worth-mentioning that we should be careful when just translating everything we know about R2 into R3 and assuming that it will work the same way. A good example of this is in graphing to some extent. Consider the following example.

Example 1.1.1 (Graphing in dierent coordinate systems) Graph and interpret the equation x = 3 geometrically in the coordinate systems R, R2 , and R3 . Solution system. In R we have a single coordinate system and so x = 3 is a point in a one-dimensional coordinate

In R2 the equation x = 3 tells us to graph all the points that are in the form (3, y). This is a vertical line in a two-dimensional coordinate system. In R3 the equation x = 3 tells us to graph all the points that are in the form (3, y, z). If you go back and look at the coordinate planes in 3-space this is very similar to yz-plane except this time we have x = 3 instead of x = 0. So, in a three-dimensional coordinate system this is a plane parallel to the yz-plane. Here are the graphs of each of these.

x x y

1.1 Three-Dimensional Coordinate Systems

Note that at this moment we can now write down the equations for each of the coordinate planes as well using this idea. x=0 y=0 z=0 the yz-plane, the xz-plane, the xy-plane.

Let us take a look at a slightly more general example.

Example 1.1.2 (Graphing in dierent coordinate systems) Graph and interpret the equation y = 2x 1 geometrically in the coordinate systems R
2

and R3 .

Solution Of course we have to throw out R for this example since there are two variables which means that we cannot be in a one-dimensional space. In R2 the equation y = 2x 1 is a line with slope 2 and a y-intercept of 1. However, in R3 this is not necessarily a line. Because we have not specied a value of z ,we may consider that z can take any value. This means that at any particular value of z we will get a copy of this line. So, the graph of the given equation in 3-space is then a vertical plane that lies over the line given by y = 2x 1 in the xy-plane. z

x x y y = 2x 1 on xy-plane

Notice that if we look to where the plane intersect the xy-plane we will get the graph of the line in R2 as shown in the above graph. Let us take a look at one more example of the dierence between graphs in the dierent coordinate systems.

1. Vectors and Geometry of Space

Example 1.1.3 (Graphing in dierent coordinate systems) Graph and interpret the equation x2 + y 2 = 4

geometrically in the coordinate systems R2 and R3 .

Solution As with the previous example this wont have a one-dimensional graph since there are two independent variables in the equation. In R2 this is a circle centered at the origin with radius 2. In R3 , as with the previous example, this may or may not be a circle. Since we have not specied z in any way we must assume that z can take on any value. In other words, at any value of z this equation must be satised and so at any value z we have a circle of radius 2 centered on the z-axis. This means that we have a cylinder of radius 2 centered on the z-axis. The following is the graphs for this example.

x x y

Again, if we look to where the cylinder intersects the xy-plane (i.e., z = 0) we will again get the circle 2 from R3 .

We need to be careful with the last two examples. It would be interesting to take the results of these and say that we cant graph lines or circles in R3 and yet that does not really make sense. There is no reason for not allowing to graph a line or a circle in R3 . To graph a circle in R3 we would need to do something like x2 + y 2 = 4 at z = 5. This would be a circle of radius 2 centered on the z-axis at the level of z = 5. So, as long as we specify a z-value we will get a circle and not a cylinder. We will see an easier way to specify circles in a later section. We could do the same thing with the line from Example 1.1.2. However, we will be looking at line in more generality in Section 1.5 (page 16) and so we will see a better way to deal with lines in R3 there. The point of the examples in this section is to make sure that we are being careful with graphing equations and making sure that we always remember which coordinate system that we are in. Another quick point to make here is that, as we have seen in the above examples, many graphs of equations in R3 are surfaces. That does not mean that we cannot graph curves in R3 . We can and will graph curves in R3 as well as we will see later in this chapter.

1.2 Vectors

As we mentioned in the beginning of this chapter, there is a mathematical tool which can be used to describe lines, curves, planes, and surfaces in space. This is the building block in geometry: vectors. In the next section we will give a short review on vectors and their basic properties.

1.2

Vectors

In this section we show how to represent things that have both magnitude and direction in the plane or in space. A quantity in physics such as force, displacement, or velocity is called a vector and is represented by a directed line segment. The arrow points in the direction of the action and its length gives the magnitude of the action in terms of a suitably chosen unit. For example, a force vector points in the direction in which the force acts; its length is a measure of the forces strength; a velocity vector points in the direction of motion and its length is the speed of the moving object. Q Terminal
point

PQ

Initial point

1.2.1 Denition A vector is a directed line segment. The directed line segment P Q has initial point P and terminal point Q; its length (or norm) is denoted by P Q . Two vectors are equal if they have the same length and direction. The arrows we use when we draw vectors are understood to represent the same vector if they have the same length, are parallel, and point in the same direction regardless of the initial point. We need a way to represent vectors algebraically so that we can be more precise about the direction of a vector. Let v = P Q. There is one directed line segment equal to P Q whose initial point is the origin. It is the representative of v in standard position and is the vector we normally use to represent v. We can specify v by writing the coordinates of its terminal point (v1 , v2 , v3 ) when v is in standard position.

1.2.2 Denition If v is a two-dimensional vector in the plane equal to the vector with initial point at the origin and terminal point (v1 , v2 ), then the component form of v is v = v1 , v2 . If v is a three-dimensional vector equal to the vector with initial point at the origin and terminal point (v1 , v2 , v3 ), then the component form of v is v = v1 , v2 , v3 .

1. Vectors and Geometry of Space

So a two-dimensional vector v = v1 , v2 is an ordered pair of real numbers, and a three-dimensional vector v = v1 , v2 , v3 is an ordered triple of real numbers. The numbers v1 , v2 , v3 are the coordinates of v. In general, an n-component multivariable x = x1 , x2 , , xn is called an n-dimensional vector. The components x1 , x2 , , xn are called the coordinates of the vector x. The collection Rn of all ndimensional vectors is called the n-dimensional Euclidean space. We indicate x is an n-dimensional vector by writing x Rn . We may visualize low dimensional Euclidean spaces as follows. R0 : a single point; R1 : a straight line with origin; R2 : a plane with origin; R3 : our living world with a choice of reference point as origin. The origin corresponds to the zero vector 0 = 0, 0, , 0 . We may also imagine higher dimensional Euclidean spaces by analogy. In summary, a position vector is represented either as a point in the Euclidean space or as an arrow from the origin to the point.

(2, 2) 2, 1

2, 2

(2, 1) 0

0 (1, 1) 1, 1 vectors as arrows

vectors as points

In the real world, one often has to deal with arrows not starting from the reference point (i.e., the origin). In practical, we get around the problem by parallely moving the arrows so that the starting points become the origin. Throughout this course any two arrows will be considered as the same vector if one can be parallely moved to the other.

v v

Two principal operations involving vectors are vector addition and scalar multiplication. A scalar is simply a real number, and is called to distinguish it from vectors. Scalars can be positive, negative or zero which can be used to scale a vector by multiplication.

1.2 Vectors

Addition and scalar multiplication


Vectors of the same dimension may be added and scalar multiplied in the natural way: x 1 , x 2 , , x n + y1 , y2 , , yn c x1 , x2 , , xn = = x 1 + y1 , x 2 + y2 , , x n + yn , cx1 , cx2 , , cxn .

The scalar multiplication is easily visualized as the stretching and shrinking of vectors. The addition is visualized with the help of parallelograms.
v u+v

1 2

2u

The above gure indicates that the operations on vectors have physical meaning in the real world. For example, suppose vectors x and y represent two forces applied to the same point of an object. Then the combined eect of the two forces on the object is a force represented by the vector x + y. By repeatedly making use of the two operations, we have linear combinations of vectors.

Example 1.2.1 (Linear combination of vectors) Let x = 1, 2, 3, 5 , Then x + 2y 3z = 2, 9, 4, 10 , 2x + y 2z = 4, 11, 6, 8 .


2

y = 0, 1, 2, 4 ,

z = 1, 3, 1, 1 .

The following gure indicates all the linear combinations of two vectors.

2v

u + 2v

u + v

1 u 2

+ 1v 2

u+v

2u + v

0 u

u
1 v 2

2u

u v

uv

2u v

1. Vectors and Geometry of Space

Unit vectors
A vector v of length 1 is called a unit vector. The standard unit vectors are i = 1, 0, 0 , Any vector v = v1 , v2 , v3 j = 0, 1, 0 , k = 0, 0, 1 .

can be written as a linear combination of the standard unit vectors as follows: v = = = = v1 , v2 , v3 v1 , 0, 0 + 0, v2 , 0 + 0, 0, v3 v1 1, 0, 0 + v2 0, 1, 0 + v3 0, 0, 1 v1 i + v2 j + v3 k.

Whenever v = 0, its length

is not zero and 1 1 v = v v

v = 1.

That is, v/ v is a unit vector in the direction of v, called the direction of the nonzero vector v. The process of rescaling a nonzero vector into a unit vector is often named the normalization of the vector.

Example 1.2.2 (Unit vectors) Find a unit vector u in the direction of the vector from P1 (1, 0, 1) to P2 (3, 2, 0). Solution We divide P1 P2 by its length. P1 P2 P1 P2 u = = = (3 1) i + (2 0) j + (0 1) k = 2i + 2j k, p (2)2 + (2)2 + (1)2 = 9 = 3, 2 2 1 P1 P2 2i + 2j k = i + j k. = 3 3 3 3 P1 P2
2

The unit vector u is in the direction of P1 P2 . Example 1.2.3 (Unit vectors)

If v = 3i 4j is a velocity vector, express v as a product of its and a unit vector in the direction of motion. Solution Speed is the magnitude (length) of v: p v = (3)2 + (4)2 = 25 = 5. has the same direction as v: 3i 4j 3 4 v = = i j. v 5 5 5 So, v = 3i 4j = 5 3 4 i j . 5 5 {z } |
a unit vector

The unit vector v/ v

1.3 The Dot Product

In summary, we can express any nonzero vector v in terms of its two important features, length and v direction, by writing v = v . v

Theorem 1.2.1 (Normalizing a Vector) If v = 0, then v is a unit vector in the direction of v. (1) v (2) The equation v = v v v expresses v in terms of its length and direction.

1.3

The Dot Product

In this section, we will see how we can determine the angle between two vectors directly from their components. A key part of the calculation is an expression called the dot product. Dot products are also called inner or scalar products because the product results in a scalar, not a vector. After introducing the dot product, we will examine its physical meaning such as to determine the work done by a constant force acting through a displacement.

When two nonzero vectors u and v are placed so their initial points coincide, they form an angle of measure 0 . If the vectors do not lie along the same line, the angle is measured in the plane containing both of them. If they do lie along the same line, the angle between them is 0 if they point in the same direction, and if they point in opposite directions. The angle is the angle between u and v. The following gives a formula to determine this angle.

Theorem 1.3.1 (Angle Between Two Vectors) u = u1 , u2 , u3 and v = v1 , v2 , v3 is given by uv = u or explicitly = cos1

The angle v cos ,

between the two nonzero vectors

u1 v1 + u2 v2 + u3 v3 u v

1. Vectors and Geometry of Space

In Theorem 1.3.1, we pay our attention to the expression u1 v1 + u2 v2 + u3 v3 in the calculation of .

1.3.1 Denition v = v1 , v2 , v3 is

The dot product u v (read u dot v) of vectors u = u1 , u2 , u3 u v = u1 v1 + u2 v2 + u3 v3 .

and

Example 1.3.1 (The dot product) (a) 1, 2, 3 2, 3, 6 = (1)(2) + (2)(3) + (3)(6) = 2 + 6 + 18 = 22. 1 1 i 2j + 3k (4i 3j + 2k) = ( )(4) + (2)(3) + (3)(2) = 14. (b) 2 2
2

Example 1.3.2 (The dot product) Find the angle between u = i 2j 2k and v = 6i + 3j + 2k. Solution We use the formula in Theorem 1.3.1: uv u v = = = = = (1)(6) + (2)(3) + (2)(2) = 4, p (1)2 + (2)2 + (2)2 = 9 = 3, p (6)2 + (3)2 + (2)2 = 49 = 7, uv cos1 u v 4 1 cos 1.76 (radians). (3)(7)
2

Work
Let us briey discuss the physical interpretation of dot product. If a force F is applied to a particle moving along a path, we often need to know the magnitude of the force in the direction of motion. If d is parallel to the tangent line to the path at the point where F is applied, then we want the magnitude of F in the direction of d. The following gure shows that the scalar quantity we seek is the length F cos , where is the angle between the two vectors F and d.

Length = F cos

10

1.3 The Dot Product

Notice that the magnitude of the force F in the direction of vector d is the length projection of F onto d.

F cos of the

Recall that the work done by a constant force of magnitude F in moving an object through a distance d is W = F d. This formula holds only if the force is directed along the line of motion. Now if a force F moving an object through a displacement d is in a particular direction, the work is performed by the component of F in the direction of d. If is the angle between F and d, then Work = = (scalar component of F in the direction of d) (length of d) ( F cos ) d = F d.

In other words, we apply the dot product to nding the work done by a constant force acting through a displacement. The work done is W = F d = F d cos .

Perpendicular (orthogonal) vectors


Two nonzero vectors u and v are perpendicular or orthogonal if the angle between them is /2. For such vectors, we have u v = 0 because cos(/2) = 0. The converse is also true. If u and v are nonzero vectors with u v = u v cos = 0, then cos = 0 and hence = cos1 0 = /2.

Theorem 1.3.2

Vectors u and v are orthogonal (or perpendicular) if and only if u v = 0.

Dot product properties


The dot product obeys many of the laws that hold for ordinary products of real numbers (scalars).

Theorem 1.3.3 (Properties of the Dot Product) scalar, then (1) u v = v u. (2) (cu) v = u (cv) = c(u v). (3) u (v + w) = u v + u w. (4) u u = u 2 . (5) 0 u = 0.

If u, v, and w are any vectors and c is a

The properties are easy to prove using the denition. We skip the detail of any proofs here but students are suggested to try the proofs of (1) and (3), for instance.

11

1. Vectors and Geometry of Space

1.4

The Cross Product

In the 2-dimensional plane, when we want to describe the orientation a line, we may use the notions of slope. In the 3-dimensional space, we also need a way to describe how a plane is tilted. We accomplish this by multiplying two vectors in the plane together to get a third vector perpendicular to the plane. The direction of this third vector tells us the inclination of the plane. The product we use to multiply the vectors together is the vector or cross product, the second of the two vector multiplication methods. We study the cross product in this section. uv

n
1

We start with two nonzero vectors u and v in space. If u and v are not parallel, they uniquely determine a plane. We select a unit vector n perpendicular to the plane by the right-hand rule. This means that we choose n to be the unit (normal) vector that points the way your right thumb points when your ngers curl through the angle from u to v. Then the cross product u v (read u cross v) is the vector dened as follows:

1.4.1 Denition uv = ( u v sin ) n.

Unlike the dot product, the cross product is a vector. For this reason it is also called the vector product of u and v, and applies only to vectors in space. The vector u v is orthogonal to both u and v because it is a scalar multiple of n. There is a straightforward way to calculate the cross product of two vectors from their components. The method does not require that we know the angle between them (as suggested by the denition), but we postpone that calculation at this moment so we can focus rst on the properties of the cross product. Since the sines of 0 and are both zero, it makes sense to dene the cross product of two parallel nonzero vectors to be 0. If one or both of u and v are zero, we also dene u v to be zero. This way, the cross product of two vectors u and v is zero if and only if u and v are parallel or one or both of them are zero.

Theorem 1.4.1

Nonzero vectors u and v are parallel if and only if u v = 0.

12

1.4 The Cross Product

The cross product obeys the following basic rules.

Theorem 1.4.2 (Properties of the Cross Product) are scalars, then (1) (ru) (sv) = (rs) (u v). (2) u (v + w) = u v + u w. (3) (v + w) u = v u + w u. (4) v u = u v. (5) 0 u = 0.

If u, v, and w are any vectors and r, s

To visualize (4), for example, notice that when the ngers of a right hand curl through the angle from v to u, the thumb points the opposite way and the unit vector we choose in forming v u is the negative of the one we choose in forming u v. Also remind that cross product is not associative so (u v) w does not generally equal u (v w). When we apply the denition to calculate the pairwise cross products of i, j, and k, we nd i j = (j i) = k, j k = (k j) = i, k i = (i k) = j, i i = j j = k k = 0.

Cross product as an area of a parallelogram


v h = v sin u
Area = base height = ( u ) ( v sin )

Because n is a unit vector, the magnitude of u v is

Theorem 1.4.3 uv = u v | sin | n = u v sin .

This is the area of the parallelogram determined by u and v, u being the base of the parallelogram and ( v sin ) the height. Remark that sin is positive (and hence | sin | = sin ) for 0 < < .

13

1. Vectors and Geometry of Space

Determinant formula for cross product


We may also calculate the cross product u v from the components of u and v in the Cartesian coordinate system. For ease in calculating the cross product using determinants, we usually write vectors in the form v = v1 i + v2 j + v3 k rather than as ordered triples v = v1 , v2 , v3 . We have the following rule.

Theorem 1.4.4 (Calculating Cross Products Using Determinants) and v = v1 i + v2 j + v3 k, then i j k u v = u1 u2 u3 . v1 v2 v3

If u = u1 i + u2 j + u3 k

Example 1.4.1 (Cross product by determinant) Find u v and v u if u = 2i + j + k and v = 4i + 3j + k. Solution By the formula, uv = i 2 4 1 3 j k 1 1 3 1 1 i 2 4 1

= = vu =

1 j+ 2 4 1

1 k 3

2i 6j + 10k, (u v) = 2i + 6j 10k.
2

Example 1.4.2 (Cross product by determinant) Find the area of the triangle with vertices P (1, 1, 0), Q(2, 1, 1), and R(1, 1, 2). Solution The vector P Q P R is perpendicular to the plane because it is perpendicular to both vectors. In terms of components, P Q = (2 1) i + (1 + 1) j + (1 0) k = i + 2j k, PR Then = (1 1) i + (1 + 1) j + (2 0) k = 2i + 2j + 2k.

i j k 2 1 = 1 2 2 2 2 1 2 1 i 1 j+ 1 = 2 2 2 k = 6i + 6k. 2 2 2 The area of the parallelogram determined by P , Q, and R is p P Q P R = 6i + 6k = (6)2 + (6)2 = 2 36 = 6 2. The triangles area is half of this, or 3 2. PQ PR

14

1.4 The Cross Product

Triple scalar or box product


The product (u v) w is called the triple scalar product of u, v, and w (in that order). As you can see in the formula, |(u v) w| = u v w | cos |, the absolute value of the product is the volume of the parallelepiped (parallelogram-sided box) determined by u, v, and w (see the following gure). The number u v is the area of the base parallelogram. The number w | cos | is the parallelepipeds height. Because of this geometry, (u v) w is also called the box product of u, v, and w. uv

w Height = w | cos | v

Volume = area of base height = ( u v ) ( w | cos |)

Area = u v u By treating the planes of v and w and of w and u as the base planes of the parallelepiped determined by u, v, and w, we see that (u v) w = (v w) u = (w u) v. Since the dot product is commutative, we also have (u v) w = u (v w).

The triple scalar product can be evaluated as a determinant:

Theorem 1.4.5 (Calculating the Triple Scalar Products by Determinants) If u = u1 i + u2 j + u3 k, v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k, then u1 u2 u3 (u v) w = v1 v2 v3 . w1 w2 w3

Three vectors in 3-space are said to be coplaner if the parallelepiped they span has zero volume; if their tails coincide, three such vectors must lie in the same plane.

Theorem 1.4.6 (Three Coplanar Vectors) u1 u, v, and w are coplanar (u v) w = 0 v1 w1 u2 v2 w2 u3 v3 = 0. w3

15

1. Vectors and Geometry of Space

Example 1.4.3 (Triple scalar product) Find the volume of the box (parallelepiped) determined by u = i + 2j k, v = 2i + 3k, and w = 7j 4k. Solution Using the rule for calculating determinants, we nd 1 2 1 0 3 + 2 2 1 = 21 + 2(8 + 7) = 23. 3 = (u v) w = 2 0 7 4 7 4 0 7 4
2

The volume is |(u v) w| = 23 units cubed.

1.5

Equations of Lines

In this section we will take a look at the equation of a line in R3 . As we saw in Example 1.1.2 (page 3) the equation y = mx + b does not describe a line in R3 , instead it describes a surface, particularly a plane. This does not mean however that we cannot write down an equation for a line in three-dimensional space. To see how to do this let us think about what we need to write down the equation of a line in R2 . In two-dimensional space we need the slope (i.e., m) and a point that was on the line in order to write down the equation. In R3 that is still all that we need except in this case the slope will not be a simple number as it was in the two dimensions. In this case we will need to acknowledge that a line can have a three-dimensional slope. So, we need something that will allow us to describe a direction that is potentially in three dimensions. We already have a quantity that will do this for us. Vectors give directions and are three-dimensional objects. So, let us start with the following information. Suppose that we know a point that is on the line, P0 (x0 , y0 , z0 ), and that v = a, b, c is a vector which is parallel to the line. Note that v is not necessary to be on the line itself. We only need v to be parallel to the line. Finally, let P (x, y, z) be any point on the line. Now, since our slope is a vector let us also turn the the two points (i.e., P0 and P ) into vectors as well. Of course, we dont actually turn them into vectors, we instead use position vectors to represent them. So, let r0 and r be the position vectors for P0 and P , respectively. Also, for no apparent reason, let us dene a to be the vector with representation P0 P . We now have the following sketch with all these vectors. z

a
P0 (x0 , y0 , z0 )

P (x, y, z)

The Line

r0

y x

16

1.5 Equations of Lines

Now we can write r as a sum of two vectors by the parallelogram rule. r = r0 + a. Also notice that a and v are parallel. Therefore, there is a number t such that a = tv. We then have r = r0 + tv = x0 , y0 , z0 + t a, b, c . This is called the vector form of the equation of a line. The only unknown in this equation is the parameter t. Notice that tv is a vector that lies along the line and it tells us how far from the original point that we should move. If t is positive we move to the right of the original point and if t is negative we move to the left of the original point. As t varies over all possible values we will completely span the line. There are several other forms of the equation of the line. To get the rst alternative form let us start with the vector form and do a slight rewrite. r x, y, z = = x0 , y0 , z0 + t a, b, c , x0 + at, y0 + bt, z0 + ct .

The only way for two vectors to be equal is for the components to be equal. In other words, x y z = = = x0 + at, y0 + bt, z0 + ct.

This set of equations is called the parametric form of the equation of a line. To get a point on the line, all we need to do is to pick a t-value and plug it into either form of the line. In the vector form of the line we get a position vector for the point and in the parametric form we get the actual coordinates of the point. There is one more form of the line that we want to look at. If we assume that a, b, and c are all nonzero numbers we can solve each of the equations in the parametric form of the line for t. We can then set all of them equal to each other since t will be the same number in each. Doing this gives the following. y y0 z z0 x x0 = = a b c (= t).

This is called the symmetric equations of the line. If one of a, b, c does happen to be zero we can still write down the symmetric equations. To see this let us suppose that b = 0. In this case t will not exist in the parametric equation for y and so we will only solve the parametric equations for x and z for t. We can set those equal and acknowledge the parametric equation for y as follows. z z0 x x0 = , a c y = y0 .

Example 1.5.1 (Equations of lines) Write down the equation of the line that passes through the points (2, 1, 3) and (1, 4, 3). Write down all three forms of the equation of the line.

17

1. Vectors and Geometry of Space

Solution To do this we need the vector v which is parallel to the line. This can be any vector as long as it is parallel to the line. In general, v will not lie on the line itself. However, in this case it will. All we need to do is let v be the vector that starts at the second point and ends at the rst point. Hence, v = 2, 1, 3 1, 4, 3 = 1, 5, 6 . Note that the order of the points was chosen to reduce the number of minus signs in the vector. Once we have v there really is not anything else to do. To use the vector form we will need a point on the line. From the given information we have two points, so we can use either one. Let us use the rst point. The following is the vector form of the equation of the line. r = 2, 1, 3 + t 1, 5, 6 = 2 + t, 1 5t, 3 + 6t . Once we have this equation the other two forms follow. Here is the parametric equations of the line. x = 2 + t, Here is the symmetric form. y = 1 5t, x2 y+1 z3 = = . 1 5 6 z = 3 + 6t.

Example 1.5.2 (Equations of lines) Determine if the line that passes through the point (0, 3, 8) and is parallel to the line given by x = 10 + 3t, y = 12t, and z = 3 t

passes through the xz-plane. If it does give the coordinates of the point. Solution To answer this we will rst need to write down the equation of the line. We know a point on the line and just need a parallel vector. We know that the new line must be parallel to the line given by the parametric equations in the problem statement. That means that any vector that is parallel to the given line must also be parallel to the new line. Now recall that in the parametric form of the line the numbers multiplied by t are the components of the vector that is parallel to the line. Therefore, the vector v = 3, 12, 1 is parallel to the given line and so must also be parallel to the new line. The equation of the new line is then r = 0, 3, 8 + t 3, 12, 1 = 3t, 3 + 12t, 8 t . If this line passes through the xz-plane then we know that the y-coordinate of that point must be zero. So, let us set the y-component of the equation equal to zero and see if we can solve for t. If we can, this will get the value of t for which the point will pass through the xz-plane. 1 . 4 So, the line does pass through the xz-plane. To get the complete coordinates of the point all we need to do is plug t = 1/4 into any of the equations. We will use the vector form. 3 + 12t = 0 = t= 1 1 3 31 1 . r = 3( ), 3 + 12( ), 8 ( ) = , 0, 4 4 4 4 4 Recall that this vector is the position vector for the point on the line and so the coordinates of the point here the line will pass through the xz-plane are given by 3 31 ( , 0, ). 4 4
2

18

1.5 Equations of Lines

The distance from a point to a line in space


To nd the distance from a point S to a line that passes through a point P parallel to a vector v, we nd the absolute value of the scalar component of P S in the direction of a vector normal to the line (see the following gure). In the notation of the gure, the absolute value of the scalar component is P S sin , which is P S v / v . S

P S sin = d (the distance we want)

The Line dened by P and v

Theorem 1.5.1

The distance from a point S to a line through P parallel to v is given by d= PS v . v

Just recall the denition of cross product in Section 1.4 (page 12) that P S v = ( P S v sin ) n = PS v = PS

v | sin |.

Example 1.5.3 (Distance from a point to a line) Find the distance from the point S(1, 1, 5) to the line x = 1 + t, Solution With and y = 3 t, z = 2t.

We see from the equations that the line passes through P (1, 3, 0) parallel to v = i j + 2k. P S = (1 1) i + (1 3) j + (5 0) k = 2j + 5k i P S v = 0 1 j 2 1 k 5 = i + 5j + 2k. 2

Thus, d=

PS v 1 + 25 + 4 30 = = = 5. v 1+1+4 6

19

1. Vectors and Geometry of Space

1.6

Equations of Planes

In the rst section of this chapter we saw some equations of planes. However, none of those equations has three variables in them and are basically the extension of graphs in two dimensions. We would like have a more general equation for planes. Let us start by assuming that we know a point on the plane, P0 (x0 , y0 , z0 ). In addition, we assume that we have a vector that is orthogonal (perpendicular) to the plane, n = a, b, c . This vector is called the normal vector. For an arbitrary point P = (x, y, z) on this plane, we can nd the position vectors r0 and r of P0 and P , respectively. Here is the sketch of all these vectors. z

n
P (x, y, z)

r r0

The Plane r r0

P0 (x0 , y0 , z0 )

y x Notice that the vector r r0 is also lying on the plane. Also notice that we put the normal vector n on the plane, but it is not necessary to be the case and we put it here just for illustration. It is possible that the normal vector does not touch the plane. Now, because n is orthogonal to the plane, it is also orthogonal to any vector that lies in the plane. In particular it is orthogonal to r r0 . Recall that two orthogonal vectors will have a dot product of zero. In other words, n (r r0 ) = 0 This is called the vector equation of the plane. The vector equation of the plane is not a very useful equation in some ways. Let us get a much more useful form of the equations. Let us start with the rst form of the vector equation. a, b, c ( x, y, z x0 , y0 , z0 ) a, b, c x x0 , y y0 , z z0 Now, actually compute the dot product, a(x x0 ) + b(y y0 ) + c(z z0 ) = 0. This is called the scalar equation of plane. Often this will be written as ax + by + cz = d, = = 0, 0. = n r = n r0 .

20

1.6 Equations of Planes

where d = ax0 + by0 + cz0 . This second form is often how we are given equations of planes. Notice that if we are given the equation of a plane in this form we can quickly get a normal vector for the plane. A normal vector is n = a, b, c .

Let us work a couple of examples.

Example 1.6.1 (Equations of planes) Determine the equation of the plane that contains the points P = (1, 2, 0), Q = (3, 1, 4), and R = (0, 1, 2). Solution In order to write down the equation of plane we need a point (yes, we have three here) and a normal vector. We need to nd a normal vector. Recall that we saw how to do this in Section 1.4. We can form the following two vectors from the given points. P Q = 2, 3, 4 , P R = 1, 1, 2 .

These two vectors lie completely in the plane since we formed them from points that were in the plane. Notice that there are many possible vectors for use, we just chose two of the possibilities. Now we know that the cross product of two vectors is orthogonal to both of these vectors. Since both of these are in the plane any vector that is orthogonal to both of these will also be orthogonal to the plane. Therefore, we can use the cross product as the normal vector. i j k 3 4 = 2i 8j + 5k. n = PQ PR = 2 1 1 2 The equation of the plane is then 2(x 1) 8(y + 2) + 5(z 0) = 0, or 2x 8y + 5z = 18. We used P for the point, but could have used any of the three points.
2

Example 1.6.2 (Equations of planes) Determine if the plane given by x + 2z = 10 and the line given by r = 5, 2 t, 10 + 4t parallel or neither. are orthogonal,

Solution This is not a problem as dicult as it rst appeared to be. We can pick o a vector that is normal to the plane. This is n = 1, 0, 2 . We can also get a vector that is parallel to the line. This is v = 0, 1, 4 . Now, if these two vectors are parallel then the line and the plane will be orthogonal. If you think about it this makes some sense. If n and v are parallel, then v is orthogonal to the plane, but v is also parallel to the line. So, if the two vectors are parallel the line and plane will be orthogonal. Let us consider the following. i j k 0 2 = 2i + 4j + k = 0. n v = 1 0 1 4 So, the vectors are not parallel and so the plane and the line are not orthogonal.

21

1. Vectors and Geometry of Space

Now, let us check to see if the plane and line are parallel. If the line is parallel to the plane then any vector parallel to the line will be orthogonal to the normal vector of the plane. In other words, if n and v are orthogonal then the line and the plane will be parallel. Let us check this. n v = 0 + 0 + 8 = 8 = 0. The two vectors are not orthogonal and so the line and plane are not parallel. So, the line and the plane are neither orthogonal nor parallel.
2

Line of intersection
Just as lines are parallel if and only if they have the same direction, two planes are parallel if and only if their normals are parallel, or n1 = kn2 for some scalar k. Two planes that are not parallel intersect in a line.

The Line of Intersection

Plane 2

n2 n1 n1 n2
Plane 1

Example 1.6.3 (Line of intersection) Find a vector parallel to the line of intersection of the planes 3x 6y 2z = 15 and 2x + y 2z = 5. Solution The line of intersection of two planes is n2 and therefore parallel to n1 n2 . Turning this of intersection. In our case, i j n1 n2 = 3 6 2 1 perpendicular to both planes normal vectors n1 and around, n1 n2 is a vector parallel to the planes line k 2 = 14i + 2j + 15k. 2
2

Any nonzero scalar multiple of n1 n2 will do as well.

Example 1.6.4 (Line of intersection) Find parametric equations for the line in which the planes 3x 6y 2z = 15 and 2x + y 2z = 5 intersect.

22

1.6 Equations of Planes

Solution We nd a vector parallel to the line and a point on the line. Example 1.6.3 identies v = 14i + 2j + 15k as a vector parallel to the line. To nd a point on the line, we can take any point common to the two planes. Substituting z = 0 in the plane equations and solving for x and y simultaneously identies one of these points as (3, 1, 0). The line is x = 3 + 14t, y = 1 + 2t, z = 15t.

The choice z = 0 is arbitrary and we could have chosen z = 1 or z = 1 just as well. Or we could have let x = 0 and solved for y and z. The dierent choices would simply give dierent parametrizations of 2 the same line. Example 1.6.5 (Line of intersection) Find the point where the line x= 8 + 2t, 3 y = 2t, z = 1+t

intersects the plane 3x + 2y + 6z = 6. Solution The point (

8 + 2t, 2t, 1 + t) 3 lies in the plane if its coordinates satisfy the equation of the plane, that is, if 3( 8 + 2t) + 2(2t) + 6(1 + t) 3 8 + 6t 4t + 6 + 6t 8t t The point of intersection is ( 8 2 + 2(1), 2(1), 1 + (1)) = ( , 2, 0). 3 3
2

= = = =

6, 6, 8, 1.

The distance from a point to a plane


If P is a point on a plane with normal n, then the distance from any point S to the plane is the length of the vector projection of P S onto n. That is, the distance from S to the plane is n , d = P S n where n = Ai + Bj + Ck is normal to the plane.

Example 1.6.6 (Line of intersection) Find the distance from S(1, 1, 3) to the plane 3x + 2y + 6z = 6.

23

1. Vectors and Geometry of Space Solution We nd a point P in the plane and calculate the length of the vector projection of P S onto a vector n normal to the plane (see the following gure). The coecients in the equation 3x + 2y + 6z = 6 give n = 3i + 2j + 6k. z S(1, 1, 3) n = 3i + 2j + 6k

3x + 2y + 6z = 6

(0, 0, 1)
Distance from S to the plane

(2, 0, 0) x

P (0, 3, 0) y
The Plane

The points on the plane easiest to nd from the planes equation are the intercepts. If we take P to be the y-intercept (0, 3, 0), then PS = = n = (1 0) i + (1 3) j + (3 0) k i 2j + 3k, p (3)2 + (2)2 + (6)2 = 49 = 7.

The distance from S to the plane is the length of projn P S, or n d = P S n (i 2j + 3k) ( 3 i + 2 j + 6 k) = 7 7 7 3 4 18 = 17 . = + 7 7 7 7


2

Angles between planes


The angles between two intersecting planes is dened to be the (acute) angle between their normal vectors.

Example 1.6.7 (Angle between planes) Find the angle between the planes 3x 6y 2z = 15 and 2x + y 2z = 5. Solution The vectors n1 = 3i 6j 2k, n2 = 2i + j 2k

24

1.7 Quadric Surfaces

are normals to the planes. The angle between them is n1 n2 4 = cos1 = cos1 13.8 radians. n1 n2 21
2

1.7

Quadric Surfaces

In the previous two sections we have looked at lines and planes in three dimensions (or R3 ) and while these are used quite heavily at times in calculus class there are many other surfaces that are also used fairly regularly and so we need to take a look at those. In this section we are going to be looking at quadric surfaces. Quadric surfaces are the graphs of any equation that can be put into the general form Ax2 + By 2 + Cz 2 + Dxy + Exz + F yz + Gx + Hy + Iz + J = 0, where A, B, , J are all constants. There is no way that we can possibly list all of them, but there are some standard equations so here is a list of some of the more common quadric surfaces.

Ellipsoid
Here is the general equation of an ellipsoid, y2 z2 x2 + 2 + 2 = 1. 2 a b c The following is the sketch of a typical ellipsoid. z

x If a = b = c then we will have a sphere. Notice that we only gave the equation for the ellipsoid that has been centered on the origin. Clearly ellipsoids dont have to be centered on the origin. However, in order to make the discussion in this section a little easier we have chosen to concentrate on surfaces that are centered on the origin in one way or another.

25

1. Vectors and Geometry of Space

Cone
Here is the general equation of a cone, y2 z2 x2 + 2 = 2. 2 a b c The following is the sketch of a typical cone. z

y x

Note that this is the equation of a cone that will open along the z-axis. To get the equation of a cone that opens along one of the other axes all we need to do is make a slight modication of the equation. This will be the case for the rest of the surfaces that we will be looking at in this section as well. In the case of a cone the variable that sits by itself on one side of the equal sign will determine the axis that the cone opens up along. For instance, a cone that opens up along the x-axis will have the equation z2 x2 y2 + 2 = 2. 2 b c a For most of the following surfaces we will not give the other possible formulae. We will however acknowledge how each formula needs to be changed to get a change of orientation for the surface.

Cylinder
Here is the general equation of a cylinder, x2 + y 2 = r 2 . The following is the sketch of a typical cylinder. z

26

1.7 Quadric Surfaces

The cylinder will be centered on the axis corresponding to the variable that does not appear in the equation. Be careful to not confuse this with a circle. In two dimensions it is a circle, but in three dimensions it is a cylinder.

Hyperboloid of one sheet


Here is the general equation of a hyperboloid of one sheet, y2 z2 x2 + 2 2 = 1. 2 a b c The following is the sketch of a typical hyperboloid of one sheet. z

The variable with the negative in front of it will give the axis along which the graph is centered.

Hyperboloid of two sheets


Here is the general equation of a hyperboloid of two sheets, x2 y2 z2 2 + 2 = 1. 2 a b c

The following is the sketch of a typical hyperboloid of two sheets. z

27

1. Vectors and Geometry of Space

The variable with the positive in front of it will give the axis along which the graph is centered. Notice that the only dierence between the hyperboloid of one sheet and the hyperboloid of two sheets is the signs in front of the variables. They are exactly the opposite signs.

Elliptic paraboloid
Here is the general equation of an elliptic paraboloid, y2 z x2 + 2 = . a2 b c The following is the sketch of a typical elliptic paraboloid. z

y x In this case the variable that is not squared determines the axis upon which the paraboloid opens up. Also, the sign of c will determine the direction that the paraboloid opens. If c is positive then it opens up and if c is negative then it opens down.

Hyperbolic paraboloid
Here is the general equation of an hyperbolic paraboloid, y2 z x2 2 = . 2 a b c The following is the sketch of a typical hyperbolic paraboloid. z

y x As with the elliptic paraboloid the sign of c will determine the direction in which the surface opens up. The graph above is shown for c positive. With the both of the paraboloids the surface can be easily moved up (resp. down) by adding (resp. subtracting) a constant from the left side. For instance, z = x2 y 2 + 6 is an elliptic paraboloid that opens downward and starts at z = 6 instead of z = 0. Can you make a sketch of it?

28

Chapter 2

Vector-Valued Functions
2.1 Vector Functions

We have seen some examples of graphing surfaces in R3 . However, as we saw with lines, not every graph in R3 needs to be a surface. We can graph curves (sometimes called space curves) that are three-dimensional as well. To do this we may use vector-valued functions or shortly vector functions. Actually, we can also use vector functions to represent surfaces as well. However, in this section we will focus on talking about curves instead of surfaces. The vector form of the equation for a line is a good example of a vector function. For instance, r(t) = r0 + tv. Vector functions take real numbers (or scalars) as arguments, t in this case, and return vectors that are the position vectors for points on the curve (or surface). The general form of a three-dimensional vector function for a curve is r(t) = f (t), g(t), h(t) , or r(t) = f (t) i + g(t) j + h(t) k,

where f (t), g(t), and h(t) are sometimes called the component functions. In the following gure the moving point P (x, y, z) = (f (t), g(t), h(t)) (with r(t) = OP being its position vector) makes up the curve in space. We say that the equations x = f (t), y = g(t), z = h(t) parametrize the curve. Each xed value of t will determine one point on the curve. Here r(t) is a vector function whereas the components of r are scalar functions of t. z

The Curve

r O

P (f (t), g(t), h(t))

29

2. Vector-Valued Functions

The domain of a vector function is the set of all t for which all the component functions are dened.

Example 2.1.1 (Vector functions) Determine the domain of the following vector function r(t) = cos t, ln(4 t), t+1 .

Solution The rst component is dened for all t. The second component is only dened for t < 4. The third component is only dened for t 1. Putting all of these together gives the following domain [1, 4). This is the largest possible interval for which all three component functions are dened.
2

We now need to think about how to get the graph of a space curve from a vector function. There are two ways to do this. The rst is to think of the graph as the set of points (x, y, z) where x = f (t), y = g(t), z = h(t).

Note that these are also the parametric equations for the curve. We have seen parametric equations before and the only dierence is that we are now working with them in three dimensions instead of two dimensions that we used last time. The second way to interpret the graph is to think of r(t) = f (t), g(t), h(t) the point (f (t), g(t), h(t)). as the position vector of

Either these two ways of thinking about the graph will work and useful. We will mostly use the rst way of thinking of the graph of a vector function. Let us take a look at a couple of graphs of vector functions.

Example 2.1.2 (Graph of vector functions) Sketch the graph of the following vector function r(t) = 2 4t, 1 + 2t, 3 t . Solution Notice that the graph of the given vector function is nothing but just a line. It might be better if we rewrite it a little, r(t) = 2, 1, 3 + t 4, 2, 1 . In this form we can see that this is the equation of a line that passes through the point (2, 1, 3) and is parallel to the vector v = 4, 2, 1 . To graph this line all that we need to do is to plot the point and then sketch in the parallel vector. In order to get the sketch we assume that the vector is on the line and will start at the point in the line. To sketch in the line all we need to do is to extend the parallel vector into a line.

Here is the sketch.

30

2.1 Vector Functions z

y x v = 4, 2, 1

(2, 1, 3)

The Line

Example 2.1.3 (Graph of vector functions) Sketch the graph of the following vector function r(t) = 2 cos t, 2 sin t, 5 . Solution In order to sketch the graph let us rst get the parametric equations for the curve. x = 2 cos t, y = 2 sin t, z = 5.

If we ignore the equation for z we will recall that the parametric equations for x and y will give a circular cylinder of radius 2 with axis along the z-axis in R3 . Now, all the parametric equations here tell us that no matter what is going on in the graph all the z-coordinates must be 5. So, we get a circle of radius 2 centered on the z-axis and at the level of z = 5. The following is a sketch. z

The circle with radius 2

y x

31

2. Vector-Valued Functions

Note that it is very easy to modify the above vector function to get a circle centered on the x- or y-axis as well. For instance, r(t) = 10 sin t, 3, 10 cos t will be a circle of radius 10 centered on the y-axis and at y = 3. In other words, as long as two of the terms are sine or cosine (with the same coecient) and the other is a xed number then we will have a circle that is centered on the axis that is given by the xed number.

Example 2.1.4 (Graph of vector functions) Sketch the graph of the following vector function r(t) = 4 cos t, 4 sin t, t . Solution If this one has a constant in the z component we will have another circle. However, in this case we dont have a constant. Instead we have a t and that will change the curve. However, because the x and y component functions are still a circle in parametric equations our curve should have a circular nature to it in some way. In fact, the only change in the z component and as t increases the z coordinate will increase. Also, as t increases the x and y coordinates will continue to form a circle centered on the z-axis. Putting these two ideas together tells us that as we increase t the circle that is being traced out in the x and y directions should also be rising. With its spiral-like graph the curve is named a helix.
-4 -2 4 2 0 -2 -4 20 0

15

The helix

10

As with circles the component that has the t will determine the axis that the helix rotates about. For instance, the vector function r(t) = t, 6 cos t, 6 sin t represents a helix that rotates around the x-axis. Also note that if we allow the coecients on the sine and cosine for both the circle and helix to be dierent we will get ellipses. For example, r(t) = 9 cos t, t, 2 sin t will be a helix that rotates about the y-axis and is in the shape of an ellipse.

32

2.2 Calculus with Vector Functions

Example 2.1.5 (Vector equation for a line segment) Determine the vector equation for the line segment starting at the point P = (x1 , y1 , z1 ) and ending at the point Q = (x2 , y2 , z2 ). Solution It is important to note here that we only want to nd the equation of the line segment that starts at P and ends at Q. We dont want any other portion of the line and we do want the direction of the line segment preserved as we increase t. Let us not worry about that and just nd the vector equation of the line that passes through the two points. Once we have this we will be able to get what we are after. So, we need a point on the line. Two points are given by the question and we will use P . We need a vector that is parallel to the line and since we have two points we can nd the vector between them. This vector will lie on the line and hence parallel to the line. Also, remember that we want to preserve the starting and ending points of the line segment so let us construct the vector using the same orientation. v = x 2 x 1 , y2 y1 , z 2 z 1 . Using this vector and the point P we get the following vector equation of the line. r(t) = x1 , y1 , z1 + t x2 x1 , y2 y1 , z2 z1 . While this is the vector equation of the line, let us rewrite the equation slightly. r(t) = = x 1 , y1 , z 1 + t x 2 , y2 , z 2 t x 1 , y1 , z 1 (1 t) x1 , y1 , z1 + t x2 , y2 , z2 .

This is the equation of the line that contains the points P and Q. We of course just want the line segment that starts at P and ends at Q. We can get this by simply restricting the values of t. Notice that r(0) = x1 , y1 , z1 , r(1) = x2 , y2 , z2 .

So, if we restrict t to be between zero and one we will cover the line segment and we will start and end at the corresponding point. So the vector equation of the line segment that starts at P = (x1 , y1 , z1 ) and ends at Q = (x2 , y2 , z2 ) is r(t) = (1 t) x1 , y1 , z1 + t x2 , y2 , z2 , 0 t 1.
2

As noted at the beginning of this section we can also use vector functions for surfaces as well. So, to make sure that we dont forget that let us work an example with that as well.

2.2

Calculus with Vector Functions

In this section we need to discuss briey about limits, derivatives and integrals of vector functions. As you will see, these behave in a fairly predicatable manner. We will be doing all of the work in R3 but we can naturally extend the results in this section to Rn (i.e., n-dimensional space). Limits.
ta

Let us start with limits. Here is the limit of a vector function. = =


ta

lim r(t)

lim f (t), g(t), h(t) lim f (t), lim g(t), lim h(t)
ta ta

ta

or

ta

lim f (t) i + lim g(t) j + lim h(t) k.


ta ta

So all we need to do is take the limits of each of the component functions and leave it as a vector.

33

2. Vector-Valued Functions

Example 2.2.1 (Limit of vector functions) sin(3t 3) 2t , e . Compute lim r(t) where r(t) = t3 , t1 t1 Solution Taking the limit of the vector function is equivalent to taking the limits of the three component scalar functions. sin(3t 3) lim t3 , lim lim r(t) = , lim e2t t1 t1 t1 t1 t1 3 cos(3t 3) , lim e2t = 1, 3, e2 . = lim t3 , lim t1 t1 t1 1 Notice that we used lHpitals Rule on nding the limit of the second component function. o 2

Continuity. We dene continuity for vector functions the same way we dene continuity for scalar functions. A vector function r(t) is continuous at a point t = t0 in its domain if
tt0

lim r(t) = r(t0 ).

The function is continuous if it is continuous at every point in its domain. Particularly, r(t) is continuous at t = t0 if and only if each component function is continuous there. Derivatives. Now let us take care of derivatives and after seeing how limits work it should not be surprising that we have the following for derivatives. r (t) = f (t), g (t), h (t) = f (t) i + g (t) j + h (t) k.

Example 2.2.2 (Derivative of vector functions) Compute Solution r (t) where r(t) = t6 i + sin 2t j ln(t + 1) k.

The calculations are nothing but just taking the derivatives of each component independently. r (t) = 6t5 i + 2 cos 2t j 1 k. t+1
2

1. 2. 3.

d (u + v) = u + v . dt d (cu) = cu . dt d (f (t) u(t)) = f (t) u(t) + f (t) u (t). dt

4. 5. 6.

d (u v) = u v + u v . dt d (u v) = u v + u v . dt d (u(f (t))) = u (f (t)) f (t). dt

Most of the basic facts that we know about derivatives still hold however, just to make it clear here are some facts about derivatives of vector functions. There is also one quick denition that we should get out of the way so that we can use it when we need to.

34

2.2 Calculus with Vector Functions

A smooth curve is any curve for which r (t) is continuous and r (t) = 0 for any t. A helix is a smooth curve, for example. Integrations. Finally, we need to discuss integrals of vector functions. Using both limits and derivatives as a guide it should be natural that we also have the following for indenite integrals: Z r(t) dt Z r(t) dt = = Z f (t) dt i + Z f (t) dt, Z g(t) dt j + Z g(t) dt, Z h(t) dt + c, Z h(t) dt k + c

and the following for the denite integrals: Z


a b

Z r(t) dt =
a

Z f (t) dt,
a

Z g(t) dt,
a b

h(t) dt , Z
b

Z
a

Z r(t) dt =
a

Z f (t) dt i +
a

g(t) dt j +
a

h(t) dt k.

With the indenite integrals we put in a constant of integration to make sure that it was clear that the constant in this case needs to be a vector instead of a regular constant.

Also, for the denite integrals we will sometimes write it as follows, Z b Z Z Z b r(t) dt = f (t) dt, g(t) dt, h(t) dt ,
a a

Z
a

Z r(t) dt = f (t) dt i +

Z g(t) dt j +

b h(t) dt k .
a

In other words, we will do the indenite integral and then do the evaluation of the vector as a whole instead of on a component-by-component basis. Example 2.2.3 (Indenite integration of vector functions) Z Compute r(t) dt for r(t) = sin t, 6, 4t . Solution All we need to do is integrate each of the components and we are done. Z r(t) dt = cos t, 6t, 2t2 + c,
2

where c is the constant (vector) of integration.

Example 2.2.4 (Denite integration of vector functions) Z 1 Compute r(t) dt for r(t) = sin t, 6, 4t .
0

35

2. Vector-Valued Functions

Solution In this case all that we need to do is apply the result from the previous example and then do the evaluation. Z 1 ` 1 r(t) dt = cos t, 6t, 2t2 0
0

cos 1, 6, 2 1, 0, 0

1 cos 1, 6, 2 .
2

2.3

Arc Length in Space

In this section we will focus on an old formula with vector functions. We want to determine the length of a vector function r(t) = f (t), g(t), h(t) , on the interval a t b. We actually know how to do this. Recall that we can write the vector function into the parametric form x = f (t), y = g(t), z = h(t). Also, recall that with two-dimensional parametric curves the arc length is given by Z bp [f (t)]2 + [g (t)]2 dt. L=
a

There is a natural extension of this into three dimensions. So, the length of the curve r(t) on the interval a t b is Z bp L= [f (t)]2 + [g (t)]2 + [h (t)]2 dt.
a

There is a nice simplication that we can make for this. Notice that the integrand (the function we are integrating) is nothing more than the magnitude of the tangent vector, i.e., p r (t) = [f (t)]2 + [g (t)]2 + [h (t)]2 . Therefore, the arc length of the curve can be written as Z L=
a b

r (t) dt.

Example 2.3.1 (Arc length) Determine the length of the curve r(t) = 2t, 3 sin 2t, 3 cos 2t Solution We need the tangent vector and its magnitude. r (t) r (t) The arc length is then L=
a

on the interval 0

2.

= =

2, 6 cos 2t, 6 sin 2t , p 4 + 36 cos2 2t + 36 sin2 2t = Z


b

4 + 36 = 2 10.

Z r (t) dt =
0

2 10 dt = 4 10.

36

2.3 Arc Length in Space

We need to take a quick look at another concept here. We dene the arc length function as Z s(t) =
0 t

r (u) du.

Before we look at why this might be important let us work a quick example. Example 2.3.2 (Arc length function) Determine the arc length function for r(t) = 2t, 3 sin 2t, 3 cos 2t . Solution From the previous example we know that r (t) = 2 10. The arc length function is then Z s(t) =
0 t

h it 2 10 du = 2 10u = 2 10 t.
0
2

Finding arc length parametrizations. Now the question is why would we want to do this? Let us take the result of the example above and solve it for t. s t= . 2 10 Taking this and plugging it into the original vector function and we can reparametrize the function into the form r(t(s)). For our function this is s s s . r(t(s)) = , 3 sin , 3 cos 10 10 10 So, why would we want to do this? What is the reason of considering the problem of nding an arc length parametrization of a vector function that is expressed initially in terms of some other parameter t? The reason is: with the reparametrization we can now tell where we are on the curve after we have traveled a distance of s along the curve. Note as well that we will start the measurement of distance from where we are at t = 0.

Example 2.3.3 (Reparametrization of vector function) Where on the curve r(t) = 2t, 3 sin 2t, 3 cos 2t 10 . are we after traveling for a distance of 3

37

2. Vector-Valued Functions

Solution

To determine this we need the reparametrization, which we have from above

s s s . r(t(s)) = , 3 sin , 3 cos 10 10 10 10 Then, to determine where we are all that we need to do is plug in s = into this and we will get our 3 location. 10 3 3 3 r t( ) = , 3 sin , 3 cos = , , . 3 3 3 3 3 2 2 3 3 3 10 2 along the curve we are at the point ( , , ) in space. So, after traveling a distance of 3 3 2 2

Because arc length parameters for a curve are intimately related to the geometric characteristics of the curve, arc length parametrizations have properties that are not enjoyed by other parametrizations. For example, there is a theorem stating that if a smooth function curve is represented parametrically using arc length parameter, then the tangent vectors all have length 1, i.e., dr/ds = 1.

38

Chapter 3

Partial Derivatives
The notation y = f (x) is used to indicate that the variable y depends on the single independent variable x, that is, that y is a function of x. In fact, many functions depend on more than one independent variable. For instance, the volume of a circular cone is a function V = (1/3)r 2 h of its radius and its height, so it is a function V (r, h) of two variables. In this chapter we extend the basic ideas of single variable calculus to functions of several variables. The calculus of several variables is basically single variable calculus applied to several variables once at a time. When we hold all but one of the independent variables of a function constant and dierentiate with respect to that one variable, we get a partial derivative. Section 3.3 (page 46) will show how partial derivatives are dened and interpreted geometrically, and how to calculate them by applying the rules for dierentiating functions of a single variable. Despite the fact that this chapter is about derivatives we would like to rst develop the fundamentals and to introduce the basic concepts on limits and continuity of functions of several variables.

3.1

Functions of Several Variables

In this beginning section we rst dene functions of more than one independent variable and discuss their geometric representations. Real-valued functions of several independent variables are dened similarly to functions in the single variable case. By analogy with the corresponding denition for functions of single variable, we dene a function of n variables as follows: 3.1.1 Denition Suppose D is a set of n-tuples of real numbers (x1 , x2 , , xn ). A real-valued function f on D is a rule that assigns a unique (single) real number w = f (x1 , x2 , , xn ) to each element in D. The set D is the functions domain. The set of w-values taken on by f is the functions range. The symbol w is the dependent variable of f , and f is said to be a function of the n independent variables x1 to xn . We also call the xj s the functions input variables and call w the functions output variable. Most of the examples we consider hereafter will be functions of two or three independent variables. When a function f depends on two variables, we will usually call these independent variables x and y, and we will use z to denote the dependent variable that represents the value of the function; that is, z = f (x, y). We will normally use x, y, and z as the independent variables of a function of three variables and w as the value of the function: w = f (x, y, z). Some denitions will be given, and some theorems will be stated only for the two-variable case, but extensions to three or more variables will usually be natural and obvious.

39

3. Partial Derivatives

Natural domain
As we stated in the previous denition that the independent variables of a function of two or more variables may be restricted to lie in some set D, which we call the domain of f . Sometimes the domain will be determined by physical restrictions on the variables (for instance, the time t must be non-negative). If the function is dened by a formula and if there are no physical restrictions or other restrictions stated explicitly, then it is understood that the domain consists of all points for which the formula yields a real value for the dependent variable. We call this the natural domain of the function. Example 3.1.1 (Natural domain) Sketch the natural domain of the function f (x, y) = ln(x2 y). Solution The function ln(x2 y) is dened only when x2 y > 0 or y < x2 .

We rst sketch the parabola y = x2 as a dashed curve. y


y = x2

x The region y < x2 then consists of all points below this curve. Remark that the dashed boundary does not 2 belong to the domain.

Example 3.1.2 (Natural domain) Let p f (x, y, z) = 1 x2 y 2 z 2 . Find f (0, 1 , 1 ) and the natural domain of f . 2 2 Solution By substitution, 1 1 f (0, , ) = 2 2 r 1 (0)2 r

1 1 ( )2 ( )2 = 2 2

1 . 2

Because of the square root sign, we must have 1 x2 y 2 z 2 f (x, y, z). Rewriting this inequality in the form x2 + y 2 + z 2 1

0 in order to have a real value for

we see that the natural domain of f consists of all points on or within the sphere x2 + y 2 + z 2 = 1.
2

40

3.1 Functions of Several Variables

Graphical representations
The graph of a function f of one variable (i.e., the graph of the equation y = f (x)) is the set of points in the xy-plane having coordinates (x, f (x)), where x is in the domain of f . Similarly, the graph of a function of two variables (i.e., the graph of the equation z = f (x, y)) is the set of points in 3-space having coordinates (x, y, f (x, y)), where (x, y) belongs to the domain of f . This graph is a surface in R3 lying above (if f (x, y) > 0) or below (if f (x, y) < 0) the domain of f in the xy-plane. The graph of a function of three variables is a three-dimensional hypersurface in 4-space, R4 . In general, the graph of a function of n variables is an n-dimensional surface in Rn+1 . However, we will not attempt to draw graphs of functions of more than two variables!

Example 3.1.3 (Surface as a plane) Consider the function y x , f (x, y) = 4 1 2 3 4 2x.

2,

The graph of f is the plane triangular surface with vertices at (2, 0, 0), (0, 3, 0), and (0, 0, 4). If the domain of f had not been explicitly stated to be a particular set in the xy-plane, the graph would have been the whole plane through these three points. 2

Example 3.1.4 (Surface as a shell) Consider the function f (x, y) = p 9 x2 y 2 .

The expression inside the square root cannot be negative, so the domain is the disk x2 + y 2 9 in the xyp plane. If we square the equation z = 9 x2 y 2 , we can rewrite the result in the form x2 + y 2 + z 2 = 9. This is a spherical shell of radius 3 centered at the origin. However, the graph of f is only the upper 2 hemisphere where z 0. Quite often it is dicult to sketch the surface z = f (x, y) onto a two-dimensional paper without considerable artistic talent and training. Nevertheless, you should always try to visualize such a graph and sketch it as best you can. Sometimes it is convenient to sketch only part of a graph, for instance, the part lying in the rst octant. It is also helpful to determine (and sketch) the intersections of the graph with various planes, especially the coordinate planes, and planes parallel to the coordinate planes.

Another way to represent the function f (x, y) graphically is to produce a two-dimensional topographic map of the surface z = f (x, y). In the xy-plane we sketch the curves f (x, y) = C for various values of the constant C. These curves are called level curves of f because they are the vertical projections onto the xy-plane of the curves in which the graph z = f (x, y) intersects the horizontal (level) planes z = C. For example, the graph of the function f (x, y) = x2 + y 2 is a circular paraboloid in 3-space; the level curves are circles centered at the origin in the xy-plane.

y x of Example 3.1.3 are the segments of the straight The level curves of the function f (x, y) = 4 1 2 3 lines x y x y C 4 1 =C or + =1 , 0 C 3, 2 3 2 3 4 which lie in the rst quadrant. Such level curves correspond to equally spaced values of C, and their equal 2 spacing indicates the uniform steepness of the graph of f .

Example 3.1.5 (Level curves)

41

3. Partial Derivatives

Example 3.1.6 (Level curves) The level curves of the function f (x, y) = p 9 x2 y 2 = C

9 x2 y 2 of Example 3.1.4 are the concentric circles or x2 + y 2 = 9 C 2 , 0 C 3.

The level curves (i.e. concentric circles) should be drawn for several equally spaced values of C. The circles are getting closer and closer for values of C getting closer and closer to 0. The decreasing spacing indicates the steepness of the hemispherical surface that is the graph of f . 2 Let us summarize all these terminologies for a function of two variables in the following denition. 3.1.2 Denition The set of points in the plane where a function f (x, y) has a constant value f (x, y) = C is called a level curve of f . The set of all points (x, y, f (x, y)) in space, for (x, y) in the domain of f , is called the graph of f . The graph of f is also called the surface z = f (x, y).

3.2

Limits and Continuity

In this section we will take a look at limits involving functions of more than one variable. In fact, we will concentrate on limits of functions of two variables, but the ideas can be extended out to functions with more than two variables. Before getting into this let us briey recall how limits of functions of one variable work. We say that
xa

lim f (x) = L

(exists)

provided that the two one-sided limits exist and have equal value, that is,
xa

lim f (x) = lim f (x) = L.


xa+

Also recall that greater than a.

xa

lim f (x) is the left hand limit and requires us to only look at values of x that are less
xa+

than a. Likewise, lim f (x) is the right hand limit and requires us to only look at values of x that are

y
point of interest Function of single variable y = f (x) L

( a
from the left

)
from the right

Now, notice that in this case there are only two paths that we can take as we move in towards x = a. We can either move in from the left or we can move in from the right. Then in order for the limit of a function of one variable to exist the function must be approaching the same value as we take each of these paths in towards x = a.

42

3.2 Limits and Continuity

Limits of functions of two variables


With functions of two variables we will have to do something similar, except this time there is possibly going to be a lot more work involved. Let us rst address the notation and get a feeling of what we are going to be asking for in these kinds of limits. We will be asking to take the limit of the function f (x, y) as x approaches a and y approaches b. This can be written in several ways. Here are a couple of the more standard notations
xa, yb

lim

f (x, y),

(x,y)(a,b)

lim

f (x, y).

We will use the second notation in this course. The second notation is also a little more helpful in illustrating what we are really doing here when we are taking a limit. In taking a limit of a function of two variables we are really asking what the value of f (x, y) is doing as we move the point (x, y) in closer and closer to the point (a, b) without actually letting it be (a, b). Just like with limits of functions of one variable, in order for this limit to exist, the function must be approaching the same value regardless of the path that we take as we move in towards (a, b). The problem that we are immediately faced with is that there are literally an innite number of paths that we can take as we move towards the point (a, b).
z Function of two variables z = f (x, y) y

b b y

Domain of f (x, y)

a x

The above gure (right side) gives a few examples of paths that we could take. We put in several straight line paths as well as a couple of stranger paths that are not straight line paths. Also, we only included 6 paths here and as you can see simply by varying the slope of the straight line paths there are an innite number of these and then we would need to consider paths that are not straight line paths. In other words, to show that a limit (of function of two variables) exists we would technically need to check an innite number of paths and verify that the function is approaching the same value regardless of the path we are using to approach the point. Of course, practically, this is simply not possible. Fortunately however we can use the main ideas from single variable calculus to help us take limits here. See the following denition for continuity of a function of two variables.

Continuity of functions of two variables


As for functions of one variable, continuity of function of a function f at a point of its domain is dened directly in terms of the limit.

43

3. Partial Derivatives

3.2.1 Denition A function f is continuous at the point (a, b) if 1. f is dened at (a, b), 2.
(x,y)(a,b)

3.

(x,y)(a,b)

lim

f (x, y) = f (a, b).

lim

f (x, y) exists,

A function is continuous if it is continuous at every point of its domain.

From a graphical viewpoint this denition means the same thing as it did when we rst learnt single variable calculus. A function will be continuous at a point if the graph does not have any holes or breaks at that point. How can this help us take limits? Well, just as in single variable calculus, if you know that a function is continuous at (a, b) then you also know that
(x,y)(a,b)

lim

f (x, y) = f (a, b)

must be true. So, if we know that a function is continuous at a point then all we need to do to take the limit of the function at that point is plug the point into the function. All the standard functions that we know to be continuous are still continuous even if we are plugging in more than one variable now. We just need to watch out for division by zero, square roots of negative numbers, logarithms of zero or negative numbers, etc. Note that the idea about paths however is not one that we should forget since it is a nice way to determine if a limit does not exist. If we can nd two paths upon which the function approaches dierent values as we get near the point then we will know that the limit does not exist. Let us rst take a look at a couple of examples.

Example 3.2.1 (Evaluation of limits) Determine if the following limits exist or not. If they do exist then give the value of the limit. ` ` 2 x2 y, (c) lim 2x y 2 , (b) lim 3x + xy cos(x) . (a) lim
(x,y)(2,3) (x,y)(a,b) (x,y)(2,1)

Solution In this example the three functions are continuous at the respective point and so all we need to do is plug in the values and we are done. ` 2x y 2 = 2(2) (3)2 = 4 9 = 5, (a) lim
(x,y)(2,3)

(b) (c)

(x,y)(a,b)

lim

x2 y = a2 b, ` 2 3x + xy cos(x) = 3(2)2 + (2)(1) cos(2) = 10.

(x,y)(2,1)

lim

Recall that any combinations and compositions of continuous functions (e.g. polynomials, rational functions, sine/cosine functions, exponential functions, etc.) are still continuous. For example, the composite functions exy , p x2 + y 2 + 1, cos xy , x2 + 1 ln(1 + x2 y 2 )

are continuous at every point (x, y). So, basically, we can calculate the limits of this kind of continuous functions by evaluating the function values at (a, b). The only reminder is that the rational functions must 2 be dened at (a, b).

44

3.2 Limits and Continuity

Example 3.2.2 (Evaluation of limits) Investigate the limiting behavior of the function f as (x, y) approaches (5, 1): f (x, y) = xy . x+y

Solution In this example the function f will not be continuous along the line y = x since we will encounter division by zero when this is true. However, for this problem that is not something that we will need to worry about since the point that we are taking the limit at is not on this line. Therefore, all that we need to do is plug in the point as in Example 3.2.1 since the function is continuous at this point.
(x,y)(5,1)

lim

5 xy = x+y 6

(exists).
2

Example 3.2.3 (Existence of limits) Find the limit if it exists.


(x,y)(0,0)

lim

x2 y 2 . x4 + 3y 4

Solution Now, in this example the function is not continuous at the point in question and so we cannot just plug in the point. So, since the function is not continuous at the point there is at least a chance that the limit does not exist. If we can nd two dierent paths to approach the point that will give two dierent values for the limit then we will know that the limit does not exist. Two of the more common paths to check are the x and y-axis so let us try those. Before actually doing this we need to address just what exactly do we mean when we say that we are going to approach a point along a path. When we approach a point along a path we will do this be either xing x or y or by relating x and y through some function. In this way we can reduce the limit to just a limit involving a single variable which we know how to do from elementary calculus. So, let us see what happens along the x-axis. If we are going to approach (0, 0) along the x-axis we can take the advantage of the fact that along the x-axis we know that y = 0. This means that, along the x-axis, we will plug in y = 0 into the function and then take the limit as x approaches zero.
(x,y)(0,0)

lim

x2 y 2 x2 (0)2 = lim = lim 0 = 0. x0 (x,0)(0,0) x4 + 3(0)4 x4 + 3y 4

So, along the x-axis the function will approach zero as we move in towards the origin. Now, let us try the y-axis. Along the y-axis we have x = 0 and so the limit becomes
(x,y)(0,0) x4

lim

x2 y 2 (0)2 y 2 = lim = lim 0 = 0. 4 y0 (0,y)(0,0) (0)4 + 3y 4 + 3y

So, the same limit along two paths. Do not mis-read this. This does NOT say that the limit exists and has a limit value of zero. This only means that limit happens to have the same limit value along two special paths. Let us take a look at the limit of the function along a third (fairly common) path. In this case we will move in towards the origin along the path y = x. This is what we meant previously about relating x and y through a function. To do this we will replace all the ys with xs and then let x approach zero. Let us take a look at this limit.
(x,y)(0,0) x4

lim

x2 y 2 x2 (x)2 x4 1 1 = lim = lim = lim = . 4 4 + 3(x)4 x0 4x4 x0 4 (x,x)(0,0) x + 3y 4

So, a dierent value from the previous two paths and this means that the limit
(x,y)(0,0)

lim

x2 y 2 x4 + 3y 4

does not exist.

45

3. Partial Derivatives

Note that we can use this idea of moving in towards the origin along a line with the more general path 2 y = mx if we need to.

Example 3.2.4 (Existence of limits) Find the limit if it exists. x3 y . (x,y)(0,0) x6 + y 2 lim

Solution With this ending example we still have continuity problems at the origin. So, again let us see if we can nd a couple of paths that give dierent values of the limit. First, we will use the path y = x. Along this path we have
(x,y)(0,0) x6

lim

x3 y x3 (x) x4 x2 = lim = lim 2 4 = lim 4 = 0. 2 6 + (x)2 x0 x (x + 1) x0 x + 1 (x,x)(0,0) x +y

Now, let us try the path y = x3 . Along this path the limit becomes
(x,y)(0,0) x6

lim

x3 y x3 (x3 ) x6 1 1 = lim = lim = lim = . 2 3 )(0,0) x6 + (x3 )2 x0 2x6 x0 2 +y 2 (x,x

We now have two paths that give dierent values for the limit and so the limit does not exist. As this limit 2 has shown us we can, and often need, to use paths other than straight lines.

3.3

Partial Derivatives

In this section we begin the process of dierentiating functions of more than one variable. Before we actually start taking derivatives of functions of more than one variable let us recall an important interpretation of derivatives of functions of one variable. Recall that given a function of one variable, f (x), the derivative, f (x), represents the rate of change of the function as x changes. This is an important interpretation of derivatives and we are not going to want to lose it with functions of more than one variable. The problem with functions of more than one variable is that there is more than one variable. In other words, what do we do if we only want one of the variables to change, or if we want more than one of them to change? In fact, if we are going to allow more than one of the variables to change there are then going to be an innite amount of ways for them to change. For instance, one variable could be changing faster than the other variable(s) in the function. Notice as well that it will be completely possible for the function to be changing dierently depending on how we allow one or more of the variables to change. We will need to develop ways, and notations, for dealing with all of these cases. In this section we are going to concentrate exclusively on only changing one of the variables at a time, while remaining variable(s) are held xed. We will deal with allowing multiple variables to change in a later section (page 66). Because we are going to only allow one of the variables to change taking the derivative will become a fairly simple process. Let us start o with a fairly simple function. Consider the function f (x, y) = 2x2 y 3 . Let us determine the rate at which the function is changing at a point (a, b), if we hold y xed and allow x to vary and if we hold x xed and allow y to vary. We will start by looking at the case of holding y xed and allowing x to vary. Since we are interested in the rate of change of the function at (a, b) and are holding

46

3.3 Partial Derivatives

y xed this means that we are going to always have y = b. only xs and we can dene a new function as follows

Doing this will give us a function involving

g(x) = f (x, b) = 2x2 b3 . Now, this is a function of a single variable and at this point all that we are asking is to determine the rate of change of g(x) at x = a. In other words, we want to compute g (a) and since this is a function of a single variable we already know how to do that. Here is the rate of change of the function at (a, b) if we hold y xed and allow x to vary. g (a) = 4ab3 . We will call g (a) the partial derivative of f (x, y) with respect to x at (a, b) and we will denote it in the following way, fx (a, b) = 4ab3 . Now, let us do it the other way. We will now hold x xed and allow y to vary. We can do this in a similar way. Since we are holding x xed it must be xed at x = a and so we can dene a new function of y and then dierentiate this as we have always done with functions of one variable. Here is the work for this, h(y) = f (a, y) = 2a2 y 3 = h (b) = 6a2 b2 .

In this case we call h (b) the partial derivative of f (x, y) with respect to y at (a, b) and we denote it as follows fy (a, b) = 6a2 b2 . Note that these two partial derivatives are sometimes called the rst-order partial derivatives. Just as with functions of one variable we can have derivatives of all orders. We will be looking at higher-order (partial) derivatives later in this section (page 57). Note that the notation for partial derivatives is dierent than that for derivatives of functions of a single variable. With functions of a single variable we could denote the derivative with a single prime. However, with partial derivatives we will always need to remember the variable that we are dierentiating with respect to and so we will subscript the variable that we dierentiated with respect to. We will shortly be seeing some various notations for partial derivatives as well. Note as well that we usually do not use the (a, b) notation for partial derivatives. The more standard notation is to just continue to use (x, y). So, the partial derivatives from above will more commonly be written as and fy (x, y) = 6x2 y 2 . fx (x, y) = 4xy 3 Now, as this quick example has shown taking derivatives of functions of more than one variable is done in pretty much the same manner as taking derivatives of a single variable. To compute fx (x, y) all we need to do is treat all the ys as constants (or numbers) and then dierentiate the xs as we have done. Likewise, to compute fy (x, y) we will treat all the xs as constants and then dierentiate the ys as we are used to doing. Before we do a few examples let us get the formal denition of the partial derivative out of the way as well as some alternate (but equivalent) notations. Since we can think of the two partial derivatives above as derivatives of single variable functions it should not be too surprising that the denition of each is very similar to the denition of the derivative for single variable functions. For a function of two variables, we make this precise in the following denition.

3.3.1 Denition The rst partial derivatives of the function f (x, y) with respect to the variables x and y are the functions fx (x, y) and fy (x, y) given by fx (x, y) = lim f (x + h, y) f (x, y) , h and fy (x, y) = lim f (x, y + h) f (x, y) , h

h0

h0

provided these limits exist.

47

3. Partial Derivatives

Each of the two partial derivatives is the limit of a dierence quotient in one of the variables. Observe that fx (x, y) is just the ordinary rst derivative of f (x, y) considered as a function of x only, regarding y as a constant parameter. Similarly, fy (x, y) is the rst derivative of f (x, y) considered as a function of y alone, with x held xed.

Example 3.3.1 (Partial derivatives) If f (x, y) = x2 sin y, then fx (x, y) = 2x sin y and fy (x, y) = x2 cos y.
2

Remark that various notations can be used freely to denote the partial derivatives of z = f (x, y) considered as functions of x and y: fx (x, y) fy (x, y) = = fx fy = = f x f y = = (f (x, y)) x (f (x, y)) y = = zx zy = = z , x z . y

For the fractional notation for the partial derivative notice the dierence between the partial and the ordinary derivative from single variable calculus. f (x) f (x, y) = = f (x) = fx (x, y) = df , dx f , x and fy (x, y) = f . y

To distinguish partial derivatives from ordinary derivatives we use the symbol rather than the d used in single variable calculus. The symbol /x should be read as partial with respect to x so f /x is partial f with respect to x. Let us work some examples. When working these examples always keep in mind that we need to pay very much attention to which variable we are dierentiating with respect to. This is important because we are going to treat all other variables as constants and then proceed with the derivative as if it was a function of a single variable. Also note that the standard dierentiation rules for sums, products, reciprocals, and quotients continue to apply to partial derivatives.

Example 3.3.2 (Partial derivatives) Find all of the rst-order partial derivatives for the following functions. (a) f (x, y) = x4 + 6 y 10. (b) w = x2 y 10y 2 z 3 + 44x 7 tan(4y). (c) h(s, t) = t7 ln s2 + (d) f (x, y) = cos 9 7 s4 . t3 (e) z = 9u . u2 + 5v x sin y . z2

(f) g(x, y, z) =

4 x2 y5y 3 e . x

(g) z =

p x2 + ln(5x 3y 2 ).

Solution

48

3.3 Partial Derivatives

(a) Let us rst take the derivative with respect to x and remember that as we do so all the ys will be treated as constants. The partial derivative with respect to x is fx (x, y) = 4x3 . Notice that the second and the third terms dierentiate to zero in this case. It should be clear why the third term dierentiated to zero. It is a constant and we know that constants always dierentiate to zero. This is also the reason that the second term dierentiated to zero. Remember that since we are dierentiating with respect to x here we are going to treat all ys as constants. This means that those terms that only involve ys will be treated as constants and hence dierentiate to zero. Now, let us take the derivative with respect to y. In this case we treat all xs as constants and so the rst term involves only xs and so will dierentiate to zero, just as the third term will. The partial derivative with respect to y is 3 fy (x, y) = . y (b) With this function we have three rst-order derivatives to compute. Let us do the partial derivatives with respect to x rst. Since we are dierentiating with respect to x we will treat all ys and all zs as constants. This means that the second and fourth terms will dierentiate to zero since they only involve ys and zs. The rst term contains both xs and ys and so when we dierentiate with respect to x the y is just treated to be a multiplicative constant and so the rst term will be dierentiated just as the third term will be dierentiated. Here is the partial derivative with respect to x. w = 2xy + 44. x Let us now dierentiate with respect to y. In this case all xs and zs will be treated as constants. This means the third term will dierentiate to zero since it contains only xs while the xs in the rst term and the zs in the second term will be treated as multiplicative constants. Here is the partial derivative with respect to y. w = x2 20yz 3 28 sec2 (4y). y Finally, let us get the derivative with respect to z. Since only one of the terms involve zs this will be the only non-zero term in the derivative. Also, the ys in that term will be treated as multiplicative constants. Here is the partial derivative with respect to z. w = 30y 2 z 2 . z

(c) With this function we will not put in the detail of the rst two. Before taking derivative let us rewrite the function a little to help us with the dierentiation process. h(s, t) = 2t7 ln s + 9t3 s 7 . Now, the fact that we are using s and t here instead of the standard x and y should not be a problem at all. It will work the same way. Here are two partial derivatives for this function. hs (s, t) ht (s, t) = = 4 3 h 1 = 2t7 + 0 s 7 s s 7 h 6 = 14t ln s 27t4 . t = 4 3 2t7 s 7 , s 7
4

49

3. Partial Derivatives

(d) Now, we cannot forget the product rule with derivatives. The product rule will work the same way here as it does with functions of one variable. We will just need to be careful to remember which variable we are dierentiating with respect to. Let us start out by dierentiating with respect to x. In this case both the cosine and the exponential contain xs and so we have a product of two functions involving xs and so we will need the product rule for dierentiation. Here is the derivative with respect to x. fx (x, y) = = sin
2 3 2 3 4 4 4 ( 2 ) ex y5y + cos ex y5y (2xy) x x x 3 3 4 4 2 4 2 sin ex y5y + 2xy cos ex y5y . x2 x x

Do not forget the chain rule for functions of one variable. We will be looking at the chain rule for some more complicated expressions for multivariable functions in a later section (page 59). However, at this point we are treating all the ys as constants and so the chain rule will continue to work as it does in single variable calculus. Also, do not forget how to dierentiate exponential functions d f (x) e = f (x) ef (x) . dx Now, let us dierentiate with respect to y. In this case we do not have a product rule to worry about since the only place that the y shows up is in the exponential. Therefore, since xs are considered to be constants for this derivative, the cosine in the front will also be treated as a multiplicative constant. Here is the partial derivative with respect to y. fy (x, y) = (x2 15y 2 ) cos 4 x2 y5y 3 e . x

(e) We also cannot forget about the quotient rule. Since there is not much to do this one, we will simply give the derivatives. 9(u2 + 5v) 9u(2u) 9u2 + 45v = , zu = 2 + 5v)2 (u (u2 + 5v)2 (0)(u2 + 5v) 9u(5) 45u = . (u2 + 5v)2 (u2 + 5v)2 In the case of the derivative with respect to v recall that us are constants and so when we dierentiate the numerator we will get zero. zv =

(f) Now, we do need to be careful however not to use the quotient rule when it does not need to be used. In this case we do have a quotient, however, since the xs and ys only appear in the numerator and the zs only appear in the denominator this really is not a quotient rule problem. Let us compute the derivatives with respect to x and y rst. In both these cases the zs are constants and so the denominator is a constant and so we do not really need to worry too much about it. Here are the derivatives for these two cases. gx (x, y, z) = sin y z2 and gy (x, y, z) = x cos y . z2

Now, in the case of dierentiation with respect to z we can avoid the quotient rule with a quick rewrite of the function. Here is the rewrite as well as the derivative with respect to z. g(x, y, z) gz (x, y, z) = = x sin y z 2 , 2x sin y z 3 = 2x sin y . z3

50

3.3 Partial Derivatives

(g) In the last part we are going to apply the chain rule. If you have a good knowledge in single variable calculus this should not be all that dicult of a problem. Here are the two derivatives. zx = = = 1 ` 2 1` 2 x + ln(5x 3y 2 ) 2 x + ln(5x 3y 2 ) 2 x 1 ` 2 5 1 x + ln(5x 3y 2 ) 2 2x + 2 5x 3y 2 1 ` 2 5 x+ x + ln(5x 3y 2 ) 2 , 2(5x 3y 2 ) 1 ` 2 1` 2 x + ln(5x 3y 2 ) 2 x + ln(5x 3y 2 ) 2 y 1 1` 2 6y x + ln(5x 3y 2 ) 2 2 5x 3y 2 1 ` 2 3y x + ln(5x 3y 2 ) 2 . 2 5x 3y
2

zy

= = =

So, there are some examples of partial derivatives. Hopefully you will agree that as long as we can remember to treat the other variables as constants these work in exactly the same manner that derivatives of functions of one variable do. So, if you can do single variable calculus derivative you should not have too much diculty in doing basic partial derivatives.

Implicit dierentiation
There is one important topic that we need to take a quick look in this section, implicit dierentiation. Before getting into implicit dierentiation for multivariable functions let us rst remark how implicit dierentiation works for functions of one variable.

Example 3.3.3 (Implicit dierentiation) dy for 3y 4 + x7 = 5x. Find dx Solution Remember that the key to this is to always think of y as a function of x, or y = y(x) and so whenever we dierentiate a term involving ys with respect to x we will need to use the chain rule which dy to that term. The rst step is to dierentiate both sides with respect will mean that we will add on a dx to x, we have dy + 7x6 = 5. 12y 3 dx dy . The second step is to solve for dx 5 7x6 dy = . dx 12y 3
2

Implicit dierentiation works in exactly the same manner with multivariable functions. If we have a function in terms of three variables x, y, and z we will assume that z is in fact a function of x and y. In other

51

3. Partial Derivatives

words, z = z(x, y). Then whenever we dierentiate zs with respect to x we will use the chain rule and z z . Likewise, whenever we dierentiate zs with respect to y we will add on a . Let us take x y a quick look at a couple examples of implicit dierentiation problems. add on a

Example 3.3.4 (Implicit dierentiation) z z and for each of the following functions. Find x y (a) x3 z 2 5xy 5 z = x2 + y 3 . Solution z (a) Let us start with nding . We will dierentiate both sides with respect to x and remember to x z whenever we dierentiate a z. add on a x 3x2 z 2 + 2x3 z z z 5y 5 z 5xy 5 = 2x. x x (b) x2 sin(2y 5z) = 1 + y cos(6xz).

Remember that since we are assuming z = z(x, y) then any product of xs and zs will be a product z . and so we need the product rule. Now, solve for x ` 3 z 2x z 5xy 5 x z x Now, we will do the same thing for whenever we dierentiate a z. 2x3 z z z 25xy 4 z 5xy 5 y y ` 3 z 2x z 5xy 5 y z y = = = 3y 2 , 3y 2 + 25xy 4 z, 3y 2 + 25xy 4 z . 2x3 z 5xy 5 = = 2x 3x2 z 2 + 5y 5 z, 2x 3x2 z 2 + 5y 5 z . 2x3 z 5xy 5

z z except this time we will need to remember to add on a y y

z (b) Basically, we will do the same thing for this function as we did in the previous part. Let us nd x rst. z z 2x sin(2y 5z) + x2 cos(2y 5z) (5 ) = y sin(6xz) (6z + 6x ). x x Do not forget to do the chain rule on each of the trigonometric functions and when we are dierentiating z . the inside function on the cosine we will need to also use the product rule. Now let us solve for x 2x sin(2y 5z) 5x2 cos(2y 5z) z x = = = 6yz sin(6xz) 6xy sin(6xz) z , x z ` 2 , 5x cos(2y 5z) 6xy sin(6xz) x 2x sin(2y 5z) + 6yz sin(6xz) . 5x2 cos(2y 5z) 6xy sin(6xz)

2x sin(2y 5z) + 6yz sin(6xz) z x

52

3.3 Partial Derivatives z . This one will be slightly easier than the rst one. y z z = cos(6xz) y sin(6xz)(6x ), x2 cos(2y 5z) 2 5 y y z y = = = cos(6xz) 6xy sin(6xz) z , y

Next, let us nd

2x2 cos(2y 5z) 5x2 cos(2y 5z) ` 6xy sin(6xz) 5x2 cos(2y 5z)

z y z y

cos(6xz) 2x2 cos(2y 5z), cos(6xz) 2x2 cos(2y 5z) . 6xy sin(6xz) 5x2 cos(2y 5z)
2

Interpretations of partial derivatives


At this point we will show that the two main interpretations of derivatives of functions of a single variable still hold for partial derivatives, with small modications of course to account for the fact that we now have more than one variable. Rates of change. The rst interpretation we have already seen and is the more important of the two. As with functions of several variables partial derivatives represent the rates of change of the functions as the variables change. As we saw previously, fx (x, y) represents the rate of change of the function f (x, y) as we change x and hold y xed while fy (x, y) represents the rate of change of the function f (x, y) as we change y and hold x xed.

Example 3.3.5 (Rates of change) Determine if f (x, y) = x2 is increasing or decreasing at (2, 5), y3 (b) if we allow y to vary and hold x xed.

(a) if we allow x to vary and hold y xed. Solution

(a) In this case we still rst need fx (x, y) and its value at the point. fx (x, y) = 2x y3 = fx (2, 5) = 4 > 0. 125

The partial derivative with respect to x is positive and therefore if we hold y xed the function is increasing at (2, 5) as we vary x. (b) For this part we will need fy (x, y) and its value at the point. fy (x, y) = 3x2 y4 = fy (2, 5) = 12 < 0. 625

The partial derivative with respect to y is negative and therefore the function is decreasing at (2, 5) as we vary y and hold x xed. 2

53

3. Partial Derivatives

Note that it is completely possible for a function to be increasing for a xed y and decreasing for a xed x at a point as the above example has shown. To see a nice example of this take a look at the following graph.

1 0.5

0 -0.5 -2 -1 0 0

1 2

-2

This is a graph of hyperbolic paraboloid and at the origin we can see that if we move along the positive x-axis the graph is increasing and if we move along the positive y-axis the graph is decreasing. So it is completely possible to have a graph both increasing and decreasing at a point depending upon the direction that we move. We should never expect the function will behave in exactly the same way at a point as each variable changes.

Slopes of tangent lines. The next interpretation was one of the standard interpretations in any single variable calculus course. We know that from single variable calculus that f (a) represents the slope of the tangent line to the curve y = f (x) at x = a. Here, fx (x, y) and fy (x, y) also represent the slopes of tangent lines. The dierence is the functions that they represent tangent lines to. Partial derivatives are the slopes of traces. By a trace to f (x, y) at the point (a, b) we mean the intersection curve between the surface dened by z = f (x, y) and a plane dened by x = a (resp. by another plane y = b). In particular, the partial derivative fx (a, b) is the slope of the trace to f (x, y) for the plane y = b at the point (a, b). Likewise, the partial derivative fy (a, b) is the slope of the trace to f (x, y) for the plane x = a at the point (a, b).

Example 3.3.6 (Slopes of tangent lines) Find the slopes of the traces to z = 10 4x2 y 2 at the point (1, 2).

Solution

We sketch the graphs of the traces for the planes x = 1 and y = 2 in the following.

54

3.3 Partial Derivatives z z

Trace for x = 1

Trace for y = 2

Next we will need the two partial derivatives so we can get the slopes. fx (x, y) = 8x, fy (x, y) = 2y.

To get the slopes all we need to do is evaluate the partial derivatives at the point (1, 2). fx (1, 2) = 8, fy (1, 2) = 4.

So, the tangent line at (1, 2) for the trace to z = 10 4x2 y 2 for the plane y = 2 has a slope of 8. Also, the tangent line at (1, 2) for the trace to z = 10 4x2 y 2 for the plane x = 1 has a slope of 4.
2

Example 3.3.7 (Slopes of tangent lines) The plane x = 1 intersects the paraboloid z = x2 + y 2 in a parabola. Find the slope of the tangent to the parabola at (1, 2, 5). Solution The slope is the value of the partial derivative z/y at (1, 2). 2 z (x + y 2 ) = = 2y = 4. y (1,2) y y=2 (1,2)

As a check, we can treat the parabola as the graph of the single variable function z = (1)2 + y 2 = 1 + y 2 in the plane x = 1 and ask for the slope at y = 2. The slope, calculated now as an ordinary derivative, is dz d (1 + y 2 ) = = 2y = 4. dy y=2 dy y=2 y=2 Could you sketch the graphs of the functions involved in this question?
2

Vector equations of tangent line. Finally, let us briey talk about getting the equations of the tangent line. Recall that the equation of a line in 3-space is given by a vector equation. Also to get the equation we need a point on the line and a vector that is parallel to the line. The point is easy. Since we know the x-y coordinates of the point all we need to do is plug this into the equation to get the point. So, the point will be (a, b, f (a, b)).

55

3. Partial Derivatives

The parallel (or tangent) vector is also easy. We can write the equation of the surface as a vector function as follows, r(x, y) = xi + yj + zk = xi + yj + f (x, y)k, or in the alternate vector notation r(x, y) = x, y, f (x, y) . We know that if we have a vector function of one variable we can get a tangent vector by dierentiating the vector function. The same will still be true here. If we dierentiate with respect to x we will get a vector to traces for the plane y = b (i.e. for xed y) and if we dierentiate with respect to y we will get a vector to traces for the plane x = a (for xed x). The following is the tangent vector for traces with xed y. rx (x, y) = 1, 0, fx (x, y) . We dierentiated each component with respect to x. Therefore the rst component becomes a one and the second becomes a zero because we are treating y as a constant when we dierentiate with respect to x. The third component is just the partial derivative of the function with respect to x. For traces with xed x the tangent vector is ry (x, y) = 0, 1, fy (x, y) . The equation for the tangent line to traces with xed y is r(t) = a, b, f (a, b) + t 1, 0, fx (a, b) , whereas the tangent line to traces with xed x is r(t) = a, b, f (a, b) + t 0, 1, fy (a, b) . Example 3.3.8 (Vector equations of tangent line) Write down the vector equations of the tangent lines to the traces to z = 10 4x2 y 2 at the point (1, 2). Solution Actually there is not much to do other than plugging the values and function into the formulas above. We have already computed the derivatives and their values at (1, 2) in Example 3.3.6 (page 54) and the point on each trace is (1, 2, f (1, 2)) = (1, 2, 2). The equation of the tangent line to the trace for the plane y = 2 is given by r(t) = 1, 2, 2 + t 1, 0, 8 = 1 + t, 2, 2 8t , and the equation of the tangent line to the trace for the plane x = 1 is given by r(t) = 1, 2, 2 + t 0, 1, 4 = 1, 2 + t, 2 4t .
2

56

3.3 Partial Derivatives

Higher-order partial derivatives


Just as we have higher-order derivatives with functions of one variable we will also have higher-order (partial) derivatives of functions of more than one variable. Consider the case of a function of two variables, f (x, y), since both of the rst-order partial derivatives are also functions of x and y we could in turn dierentiate each with respect to x or y. This means that for the case of a function of two variables there will be a total four possible second-order derivatives. Here they are the notations that we will use to denote them. 2f f = , (fx )x = fxx = x x x2 2f f = , (fx )y = fxy = y x yx f 2f , (fy )x = fyx = = x y xy 2f f = (fy )y = fyy = . y y y 2 In the above, the second and third second-order partial derivatives are often called mixed partial derivatives since we are taking derivatives with respect to more than one variable. Note as well that the order that we take the derivatives in is given by the notation for each of these. If we are using the subscripting notation, for example fxy , then we will dierentiate from left to right. In other words, in this case, we will dierentiate 2f rst with respect to x and then with respect to y. With the fractional notation, for example , it is yx the opposite. In these cases we dierentiate moving along the denominator from right to left. So, again, in this case we dierentiate with respect to x rst and then y. Let us take a quick look at an example.

Example 3.3.9 (Second-order partial derivatives) Find all the second-order derivatives for f (x, y) = cos(2x) x2 e5y + 3y 2 . Solution We will need the rst-order derivatives so here they are fx (x, y) = 2 sin(2x) 2xe5y , Now, let us get the second-order derivatives. They are fxx = 4 cos(2x) 2e5y , fxy = 10xe5y , fyx = 10xe5y , fyy = 25x2 e5y + 6.
2

fy (x, y) = 5x2 e5y + 6y.

Note that we dropped the (x, y) from the derivatives (i.e., writing for example fxx instead of fxx (x, y)). This is fairly standard and we will be doing it most of the time from now on. We will also be dropping it for the rst-order derivatives in most cases. You may have noticed that the mixed second-order partial derivatives fxy = 2f yx and fyx = 2f xy

in Example 3.3.9 are equal. This is not a coincidence. If the function is nice enough this will always be the case. So, what is actually nice enough? The following theorem tells us the answer.

57

3. Partial Derivatives

Theorem 3.3.1 (The Mixed Derivative Theorem) If f (x, y) and its partial derivatives fx , fy , fxy , and fyx are dened on a disk containing a point (a, b) and are all continuous at (a, b), then fxy (a, b) = fyx (a, b). The theorem is also known as Clairauts Theorem, named after the French mathematician Alexis Clairaut who discovered it. The proof is omitted here. This theorem says that to calculate a mixed second-order derivative, we may dierentiate in either order, provided the continuity conditions are satised. This can lead to our advantage. Example 3.3.10 (Mixed derivative) ey 2w if w = xy + 2 . Find xy y +1
w Solution The symbol xy tells us to dierentiate rst with respect to y and then with respect to x. If we postpone the dierentiation with respect to y and dierentiate rst with respect to x, however, we get the answer more quickly. In two steps,
2

w =y x

and

2w = 1. yx
2

w If we dierentiate rst with respect to y, certainly we still obtain xy = 1 as well. We can dierentiate 2 in either order because the conditions of Theorem 3.3.1 hold for w at all points.

Although we will deal mostly with rst- and second-order partial derivatives, because these appear the most frequently in applications, there is no theoretical limit to how many times we can dierentiate a function as long as the derivatives involved exist. There are higher-order derivatives as well and the following is a couple of the third-order partial derivatives of a function of two variables. 2 3f f = , fxyx = (fxy )x = x yx xyx 2 f 3f . fyxx = (fyx )x = = x xy 2 xy Notice as well that for both of these we dierentiate once with respect to y and twice with respect to x. There is also another third-order partial derivative in which we can do this, fxxy . There is an extension to Clairauts Theorem that says if furthermore all three of these are continuous then they should all be equal, fxxy = fxyx = fyxx . To this point we have only looked at functions of two variables, but everything that we have done here will work regardless of the number of variables that we have got in the function and there are natural extension to Clairauts theorem to all of these cases as well. For instance, fxz (x, y, z) = fzx (x, y, z), provided both of the derivatives are continuous. In general, we can extend Clairauts theorem to any function and mixed partial derivatives. The only requirement is that in each derivative we dierentiate with respect to each variable the same number of times. In other words, provided we meet the continuity condition, the following will be equal fssrtsrr = frtsrssr because in each case we dierentiate with respect to t once, s three times and r three times. Let us do a couple of examples with higher-order derivatives and functions of more than two variables. Example 3.3.11 (Higher-order derivatives) Find the indicated derivative for each of the following functions.

58

3.4 The Chain Rule f (x, y, z) = z 3 y 2 ln x. 3f yx2

(a) Find fxxyzz Solution

for

(b) Find

for

f (x, y) = exy .

(a) In this case remember that we dierentiate from left to right. The derivatives are fx fxxyz = = z3 y2 , x 6z 2 y , x2 2f x2 fxx fxxyzz = = z3 y2 , x2 12zy . x2 3f yx2 fxxy = 2z 3 y , x2

(b) Here we dierentiate from right to left. The derivatives are f x = yexy , = y 2 exy , = 2yexy + xy 2 exy .
2

3.4

The Chain Rule

The chain rule for functions of a single variable says that when y = f (x) is a dierentiable function of x and x = g(t) is a dierentiable function of t, y becomes a dierentiable function of t and dy/dt could be calculated with the formula dy dx dy = . dt dx dt It is now time to extend the chain rule out to more complicated situations. Notice that in the above the derivative dy/dt really does make sense since if we plug in for x then y really will be a function of t. One way to remember this form of the chain rule is to note that if we think of the two derivatives on the right side as fractions the dxs will cancel to get the same derivative on both sides. As with many topics in multivariable calculus, there are in fact many dierent formulas depending on the number of variables that we are dealing with. So, let us start this discussion o with a function of two variables, z = f (x, y). From this point there are still many dierent possibilities that we can look at. We will be looking at two distinct cases prior to generalizing the whole idea out. dz . dt This case is analogous to the standard chain rule from single variable calculus that we looked at above. In this case we are going to compute an ordinary derivative since z really would be a function of t only if we substitute in for x and y. The chain rule for this case is z = f (x, y), x = g(t), y = h(t) and compute f dx f dy dz = + . dt x dt y dt So, basically what we have done here is dierentiating f with respect to each variable in it and then multiplying each of these by the derivative of that variable with respect to t. The nal step is to add them up together. Let us take a look at a couple of examples.

Case 1.

Example 3.4.1 (Chain rule) dz for each of the following. Compute dt

59

3. Partial Derivatives (a) z = xexy , x = t2 , y = t1 . Solution (a) We may directly apply the formula dz dt = = = f dx f dy + x dt y dt (exy + xyexy )(2t) + x2 exy (t2 ) 2t(1 + xy) exy t2 x2 exy . (b) z = x2 y 3 + y cos x, x = ln t, y = sin(4t).

So, technically we have computed the derivative. However, we should probably go ahead and substitute in for x and y as well at this point since we have already got ts in the derivative. Doing this gives dz = 2t(1 + t) et t2 t4 et = (2t + t2 ) et . dt Note that in this case it might actually be easier to just substitute in for x and y in the original function and just compute the derivative as we normally would. For comparison purpose let us do that dz = 2tet + t2 et . = z = t2 et dt The same result for less work. Note however, that often it will actually be more work to do the substitution rst. (b) In this case it would almost denitely be more work to do the substitution rst so we will use the chain rule rst and then substitute. dz dt = = 1 (2xy 3 y sin x)( ) + (3x2 y 2 + cos x)(4 cos(4t)) t ` 2 sin3 (4t) ln t sin(4t) sin(ln t) + 4 cos(4t) 3 sin2 (4t) ln2 t + cos(ln t) . t

Note that sometimes, because of the signicant mess of the nal answer, we will only simplify the rst step a little and leave the answer in terms of x and y, and t. This is dependent upon the situation, class and instructor however this kind of substitution work is not necessary in the examinations for this class.
2

Now, there is a special case that we should take a quick look at before moving on to the next case. Let us suppose that we have the following situation. z = f (x, y) In this cae the chain rule for dz becomes dx dz f dx f dy f f dy = + = + . dx x dx y dx x y dx In the rst term we used the fact that d dx = (x) = 1. dx dx and y = g(x).

Let us take a quick look at an example.

60

3.4 The Chain Rule

Example 3.4.2 (Chain rule) dz for Compute dx z = x ln(xy) + y 3 Solution

and

y = cos(x2 + 1).

We just plug into the formula ` dz y x = ln(xy) + x + x + 3y 2 2x sin(x2 + 1) dx xy xy ` x = ln x cos(x2 + 1) + 1 2x sin(x2 + 1) + 3 cos2 (x2 + 1) cos(x2 + 1) ` = ln x cos(x2 + 1) + 1 2x2 tan(x2 + 1) 6x sin(x2 + 1) cos2 (x2 + 1).
2

Let us take a look at the second case. Case 2. z = f (x, y), x = g(s, t), y = h(s, t) and compute z z and . s t

In this case if we substitute in for x and y we may nd that z is a function of s and t and so it makes sense that we will be computing partial derivatives here and that there will be two of them. Here is the chain rule for both of these partial derivatives. z f x f y = + s x s y s and z f x f y = + . t x t y t

So, not surprisingly, these are very similar to the rst case that we looked at. Here is a quick example of this kind of chain rule.

Example 3.4.3 (Chain rule) z z and for Find s t z = e2r sin(3), Solution Here is the chain rule for z s = = z . t = = ` 2r ` t 2e sin(3) (s 2t) + 3e2r cos(3) s2 + t 2 2(s 2t) e
2(stt2 )

r = st t2 ,

s2 + t 2 .

z . s

2 2 3se2(stt ) cos(3 s2 + t2 ) 2t e2(stt ) sin(3 s2 + t2 ) + . s2 + t 2

` 2r ` s 2e sin(3) (t) + 3e2r cos(3) s2 + t 2

Now the chain rule for z t

2 3te2(stt ) cos(3 s2 + t2 ) 2 + t2 ) + sin(3 s . s2 + t 2


2

61

3. Partial Derivatives

We have seen a couple of cases for the chain rule let us see the general version of the chain rule.

Chain Rule Suppose that z is a function of n variables x1 , x2 , , xn , and that each of these variables are in turn functions of m variables t1 , t2 , , tm . Then for any variable ti (i = 1, 2, , m), we have the following z x1 z x2 z xn z = + + + . ti x1 ti x2 ti xn ti This is a bit troublesome. There is actually an easier way to construct all the chain rules that we have discussed in the section or will look at in later examples. We can build up a tree diagram that will give us the chain rule for any situation. To see how these work let us go back and take a look at the chain rule for z/s given that z = f (x, y), x = g(s, t), y = h(s, t). Of course we have already known the answer but the following tree diagram is used as an illustration. For reference, here is the chain rule for this case, f x f y z = + . s x s y s

Here is the tree diagram for this case. z


z x z y

x s

x t

y s

y t

We start at the top with the function itself and the branch out from that point. The rst set of branches is for the variables in the function. From each of these endpoints we put down a further set of branches that gives the variables that both x and y are a function of. We connect each letter with a line and each line represents a partial derivative as shown. Note that the letter in the numerator of the partial derivative is the upper node of the tree and the letter in the denominator of the partial derivative is the lower node of the tree. To use this to get the chain rule we start at the bottom and for each branch that ends with the variable we want to take the derivative with respect to (s in this case) we move up the tree until we hit the top multiplying the derivatives that we see along that set of branches. Once we have done this for each branch that ends at s, we then add the results up to get the chain rule for that given situation. Note that we do not usually put the derivatives in the tree. They are always an assumed part of the tree. Let us write down the chain rules for a couple of examples.

Example 3.4.4 (Chain rule) Use a tree diagram to write down the chain rule for the given derivatives.

62

3.4 The Chain Rule dw for w = f (x, y, z), x = g1 (t), y = g2 (t), z = g3 (t). dt w (b) for w = f (x, y, z), x = g1 (r, s, t), y = g2 (r, s, t), z = g3 (r, s, t). r (a) Solution (a) We rst draw the tree diagram. w

From this tree diagram we know that the chain rule is given by f dx f dy f dz dw = + + dt x dt y dt z dt which is really just a natural extension to the two variable case that we saw before. (b) Here is the tree diagram for this situation. w

From this tree diagram we know that the chain rule is given by f x f y f z w = + + . r x r y r z r
2

So, provided we can construct the tree diagram, and it is not too dicult to write down the chain rule for any set up that we might run across. We have now seen how to take the rst-order derivatives of these more complicated situation, but what about higher-order derivatives? How do we do these? It is probably easiest to see how to deal with these with an example.

63

3. Partial Derivatives

Example 3.4.5 (Chain rule) Compute 2z for z = f (x, y) if x = r cos and y = r sin . 2

Solution We will need the rst-order derivative before we can even think about nding the second-order derivative so let us get that. This situation falls into the second case that we looked at above so we do not need a new tree diagram. The following is the rst-order derivative. f = = f x f y + x y f f r sin + r cos . x y

Now, the second-order derivative is given by f f 2f f = r sin . = + r cos 2 x y The issue here is to correctly deal with this derivative. Since the two rst-order derivatives, f and f , x y are both functions of x and y which are in turn functions of r and both of these terms are products. So, using the product rule gives the following, f f f f 2f = r cos r sin r sin + r cos . 2 x x y y ` We now need to determine what f and f will be. These are both chain rule problems again x y since both of the derivatives are functions of x and y and we want to take the derivative with respect to . f f f = r sin + r cos x x x y x = f y = = r sin r sin r sin 2f 2f + r cos , 2 x yx x f y + r cos y f y

2f 2f . + r cos xy y 2

The nal step is to plug these back into the second-order derivative and do some simplifying. 2f f 2f 2f = r cos + r cos r sin r sin 2 x x2 yx 2 f f 2f r sin + r cos r sin + r cos y xy y 2 = r cos r sin r cos f 2f 2f r 2 sin cos + r 2 sin2 2 x x yx 2f 2f f r 2 sin cos + r 2 cos2 y xy y 2 f f 2f r sin + r 2 sin2 x y x2 2f 2f + r 2 cos2 . yx y 2

2r 2 sin cos

64

3.4 The Chain Rule

It is long and fairly messy but there it is.

Implicit Dierentiation The nal topic in this section is a revisiting of implicit dierentiation. With these forms of the chain rule implicit dierentiation actually becomes a fairly simple process. We will start with a function in the form F (x, y) = 0 (if it is not in this form simply move everything to one side of the equal sign to get it into this form) where y = y(x). In a single variable calculus course we were asked to dy compute dx and this was often a fairly messy process. Using the chain rule from this section however we can get a nice simple formula for doing this. We will start by dierentiating both sides with respect to x. This will mean using the chain rule on the left side and the right side will dierentiate to zero. Here is the result of that. dy Fx dy =0 = = . Fx + Fy dx dx Fy As shown, all we need to do next is solve for dy and we now have a very nice formula to use for implicit dx dierentiation. Note as well that in order to simplify the formula we switched back to using the subscript notation for the derivatives. Let us check out a quick example. Example 3.4.6 (Implicit dierentiation) dy for Find dx x cos(3y) + x3 y 5 = 3x exy . Solution The rst step is to get a zero on one side of the equal sign and that is easy enough to do. x cos(3y) + x3 y 5 3x + exy = 0. Now, the function on the left is F (x, y) in our formula so all we need to do is use the formula to nd the derivative. cos(3y) + 3x2 y 5 3 + yexy dy = . dx 3x sin(3y) + 5x3 y 4 + xexy
2

We can also do something similar to handle the types of implicit dierentiation problems involving partial derivatives like those we saw when we rst introduced partial derivatives. In these cases we will start o with a function in the form F (x, y, z) = 0 and assume that z = f (x, y)

z z z and we want to nd x and/or y . Let us start by trying to nd x . We will dierentiate both sides with respect to x and we will need to remember that we are going to be treating y as a constant. Also, the left side will require the chain rule. Here is the derivative.

F y F z F x + + = 0. x x y x z x Now, we have the following, x y =1 and = 0. x x The rst is because we are just dierentiating x with respect to x and we know that is 1. The second is because we are treating y as a constant and so it will dierentiate to zero. Plugging these in and solving for
z x

gives Fx z = . x Fz

65

3. Partial Derivatives

A similar argument can be used to show that Fy z = . y Fz As with the one variable case we switched to the subscripting notation for derivatives to simplify the formulas. Let us take a quick look at an example of this.

Example 3.4.7 (Implicit dierentiation) z z Find and for x y x2 sin(2y 5z) = 1 + y cos(6xz). Solution This is one of the functions discussed in Example 3.3.4 (page 52). You might go back and see the dierence between the two. First let us get everything on one side, x2 sin(2y 5z) 1 y cos(6xz) = 0. Now, the function on the left is F (x, y, z) and so all we need to do is use the formulas developed above to nd the derivatives, 2x sin(2y 5z) + 6yz sin(6xz) z = , x 5x2 cos(2y 5z) + 6xy sin(6xz) z y = 2x2 cos(2y 5z) cos(6xz) . 5x2 cos(2y 5z) + 6xy sin(6xz)

If you go back and compare these answers to those that we found the rst time around you will notice that they might appear to be dierent. However, if you take into account the minus sign that sits in the front of 2 our answers here you will see that they are in fact the same.

3.5

Directional Derivatives

So far we have only looked at the two partial derivatives fx (x, y) and fy (x, y). Recall that these derivatives represent the rate of change of f as we vary x (holding y xed) and as we vary y (holding x xed) respectively. We now need to discuss how to nd the rate of change of f if we allow both x and y to change simultaneously. The problem here is that there are many ways to allow both x and y to change. For instance one could be changing faster than the other and then there is also the issue of whether or not each is increasing or decreasing. So, before we get into nding the rate of change we need to get a couple of preliminary ideas rst. The main idea that we need to look at is just how are we going to dene the changing of x and/or y. Let us start o by supposing that we want the rate of change of f at a particular point, say (x0 , y0 ). Let us also suppose that both x and y are increasing and that, in this case, x is increasing twice as fast as y is increasing. So, as y increases one unit of measure x will increase two units of measure. To help us see how we are going to dene this change let us suppose that a particle is sitting at (x0 , y0 ) and the particle will move in the direction given by the changing x and y. Therefore, the particle will move o in a direction of increasing x and y and the x-coordinate of the point will increase twice as fast as the y-coordinate. Now we are thinking of this changing x and y as a direction of movement we can get a way of

66

3.5 Directional Derivatives

dening the change. We have known that vectors can be used to dene a direction and so the particle, at this point, can be said to be moving in the direction, v = 2, 1 . Since this vector can be used to dene how a particle at a point is changing we can also use it to describe how x and/or y is changing at a point. For our example we will say that we want the rate of change of f in the direction of v = 2, 1 . In this way we will know that x is increasing twice as fast as y is. There is still a small problem with this, however. There are many vectors that point in the same direction. For instance, all of following vectors point in the same direction as v = 2, 1 , v= 1 1 , , 5 10 v = 6, 3 , 2 1 v= , . 5 5

We need a way to consistently nd the rate of change of a function in a given direction. We will do this by insisting that the vector that denes the direction of change be a unit vector. Recall that a unit vector is a vector with length, or magnitude, of 1. This means that for the example that we started o thinking about we might want to use 1 2 v= , , 5 5 since this is the unit vector that points in the direction of change. For reference purposes recall that the magnitude or length of the vector v = a, b, c p v = a2 + b2 + c2 . is given by

For two-dimensional vectors we drop the c from the formula. Sometimes we will give the direction of changing x and y as an angle. For instance, we may say that we want the rate of change of f in the direction of = /3. The unit vector that points in this direction is given by u = cos , sin . Now that we know how to dene the direction of changing x and y it is time to start talking about nding the rate of change of f in this direction. Let us rst give the formal denition.

3.5.1 Denition The rate of change of f (x, y) in the direction of the unit vector u = a, b is called the directional derivative and is denoted by Du f (x, y). The denition of the directional derivative is Du f (x, y) = lim f (x + ah, y + bh) f (x, y) . h

h0

So, the denition of the directional derivative is very similar to the denition of partial derivatives. However, in practice this can be a very dicult limit to compute so we need an easier way of taking directional derivatives. It is actually fairly simple to derive an equivalent formula for taking directional derivatives. To see how we can do this let us dene a new function of a single variable, g(z) = f (x0 + az, y0 + bz), where x0 , y0 , a, and b are some xed numbers. Note that this really is a function of a single variable now since z is the only letter that is not representing a xed number. Then by the denition of the derivative for functions of a single variable we have g (z) = lim
h0

g(z + h) g(z) h

67

3. Partial Derivatives

and the derivative at z = 0 is given by g (0) = lim If we now substitute in for g(z) we get g (0) = lim g(h) g(0) f (x0 + ah, y0 + bh) f (x0 , y0 ) = lim = Du f (x0 , y0 ). h0 h h (3.1) g(h) g(0) . h

h0

h0

Now let us look at this from another perspective. Let us rewrite g(z) as follows, g(z) = f (x, y), where x = x0 + az and y = y0 + bz.

We can now use the chain rule to compute g (z) = f dx f dy dg = + = fx (x, y) a + fy (x, y) b. dz x dz y dz

So, from the chain rule we get the following relationship g (z) = fx (x, y) a + fy (x, y) b. (3.2)

If we now take z = 0 we will get that x = x0 and y = y0 (from how we dened x and y above) and plug these into (3.2) we get (3.3) g (0) = fx (x0 , y0 ) a + fy (x0 , y0 ) b. Now, simply equate (3.1) and (3.3) to get that Du f (x0 , y0 ) = g (0) = fx (x0 , y0 ) a + fy (x0 , y0 ) b. If we now go back to allowing x and y to be any number we get the following formula for computing directional derivatives. Du f (x, y) = fx (x, y) a + fy (x, y) b. This is much simpler than the limit denition. Also note that this denition assumed that we were working with functions of two variables. There are similar formulas that can be derived by the same type of argument for functions with more than two variables. For instance, the directional derivative of f (x, y, z) in the direction of the unit vector u = a, b, c is given by Du f (x, y, z) = fx (x, y, z) a + fy (x, y, z) b + fz (x, y, z) c.

Let us work out a couple of examples. Example 3.5.1 (Directional derivative) Find each of the directional derivatives. (a) Du f (2, 0), where f (x, y) = xexy + y and u is the unit vector in the direction of = 2 . 3

(b) Du f (x, y, z), where v = 1, 0, 3 . Solution

f (x, y, z) = x2 z + y 3 z 2 xyz and u is the unit vector in the direction of

68

3.5 Directional Derivatives

(a) We will rst nd Du f (x, y) and then use this formula for nding Du f (2, 0). The unit vector giving the direction is 2 1 2 3 , sin = , . u = cos 3 3 2 2 So, the directional derivative is 1 3 ` 2 xy Du f (x, y) = (exy + xyexy ) + x e +1 , 2 2 5 31 1 3 (5) = . Du f (2, 0) = (1) + 2 2 2 (b) In this case let us rst check to see if the direction vector is a unit vector or not and if it is not convert it into one. To do this all we need to do is compute its magnitude. v = 1 + 0 + 9 = 10. So, it is not a unit vector. Recall that we can normalize it into the unit vector 1 3 1 1, 0, 3 = , 0, . u= 10 10 10 The directional derivative is then Du f (x, y, z) = ` 1 3 ` 2 x + 2y 3 z xy (2xz yz) + 0 3y 2 z 2 xz + 10 10 1 ` 2 3x + 6y 3 z 3xy 2xz + yz . 10
2

There is another form of the formula that we may use to get the directional derivative that is slightly better and somewhat more compact. It is also a much more general formula that will encompass both of the formulas above. Let us start with the second one and notice that we can rewrite it as follows. Du f (x, y, z) = = fx (x, y, z) a + fy (x, y, z) b + fz (x, y, z) c fx , fy , fz a, b, c .

In other words we can write the directional derivative as a dot product and notice that the second vector is nothing more than the unit vector u that gives the direction of change. Also, if we like to use this version for functions of two variables the third component will not be there, but other than that the formula will be the same. Now let us give a name and notation to the rst vector in the dot product since this vector will show up fairly regularly throughout this course. The gradient of f or gradient vector of f is dened to be f = fx , fy , fz or f = fx , fy .

Or, if we want to use the standard basis vectors the gradient is f = fx i + fy j + fz k or f = fx i + fy j.

The denition is only shown for functions of two or three variables, however there is a natural extension to functions of any number of variables that we would like.

69

3. Partial Derivatives

With the denition of the gradient we can now say that the directional derivative is given by Du f = f u, where we will no longer show the variable and use this formula for any number of variables. Note as well that we will sometimes use the following notation Du f (x) = f (x) u, where x = x, y, z or x = x, y as needed. This notation will be used when we want to note the variables in some way, but dont really want to restrict ourselves to a particular number of variables. In other words, x will be used to represent as many variables as we need in the formula and we will most often use this notation when we are already using vectors or vector notation in the problem. Let us work out a couple of examples using this formula of the directional derivative.

Example 3.5.2 (Directional derivative) Find each of the directional derivatives. (a) Du f (2, 0) for f (x, y) = x cos y in the direction of v = 2, 1 . f (x, y, z) = sin(yz) + ln x2 at (1, 1, ) in the direction of v = 1, 1, 1 .

(b) Du f (x, y, z) Solution

for

(a) Let us rst compute the gradient for this function. f = cos y, x sin y . Also, as we saw earlier in this section the unit vector for the direction of v is 1 2 u= , . 5 5 The directional derivative is then Du f (x) = 1 2 cos y, x sin y , 5 5 2 1 (2 0) = . 5 5 1 = (2 cos y x sin y) , 5

Du f (2, 0)

(b) In this case we are asking for the directional derivative at a particular point. To do this we will rst compute the gradient, evaluate it at the point in question and then do the dot product. So, let us get the gradient. 2 , z cos(yz), y cos(yz) , f (x, y, z) = x 2 , cos , cos = 2, , 1 . 1 Next, we need the unit vector for the direction. 1 1 1 v = 3, u = , , . 3 3 3 f (1, 1, ) = Finally, the directional derivative at the point is Du f (1, 1, ) = 1 1 1 2, , 1 , , 3 3 3 3 1 (2 + 1) = . 3 3

70

3.5 Directional Derivatives

Before proceeding let us note that the rst-order partial derivatives that we were looking at in the majority of the section can be thought of as special cases of the directional derivatives. For instance, fx can be thought of as the directional derivative of f in the direction of u = 1, 0 or u = 1, 0, 0 , depending on the number of variables that we are working with. The same can be done for fy and fz . Gradient vectors. We will nish this section with a couple of nice facts about the gradient vector. The rst tells us how to determine the maximum rate of change of a function at a point and the direction that we need to move in order to achieve that maximum rate of change.

Theorem 3.5.1 The maximum value of Du f (x) ( and hence then the maximum rate of change of the function f (x) ) is given by f (x) and will occur in the direction given by f (x).

This theorem provides a very useful interpretation for the gradient vector. Now we are going to discuss the detail of this theorem and the reasoning behind. For any point x and any unit vector u we have Du f (x) = f (x) u = f (x) cos ,

where is the angle between the vector u and f (x). Since cos only takes on values between 1 and 1, Du f (u) only takes on values between f (x) and f (x) . Moreover, Du f (x) = f (x) Du f (x) = f (x) if and only if if and only if u points in the opposite direction to f (x) (cos = 1), u points in the same direction as f (x) (cos = 1).

The directional derivative is zero in the direction = /2; this is the direction of the (tangent line to the) level curve of f through x. We summarize these properties of the gradient as follows.

Theorem 3.5.2 (Geometric Properties of the Gradient Vector) (a) At x, f (x) increases most rapidly in the direction of the gradient vector f (x). The maximum rate of increase is f (x) . (b) At x, f (x) decreases most rapidly in the direction of f (x). The maximum rate of decrease is f (x) . (c) The rate of change of f (x) at x is zero in directions tangent to the level curve of f that passes through x.

As we remarked before, these properties hold in both two and three dimensions.

Example 3.5.3 (Gradient vector) Find the directions in which the function f (x, y) = y2 x2 + 2 2

71

3. Partial Derivatives

(a) Increases most rapidly at the point (1, 1). (b) Decreases most rapidly at (1, 1). (c) What are the directions of zero change in f at (1, 1). Solution (a) The function increases most rapidly in the direction of f (x) at (1, 1). The gradient there is f (1, 1) = Its direction is f f = x, y = 1, 1 . , x y (1,1) (1,1)

1 1 1 u = 1, 1 = , . 2 2 2

(b) The function decreases most rapidly in the direction of f (x) at (1, 1), which is 1 1 u = , . 2 2 (c) The directions of zero change at (1, 1) are the directions orthogonal to f : 1 1 n = , 2 2 and 1 1 n = , . 2 2
2

Example 3.5.4 (Gradient vector) Suppose that the height of a hill above sea level is given by z = 1000 0.01x2 0.02y 2 . If you are at the point (60, 100) in what direction is the elevation changing faster? What is the maximum rate of change of the elevation at this point? Is the maximum rate of change of the elevation towards the center of the hill or away from it?

Solution First, you will hopefully know that the graph of the function is an elliptic paraboloid that opens downward. So even though most hills are not this symmetrical it will at least be vaguely hill shaped and so the question makes at least some sense. To this problem there are a couple of questions to answer here, but using Theorem 3.5.2 makes answering them very simple. We will rst need the gradient vector. f (x) = f (x, y) = 0.02x, 0.04y . The maximum rate of change of the elevation will then occur in the direction of f (60, 100) = 1.2, 4 . The maximum rate of change of elevation at this point is p f (60, 100) = (1.2)2 + (4)2 = 17.44 4.176. To answer the nal part it might be convenient to have a quick sketch of the gradient at this point.

72

3.5 Directional Derivatives y


100

80

60

40

20

0 0 20 40 60 80 100

We have only shown a portion of the axis system here to make the picture easier to see. The center of the hill is at the origin and that is also the highest point on the hill. If we are standing at the point (60, 100) then the direction with the maximum rate of change of the elevation is given by f (60, 100) = 1.2, 4 . This means that both x and y are decreasing (since they are negative) and y is decreasing faster than x. This is shown by the vector in the above sketch. This also shows that the direction with maximum rate of change of the elevation is generally up the hill 2 (and hence towards the center) rather than down the hill (away from the hill).

The second fact about the gradient vector that we need to give before the end of this section will be very convenient in some later sections. Let us consider the case of two variables for illustration. If a dierentiable function f (x, y) has a constant value c along a smooth curve r = g(t), h(t) (making the curve a level curve of f ), then f (g(t), h(t)) = c. Dierentiating both sides of this equation with respect to t leads to the equations d f (g(t), h(t)) dt f dg f dh + x dt y dt dg dh f f , , x y dt dt in which we might denote dr dg dh f f , and = , . x y dt dt dt The last equation says that f is normal to the tangent vector dr/dt, so it is normal to the curve. f = We summarize the second fact about gradient vector as follows. = = d (c), dt 0,

0,

Theorem 3.5.3 (Gradient Vector Normal to Level Curve) The gradient vector f (x0 , y0 ) is orthogonal (or perpendicular) to the level curve f (x, y) = c at the point (x0 , y0 ). Likewise, the gradient vector f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c at the point (x0 , y0 , z0 ).

73

3. Partial Derivatives

As we are going to see in later sections we always like to know the vectors that are orthogonal to a surface or curve. Therefore, what we need to do is to compute a gradient vector and we will get the orthogonal vector we need. We will see the rst application of this in the next section.

3.6

Applications of Partial Derivatives

In this section we will take a look at a couple of applications of partial derivatives. Most of the applications are simply extensions of what we have learnt about ordinary derivatives in single variable calculus. For instance, we will be looking at nding absolute and relative extrema of a function and we will also look at optimization. Both of these subjects are major applications back in single variable calculus. They will, however, be a little more work here because we now have more than one variable.

Tangent planes and linear approximations


Previously we saw (page 54) how the two partial derivatives fx and fy can be thought of as the slopes of traces. We want to extend this idea a little bit in this section. The graph of a function z = f (x, y) is a surface in R3 (three-dimensional space) and so we can now start thinking of the plane that is tangent to the surface as a point. Let us start out with a point (x0 , y0 ) and also let C1 represent the trace to f (x, y) for the plane y = y0 (i.e., allowing x to vary with y held xed) and we will let C2 represent the trace to f (x, y) for the plane x = x0 (i.e., allowing y to vary with x held xed). Now, we know that fx (x0 , y0 ) is the slope of the tangent line to the trace C1 and fy (x0 , y0 ) is the slope of the tangent line to the trace C2 . So, let L1 be the tangent line to the trace C1 and let L2 be the tangent line to the trace C2 . The tangent plane will then be the plane that contains the two lines L1 and L2 . Geometrically this plane will serve the same purpose that a tangent line did in single variable calculus. A tangent line to a curve was a line that just touched the curve at that point and was parallel to the curve at the point in question. Tangent planes to a surface are planes that just touch the surface at the point and are parallel to the surface at the point. Note that this gives us a point that is on the plane. Since the tangent plane and the surface touch at (x0 , y0 ) the following point will be on both the surface and the plane. (x0 , y0 , z0 ) = (x0 , y0 , f (x0 , y0 )) . What we need to do now is determine the equation of the tangent plane. We know that the general equation of a plane is given by a(x x0 ) + b(y y0 ) + c(z z0 ) = 0, where (x0 , y0 , z0 ) is a point that is on the plane, which we know already. Let us rewrite this a little. We will move the x terms and y terms to the other side and divide both sides by c. Doing this gives a b z z0 = (x x0 ) (y y0 ). c c Now, let us rename the constants to simplify up the notation a little. Let us rename them as follows. a A= , c b B= . c

With this renaming the equation of the tangent plane becomes z z0 = A(x x0 ) + B(y y0 ) and we need to determine values for A and B.

74

3.6 Applications of Partial Derivatives

Let us rst think about what happens if we hold y xed, i.e., if we assume that y = y0 . In this case the equation of the tangent plane becomes z z0 = A(x x0 ). This is the equation of a line and this line must be tangent to the surface at (x0 , y0 ) (since it is part of the tangent plane). In addition, this line assumes that y = y0 (i.e., xed) and A is the slope of this line. But if we think about it this is exactly that the tangent to C1 is a line tangent to the surface at (x0 , y0 ) assuming that y = y0 . In other words, z z0 = A(x x0 ) is the equation for L1 and we know that the slope of L1 is given by fx (x0 , y0 ). Therefore we have the following A = fx (x0 , y0 ). If we hold x xed at x = x0 the equation of the tangent plane becomes z z0 = B(y y0 ). However, by a similar argument to the one above we can see that this is nothing more than the equation for L2 and that its slope is B or fy (x0 , y0 ). So, B = fy (x0 , y0 ).

The equation of the tangent plane to the surface given by z = f (x, y) at (x0 , y0 ) is then z z0 = fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ). Also, if we use the fact that z0 = f (x0 , y0 ) we can rewrite the equation of the tangent plane as z f (x0 , y0 ) z = = fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ), f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ).

We will see another derivation of this formula (actually a more general formula) later on (page 76). So if you didnt quite follow this argument hold o until then to see a better derivation.

Example 3.6.1 (Tangent plane) Find the equation of the tangent plane to z = ln(2x + y) at the point (1, 3). Solution There really is not too much to do here other than taking a couple of derivatives and doing some quick evaluations. f (x, y) = ln(2x + y), fx (x, y) = 2 , 2x + y 1 , 2x + y z0 = f (1, 3) = ln 1 = 0, fx (1, 3) = 2,

fy (x, y) = The equation of the plane is then

fy (1, 3) = 1.

z0 z

= =

2(x + 1) + (1)(y 3), 2x + y 1.

75

3. Partial Derivatives

One nice use of tangent planes is that they give us a way to approximate a surface near a point. As long as we are near to the point (x0 , y0 ) then the tangent plane should nearly approximate the function at that point. The tangent plane to the graph of z = f (x, y) at (x0 , y0 ) is z = L(x, y), where L(x, y) = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) is the linear approximation of f at (x0 , y0 ). We can use L(x, y) to approximate values of f (x, y) near (x0 , y0 ): f (x, y) L(x, y) = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ).

Example 3.6.2 (Linear approximation) Find an approximate value for f (x, y) = 2x2 + e2y at (2.2, 0.2). Solution It is convenient to use the linear approximation at (x0 , y0 ) = (2, 0), where the values of f and its partial derivatives are easily evaluated: f (x, y) = 2x2 + e2y , f (2, 0) = 3, fx (x, y) = 2x , 2x2 + e2y fx (2, 0) = fy (2, 0) = 4 , 3 1 . 3

e2y fy (x, y) = , 2x2 + e2y Thus, L(x, y) = 3 + 4 1 (x 2) + (y 0), and 3 3 f (2.2, 0.2) L(2.2, 0.2) = 3 +

4 1 (2.2 2) + (0.2 0) = 3.2. 3 3


2

(For the sake of comparison, f (2.2, 0.2) 3.2172 to 4 decimal places.)

Gradient vector, tangent planes and normal lines


In this subsection we want to revisit the derivation of tangent planes in the light of the gradient vector. In the process we will also take a look at a normal line to a surface. Let us rst recall the equation of a plane that contains the point n = a, b, c is given by a(x x0 ) + b(y y0 ) + c(z z0 ) = 0. (x0 , y0 , z0 ) with normal vector

When we introduced the gradient vector in Section 3.5 (page 69) on directional derivatives we gave the following fact in Theorem 3.5.3 (page 73).

Fact 1 The gradient vector f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c at the point (x0 , y0 , z0 ).

76

3.6 Applications of Partial Derivatives

This says that the gradient vector is always orthogonal, or normal, to the surface at the point. Also recall that the gradient vector is f = fx , fy , fz . So, the tangent plane to the surface given by f (x, y, z) = c at (x0 , y0 , z0 ) has the equation fx (x0 , y0 , z0 ) (x x0 ) + fy (x0 , y0 , z0 ) (y y0 ) + fz (x0 , y0 , z0 ) (z z0 ) = 0. This is a much more general form of the equation of a tangent plane than the one that was derived previously (page 75). Note however, that we can also get the equation from the previous subsection (page 75) using this more general formula. To see this let us start with the equation z = f (x, y) and we want to nd the tangent plane to the surface given by z = f (x, y) at the point (x0 , y0 , z0 ) where z0 = f (x0 , y0 ). In order to use the formula above we need to have all the variables on one side. This is easy enough to do. All we need to do is to subtract a z from both sides to get f (x, y) z = 0. Now, if we dene a new function F (x, y, z) = f (x, y) z, we can see that the surface given by z = f (x, y) is identical to the surface given by F (x, y, z) = 0 and this new equivalent equation is in the correct form for the equation of the tangent plane that we derived in this subsection. So, the rst thing that we need to do is to nd the gradient vector for F , F = Fx , Fy , Fz = fx , fy , 1 . Notice that Fx Fy Fz = = = (f (x, y) z) x (f (x, y) z) y (f (x, y) z) z

= = =

fx , fy , 1.

The equation of the tangent plane is then fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) (z z0 ) = 0. Solving for z gives z = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) which is identical to the equation that we derived in the previous subsection (page 75). We can get another nice piece of information out of the gradient vector as well. We might on occasion want a line that is orthogonal to a surface at a point, sometimes called the normal line. This is easy enough to get if we recall that the equation of a line only requires that we have a point and a parallel vector. Since we want a line that is at the point (x0 , y0 , z0 ) we know that this point must also be on the line and we know that f (x0 , y0 , z0 ) is a vector that is normal to the surface and hence will be parallel to the line. Therefore the equation of the normal line is r(t) = x0 , y0 , z0 + t f (x0 , y0 , z0 ).

77

3. Partial Derivatives

Example 3.6.3 (Tangent plane, normal line) Find the tangent plane and normal line to x2 + y 2 + z 2 = 30 at the point (1, 2, 5). Solution For this case the function that we are going to be working with is F (x, y, z) = x2 + y 2 + z 2 , and note that we dont have to have a zero on one side of the equal sign. All that we need is a constant. To nish this problem out we simply need the gradient evaluated at the point F (x, y, z) = 2x, 2y, 2z , The tangent plane is then 2(x 1) 4(y + 2) + 10(z 5) = 0. The normal line is r(t) = 1, 2, 5 + t 2, 4, 10 = 1 + 2t, 2 4t, 5 + 10t .
2

F (1, 2, 5) = 2, 4, 10 .

Relative minima and maxima


In this subsection we are going to extend one of the more important ideas from single variable calculus into functions of two variables. We wil see how we can nd minima and maxima of functions in two variables. Recall that we often use the word extrema to refer to both minima and maxima. This in fact will be the topic of the following two subsections as well (i.e., absolute maxima and Lagrange multipliers). The denition of relative extrema for functions of two variables is identical to that for functions of one variable we just need to remember now that we are working with functions of two variables. So, for the sake of completeness here is the denition of relative minima and relative maxima for functions of two variables.

3.6.1 Denition (Relative Extrema) 1. A function f (x, y) has a relative minimum at the point (a, b) if f (x, y) points (x, y) in some region around (a, b). 2. A function f (x, y) has a relative maximum at the point (a, b) if f (x, y) points (x, y) in some region around (a, b). f (a, b) for all

f (a, b) for all

Note that this denition does not say that a relative minimum is the smallest value that the function will ever take. It only says that in some region around the point (a, b) the function will always be larger than f (a, b). Outside of that region it is completely possible for the function to be smaller. Likewise, a relative maximum only says that around (a, b) the function will always be smaller than f (a, b). Again, outside of that region it is completely possible that the function will be larger. Next we need to extend the idea of critical values/points up to functions of two variables. Recall that a critical value of the function f (x) is a number x = c so that either f (c) = 0 or f (c) does not exist. If x = c is a critical value of f , then (c, f (c)) is said to be a critical point. We have a similar denition for critical points of functions of two variables.

78

3.6 Applications of Partial Derivatives

3.6.2 Denition The point (a, b) is a critical point ( or a stationary point ) of f (x, y) provided one of the following is true. 1. f (a, b) = 0 ( this is equivalent to saying that fx (a, b) = 0 and fy (a, b) = 0 ). 2. fx (a, b) and/or fy (a, b) does not exist.

To see the equivalence in the rst part let us start with f = 0 and put in the denition of each part. f (a, b) fx (a, b), fy (a, b) = = 0, 0, 0 .

The only way that these two vectors can be equal is to have fx (a, b) = 0 and fy (a, b) = 0. In fact, we will use this denition of the critical point more than the gradient denition since it will be easier to nd the critical points if we start with the rst-order partial derivatives. Note as well that both of the rst-order partial derivatives must be zero at (a, b). If only one of the rst-order partial derivatives is zero at the point then the point will NOT be a critical point.

We now have the following fact that, at least partially, relates critical points to relative extrema.

Fact 2 If the point (a, b) is a relative extremum of the function f (x, y) then (a, b) is also a critical point of f (x, y).

Note that this does NOT say that all critical points are relative extrema. It only says that relative extrema will be critical points of the function. To see this let us consider the function f (x, y) = xy. The two rst-order partial derivatives are fx (x, y) = y and fy (x, y) = x.

The only point that will make both of these derivatives zero at the same time is (0, 0) and so (0, 0) is a critical point for the function. Here is the graph of the function.

79

3. Partial Derivatives
z

Note that the axes are not in the standard orientation here so that we can see more clearly what is happening at the origin, i.e., at (0, 0). If we start at the origin and move into either of the quadrants where both x and y are the same sign the function increases. However, if we start at the origin and move into either of the quadrants where x and y have opposite signs then the function decreases. In other words, no matter what region you take about the origin there will be points larger than f (0, 0) = 0 and points smaller than f (0, 0) = 0. Therefore, there is no way that (0, 0) can be a relative extremum. Critical points that exhibit this kind of behavior are called saddle points. While we have to be careful not to misinterpret the results of this fact. Because of this fact we know that if we have all the critical points of a function then we also have every possible relative extrema for the function. The fact tells us that all relative extrema must be critical points so we know that if the function does have relative extrema then they must be in the collection of all the critical points. Remember however, that it will be completely possible that at least one of the critical points wont be a relative extremum. So, once we have all the critical points in hand all we need to do is test these points to see if they are relative extrema or not. To determine if a critical point is a relative extremum (and in fact to determine if it is a minimum or a maximum) we can use the following fact. Fact 3 (Second Derivative Test) Suppose that (a, b) is a critical point of f (x, y) and that the second-order partial derivatives are continuous in some region that contains (a, b). Next dene D = D(a, b) = fxx (a, b) fyy (a, b) [fxy (a, b)]2 . We then have the following classications of the critical point. 1. If D > 0 and fxx (a, b) > 0, then (a, b) is a relative minimum. 2. If D > 0 and fxx (a, b) < 0, then (a, b) is a relative maximum. 3. If D < 0, then (a, b) is a saddle point. 4. If D = 0, then (a, b) may be a relative minimum, relative maximum or a saddle point. Other techniques would need to be used to classify the critical point.

80

3.6 Applications of Partial Derivatives

Note that we are not going to be seeing any cases in this class where D = 0. We will be able to classify all the critical points that we nd. Let us see a couple of examples. Example 3.6.4 (Critical points) Find and classify all the critical points of f (x, y) = 4 + x3 + y 3 3xy. Solution We rst need all the rst-order (to nd the critical points) and second-order (to classify the critical points) partial derivatives so let us get those. fx = 3x2 3y, fxx = 6x, fy = 3y 2 3x, fyy = 6y, fxy = 3.

Let us rst nd the critical points. Critical points will be solutions to the system of equations ( fx = 3x2 3y = 0, fy = 3y 2 3x = 0.

This is a nonlinear system of equations and these can (quite often) be dicult to solve. However, in this case it is not too bad. We can solve the rst equation for y as follows 3x2 3y = 0 Plugging this into the second equation gives 3(x2 )2 3x = 3x(x3 1) = 0. From this we can see that we must have x = 0 and x = 1. Now use the fact that y = x2 to get the critical points. = (0, 0), x = 0 : y = 02 = 0 x = 1 : y = 12 = 1 = (1, 1). So we get two critical points. All we need to do now is classify them. To do this we will need to know the sign of D. Here is the general formula for D. D(x, y) = = = fxx (x, y) fyy (x, y) [fxy (x, y)]2 (6x)(6y) (3)2 36xy 9. = y = x2 .

To classify the critical points all that we need to do is plug in the critical points and use the fact above to classify them. For the critical point (0, 0): D = D(0, 0) = 9 < 0. For (0, 0), D is negative and so this must be a saddle point. For the critical point (1, 1): D = D(1, 1) = 36 9 = 27 > 0, fxx (1, 1) = 6 > 0.
2

For (1, 1), D is positive and fxx is positive and so we must have a relative minimum.

81

3. Partial Derivatives

Example 3.6.5 (Critical points) Find and classify all the critical points of f (x, y) = 3x2 y + y 3 3x2 3y 2 + 2. Solution We rst need all the rst-order (to nd the critical points) and second-order (to classify the critical points) partial derivatives so let us get those. fx = 6xy 6x, fxx = 6y 6, fy = 3x2 + 3y 2 6y, fyy = 6y 6, fxy = 6x.

We will rst need the critical points. The equations that we will need to solve this time are ( 6xy 6x = 0, 3x2 + 3y 2 6y = 0.

These equations are a little trickier to solve than the rst set, but once you see what to do they really are not too complicated. First, notice that we can factor out a 6x from the rst equation to get 6x(y 1) = 0. So, we can see that the rst equation will be zero if x = 0 or y = 1. Be careful to not just cancel x from both sides. If we really do this we would miss the case x = 0. To nd the critical points we can plug these (individually) into the second equation and solve for the remaining variable. x=0: y=1: 3y 2 6y = 3y(y 2) = 0 3x2 3 = 3(x2 1) = 0 = = y = 0, y = 2, x = 1, x = 1.

So, if x = 0 we have the following critical points (0, 0) and if y = 1 the critical points are (1, 1) and (1, 1). and (0, 2)

Now all we need to do is classify the critical points. To do this we will need the general formula for D. D(x, y) = (6y 6)(6y 6) (6x)2 = (6y 6)2 36x2 . To classify the critical points all that we need to do is to plug in the critical points and use the fact above to classify them. For the critical point (0, 0): D = D(0, 0) = 36 > 0, fxx (0, 0) = 6 < 0. So (0, 0) is a relative maximum. For the critical point (0, 2): D = D(0, 2) = 36 > 0, fxx (0, 2) = 6 > 0. So (0, 2) is a relative minimum. For the critical point (1, 1): D = D(1, 1) = 36 < 0. So (1, 1) is a saddle point. For the critical point (1, 1): D = D(1, 1) = 36 < 0. So (1, 1) is a saddle point.
2

Let us do one more example that is a little dierent from the rst two.

82

3.6 Applications of Partial Derivatives

Example 3.6.6 (Critical points) Determine the point on the plane 4x 2y + z = 1 that is closest to the point (2, 1, 5).

Solution Note that we are NOT asking for the critical points of the plane. In order to do this example we are going to work out the equation that we are going to work with. First let us suppose that (x, y, z) is any point on the plane. The distance between this point and the point in question, (2, 1, 5), is given by the formula d= p (x + 2)2 + (y + 1)2 + (z 5)2 .

What are then asking is to nd the minimum value of this distance function. The point (x, y, z) that gives the minimum value of this equation will be the point on the plane that is closest to (2, 1, 5). There are a couple of issues with this function. First, it is a function of x, y and z and we can only deal with functions of x and y at this point. This is easy to x however. We can solve the equation of the plane to see that z = 1 4x + 2y. Plugging this into the distance function gives p (x + 2)2 + (y + 1)2 + (1 4x + 2y 5)2 d = p = (x + 2)2 + (y + 1)2 + (4 4x + 2y)2 . Now, the next issue is that there is a square root in this formula and we know that we are going to be dierentiating this eventually. So, in order to make our argument a little easier let us notice that nding the minimum value of d will be equivalent to nding the minimum value of d2 . So, let us instead nd the minimum value of f (x, y) = d2 = (x + 2)2 + (y + 1)2 + (4 4x + 2y)2 . Now, we need to be a little careful here. We are being asked to nd the closest point on the plane to (2, 1, 5) and that is not really the same thing as what we have been doing in this subsection. In this subsection we have been nding and classifying critical points as relative minima or maxima and what we are really asking is to nd the smallest value the function will take, or the absolute minimum. Hopefully, it does make sense from a physical standpoint that there will be a closest point on the plane to (2, 1, 5). Also, this point should be a relative minimum. So, let us go through the process from the rst and second example and see what we get as far as relative minima go. If we only get a single relative minimum then we will be done since that point will also need to be the absolute minimum of the function and hence the point on the plane that is closest to (2, 1, 5). We will need the derivatives rst. fx fy fxx fyy fxy = = = = = 2(x + 2) + 2(4)(4 4x + 2y) = 36 + 34x 16y, 2(y + 1) + 2(2)(4 4x + 2y) = 14 16x + 10y, 34, 10, 16.

Now, before we get into nding the critical point(s) let us compute D quickly. D = (34)(10) (16)2 = 84 > 0.

83

3. Partial Derivatives

So, in this case D will always be positive and also notice that fxx = 34 > 0 is always positive and so any critical points that we get will be guaranteed to be relative minima. Now, let us nd the critical point(s). This will mean solving the system ( 36 + 34x 16y = 0, 14 16x + 10y To do this we can solve the rst equation for x. x= 1 1 (16y 36) = (8y 18). 34 17 = 0.

Now, plug this into the second equation and solve for y. 14y 16 (8y 18) + 10y = 0 17 = y= 25 . 21

Backward substituting this into the equation for x gives x = 34/21. So we get a single critical point 34 25 . , 21 21 Also, since we know this will be a relative minimum and it is the only critical point we know that this is also the x and y coordinates of the point on the plane that we are looking for. We can nd the z coordinate by plugging into the equation of the plane as follows z = 1 4( 25 107 34 ) + 2( ) = . 21 21 21

So, the point on the plane that is closest to (2, 1, 5) is 34 25 107 , , . 21 21 21


2

Absolute minima and maxima


Now we are going to extend the work from the previous subsection. In the previous subsection we were asked to nd and classify all critical points as relative minima, relative maxima and/or saddle points. In this subsection we want to optimize a function, that is identify the absolute minimum and/or the absolute maximum of the function, on a given region in R2 . Note that when we say we are going to be working on a region in R2 we mean that we are going to be looking at some region in the xy-plane. In order to optimize a function in a region we are going to introduce a couple of denitions out of the way. Here are the denitions.

3.6.3 Denition 1. A region in R2 is called closed if it includes its boundary. A region is called open if it does not include any of its boundary points. 2. A region in R2 is called bounded if it can be completely contained in a disk. In other words, a region will be bounded if it is nite.

84

3.6 Applications of Partial Derivatives

Let us think a little more about the denition of a closed region. We said a region is closed if it includes its boundary. Just what does this mean? Let us think of a rectangle. Below are two denitions of a rectangle, one is closed and the other is open. Open 5 < x < 3, 1 < y < 6. Closed 5 1 x y 3, 6.

In the rst case we dont allow the ranges to include the endpoints (i.e., we are not including the edges of the rectangle) and so we are not allowing the region to include any points on the edge of the rectangle. In other words, we are not allowing the region to include its boundary and so it is open. In the second case we are allowing the region to contain points on the edges and so will contain its entire boundary and hence will be closed. This is an important idea because of the following fact.

Theorem 3.6.1 (Extreme Value Theorem) If f (x, y) is continuous in some closed, bounded set D in R2 then there are points in D, (x1 , y1 ) and (x2 , y2 ) so that f (x1 , y1 ) is the absolute maximum and f (x2 , y2 ) is the absolute minimum of the function in D.

Note that this theorem does NOT tell us where the absolute minimum or absolute maximum will occur. It only tells us that they will exist. Note as well that the absolute minimum and/or absolute maximum may occur in the interior of the region or it may occur on the boundary of the region. The basic process to nd absolute maxima is pretty much identical to the process that we used in single variable calculus when we looked at nding absolute extrema of functions of a single variable. There will however, be some dierences to account for the fact that we now are dealing with functions of two variables. Here is the process.

Theorem 3.6.2 (Finding Absolute Extrema) 1. Find all the critical points of the function that lie in the region D and determine the function value at each of these points. 2. Find all extrema of the function on the boundary. This usually involves the single variable calculus approach for this work. 3. The largest and smallest values found in the rst two steps are the absolute maximum and the absolute minimum of the function respectively.

The main dierence between this process and the process that we used in single variable calculus is that the boundary in single variable calculus is just two points and so there is just little to do in the second step. For these problems the majority of the work is often in the second step as we will often end up doing an absolute extrema problem in single variable calculus one or more times. Let us take a look at a couple of examples.

85

3. Partial Derivatives

Example 3.6.7 (Absolute extrema) Find the absolute minimum and absolute maximum of f (x, y) = x2 + 4y 2 2x2 y + 4 on the rectangle given by 1 Solution x 1 and 1 y 1.

Let us rst get a quick picture of the rectangle for reference purposes. y
y=1

x = 1

x=1

y = 1

The boundary of this rectangle is given by the following conditions. right side : left side : upper side : lower side : x = 1, 1 x = 1, 1 y = 1, 1 y = 1, 1 x x y y 1, 1. 1, 1,

These will be important in the second step of our process. We will start this o by nding all the critical points that lie inside the given rectangle. To do this we will need the two rst-order derivatives fx = 2x 4xy, fy = 8y 2x2 .

Note that since we are not going to be classifying the critical points we dont have to consider the second-order derivatives. To nd the critical points we will need to solve the system ( 2x 4xy = 0, 8y 2x2 We can solve the second equation for y to get y= Plugging this into the rst equation gives 2x 4x( x2 ) = 2x x3 = x(2 x2 ) = 0. 4 x2 . 4 = 0.

86

3.6 Applications of Partial Derivatives

This implies that we must have x=0 or x = 2 1.414.

Now, recall that we only want critical points in the region that we are given. This means that we only want critical points for which 1 x 1. The only value of x that will satisfy this is the rst one so we can ignore the last two for this problem. Note however that a simple change to the boundary would include these two so dont forget to always check if the critical points are in the region (or on the boundary since that can also happen). Put x = 0 into the equation for y gives y= 02 = 0. 4 We now need to get the

The single critical point, in the region (and again, thats important), is (0, 0). value of the function at the critical point. f (0, 0) = 4.

Eventually we will compare this to values of the function found in the next step and take the largest and smallest as the absolute extrema of the function in the rectangle. Now we have reached the long part of this problem. We need to nd the absolute extrema of the function along the boundary of the rectangle. What this means is that we are going to look at what the function is doing along each of the sides of the rectangle listed above. Let us rst take a look at the right side. As noted above the right side is dened by x = 1, 1 y 1.

Notice that along the right side we know that x = 1. Let us take advantage of this by dening a new function as follows g(y) = f (1, y) = (1)2 + 4y 2 2(1)2 y + 4 = 5 + 4y 2 2y. Now, nding the absolute extrema of f (x, y) along the right side will be equivalent to nding the absolute extrema of g(y) in the range 1 y 1. Hopefully you can recall how to do this from single variable calculus. We nd the critical points of g(y) in the range 1 y 1 and then evaluate g(y) at the critical points and the end points of the range ys. Let us do that for this problem. g (y) = 8y 2 = y= 1 . 4

This is in the range and so we will need the following function evaluations. g(1) = 11, g(1) = 7, 19 1 = 4.75. g( ) = 4 4

Notice that, using the denition of g(y) these are also function values for f (x, y). g(1) g(1) 1 g( ) 4 = = = f (1, 1) = 11, f (1, 1) = 7, 1 19 f (1, ) = = 4.75. 4 4

We can now do the left side of the rectangle which is dened by x = 1, 1 y 1.

87

3. Partial Derivatives

Again, we will dene a new function (it doesnt matter we are still using the symbol g) as follows g(y) = f (1, y) = (1)2 + 4y 2 2(1)2 y + 4 = 5 + 4y 2 2y. However, notice that for this boundary, this is the same function as we looked at for the right side. This will not always happen, but for this example let us take advantage of the fact that we have already done the work for this function. We know that the critical point is y = 1/4 and we know that the function value at the critical point and the end points are g(1) = 11, g(1) = 7, 19 1 = 4.75. g( ) = 4 4

The only real dierence here is that these will correspond to value of f (x, y) at dierent points than for the right side. In this case these will correspond to the following function values for f (x, y). g(1) g(1) 1 g( ) 4 = = = f (1, 1) = 11, f (1, 1) = 7, 1 19 f (1, ) = = 4.75. 4 4

We can now look at the upper side dened by y = 1, 1 x 1.

We will again dene a new function except this time it will be a function of x. h(x) = f (x, 1) = x2 + 4(1)2 2x2 (1) + 4 = 8 x2 . We need to nd the absolute extrema of h(x) on the range 1 h (x) = 2x = x 1. First nd the critical points.

x = 0.

The value of this function at the critical point and the end points are h(1) = 7, and the corresponding values for f (x, y) are h(1) h(1) h(0) = = = f (1, 1) = 7, f (1, 1) = 7, f (0, 1) = 8. h(1) = 7, h(0) = 8,

Note that there are several repeats here. The rst two function values have already been computed when we looked at the right and left sides. This will often happen. Finally, we need to take care of the lower side. This side is dened by y = 1, The new function we will dene in this case is h(x) = f (x, 1) = x2 + 4(1)2 2x2 (1) + 4 = 8 + 3x2 . The critical point for this function is h (x) = 2x = x = 0. 1 x 1.

88

3.6 Applications of Partial Derivatives

The function values at the critical point and the end points are h(1) = 11, and the corresponding values for f (x, y) are h(1) h(1) h(0) = = = f (1, 1) = 11, f (1, 1) = 11, f (0, 1) = 8. h(1) = 11, h(0) = 8,

The nal step to this long process is to collect up all the function values for f (x, y) that we have computed in this problem. Here they are f (0, 0) = 4, 1 f (1, ) = 4.75, 4 f (1, 1) = 11, f (1, 1) = 7, f (1, 1) = 7, f (1, 1) = 11,

1 f (1, ) = 4.75, f (0, 1) = 8, f (0, 1) = 8. 4 The absolute minimum is at (0, 0) since it gives the smallest function value and the absolute maximum 2 occurs at (1, 1) and (1, 1) since these two points give the largest value among all.

As this example has shown these can be very long problems. Let us take a look at an easier problem with a dierent kind of boundary.

Example 3.6.8 (Absolute extrema) Find the absolute minimum and absolute maximum of f (x, y) = 2x2 y 2 + 6y on the disk of radius 4, x2 + y 2 16.

Solution First note that a disk of radius 4 is given by the inequality in the problem statement. The less than inequality is included to get the interior of the disk and the equal sign is included to get the boundary. Of course, this also means that the boundary of the disk is a circle of radius 4. y

Circular boundary x2 + y 2 = 16

89

3. Partial Derivatives

Let us nd the critical points of the function that lie inside the disk. This will require the following two rst-order partial derivatives. fy = 2y + 6. fx = 4x, To nd the critical points we will need to solve the system ( 4x = 0, 2y + 6 = 0.

This is actually a fairly simple system to solve however. The rst equation tells us that x = 0 and the second tells us that y = 3. So the only critical point for this function is (0, 3) and this is inside the disk of radius 4. The function value at this critical point is f (0, 3) = 9. Now we need to look at the boundary. This one will be somewhat dierent from the previous example. In this case we dont have xed values of x and y on the boundary. Instead we have x2 + y 2 = 16. We can solve this for x2 and plug this into the x2 in f (x, y) to get a function of y as follows x2 g(y) = = 16 y 2 , 2(16 y 2 ) y 2 + 6y = 32 3y 2 + 6y. y 4 (this is the range of

We will need to nd the absolute extrema of this function on the range 4 ys for the disk). We will rst need the critical points of this function. g (y) = 6y + 6 g(4) = 40, = y = 1.

The value of this function at the critical point and the end points are g(4) = 8, g(1) = 35.

Unlike the rst example we will still need to nd the values of x that correspond to these. We can do this by plugging the value of y into our equation for the circle and solving for y. y = 4 : y=4: y=1: g(4) = 40 g(4) = 8 g(1) = 35 x2 = 16 16 = 0 x = 16 16 = 0 x2 = 16 1 = 15 = = = f (0, 4) = 40, f (0, 4) = 8, f ( 15, 1) = 35 f ( 15, 1) = 35.
2

= = =

x = 0, x = 0, x = 15.

The function values for g(y) then correspond to the following function values for f (x, y).

and

Note that the third one actually corresponds to two dierent values for f (x, y) since that y also produces two dierent values of x. So, comparing these values to the value of the function at the critical point of f (x, y) that we found earlier can see that the absolute minimum occurs at (0, 4) while the absolute we 2 maximum occurs twice at ( 15, 1) and ( 15, 1).

In these examples one of the absolute extrema actually occured at more than one place. Sometimes this will happen and sometimes it wont, so dont read too much into the fact that it happened in both examples given here. Also note that, as we have seen, absolute extrema will often occur on the boundaries of these regions, although they dont have to occur at the boundaries. There are more complicated examples with multiple critical points that the absolute extrema may occur interior to the region and not on the boundary.

90

3.6 Applications of Partial Derivatives

Lagrange multipliers
In the previous subsection we optimized (i.e., found the absolute extrema of) a function on a region that contains its boundary. Find potential optimal points in the interior of the region is not too dicult in general, all that we need to do is to nd the critical points and plug them into the function. However, as we saw in Examples 3.6.7, 3.6.8 nding potential optimal critical points on the boundary is often a fairly long and messy process. Now we are going to take a look at another method (Lagrange multipliers) of optimizing a function subject to given constraint(s). The constraint(s) may be equation(s) that describe the boundary of a region although in this subsection we wont concentrate on those types of problems since this method just requires a general constraint and doesnt really care where the constraint came from. So, let us get things set up. We want to optimize (nd the minimum and maximum) a function, f (x, y, z), subject to the constraint g(x, y, z) = c. Again, the constraint may be the equation that describes the boundary of a region or it may not be. The process is actually fairly simple, although the work can still be a little overwhelming at times.

Theorem 3.6.3 (Method of Lagrange Multipliers) 1. Solve the following system of equations 8 < f (x, y, z) : g(x, y, z)

= =

g(x, y, z), c.

2. Plug in all solutions, (x, y, z), from the rst step into f (x, y, z) and identify the minimum and maximum values, provided they exist. The constant, , is called the Lagrange multiplier.

Notice that the system of equations actually has four equations, we just wrote the system in a simpler form. To see this let us take the rst equation and put in the denition of the gradient vector to see what we get. fx , fy , fz = = gx , gy , gz gx , gy , gz .

For these two vectors to be equal, the individual components must also be equal. So, we actually have three equations here. fy = gy , fz = gz . fx = gx , These three equations along with the constraint, g(x, y, z) = c, give four equations with four unknowns x, y, z, and . Note as well that if we only have functions of two variables then we wont have the third component of the gradient and so will only have three equations in three unknowns x, y, and . Let us work a couple of examples.

91

3. Partial Derivatives

Example 3.6.9 (Lagrange multiplier) Find the dimensions of the box with largest volume if the total surface area is 64 cm2 . Solution Before we start the process here note that we also have a way to solve this kind of problem in single variable calculus, except in those problems we require a condition that relates one of the sides of the box to the other sides so that we can get down to a volume and surface area function that only involve two variables. We no longer need this condition for these problems. Now, let us go on to solve the problem. We rst need to identify the function that we are going to optimize as well as the constraint. Let us set the length of the box to be x, the width of the box to be y and the height of the box to be z. We want to nd the largest volume and so the function that we want to optimize is given by f (x, y, z) = xyz. Next we know that the surface area of the box must be a constant 64. So this is the constraint. The surface area of a box is simply the sum of the areas of each of the sides so the constraint is given by 2xy + 2yz + 2xz = 64 = xy + yz + xz = 32.

Note that we divide the constraint by 2 to simplify the equation a little. Also, we get the function g(x, y, z) from this. g(x, y, z) = xy + yz + xz. Here are the four equations that we need to solve. yz = (y + z) xz = (x + z) xy = (x + y) xy + yz + xz = 32 (fx = gx ) , (fy = gy ) , (fz = gz ) , (g(x, y, z) = 32) . (3.4) (3.5) (3.6) (3.7)

Although the equations are nonlinear, there are many ways to solve this system. We will solve it in the following way. Let us multiply equation (3.4) by x, equation (3.5) by y and equation (3.6) by z. xyz xyz xyz = = = x(y + z), y(x + z), z(x + y). (3.8) (3.9) (3.10)

Now notice that we can set equations (3.8) and (3.9) equal. Doing this gives x(y + z) (xy + xz) (xy + yz) (xz yz) = = = y(x + z), 0, 0 = =0 or xz = yz.

This implies two possibilities. The rst, = 0, is not possible since if this is the case equation (3.4) will reduce to or z = 0. yz = 0 = y=0 Since we are talking about the dimensions of a box neither of these are possible so we can discount = 0. This leaves the second possibility. xz = yz. Since we know that z = 0 (again since we are talking about the dimensions of a box) we can cancel the z from both sides. This gives x = y. (3.11)

92

3.6 Applications of Partial Derivatives

Next, let us set equations (3.9) and (3.10) equal. Doing this gives y(x + z) (xy + yz xz yz) (xy xz) = = = z(x + y), 0, 0 = =0 or xy = xz.

As already discussed we know that = 0 wont work and so this gives xy = xz. We can also say that x = 0 since we are dealing with the dimensions of a box so we have y = z. Plugging equations (3.11) and (3.12) into equation (3.7) we get r y 2 + y 2 + y 2 = 3y 2 = 32, y= 32 3.266. 3 (3.12)

However, we know that y must be positive since we are talking about the dimensions of a box. Therefore the only solution that makes physical sense here is x = y = z 3.266 cm. This shows that we have a cube here. We should be a little careful here. Since we have obtained only one solution we might be tempted to assume that this is the dimensions that will give the largest volume. The method of Lagrange multipliers will give a set of points that will either maximize or minimize a given function subject to the constraint. However, when we get a single solution it may be either a maximum or a minimum. To verify that we indeed have a maximum, as we want, all we need to do is to pick any other point that satises the constraint and check its volume against the volume of the point we got above. If the volume of the point above is larger than the second point we will know that we indeed have a maximum. To get the second point let us choose y = z = 2 plugging these into the constraint gives 2x + 2x + 4 = 32, Checking the volume at the two points gives f (3.266, 3.266, 3.266) f (7, 2, 2) = = 34.8376, 28.
2

x = 7.

So, it is certain that we did get a maximum value as expected.

Notice that we never actually found values for in the above example. This is fairly standard for these kind of problems. The value of is not really important to determining if the point is a maximum or a minimum so often we will not bother with nding a value for it. On occasion we will need its value to help solve the system, let us take a look at the next example for illustration. Example 3.6.10 (Lagrange multiplier) Find the maximum and minimum of f (x, y) = 5x 3y subject to the constraint x + y = 136.
2 2

93

3. Partial Derivatives

Solution This one is going to be a little easier than the previous one since it only has two variables. Here is the system that we need to solve. 5 3 x +y
2 2

= = =

2x, 2y, 136.

Notice that, as with the last example, we cannot have = 0 since that would not satisfy the rst two equations. So, since we know that = 0 we can solve the rst two equations for x and y, respectively. This gives 3 5 , y= . x= 2 2 Plugging these into the constraint gives 9 17 25 + 2 = = 136. 42 4 22 We can solve this for . 1 1 = = . 16 4 Now, that we know we can nd the points that will be potential maxima and/or minima. 2 = 1 If = , we get 4

x = 10,

y = 6.

If =

1 , we get 4

x = 10,

y = 6.

To determine if we have maxima or minima we just need to plug these into the function. f (10, 6) = 68, f (10, 6) = 68, minimum at (10, 6), maximum at (10, 6).
2

In the rst two examples we have excluded = 0 either for physical reasons or because it would not satisfy one or more of the equations. Do not always expect this to happen. Sometimes we will be able to automatically exclude a value of and sometimes we wont. Let us take a look at another example. Example 3.6.11 (Lagrange multiplier) Find the maximum and minimum of f (x, y, z) = xyz subject to the constraint x + y + z = 1. Assume that x, y, z Solution Here is the system that we need to solve. yz xz xy x+y+z = = = = , , , 1. (3.13) (3.14) (3.15) (3.16) 0.

94

3.6 Applications of Partial Derivatives

Let us start this solution process o by noticing that since the rst three equations all have they are all equal. So, let us start o by setting equations (3.13) and (3.14) equal. yz = xz = z(y x) = 0 = z=0 or y = x.

So, we have two possibilities here. Let us consider the rst possibility: z = 0. With this we can see from either equation (3.13) or (3.14) that we must have = 0. From equation (3.15) we see that this means that xy = 0. This in turn means that either x = 0 or y = 0. So, we have two possible cases to deal with here. In each case two of the variables must be zero. Once we know this we can plug into the constraint, equation (3.16), to nd the remaining value. z = 0, z = 0, and and x=0 y=0 = = y = 1, x = 1.

So, we get two possible solutions (0, 1, 0) and (1, 0, 0). Now, let us go back and take a look at the other possibility, y = x. We also have two possible cases to look at here as well. The rst case is x = y = 0. In this case we can see from the constraint that we must have z = 1 and so we now have a third solution (0, 0, 1). The second case is x = y = 0. Let us set equations (3.14) and (3.15) equal. xz = xy = x(z y) = 0 = x=0 or z = y.

Now, we have already assumed that x = 0 and so the only possibility is that z = y. However, this also means that x = y = z. Using this in the constraint gives 3x = 1 = x= 1 . 3

1 1 1 So, the next solution is ( , , ). We got four solutions by setting the rst two equations equal. 3 3 3 To completely nish this problem out we should probably set equations (3.13) and (3.15) equal as well as setting equations (3.14) and (3.15) equal to see what we get. Doing this gives yz = xy xz = xy = = y(z x) = 0 x(z y) = 0 = = y=0 x=0 or or z = x, z = y.

Both of these are very similar to the rst situation that we looked at and we will leave it up to you to show that in each of these cases we arrive back at the four solutions that we already found. So, we have four solutions that we need to check in the function to see whether we have minima or maxima. f (0, 0, 1) = 0, 1 1 1 1 : f( , , ) = 3 3 3 27 f (0, 1, 0) = 0, f (1, 0, 0) = 0 : all minima, maximum.

So, in this case the maximum occurs only once while the minimum occurs three times. Note as well that we never really used the assumption that x, y, z 0 in this problem. This assumption is here mostly to make sure that we really do have a maximum and a minimum of the function. Without

95

3. Partial Derivatives

this assumption it would not be too dicult to nd points that give both larger and smaller values for the function. For example, x = 100, y = 100, z = 1 : x = 50, y = 50, z = 101 : 100 + 100 + 1 = 1, 50 50 + 101 = 1, f (100, 100, 1) = 10000, f (50, 50, 101) = 252500.

With these examples you can clearly see that it is not too hard to nd points that will give larger and smaller function values. However, all of these examples required negative values of x, y, and/or z to make sure we satisfy the constraint. By eliminating these we can now say that we have found the minimum and 2 maximum values of the function.

To this point we have only looked at constraints that were equations. We can also have constraints that are inequalities. The process for these types of problems is nearly identical to what we have been doing. The main dierence between the two types of problems is that we will also need to nd all the critical points that satisfy the inequality in the constraint and check these in the function when we check the values we found using Lagrange multipliers. We are not going to give any examples of this type here and this will not appear in our examinations.

The nal topic that we need to discuss here is what to do if we have more than one constraint. We will look at two constraints, but we can naturally extend the work here to more than two constraints. We want to optimize f (x, y, z) subject to the constraint g(x, y, z) = c and h(x, y, z) = k. The system that we need to solve in this case is 8 > f (x, y, z) = g(x, y, z) + h(x, y, z), > > > < g(x, y, z) = c, > > > > : h(x, y, z) = k. So in this case we get two Lagrange multipliers (i.e., and ). Also, note that the rst equation really is three equations as we saw in the previous examples. Let us see an example of this kind of optimization problem.

Example 3.6.12 (Lagrange multiplier) Find the maximum and minimum of f (x, y, z) = 4y 2z subject to the constraints 2x y z = 2 and x2 + y 2 = 1. Solution Here is the system that we need to solve. 0 = 2 + 2x 4 = + 2y 2 = 2x y z = 2, x + y = 1.
2 2

(fx = gx + hx ) , (fy = gy + hy ) , (fz = gz + hz ) ,

(3.17) (3.18) (3.19) (3.20) (3.21)

96

3.6 Applications of Partial Derivatives

First, let us notice that from equation (3.19) we get = 2. equation (3.18) and solving for x and y respectively gives 0 = 4 + 2x 4 = 2 + 2y Now, plug these into equation (3.21). 9 13 4 + 2 = 2 =1 2 = = =

Plugging this into equation (3.17) and

2 x= , 3 y= .

= 13. 13. In this case we have

So, we have two cases to look at here. First, let us see what we get when = 2 x = 13 Plugging these into equation (3.20) gives 3 4 z = 2 13 13 So, we have obtained one solution. Let us now see what we get if we take = 13. Here we have 2 x= 13 Plugging these into equation (3.20) gives 3 4 + z = 2 13 13 and this is the second solution. = and 3 y = . 13 = and 3 y= . 13

7 z = 2 . 13

7 z = 2 + , 13

Now all that we need to do is check the two solutions in the function to see which is the maximum and which is the minimum. 3 7 2 f ( , , 2 ) 13 13 13 3 7 2 f ( , , 2 + ) 13 13 13 So, we have a maximum at 3 7 2 ( , , 2 ) 13 13 13 and a minimum at 2 3 7 ( , , 2 + ). 13 13 13
2

26 4+ 11.2111, 13 26 4 3.2111. 13

97

3. Partial Derivatives

98

Chapter 4

Multiple Integrals
4.1 Double Integrals

Before starting on double integrals let us do a quick review of what is the denition of denite integrals for functions of one single variable. For these integrals we are integrating over the interval a x b, Z b f (x) dx.
a

Now, when we are going to make the denition of the denite integral we rst consider this as a classical area problem. What is the area problem? Historically, the idea of limits has been used to nd areas of dierent shapes. Let us rst consider a simple example: how the area of a circle can be found by a limiting process?

height about same as radius width approximately = half circumference

We divide the circular disc into pieces of equal size and then rearrange the pieces as in the gure above. In the limit of increasing number of pieces (i.e., smaller and smaller divisions) the rearrangement will give something closer to a rectangle (imagine it) of which the area can easily be found. Thus the area of the circle is the same as the area of the limiting rectangle, which is height width = r r = r 2 . However, in general, we cannot divide things up like we can with the circle. We need a general method that will work for a wider variety of shapes. For example, given a region in the plane, we need a consistent method that helps to nd the enclosed area. Very often this refers to nding the area of the region bounded above by the graph of a given positive function f (x), bounded below by the x-axis, bounded to the left by the vertical line x = a, and to the right by the vertical line x = b. The answer to this problem came through a very nice idea. Indeed, this can be done by rst dividing the region into strips (gure below) and approximating the area of each strip by a small rectangle whose area we know how to nd! Of course, the area approximated by the rectangles is not quite the same as the area under the curve, but as the rectangles get thinner (more rectangles at the same time), the total area of the rectangles gets closer to the real area. Again, this is a limiting process similar to the disc area problem.

99

4. Multiple Integrals

The following is a gure illustrating the above idea of dividing the region into (innitely many) strips.

As mentioned before, we may regard integration as the limiting process of an algebraic summation. So it is not hard to imagine that denite integration plays the role in summing up the areas of rectangles in a limiting way. In the following we shall try to derive the mathematics of nding areas by denite integration. In the following you will see how the method works.

Mathematical formulation. Let f (x) be a continuous positive function dened on the closed interval [a, b]. We divide the interval [a, b] by any nite set of points, called a partition P of [a, b], i.e., a = x0 < x1 < x2 < < xn1 < xn = b. Hence, [a, b] is divided into n subintervals: [a, x1 ], [x1 , x2 ], [x2 , x3 ], , [xi1 , xi ], , [xn1 , b]. The length of each subinterval is xi = xi xi1 , i = 1, 2, , n. Denote the longest length of these subintervals by x = max {xi } and we call it the norm of the partition P. On each of the subintervals [xi1 , xi ] (i = 1, 2, , n), we choose an arbitrary point x such i that xi1 x xi . The choice for x will dene a particular rectangle, with f (x ) as its height and i i i xi as its width, on the subinterval [xi1 , xi ]. The following gure summarizes the details of what we have discussed. y magnied part on right side y = f (x) xi
1 i n

xi1

x i

xi

x xi1 x i xi (x movable) i

A Riemann sum of f for the partition P is dened as a nite algebraic sum of the form Sn =
n X i=1

f (x ) xi , i

where

x [xi1 , xi ] i

for i = 1, 2, , n.

The sum Sn is nothing but just the sum of areas of the rectangles over [a, b]. If the sum has a limit as the number of subintervals increases without bound (while the longest length x goes to zero) and if the limit is independent of the choices x , then we call the limit the denite integral of f (x) over [a, b] and write it i Z b as f (x) dx. We make the conclusion in the following denition.
a

100

4.1 Double Integrals 4.1.1 Denition Let f be a real-valued function dened on [a, b], P be a partition of [a, b]. If the limit n P f (x ) xi exists independently of the mode of subdivision of [a, b] and x , then the limit is called lim i i
n i=1

the denite integral of f (x) from x = a to x = b and is written as lim


n X i=1

f (x ) xi = i

Z
a

f (x) dx.

(4.1)

The function f (x) is called the integrand. The numbers a and b are called the lower and upper limits of integration, respectively and [a, b] is called the range of integration. The letter x is the variable of integration which is actually a dummy variable that can be replaced by any other letter. It can be Z
b

proved that the denite integral continuous. Z Geometrically,


a b a

f (x) dx exists if f is continuous on [a, b]. It also exists if f is piecewise

f (x) dx represents the area bounded by the curve y = f (x), the x-axis and the

vertical lines at x = a, x = b only if f (x) 0 (i.e., the whole graph of f lies in the upper half plane). Otherwise, the denite integral represents the algebraic sum of the areas above and below the x-axis, treating areas above the x-axis as positive and areas below the x-axis as negative. That is to say, in general, a denite integral gives the net area between the graph of f and the x-axis, i.e., the sum of the areas of the regions where y = f (x) is above the x-axis minus the sum of the areas of the regions where y = f (x) is below the x-axis.

Double integrals. In this section we want to integrate a function of two variables, f (x, y). For functions of one variable we integrate over an interval (i.e., a one-dimensional space) and so it makes some sense that when integrating a function of two variables we will integrate over a region in R2 (i.e., a two-dimensional space). We begin by assuming that the region in R2 is a rectangle which we will denote as follows D = [a, b] [c, d]. This means that the ranges for x and y are a x b and c y d. Also we will initially assume that f (x, y) 0 although this does not really have to be the case. Let us start out with the graph of the surface S given by graphing z = f (x, y) over the rectangle D. z Surface S

c a b x
11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000

Rectangular Region D

101

4. Multiple Integrals

Now, just like with functions of one variable, let us rst ask what the volume of the region under the surface S (and above the xy-plane of course) is. We will rst approximate the volume much as we approximated the area before. We will rst divide up a x b into n subintervals and divide up c y d into m subintervals. This will divide up the region D into a series of smaller rectangles and from each of these we will choose a point (x , yj ). Here i is a sketch of this set up. y d = ym
(x , yj ) i

Region D yj

c = y0

a = x0

x1

xi

xn1

b = xn

Now, over each of these smaller rectangles we will construct a box whose height is given by f (x , yj ). Here i is a sketch of that. z

Surface S

Region D (projection)

Each of the rectangles has a base area of A and a height of f (x , yj ) so the volume of each of these i boxes is f (xi , yj ) A. The volume under the surface S is then approximately

m n XX i=1 j=1

f (x , yj ) A. i

We will have a double sum since we will need to add up volumes in both the x and y directions.

102

4.1 Double Integrals

To get a better estimation of the volume we will take n and m go to innity. In other words, V = lim
m n XX i=1 j=1 f (x , yj ) A. i

n,m

Now, this looks familiar. This looks like the denition of the integral of a function of single variable. In fact this is also the denition of a double integral, or more exactly an integral of a function of two variables over a rectangle. Here is the formal denition of a double integral of a function of two variables over a rectangular region D as well as the notation that we will use for it. ZZ f (x, y) dA =
D n,m

lim

m n XX i=1 j=1

f (x , yj ) A. i

RR Note the similarities and dierences in the notation to single integrals. We have two integrals (i.e., ) to denote the fact that we are dealing with a two-dimensional region and we have a dierential here as well. Note that the dierential is dA instead of the dx and dy that we are used to seeing. Note as well that we do not have limits on the integrals in this notation. Instead we have the D written below the two integrals to denote the region that we are integrating over. Note that one interpretation of the double integral of f (x, y) over the rectangle D is the volume under the function f (x, y) (and above the xy-plane), or ZZ volume =
D

f (x, y) dA.

We can use this double sum in the denition to estimate the value of a double integral if we need to. We can do this by choosing (x , yj ) to be the midpoint of each rectangle. When we do this we usually denote i the point as (i , yj ). This leads to the midpoint rule x ZZ f (x, y) dA
D m n XX i=1 j=1

f (i , yj ) A. x

In the next section we start looking at how to actually compute double integrals.

Iterated integrals
We gave the denition of the double integral as a double sum (page 103). However, just like single integral, the denition is very dicult to use in practice and so we need to start looking into how we actually compute double integrals. We still assume that we are integrating over the rectangle D = [a, b] [c, d]. We will look at more non-rectangular regions in the next section. The following theorem tells us how to compute a double integral over a rectangle.

103

4. Multiple Integrals Theorem 4.1.1 (Fubinis Theorem) If f (x, y) is continuous on D = [a, b] [c, d], then ZZ f (x, y) dA =
D a c

Z f (x, y) dy dx =
c

Z
a

f (x, y) dx dy.

These integrals are called iterated integrals.

Note that there are in fact two ways to compute a double integral and also notice that the inner dierential matches up with the limits on the inner integral and similarly for the outer dierential and limits. In other words, if the inner dierential is dy then the limits on the inner integral must be y limits of integration and if the outer dierential is dy then the limits on the outer integral must be y limits of integration. Now, on some level this is just notation and does not really tell us how to compute the double integral. Let us just take the rst possibility above and change the notation a little. Z b Z d ZZ f (x, y) dA = f (x, y) dy dx.
D a c

We will compute the double integral by rst computing the inner integral Z d f (x, y) dy
c

and we compute this by holding x constant and integrating with respect to y as if this is a single integral. The output of this integral is therefore a function involving only xs which we can in turn integrate. We have done a similar process with partial derivatives. To take the derivative of a function with respect to y we treated the x as constants and dierentiated with respect to y as if it was a function of a single variable. Double integrals work in the same manner. We think of all the xs as constants and integrate with respect to y or we think of all ys constants and integrate with respect to x, depending on which of the iterated integrals is considered. Let us take a look at some examples. Example 4.1.1 (Double integrals) Compute each of the following double integrals over the indicated rectangles. ZZ 6xy 2 dA, D = [2, 4] [1, 2]. (a) ZZ (b) ZZ (c) ZZ (d) ZZ (e)
D D D D D

(2x 4y 3 ) dA,

D = [5, 4] [0, 3].

x2 y 2 + cos(x) + sin(y) dA,

D = [2, 1] [0, 1].

1 dA, (2x + 3y)2 xexy dA,

D = [0, 1] [1, 2].

D = [1, 2] [0, 1].

104

4.1 Double Integrals

Solution (a) It does not matter which variable we integrate with respect to rst, we will get the same answer regardless of the order of integration. To justify that let us work this one with each order to make sure that we do get the same answer. Solution 1. In this case we will integrate with respect to y rst. So, the iterated integral that we need to compute is Z 4Z 2 ZZ 6xy 2 dA = 6xy 2 dy dx.
D 2 1

When setting these up make sure the limits match up to the dierentials. Since the dy is the inner dierential (i.e. we are integrating with respect to y rst) the inner integral needs to have y limits. To compute this we will do the inner integral rst and we typically keep the outer integral around as follows Z 4h Z 4 Z 4 ZZ i2 2xy 3 dx = 6xy 2 dA = (16x 2x) dx = 14x dx.
D 2 1 2 2

Remember that we treat the x as a constant when doing the rst integral and we dont do any integration with it yet. Now, we have a normal single integral so let us nish the integral by computing this ZZ i4 h 6xy 2 dA = 7x2 = 84.
2 D

Solution 2. In this case we will integrate with respect to x rst and then y. Here is the computation for the solution. Z 2Z 4 Z 2h Z 2 ZZ i4 i2 h 3x2 y 2 dy = 6xy 2 dA = 6xy 2 dx dy = 36y 2 dy = 12y 3 = 84.
D 1 2 1 2 1 1

Sure enough the same answer as the rst solution. So, remember that we can do the integration in any order. (b) For this integral we will integrate with respect to y rst. Z 4 Z 3 Z ZZ (2x 4y 3 ) dA = (2x 4y 3 ) dy dx =
D 5 4 5 0 4 5

i3 2xy y 4 dx
0

Z =

(6x 81) dx =

h i4 3x2 81x

= 756.

Remember that when integrating with respect to y all xs are treated as constants and so as far as the inner integral is concerned the 2x is a constant and we know that when we integrate constants with respect to y we just tack on a y and so we get 2xy from the rst term. (c) In this case we will integrate with respect to x rst. ZZ Z 1 Z 1 2 2 2 2 x y + cos(x) + sin(y) dA = x y + cos(x) + sin(y) dx dy
D 0 2 1

= = =

1 1 3 2 1 x y + sin(x) + x sin(y) dy 3 0 2 1 Z 1 7 2 7 3 1 y + sin(y) dy = y cos(y) 3 9 0 0 2 7 + . 9 Z

105

4. Multiple Integrals

(d) In this case because the limits for x are kind of nice (i.e., they are zero and one which are often nice for evaluation) let us integrate with respect to x rst. We will also rewrite the integrand to help with the rst integration. Z 2Z 1 ZZ 1 dA = (2x + 3y)2 dx dy (2x + 3y)2 1 0
D

Z =
1

= = =

1 2

1 2 1 (ln 8 ln 2 ln 5) . 6

1 1 (2x + 3y)1 dy 2 0 Z 2 1 1 dy 2 + 3y 3y 1 2 1 1 ln |2 + 3y| ln |y| 3 3 1

(e) Now, while we can technically integrate with respect to either variable rst sometimes one way is signicantly easier than the other way. In this case it will be signicantly easier to integrate with respect to y rst as we will see. Z 2 Z 1 ZZ xexy dA = xexy dy dx.
D 1 0

The y integration can be done with the quick substitution u = xy, which gives ZZ xexy dy dx
D

du = x dy

Z =

2 1

Z h i1 exy dx =
0

2 1

[ex 1] dx

h i2 ex x

= e2 2 (e1 + 1) = e2 e1 3.

So, not too bad of an integral there provided you get the substitution. Now let us see what would happen if we had integrand with respect to x rst. Z 1Z 2 ZZ xexy dA = xexy dx dy.
D 0 1

In order to do this we would have to use integration by parts. We are not even going to continue here as the outcome is indeed troublesome to do. (However, students are suggested to work out this for review purposes as integration by parts is a very important technique in single variable integral 2 calculus. Do not forget the basic integration techniques.)

As we saw in the previous set of examples we can do the integral in either direction. However, sometimes one direction of integration is signicantly easier than the other so make sure that you think about which one you should do rst before actually doing the integral. The next topic of this section is a quick fact that can be used to make some iterated integrals somewhat easier to compute on occasion.

106

4.1 Double Integrals Theorem 4.1.2 If f (x, y) = g(x) h(y) and we are integrating over the rectangle D = [a, b] [c, d], then ZZ f (x, y) dA =
D D

ZZ g(x) h(y) dA =

Z
a

Z g(x) dx
c

h(y) dy .

So we can break up the function into a function only of x times a function of y then we can do two integrals individually and multiply them together. Let us do a quick example using this integral.

Example 4.1.2 (Integral as a product) Evaluate ZZ


D

x cos2 y dA,

D = [2, 3] [0,

]. 2

Solution

Since the integrand is a function of x times a function of y we can use the fact. ! Z Z ZZ x cos2 y dA
3 /2

=
2 3

x dx
0

cos2 y dy

= =

! 1 + cos(2y) dy x dx 2 2 0 /2 3 1 2 1 5 1 = x y + sin(2y) 2 2 4 2 4 2 0 Z Z
/2

5 . 8
2

We have one more topic to discuss in this section. This topic really doesnt have anything to do with iterated integrals, but this is as good a place as any to put it and there are liable to be some questions about it at this point as well so this is as good a place as any. What we want to do is discuss single denite integrals of a function of two variables. In other words we want to look at integrals like the following. Z Z (x3 ex/y ) dx. (x sec2 y + 4xy) dy, From single variable calculus we know that these integrals are asking what function that we dierentiate to get the integrand. However, in this case we need to pay attention to the dierential (dy or dx) in the integral, because that will change things a little. In the case of the rst integral we are asking what function we dierentiate with respect to y to get the integrand whereas in the second integral we are asking what function dierentiate with respect to x to get the integrand. For the most part answering these questions is not that dicult. The important issue is how we deal with the constant of integration. Here are the integrals. Z (x sec2 y + 4xy) dy Z (x3 ex/y ) dx = = x tan y + 2xy 2 + g(x), 1 4 x + yex/y + h(y). 4

Notice that the constants of integration are now functions of the opposite variable. If we are dierentiating the rst integral with respect to y and we know that any function involving only xs will dierentiate to zero and so when integrating with respect to y we need to acknowledge that there may have been a function of only xs in the primitive function and so the constant of integration is in general a function of x. Likewise,

107

4. Multiple Integrals

in the second integral, the constant of integration must be a function of y since we are integrating with respect to x. Again, remember if we dierentiate the answer with respect to x then any function of only ys will dierentiate to zero.

4.2

Double Integrals Over Non-rectangular Regions

In the previous section we looked at double integrals over rectangular regions. The problem with this is that practically most of the regions are not rectangular so we need to now look at the following double integral ZZ f (x, y) dA,
R

where R is a general region. There are two types of regions that we need to look at. Here is a sketch of both of them. y y = g2 (x) d y

Region R c y = g1 (x) a Type 1 b x Type 2 x = h2 (y) x = h1 (y) x

We will often use set builder notation to describe these regions. Here is the denition for the region in Type 1: R = {(x, y) : a x b, g1 (x) y g2 (x)},

and here is the denition for the region in Type 2: R = {(x, y) : h1 (y) x h2 (y), c y d}.

This notation is really just a fancy way of saying we are going to use all the points (x, y) in which both of the coordinates satisfy the two given inequalities. The double integral for both of these cases are dened in terms of iterated integrals as follows. In Type 1 where R = {(x, y) : a x b, g1 (x) ZZ Z bZ f (x, y) dA =
R a

y
g2 (x)

g2 (x)} the integral is dened to be f (x, y) dy dx.

g1 (x)

In Type 2 where R = {(x, y) : h1 (y) x h2 (y), c y d} the integral is dened to be ZZ Z d Z h2 (y) f (x, y) dA = f (x, y) dx dy.
R c h1 (y)

108

4.2 Double Integrals Over Non-rectangular Regions

The following are some properties of the double integral that we should go over before we actually do some examples. Note that all three of these properties are really just extensions of properties of single integrals that have been extended to double integrals. Properties. ZZ ZZ ZZ [f (x, y) + g(x, y)] dA = f (x, y) dA + g(x, y) dA. (1)
R R R

ZZ (2)
R

ZZ cf (x, y) dA = c
R

f (x, y) dA,

where c is any constant.

(3) If the region R can be split into two separate regions R1 and R2 then the integral can be written as ZZ ZZ ZZ f (x, y) dA = f (x, y) dA + f (x, y) dA.
R R1 R2

Let us take a look at some examples of double integrals over non-rectangular regions.

Example 4.2.1 (Double integrals over non-rectangular regions) Evaluate each of the following double integrals over the given region R. ZZ (a) ex/y dA, R = {(x, y) : 1 y 2, y x y 3 }.
R

ZZ (b)
R

(4xy y 3 ) dA,

R is the region bounded by y =

x and y = x3 .

ZZ (c)
R

(6x2 40y) dA,

R is the triangle with vertices (0, 3), (1, 1), and (5, 3).

Solution (a) Let us do this one by just using the formula. ZZ e


R x/y

Z dA =
1

Z
y

y3

Z e
x/y

dx dy =
1

h iy 3 yex/y dy
y

Z =
1

2 h 2 i 1 y2 e 1 4 e y2 e 2e. yey ye1 dy = = 2 2 2 1

(b) In this case we need to determine the two inequalities for x and y that we need to evaluate the integral. The best way to do this is rst sketch the graphs of the two curves. Here is the sketch.

109

4. Multiple Integrals y y= x
Region R

y = x3

0 So, from the sketch we can see that inequalities are 0 We can now do the integral ZZ
R

and

x3

x.

(4xy y ) dA

Z =
0

x3

(4xy y 3 ) dy dx
2

Z = Z
0

= =

7 2 1 x 2x7 + x12 dx 4 4 0 1 1 13 55 7 3 1 8 x x + x . = 12 4 52 156 0


1

1 2xy y 4 4

x dx
x3

(c) We have even less information about the region for this case. Let us start this o by sketching the triangle and getting equations for each side of the triangle. y y=3
(0, 3) (5, 3) (0, 3) (5, 3)

y = 2x + 3
(1, 1)

1 1 y = x+ 2 2 x

3 1 x= y+ 2 2
(1, 1)

x = 2y 1 x

Now, there are two ways to describe this region (i.e., Type 1 and Type 2, page 108). If we use functions of x, as shown in the gure (left) we will have to break the region up into two dierent pieces since the lower function is dierent depending upon the value of x. In this case the region will be given by R = R1 R2 , where R1 R2 = = {(x, y) : 0 {(x, y) : 1 x x 1, 5, 2x + 3 1 1 x+ 2 2 y y 3}, 3}.

110

4.2 Double Integrals Over Non-rectangular Regions Note that the is the union notation and it just means that R is the region given by combining the two regions. If we do this then we will need to do two separate integrals, one for each of the regions. To avoid this we could turn things around and solve the two equations for x to get. 3 1 y = 2x + 3 = x= y+ , 2 2 1 1 y = x+ = x = 2y 1. 2 2 If we do this we can notice that the same function is always on the right and same function is always on the left, as shown in the gure above (right). The region is 1 3 R = {(x, y) : y + x 2y 1, 1 y 3}. 2 2 Writing the region in this form means doing a single integral instead of the two integrals. Either ways should give the same answer and let us compare the two ways in the following. Solution 1. ZZ (6x2 40y) dA
R

ZZ = Z = Z = Z =
0 0 1 0 1 R1 1

(6x2 40y) dA + Z h
3 2x+3 2

ZZ
R2

(6x2 40y) dA Z Z
1 5 1 5

(6x 40y) dy dx + i3
2x+3

3
1 x+ 1 2 2

(6x2 40y) dy dx dx

6x2 y 20y 2

dx +

h i3 6x2 y 20y 2 1
2

x+ 1 2

12x3 + 80x2 240x dx Z


5 1

+ =
4

3x3 + 20x2 + 10x 175 dx

5 1 80 3 3 20 3 3x + x 120x2 + x4 + x + 5x2 175x 3 4 3 0 1 80 120 = 3+ 3 1875 2500 3 20 + + + 125 875 + + 5 175 4 3 4 3 935 = . 3 The above is a lot of work. Let us see the second way of evaluating the same integral. Solution 2. This way will be a lot less work since we are only going to do one single integral. Z 3 Z 2y1 ZZ (6x2 40y) dA = (6x2 40y) dx dy
R 1

Z = Z
1

1 y+ 3 2 2

2x3 40xy

i2y1
1 y+ 3 2 2

dy

= = =

35 65 3 505 2 475 y y + y dy 4 4 4 4 1 3 65 4 505 3 475 2 35 y y + y y 16 12 8 4 1 935 . 3


3

111

4. Multiple Integrals

So, the numbers are a little messier, but other than that there is much less work for the same result.
2

As the part (c) of Example 4.2.1 has shown we can integrate these integrals in either order (i.e., x followed by y or y followed by x), although often one order will be easier than the other. In fact there will be times when it will not even be possible to do the integral in one order while it will be possible to do the integral in the other order.

Let us see a couple examples of these kinds of integrals.

Example 4.2.2 (Change the order of a double integral) Evaluate the following integrals by rst reversing the order of integration. Z (a)
0 3

x2

x3 ey dy dx.

Z (b)
0

2 3 y

x4 + 1 dx dy.

Solution (a) First, notice that if we try to integrate with respect to y we cannot do the integral because we would need a y 2 in front of the exponential in order to do the y integration. Instead, we hope that if we reverse the order of integration we will get an integral that we can do. Now, when we say that we are going to reverse the order of integration this means that we want to integrate with respect to x rst and then y. Note as well that we cannot just interchange the integrals, keeping the original limits, and be done with it. This would not x our original problem and in order to integrate with respect to x we cannot have xs in the limits of the integrals. Even if we ignored that the answer would not be a constant as it should be. So, let us see how we reverse the order of integration. The best way to reverse the order of integration is to rst sketch the region given by the original limits of integration. From the integral we see that the inequalities that dene this region are 0 x 3 and x2 y 9.

These inequalities tell us that we want the region with y = x2 as the lower boundary and y = 9 as the upper boundary that lies between x = 0 and x = 3. Here is a sketch of that region (left). y y=9 y

x=0 y = x2
0 3

x=

112

4.2 Double Integrals Over Non-rectangular Regions

Since we want to integrate with respect to x rst we will need to determine limits of x (probably in terms of y) and then get the limits on ys. Here are the inequalities 0 y 9 and 0 x y.

Any horizontal line drawn in this region will start at x = 0 and end at x = y and so these are the limits on the xs and the range of ys for the region is from 0 to 9. The sketch of the region is in the above (right). Now the integral, with the order reversed, is given by Z
0 3

x2

Z 3 x3 ey dy dx =
0

Z
0

3 x3 ey dx dy

and notice that we can do the rst integration with this order. We will also hope that this will give us a second integral that we can do. Here is the work for this integral. Z
0 3

9 x2

x3 ey dy dx

Z = Z = Z = =
0 0

Z
0

x3 ey dx dy y dy
0

1 2 y3 y e dy 4 0 9 1 ` 729 1 y3 = 1 . e e 12 12 0
9

1 4 y3 x e 4

(b) As with the rst integral we cannot do this integral by integrating with respect to x rst so we will hope that by reversing the order of integration we will get something that we can integrate. Here are the limits for the variables that we get from this integral. 0 Here is a sketch of this region (left). y 8 and 3 y x 2.

x=

y = x3 y x=2 x x

y=0

So, if we reverse the order of integration we get the following limits (sketch above on the right). 0 x 2 and 0 y x3 .

113

4. Multiple Integrals

The integral is then Z


0 8

2 3y

p x4 + 1 dx dy

Z = Z = Z =
0 0 0

Z
0

x3

x4 + 1 dy dx

h p i x3 y x4 + 1 dx
0

x 1 4 Z

p 3
2

x4 + 1 dx

= =

(x4 + 1)1/2 d(x4 + 1)

2 1 (x4 + 1)3/2 1 3/2 17 = 1 . 4 3/2 6 0


2

The nal topic of this section is two geometric interpretations of a double integral. The rst interpretation is an extension of the idea that we used to develop the idea of a double integral in Section 4.1 of this chapter. We did this by looking at the volume of the solid that was below the surface of the function z = f (x, y) and over the rectangle D in the xy-plane. This idea can be extended to more general regions. The volume of the solid that lies below the surface given by z = f (x, y) and above any non-rectangular region R in the xy-plane is ZZ volume = f (x, y) dA.
R

Let us look at a couple of examples. Example 4.2.3 (Volume as a double integral) Find the volume of the solid that lies below the surface given by z = 16xy + 200 and lies above the region in the xy-plane bounded by y = x2 and y = 8 x2 . Solution The following is the graph of the given region, say R, in the xy-plane. y

Region R

y = 8 x2

y = x2 2 2 x

114

4.2 Double Integrals Over Non-rectangular Regions By setting the two bounding equations equal we can see that they will intersect at x = 2 and x = 2. So, the inequalities that will dene the region R in the xy-plane are 2 The volume is then given by ZZ (16xy + 200) dA
R

and

x2

8 x2 .

Z = Z =

2 2 2

8x2

(16xy + 200) dy dx
x2

h `

8xy 2 + 200y

i8x2
x2

dx

2 Z 2

=
2

128x3 400x2 + 512x + 1600 dx

= =

2 400 3 x + 256x2 + 1600x 3 2 400 3 12800 (2) + 1600(2) = . 2 3 3 32x4


2

Example 4.2.4 (Volume as a double integral) Find the volume of the solid enclosed by the planes z + 4x + 2y = 10, y = 3x, z = 0, x = 0.

Solution This example is a little dierent from the previous one. Here the region R is not explicitly given so we are going to look for it. First, notice that the last two planes are really telling us that we wont go past the xy-plane and the yz-plane when we reach them. The rst plane, z + 4x + 2y = 10, is the top of the volume and so we are really looking for the volume under z = 10 4x 2y and above the region R in the xy-plane. The second plane, y = 3x, gives one of the sides of the volume as shown below. The region R will be the region in the xy-plane (i.e., z = 0) that is bounded by y = 3x, x = 0, and the line where z + 4x + 2y = 10 intersects the xy-plane. We can determine where z + 4x + 2y = 10 intersects the xy-plane by plugging z = 0 into it. 0 + 4x + 2y = 10 = 2x + y = 5 = y = 2x + 5.

So, the following is the sketch of the solid and the region in the xy-plane. Note that the region R is a little out of scale.

115

4. Multiple Integrals z y

z + 4x + 2y = 10 y

Region R

y = 2x + 5

y = 3x 0 y = 3x x The region R is really where this solid will sit on the xy-plane and the region is dened by the inequalities: 0 The volume is then given by ZZ volume = Z = Z = Z =
0 0 1 0 1 3x R 1

and

3x

2x + 5.

(10 4x 2y) dA Z
2x+5

(10 4x 2y) dy dx

h i2x+5 10y 4xy y 2 dx


3x

` 25x2 50x + 25 dx =

25 3 x 25x2 + 25x 3

1 =
0

25 . 3
2

The second geometric interpretation of a double integral is the following.

Theorem 4.2.1 (Area as a double integral) ZZ Area of region R =


R

1 dA,

in which the integrand is the constant function 1.

This is easy to see why this is true in general. Let us suppose that we want to nd the area of the region shown below.

116

4.3 Double Integrals in Polar Coordinates y


Region R

y = g2 (x)

y = g1 (x) a b x

From single variable calculus we know that this area can be found by the integral Z area of the region R = Z =
a a b b

(upper curve lower curve) dx [g2 (x) g1 (x)] dx.

Or in terms of a double integral we have ZZ area of the region R = Z = Z = Z =


a a b a b g1 (x) R b

1 dA Z
g2 (x)

1 dy dx h ig2 (x) y dx
g1 (x)

[g2 (x) g1 (x)] dx.

This is exactly the same formula we have in single variable calculus.

4.3

Double Integrals in Polar Coordinates

To this point we have seen quite a few double integrals. However, in every case region R could be easily described in terms of simple functions in Cartesian coordinates. In this section we want to look at some regions that are much easier to describe in terms of polar coordinates. For instance, we might have a region that is a disk, ring, or a portion of a disk or ring. In these cases using Cartesian coordinates could be somewhat cumbersome. For instance, let us suppose we have the following integral ZZ f (x, y) dA, R is the disk of radius 2.
R

To this we have to determine a set of inequalities for x and y that describe this region. These would be p p 2 x 2 and 4 x2 y 4 x2 ,

117

4. Multiple Integrals

and then the integral would become ZZ f (x, y) dA =


R 2

Z 4x2

4x2

f (x, y) dy dx.

Due to the limits on the inner integral this is liable to be an uneasy integral to compute. However, a disk of radius of 2 can be dened easily in polar coordinates by the following inequalities 0 2 and 0 r 2.

These are very simple limits and, in fact, are constant limits of integration which always makes integrals somewhat easier. So, if we could convert our double integral formula into one involving polar coordinates we would be in pretty good shape. The problem is that we cannot just convert the dx and the dy into a dr and a d. In computing double integrals to this point we have been using the fact that dA = dx dy and this really does require Cartesian coordinates to use. In fact it can be shown that, in terms of polar coordinates, dA can be written as dA = r dr d. Note the addition of the r in the formula. That is important. Without it the answer will be wrong. We now need to nd a formula for the double integral when we use polar coordinates to dene the region R. First, let us get a sketch of a sample region as well as the area dierential (area element): y = 2 r = r2 () A r + r r = r1 () = 1 r Area dierential: dA A x So, our general region will be dened by inequalities 1 2 and r1 () r r2 ().

Now, if we are going to integrate with respect to polar coordinates we have to make sure that we have also converted all the xs and ys into polar coordinates as well. To do this we will need to remember the following conversion formulas x = r cos , y = r sin , and x2 + y 2 = r 2 .

Also, as we mentioned that the area dierential can be written as dA = r dr d. We are going to give the detail of the proof in the following. dA = = A 1 1 (r + r)2 r 2 2 2 1 r r + (r)2 2 r r r dr d.

118

4.3 Double Integrals in Polar Coordinates

We are now ready to write down a formula for the double integral in terms of polar coordinates. ZZ f (x, y) dA =
R 1 r1 ()

r2 ()

f (r cos , r sin ) r dr d.

It is important not to forget the added r in the dierential and also dont forget to convert the Cartesian coordinates (i.e., x, y) in the integrand to polar coordinates. Let us look at a couple of examples of these kinds of integrals.

Example 4.3.1 (Double integrals in polar coordinates) Evaluate the following integrals by converting them into polar coordinates. ZZ 2xy dA, R is the portion of the region between the circles of radius 2 and radius 5 centered at (a)
R

the origin that lies in the rst quadrant. ZZ (b)


R

ex

+y 2

dA,

R is the unit circle centered at the origin.

Solution (a) First let us get R in terms of polar coordinates. The circle of radius 2 is given by r = 2 and the circle of radius 5 is given by r = 5. We want the region between them so we will have the following inequality for r, 2 r 5. Also, since we only want the portion that is in the rst quadrant we get the following range of s, . 0 2 Now we can do the integral Z Z 5 ZZ 2 2xy dA = 2(r cos )(r sin ) r dr d.
R 0 2

Do not forget to do the conversions and to add in the extra r. Now, let us simplify and make use of the double angle formula for sine to make the integral a little easier. Z Z 5 ZZ 2 2xy dA = r 3 sin(2) dr d
R 0 2
2

Z = Z = = =
0

1 4 r sin(2) 4

5 d
2

609 sin(2) d 4 2 609 cos(2) 8 0 609 . 4


0

119

4. Multiple Integrals

(b) In this case we cant do this integral in terms of Cartesian coordinates. We will however be able to do it in polar coordinates. First, the region R is dened by 0 2 and
2 0

0 Z
0 1

1.

In terms of polar coordinates the integral is then ZZ Z 2 2 ex +y dA =


R

er r dr d.

Notice that the addition of r gives us an integral that can be solvable. Here is how to do the integral. Z 2 Z 1 ZZ 2 2 2 ex +y dA = rer dr d
R 0 0 2

= = = =

1 r2 e d(r 2 ) d 0 0 2 1 Z 2 1 r2 d e 2 0 0 Z 2 1 (e 1) d 2 0 (e 1).
2

Do not forget that we still have the two geometric interpretations for these integrals as well. They include nding enclosed areas and volumes in polar coordinates. We use several examples in the following to illustrate the advantage of using polar coordinates. That is to say occasionally it is technically easier to evaluate the double integrals in polar coordinates. Example 4.3.2 (Area in polar coordinates) Find the area of the region enclosed by the two circles: x2 + (y 1)2 = 1 and x2 + (y 2)2 = 4.

Solution For the rst circle x2 + (y 1)2 = 1, substituting x = r cos and y = r sin into the equation and simplifying, we have r = 2 sin . Similarly, the second circle is r = 4 sin . The enclosed region is shown as follows. y

Region R

120

4.3 Double Integrals in Polar Coordinates

The values of is given by 0 Thus, the area of the region R is area = Z = Z = Z = Z =


0 0 0 0 R

ZZ 1 dA Z
4 sin

1 r dr d
2 sin

r2 2

4 sin d
2 sin

6 sin2 d 3(1 cos 2) d

= =

3 3 sin 2 2 0 3.
2

Example 4.3.3 (Area in polar coordinates) Find the area of the region enclosed by the two circles: x2 + (y 1)2 = 1 Solution and (x 1)2 + y 2 = 1.

In polar coordinates, the two circles are r = 2 sin and r = 2 cos .

The region can be divided into two equal-sized non-overlapping regions. We just need to evaluate one of them and eventually multiply it by 2 to obtain the required area. The enclosed region R is shown as follows. y r = 2 sin

Region R

r = 2 cos

Solving for the intersection points between the two circles gives 2 sin tan = = = 2 cos , 1, . 4

121

4. Multiple Integrals

Now, the area of the region R is Z area = = = = = = 2 Z 2 Z 4 Z 2


0 0 /4 0 /4 0 /4 /4

Z
0

2 sin

r dr d r2 2 2 sin d
0

sin2 d (1 cos 2) d

/4 1 2 sin 2 2 0 1. 2
2

Example 4.3.4 (Area in polar coordinates) Determine the area of the region that lies inside r = 3 + 2 sin and outside r = 2.

Solution Below shows a sketch of the region R that we want to determine the area of. To determine this area we will need to know that value(s) of for which the two curves intersect. We can determine these points by setting the two equations equal and solving. Thus, 3 + 2 sin = 2, and hence sin = 1 2 = = 7 , 6 11 . 6

Here is the sketch of the gure with the rays ( = 7/6, = 11/6) shown. y
Region R

r = 3 + 2 sin

x 7 6 = , 6 11 6

r=2

11 is another representation for the angle . This is Note as well that we have acknowledged that 6 6 important since we need the range of to actually enclose the region as we increase from the lower limit to 11 7 11 the upper limit. If we choose to use then as we increase from to we would be tracing out 6 6 6 the lower portion of the circle and that is not the region that we are looking for.

122

4.3 Double Integrals in Polar Coordinates

So, here are the ranges that will dene the region. 7 and 2 r 3 + 2 sin . 6 6 To get the ranges for r the function that is closest to the origin is the lower bound and the function that is farthest to the origin is the upper bound. The area of the region R is then Z 7 Z 3+2 sin 6 area = r dr d Z =
6 2
7 6

Z = Z = =

6
7 6

1 2 r 2

3+2 sin d
2

6
7 6

5 + 6 sin + 2 sin2 d 2 7 + 6 sin cos(2) d 2 7


6

7 1 6 cos sin(2) 2 2

11 3 14 + 2 3

24.19.
2

Example 4.3.5 (Volume in polar coordinates) Find the volume of the sphere x2 + y 2 + z 2 = a2 . Solution First recall that the volume of the solid enclosed by two surfaces (top and bottom) is given by ZZ (ztop zbottom ) dA. volume =
R

It would be dicult to use the rectangular coordinates to nd the volume. In terms of polar coordinates, Z 2 Z a (ztop zbottom ) r dr d. volume =
0 0

Solving the given equation for z gives z 2 = a2 (x2 + y 2 ) = a2 r 2 The volume of the solid is given by volume =
0 0 2

z=

p a2 r 2 .

hp

a2 r 2 (

i p a2 r 2 ) r dr d

Z = 2
0

Z
0

a2 r 2 r dr d

Z = Z = 2 3 Z
0 0 0

Z
0

(a2 r 2 )1/2 d(a2 r 2 ) d a d


0

(a2 r 2 )3/2 3/2

a3 d =

2 3 4 3 a (2) = a . 3 3

123

4. Multiple Integrals

Example 4.3.6 (Volume in polar coordinates) Find the volume of the solid that lies under the sphere x2 + y 2 + z 2 = 9, above the plane z = 0 and inside the cylinder x2 + y 2 = 5.

Solution

We know that the formula for nding the volume is given by ZZ (ztop zbottom ) dA. volume =
R

In order to make use of this formula we are going to determine the integrand function that we should be integrating and the region R that we are going to be integrating over. The integrand function is just not too complicated. The solid is bounded above by the top surface which p is a portion of the sphere (i.e., z = 9 x2 y 2 ) and bounded below by the bottom surface which is the xy-plane (i.e., z = 0). Remark that we took the positive square root since we are wanting the solid lies above the xy-plane. p and zbottom = 0. ztop = 9 x2 y 2 The region R is also not too dicult in this case either. As we take points, (x, y), from the region we need to completely graph the portion of the sphere that we are working with. Since we only want the portion of the sphere that actually lies inside the cylinder given by x2 + y 2 = 5 this is also the region R. The region R is the disk x2 + y 2 5 in the xy-plane. For reference purposes here is a sketch of the solid that we are trying to nd the volume of. z

Solid

1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 Region R (bottom of solid) 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 y
x So, the solid that we want the volume for is really a cylinder with a cap that comes from the sphere. Denitely we are going to do this integral in terms of polar coordinates, so here are the limits (in polar coordinates) for the region R: 5. 0 2 and 0 r

124

4.3 Double Integrals in Polar Coordinates

We will also need to convert the function to polar coordinates as well. ztop zbottom = The volume is therefore Z volume =
0 0 2

p p 9 (x2 + y 2 ) 0 = 9 r 2 .

9 r 2 r dr d (9 r 2 )1/2 d(9 r 2 ) d 5 d
0

1 2 1 2 1 3 Z

Z
0

Z
0

Z
0

(9 r 2 )3/2 3/2

2 0

93/2 43/2 d

1` 3 38 . 3 23 (2) = 3 3
2

Example 4.3.7 (Volume in polar coordinates) Find the volume of the solid that lies inside z = x2 + y 2 and below the plane z = 16.

Solution

Here is a sketch of the solid. z

Solid

1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000Region R (projection) 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 y

Now, in this case the standard formula is not going to work. The formula ZZ volume = f (x, y) dA
R

125

4. Multiple Integrals

gives the volume under the function f (x, y) and we are actually look for the volume above the function. This is not the problem that it might appear to be however. First, notice that ZZ V1 = 16 dA
R

will be the volume under z = 16 (of course we will need to determine R eventually) while ZZ (x2 + y 2 ) dA V2 =
R

will be the volume under z = x + y , using the same R. The volume that we are looking for is really the dierence between these two, or ZZ V = V1 V2 = 16 (x2 + y 2 ) dA.
R

Now all that we need to do is determine the region R and then convert everything to polar coordinates. Determining the region R in this case is not too complicated. If we look straight down the z-axis onto the region we will see a circle of radius 4 centered at the origin. This is because the top of the solid, where the elliptic paraboloid intersects the plane, is the widest part of the solid. We know the z coordinate at the intersection so, setting z = 16 in the equation of the paraboloid gives 16 = x2 + y 2 which is the equation of a circle of radius 4 centered at the origin. Here are the inequalities for the region R and the function we will be integrating in terms of polar coordinates. 0 The volume is then volume =
R

2, ZZ

4,

z = 16 r 2 .

16 (x2 + y 2 ) dA
4 0

Z =
0

(16 r 2 ) r dr d 1 4 r 4 4 d =
0 0
2

Z =
0

8r 2

64 d = 128.

In Examples 4.3.5 4.3.7 we would have not been able to easily compute the volume without rst converting to polar coordinates. So, as these examples have shown, it is a good idea to always remember polar coordinates. There is one more type of example that we need to look at before moving on to the next section. Sometimes we are given an iterated integral that is already in terms of x and y and we need to convert this over to polar coordinates so that we can actually do the integral. We need to see an example of how to do this kind of conversion.

126

4.3 Double Integrals in Polar Coordinates

Example 4.3.8 (Double integral in polar coordinates) Evaluate the following integral by rst converting to polar coordinates. Z
0 1

Z 1y 2
0

cos(x2 + y 2 ) dx dy.

Solution First, notice that we can not do this integral in Cartesian coordinates and so converting to polar coordinates may be the only option we have for actually doing the integral. Notice that the function will convert to polar coordinates nicely and so should not be a problem. Let us rst determine the region that we are integrating over and see if it is a region that can be easily converted into polar coordinates. Here are the inequalities that dene the region in terms of Cartesian coordinates. p 1 y2. 0 y 1, 0 x Now, the upper limit for the xs is x= p 1 y2

and this looks like the right side of the circle of radius 1 centered at the origin. Since the lower limit for the xs is x = 0 it looks like we are going to have a portion (or all) of the right side of the disk of radius 1 centered at the origin. The range for the ys, however, tells us that we are only going to have positive ys. This means that we are only going to have the portion of the disk of radius 1 centered at the origin that is in the rst quadrant. So, we know that the inequalities that will dene this region in terms of polar coordinates are then 0 Finally, we just need to remember that dx dy = dA = r dr d and so the integral becomes Z
0 1

, 2

1.

Z 1y 2
0

cos(x2 + y 2 ) dx dy =

Z
0

/2

Z
0

cos(r 2 ) r dr d.

Note that this is an integral that we can do. Here is the rest of the work for this integral. Z
0 1

Z 1y 2
0

cos(x2 + y 2 ) dx dy

Z =
0

/2

1 sin(r 2 ) 2

1 d
0

Z =
0

/2

1 sin(1) d = sin 1. 2 4
2

127

4. Multiple Integrals

4.4

Triple Integrals

After introducing integrations over a two-dimensional region we need to move on to integrations over a three-dimensional solid. We used a double integral to integrate over a two-dimensional region and so it should not be too surprising that we will use a triple integral to integrate over a three-dimensional solid. The notation for the general triple integrals is ZZZ f (x, y, z) dV.
G

Let us start with simple solid by integrating over the box, G = [a, b] [c, d] [r, s]. Note that when using this notation we list the xs rst, the ys second and the zs third. Just as a double integral can be evaluated by two successive single integrations, so a triple integral can be evaluated by three successive integrations. The triple integral in this case is ZZZ f (x, y, z) dV =
G r c a

f (x, y, z) dx dy dz.

Note that we integrate with respect to x rst, then y, and nally z here, but in fact there is no reason to do the integrals in this order. There are 6 dierent possible orders to do the integral and in which order you do the integral will depend on the function and the order that you feel will be the easiest. We will get the same answer regardless of the order however. The six dierent orders for the iterated integral are dx dy dz, dx dz dy, dy dz dx, dz dy dx, dz dx dy, dy dx dz.

Let us do a quick example of a triple integral over a three-dimensional box.

Example 4.4.1 (Triple integral) Evaluate the following integral ZZZ 8xyz dV,
G

where

G = [2, 3] [1, 2] [0, 1].

Solution Just to make the point that order does not matter let us use a dierent order from that listed above. We will do the integral in the following order. ZZZ Z 2Z 3Z 1 8xyz dV = 8xyz dz dx dy
G 1 2 2Z 1 2 1 2 1 2 3 2 3 0

Z = Z = Z = Z =
1

4xyz 2

1
0

dx dy

4xy dx dy
2

2x2 y

3
2

dy 5y 2 2
1

10y dy =

= 15.

128

4.4 Triple Integrals

Before moving on to more general solids let us get a nice geometric interpretation about the triple integral out of the way so we can use it in some of the examples to follow. In the special case where f (x, y, z) = 1, the triple integral yields the formula for calculating the volume of a solid.

Theorem 4.4.1 The volume of the three-dimensional solid G is given by the integral, ZZZ volume of G =
G

1 dV .

Let us now move on the more general three-dimensional solids. We have three dierent possibilities for a general solid. Here is a sketch of the rst possibility. z z = u2 (x, y)

Solid G z = u1 (x, y)

y
111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 Region R (projection) 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000

In this case we dene the solid G as follows, G = {(x, y, z) : (x, y) R, u1 (x, y) z u2 (x, y)},

where (x, y) R is the notation which means that the point (x, y) lies in the region R from the xy-plane. In this case we will evaluate the triple integral as follows ZZZ f (x, y, z) dV =
G R u1 (x,y)

Z Z "Z

u2 (x,y)

# f (x, y, z) dz dA,

where the double integral can be evaluated in any of the methods that we saw in the previous couple of sections. In other words, we can integrate rst with respect to x, we can integrate rst with respect to y, or we can use polar coordinates as needed.

129

4. Multiple Integrals

Example 4.4.2 (Triple integral) ZZZ 2x dV , where G is the solid under the plane 2x + 3y + z = 6 that lies in the rst octant. Evaluate
G

Solution We should rst dene octant. Just as the two-dimensional coordinates system can be divided into four quadrants the three-dimensional coordinate system can be divided into eight octants. The rst octant is the octant in which all three of the coordinates are positive. Here is a sketch of the plane in the rst octant. z
111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 + z = 6 2x + 3y 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 0000000002 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 0 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 3
Region R on xy-plane

x We now need to determine the region R in the xy-plane. We can get a visualization of the region by pretending to look straight down on the object from above. What we see will be the region R in the xy-plane. So R will be the triangle with vertices at (0, 0), (3, 0), and (0, 2). Here is a sketch of R. y

1111111111111111111111111111111 0000000000000000000000000000000 2 1111111111111111111111111111111 0000000000000000000000000000000 2 y = x+ 1111111111111111111111111111111 0000000000000000000000000000000 3 1111111111111111111111111111111 0000000000000000000000000000000 or 1111111111111111111111111111111 0000000000000000000000000000000 3 1111111111111111111111111111111 0000000000000000000000000000000 + 3 x= y 1111111111111111111111111111111 0000000000000000000000000000000 2 1111111111111111111111111111111 0000000000000000000000000000000 1111111111111111111111111111111 0000000000000000000000000000000 1111111111111111111111111111111 0000000000000000000000000000000
0 3

Region R

Now we need the limits of integration. Since we are under the plane and in the rst octant (so we are above the plane z = 0) we have the following limits for z. 0 z 6 2x 3y.

130

4.4 Triple Integrals

We can integrate the double integral over R using either of the following two sets of inequalities 0 or x 3, 0 y 2 x+2 3

3 0 y 2. y+3 2 Since neither really holds an advantage over the other we will use the rst one. The integral is then Z Z Z 62x3y ZZZ 2x dV = 2x dz dA 0 x
G

ZZ h i62x3y 2xz dA
0 R

Z = Z = Z = =
0 0

Z h
0

2 x+2 3

2x(6 2x 3y) dy dx i 2 x+2


3

12xy 4x2 y 3xy 2

4 3 x 8x2 + 12x dx 3 0 3 1 4 8 3 x x + 6x2 = 9. 3 3 0


3
2

dx

Let us now move onto the second possible three-dimensional solid we may run into for triple integrals. Here is the sketch of the solid. z x = u2 (y, z)
1111 0000 1111 0000 1111 0000 1111 0000 (projection on yz-plane) R 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000

Solid G

x = u1 (y, z)

y For this possibility we dene the solid G as follows, G = {(x, y, z) : (y, z) R, u1 (y, z) x u2 (y, z)}.

131

4. Multiple Integrals

So, the region R will be a region in the yz-plane. Here is how we will evaluate these integrals. ZZZ f (x, y, z) dV =
G R u1 (y,z)

Z Z "Z

u2 (y,z)

# f (x, y, z) dx dA.

As with the rst possibility we will have two options for doing the double integral in the yz-plane as well as the option of using polar coordinates if needed.

Of course, there is the third (and nal) possible three-dimensional solid we may run into for triple integrals. Here is a sketch of the solid G. z y = u1 (x, z)
1111 0000 1111 0000 1111 0000 1111 Region R0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000

Solid G

y = u2 (x, z)

x In this nal case G is dened as G = {(x, y, z) : (x, z) R, u1 (x, z) Z Z "Z f (x, y, z) dV =


G R u1 (x,z)

u2 (x, z)} #

and here the region R will be a region in the xz-plane. Here is how we will evaluate these integrals. ZZZ
u2 (x,z)

f (x, y, z) dy dA,

where we can use either of the two possible orders for integrating R in the xz-plane or we can use polar coordinates as needed. Example 4.4.3 (Triple integral) ZZZ p 3x2 + 3z 2 dV , where G is the solid bounded by y = 2x2 + 2z 2 and the plane y = 8. Evaluate
G

Solution

Here is a sketch of the solid G.

132

4.4 Triple Integrals z

y = 2x2 + 2z 2 Solid G

Region R

111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 0 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000

The region R in the xz-plane can be found by projecting the surface onto the xz-plane and we can see that R will be a disk in the xz-plane. This disk will come from the front of the solid and we can determine the equation of the disk by setting the elliptic paraboloid and the plane equal. 2x2 + 2z 2 = 8 = x2 + z 2 = 4.

This region, as well as the integrand, both seem to suggest that we should use something like polar coordinates. However we are in the xz-plane and we have only seen polar coordinates in the xy-plane. This is indeed not a problem. We can always translate them over to the xz-plane with the following denition. x = r cos , z = r sin .

Since the region does not have ys we will let z take the place of y in all the formulas. Note that the denition also leads to the formula x2 + z 2 = r 2 . With this in hand we can arrive at the limits of the variables that we will need for this integral. 2x2 + 2z 2 The integral is then ZZZ p
G

8,

2,

2.

Z Z Z 3x2 + 3z 2 dV =
R

8 2x2 +2z 2

3x2 + 3z 2 dy dA

ZZ h p i8 y 3x2 + 3z 2 ZZ p
R R

2x2 +2z 2

dA

3(x2 + z 2 ) 8 (2x2 + 2z 2 ) dA.

Now, since we are going to do the double integral in polar coordinates let us get everything converted over to polar coordinates. The integrand is p 3(x2 + z 2 ) 8 (2x2 + 2z 2 ) = = ` 3r 2 8 2r 2 ` 3 8r 2r 3 .

133

4. Multiple Integrals

The integral is then ZZZ p


G

3x2 + 3z 2 dV

ZZ ` 3 8r 2r 3 dA Z 3 Z 3 Z 3
0 2 R 2

Z
0

= = = =

` 8r 2r 3 r dr d 2 d
0

0 2

8 3 2 5 r r 3 5

256 3 . 15

128 d 15

4.5

Triple Integrals in Cylindrical Coordinates

In this section we want to take a look at triple integrands done completely in cylindrical coordinates. Recall that cylindrical coordinates are really nothing more than an extension of polar coordinates into three dimensions. The following are the conversion formulas for cylindrical coordinates. x = r cos , y = r sin , z = z.

In order to do the integral in cylindrical coordinates we will need to know that dV will become in terms of cylindrical coordinates. We will be able to show that dV = r dz dr d. The solid G over which we are integrating becomes G = = {(x, y, z) : (x, y) R, u1 (x, y) {(r, , z) : 1 2 , r1 () r z u2 (x, y)} r2 (), u1 (r, ) z u2 (r, )}.

Note that we have only given this for Gs in which R is in the xy-plane. We can modify this accordingly if R is in the yz-plane or the xz-plane as needed. In terms of cylindrical coordinates a triple integral is ZZZ Z 2 Z r2 () Z u2 (r,) f (x, y, z) dV = f (r cos , r sin , z) r dz dr d.
G 1 r1 () u1 (r,)

Do not forget to add in the r and make sure that all the xs and ys also get converted over into cylindrical coordinates. Practically, to apply the above formula it is best to begin with a three-dimensional sketch of the solid G, from which the limits of integration can be obtained as follows: Determining Limits of Integration: Cylindrical Coordinates

134

4.5 Triple Integrals in Cylindrical Coordinates

Step 1. Identify the upper surface z = u2 (r, ) and the lower surface z = u1 (r, ) of the solid. The functions u1 (r, ) and u2 (r, ) determine the z-limits of integration. (If the upper and lower surfaces are given in rectangular coordinates, convert them to cylindrical coordinates.) Step 2. Make a two-dimensional sketch of the projection R of the solid on the xy-plane. From this sketch the r- and -limits of integration may be obtained exactly as with double integrals in polar coordinates.

Let us see some examples in the following.

Example 4.5.1 (Cylindrical Coordinates) ZZZ y dV , where G is the solid that lies below the plane z = x + 2 above the xy-plane and Evaluate between the cylinders x2 + y 2 = 1 and x2 + y 2 = 4.
G

Solution There really is not too much with this one other than do the conversions to cylindrical coordinates and then evaluate the integral. We will start out by getting the range for z in terms of cylindrical coordinates. 0 z x+2 = 0 z r cos + 2.

Remember that we are above the xy-plane and so we are above the plane z = 0. Next, the region R is the region between the two circles x2 + y 2 = 1 and x2 + y 2 = 4 in the xy-plane and so the ranges for the region are 0 2, 1 r 2. Here is the integral. ZZZ y dV
G

Z =
0

Z
1

Z
0

r cos +2

(r sin ) r dz dr d Z
2

Z Z
1

= Z
0 2 2

(r 2 sin ) (r cos + 2) dr d

= = = = =

1 3 r sin 2 + 2r 2 sin dr d 2 0 1 2 Z 2 1 4 2 r sin 2 + r 3 sin d 8 3 0 1 Z 2 15 14 sin 2 + sin d 8 3 0 2 14 15 cos cos 2 16 3 0 0.


2

Just as we did with double integral involving polar coordinates we can start with an iterated integral in terms of x, y, and z and convert it to cylindrical coordinates.

135

4. Multiple Integrals

Example 4.5.2 (Cylindrical Coordinates) Z 1 Z 1y 2 Z x2 +y 2 xyz dz dx dy into an integral in cylindrical coordinates. Convert
1 0 x2 +y 2

Solution

Here are the ranges of the variables from this iterated integral. 1 y 1, 0 x p 1 y2, x2 + y 2 z p x2 + y 2 .

The p two inequalities dene the region R and since the upper and lower bounds for the xs are rst x = 1 y 2 and x = 0 we know that we have got at least part of the right half a circle of radius 1 centered at the origin. Since the range of ys is 1 y 1 we know that we have the complete right half of the disk of radius 1 centered at the origin. So, the ranges for R in cylindrical coordinates are 2 , 2 0 r 1.

The last thing we have to do is to convert the limits of the z range, but that is pretty simple. r2 z r.

Here we notice that the lower bound is an elliptic paraboloid and the upper bound is a cone. Therefore G is a portion of the solid between these two surfaces. The integral is then Z
1 1

Z 1y 2 Z x2 +y 2 xyz dz dx dy
0 x2 +y 2

Z = Z =

/2 /2 /2 /2

Z Z
0

Z Z

(r cos )(r sin ) z r dz dr d


r2 r r2
2

zr 3 cos sin dz dr d.

4.6

Triple Integrals in Spherical Coordinates

We introduce here another common coordinate system that is frequently more convenient than either rectangular or cylindrical coordinates. In particular, some triple integrals that cannot be calculated exactly in either rectangular or cylindrical coordinates can be dealt with easily in spherical coordinates. First, we need to recall just how spherical coordinates are dened. The following sketch shows the relationship between the Cartesian and spherical coordinate systems.

136

4.6 Triple Integrals in Spherical Coordinates z Q(0, 0, z)

P (x, y, z) = (, , )

z O y

R(x, y, 0)

Spherical coordinates locate points in space with two angles and one distance, as shown in the above gure. The rst coordinate, p = OP = x2 + y 2 + z 2 , is the points distance from the origin. Unlike r, the distance is never negative. The second coordinate, , is the angle OP makes with the positive z-axis. It is required to lie in the interval [0, ]. The third coordinate is the angle as measured in cylindrical coordinates. If you look closely at the above gure, you can see how to relate rectangular and spherical coordinates. Notice that x = OR cos = QP cos . Looking at the triangle OQP , we nd that QP = sin , so that x = sin cos . Similarly, we have y = OR sin = sin sin .

Finally, focusing again on triangle OQP , we have z = cos . Here is the summary of the conversion formulas for spherical coordinates. x = sin cos , x2 + y 2 + z 2 = 2 . As just mentioned we have the following restrictions on the coordinates. 0, 0 . y = sin sin , z = cos ,

137

4. Multiple Integrals

For our integrals we are going to restrict G down to spherical wedge. This will mean that we are going to take ranges for the variables as follows, 1 2 , 1 2 , 1 2 .

Here is a quick sketch of a spherical wedge for reference purposes. z

Solid G

y x From this sketch we can see that G is really nothing more than the intersection of a sphere and a cone. In fact we can show that dV = 2 sin d d d. Therefore the integral will become ZZZ f (x, y, z) dV
G

Z =

2 1

2 1

2 1

f ( sin cos , sin sin , cos ) 2 sin d d d.

This looks a bit complicated, but given that the limits are all constants the integrals here tend to not be too bad. Let us see a couple of examples of this kind. Example 4.6.1 (Spherical Coordinates) Evaluate ZZZ 16z dV,
G

where G is the upper half of the sphere x2 + y 2 + z 2 = 1. Solution Since we are taking the upper half of the sphere the limits for the variables are 0 1, 0 2, 0 . 2

138

4.6 Triple Integrals in Spherical Coordinates

The integral is then ZZZ 16z dV


G

Z =
0

/2

Z
0

Z
0

(16 cos ) 2 sin d d d

Z = Z = Z = h = =
0 0 0

/2

Z Z
0

Z
0

83 sin 2 d d d

/2

2 sin 2 d d
0 /2

4 sin 2 d 2 cos 2 i/2


0

4.
2

Example 4.6.2 (Spherical Coordinates) Convert Z 3 Z 9y 2 Z 18x2 y 2


0 0

x2 +y 2

` 2 x + y 2 + z 2 dz dx dy

into spherical coordinates. Solution Let us rst write down the limits for the variables. p p 9 y2, x2 + y 2 0 y 3, 0 x

18 x2 y 2 .

The range for x tells us that we have a portion of the right half of a disk of radius 3 centered at the origin. Since we are restricting ys to positive values it looks like we will have a quarter disk in the rst quadrant. Therefore since R is in the rst quadrant the solid G must be in the rst octant and this follows that we have the following range for (since this is the angle around the z-axis). 0 . 2

p Now, let us see what the range for z tells us. The lower bound, z = x2 + y 2 , is the p upper half of a cone. At this point we dont need this quite yet, but we will later. The upper bound, z = 18 x2 y 2 , is the upper half of the sphere, x2 + y 2 + z 2 = 18 and so from this we now have the following range for . 18 = 3 2. 0

Now we need the range for . There are two ways to get this. One is from where the cone and the sphere intersect. Plugging in the equation for the cone into the sphere gives 2 p x2 + y 2 + z 2 = 18, z2 + z2 z
2

= = =

18, 9, 3.

139

4. Multiple Integrals

Note that we can assume z is positive here since we have the upper half of the cone and/or sphere. Finally, put this into the conversion for z and take advantage of the fact that we know that = 3 2 since we are intersecting on the sphere. This gives cos 3 2 cos cos So, we have the following range. 0 = = = 3, 3, 1 = 2

2 2 . 4

. 4

The other way to get this range is from the cone itself. By rst converting the equation into cylindrical coordinates and then into spherical coordinates we get the following p x2 + y 2 , z = cos 1 So, recalling that the integral is then Z Z
3 0 0

= =

sin , tan = = . 4

2 = x2 + y 2 + z 2 , Z 18x2 y 2
x2 +y 2

9y 2

` 2 x + y 2 + z 2 dz dx dy

Z =
0

/4

Z
0

/2

Z
0

3 2

4 sin d d d.
2

140

You might also like