Professional Documents
Culture Documents
1 Dierential Calculus 1.1 Ordinary and Partial Derivatives . . . . . . . 1.2 Directional Derivatives and the Gradient . . . 1.2.1 Directional Derivatives . . . . . . . . 1.2.2 Meaning of the Gradient . . . . . . . 1.2.3 The Gradient in Three Dimensions . . 1.2.4 Physical Applications of the Gradient 1.3 Summary . . . . . . . . . . . . . . . . . . . 2 Single integrals 2.1 Fundamental Theorem of Calculus . . 2.2 Variable Transformations in Integrals . 2.3 Proof of the Fundamental Theorem . 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 4 5 7 10 12 14 15 17 18 21 24 25 25 32 32 34 36 40 41 41 46 46 47
3 Double Integrals 3.1 Integrals in Cartesian Coordinates . . . . . . . . 3.2 Circular Polar Coordinates . . . . . . . . . . . . 3.2.1 Denitions of Coordinate Systems . . . . 3.2.2 Transformation of the Integration Element 3.2.3 Double Integrals in Polar Coordinates . . 3.3 Summary . . . . . . . . . . . . . . . . . . . . . 4 Triple Integrals 4.1 Integrals in Cartesian Coordinates . . . . . 4.2 Cylindrical Polar Coordinates . . . . . . . . 4.2.1 Denition of the Coordinate System 4.2.2 The Integration Element . . . . . . i . . . . . . . . . . . .
ii 4.2.3 Triple Integrals in Cylindrical Polar Coordinates Spherical Polar Coordinates . . . . . . . . . . . . . . . 4.3.1 Denition of the Coordinate System . . . . . . . 4.3.2 The Integration Element . . . . . . . . . . . . . 4.3.3 Triple Integrals in Spherical Polar Coordinates . Surface Integrals . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 51 51 53 54 55 57
4.3
4.4 4.5
5 Line Integrals 5.1 Work in Classical Mechanics . . . 5.2 Line Integrals over Closed Curves 5.3 Exact and Inexact Dierentials . . 5.4 Arc Length . . . . . . . . . . . . 5.5 Summary . . . . . . . . . . . . .
59 . 59 . 64 . 70 . 73 . 74 . . . . . . . . . . . 77 77 82 86 89 91 91 95 99 99 99 100 104
6 The Divergence and the Divergence Theorem 6.1 Denition of the Divergence . . . . . . . 6.2 The Divergence Theorem . . . . . . . . . 6.3 Gauss Law . . . . . . . . . . . . . . . . 6.4 Summary . . . . . . . . . . . . . . . . .
7 The Curl and Stokes Theorem 7.1 The Curl in Two Dimensions . . . . . . . . . . . . . 7.2 Greens Theorem . . . . . . . . . . . . . . . . . . . 7.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . 7.3.1 Line Integrals in Three Dimensions . . . . . 7.3.2 The Curl Vector . . . . . . . . . . . . . . . . 7.3.3 The Curl of Three-Dimensional Vector Fields 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . .
(1.1)
As the construction in Fig. 1.1 demonstrates, the derivative is the slope of the tangent to f at the point x. The derivative of f is often written as f 0 (x). Example. Consider the function f (x) = x2 . The derivative of this function with respect to x can be calculated from rst principles by using the denition in Eq. (1.1) 1
Dierential Calculus
x x
the point (x, f ) with slope
x
f / x. The right panel shows the effect of taking the limit x ! 0,
Fig. 1.1: The construction of the derivative in Eq. (1.1). The left panel shows a line through which results in a line through (x, f ) that is tangent to f at x.
as follows:
" # d x2 (x + x)2 x2 = lim x!0 dx x " # 2x x + ( x)2 = lim x!0 x = lim (2x + x)
x!0
= 2x .
(1.2)
The basic denition in Eq. (1.1) can be used to show the following well-known formulae of sums, products, and quotients of functions, and the chain rule for composite functions (i.e. functions of functions): d df dg (a f + bg) = a +b , dx dx dx d( f g) df dg = g+ f , dx dx dx ! ! d f 1 df dg = 2 g f , dx g g dx dx (1.3) (1.4) (1.5)
Dierential Calculus
(a)
(b)
(c)
Fig. 1.2: (a) A section of a surface f (x, y). (b) The partial derivative of f with respect to x, and (c) the partial derivative of f with respect to y at the same point. The constructions in (b) and (c) show that the two partial derivatives of f can be obtained by slicing the surface parallel to the appropriate axis.
d f (g(x)) d f dg = , dx dg dx
(1.6)
in which a and b are any constants and f and g are any dierentiable functions. Specic derivatives that will be used throughout this course are d xn = nxn 1 , dx d sin x = cos x , dx d cos x = dx sin x , (1.7) (1.8) (1.9) (1.10)
d e f (x) d f f (x) = e , dx dx
d ln x 1 = , (1.11) dx x where n is an integer. All of these results will be derived from the denition in Eq. (1.1) in Classwork 1 and Problem Set 1. The derivative can be extended to functions of more than one variable. For a function f of two independent variables x and y, the partial derivative of f with respect to x is dened as (Fig. 1.2) " @f f (x + x, y) lim x!0 @x x f (x, y) # , (1.12)
with an analogous expression for the partial derivative @ f /@y: " @f f (x, y + y) lim y!0 @y y f (x, y) # . (1.13)
As these denitions indicate, when taking the partial derivative with respect to a particular independent variable, the other independent variables are held xed. Thus, the usual rules of dierentiation apply, with these other variables treated eectively as constants. Partial derivatives are often abbreviated with a subscript to indicate the independent variable used for the derivative. In this notation, the two derivatives in Eqs. (1.12) and (1.13) are written as f x and fy , respectively. Similarly, the three second-order derivatives are written as f xx , f xy , and fyy . The generalization of partial derivatives to any number of independent variables is straightforward. Example. Consider the function f (x, y) = x sin y. The two rst partial derivatives are @f = sin y , @x @f = x cos y . @y The derivative can also be applied to vectors. Consider the quantity r(t) = x(t) i + y(t) j + z(t) k , (1.16) (1.14) (1.15)
where i, j, and k are the usual unit vectors along the x, y, and z directions, respectively. This may be imagined as the position of a particle in space at time t. The derivative of r with respect to t, which is the instantaneous velocity of the particle, is given by dr dx dy dz v(t) = = i+ j+ k. (1.17) dt dt dt dt This vector is tangent to r, with a magnitude that is equal to the speed of the particle.
the second derivative f 00 (x0 ), and so on. These quantities enable a function to be visualized in terms of its steepness and the sharpness of its bends. A function f of two or more independent variables can be similarly characterized in terms of its partial derivatives. But partial derivatives such as f x and fy are unnecessarily restrictive in that they represent the rates of change of a function only along the xand y-axes (Fig. 1.2). In fact, derivatives can be taken along any direction in the space of the independent variables. The computation of such directional derivatives is the basis for introducing a new type of derivative for functions of two or more variables, called the gradient. This quantity is a vector that provides similar information to contour plots, such as isobars on a weather map or a relief map of a mountainous range, but in a much more succinct form. In some respects, the gradient is the natural generalization to higher dimensions of the ordinary derivative in that it determines the tangent plane to the surface of a function (for the case of two independent variables), just as the ordinary derivative is the slope of the tangent line to a function. The gradient has a wide variety of applications, ranging from the calculation the ow of physical quantities such as heat, the solution of certain types of linear equations (the conjugate gradient method, and in image processing, where the gradient is used to extract information about the edges in an image, which is especially important in biological and medical imaging.
1.2.1
Directional Derivatives
We will conne our discussion initially to functions of two independent variables because the various quantities associated with derivatives are easier to visualize than for the case of three independent variables and the results obtained are straightforward to generalize. The task at hand is the calculation of the derivative of a function along a particular direction. This proceeds by specifying the point r0 and the direction along a unit vector u: r0 = x0 i + y0 j , u = ai + b j, (1.18) (1.19)
where the stipulation that u must be a unit vector, u u = 1, means that a2 + b2 = 1. We now form the vector r = r0 + u s = (x0 + as) i + (y0 + bs) j , (1.20)
where 0 s 1 is a parameter, as shown in Fig. 1.3. In terms of Cartesian components, r = (x, y), x = x0 + as, (1.21)
Thus, for s = 0, r = r0 , and for s = 1, r = r0 + u (Fig. 1.3). Now consider a scalar function f (x, y). y Along the direction dened by r, f (x, y) = f [x(s), y(s)] = = f (x0 + as, y0 + bs) . (1.23)
r0 u r0 s 0
s 1
Thus, the derivative of f with respect to s is, according to the chain rule, df @ f dx @ f dy = + ds @x ds @y ds @f @f = a+ b. @x @y
We can rewrite this expression in vector notation as ! df @f @f @f @f = i+ j (a i + b j) = a+ b, | {z } @x ds @x @y @y | {z } u rf where we have dened the gradient of f , denoted by r f , as rf = @f @f i+ j. @x @y
(1.25)
(1.26)
The symbol r is referred to as nabla, grad, or del. The gradient is a vector eld calculated from the scalar function f . The quantity d f /ds is the directional derivative of f in the direction of the unit vector u, and is written as ru f r f u , (1.27)
which is a scalar quantity because it is obtained as the dot product of two vectors. This derivative can be written in a form analogous to the ordinary derivative in Sec. 1.1. Using the notation f (r) f (x, y), we have that f (r0 + su) ru f = lim s!0 s r0 " f (r0 ) # . (1.28)
Example. Consider the special cases where u = i and u = j. From Eq. (1.27), we obtain ! @f @f @f ri f = i+ j i= , (1.29) @x @y @x ! @f @f @f rj f = i+ j j= . (1.30) @x @y @y Thus, the directional derivatives along the directions of the coordinate axes reduces to the familiar partial derivatives. Along any other directions, the directional derivative is the weighted average of these two derivatives. This explains why u must be a unit vector: the role of u is only to provide the direction in the directional derivative and not to aect its magnitude.
where is the angle between r f and u. The maximum value of the right-hand side is obtained for = 0, when r f and u are parallel. We conclude that ru f |r f | , (1.32)
i.e. the absolute value of the gradient of a function is the maximum rate of change of that function. Direction We can proceed further and obtain an interpretation of the direction of the gradient. Consider the surface f (x, y) = constant and a curve [x(s), y(s)] that lies on this surface, so f [x(s), y(s)] = constant for all s. These curves are called contour lines and they represent constant function values as shown in Fig. 1.4(a). If a contour line is projected to the x y plane any point of the projected curve in the x y plane is given by: r(s) = x(s) + y(s) i j (1.33)
(a)
(b)
2
r(s)
x
3 3
Fig. 1.4: Contour lines of constant function value (a). The position vector r(s) is used to describe a contour line (b).
as shown in Fig. 1.4(b). So for the contour lines we may write: f [x(s), y(s)] = const. which gives after dierentiation of both sides: d f [x(s), y(s)] =0) ds @ f [x(s), y(s)] dx(s) @ f [x(s), y(s)] dy(s) + = 0. @x ds @y ds (1.34)
(1.35)
as follows from the chain rule. The above equation can be written as a scalar product of two vectors: rf v = 0 (1.36) with dx(s) dy(s) dr(s) i+ j= . (1.37) ds ds ds This equation shows that v(s) is perpendicular to r(s) at any point, i.e. v(s) is tangent to the contour line. Also, by Eq. 1.36, r f is perpendicular to v(s), i.e. the gradient of a function is perpendicular to lines of constant function value. The properties of the gradient for functions of two independent variables are illustrated by the following example. v(s) = Example. Consider the function f (x, y) = 1 x2 y2 (1.38)
for the ranges of variables 1 x 1 and 1 y 1. This surface, which is an inverted paraboloid, is shown in Fig. 1.5 (a). The gradient of f is dened as the two-dimensional vector @f @f rf = i+ j. (1.39) @x @y The partial derivatives of f are straightforward to calculate: @f = 2x , @x so the gradient is r f = 2x i 2y j . (1.41) According to the discussion in this section, the gradient represents the maximum rate of change of f and point in the direction normal to surfaces of constant f . In this case, the surfacesof constant f are curves in the x-y plane. For f = z0 , these curves are given by x2 + y2 = 1 z0 , (1.42) @f = 2y , @y (1.40)
(a)
1 z 0 1 0 x 1 1
(b)
0.5
1 0.5
0 y
1 1 0.5 0 x 0.5 1
x2
constant f in Eq. (1.38) and the gradient eld in Eq. (1.41) (b).
p which are circles of radius 1 z0 centered at the origin. As z0 increases from 0 to 1, the radii of the circles decreases from 1 to 0, i.e. the height of f above the x-y plane increases toward the origin, which is the maximum of f , as shown in Fig. 1.5 (a). These contours are shown in Fig. 1.5 (b), together with the gradient calculated in Eq. (1.40). The gradient is seen to be a radial vector eld that points toward the origin, i.e. along the direction of the maximum rate of change of f . Also evident is that the gradient vectors are normal to the circles of constant f .
10
Having computed the gradient of f , we can now determine its directional derivative. For any unit vector u = a i + b j, where a2 + b2 = 1, we have from Eq. (1.27), ru f = ( 2x i 2y j)(a i + b j) = = 2ax 2yb . (1.43)
The calculation of this derivative requires the specication of a point (x, y) and a direction (a, b). For example, at the point ( 1 , 1 ), 2 2 ru f
(1,1) 2 2
= a
b.
(1.44)
The maximum of the directional derivative is obtained from calculating the maximum of the function p ru f = a 1 a2 = f (a) (1.45)
(1,1) 2 2
which gives:
p that gives a = 2/2. By choosing the negative sign we have a = p b= 2/2 which gives the maximum of the directional derivative p p 2 2 p ru f = + = 2 2 2 (1,1) 2 2
d f (a) a = 1+ p =0 da 1 a2
(1.47)
which isp equal to the magnitude of the gradient. The minimum value is obtained p for a = 2/2 and b = 2/2: ru f
(1,1) 2 2
p 2,
(1.48)
which is also equalp the magnitude of the gradient. The directional derivative to p p p vanishes for a = 2/2 and b = 2/2 and a = 2/2 and b = 2/2, which is the tangential direction to the contours of constant f .
11
Fig. 1.6: The surface x2 + y2 + z2 = constant shown together with the gradient at several points.
added dimension. We only quote the main results here and leave the derivations as an exercise. The gradient of a function of f (x, y, z) of three independent variables is rf = @f @f @f i+ j+ k. @x @y @z (1.49)
The directional derivative at a point (x0 , y0 , z0 ) in the direction of the unit vector u = ai + b j + c k, where a2 + b2 + c2 = 1, is ru f = r f u . (1.51) (1.50)
The properties of the gradient are the same as for functions of two independent variables, namely, that the magnitude of the gradient of f at a point is the maximum rate of change of f at that point, and the gradient points in the direction of the maximum rate of change of f , normal to surfaces of constant f . Example. Consider the function f (x, y, z) = x2 + y2 + z2 . (1.52)
The surfaces of constant f are concentric spheres, as shown in Fig. 1.6. The partial derivatives of f are @f = 2x , @x @f = 2y , @y @f = 2z , @z (1.53)
12 so the gradient of f is
Several of these vectors are plotted in Fig. 1.6. These vectors are normal to the spherical surface and point away from the origin, because points on spherical surfaces with increasing radii are further away from the origin. The directional derivative can be calculated for example at (1, 0, 0) from Eqs. 1.54 and 1.50 as ru f = 2a, . (1.55)
(1,0,0)
r f = 2x i + 2y j + 2z k .
(1.54)
This derivative has a maximum value when a = 1, i.e. for u along the x-axis (as shown in Fig. 1.6), and vanishes if a = 0. This corresponds to four unit vectors pointing along the positive and negative y- and z-axes, i.e. normal to the gradient. Example We may use the results of the following example to calculate the tangent plane of a sphere. The equation of a general plane is given by N (r a) = 0 (1.56)
where N is a vector normal to the plane, r = x + yj + zk is a position vector and i a is a vector pointing to a known position of the plane. Suppose we would like to know the equation of the plane tangent to the sphere x2 + y2 + z2 = 1 (1.57)
at a = k. The normal at a = (1, 1, 1) is given from Eq. 1.54 by N = 2 k i+ j+ i+2j+2 and so Eq. 1.56 yields: h i (2 + 2 + 2k) (x 1) + (y 1) + (z 1)k = 0 i j i j (1.58) which gives x+y+z=3 which is the equation of the plane. (1.59)
13
The minus sign indicates that the forces acting on the particle point in the direction of decreasing potential energy. The motion of the particle is obtained by solving Newtons second law of motion: m d2 r = rV , dt2 (1.61)
where m is the mass of the particle. The existence of a potential associated with a force was discussed in Sec. 5.3. The electrostatic force E is conservative, so the work done on a particle depends only on the initial and nal position of the particle, and not on the path followed. as discussed in Sec. 5.3, with each conservative force, a potential energy can be associated. For the electrostatic force, the associated potential is calculated from E= r . (1.62)
In most metals and semiconductors, the relationship between the electrical current density j and the applied electric eld E is given by Ohms law: j= E, (1.63)
where is the electrical conductivity. By using Eq. (1.62), we can write this relation as j= r . (1.64)
As the following discussion shows, there are several phenomena that are described by equations of this form. The relationship between the heat ow q in the presence of variations of temperature T is expressed in terms of Fouriers law: q = CrT , (1.65)
where C is the coe cient of thermal conductivity. The coe cient C may vary with the temperature, and certainly varies from one substance to another, but it is always a positive constant. This of course makes intuitive sense, at least if the molecular concept of temperature is invoked; the heat (kinetic energy at the microscopic scale) tends to ow from regions of high concentration of internal energy to regions of low internal energy, which is consistent with the statement above that
14
the heat ow is directed in the direction of the gradient of the temperature, which denes its maximum rate of change. Similar concepts apply to particle diusion. The current of particles j in the presence of a varying concentration c of the particles is given by Ficks law: j = Drc , (1.66)
where D is the diusion coe cient. The negative sign indicates that transport of material is from high to low concentrations, so that any variations in the concentrations tend to be smoothed out.
1.3 Summary
This chapter has introduced a derivative operation on scalar functions with several independent variables called the gradient, and denoted by r: rf = @f @f @f i+ j+ k. @x @y @z (1.67)
The gradient is a vector quantity calculated from a scalar function. The gradient is an intrinsic property of the function has the magnitude of the maximum rate of change of f and points in the direction normal to surfaces of constant f . The gradient is used in the following applications: 1. Identify the magnitude and direction of the maximum rate of change of a function at a given point. 2. Calculate the directional derivative at a point in a direction specied by a unit vector. 3. Find the normal to a surface as a specied point. 4. Determine the tangent plane to a surface at a given point.
(2.1)
Example. The integral of f (x) = x between x = a and x = b is calculated by rst constructing the Riemann sum. For this function, we have that f (a + n xN ) = a + n xN , (2.2)
so the area corresponding to each strip is (a + n xN ) xn . Hence, denition in Eq. (2.1) reduces to 2 N 3 Z b 6X 7 6 7 6 7 x dx = lim 6 (a + n xN ) xN 7 . (2.3) 6 7 4 5
a N!1 n=1
With xN = (b
N!1
2 N 6X b 6 6 6 a+n 6 4
n=1
a N
b N
15
!3 7 7 7 7 7 5
16
y y
Fig. 2.1: The approximation by Riemann sums (left panel) of the area between a curve and the x-axis (right panel).
We can break up the right-hand side of this equation into two separate sums. The rst of these can be easily evaluated because there is no explicit n-dependence: ! N X b a! b a a = Na = a(b a) . (2.5) N N n=1 The second sum,
N X n=1
8 N 2 ! 2 39 >X 6 b a ! > 6 b a 7> < 7> 7= . 7> = lim > 6a +n 4 5> > 6 ; N!1 : N N n=1
(2.4)
b N
N X n=1
!2
b N
!2 X N
n=1
n,
(2.6)
n = 1 N(N + 1) . 2
(2.7)
Thus, b N a
Combining these summations and taking the limit N ! 1 allows us to evaluate the integral: ! Z b N+1 2 1 x dx = a(b a) + 2 (b a) lim = 1 (b2 a2 ) . (2.9) 2 N!1 N a | {z } =1
!2 X N
n=1
n = 1 (b 2
a)2
N+1 . N
(2.8)
17
f (x) dx = F(b)
F(a) ,
(2.10)
The function F, whose derivative is equal to f is called the anti-derivative or the primitive function of f . Note the structure of the Fundamental Theorem. The integral of f is an expression that involves the values of f at every point within the interval (a, b). But the evaluation of this integral with the primitive function F of f requires the values of F only at the endpoints a and b of this interval. The basic theorems of vector calculus will be seen to have an analogous structure. A proof of the Fundamental Theorem of Calculus is given in the last section of this chapter. Example. Consider the integral Z
b a
x dx ,
(2.12)
which was evaluated in the preceding section using Riemann sums. To use the Fundamental Theorem of Calculus, we rst identify the primitive function F of x as F(x) = 1 x2 + A , 2 (2.13)
where A is a constant (called a constant of integration). Then, the value of this integral is Z b b 1 2 x dx = 2 x + A = 1 (b2 a2 ) . (2.14) 2
a a
Note that the constant A makes no contribution to the value of the integral. Thus, for the purposes of evaluating denite integrals, constants of integration can be omitted from the primitive function F.
18
The Fundamental Theorem of Calculus enables a number of important properties of integrals to be obtained. Higher-dimensional versions of this theorem form one of the major themes of this course. The following properties of denite integrals are implied by the Fundamental Theorem: Z b Z a f (x) dx = f (x) dx , (2.15)
a b
d dx d dx
x a b x
f (s) ds = f (x) ,
(2.16)
f (s) ds =
f (x) , du f (u(x)) . dx
(2.17)
(2.18)
dx 1 x2
(2.19)
This a standard example of an integral whose evaluation benets from a change of variables, in this case based on trigonometric functions. We dene a new variable of integration through x = sin . (2.20) To transform the integral, we must consider the eect of this transformation on the integrand, the integration element, and the limits of integration. By using the
19
(2.21)
The integration element is calculated by applying the chain rule to Eq. (2.20): dx = cos d . (2.22)
Lastly, the new limits of integration are determined by identifying the values of whose values are 0 for the lower limit, and 1 for the upper limit. These are identied as sin(0) = 0 , sin( 1 ) = 1 . 2 (2.23)
Thus, the original integral is transformed to Z 1 Z 1 2 dx = d , p 0 0 1 x2 the right-hand side of which is easily evaluated, and we obtain Z 1 dx = 1 . p 2 2 0 1 x
(2.24)
(2.25)
This example illustrates the power of variable transformations: what seemed as di cult evaluation has been transformed, through a judicious choice of a new integration variable, to a much simpler expression. An appropriate transformation is sometimes apparent from the integral itself, as in this example, but may involve an element of trial and error. Most modern computational mathematics software, e.g Maple and Mathematica, perform a series of transformations to determine the simplest form of an integral. We can now formulate in general terms the transformation of an integral Z b f (x) dx (2.26)
a
under the change of variables x ! t(x). The integrand becomes f (x) = f (x(t)) ,
(2.27)
where x(t) is obtained from the inverse of t(x). In the example above, the change of variables was dened in this form. The integration element is transformed to dx = dx dt dt (2.28)
20
and the limits of integration are now t(a) and t(b). Hence, the general form of a change of variables in an integral is Z
b a
f (x) dx =
t(b) t(a)
f (x(t))
dx dt . dt
(2.29)
The choice of transformation is usually dictated by the requirement that the primitive function of the transformed integrand, f (x(t))(dx/dt), is easier to determine than the original function. The quantity dx/dt represents the change in the density of integration points induced by the change of variables. This is a key quantity that arises whenever integration variables are changed and will be encountered again when we discuss coordinate transformations in two and three dimensions. For one-dimensional integrals there is alternative interpretation of the transformation of the integration element. If we regard x(t) as the position of a particle as a function of time, then dx represents the dierential change in the position during a time interval dt: dx(t) = x(t + dt) x(t) . (2.30)
Intuitively, we know that dx is given by the instantaneous velocity v(t) multiplied by the time interval dt: dx(t) = v(t) dt. But, v(t) = dx/dt, so dx(t) = dx dt , dt (2.31)
which is the same as Eq. (2.28). A somewhat more formal argument goes as follows. We may expand the function f (x + dx) in the vicinity of dx by using the Taylor series: 1 f (x + dx) = f (x) + dx f 0 (x) + dx2 f 00 (x) + . . . = f (x) + dx f 0 (x) + O(dx2 ) (2.32) 2 where O(y) denotes the error in the order of y. However, f (x + dx) f (x) is just a small increment on the function f which can be denoted by d f . Hence, for small but nite dx we have d f = f 0 (x) dx . Note that d f and dx are called dierential changes of f and x, respectively, while f 0 (x) = d f (x)/dx denotes the derivative of function f by variable x. The following indenite integrals, which can be derived from the basic formulae in Eqs. (1.7)(1.11), will be used throughout this course: Z 1 n+1 xn dx = x + A, (2.33) n+1
cos x dx = sin x + A , Z ex dx = ex + A , ln x dx = 1 + A, x
sin x cosn x dx = Z
x2 e x dx = (x2 + 2x + 2) e
+ A.
where n is a positive integer and A is a constant of integration that is eliminated once the integrals are evaluated between specic upper and lower limits. As guaranteed by the Fundamental Theorem, the derivative of the right-hand side of each of these expressions yields the integrand on the corresponding left-hand side.
22
Slope = f(b)
f (b) b
f (a) a
f(a) a x b
This is shown by the emboldened line in the gure above. To use the Mean Value Theorem to prove the Fundamental Theorem of Calculus, we dene the function F by Z x F(x) = f (t) dt
a
for some function f and a x b. We will rst show that F is a dierentiable function where f is continuous. Using the denition in Eq. (1.1), we write the derivative of F as F(x + x) F(x) dF = lim x!0 dx x 1 Z x+ = lim x!0 x a
x
f (t) dt
x a x
f (t) dt Z
x a
f (t) dt
f (t) dt
f (t) dt .
The integral in the last line of this equation can be approximated by the area of a strip of height f (x) and width x, with a correction of order ( x)2 : Z
x+ x x
23
Hence, upon substitution of this expression into the denition of the derivative of F, we obtain 1 dF = lim f (x) x + O( x2 ) x!0 dx x = lim f (x) + O( x)
x!0
= f (x) ,
which demonstrates that the derivative of F exists for every point x where f is continuous. In particular, if f is a continuous function on [a, b], then F is dierentiable at every point in that interval. Thus, consider the partition of [a, b] into N intervals such that xn 1 x xn , where x0 = a and xN = b: ! b a xn = a + n , (2.41) N where n = 1, 2, . . . , N. This is shown schematically in the following diagram:
We now use the Mean Value Theorem to choose a point tn within the nth interval that satises F(xn ) Then F(b) F(a) =
N X n=1
F(xn 1 ) = (xn
xn 1 )F 0 (tn ) = (xn
xn 1 ) f (ni ) .
[F(xn )
F(xn 1 )] =
N X n=1
f (tn ) xn ,
24
where xn = xn xn 1 . The right-hand-side of this equation is represented by the shaded area in the right panel in the gure above and is seen to be the same basic construction as that used for Riemann sums shown in the gure above. Accordingly, if we now take the limit N ! 1 this approaches the area under the curve, and we have Z b F(b) F(a) = f (t) dt ,
a
2.4 Summary
This chapter has reviewed the main results of single integral calculus. The evaluation of integrals in higher dimensions, which are introduced in the following chapters, all reduce to the type of one-dimensional evaluations discussed here. Moreover, the Fundamental Theorem of Calculus will have analogues in higher spatial dimensions.
f (x, y) dx dy .
(3.1)
25
26
Double Integrals
Fig. 3.1: Geometric representation of a double integral of a function f (x, y), represented as the surface z = f (x, y) (shown shaded). The integral is bounded by integration range A in the x-y plane, the corresponding region mapped onto the surface, and the vertical extensions of the boundaries of A to f . This interpretation is analogous to that in Fig. 2.1 for one-dimensional integrals.
Double integrals have a geometrical interpretation that is analogous to that of one-dimensional integrals. As shown in Fig. 3.1, the function f (x, y) can be represented as the surface z = f (x, y) in three-dimensional space. The region A in the x-y plane is mapped to the region f (A). The integral in Eq. (3.1) therefore corresponds to the volume bounded by the x-y plane, the surface f (A), and the boundaries of A extended vertically to f . There are several important points to note about double integrals: 1. Once the area A is specied, the integral has a unique value. 2. The integrations over x and y can be carried out in any order. 3. For the special case f (x, y) = 1, the region of integration is a cylinder of unit height with base area A. The value of the integral is, therefore, the area of A. Example. Suppose f = x2 y and A is the rectangular region shown in Fig. 3.2. The evaluation of the double integral ZZ x2 y dx dy (3.2)
A
Double Integrals
y
27
Fig. 3.2: The integration region A, shown shaded, for the double integral in Eq. (3.2).
proceeds by rst determining the ranges of x and y for the coordinates of every point within A. Since the boundaries of A are parallel to the x- and y-axes, these ranges are readily determined as 1 x 3, The double integral to be evaluated is Z
3 1
1 y 2. Z Z
(3.3)
dx
2 1
dy x y =
3 1
x dx
2 1
y dy .
(3.4)
The original integral has thereby been reduced to two one-dimensional integrals. This is a general feature of multiple integrals: their evaluation always reduces to a sequence of one-dimensional integrals. The nal step is the evaluation of the integrals in Eq. (3.4): Z
3 1
x2 dx =
2
1 33 x =9 3 1 1 22 y =2 2 1
1 26 = , 3 3 1 3 = , 2 2
(3.5) (3.6)
y dy =
28
Double Integrals
Double integrals over regions bounded by lines that are parallel to the coordinate axes are especially straightforward to evaluate because the ranges of x and y are independent of one another. But, as the following example shows, this is not always the case. Example. Suppose that, in the double y integral in Eq. (3.2), the integration region A is the triangle shown in Fig. 3.3. Identi2 fying the ranges of x and y for this region requires a dierent procedure from that in Fig. 3.2. There are two ways that this integral can be done: by carrying out the yintegration rst followed by the x-integration, and by carrying out the x-integration rst followed by the y-integration. x 1 Method I. If x is allowed to range over the interval 0 x 1 then, as shown in Fig. 3.3: The integration region A, Fig. 3.4(a), the values of y corresponding to shown shaded, for the double integral in a particular value of x must lie in the range Eq. (3.2). 0 y 2x because A is bounded from below by the x-axis and from above by the line y = 2x. Thus, the double integral over A is written as R 1 R 2x R1 R 2x dx 0 dy x2 y = 0 x2 dx 0 y dy = 0 R 1 R 2x x2 0 y dy dx . (3.8) 0
Notice that, because the upper limit of the y-integration is a function of x, this integral must be performed before the integral over x. Such a multiple integral is called an iterated integral. In eect, the double integral over A has been represented as an integral over each vertical strip that runs parallel to the y-axis within A, followed by an integral over all of these strips. Accordingly, the integration over y yields Z 2x 1 2x y dy = y2 = 2x2 , (3.9) 2 0 0 The integral over x can now be carried out, and we obtain Z 1 2 1 2 2 x4 dx = x5 = . 5 0 5 0
(3.10)
Method II. This integral can also be evaluated by performing the integration over x rst. Referring to Fig. 3.4(b), the range of y is 0 y 2. For a given value of y, the corresponding values of x within A are bounded from the left by y = 2x
Double Integrals
y y
29
2 2x
y x 1 x y2 1 x
(a)
(b)
Fig. 3.4: Two ways of setting up the ranges of integration for the region shown in Fig. 3.3. (a) The allowed values of y for a given value of x in the range 0 x 1. (b) The allowed values of x for a given value of y in the range 0 y 1.
and from the right by x = 1. Thus, the corresponding range of x is 1 y x 1. 2 The double integral is now written as 1 Z 2 Z 1 Z 2 Z 1 Z 20 Z 1 B C B C 2 2 2 By B C dy dx x y = y dy x dx = x dxC dy . (3.11) @ A
0
1 2y
1 2y
1 2y
The integration over x must now be performed rst, with the result Z 1 1 1 1 1 3 x2 dx = x3 = y . 1 3 1 y 3 24 y 2 2 The integration over y then yields Z Z 2 1 2 1 1 2 y dy y4 dy = y2 3 0 24 0 6 0 1 52 2 y = 120 0 3
(3.12)
32 2 = 120 5
which agrees with the result in Eq. (3.10). Example We now evaluate the same function f (x, y) = x2 y as in the previous example but now over the unshaded area in Fig. 3.3. Method I. We allow the independent variable x to vary such that 0 x 1. The dependent variable y then needs to vary between 2x and 2: 2x y 2. Thus the double integral over the new area A is given by: Z 1 Z 2 R1 R2 R1 2 2 dx dy x2 y = 0 x2 dx 2x y dy = 0 x2 y2 dx 2x 0 2x R1 4 = 2 0 x2 (1 x2 ) dx = 15 (3.13)
Method II. We now choose the independent variable to be y. Accordingly, as the gure shows we allow y to vary such that 0 y 2. The dependent variable is x.
30
Double Integrals
Since now we are integrating for the unshaded triangle we need to allow x to vary between 0 and y/2: 0 x y/2. Accordingly the integral now reads: Z
2 0
dy
y/2 0
dy x2 y =
R2
0
ydy
R y/2
0
x2 dx =
3
Clearly the two methods yield the same result. However, the two examples result in dierent answers. This is not surprising when we note that what we are calculating is the volume that is determined by the x y plane from below, the function x2 y from above and the three planes whose intersection with the x y plane is the outline of area A. If the function value changes above the area then, of course, so will the resulting volume. The approach described in the preceding examples can be applied to any region bounded by straight line segments. In some cases, the same methods can be used for regions with circular boundaries. The following example shows how to calculate the area of a semi-circular region. Example. Consider the integral ZZ
1 y
R2
0
y y 323 dy =
R2
0
y
4 15
y/2 y3 3 0
dy (3.14)
dx dy ,
(3.15)
where the area A, shown in Fig. 3.5, is bounded from below by the x-axis, and from above by x 1 1 the boundary of the circle x2 + y2 = 1. Because the integrand is unity, the value of this integral Fig. 3.5: The semi-circular region A, is equal to the area of A. We will evaluate this shown shaded, for the double integral integral by performing the integral over y rst. in Eq. (3.15). The range of x is 1 x 1. For a given value of x, the values of y within A are bounded from below by the x-axis and p from above by the circular boundary. Thus, the range of y is 0 y 1 x2 . The double integral is therefore written as Z
1 1
dx
p 1 x2 0
dy .
(3.16)
p 1
x2 dx .
(3.17)
Double Integrals
31
This integral can be evaluated by trigonometric substitution. We set x = sin . Then, q p 2 = 1 x 1 sin2 = cos , (3.18) dx = cos d , (3.19) and the limits of integration are transformed as x= 1 ! The transformed integral is Z
1 1
1 , 2
x=1 !
1 2 1 2
= 1 . 2
(3.20)
x2 dx =
cos2 d = 1 , 2
(3.21)
which is the area of the semi-circular region. With the examples in this section as background, we can summarize the evaluation of double integrals over any region A in the x-y plane by the two approaches illustrated in Fig. 3.5. In Fig. 3.6(a), the range of x is xA x xB and the corresponding range of y at a particular value of xis u1 (x) y u2 (x), and the double integral is written as ZZ
y u2 x
f (x, y) dx dy =
xB xA
dx
u2 (x) u1 (x)
dy f (x, y) .
(3.22)
yB
v1 y
v2 y
u1 x xA xB x
yA x
(a)
shaded.
(b)
Fig. 3.6: The two methods of evaluating a double integralover a region in the x-y plane, shown
32
Double Integrals
In Fig. 3.6(b), the range of y is yA y yB and the corresponding range of x for a particular value of y is v1 (y) x v2 (y), and the double integral is ZZ f (x, y) dx dy = Z
yB
yA
dy
v2 (y) v1 (y)
dx f (x, y) .
(3.23)
Although these expressions indicate the order in which the integrals over x and y are to be carried out, the actual evaluation of these integrals may prove problematic for certain types of boundaries and integrands. For some common cases there are special methods available. We consider an example. Example. Consider the integral ZZ
x 2 y2
dx dy ,
(3.24)
where the region A is shown in Fig. 3.5. In integrals of this type arise in quantum mechanics and in the physics of random processes. We set up the integral using the same steps that lead to Eq. (3.16): R1 R p1 x2 2 2 dx 0 dy e x y = 1 R1 R p1 x2 2 2 e x dx 0 e y dy . (3.25) 1 We arrive at an impasse because there is no explicit expression for the primitive 2 function of e y . The problem is not the boundary, but the integrand. In fact, the semi-circular boundary provides the basis for an alternative way of writing this integral that enables it to be evaluated in a straightforward manner. This involves the transformation to a new coordinate system and will be discussed in the next section.
Double Integrals
33
x,y
(a)
system and (b) (r, ) in a circular polar coordinate system.
(b)
Fig. 3.7: Two ways of labelling the same point in the plane: (a) (x, y) in a Cartesian coordinate
include all points in the x-y plane. This coordinate system is conceptually simple and has natural extensions to higher dimensions. But there are other ways of labelling points that may be more suitable in particular circumstances. The basic idea of circular polar coordinates is to specify any point (x, y) in terms of two new variables: (i) a radius r that species the distance of the point from the origin, and (ii) an angular variable , called the azimuthal angle, that species the angle of the radial vector with respect to some axis, by convention taken to be the positive x-axis, with the angle increasing from zero in the counterclockwise direction. These quantities are shown in Fig. 3.7(b). The variable r is an inherently non-negative quantity so its range is 0 r < 1. (3.27)
The azimuthal angle must account for all orientations with respect to the positive x-axis while maintaining a unique labelling for all points, so its range is 0 < 2 . (3.28)
The relationship between the two coordinate systems can be determined from a standard trigonometric analysis: x = r cos , or, alternatively, r= p x 2 + y2 , y = r sin , = tan
1
(3.29)
. (3.30) x The restriction of the range of in Eq. (3.28) can now be understood as restricting the trigonometric functions in Eq. (3.29) to a single period. The dierences between the Cartesian and circular polar coordinates are best appreciated pictorially by plotting the coordinate curves, i.e. the curves where
34
Double Integrals
one of the coordinates is held constant. The coordinate curves of the Cartesian coordinate system are straight lines parallel to the x and y axes, as shown in Fig. 3.8(a). For the circular polar coordinate system, the curves of constant r are, from Eq. (3.30) concentric circles centered at the origin. The lines of constant azimuthal angle are straight lines through the origin that make an angle with respect to the positive x-axis. These are shown in Fig. 3.8(b).
(a)
(b)
Fig. 3.8: The coordinate curves of the (a) Cartesian and (b) circular polar coordinate system. Also shown is a circle to indicate how the polar coordinates provide a more natural description of such boundaries than the Cartesian coordinates.
Double Integrals
r + dr r
+d
35
(a)
(b)
Fig. 3.9: The steps used to calculate the integration element in circular polar coordinates. The area between the circles of radii r and r + dr is shown in the left panel. The fraction of this area contained between the azimuthal angles and
Note the factor of r multiplying dr d . This results from the fact that, for a xed angle d , the arc length between and + d is r d . An alternative way of deriving the integration element is to rst construct the vector r = x i + y j which, in terms of the variables in Eq. (3.29) is r = r cos i + r sin j. (3.33)
We now calculate the vectors drr and dr resulting from the dierential with respect to r and , respectively: drr = dr cos i + dr sin j (3.34) (3.35)
dr = r sin d i + r cos d j . These vectors are orthogonal, drr dr = (dr cos i + dr sin j)( r sin d i + r cos d j)
so the area dened by these vectors is obtained by multiplying their magnitudes: dA = |drr ||dr | = drr d = r dr d . (3.37)
This procedure can be generalized to any coordinate transformation given in terms of new variables u and v: x = x(u, v) and y = y(u, v). The vector r is r = x(u, v) i + y(u, v) j , (3.38)
36 and the dierential changes to u and v yield the vectors dru = drv = @x @y du i + du j @u @u
Double Integrals
(3.39)
@x @y dv i + dv j . (3.40) @v @v Since these vectors are not necessarily orthogonal, we must calculate the area from their cross product, which we write as @x @y @x @y du du @u @u @u @u dA = |dru drv | = = du dv (3.41) @x @y @x @y dv dv @v @v @v @v The determinant on the right-hand side of this equation is called the Jacobian and denoted by J(u, v): J(u, v) = @x @u @x @v @y @u @y @v (3.42)
where the absolute value is taken because dA is an inherently positive quantity, while the sign of J can be changed simple by interchanging the two vectors in the cross product. The Jacobian provides the weight of the integration elements as a function of position. In the case of circular polar coordinates, the Jacobian factor indicates that the weight associated with an element dr d increases linearly with r: cos sin J(r, ) = = r(cos2 + sin2 ) = r , (3.44) r sin r cos as in Eq. (3.37).
Double Integrals
37
where the region A is shown in Fig. 3.5. In circular polar coordinates in Eq. (3.29), the integrand becomes 2 2 2 e x y =e r , (3.45) and the integration element becomes r dr d . There remains only the specication of the ranges of r and . For the semi-circular region in Fig. 3.5, 0 r 1, 0 , (3.46)
where the restriction on the range of results from the fact that the integration region is the upper half-circle. The integral to be evaluated is Z Z 1 2 d re r dr . (3.47)
0 0
(3.48)
Thus, the transformation into circular polar coordinates has enabled us to evaluate an integral that was intractable in Cartesian coordinates. Example. Consider the integral of f (x, y) = 2x + 4y2 between the circles x2 + y2 = 1 and x2 + y2 = 4. In circular polar coordinates Eq. (3.29), f is 2x + 4y2 = 2r cos + 4r2 sin2 . (3.49)
The range r is restricted by the radii of the bounding circles, 1 r 2, and the range of is 0 < 2 to account for the entire region between the circles. Hence, the integral to be evaluated is Z 2 Z 2 r dr d (2r cos + 4r2 sin2 )
1 0
=2
r dr
2
= 4 = r
r3 dr
|0
cos d +4 {z } =0
r dr
|0
sin2 d {z } =
2 1
= 15 .
(3.50)
38
Double Integrals
(a)
polar coordinates.
(b)
(c)
Fig. 3.10: The most common types of region, shown shaded, used for integrations in circular
The most common regions over which integrations are carried out in circular polar coordinates are shown in Fig. 3.10. Figure 3.10(a) represents the interior of a circle of radius R. The ranges of r and are 0 r R, 0 < 2 . (3.51)
For the interior of the wedge-shaped region in Fig. 3.10(b), we have 0 r R, 0 , (3.52)
where is the angle of the wedge. Finally, for the region in Fig. 3.10(c), which is a partial annular region, R1 r R2 ,
1
(3.53)
f (x, y) dx dy =
f (r cos , r sin ) r dr d .
(3.54)
Although regions of the type in Fig. 3.10 are the most natural for circular polar coordinates, the following example shows that integrals over areas with straight boundaries can also be carried out in this coordinate system. Example. Consider the area shown in Fig. 3.11. The azimuthal angle is seen to range between tan 1 (1) = 1 and tan 1 ( 1) = 3 : 4 4
1 4
3 . 4
(3.55)
Double Integrals
y 1
39
The lower bound of r is r = 0. The upper bound is determined by writing the upper boundary of the triangle, y = 1, in circular polar coordinates. Given that y = r sin , we have that this boundary is r sin = 1. The range of r is therefore given by 0r and the area integral is Z Z
1/ sin 0
3 4 1 4
1 , sin
1/ sin
(3.56)
r dr .
(3.57)
1 2 sin2
(3.58)
d sin2
(3.59)
cot x: (3.60)
Double Integrals
d sin2
0 1B B B = B cot x @ 2
1 C 1 C C = (1 + 1) = 1 . C A 1 2 4
3 4
(3.61)
3.3 Summary
The double integral of a function f (x, y) represents the volume under the surface of that function within a specied region in the x-y plane. This extends to functions of two independent variables the discussion in Sec. 2 of one-dimensional integrals. The change of variables from Cartesian to another system of coordinates, such as circular polar coordinates, introduces a term, called the Jacobian, into the integral. The Jacobian is the higher-dimensional analogue of the term obtained by applying the chain rule to the integration element in one-dimensional integrals. It takes into account the changes of the element of integration area across the x-y plane. The evaluation of double integrals proceeds by the successive evaluation of a sequence of one-dimensional integrals.
Following our discussion of double integrals, there are several points to note about triple integrals: 1. Once the volume V has been specied, the integral has a unique value. 2. The integrals over x, y, and z can be carried out in any order. 41
42
Triple Integrals
3. If f = 1, the integral yields the volume of the integration region: ZZZ dx dy dz = V .
V
(4.2)
The evaluation of triple integrals proceeds in direct analogy to the cases described in Chapter 3 for double integrals. The following examples illustrate the dierent situations that can arise. Example. Suppose f = xyz, and that V is the volume shown in Fig. 4.1. We must rst determine the ranges of the integration variables. The volume V is a cube in the positive octant of space with one corner at the origin. The points (x, y, z) within the cube have coordinates within the ranges 0 x 1, 0 y 1, 0 z 1. (4.3)
y 1
0 x 1
Fig. 4.1: The cubic region for the triple integral in Eq. (4.4).
The triple integral of f = xyz within the cube in Fig. 4.1 is therefore calculated as ZZZ Z 1 Z 1 Z 1 1 f (x, y, z)dx dy dz = x dx y dy z dz = . 8 V |0 {z } |0{z } |0{z }
1 2 1 2 1 2
(4.4)
This type of region, where the ranges of x, y, and z are specied independently, is the simplest for triple integrals. The most general volume of this type is a rectangular prism aligned with the coordinate axes, where each side is a rectangle parallel to one of the coordinate planes. The next two examples have a volumes which do not satisfy these criteria, with the result that the triple integrals become iterated integrals. Example. Suppose that f = xyz, as in the preceding example, and V is the wedge shown in Fig. 4.2. We rst determine the ranges of the integration variables. The wedge is bounded from above by the plane y z = 0, with all other bounding planes lying parallel to coordinate planes. Thus, the range of x is
Triple Integrals
43
z 1 0 x 1 0 y
Fig. 4.2: The volume for the triple integral in Eq. (4.9).
0 x 1.
(4.5)
The triangular sides of the wedge are parallel to the plane x = 0, so the ranges of the y and z coordinates cannot be specied independently. Referring to Fig. 3.4, the ranges of these variables are 0 y 1, An alternative choice is (cf. Fig. 3.4) z y 1, 0 z 1. (4.7) 0 z y. (4.6)
Using the ranges in Eq. (4.6), the triple integral is ZZZ Z 1 Z 1 Z y f (x, y, z)dx dy dz = x dx y dy z dz .
V 0 0 0
(4.8)
As was the case for double integrals, this is called an iterated integral because the upper limit of the z-integral is y, which necessitates evaluating this integral before the y-integral. The x-integral can be carried out independently of the other two. Thus, carrying out the required integrations, Z 1 Z 1 Z y Z Z y 1 1 x dx y dy z dz = y dy z dz 2 0 0 0 0 0 | {z } 1 = 2 Z
1 2 1 0
! Z 1 2y 1 1 3 1 y dy z = y dy = . 2 0 4 0 16 | {z } | {z } 1 1 2 y 4 2
(4.9)
44
Triple Integrals
The evaluation of this integral with the ranges in Eq. (4.7) is left as an exercise. Example. Consider now the integration of f = xyz over the volume in Fig. 4.3. This region is contained in the positive octant, bounded from below by the x-y plane and from above by the plane x + y + z = 1. The ranges of the integration variables are ob1 y tained by rst observing that, in the x-y plane, where z = 0, the (x, y) coordinates within V are 0 bounded by the line x + y = 1. Hence, the ranges 1 of x and y may be chosen as 0 x 1, 0y1 x. (4.10)
z 0 x 1
The lower bound for the range of z for all values of x and y is z = 0. The upper bound is obtained from the equation of the plane, solved for z: z = 1 x y. Hence, 0z1 x y, (4.11) Z
1 0
x dx
1 x 0
y dy
This is again an iterated integral in which the z-integration must be evaluated rst, then the y-integration, and nally the x-integration. The integral over z is evaluated as ! Z 1 x y 1 21 x y 1 z dz = z = (1 x y)2 (4.13) 2 0 2 0 By substituting this result into the y-integral and carrying out an integration by parts, we obtain Z 1 1 x y(1 x y)2 dy 2 0 Z 1 x 1 1 1 x 3 = y(1 x y) + (1 x y)3 dy 0 |6 {z } 6 0 0 = = 1 (1 24 x y)4
1 x 0
1 x y 0
z dz .
(4.12)
1 (1 24
x)4 .
(4.14)
Triple Integrals
45
Finally, substitution of this expression into the x-integral and again integrating by parts yields 1 24 Z
1 0
x(1
1 0
(1
x)5 dx
1 (1 720
x)5 =
0
1 720
(4.15)
as the value of the integral in Eq. (4.12). Example. It is possible also to solve this problem from rst principles. Imagine that the volume is divided up into slices parallel to the x y plane. Each slice in the z direction has a thickness of z as shown in Fig. 4.4. When the volume is viewed from above (with z axis pointing towards the viewer) the lines that limit each section look like it is shown on the right hand side in Fig. 4.4. The equation of the line delimiting the nth slice is y = (1 n z) x.
z
x x y
Fig. 4.4: Figure for the derivation of the volume integral from rst principles.
We want to write a volume integral for the entire object and for this we proceed as follows: rst we determine the area of each one of the triangular slices. The volume of the given slice will be just obtained as the area times the height which of course is z. Finally we will some together the contribution from all slices and calculate the limit for z ! 0. This solution therefore exemplies how a volume integral problem can be reduced to an area integration problem. The area of a given triangle is determined by using a double integral. As Fig. 4.4 shows the limit for x integration will be 0 x 1 n x as a line intersects the x axis at 1 n z. The corresponding y limit is 0 y 1 n z x.
46 Therefore area = Z
1 n x 0
Triple Integrals
Z
1 n z x
dx
dy
Because we need to integrate the function f (x, y, z) = xyz we need to weight the dierential volume by this function. Therefore we obtain the value of the integral for a given slice: Z Z
1 n x 0
x dx
1 n z x
y dy (n z) z.
N!1
lim
x dx
1 n z x 0
y dy (n z) z =
1 0
z dz
1 z 0
x dx
1 z x 0
y dy.
This integral can readily be evaluated in the following way: Z 1 Z 1 z Z 1 z x Z Z 1 z 1 1 z dz x dx y dy = z dz x(1 z x)2 dx 2 0 0 0 0 0 Z 1 1 1 = z(z 1)3 dz = 24 0 720 Cartesian coordinates are convenient for evaluating triple integrals within volumes bounded by planes. But there are many situations where other geometries are used, the most common of which are spheres and volumes contained within surfaces of revolution. In the next two sections, we will discuss two coordinate systems that considerably extend the capabilities of triple integrals.
Triple Integrals
where This transformation is depicted in Fig. 4.5. The expressions for r and of x and y are the same as those in Eq. (3.29). 0 r < 1, 0 < 2, 1 < z < 1.
47
(4.17) in terms
4.2.2
The integration element of this coordinate system can be obtained in two ways. The simplest way is to observe that the z-coordinate simply adds a thickness dz to the integration element in circular polar coordinates: dV = r dr d dz . (4.18)
The other method, described in Problem Set 4, is based on writing any point (x, y, z) as a radius vector r r = r cos i + r sin j + z k, (4.19)
and calculating the integration element from the vector product dV = |drr dr drz | , (4.20)
where drr , dr , and drz are the dierential changes of r with respect to r, , and z, respectively: drr = dr cos i + dr sin
z y r
j,
(4.21)
x
(a) (b)
Fig. 4.5: Two illustrations of circular polar coordinates. (a) The denitions of r, , and z. (b) The representation of any point as the intersection of the surface of constant r (the cylinder), constant (the vertical plane), and constant z (the horizontal plane).
Triple Integrals
(4.22) (4.23)
z 1 1 0 1 1 0 y
0.
Example. Consider the sphere with unit radius in the upper half-space, as shown in Fig. 4.6. The equation of the surface is x2 + y2 + z2 = 1 , (4.24)
where z 0. To calculate this volume as an integral in circular polar coordinates, we must rst determine the ranges of the integration variables. The ranges of r and span the interior of the half-sphere: 0 r 1, 0 < 2 (4.25)
The upper bound for the range of z is obtained from the equation of the sphere, solved for z: z2 = 1 x 2 y 2 = 1 r 2 . (4.26) The half-sphere is bounded from below by the x-y plane, where z = 0. Hence, p the range of z is 0 z 1 r2 . Thus, the volume integral of the half-sphere is given by V =
1 p d dz = 2 r 1 0 0 0 0 | {z } | {z } p 2 1 r2 " # 1 1 1 2 2 3/2 = 2 (1 r ) = 2 = , 3 3 3 0 1
r dr
p 1 r2
r2 dr
(4.27)
Triple Integrals
which is one-half the volume of the unit sphere. Example. Consider the cone in Fig. 4.7. The surface is given by x2 + y2 = (1 for 0 z 1. The ranges of r and are 0 < 2 . z)2 ,
49
(4.28)
0 r 1,
(4.29)
The range of z is calculated by following the steps in the preceding example. The cone is bounded from below by the x-y plane, where z = 0. The upper bound of x is determined by the surface of the cone which, in cylindrical coordinates, is r2 = (1 z)2 . Thus, the range of z is 0z1 r. (4.30)
The volume integral of the cone is Z 1 Z 2 Z 1 r Z 1 V = r dr d = 2 (r 0 0 0 |0{z } |{z} 2 1 r ! r2 1 r3 1 1 = 2 = 2 = . 2 0 3 0 6 3 The preceding two examples showed how cylindrical polar coordinates are used to calculate the volumes of surfaces of revolution, i.e. surfaces that were obtained by rotating a curve about an axes, in those cases, the z-axis. We now consider a more substantial example by calculating the volume of another surface of revolution, the torus.
r2 ) dr
(4.31)
z 1 0 y 11
1 0 x
Example. A torus is a surface of revolution Fig. 4.7: The surface determined generated by rotating a circle of radius whose by x2 + y2 = (1 z)2 , for 0 z 1. center is a distance R > from the origin about an axis, usually taken as the z-axis. The calculation of the volume of a torus does not actually require an expression for the surface. The ranges of r, , and z can be determined by referring to Fig. 4.8. Consider rst the left panel. Suppose we take the range of z as z . (4.32)
Triple Integrals
(4.33)
so the range of r is obtained by solving this equation for x and referring to Fig. 4.8(b): p p R 2 z2 r R + 2 z2 . (4.34)
z
y
x R
(a)
(b)
Fig. 4.8: (a) The circle in the x-z plane that is rotated about the z-axis. (b) The section of the torus in the x-y plane. The emboldened line is the path traced out by the center of the circle.
is 2 .
R+ 2 z2
(4.35)
dz
2 z2
r dr .
(4.36)
2 z2
r r dr = 2
2 R+ R
p p
2 z2 2 z2
The integral over the azimuthal angle in Eq. (4.36) is 2, so the volume integral reduces to Z p V = 4R 2 z2 dz . (4.38)
2 p 1 R + 2 z2 2 p = 2R 2 z2 . =
z2
(4.37)
Triple Integrals
51
This integral can be evaluated by the trigonometric substitution z = sin , where 1 1 . Carrying out the required changes to the integrand, the integration 2 2 element, and the limits of integration yields Z p Z 1 2 2 2 dz = 4R2 V = 4R z cos2 d = 22 R2 . (4.39) 1 2 | {z } 1 2 By writing this result as the volume of a torus can be interpreted as the product of the area of the circle that is rotated about the z-axis to form the torus (2 ) and the length of the path taken by the center of the circle (2R). This is a special case of Pappus Theorem1 : let R be a planar region that lies entirely on one side of an axis (usually the z-axis) in the plane. If R is rotated about this axis, the volume of the resulting solid is the product of the area A of R and the distance travelled by its centroid. V = (2R) (2 ) , (4.40)
(4.41) (4.42)
Pappus of Alexandria, who lived in the 4th century, is considered to be the last of the great Greek geometers.
52
z
Triple Integrals
(a)
(b)
Fig. 4.9: Two depictions of spherical polar coordinates. (a) The denitions and ranges of r, , and . (b) The representation of any point as the intersection of the surface of constant r (the sphere), constant (the plane), and constant (the cone).
These are the transformations that relate Cartesian coordinates to spherical polar coordinates. The ranges of the radial and azimuthal z variables are determined by referring to Fig. 4.9(a). As in circular polar coordinr ates (Sec. 3.2) 0 r < 1, 0 < 2 . (4.44)
r cos The range of is determined by requiring that the transformation between Cartesian and spherical polar coordinates is singler sin valued, i.e. that one and only one set of spherical polar coordinates (r, , ) cor- Fig. 4.10: The orientation of a radial vector responds to a particular set of Cartesian with respect to the z-axis. coordinates (x, y, z). This necessitates restricting the range of to
(4.45)
(4.46)
Suppose that we transform this point by rotating the azimuthal angle by : ! + . The coordinates of the transformed point r0 are obtained by applying standard
Triple Integrals
trigonometric identities: r0 = r cos sin i r sin sin j + r cos k .
53
(4.47)
Now suppose that we rotate the polar angle of r so that: ! 2 (< 2). The coordinates of the transformed point r00 are again determined by applying standard trigonometric identities: r00 = r cos sin i r sin sin j + r cos k . (4.48)
By comparing these coordinates with those in Eq. (4.47), we conclude that r00 = r0 , i.e. that there are two ways of labelling the same point. To avoid this unacceptable result, the range of is restricted to the range in Eq. (4.45).
4.3.2
The integration element in spherical polar coordinates is most easily obtained with the procedure in Problem Set 4. The radius vector associated with a point (x, y, z) is written as r = r cos sin i + r sin sin j + r cos k , and calculating the integration element from the vector product dV = |drr dr dr | , (4.50) (4.49)
where drr , dr , and dr are the dierential changes of r with respect to r, , and z, respectively: drr = dr cos sin i + dr sin sin j + dr cos k , dr = r sin sin d i + r cos sin d j , dr = r cos cos d i + r sin cos d j r sin d k . (4.51) (4.52) (4.53)
These vectors are mutually orthogonal so the integration element is obtained from the product of their magnitudes: dV = r2 sin dr d d . (4.54)
54
Triple Integrals
ZZZ ZZZ
V0
V0
where V 0 is the volume V expressed in spherical polar coordinates. There are two important special cases of this integral. If f has no -dependence, f = f (r, ), then f is said to have azimuthal symmetry. According to the transformations in Eqs. (4.42) and (4.43) and Fig. 4.8, this corresponds to rotational symmetry about the z-axis. Surfaces of revolution have this type of symmetry. A physical situation with this type of symmetry is discussed in Problem Set 4. The integral over can be evaluated immediately and the general expression in Eq. (4.55) becomes ZZ 2 f (r, ) r2 sin dr d . (4.56) In the second case, where f has neither - nor -dependence, f is said to be isotropic. This corresponds to spherical symmetry in that f depends only on the radius r and not on any angular orientation. The integrals over and can be evaluated immediately and the general integral in Eq. (4.55) reduces to Z 4 f (r)r2 dr . (4.57) This integral is seen to correspond to the integration over radial shells. Example. We consider rst the calculation of the volume of a sphere of radius R. Referring to Eq. (4.55), this corresponds to the case f = 1. The ranges of the integration variables are obtained directly from Fig. 4.9(a): 0 r R, 0 < 2 , 0 , (4.58)
(4.59)
Triple Integrals
55
The generalization of this procedure to sections of a sphere between given azimuthal and polar angles and to spherical shells with given inner and out radii is straightforward. Example. Consider the integral of f = e r over all space. This is an example of a function with spherical symmetry that occurs frequently in quantum mechanics. The ranges of the integration variables are 0 r < 1, so the integral of f becomes Z
1 0
0
2
< 2 ,
0 , Z
(4.60)
re
dr
The radial integral is evaluated by performing successive integrations by parts: Z 1 Z 1 2 r 1 8 1 r 2 r 4 r | dr re dr |{z} e {z } = r e 0 + 0 0 | {z } u dv 0 Z 8 1 r 1 1 1 r = re + e dr | {z } 0 0 0 ! 8 1 r 1 = 2 e 0 = 8 . 3 (4.62)
d sin d = 4 |0{z } |0 {z } 2 2
1 0
r2 e
dr .
(4.61)
Notice that, in arriving at this result, we have twice used the fact that
x!1
lim xn e
=0
(4.63)
56
Triple Integrals
constant. Consider the surface of a sphere of radius R. According to Eq. (4.49) the radius r vector at any point on the sphere is r = R cos sin i + R sin sin j + R cos k , (4.64)
The element of area integration is obtained by calculating the dierential of this vector for changes in turn of d and d: dr = R sin sin d i + R cos sin d j , dr = R cos cos d i + R sin cos d j These vectors are orthogonal, dr dr = 0 , (4.67) R sin dk . (4.65) (4.66)
so the dierential area dA corresponding to these dierential changes is obtained from the product of the magnitudes of dr and dr : dA = |dr ||dr | = R sin d R d = R2 sin d d . Example. The surface area of a sphere of radius R is represented as Z 2 Z 2 R d sin d = R2 2 2 = 4R2 .
0 0
(4.68)
(4.69)
The corresponding expression of the surface area subtended by azimuthal angles 1 and 2 and polar angles 1 and 2 is Z 2 Z 2 2 R d sin d = R2 ( 2 cos 2 ) . (4.70) 1 )(cos 1
1
The other type of surface integral we will encounter involve a cylinder of radius R. The radius vector is, from Eq. (4.19), given by r = R cos i + R sin j + z k, (4.71)
The dierential dr corresponding to dierential changes of d and dz are dr = R sin d i + R cos d j , drz = dz k . (4.72) (4.73)
Triple Integrals
57
These vectors are manifestly orthogonal, dr drz = 0, so the dierential area dA corresponding to these dierential changes is obtained from the product of the magnitudes of dr and drz : dA = R d dz . (4.74)
Example. The surface of a cylinder of radius R and height H is calculated as Z 2 Z H R d dz = 2RH . (4.75)
0 0
The surface area of cylinder between heights H1 and H2 and azimuthal angles 1 and 2 is similarly calculated as Z 2 Z H2 R d dz = R( 2 H1 ) . (4.76) 1 )(H2
1
H1
4.5 Summary
The triple integral of a function f (x, y, z), viewed as a density of some physical quantity, is the amount of that quantity within a volume in three-dimensional space. There is considerably more freedom to specify other cooordinate systems than in two dimensions and many applications in physics rely on such transformations to enable calculations to be carried out. From Cartesian coordinates, we transformed triple integrals into cylindrical polar coordinates, which are the natural generalizations of circular polar coordinates to three dimensions, and are appropriate to situations where there is azimuthal symmetry, and spherical polar coordinates, for situations that involve spherical symmetry. The Jacobians obtained in each case reect the position dependence of the magnitude of the dierential volume elements.
Line Integrals
(5.1)
Suppose now that the force is a funcF tion of position. We consider this situation in one dimension rst: F = F(x). The calculation of the work between two points x = a and x = b proceeds according to the construction in Fig. 5.2. The inr terval (a, b) is rst divided into N subintervals of length x = (b a)/N. The Fig. 5.1: A force F acting at an angle force acting within each of these subin- along a displacement r. tervals is taken to be the constant value at the left endpoint of that interval. Thus, we obtain F(a) x + F(a + x) x + + F(b x) x . (5.2) as an approximation of the work done over the interval. As x ! 0, this approximation becomes increasingly accurate and the work done approaches the shaded region in the right panel of the gure. Referring to Sec. 2, the procedure depicted in Fig. 5.2 is the same as that used for the Riemann sum construction of the integral of a function, so we conclude that W= Z
b a
F dx .
(5.3)
Similar considerations apply for paths in two and three dimensions. In this chapter, we will consider the two-dimensional case. A force F in two dimensions
F F
Fig. 5.2: (Left panel) Construction used to calculate the work done from x = a to x = b by a position-dependent force. The shaded area corresponds to the work calculated by regarding the force as constant over each subinterval. (Right panel) The corresponding calculation for innitesimal subintervals, which is seen to represent the area bounded by F , the x-axis, and the lines x = a and x = b.
Line Integrals
is a vector eld: F(x, y) = P(x, y) i + Q(x, y) j , where P and Q are functions of x and y and i. This expression indicates that every point (x, y) is assigned a vector F whose x-component is given by P i and whose y-component is Q j. The path along which F acts is a curve P in the x-y plane between an initial point i and a nal point f , as shown in Fig. 5.3. The work done along this path is calculated as in Eq. (5.1) by rst considering the incremental work dW done by the force along a distance dr: dW = F dr, where dr is the incremental distance along the path. Then, with the position vector given by
f
61
(5.4)
Fig. 5.3: A path in a vector eld between an initial point i and the nal point f .
r = xi + y j, we have that the incremental change along the path is dr = dx i + dy j , so the work done along P is Z W = Fdr = = Z Z
P P
(5.5)
(5.6)
This is an example of a line integral. In addition to this example from classical mechanics, line integrals appear in thermodynamics and in electricity and magnetism. In thermodynamics, P represents a process between initial and nal values of thermodynamic variables (e.g. pressure, temperature, volume). The line integral of such variables yields quantities such as heat ow and the work done during the process. In electricity and magnetism, P is a path in space, and line integrals represent quantities such as the electromotive force. In all of these cases the mathematical form of a line
Line Integrals
(5.8)
where f and g are any functions that can be integrated and P is the path connecting the initial and nal points. As we stressed in the introduction, specifying the integration path P is as important as specifying the initial and nal points. The path provides a functional relationship between x and y and allows the integrals to be evaluated; otherwise the variable y in the term f (x, y) dx and the variable x in the term g(x, y) dy appear superuous. Additionally, the value of the line integral may depend explicitly on the path, so specifying only the initial and nal points does not necessarily su cient to obtain a unique value. The following example illustrates these ideas. Example. Consider the line integral Z xy dx ,
P
(5.9)
which is of the general form in Eq. (5.8) with f = xy and g = 0. In the context of the calculation of work, this corresponds to a force F(x, y) = xy i. We will evaluate this integral over 1 the three paths shown in Fig. 5.4, each of which have their initial point at the 0.8 origin (0, 0) and their nal point at 0.6 P2 (1, 1). We rst consider P1 . This path is 0.4 composed of two straight segments: P3 (0, 0) ! (1, 0) and (1, 0) ! (1, 1). 0.2 The rst segment lies along the x-axis, P1 so we have that
y
y = 0,
Hence, since y = 0, the integrand vanFig. 5.4: The three paths, labelled P1 , P2 , and ishes, so the contribution from segP3 between (0, 0) and (1, 1) used for evaluating ment also vanishes. The second segthe line integral in Eq.(5.9). ment is parallel to the y-axis, so x = 1, dx = 0 , Since dx = 0, the contribution along this segment also vanishes. Therefore, the integral along P1 vanishes: Z xy dx = 0 . (5.12)
P1
0 x 1.
(5.10)
0.2
0.4 x
0.6
0.8
0 y 1.
(5.11)
Line Integrals
63
The path P2 connects (0, 0) to (1, 1) with the straight line y = x. Thus, along this path, the integrand can be expressed entirely as a function of x: xy = x2 , with 0 x 1. The line integral is thereby evaluated as Z xy dx =
P2
1 0
1 x2 dx = 3 x3 = 1 . 3 0
(5.13)
Finally, the path P3 connects (0, 0) to (1, 1) with the parabola y = x2 . Along this path, the integrand can be written as xy = x3 , with 0 x 1, and the line integral becomes Z Z 1 1 xy dx = x3 dx = 1 x4 = 1 . (5.14) 4 4
P3 0 0
We have thus obtained three dierent values for the line integral in Eq. (5.9) along the three paths shown in Fig. 5.4. This result can be understood by interpreting this integral as the work done by the force F = xy i over the three paths: Z Z Fdr = xy dx , (5.15)
Pi Pi
for i = 1, 2, 3. This vector eld is shown in Fig. 5.5 superimposed on the paths P1 , P2 , and P3 . We can see immediately from this diagram that the work done along P1 must vanish because F vanishes along the x-axis (the rst segment of P1 ), and acts in the normal direction to the second segment of this path. Alternatively, the line integrals along P2 and P3 are both necessarily positive because the projection of F onto the path has a component along the direction of the path, producing positive work.
0.2
0.4
0.6
0.8
P2
P3
This example illustrates two fundamental Fig. 5.5: The vector eld F = xy i and the points about line integrals. (i) The value of paths P1 , P2 , and P3 shown in Fig. 5.4 a line integral may depend on the path over used for the evaluation of the line integral which it is evaluated. There are physical in Eq. (5.9). manifestations of this property that have important consequences in mechanics, thermodynamics, and electricity and magnetism. (ii) The path between given initial and nal points establishes a relationship between the independent variables. Once this information is incorporated into the line integral, the evaluation reduces to that of an ordinary integral (Sec. 2).
Line Integrals
(5.16)
evaluated along the three paths in Fig. 5.4. Along the rst segment of P1 , y = 0, and therefore dy = 0, so there is no contribution from either term in the integral. Along the second segment x = 1, dx = 0, and 0 y 1. Thus, only the second term in the integral makes a contribution to the integral, and we obtain Z (xy dx + x y dy) =
P1 2 2
1 0
y dy = 1 y2 = 1 . 2 2
0
(5.17)
Along P2 , y = x, so dy = dx. Thus, both terms in the integrand can be written in terms of either x or y alone: R
P2
(xy2 dx + x2 y dy) = 2
1 0
= 2 1 x4 = 1 , 4 2
R1
0
x3 dx (5.18)
which is the same value obtained in Eq. (5.17). Finally, along P3 , y = x2 , so dy = 2x dx. We can express the integrand in terms of x alone to obtain R R1 (xy2 dx + x2 y dy) = 0 (x5 dx + 2x5 dx) P3 1 R1 = 3 0 x5 dx = 3 1 x6 = 1 , (5.19) 6 2
0
which is the same as that obtained for the other two paths. A natural question arises: Is this a coincidence, or does this integral always have the same value when evaluated over dierent paths between xed initial and nal points? The results we have obtained in this example are certainly suggestive, but to address this question in a mathematically concise framework, we must derive some additional properties of line integrals. This is the subject of the next two sections.
Line Integrals
65
In this section, we will re-express the question of the path-dependence of a line integral in terms of the value of that integral around a closed curve. We rst determine the eect that reversing the sense of the integration path has on the value of a line integral. Consider a line integral over a path between an initial point i and a nal point f , as shown in Fig. 5.6(a): Z
y
( f dx + g dy) .
(5.20)
P
Suppose that this path is reversed, so that the new initial point is f and the new nal point is i, as shown Fig. 5.6(b). We signify this path by P and write the corresponding line integral as Z ( f dx + g dy) . (5.21) The relationship between the values of these two line integrals is straightforward to understand. As the examples in the preceding section show, the evaluation of a line integral always reduces to an ordinary integral. Thus, reversing the integration path in a line integral has the eect of interchanging the upper and lower limits of integration. According to the Fundamental Theorem of Calculus, this changes the sign of the integral [Eq. (2.15)]. Thus, the line integrals in Eqs. (5.20) and (5.21) have the same absolute value, but opposite signs: Z Z
P
i (a) x y f
i (b) x
Fig. 5.6: (a) The path P between points i and f , and (b) the reverse path
( f dx + g dy) =
P
( f dx + g dy) .
P
(5.22)
P (b).
Consider now a line integral over a closed curve C (Fig. 5.7). Such integrals, often called loop integrals, have a special notation to indicate that the integration path is a closed curve: I ( f dx + g dy) .
C
(5.23)
Choose any two distinct points A and B on C and denote by P1 the path on C from A to B and by P2 the path that returns B to A along C. The integral over C can be expressed as sum of line integrals over P1 and P2 : I Z Z ( f dx + g dy) = ( f dx + g dy) + ( f dx + g dy) . (5.24)
C P1 P2
66
Line Integrals
Suppose that the value of the line integral in Eq. (5.20) is independent of the path P for any initial and nal points. The closed curve C in Fig. 5.7 denes two paths from A to B: the path P1 and the reverse of the path P2 . Path-independence requires that the line integrals over P1 and P2 are equal: Z ( f dx + g dy) =
P1
( f dx + g dy) .
P2
(5.25)
Z I
( f dx + g dy) Z
P2
( f dx + g dy) +
P1
( f dx + g dy)
P2
( f dx + g dy) = 0 .
C
(5.26)
This shows that, if the value of a line integral is independent of the path between any initial and nal points, the loop integral vanishes for any closed curve. The converse of this statement is also y true. If a loop integral vanishes for any closed curve C, then we can choose any P1 two points A and B on C as initial and nal points of line integrals along the corresponding paths P1 and P2 . Then, by reB versing the steps leading to Eq. (5.26), we nd that Z ( f dx + g dy) =
P1
(5.27) which implies path independence. Thus, we have shown that the path independence of a line integral is both necessary [Eq. (5.27)] and su cient [Eq. (5.26)] for the loop integral to vanish over any closed curve. In other words, these two properties are equivalent:
( f dx + g dy) ,
P2
P2 x
P1 is a path between any two points A and B on C and P2 is the path from B to A that
completes the loop. The closed curve is the sum of these two paths: C = P1 + P2 .
Line Integrals
A line integral Z ( f dx + g dy)
P
67
is independent of the path P between any two points i and f if and only if I ( f dx + g dy) = 0
C
for any closed curve C. This result provides an alternative statement of the fact that line integrals fall into two classes: (i) path-dependent and, therefore, typically non-vanishing values over closed curves, and (ii) path-independent and vanishing values over closed curves. Both types of line integral are important in applications to physics and understanding the physical circumstances that lead to one type of integral or another is a central theme in several disciplines. We conclude this section with two examples. Example. Consider the loop integral I
y
y dx ,
C
(5.28)
where C is a circle of radius a centered at (1, 1), as shown in Fig. 5.8. We represent the circle as follows: x=1 a cos , (5.29)
a 1
y = 1 + a sin ,
where 0 < 2. This parametrization x 1 sweeps through the circle in a clockwise direction beginning at (1 a, 1). The inFig. 5.8: The circle of radius a centered at tegral in Eq. (5.28) can be expressed as (1, 1), showing the denition of for carryan integral over by using Eq. (5.29) to ing out a loop integral over this curve. transform the integrand, the integration element, and the limits of integration. The integrand y is given by the second of Eqs. (5.29), an application of the chain rule to x( ) yields dx = a sin d , (5.30)
68
y (a) P1 y
Line Integrals
(b)
B A
xA y
P2
xB
x y (c) (d)
Fig. 5.9: The evaluation of the loop integral in Eq. (5.28) around an arbitrary closed curve
C, showing (a) the separation of C into upper and lower paths P1 and P2 , (b) and (c) the
evaluation of the integral along these paths, and (d) the cumulative effect of the loop integral.
=a
= a2 .
|0
sin2 d {z } =
(5.31)
This is readily identied as the area of the circle enclosed by C. This result can be generalized to any closed curve in the x-y plane by following the steps shown in Fig. 5.9. We rst identify the points A = (xA , yA ) and B = (xB , yB ) that allow C to be written as the sum of upper and lower paths P2 and P2 , which can be represented as functions y1 (x) and y2 (x), respectively. The integral
Line Integrals
along P1 is Z Z
xB xA
69
y dx =
P1
y1 (x) dx .
(5.32)
This is an ordinary integral whose value is represented by area bounded by y1 (x), the x-axis, and x = xA and x = xB , as shown in Fig. 5.9(a). The loop is completed by integrating y2 (x) from xB to xA . The integral Z xB y2 (x) dx (5.33)
xA
is represented by the area shown in Fig. 5.9(c). But the integral we need to complete C has the upper and lower limits interchanged, so its value corresponds to the negative of this quantity. Hence, the loop integral is calculated as I Z xB Z xB y dx = y1 (x) dx y2 (x) dx , (5.34)
C xA xA
which is represented in Fig. 5.9(d). The integral over y2 (x) cancels the contribution from the integral over y1 (x) that represents the area below P2 , leaving only the area enclosed by C. We have thereby shown that I y dx = A , (5.35)
C
where A is the area enclosed by C. We conclude this section with an example of a loop integral that does vanish. Example. Consider the integral I (xy2 dx + x2 y dy) , (5.36)
C
( 1,1)
(1,1)
where C is the closed curve in Fig. 5.10. The integrand is the same as that in the second example in Sec. 5.1. The closed curve is composed of four straight segments, so we will evaluate the loop integral by considering each segment separately. Beginning at ( 1, 1), the segments are characterized as follows:
( 1, 1)
(1, 1)
Fig. 5.10: The closed contour for the integral in Eq. (5.36).
70 ( 1, 1) ( 1, 1) (1, 1) (1, 1) ! ! ! ! ( 1, 1) : (1, 1) : (1, 1) : ( 1, 1) : x= 1 y=1 x=1 y= 1 dx = 0 dy = 0 dx = 0 dy = 0 1y1 1x1 1y1 1x1
Line Integrals
If dx = 0 the rst term in Eq. (5.36) makes no contribution, while if dy = 0, the second term makes no contribution. The integral can therefore be written as (note the upper and lower limits of each integral!) Z 1 Z 1 Z 1 Z 1 y dy + x dx + y dy + x dx = 0 . (5.37)
1 1 1 1
over a path P between an initial point (xi , yi ) and a nal point (x f , y f ). The path establishes a relation between x and y that we represent as y(x). This enables us to write the line integral as an integral over x only by following the procedure in Sec. 2.2. We have Z Z xf f (x, y) dx = f [x, y(x)] dx , (5.39) Z
P xi
g(x, y) dy =
P
xf xi
g[x, y(x)]
dy dx . dx
(5.40)
Thus,
The right-hand side of this equation is an ordinary integral over xi x x f . Accordingly, if the line integral is path-independent, we can use the Fundamental
( f dx + g dy) =
xf xi
(5.41)
Line Integrals
Theorem of Calculus to write ) Z xf ( dy f [x, y(x)] + g[x, y(x)] dx = F(x f ) dx xi where dF dy = f [x, y(x)] + g[x, y(x)] . dx dx By writing F as F[x, y(x)], we also have dF @F @F dy = + , dx @x @y dx from which we identify @F = f, @x @F = g. @y
71
F(xi ) ,
(5.42)
(5.43)
(5.44)
(5.45)
The quantity F is called the potential. On account of Eq. (5.45) we can write the dierential of F as dF = @F @F dx + dy = f dx + g dy , @x @y (5.46)
in which case the quantity on the right-hand side is independent of the path. This is called an exact differential. Otherwise, the quantity f dx + g dy is called an inexact differential and the corresponding line integral is path-dependent. Hence, a line integral of an exact dierential can be represented as Z Z ( f dx + g dy) = dF , (5.47)
P P
= F(x f , y f )
F(xi , yi ) ,
In terms of our original formulation in Sec. 5.1, this equation states the work done between and initial point i and a nal point f is equal to the change in the potential F. Equation (5.45) provides a method of testing for the exactness of a dierential. By dierentiating the rst of these equations with respect to y, ! @ @F @2 F @f = = , (5.49) @y @x @y@x @y
Line Integrals
(5.50)
and equating the mixed second partial derivatives of F: Fyx = F xy , we obtain @f @g = . @y @x (5.51)
The discussion leading to this equation shows that it is a necessary condition for a dierential to be exact. The procedure described in Problem 5, Problem Set 5 shows that this is also a su cient condition for exactness, thus demonstrating the equivalence between Eq. (5.51) and the exactness of a dierential. Example. Consider the line integral Z y dx ,
P
(5.52)
which was discussed in an earlier Example in this section. In the notation of Eq. (5.8), f = y and g = 0. Thus, @f = 1, @y @g = 0. @x (5.53)
Since these two partial derivatives are unequal, we conclude from Eq. (5.51) that y dx is an inexact dierential, so the line integral in Eq. (5.52) is path-dependent. This result is to be expected in view of Eq. (5.35). As a second example, we consider the line integral Z (xy2 dx + x2 y dy) . (5.54)
P
This integral has been discussed in Secs. 5.1 and 5.2. In the notation of Eq. (5.8), f = xy2 and g = x2 y, and we nd @f = 2xy , @y @g = 2xy . @x (5.55)
The equality of these partial derivatives means that xy2 dx + x2 y dy is an exact dierential, so the line integral in Eq. (5.54) is path-independent. This conclusion conrms our expectations based on the results already obtained for this line integral.
Line Integrals
73
The exactness of this dierential implies that there is an underlying potential function F such that @F @F = xy2 , = x2 y . (5.56) @x @y Integrating the rst of these with respect to x yields F(x, y) = 1 x2 y2 + h(y) , 2 (5.57)
where h(y) is an arbitrary function of y (analogous to constants of integration obtained when integrating functions of one variable). Dierentiating both sides of this equation with respect to y, @F = x2 y + h0 (y) , @y (5.58)
and requiring that this result be consistent with the second of equations (5.56), necessitates setting h = constant, so h0 (y) = 0. Thus, F(x, y) = 1 x2 y2 + constant . 2 The constant term disappears upon integration: Z (xy2 dx + x2 y dy) = 1 (x2 y2 f f 2
P
(5.59)
xi2 y2 ) . i
(5.60)
Thus, the total distance S travelled by the particle, called the arc length of the path, is Z f Z f p S = ds = dx2 + dy2 . (5.63)
i i
74
Line Integrals
The integrand on the right-hand side can be written in a more physically suggestive form as p dx2 + dy2 2 ! !2 31/2 q 6 dx 2 dy 7 6 7 6 7 = dt v2 + v2 = v dt , 6 7 = dt 4 + 5 x y dt dt (5.64)
where v x and vy are x and y components of the instantaneous speed v of the particle. The arc length along the trajectory is can thereby be represented as Z tf S = v(t) dt . (5.65)
ti
dx2 + dy2 .
(5.66)
is used to represent the distance along any curve P. Example. We will illustrate the methodology of computing the arc length by considering y = cosh x between x = 0 and x = a. For y = cosh x, we have dy = sinh x dx , so the integrand in Eq. (5.66) becomes p p p dx2 + dy2 = dx2 + sinh2 x dx2 = dx 1 + sinh2 = cosh x dx . Thus, Z p dx2 + dy2 = Z
a 0
(5.67)
(5.68)
(5.69)
5.5 Summary
We can summarize the main results we have obtained on line integrals by noting that the following statements are equivalent in that any one implies any other. If any one statement is false, all other are false as well. 1. f dx + g dy is an exact dierential;
Line Integrals
2. 3. 4. Z I ( f dx + g dy) is independent of the path P between xed endpoints; ( f dx + g dy) = 0 for any closed curve C;
75
@f @g = ; @y @x
5. There is a potential function F such that F x = f and Fy = g, so dF = f dx + g dy; Z 6. ( f dx + g dy) = F(x f , y f ) F(xi , yi ) for any initial point (xi , yi ) and nal point (x f , y f ).
P
78
x,y
x,y
x,y
x,y
crosses the boundary of that region. The outward unit normal of each face of this region is
does not contribute to the ux across the boundary. If we denote this vector eld by V = P(x, y) i + Q(x, y) j , (6.1) and the outward unit normal of the surface by n, the total ux across the boundary of the region can be expressed as a line integral of the dot product V n over the boundary: Z V n ds .
(6.2)
Note the sign convention used here: positive for ux out of the region and negative for ux into the region. We can now calculate the contribution to this quantity from each of the four sides of the rectangular region. Beginning with the part of the boundary contained between (x, y) and (x + x, y) and moving counterclockwise, we obtain the following expressions: (V n) x = [P(x, y) i + Q(x, y) j] ( j) x = Q(x, y) x , (V n) y = [P(x + x, y) i + Q(x + x, y) j] i y = P(x + x, y) y , (V n) x = [P(x, y + y) i + Q(x, y + y) j] j x (6.4) (6.3)
79 (6.5)
(6.6)
where the signs are due to the directions of the unit normals of each face, which are j, i, j and i, respectively. Note the arguments of P and Q in each term! The variation along each face has been neglected because the lengths of each side will become innitesimal later in this calculation. Moreover, the reference points for each side have been chosen so that the leading corrections to V(x, y) are of order x y, rather than ( x)2 or ( y)2 , which would vanish in the innitesimal limit. Neither of these choices is essential for our calculation, but they make the intermediate step much simpler. Upon summing up the contributions to the ux, we obtain Z V n ds = P(x + x, y) P(x, y) y + Q(x, y + y) Q(x, y) x P(x + x, y) = x " P(x, y) Q(x, y + y) + y Q(x, y) # x y. (6.7)
The expressions within the square brackets on the right-hand side are discrete approximations to partial derivatives of P and Q, as given in Eqs. (1.12) and (1.13). Thus, by dividing both sides of this equation by x y and taking the limit x ! 0 and y ! 0, we obtain " # Z 1 lim V n ds x!0, y!0 x y P(x + x, y) = lim x!0 x = @P @Q + . @x @y " P(x, y) # Q(x, y + y) + lim y!0 y " Q(x, y) #
(6.8)
The quantity on the right-hand side of this equation is called the divergence of V. From the way we have arrived at this quantity, the divergence is dened as the ux density across the boundary of an innitesimal region. This point merits further discussion. We are familiar with other types of densities, such as mass density and charge density. Each of these refers to an extensive quantity in that the total amount of mass or charge within a region is obtained
80
by integrating the corresponding density at each point within the region. The density at any point can be determined by calculating the volume within a region surrounding the point, dividing by the volume of the region, and shrinking the volume of the region to that point. Flux is another such quantity, as the Eq. (6.8) demonstrates. This is of fundamental signicance for the applications of the divergence to physical theories. We can write the divergence of a vector eld in a more compact form that has a natural extension to three dimensions by introducing the del operation ri @ @ @ +j +k . @x @y @z (6.9)
By regarding this operation as a vector, we take the dot product with V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k , to obtain ! @ @ @ @P @Q @R rV = i + j + k (P i + Q j + R k) = + + . @x @y @z @x @y @z (6.11) The denition of the divergence in Eq. (6.8) can now be written as r V = lim 1 A Z V n ds , ! (6.12) (6.10)
A!0
where we have written A = x y. This denition has several similarities with the denition of the derivative in Eq. (1.1). We used Eq. (1.1) explicitly in arriving at the expression for the divergence, but the similarities run much deeper. The right-hand sides of both equations are expressed in terms of a function evaluated on the boundary of a region [the endpoints of an interval in the denition in Eq. (1.1)], and the corresponding derivatives of those functions are obtained by shrinking this region to zero. These similarities will be extended further when we derive the Fundamental Theorem associated with the divergence. We rst consider some examples of vector elds and their divergences. Example. As our rst example, we consider the vector eld V = xi + y j, (6.13)
81
which is shown in a region around the origin in Fig. 6.2 above. The divergence of V is calculated from the denition in Eq. (6.11), with P = x and Q = y: rV = @x @y + = 1 + 1 = 2. @x @y (6.14)
According to our convention, the positive divergence of this vector eld indicates that the ux density is directed outward. This is certainly apparent at the origin, but the fact that the divergence is independent of position means that this interpretation is valid for every point in the x-y plane, i.e. the ux density is directed outward from every point. Understanding how this conclusion is drawn is central to the concept of what the divergence means. The details of this interpretation is discussed in Classwork 7 for this and other vector elds. Example. Consider now the vector eld V = xi y j, (6.15)
x y
which is shown in Fig. 6.3 at right. The divergence of V is calculated from Eq. (6.11), with P = x and Q = y: rV = @x @x @y =1 @y 1 = 0. (6.16)
This zero divergence indicates that there is no net ux density. This is again apparent at the origin, but Fig. 6.3: Plot of V = x i y j. the fact that the divergence is independent of position means that this interpretation is valid for any point in the x-y plane. Example. Our nal example is the vector eld
82
V = yi + x j,
(6.17)
which is shown in Fig. 6.4 above. The divergence of V is calculated from Eq. (6.11), with P = y and Q = x: rV = @y @x + = 0 + 0 = 0. @x @y (6.18)
The vanishing divergence of this vector eld indicates that there is no net ux density, but here because the vector appears as a vortex. As in the preceding examples, this conclusion is valid at every point.
83
Fig. 6.5: Adjacent regions showing that, where there is a common boundary, indicated by bold lines, the ux from one region exactly cancels the ux into the adjacent region. Only unshared boundaries contribute to the net ux.
where ni represents the outward normal of the ith region. Then, summing over each region yields XZ X @P @Q ! V ni ds = + i . (6.20) @x @y i i i
y y y
(a)
(b)
(c)
Fig. 6.6: (a) Partition of a region in the x-y plane bounded by a curve. (b) Cancellation of integrals over adjacent regions, as shown in Fig. 6.5. (c) As the area of the basic regions becomes smaller ( x ! 0 and y ! 0), the partitioning in (a) provides a successively more accurate representation of the region, yielding a more accurate representation of the curve surrounding the region.
(6.21)
where
Thus,
! @P @Q + d . @x @y
(6.22)
which is the divergence theorem in two dimensions. The same arguments apply in three dimensions, and we can write this in a more compact form by using the notation in Eq. (6.11): Z where V nd = ZZ r Vd , (6.24)
! @P @Q + d , @x @y
(6.23)
is the curve bounding the area in two dimensions, and ZZ V nd = ZZZ r Vd , (6.25)
where is the surface bounding the volume in three dimensions. Note that these equations have the structure of the Fundamental Theorem of Calculus in Eqs. (2.10) and (2.11), which we combine as Z b dF F(b) F(a) = dx . (6.26) a dx The left-hand sides of all these equations involves the integral of the derivative of a function over the interior of a region, while the right-hand sides involve the evaluation of the function over the boundary of that region.
Example. We will verify the divergence theorem Eq. (6.24) for the vector eld V = xi+y j (6.27) over the volume in the region of the x-y plane given by 0 x 1 and 0 y 1, as depicted in Fig. 6.7. The divergence of V was calculated in the Example in the preceding section, r V = 2, so the right-hand side of the divergence can be evaluated immediately: " Z 1 Z 1 r V d = 2 dx dy = 2 . (6.28)
0 0
85
Fig. 6.7: Plot of V = x i + y j in the region 0 x 1 and 0 y 1 of the x-y plane, shown shaded with emboldened boundaries.
The left-hand side is evaluated in an analogous manner to that used to obtain Eq. (6.8). Beginning with the segment along the x-axis and proceeding in a counterclockwise direction, we have V n = (x i + y j) ( j) = y V n = (x i + y j) i = x V n = (x i + y j) j = y V n = (x i + y j) ( i) = x (6.29) (6.30) (6.31) (6.32)
Along these four segments, we have, respectively, that y = 0, x = 1, y = 1, and x = 0. Thus, only the expressions in Eqs. (6.30) and (6.31) have nonzero contributions. This can be understood from Fig. 6.7 because V is seen to have only components parallel to the boundaries that lie along the x- and y-axes. Hence, there is no ux of V across these boundaries. We thereby obtain Z Z 1 Z 1 V nd = x dy + y dx = 1 + 1 = 2 , (6.33)
0 x=1 0 y=1
Example. We now turn our attention to the divergence theorem in three dimensions in Eq. (6.25). Consider the ux of vector eld V = yi x j + z k, (6.34) through the surface of the unit sphere, x2 + y2 + z2 = 1 (Fig. 6.8). To evaluate the right-hand side of Eq. (6.25), we rst calculate the divergence of V. Using Eq. (6.11) with P = y, Q = x, and R = z, we obtain rV = @y @x @x @z + @y @z (6.35)
= 0 + 0 + 1 = 1.
86
The integral of this quantity over the volume of the sphere can be carried out either by inspection or explicitly in spherical polar coordinates: $ Z 1 Z 2 Z 1 4 2 r V d = r dr d sin d = 2 2 = , (6.36) 3 3 0 0 0 which is just the volume of the unit sphere. The evaluation of the left-hand side of Eq. (6.25) requires determining the dot product V n over the surface of the unit sphere. The outward unit normal is obtained from the gradient r(x2 +y2 +z2 ) = 2x i+2y j+2z k , (6.37) The length of this vector is p |r(x2 + y2 + z2 )| = 4x2 + 4y2 + 4z2 p = 2 x2 + y2 + z2 = 2 (6.38) on the surface of the unit sphere (where x2 + y2 + z2 = 1). Hence, n = xi + y j + z k, so V n = (y i (6.39)
y 0 1 1 1 x 0 1
1 z 0 1
Fig. 6.8: The vector eld in Eq. (6.34) evaluated at the surface of the unit sphere.
The integral of this quantity over the surface of the unit sphere is carried out in spherical polar coordinates, with z2 = cos2 : ZZ Z 2 Z 4 V nd = d sin cos2 d = , (6.41) 3 |0{z } |0 {z } 2 3 1 cos = 2 3 3
0
x j + z k) (x i + y j + z k) = xy
xy + z2 = z2 .
(6.40)
87
i.e. V is the gradient of a scalar function and r = (x2 + y2 + z2 )1/2 is the usual radial variable that measures the distance of the point (x, y, z) to the origin. We will evaluate the left-hand side of Eq. (6.25) for a spherical surface of radius R, which means that we need to determine V and n over this surface. The equation of this surface is x2 + y2 + z2 = R2 , so the outward unit normal is obtained by taking the gradient of this expression and normalizing the resulting vector, as in the steps leading to Eq. (6.39): x y z n = i + j + k. (6.43) R R R To calculate V, we use the chain rule to obtain r (r) = = @ @ @ i+ j+ k @x @y @z
On the surface of the sphere, r = R and this expression reduces to x d y z r (r) = i+ j+ k , dr r=R R R R so d V n = r (r) n = dr
r=R
d @r d @r d @r i+ j+ k dr @x dr @y dr @z d x y z = i+ j+ k . dr r r r
(6.44)
(6.45)
which, since both V and n are radial vector elds, is a constant on the surface of the sphere. Thus, integrating this quantity (which is a constant) over the surface of the sphere of radius R yields ZZ d V n d = 4R2 . (6.47) dr r=R We now observe that, if we specialize our choice of any constant, then ZZ to (r) = A/r, where A is (6.48)
! x2 + y2 + z2 d = 2 R dr
r=R
(6.46)
V n d = 4A ,
i.e. independent of the radius of the sphere! This result has several important consequences. Figure 6.9 shows a twodimensional depiction that we will use in the following discussion. Figure 6.9(a) shows a sphere of any radius. Since Eq. (6.48) is independent of the radius, the value of this integral is unaected by any deformation of this sphere as long as the
88
(a)
(b)
(c)
Fig. 6.9: Schematic depiction in two dimensions of Gauss law for surfaces that enclose the origin, with (a) a spherical surface, (b) a deformed spherical surface, which leaves the value of Eq. (6.48) unaffected, and (c) a general surface that can be represented by the construction in (b).
(a)
(b)
Fig. 6.10: Schematic depiction in two dimensions of Gauss law for surfaces that do not enclose the origin, with (a) a deformed surface with spherical sections, (b) a general surface that can be represented by the construction in (a).
resulting surface contains the origin. One such deformation is shown in Fig. 6.9. There is no ux through the radial planes because the vector eld is radial, so only the spherical portions contribute to the ux. Any section within a xed subtended angle can be moved to any radius with no eect on the ux through that section. Hence, since any surface can be decomposed into such radial and spherical sections, the result in Eq. (6.48) will be obtained for any surface that enclosed the origin, as shown in Fig. 6.9(c). An altogether dierent result is obtained if the surface does not contain the origin. This situation is depicted in Fig. 6.10. The surface in Fig. 6.10(a) is composed of spherical sections. Since this surface surrounds a region that excludes the origin, the ux into the volume exactly cancels the ux out of the volume because every spherical section which admits ux has a corresponding region that expels
89
the same amount of ux. Hence, the ux integral over such a surface vanishes! Figure 6.10(b) shows a smooth surface that can be decomposed into sections as in Fig. 6.10(b). Figures 6.9 and 6.10 summarize the essence of Gauss law. One of the most far-reaching applications of Gauss law is to electrostatics, where the function (r) represents the electrostatic potential of a charge q located at the origin: q (r) = , (6.49) 40 r where 0 is the permittivity of free space. The associated electric eld E is then given by q E= r = , (6.50) 40 r2 and Gauss law reads ZZZ r E d = ZZ E nd = q . 0 (6.51)
if the surface encloses the origin. More generally, the right-hand side is equal to the total charge enclosed by , i.e. the total charge contained within the volume . Gauss law results from the divergence theorem applied to Coulombs law and leads to one of the four Maxwell equations the equations that govern the behavior of all electromagnetic phenomena.
6.4 Summary
This chapter has introduced the divergence of a vector eld V = V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k: rV = @P @Q @R + + . @x @y @z (6.52)
The divergence represents the ux density of the vector eld and, because of the derivative operation, has an associated Fundamental Theorem of Calculus called the divergence theorem: ZZ ZZZ V nd = r Vd , (6.53) for a surface surrounding a volume .
in the counterclockwise direction. The integrand of this line integral represents the projection of V along the integration path, so a positive (resp., negative) value of 91
92
(x,y+ y) (x+ x,y+ y)
(x,y)
(x+ x,y)
Fig. 7.1: The region of area x y used to calculated the curl of a vector eld. The projections of the components of the vector eld onto the directions around this region are indicated by arrows.
the integral implies that V a positive (resp., negative) circulation. The direction of positive circulation is simply a matter of convention. If the line integral vanishes, V has no circulation in the region. The line integral in Eq. (7.2) is composed of four segments: I (P dx + Q dy) = Z
x x+ x
@( A)
Z
0
x+ x x
P(x , y) dx +
0
P(x , y + y) dx + Z
y+ y y
y y+ y
Q(x, y0 ) dy0
we can combine terms on the right-hand side of Eq. (7.3) to obtain I (P dx + Q dy) = Z
y+ y y
b a
f (x) dx =
a b
f (x) dx ,
(7.4)
@( A)
x+ x x
Since the limits x ! and y ! 0 are to be taken, we can regard P and Q as constant over their intervals of integration. Our line integral then becomes I (P dx + Q dy) = P(x, y) P(x, y + y) x
@( A)
P(x0 , y)
Q(x + x, y0 )
Q(x, y0 ) dy0 .
+ Q(x + x, y)
Q(x, y)
y.
(7.6)
93
(7.7)
and take the limits x ! and y ! 0. The right-hand side can be evaluated by using the denitions in Eqs. (1.12) and (1.13): " # P(x, y) P(x, y + y) @P lim = , (7.8) y!0 y @y " # Q(x + x, y) Q(x, y) @Q lim = . (7.9) x!0 x @x Thus, we obtain lim " 1 x y I (P dx + Q dy) = # @Q @x @P . @y (7.10)
x!0, y!0
@( A)
This is the denition of the curl. It represents the circulation of a vector eld around an innitesimal area at (x, y). Notice that in deriving this quantity, we have used only the components along dr, whereas the corresponding derivation of the divergence in Sec. 6.1 used the normal components. This provides a intuitive basis for understanding why the divergence and curl uniquely specify a vector eld. Example. We calculate the curls of the vector elds in the examples in Sec. 6.1. Consider V = xi + y j, (7.11)
which is shown in Fig. 7.2. This is a radial vector eld with a divergence that was calculated as r V = 2. To calculate the curl of this vector eld, we apply the denition in Eq. (7.10) with P = x and Q = y to obtain @y @x @x =0 @y 0 = 0. (7.12)
Thus, V is specied completely by its divergence. These conclusions are valid for any radial vector eld (Problem Set 9).
94
y
which is shown in Fig. 7.3. This vector eld was found in Sec. 6.1 to have a divergence that vanishes: r V = 0. With P = x and Q = y, the curl of V is @( y) @x + = 0 + 0 = 0, @x @y (7.14)
which is also zero! Thus, both the divergence and curl vanish for this vector eld. This example shows that, even if the curl and divergence of a vector eld are both zero, the vector eld itself need not reduce to a constant everywhere. However, if we further stipulate further that a vector eld must vanish at innity, then a vanishing divergence and curl do indeed imply that that V = 0. This discussion has important implications for the governing equations of electric and magnetic elds in electromagnetism (Maxwells equations).
y
y j.
95
Finally, consider
V = yi + x j, @( y) = 1 + 1 = 2. @y
(7.15)
which is shown in Fig. 7.4. With P = y and Q = x, the curl is calculated as @x @x (7.16)
so this vector eld has a positive circulation according to our convention. Indeed, the plot of the vector eld in Fig. 7.4 is suggestive of a circulation around the origin. But, as in the discussion in Sec. 6.1, the fact that the curls of the vector elds in this example are constants, means that any interpretation assigned to them must be valid for every point in the x-y plane. This point is taken up in Problem Set 9.
96
y y
(a)
(b)
(c)
Fig. 7.5: (a) Partition of a region in the x-y plane bounded by a curve. (b) Cancellation of line integrals over adjacent regions. (c) As the area of the basic regions becomes smaller ( x ! 0 and y ! 0), the partitioning in (a) provides a successively more accurate representation of the region, yielding a more accurate representation of the curve surrounding the region.
The left-hand side of this equation is the line integral over the perimeter of the region A and the right-hand side is the circulation density (i.e. the curl) multiplied by the area A of the region, which yields the total circulation within the region. Now, any region in the x-y plane can be partitioned into such contiguous elemental rectangular regions, as shown in Fig. 7.5(a). For each such region we can carry out the indicated calculations in Eq. (7.17). Thus, the corresponding quantities calculated for the entire region A is obtained by summing over the elemental regions: XI X @Q @P ! (P dx + Q dy) = Ai , (7.18) @x @y @( Ai ) i i where Ai is the area of the ith elemental region. This is depicted in Fig. 7.5(b). For extended regions A, the partitioning does not give an accurate representation of either interior or the boundary, but improves as A decreases. Thus, by taking A ! 0, we can obtain an exact relation between the line integral over the perimeter of the region and the curl within the region. We consider the left-hand side of Eq. (7.18) rst. As Figs. 7.5(b,c) show, the sum of the line integrals over the elemental rectangular regions contains only those segments that have no neighboring elemental regions; the interior line integrals cancel pairwise. Thus, 2 I 3 I 6 7 6X 7 6 7 lim 6 (P dx + Q dy)7 = (P dx + Q dy) , (7.19) 6 7 4 5
Ai !0 i @( Ai ) @A
where @A is the perimeter of the region A in the x-y plane and the integral is taken in the counter-clockwise direction.
97
The right-hand side of Eq. (7.18) is two-dimensional Riemann sum (cf. Eq. 2.1). In particular, as ! 0, the summation becomes an integral over the interior of the region A: 2 3 ZZ ! ! 6X @Q @P 7 @Q @P 6 7 6 7= lim 6 Ai 7 dA . (7.20) 6 7 4 5 Ai !0 @x @y @y A @x i Thus, the results in Eqs. (7.19) and (7.20) show that the limiting form of Eq. (7.18) obtained as Ai ! 0 is I (P dx + Q dy) =
@A
ZZ
@Q @x
! @P dx dy , @y
(7.21)
which is known as Greens theorem. This equation has the structure of the Fundamental Theorem of Calculus in that the integral of a derivative of a quantity (the curl) over the interior of a region is equal to that quantity evaluated on the boundary of the region. Example. Consider the vector eld V = yi + x j, (7.22)
which is shown in Fig. 6.4, and suppose that the area in the x-y plane is a circle of radius R. With P = y and Q = x, the left-hand side of Eq. (7.21) is I I (P dx + Q dy) = (x dy y dx) . (7.23)
@A @A
In circular polar coordinates, x = R cos , y = R sin , where 0 dx = R sin d , so the integral becomes I (x dy
@A
dy = R cos d ,
y dx) =
2 0
=R
d = 2R2 .
(7.25)
To evaluate the right-hand side of Eq. (7.21), we have that the curl is @Q @x @P =1 @y ( 1) = 2 , (7.26)
! Z R Z 2 @P dx dy = 2 r dr d @y 0 0 1 = 2 R2 2 = 2R2 , 2 (7.27)
which agrees with Eq. (7.25). Example. Consider the vector eld V = xi + y j, (7.28)
which is shown in Fig. 7.2. With P = x and Q = y, the left-hand side of Eq. (7.21) is I I (P dx + Q dy) = (x dx + y dy) .
@A @A
(7.29)
@Q @x
@P =0 @y
0 = 0.
(7.30)
(7.31)
for any area A! This ostensibly surprising result is, in fact, to be expected from the discussion in Sec. 5.2, where we showed that the value of a line integral is independent of the path if and only if the integral over any closed curve vanishes. Indeed, we showed in Sec. 5.3 that the condition for the value of a line integral, I (P dx + Q dy) (7.32)
@A
i.e. the vanishing curl of the vector eld V = P i + Q j. The three-dimensional generalization of this result will be derived in the next section.
99
7.3.1
The work done along a three-dimensional path P is obtained by calculating the component of V projected along the direction of the path. With the position vector given by r = x i + y j + z k, we have Z Z V dr = (P dx + Q dy + R dz) . (7.35)
P P
This is the generalization of the left-hand side of Eq. (7.21) to three dimensions.
7.3.2
The curl derived in Sec. 7.1 is endowed with a sign in that the counter-clockwise direction of the circulation is taken as positive by convention. But when this construction is extended to three dimensions, the concept of clockwise versus counterclockwise is not precise enough to identify the direction of circulation. For example, the counterclockwise direction observed by looking down onto the x-y plane from the positive z-axis appears as the clockwise direction when looking up to the x-y Fig. 7.6: The curl vector according to the plane from the negative z-axis. This am- right-hand rule. biguity can be alleviated by using the right-hand rule to assign an orientation to positive (i.e. counterclockwise) circulation: when the ngers of your right hand
100
bend in the counterclockwise direction, your thumb points in the direction of the positive z-axis. Thus, we can write the curl of a vector V = P i + Q j as ! @Q @P k. (7.36) @x @y We can represent this expression in a more suggestive form by utilizing the denition of the del operation, r=i @ @ @ +j +k , @x @y @z (7.37)
to represent the curl of a vector as the cross product between this operation and V. The calculation of this quantity proceeds in direct analogy with the representation of the cross product of two ordinary vectors as a determinant: i r V = @x P j @y Q k @Q 0 = @x 0 ! @P k, @y (7.38)
which is the same as Eq. (7.36). To convert this to a scalar quantity, we take the dot product of this quantity with k: (r V) k = @Q @x @P . @y (7.39)
Thus, combining Eqs. (7.35) and (7.39) Greens theorem can be written as I V dr = ZZ (r V) k dx dy . (7.40)
@A
@Q @P i+ @z @z
(7.41)
101
The meaning of this vector eld follows from that for the two-dimensional curl: it represents the circulation density, with a direction given by the right-hand rule. Example. The curl of the vector eld V = zi + x j + y k, is i rV = @ x j @y k @z = i + j + k . (7.43) (7.42)
z x y The vector components of the curl of V each have unit circulation in directions given by the right-hand rule, so the total circulation is along the direction i + j + k. The right-hand side of Eq. (7.21) can now be expressed in terms of the local unit normal n to the surface as Z ZZ (r V) k dx dy ! (r V) n d . (7.44)
A
where @ is the bounding curve of the surface . An important consequence of the structure of this equation is that, given a vector eld V, the left-hand side is determined completely by the bounding curve, independent of the surface . To appreciate the signicance of this, consider the three surfaces shown in Fig. 7.7. Each surface has the same bounding curve, namely, the unit circle in the x-y plane. The left-hand side of Stokes theorem and, therefore, the right-hand side, is the same for all three surfaces! This highlights the fact that Stokes theorem is a fundamental theorem of calculus for the curl in that the evaluation of the righthand side of Eq. (7.45) is determined entirely by the nature of the boundary @ . Example. Consider the surface given by the surface of the upper half-sphere of radius R: : x2 + y2 + z2 = R2 , (z 0) . (7.46) The bounding curve is therefore given by the circle of radius R in the x-y plane: @ : x 2 + y2 = R 2 . (7.47)
102
1 z 0 1 0 x 1 1 1 z
1 1 0 1 0 x 1 1 z
1 1 0 1 0 x 1 1
0 y
0 y
0 y
(a)
(b)
(c)
Fig. 7.7: Three surfaces that have the same bounding curve in the x-y plane, which is shown emboldened: (a) an upper half-sphere, (b) a cylinder, and (c) a cone. For each of these surfaces and for a given vector eld, the left-hand side of Stokes theorem in Eq. (7.45) is identical.
These quantities are shown Fig. 7.7(a). We will evaluate both side Stokes theorem in Eq. (7.45) for the vector eld V = yi + x j + z k. We consider the left-hand side of Eq. (7.45) rst. We have V dr = ( y i + x j + z k) (dx i + dy j + dz k) = y dx + x dy + z dz . On @ , which lies in the x-y plane, where z = 0, this expression reduces to V dr = y dx + x dy .
@
(7.48)
(7.49)
In circular polar coordinates, x = R cos , y = R sin , where 0 dx = R sin d , from which we obtain V dr dy = R cos d ,
= y dx + x dy = R2 sin2 d + R2 cos2 d = R2 d .
@
(7.52)
R2 d = 2R2 .
(7.53)
103
To evaluate the right-hand side of Stokes theorem, we rst calculate the curl of V: i j k y x z The unit normal is given in Eq. (6.43): n= Thus, x y z i + j + k. R R R x Z (7.55) rV = @ x @y @z = (1 + 1) k = 2 k . (7.54)
y z 2z (rV) n = 2 k i+ j+ k = . (7.56) R R R R We will evaluate the integral of this quantity over the upper half-sphere in spherical polar coordinates: Z (r V) n d =R
2 2
1 2
! 2 sin R cos d R
= 4R
Z |
1 2
0 1 2
sin
sin cos d {z }
2
1 2
1 2
= 2R2 . which agrees with Eq. (7.53). H: Example. Consider the surface
(7.57)
: x 2 + y2 = R 2 ,
The bounding curve is again the circle of radius R in the x-y plane: @ : x 2 + y2 = R 2 . These quantities are shown Fig. 7.7(b). We again have the vector eld V = yi + x j + z k. (7.60) (7.59)
Since the bounding curve and the vector eld are the same as in the preceding example, the evaluation of the left-hand side of Stokes theorem again yields the value 2R2 . To evaluate the right-hand side of Stokes theorem, we again have
104
that rV = 2 k. The unit normal to the top of the cylinder n = k, so for this part of the surface integral we have (rV) n = 2 . The integral of this quantity over the top of the cylinder is Z Z R Z 2 (r V) n d = 2 r dr d = 2R2 .
0 0
(7.61)
(7.62)
For the integral over the sides of the cylinder, we have that the unit normal is n= and, therefore, we nd that y i + j = 0. (rV) n = 2 k R R x (7.64) x y i + j, R R (7.63)
Thus, the integral over the sides of the cylinder vanishes and the total integral over the surface of the cylinder is given by the integral over the top of the cylinder, which is independent of the height of the cylinder, and equal to 2R2 .
7.4 Summary
This chapter has introduced the curl of a vector eld V = V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k: i rV = @ x P @R = @y j @y Q k @z R ! ! @R @Q j+ @x @x ! @P k. @y
@Q @P i+ @z @z
(7.65)
The curl represents the circulation density of the vector eld and, because of the derivative operation, has an associated Fundamental Theorem of Calculus called Stokes theorem theorem: I ZZ V dr = (r V) n d , (7.66)
@
for a surface