You are on page 1of 9

Jim Lambers

MAT 280
Spring Semester 2009-10
Lecture 5 Notes

These notes correspond to Section 11.4 in Stewart and Section 2.3 in Marsden and Tromba.

Tangent Planes, Linear Approximations and Dierentiability


Now that we have learned how to compute partial derivatives of functions of several independent
variables, in order to measure their instantaneous rates of change with respect to these variables, we
will discuss another essential application of derivatives: the approximation of functions by linear
functions. Linear functions are the simplest to work with, and for this reason, there are many
instances in which functions are replaced by a linear approximation in the context of solving a
problem such as solving a dierential equation.

Tangent Planes and Linear Approximations


In single-variable calculus, we learned that the graph of a function () can be approximated near
a point 0 by its tangent line, which has the equation

= (0 ) + (0 )( 0 ).

For this reason, the function () = (0 ) + (0 )( 0 ) is also referred to as the linearization,


or linear approximation, of () at 0 .
Now, suppose that we have a function of two variables, : 2 , and a point (0 , 0 )
. Furthermore, suppose that the rst partial derivatives of , and , exist at (0 , 0 ). Because
the graph of this function is a surface, it follows that a linear function that approximates near
(0 , 0 ) would have a graph that is a plane.
Just as the tangent line of () at 0 passes through the point (0 , (0 )), and has a slope that
is equal to (0 ), the instantaneous rate of change of () with respect to at 0 , a plane that
best approximates (, ) at (0 , 0 ) must pass through the point (0 , 0 , (0 , 0 )), and the slope
of the plane in the - and -directions, respectively, should be equal to the values of (0 , 0 ) and
(0 , 0 ).
Since a general linear function of two variables can be described by the formula

(, ) = ( 0 ) + ( 0 ) + ,

so that (0 , 0 ) = , and a simple dierentiation yields



= , = ,

1
we conclude that the linear function that best approximates (, ) near (0 , 0 ) is the linear
approximation

(, ) = (0 , 0 ) + (0 , 0 )( 0 ) + (0 , 0 )( 0 ).

Furthermore, the graph of this function is called the tangent plane of (, ) at (0 , 0 ). Its equation
is

0 = (0 , 0 )( 0 ) + (0 , 0 )( 0 ).

Example Let (, ) = 22 + 3 2 , and let (0 , 0 ) = (1, 1). Then (0 , 0 ) = 5, and the rst
partial derivatives at (0 , 0 ) are

(1, 1) = 4=1,=1 = 4, (1, 1) = 22 + 6=1,=1 = 8.

It follows that the tangent plane at (1, 1) has the equation

5 = 4( 1) + 8( 1),

and the linearization of at (1, 1) is

(, ) = 5 + 4( 1) + 8( 1).

Let (, ) = (1.1, 1.1). Then (, ) = 6.292, while (, ) = 6.2, for an error of 6.2926.2 = 0.092.
However, if (, ) = (1.01, 1.01), then (, ) = 5.120902, while (, ) = 5.12, for an error of
5.120902 5.12 = 0.000902. That is, moving 10 times as close to (1, 1) decreased the error by a
factor of over 100.
Another useful application of a linear approximation is to estimate the error in the value of a
function, given estimates of error in its inputs. Given a function = (, ), and its linearization
(, ) around a point (0 , 0 ), if 0 and 0 are measured values and = 0 and = 0
are regarded as errors in 0 and 0 , then the error in can be estimated by computing

= 0 = (, ) (0 , 0 )
= [ (0 , 0 ) + (0 , 0 )( 0 ) + (0 , 0 )( 0 )] (0 , 0 )
= (0 , 0 ) + (0 , 0 ) .

The variables and are called dierentials, and is called the total dierential, as it depends
on the values of and . The total dierential is only an estimate of the error in ; the actual
error is given by = (, ) (0 , 0 ), when the actual errors in and , = 0 and
= 0 , are known. Since this is rarely the case in practice, one instead estimates the error in
from estimates and of the errors in and .

2
Example Recall that the volume of a cylinder with radius and height is = 2 . Suppose
that = 5 cm and = 10 cm. Then the volume is = 250 cm3 . If the measurement error in
and is at most 0.1 cm, then, to estimate the error in the computed volume, we rst compute

= 2 = 100, = 2 = 25.

It follows that the error in is approximately

= + = 0.1(100 + 25) = 12.5 cm3 .

If we specify = 0.1 and = 0.1, and compute the actual volume using radius + = 5.1
and height + = 10.1, we obtain

+ = (5.1)2 (10.1) = 262.701 cm3 ,

which yields the actual error

= 262.701 250 = 12.701 cm3 .

Therefore, the estimate of the error, , is quite accurate.

Functions of More than Two Variables


The concepts of a tangent plane and linear approximation generalize to more than two variables in
(0) (0) (0)
a straightforward manner. Specically, given : and p0 = (1 , 2 , . . . , ) ,
we dene the tangent space of (1 , 2 , . . . , ) at p0 to be the -dimensional hyperplane in +1
whose points (1 , 2 , . . . , , ) satisfy the equation

(0) (0)
0 = (p0 )(1 1 ) + (p0 )(2 2 ) + + (p0 )( (0)
),
1 2
where 0 = (p0 ). Similarly, the linearization of at p0 is the function (1 , 2 , . . . , ) dened
by
(0) (0)
(1 , 2 , . . . , ) = 0 + (p0 )(1 1 ) + (p0 )(2 2 ) + + (p0 )( (0)
).
1 2

The Gradient Vector


It can be seen from the above denitions that writing formulas that involve the partial derivatives
of functions of variables can be cumbersome. This can be addressed by expressing collections
of partial derivatives of functions of several variables using vectors and matrices, especially for
vector-valued functions of several variables.

3
(0) (0) (0)
By convention, a point p0 = (1 , 2 , . . . , ), which can be identied with the position vector
(0) (0) (0)
p0 = 1 , 2 , . . . , , is considered to be a column vector
(0)

1(0)
2
p0 = .. .

.
(0)

Also, by convention, given a function of variables, : , the collection of its partial
derivatives with respect to all of its variables is written as a row vector
[ ]

(p0 ) = 1
(p 0 ) 2 (p 0 ) (p 0 ) .

This vector is called the gradient of at p0 .


Viewing the partial derivatives of as a vector allows us to use vector operations to describe,
much more concisely, the linearization of . Specically, the linearization of at p0 , evaluated at
a point p = (1 , 2 , . . . , ), can be written as
(0) (0)
(p) = (p0 ) + (p0 )(1 1 ) + (p0 )(2 2 ) + + (p0 )( (0)
)
1 2

(0)
= (p0 ) + (p0 )( )

=1
= (p0 ) + (p0 ) (p p0 ),

where (p0 ) (p p0 ) is the dot product, also known as the inner product, of the vectors (p0 )
and p p0 . Recall that given two vectors u = 1 , 2 , . . . , and v = 1 , 2 , . . . , , the dot
product of u and v, denoted by u v, is dened by


uv = = 1 1 + 2 2 + + = uv cos ,
=1

where is the angle between u and v.


Example Let : 3 be dened by

(, , ) = 32 3 4 .

Then
6 3 4 92 2 4 122 3 3
[ ] [ ]
(, , ) = = .
Let (0 , 0 , 0 ) = (1, 2, 1). Then
[ ] [ ]
(0 , 0 , 0 ) = (1, 2, 1) = (1, 2, 1) (1, 2, 1) (1, 2, 1) = 48 36 96 .

4
It follows that the linearization of at (0 , 0 , 0 ) is

(, , ) = (1, 2, 1) + (1, 2, 1) 1, 2, + 1
= 24 + 48, 36, 96 1, 2, + 1
= 24 + 48( 1) + 36( 2) 96( + 1)
= 48 + 36 96 192.

At the point (1.1, 1.9, 1.1), we have (1.1, 1.9, 1.1) 36.5, while (1.1, 1.9, 1.1) = 34.8.
Because is changing rapidly in all coordinate directions at (1, 2, 1), it is not surprising that the
linearization of at this point is not highly accurate.

The Jacobian Matrix


Now, let f : be a vector-valued function of variables, with component functions

1 (p)
2 (p)
f (p) = ,

..
.
(p)

where each : . Combining the two conventions described above, the partial derivatives
of these component functions at a point p0 are arranged in an matrix
1 1

1
(p 0 ) (p 0 ) (p 0 )

2
1 2
2

2
(p ) (p ) (p0 )

1
0 2
0
f (p0 ) = .. .. .
. .



1 (p0 ) 2 (p0 ) (p0 )

This matrix is called the Jacobian matrix of f at p0 . It is also referred to as the derivative of f at
x0 , since it reduces to the scalar (0 ) when is a scalar-valued function of one variable. Note
that rows of f (p0 ) correspond to component functions, and columns correspond to independent
variables. This allows us to view f (p0 ) as the following collections of rows or columns:

1 (p0 )
2 (p0 ) [ ]
f f f
f (p0 ) = = (p ) (p ) (p ) .

.. 1
0 2
0
0
.
(p0 )

The Jacobian matrix provides a concise way of describing the linearization of a vector-valued
function, just the gradient does for a scalar-valued function. The linearization of f at p0 is the

5
function Lf (p), dened by
1
1

1 (p0 ) (p0 )

1 (p0 )
2 2
2 (p0 )
1 (p0 ) (p0 )

(1 (0) ) + + ( (0)

Lf (p) = + )

.. .. 1 ..
. . .


(p0 )
1 (p0 ) (p0 )

f (0)
= f (p0 ) + (p0 )( )

=1
= f (p0 ) + f (p0 )(p p0 ),

where the expression f (p0 )(p p0 ) involves matrix multiplication of the matrix f (p0 ) and the
vector p p0 . Note the similarity between this denition, and the denition of the linearization of
a function of a single variable.
In general, given a matrix ; that is, a matrix with rows and columns, and an
matrix , the product is the matrix , where the entry in row and column of
is obtained by computing the dot product of row of and column of . When computing
the linearization of a vector-valued function f at the point p0 in its domain, the th component
function of the linearization is obtained by adding the value of the th component function at p0 ,
(p0 ), to the dot product of (p0 ) and the vector p p0 , where p is the vector at which the
linearization is to be evaluated.
Example Let f : 2 2 be dened by
[ ] [ ]
1 (, ) cos
f (, ) = = .
2 (, ) 2 sin

Then the Jacobian matrix, or derivative, of f is the 2 2 matrix

cos sin
[ ] [ ] [ ]
1 (, ) (1 ) (1 )
f (, ) = = = .
2 (, ) (2 ) (2 ) 22 sin 2 cos

Let (0 , 0 ) = (0, /4). Then we have


[ ]
2
2 22
f (0 , 0 ) = 2
,
2 2

and the linearization of f at (0 , 0 ) is


[ ] [ ]
1 (0 , 0 ) 0
Lf (, ) = + f (0 , 0 )
2 (0 , 0 ) 0

6
[ ] [ ][
2 2
22
]
2 2 0
= +
2
2 2 4
2 2
[ ) ]
2
+ 22 2
4
(
= 2 2 ) .
2 2
4
(
2 2 + 2

At the point (1 , 1 ) = (0.1, 0.8), we have


[ ] [ ]
0.76998 0.76749
f (1 , 1 ) , Lf (1 , 1 ) .
0.58732 0.57601

Because of the relatively small partial derivatives at (0 , 0 ), the linearization at this point yields
a fairly accurate approximation at (1 , 1 ).

Dierentiability
Before using a linearization to approximate a function near a point p0 , it is helpful to know whether
this linearization is actually an accurate approximation of the function in the rst place. That is, we
need to know if the function is dierentiable at p0 , which, informally, means that its instantaneous
rate of change at p0 is well-dened. In the single-variable case, a function () is dierentiable at
0 if (0 ) exists; that is, if the limit

() (0 )
(0 ) = lim
0 0
exists. In other words, we must have

() (0 ) (0 )( 0 )
lim = 0.
0 0

But (0 )+ (0 )(0 ) is just the linearization of at 0 , so we can say that is dierentiable


at 0 if and only if
() ()
lim = 0.
0 0
Note that this is a stronger statement than simply requiring that

lim () () = 0,
0

because as approaches 0 , 1/(0 ) approaches , so the dierence () () must approach


zero particularly rapidly in order for the fraction [ () ()]/( 0 ) to approach zero. That
is, the linearization must be a suciently accurate approximation of near 0 for this to be the
case, in order for to be dierentiable at 0 .

7
This notion of dierentiability is readily generalized to functions of several variables. Given
f : , and p0 , we say that f is dierentiable at p0 if
f (p) Lf (p)
lim = 0,
pp0 p p0
where Lf (p) is the linearization of f at p0 .
Example Let (, ) = 2 . To verify that this function is dierentiable at (0 , 0 ) = (1, 1), we
rst compute = 2 and = 2 . It follows that the linearization of at (1, 1) is
(, ) = (1, 1) + (1, 1)( 1) + (1, 1)( 1) = 1 + 2( 1) + ( 1) = 2 + 2.
Therefore, is dierentiable at (1, 1) if
2 (2 + 2) 2 (2 + 2)
lim = lim = 0.
(,)(1,1) (, ) (1, 1) (,)(1,1) ( 1)2 + ( 1)2
By rewriting this expression as
2 (2 + 2) 1( + 1) 2
= ,
( 1)2 + ( 1)2 ( 1)2 + ( 1)2
and noting that
1
lim ( + 1) 2 = 0, 0 1,
(,)(1,1) ( 1)2 + ( 1)2
we conclude that the limit actually is zero, and therefore is dierentiable.
There are three important conclusions that we can make regarding dierentiable functions:
If all partial derivatives of f at p0 exist, and are continuous, then f is dierentiable at p0 .
Furthermore, if f is dierentiable at p0 , then it is continuous at p0 . Note that the converse
is not true; for example, () = is continuous at = 0, but it is not dierentiable there,
because () does not exist there.
If f is dierentiable at p0 , then its rst partial derivatives exist at p0 . This statement
might seem redundant, because the rst partial derivatives are used in the denition of the
linearization, but it is important nonetheless, because the converse of this statement is not
true. That is, if a functions rst partial derivatives exist at a point, it is not necessarily
dierentiable at that point.
The notion of dierentiability is related to not only partial derivatives, which only describe how
a function changes as one of its variables changes, but also the instantaneous rate of change of a
function as its variables change along any direction. If a function is dierentiable at a point, that
means its rate of change along any direction is well-dened. We will explore this idea further in
Lecture 7.

8
Practice Problems
1. Compute the equation of the tangent plane of (, ) = cos + sin at (0 , 0 ) =
(1, /2). Then, use the linearization of at this point to approximate the value of (1.1, 1.6).
How accurate is this approximation?

2. Compute the equation of the tangent space of (, , ) = 12 ln ( 1)2 + ( + 2)2 + 2 at


(0 , 0 , 0 ) = (1, 1, 1). Then, use the linearization of at this point to approximate the value
of (1.01, 0.99, 1.05). How accurate is this approximation?

3. Let (, ) = 2 + 2 . Use the denition of dierentiability to show that this function is


dierentiable at (0 , 0 ) = (1, 1).

4. Suppose that the coordinates of two points (1 , 1 ) = (2, 3) and (2 , 2 ) = (7, 5) are
obtained by measurements, for which the maximum error in each is 0.01. Estimate the
maximum error in the distance between the two points.

Additional Practice Problems


Additional practice problems from the recommended textbooks are:

Stewart: Section 11.4, Exercises 1-5 odd, 15-27 odd

Marsden/Tromba: Section 2.3, Exercises 5, 9, 15

You might also like