Professional Documents
Culture Documents
John Duggan
University of Rochester
June 21, 2010
Contents
1 Opening Remarks
2 Unconstrained Optimization
3 Pareto Optimality
3.1 Existence of Pareto Optimals . . . . . . . . . . . . . . . . . . . . . .
3.2 Characterization with Concavity . . . . . . . . . . . . . . . . . . . . .
3.3 Characterization with Differentiability . . . . . . . . . . . . . . . . .
8
9
12
18
4 Constrained Optimization
19
5 Equality Constraints
5.1 First Order Analysis . . . . .
5.2 Examples . . . . . . . . . . .
5.3 Second Order Analysis . . . .
5.4 Multiple Equality Constraints
.
.
.
.
21
21
27
31
37
6 Inequality Constraints
6.1 First Order Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Concave Programming . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Second Order Analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
44
44
49
55
58
8 Mixed Constraints
62
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Opening Remarks
These notes were written for PSC 408, the second semester of the formal modeling
sequence in the political science graduate program at the University of Rochester. I
hope they will be a useful reference on optimization and Pareto optimality for political scientists, who otherwise would see very little of these subjects, and economists
wanting deeper coverage than one gets in a typical first-year micro class. I do not
invent any new theory, but I try to draw together results in a systematic way and to
build up gradually from the basic problems of unconstrained optimization and optimization with a single equality constraint. That said, Theorem 8.2 may be a slightly
new way of presenting results on convex optimization, and Ive strived for quantity
and quality of figures to aid intuition. As alternatives to these notes, I suggest Simon
and Blume (1994), who cover a greater range of topics, and Sundaram (1996), who
is more thorough and technically rigorous. Unfortunately, my notes are not entirely
self-contained and do presume some sophistication with calculus and a bit of linear
algebra and matrix algebra (not too much), and worse yet, I havent been entirely
consistent with notation for partial derivatives; I hope the meaning of my notation is
clear from context.
Unconstrained Optimization
f (x + t) f (x)
0
Dt f (x) = lim
f (x + t) f (x)
0,
and
0
Solving the second equation for x1 , we have x1 = 2x2 . Substituting this into the first
equation, we have x2 64x32 = 0, which has three solutions: x2 = 0, 1/8, 1/8. Then
the first order condition has three solutions,
(x1 , x2 ) = (0, 0), (1/4, 1/8), (1/4, 1/8),
but the last of these is not in the domain of f , and the first is on the boundary of the
domain. Thus, we have a unique solution in the interior of the domain: (x1 , x2 ) =
(1/4, 1/8).
The usual necessary second order condition from univariate calculus extends as well.
Theorem 2.2 Let X Rn , let x X be interior to X, and let f : X R be twice
differentiable. If x is a local maximizer of f , then for every direction t, we have
Dt2 f (x) 0.
Proof Assume x is an interior local maximizer of f , let t be an arbitrary direction,
and let > 0 be such that B (x) X and for all y B (x), f (x) f (y). Consider a
sequence {n } such that n 0, so for sufficiently high n, we have x + n t B (x),
and therefore f (x + n t) f (x). For each such n, the mean value theorem yields
n (0, n ) such that
Dt f (x + n t) =
f (x + n t) f (x)
,
n
f (xn ) f (x)
0.
||xn x||
we assume for simplicity that X is closed, though not necessarily compact. To represent the values of f at the extremes of the domain, if it is unbounded, let
y =
where we assume for simplicity that this limit exists, and let
y = sup{f (x) : x bdX}
represent the highest value of f on the boundary of its domain. Assume for simplicity
that the function f has at most a finite number of critical points. There are then
three possibilities.
1. The first order condition has a unique solution x .
(a) If Dt2 f (x ) < 0 for every direction t, then x is the unique maximizer.
(b) If Dt2 f (x ) > 0 for every direction t, then x is the unique minimizer.
An element x is a maximizer if and only if it is a boundary point and
f (x) max{y , y}. There may be no maximizer.
(c) Else, x is the unique maximizer if and only if f (x ) max{y , y}.
If this inequality does not hold, then an element x is a maximizer
if and only if it is a boundary point and f (x) max{y , y}. There
may be no maximizer.
2. There are multiple solutions, say x1 , . . . , xk , to the first order condition.
An element x is a maximizer if and only if it is a critical point or boundary
point and
f (x) max{y , y, f (x1 ), . . . , f (xk )}.
There may be no maximizers.
3. The first order condition has no solution. An element x is a maximizer
if and only if it is a boundary point and f (x) max{y , y}. There may
be no maximizer.
If X is compact, then f has a maximizer, simplifying the situation somewhat.
Example Returning one last time to X = R2+ and f (x) = x1 x2 2x41 x22 , we
have noted that (1/4, 1/8) is the unique interior solution to the first order condition
and that the second directional derivatives of f are non-positive at this point. Thus,
we are in case 1(c). Note that when x1 = 0 or x2 = 0, we have f (x1 , x2 ) 0, and
f (0) = 0. Thus, y = 0. Furthermore, we claim that y = . To see this, take any
value c < 0, and suppose ||(x1 , x2 )|| = k. We argue that when k is sufficiently large,
we necessarily have f (x1 , x2 ) c. Rewriting f (x1 , x2 ) as
f (x1 , x2 ) = (x2 x1 )2 + x21 x1 x2 2x41 ,
we see that f (x1 , x2 ) c if x21 2x41 c. This in turn holds if x1 a =
5
max{1, |c|}.
x2 =
k 2 x21 >
k 2 a.
Then we have
xn =
||y xn ||
||xn x||
x+
y
||y x||
||y x||
||xn x||
||yxn ||
f (x)
||yx||
||xn x||
f (y)
||yx||
||xn x||
f (x)
f (y) f (x)
> 0.
||y x||
f (y) f (x)
> 0,
yx
a contradiction.
Example Assume X = Rn , and note that f (x) = ||x x||2 is strictly concave. The
first order condition has the unique solution x = y, and we conclude that this is the
unique maximizer of the function. (Of course, we could have verified that directly.)
The next result lays the foundation for comparative statics analysis, in which we
consider how local maximizers vary with respect to underlying parameters of the
problem. Specifically, we study the effect of letting a parameter, say , vary in the
6
In case the matrix algebra in the preceding theorem is a bit hard to digest, we can
state the derivative of in terms of partial derivatives when n = 1: it is
D( ) =
Dx f (x , )
.
D2 f (x , )
Pareto Optimality
x1
x2
x
ui(y) ui (x) for all i, with strict inequality for at least one individual. An alternative
is Pareto optimal if there is no alternative that Pareto dominates it.
Consider the case of two individuals and quadratic utility, i.e., ui(x) = ||x xi ||2 ,
and an alternative x, as in Figure 2. It is clear that any alternative in the shaded
lens is strictly preferred to x by both individuals, which implies that x is Pareto
dominated and, therefore, not Pareto optimal. In fact, this will be true whenever
the individuals indifference curves through an alternative create a lens shape like
this. The only way that the individuals indifference curves wont create such a lens
if they meet at a tangency at the alternative x, and this happens only when x lies
directly between the two individuals ideal points. We conclude that, when there are
just two individuals and both have Euclidean preferences, the set of Pareto optimal
alternatives is the line connecting the two ideal points. See Figure 3 for elliptical
indifference curves, in which case the set of Pareto optimal alternatives is a curve
connecting the two ideal points. This motivates the standard terminology: when
there are just two individuals, we refer to the set of Pareto optimal alternatives as
the contract curve.
3.1
i=1
x1
x2
Proof Suppose x solves the above maximization problem but there is some alternative y that Pareto dominates it. Since ui (y) ui (x) for each i, each term i ui (y)
is at least as great as i ui(x). And since there is some individual, say j, such that
uj (y) > uj (x), and since j > 0, there is at least one y-term that is strictly greater
than the corresponding x-term. But then
n
X
i ui (y) >
i=1
n
X
i ui(x),
i=1
a contradiction.
From the preceding sufficient condition, we can then deduce the existence of at least
one Pareto optimal alternative very generally.
Theorem 3.2 Assume A Rd is compact and each ui is continuous. Then there
exists a Pareto optimal alternative.
P
Proof Define the function f : A R by f (x) = ni=1 i ui(x) for all x. Note that f
is continuous, and so it achieves a maximum over the compact set A. Letting x be
a maximizer, this alternative is Pareto optimal.
We have shown that if an alternative maximizes the sum of utilities for strictly positive
weights, then it is Pareto optimal. The next result imposes Euclidean structure on
the set of alternatives and individual utilities, namely strict quasi-concavity, and
strengthens the result of Theorem 3.1 by weakening the sufficient condition to allow
some weights to be zero.
10
n
X
i ui (y),
i=1
i ui(w) >
i=1
n
X
i ui (x),
i=1
Our sufficient condition for Pareto optimality for general utilities, Theorem 3.1, relies
on all coefficients i being strictly positive, while Theorem 3.3 weakens this for strictly
quasi-concave utilities to at least one positive i . In general, we cannot state a
sufficient condition that allows some coefficients to be zero, even if we replace strict
quasi-concavity with concavity.
Example Let there be two individuals, A = [0, 1], u1 (x) = x, and u2(x) = 0. These
utilities are concave, and x = 0 maximizes 1 u1 (x) + 2 u2 (x) with weights 1 = 0
and 2 = 1, but it is obviously not Pareto optimal.
In the latter example, of course the problem maxx[0,1] 1 u1 (x)+2 u2 (x) (with 1 = 0
and 2 = 1) has multiple (in fact, an infinite number of) solutions. Next, we provide
a different sort of sufficient condition, relying on uniqueness of solutions to the social
welfare problem, for Pareto optimality.
Theorem 3.4 Assume that for weights 1 , . . . , n 0 (not all zero), the problem
max
yA
n
X
i ui (y)
i=1
has a unique solution. If x solves the above maximization problem, then it is Pareto
optimal.
11
utility for 2
V
z
z + (1 )z
= (1 , . . . , n )
(u1 (x ), . . . , un (x ))
z
U
utility for 1
3.2
As yet, we have derived sufficientbut not necessaryconditions for Pareto optimality. To provide a more detailed characterization of the Pareto optimal alternatives
under convexity and concavity conditions, we define the set of utility imputations as
there exists x A s.t.
n
.
U =
zR :
(u1(x), . . . , un (x)) z
Intuitively, given an alternative x, we may consider the vector (u1 (x), . . . , un (x))
of utilities generated by x. Note that this vector lies in Rn , which has number of
dimensions equal to the number of individuals. The set of utility imputations consists
of all such utility vectors, as well as any vectors less than or equal to them. See Figure
4 for the n = 2 case.
The next lemma gives some useful technical properties of the set of utility imputations.
In particular, assuming the set of alternatives is convex and utilities are concave, it
establishes that the set U of imputations is convex. See Figure 4.
Lemma 3.5 Assume A Rm is convex and each ui is concave. Then U is convex.
Furthermore, if each ui is strictly concave, then for all distinct x, y A and all
12
and
(u1 (x ), . . . , un (x )) z .
Next, assuming utilities are concave, we derive a necessary condition for Pareto optimality: if an alternative x is Pareto optimal, then there is a vector of non-negative
weights = (1 , . . . , n ) (not all zero) such that x maximizes the sum of individual
utilities with those weights. Note that we do not claim that x must maximize the
sum of utilities with strictly positive weights.
Theorem 3.6 Assume A Rd is convex and each ui is concave. If x is Pareto
optimal, then there exist weights 1 , . . . , n 0 (not all zero) such that x solves
max
yA
n
X
i ui (y).
i=1
of vectors strictly greater than the utility vector (u1(x ), . . . , un (x )) in each coordinate. For the remainder of the proof, let z = (u1 (x ), . . . , un (x )) be the utility
vector associated with x . The set V is nonempty, convex, and open (and so has
13
nonempty interior). The set U of imputations is nonempty and, by Lemma 3.5, convex. Note that U V = , for suppose otherwise. Then there exists z U V ,
which implies the existence of x A such that
(u1 (x), . . . , un (x)) z > z .
But then we have xPi x for all i N, contradicting our assumption that x is Pareto
optimal. Therefore, by the separating hyperplane theorem, there is a hyperplane H
that separates U and V . Let H be generated by the linear function f at value c,
and let = (1 , . . . , n ) Rn be the non-zero gradient of f . Then we may assume
without loss of generality that for all z U and all w V , we have f (z) c f (w),
i.e., z c w. We claim that z = c, and particular that x solves the
maximization problem in the theorem. Since z U , it follows immediately that
f evaluated at this vector is less than or equal to c. Suppose it is strictly less so,
i.e., z < c. Given > 0, define w = z + (1, 1, . . . , 1), and note that w V ,
and therefore w c. But for sufficiently small, we in fact have w < c, a
contradiction. That x solves the maximization problem in the theorem then follows
immediately: for all x A, we have (u1 (x), . . . , un (x)) U , and then
(u1 (x), . . . , un (x)) c = z ,
or equivalently,
X
iN
i ui(x)
i ui (x ),
iN
as claimed. Finally, we claim that Rn+ , i.e., i 0 for all i N. To see this,
suppose that i < 0 for some i. Then we may define the vector w = z + ei , and for
high enough , we have
w = z + i i < z .
For all > 0, we have w = w + (1, 1, . . . , 1) V , and therefore w c. But we
may choose > 0 sufficiently small that w < z = c, a contradiction. Thus,
Rn+ \ {0}.
The proof of the previous result uses the separating hyperplane theorem and the
following insight. We can think of the social welfare function above as merging two
steps: first we apply individual utility functions to an alternative x to get a vector,
say z = (z1 , . . . , zn ), of individual utilities, and then we take the dot product z
to get the social welfare from x. Of course, dot products are equivalent to linear
functions, so we can view the second step as applying a linear function f : Rn R to
the vector of utilities. Geometrically, when n = 2, we can draw the level sets of the
linear function, and if x maximizes social welfare with weights , then the vector
of utilities from x , denoted (u1 (x ), . . . , un (x )), must maximize the linear function
over the set U of utility imputations. See Figure 4.
14
utility for 2
= (1 , 2 ) 0
(u1 (1), u2(1))
U
utility for 1
i=1
The condition that the weights are non-negative but not all zero cannot be strengthened to the condition that they are all strictly positive in the necessary condition of
Theorem 3.6 and Corollary 3.7.
Example Suppose there are two individuals who must choose an alternative in the
unit interval, A = [0, 1], with quadratic utilities: u1 (x) = x2 and u2 (x) = (1 x)2 .
Then x = 1 is Pareto optimal, yet there do not exist strictly positive weights 1 , 2 >
0 such that x maximizes 1 u1 (y) + 2 u2 (y). See Figure 5. Given any strictly positive
weights, 1 and 2 , the level set through (0, 1) of the linear function with gradient
(1 , 2 ) cuts through the set of utility imputations; thus, (u1(1), u2 (1)) does not
maximize the linear function over the set of imputations.
The previous corollary uses the assumption of strict concavity to provide a full characterization of Pareto optimality. It is simple to deduce a more general conclusion
that relies instead on the uniqueness condition of Theorem 3.4.
15
n
X
i ui (y)
i=1
has a unique solution. Then x is Pareto optimal if and only if there exist weights
1 , . . . , n 0 (not all zero) such that x solves the above maximization problem.
One direction follows immediately from Theorem 3.6. Under the conditions of the
corollary, suppose x solves the maximization problem for some non-negative weights
(not all zero). Then Theorem 3.4 implies x is Pareto optimal, as required.
With the necessary condition for Pareto optimality established in Theorem 3.6, we
can use calculus techniques to calculate contract curves in simple examples with two
individuals. Let x intX be Pareto optimal, which therefore maximizes 1 u1 (x) +
2 u2 (x) for some 1 , 2 0 such that 1 + 2 > 0. Then the first order necessary
condition holds, and for all coordinates j, k = 1, . . . , n, we have
1 Dj u1 (x ) + 2 Dj u2 (x ) = 0
1 Dk u1 (x ) + 2 Dk u2 (x ) = 0.
Note that when Dk u1 (x ) 6= 0 and Dk u2 (x ) 6= 0, we have
Dj u1 (x )
Dj u2 (x )
=
.
Dk u1 (x )
Dk u2 (x )
That is, the marginal rates of substitution of k for j are equal for the two individuals,
i.e., their indifference curves are tangent, as in Figures 2 and 3. And although the
machinery we have developed thus far requires the utilities u1 and u2 in the preceding
discussion to be concave, we will see that the analysis extends more generally.
Example Suppose A = Rd and each ui is quadratic. Since quadratic utilities are
strictly concave, it follows that x is Pareto optimal if and only if there exist weights
1 , . . . , n 0 (not all zero) such that x solves
max
yA
n
X
i ui (y).
i=1
P
Furthermore, since each ui is strictly concave, the function ni=1 i ui (x) is strictly
concave, so x is a solution to the above maximization problem if and only if it solves
the first order condition
0 = D
n
X
i ui(x) =
i=1
n
X
i=1
16
2i (
xi x),
or
x =
n
X
i=1
Finally, writing
i =
Pni
j=1
i
Pn
j=1 j
xi .
, we have
i 0 for all i,
x =
n
X
Pn
i=1
i = 1, and
i xi ,
i=1
17
3.3
When utilities are differentiable, we can sharpen the characterization of the previous
subsection. We first note that at an interior Pareto optimal alternative, the gradients
of the individuals are linearly dependent.
Theorem 3.9 Assume A Rd , and let x be interior to A. Assume P
each ui is differentiable at x. Then there exist 1 , . . . , n 0 (not all zero) such that ni=1 i Dui(x) =
0.
Proof If there do not exist such weights, then 0
/ conv{Du1 (x), . . . , Dun (x)}. Then
by the separating hyperplane theorem, there is a non-zero vector p Rn such that
p Du1 (x) > 0, . . . , p Dun (x) > 0. Then there exists > 0 such that x + p A and
ui(x + p) > ui(x) for all i, contradicting Pareto optimality of x.
An easy implication of Theorem 3.9 is a differentiable version of Theorem 3.6. Indeed,
if each ui is differentiable and concave and x is Pareto optimal, then there
weights
Pare
n
1 , . . . , n 0 such that x satisfies the first order condition for maxyA i=1 i ui (y),
and by concavity, x solves the maximization problem.
We can take a geometric perspective by defining the mapping u : X Rn from
alternatives to vectors of utilities, i.e., u(x) = (u1 (x), . . . , un (x)). Then the derivative
of u at x is the matrix
u1
u1
(x)
(x)
x1
xd
..
..
..
.
.
.
.
un
(x)
x1
un
(x)
xd
The span of the columns is a linear subspace of Rn called the tangent space of u
at x. Theorem 3.9 implies that at a Pareto optimal alternative, the rank of this
derivative is n 1 or less. By Pareto optimality, u(x) belongs to the boundary of
u(X). Furthermore, the theorem implies
u1
u1
(x) x
(x)
x1
d
..
..
..
1 n
= 0,
.
.
.
un
(x)
x1
un
(x)
xd
utility
for 1
du
(x)
dx1
normal space
u(x)
du
(x)
dx3
du
(x)
dx2
boundary
of u(X)
utility
for 2
utility
for 3
Figure 7: Unique weights
Figure 7 for the three-individual case. Then the normal space is one-dimensional,
and the uniqueness claim follows.
Theorem 3.10 Assume A Rd , and let x be interior to A. Assume each ui is
differentiable at x and that
P Du(x) has rank n 1. Then there exist 1 , . . . , n 0
(not all zero) such that ni=1 i Dui (x) = 0, and these weights are unique up to a
positive scaling.
The rank condition used in the previous result, while reasonable in some contexts,
is restrictive; it implies, for example, that the set of alternatives has dimension at
least n 1. Note that the condition that the weights are non-negative and not all
zero implies that the tangent line at u(x) is downward sloping when n = 2, and it
formalizes the idea that the boundary of u(X) at u(x) is downward sloping for any
number of individuals.
Constrained Optimization
..
.
s.t. g1 (x)
gm (x)
20
c1
cm ,
now defining f on the entire Euclidean space and building any restrictions on the
domain into the constraints of the problem. Of course, it may be that gj (x) = xj
and cj = 0, so the j th constraint is just a non-negativity constraint: xj 0. Note
that the problem of equality constraints is a special case of inequality constraints: we
can always convert g(x) = c into two inequalities g(x) c and g(x) c.
Finally, we consider the general problem with mixed constraints, where we are given
equality constraints g1 : Rn R, . . . , g : Rn R and inequality constraints h1 : Rn
R, . . . , hm : Rn R. Then the form of the constraint set is
g1 (x) = c1 , . . . , g (x) = c
n
.
C =
xR |
h1 (x) d1 , . . . , hm (x) dm
We consider the hybrid optimization
maxxRn f (x)
s.t gj (x) = cj , j = 1, . . . ,
hj (x) dj , j = 1, . . . , m,
again defining f on the entire Euclidean space.
5
5.1
Equality Constraints
First Order Analysis
x2
level sets
of f
y
Df (x)
x
g=c
Dg(x)
x1
Figure 8: Constrained local maximizer
x2
level set
of f
Df (x)
x
g=c
y
Dg(x)
x1
22
x2
Dg(x)
g=c
z
x1
x=0
(z)
I
Figure 10: Proof of Lagrange
Theorem 5.1 (Lagrange) Let X Rn , f : X R, and g : Rn R. Assume f
and g are continuously differentiable in an open neighborhood around x, an interior
point of X. Also assume Dg(x) 6= 0. If x is a constrained local maximizer of f subject
to g(x) = c, then there is a unique multiplier R such that
Df (x) = Dg(x).
(1)
Proof I provide a heuristic argument for the case of two variables. The idea is to
transform the constrained problem into an unconstrained one. The theorem assumes
that Dg(x) 6= 0, and (only to simplify notation) we will assume x = 0 and D2 g(x) 6= 0.
The implicit function theorem implies that in an open interval I around x1 = 0, we
may then view the level set of g at c as the graph of a function : I R such that for
all z I, g(z, (z)) = c. See Figure 10. Note that 0 = x = (0, (0)). Furthermore,
is continuously differentiable with derivative
D(z) =
D1 g(z, (z))
.
D2 g(z, (z))
(2)
Because x is interior to X, we can choose the interval I small enough that each
(z, (z)) belongs to the domain X of the objective function. Then z = 0 is a local
maximizer of the unconstrained problem
maxzI f (z, (z)),
23
and we know the first order condition holds, i.e., differentiating with respect to z and
using the chain rule, we have
D1 f (0) + D2 f (0)D(0) = 0,
which implies
D1 f (0) + D2 f (0)
D1 g(0)
= 0.
D2 g(0)
2 f (0)
, we have Df (0) = Dg(0), as desired.
Defining = D
D2 g(0)
Of course, the first order condition from Lagranges theorem can be written in terms
of partial derivatives:
f
g
(x) =
(x)
xi
xi
for all i = 1 . . . n. Thus, the theorem gives us n + 1 equations (including the constraint) in n + 1 unknowns (including ). If we can solve for all of the solutions of this
system, then we have an upper bound on the interior constrained local maximizers.
Remember: the theorem of Lagrange gives a necessary condition for a constrained
local maximizer, not a sufficient one; the solutions to the first order condition may
not be local maximizers.
The number is the Lagrange multiplier corresponding to the constraint. The condition Dg(x) 6= 0 is called the constraint qualification. Without it, the result would
not be true.
Example Consider X = R, f (x) = (x + 1)2 , and g(x) = x2 . Consider the problem
of maximizing f subject to g(x) = 0. The maximizer is clearly x = 0. But Dg(0) = 0
and Df (0) = 2, so there can be no such that Df (0) = Dg(0).
There is an easy way to remember the conditions in the corollary to the Lagranges
theorem: if x is an interior constrained local maximizer of f subject to g(x) = c,
and if Dg(x) 6= 0, then there exists R such that (x, ) is a critical point of the
function L : X R R defined by
L(x, ) = f (x) + (c g(x)).
That is, there exists R such that
L
(x, )
x1
=
..
.
f
(x)
x1
L
(x, )
xn
L
(x, )
=
=
f
(x)
xn
g
x
(x) = 0
1
..
.
g
x
(x) = 0
n
c g(x)
= 0,
24
which is equivalent to the first order condition (1). The function L is called the
Lagrangian function.
Though its not quite technically correct, its as though weve converted a constrained
maximization problem into an unconstrained one: maximizing the Lagrangian L(x, )
with respect to x. Imagine allowing xs that violate the constraint; for example,
suppose, at a constrained maximizer x , that we could increase the value of f by
moving from x to a nearby point x with g(x) < c. Since this x violates the constraint,
we dont want this to be profitable, so the Lagrangian has to impose a cost of doing
so in the amount (c g(x)) (here, has to be positive). Then is like a price
of violating the constraint imposed by the Lagrangian. The reason why this is not
technically correct is that given the multiplier , a constrained local maximizer need
not be a local maximizer of L(, ).
Example Consider X = R, f (x) = (x 1)3 + x, g(x) = x, and
maxxR f (x)
s.t. g(x) = 1
The unique solution to the constraint, and therefore to the maximization problem,
is x = 1. Note that Df (x) = 3(x 1)2 + 1 and Dg(x) = 1, and evaluating at the
solution x = 1, we have Df (1) = 1 = Dg(1). Thus, the multiplier for this problem is
= 1. The Lagrangian is
L(x, ) = (x 1)3 + x + (1 x),
and evaluated at = 1, this becomes
L(x, 1) = (x 1)3 + 1.
But note that this function is strictly increasing at x = 1, i.e., for arbitrarily small
> 0, we have L(1 + , 1) > L(1, 1), so x = 1 is not a local maximizer of L(, 1).
Note the following implication of Lagranges Theorem: at a constrained local maximizer, x, we have
f
(x)
xi
f
(x)
xj
g
(x)
xi
g
(x)
xj
for all i and j. The lefthand side is the marginal rate of substitution telling us the
value of xi in terms of xj . The righthand side tells us the cost of xi in terms of xj .
Lagrange tells us that, at an interior local maximizer, those have to be the same.
Recall that Lagranges theorem only gives us a necessarynot a sufficientcondition
for a constrained local maximizer. To see why the first order condition is not generally
sufficient, consider the following example.
25
1. Df (x) 6= 0, or
2. f is concave.
The preceding example shows that the first order condition is not sufficient for a
local maximizer (and a fortiori, not for a global maximizer). One approach to this
problem, taken above, is to add the assumption of non-zero gradient. An alternative is to strengthen the first order condition to the assumption that x is a local
maximizer. . . but this hope is not realized: in the previous example, re-define f to
be constant at zero whenever x2 < 0, leaving the definition unchanged whenever
x2 0; then every vector with x2 < 0 is a local maximizer (right?) but not a global
maximizer. We end by strengthening these assumptions even further, and deducing
an even stronger condition: if f is quasi-concave and x is a constrained strict local
maximizer, then x is the unique global maximizer.
Theorem 5.3 Let X Rn be convex, let f : X R be quasi-concave, and let
g : Rn R be linear. If x X is a constrained strict local maximizer, then it is the
unique constrained global maximizer, i.e., it is the unique maximizer of f subject to
g(x) = c.
Proof Assume x X is a constrained strict local maximizer, and suppose there
exists y X with y 6= x such that g(y) = c and f (y) f (x). Let > 0 be
such that for all z X C B (x) with z 6= x, we have f (x) > f (z). Given any
with 0 < < 1, define z() = y + (1 )x. Then quasi-concavity implies
f (z()) min{f (x), f (y)} = f (x). Furthermore, with g(x) = g(y) = c, linearity of g
implies g(z()) = c. But for small enough > 0, we have z() X C B (x) and
f (z()) 0, a contradiction.
Of course, if f is strictly quasi-concave and x is a constrained local maximizer, then
it is a constrained strict local maximizer, and the theorem can be applied.
5.2
Examples
Note that we impose the constraint that the consumer must spend all of his income;
since we assume monotonicity, this is without loss of generality. The set X C =
R2+ {(x1 , x2 ) | p1 x1 + p2 x2 = I} is compact (since p1 , p2 > 0), and u is continuous,
so the maximization problem has a solution. We can apply Lagranges theorem with
f (x1 , x2 ) = u(x1 , x2 )
g(x1 , x2 ) = p1 x1 + p2 x2
c = I
to find all the constrained local maximizers (x1 , x2 ) interior to R2+ (i.e, x1 , x2 > 0)
satisfying Dg(x1 , x2 ) 6= 0. In fact, for all (x1 , x2 ) R2+ ,
Dg(x1 , x2 ) = (p1 , p2 ) 6= 0,
so the constraint qualification is always met. Letting (x1 , x2 ) be an interior constrained local maximizer, there exists R such that (x1 , x2 , ) is a critical point of
the Lagrangian:
L(x1 , x2 , ) = u(x1 , x2 ) + (I p1 x1 p2 x2 ).
That is,
L
u
(x1 , x2 , ) =
(x1 , x2 , ) p1 = 0
x1
x1
L
u
(x1 , x2 , ) =
(x1 , x2 , ) p2 = 0
x2
x2
L
(x1 , x2 , ) = I p1 x1 p2 x2
= 0.
Solving these equations gives us the critical points of the Lagrangian, and if a maximizer (x1 , x2 ) is interior to R2+ (x1 , x2 > 0 ), then it will be one of these critical points.
Note that
u
(x1 , x2 )
x1
u
(x1 , x2 )
x2
p1
,
p2
i.e., the relative value of x1 in terms of x2 equals the relative price. Consider the CobbDouglas special case u(x1 , x2 ) = x1 x2 , where , > 0. Its clear that every maximizer
must be interior to R2+ . (Right?) The critical points of the Lagrangian satisfy
x1
x2 p1 = 0
1
x1 x21 p2 = 0,
Divide to get
x2 p 1
,
x1 p 2
or x2 =
p1
x.
p2 1
I
+ p1
and
x2 =
I
.
+ p2
Since this critical point is unique, it is the unique maximizer, and we call
I
+ p1
I
x2 (p1 , p2 , I) =
+ p1
x1 (p1 , p2 , I) =
demand functions. They tell us the consumers consumption for different prices and
incomes. Fixing p2 and I, we can graph x1 as a function of p1 , which gives us the
demand curve for good 1. We can also solve for by substituting into x1
x2 = p1 .
1
This gives us,
1 1
=
p1 +
p1
+
p2
+1
I
=
.
p1
p2
+
If + = 1, then the last term drops out. Note that we can always take a strictly
increasing transformation of Cobb-Douglas utilities to obtain + = 1 without altering the consumers demand functions, but such a transformation can affect the
Lagrange multiplier.
Example Now consider the distributive model of social choice, where the set of
alternatives is the unit simplex,
)
(
n
X
xi = 1 ,
X =
x Rn+ |
i=1
and each individual simply wants more of this scarce resource for him- or herself.
Formally, assume each is utility function ui(x1 , . . . , xn ) is strictly increasing in xi
and invariant with respect to reallocations of the resource among other individuals.
Consider the welfare maximization problem of a social planner with non-negative
weights 1 , . . . , n (not all zero):
max
xX
n
X
i ui (x).
i=1
xi = 1.
i=1
29
In contrast to unconstrained maximization, where the first order condition means that
the marginal impact of changing each choice variable is zero, now an interior allocation
can be a local maximizer only if the marginal impacts are equalized across individuals.
If a local maximum involves some individuals receiving an allocation of zero, then the
logic extends to all individuals receiving a positive amount of the resource. Now
consider the special case ui (x) = ln(xi ). (Henceforth, we only consider alternatives in
which each individual receives a strictly positive amount of the resource, so utilities
are well-defeined.) These utilities are concave in x but not strictly concave or even
strictly quasi-concave. Given the structure of the set of alternatives and utilities, we
can write the first order condition as
n
X
i
i ln(xi ) =
=
xi i=1
xi
n
X
xi = 1,
i=1
ln(x
)
ln
x
i
i
e i=1
e i =
xi i ,
=
i=1
i=1
which has the form of a Cobb-Douglas utility function with exponent i on xi ; thus,
the above problem is isomorphic to the problem of a Cobb-Douglas consumer facing
unit prices p1 = = pn = 1 and income I = 1. Because the maximization problem
has a unique solution for all such weights, the characterization result of Corollary 3.8
applies, and so we have solved for all Pareto optimal alternatives. In fact, by varying
the weights 1 , . . . , n , we conclude that every alternative is Pareto optimal a fact
that was pretty obvious from the outset (right?).
Example Prior to a national election, suppose a political party must decide how
much to spend in a number of electoral districts i = 1,
P.n. . , n. Let xi denote the amount
spent in district i, and assume x1 0, . . . xn 0, i=1 xi = I. The probability the
party wins district i is Pi (xi ), where Pi : R+ R is a differentiable function. The
party seeks to maximize the expected number of districts it wins, i.e.,
P
max(x1 ,...xn )Rn+ ni=1 Pi (xi )
s.t. x1 + + xn = I.
30
Again, the first order conditions reduce to the following simple principle: allocate
money to districts in a way that equalizes marginal probability of victory across
the districts. Note that the special case Pi (xi ) = i ln(xi ) is equivalent to the CobbDouglas specification of the consumers problem. For an alternative parameterization,
it could be that
Pi (xi ) =
i +
+ xi
where i < and i may vary across districts. The first order condition is
1
2
n1
n
.
2 =
2 = =
2 =
( + x1 )
( + x2 )
( + xn1 )
( + xn )2
The solutions to these equations will include all interior maximizers, if any. (Whether
there are any will depend on the i s. If i is close to so the probability of victory
2 ( )
is close to one, spending will be low. If i < (+1)2j for all j 6= i, spending in
district j will be zero.) The n = 2 case is analytically tractable:
(1 + 2 ) + 1 I 1 2 (I + 2)
x1 =
2 1
(1 + 2 ) + 2 I 1 2 (I 2)
,
x2 =
1 2
where 1 = 1 and 2 = 2 .
5.3
Second order necessary and sufficient conditions are more complicated than they
were in unconstrained optimization. As in the first order analysis, a condition on
second order directional derivatives needs to be satisfied at an interior constrained
local maximizerbut only in directions t that are orthogonal to the level set of g
at c. Once again, we must deal with the complication that we can only move along
the level set of g at c; moving from a constrained local maximizer in such a direction
t can violate the constraint if g is non-linear. Once again, the insight is to convert
the constrained optimization problem into an unconstrained one using the implicit
function theorem.
Theorem 5.4 Let f : X R and g : Rn R be twice continuously differentiable in
an open neighborhood around x, an interior point of X. Assume Dg(x) 6= 0. Assume
x is a constrained local maximizer of f subject to g(x) = c, and let R satisfy the
first order condition (1). Then
t [D 2 f (x) D 2 g(x)]t 0
for all directions t with Dg(x)t = 0.
31
(3)
Proof We give a heuristic proof for the two-variable case similar to that of Lagranges
theorem. As above, we have Dg(x) 6= 0, and we assume for simplicity that x = 0 and
D2 g(x) 6= 0. To further simplify matters, assume that the gradient of g at x points
straight up, so D1 g(x) = 0. (This just amounts to a rotation of axes that doesnt
affect the second order analysis.) Again, we have an open interval I around x1 = 0
and a continuously differentiable function : I R such that for all z I, we have
g(z, (z)) = c. Because we assume Dg(0) = (0, D2g(0)), this means Dg(x)t = 0 for
exactly two directions, i.e., t = (1, 0) and t = (1, 0). In either case, the necessary
second order condition is
D12 f (0) D12 g(0).
To obtain this, we note again that z = 0 is a local maximizer of the unconstrained
problem
max f (z, (z)).
zI
D 2 (z) =
.
(D2 g(0))2
D2 g(0)
Substituting the latter expression for D 2 (0) into the inequality D 2 F (0) 0, we have
D12 f (0) D2 f (0)
32
D12g(0)
0.
D2 g(0)
Finally, recall that the first order condition (1) implies D2 f (0) = D2 g(0), so the
preceding inequality becomes D12 f (0) D12 g(0), as required.
We can write the necessary second order condition in (3) in terms of the Lagrangian.
Recall the Lagrangian of an equality constrained maximization problem is defined as
L(x, ) = f (x) + (c g(x)).
Suppose the first order condition (1) holds at x with multiplier , i.e.,
Dx L(x, ) = Df (x) Dg(x) = 0,
and define the Hessian of the Lagrangian with respect
2
2
L(x, )
L(x, )
x1 x2
x21
2
2
x2 x1 L(x, )
L(x, )
x22
Dx2 L(x, ) =
..
..
.
.
2
2
L(x, ) xn x2 L(x, )
xn x1
to x as
..
.
2
L(x, )
x1 xn
2
L(x, )
x2 xn
..
.
2
L(x, )
x2
n
t Dx2 L(x, )t 0
for all t with Dg(x)t = 0.
How do we check whether the Hessian satisfies these inequalities? We can form the
bordered Hessian of the Lagrangian,
0
Dg(x)
,
Dg(x) Dx2 L(x, )
and then check signs of the last n 1 leading principal minors of the matrix. But
this takes us beyond the scope of these notes. See Simon and Blume (1994) for a nice
explanation.
Of course, the latter result provides only a necessary second order condition, not a
sufficient one. Strengthening the condition to strict inequality, we have a stronger
second order condition that is sufficient for a constrained strict local maximizer. Note
that, in contrast to the analysis of necessary conditions, the next result does not rely
on the constraint qualification.
Theorem 5.5 Let f : X R and g : Rn R be twice continuously differentiable
in an open neighborhood around x, an interior point of X. Assume x satisfies the
constraint g(x) = c and the first order condition (1) with multiplier R. If
t [D 2 f (x) D 2 g(x)]t < 0
(4)
for all directions t with Dg(x)t = 0, then x is a constrained strict local maximizer of
f subject to g(x) = c.
33
Again, we can write the sufficient second order condition in terms of the Lagrangian
as t Dx2 L(x, )t < 0 for all t with Dg(x)t = 0.
In fact, with the first order condition (1) from Lagranges theorem, the second order
sufficient condition implies much more. It implies that x is locally isolated, i.e., there
is an open set Y Rn around x such that x is the unique constrained local maximizer
belonging to Y . Furthermore, following the analysis of unconstrained optimization,
we can consider the possibility that the objective function f and the constraint function g contain parameters, notationally suppressed until now, and we can study the
effect of letting one parameter, say , vary. Of course, if x is a constrained local
maximizer given parameter , and then the value of the parameter changes a small
amount to , then x may no longer be a constrained local maximizer, but assuming
the second order sufficient condition, the new constrained local maximizer will be
close to x, and its location will vary smoothly as we vary the parameter. Note that
the constraint qualification is reinstated in the next result.
Theorem 5.6 Let I be an open interval and X Rn , and let f : X I R and
g : Rn I R be twice continuously differentiable in an open neighborhood of (x , ),
an interior point of X I. Assume Dg(x , ) 6= 0. Assume x satisfies the constraint
g(x , ) = c and the first order condition at , i.e.,
Dx f (x , ) Dx g(x , ) = 0,
with multiplier R. If
t [D 2 f (x , ) D 2 g(x , )]t < 0
for all t with Dg(x , )t = 0, then there are an open set Y Rn with x Y , an open
interval J R with J, and continuously differentiable mappings : I Y and
: I R such that for all I, (i) () is the unique maximizer of f (, ) subject to
g(x, ) = c belonging to Y , (ii) the unique multiplier for which () satisfies the first
order necessary condition (1) at is (), and (iii) () satisfies the second order
sufficient condition (4) at with multiplier ().
The preceding result lays the theoretical groundwork necessary for studying the effect
of a parameter on the solution to a given optimization problem. This exercise is
referred to as comparative statics. For example, under the conditions of the preceding
theorem, we can take partial derivatives,
x1 x1 x1
,
,
, etc.,
p1 p2 I
that tell us how the consumers maximizer changes with respect to market parameters.
34
I
+ p1
and
x2 (p1 , p2 , I) =
I
+ p1
The conditions of Theorem 5.6 are satisfied here, and indeed we can directly compute
partial derivatives of demand for good 1 as
x1
I
(p1 , p2 , I) =
p1
+ p21
x1
(p1 , p2 , I) = 0
p2
x1
(p1 , p2 , I) =
.
I
p1 ( + )
Obviously, partial derivatives for good 2 are similar. Interesting features of CobbDouglas demands are that demand curves are downward-sloping and the demand for
any good is invariant with respect to price changes in other goods. Indeed, the share
of income spent on good 1 is always /( + ) and the share spent on good 2 is
/( + ).
Given the preceding result and the mapping , which specifies a constrained local
maximizer as a continuously differentiable function of the parameter I, the locally
maximized value
F () = f ((), )
is itself a continuously differentiable mapping. The next result is an extension of
the envelope theorem to equality constrained maximization problems; it provides a
simple technique for performing comparative statics on this maximized value function:
basically, we can take a simple partial derivative of the parameterized Lagrangian,
L(x, , ) = f (x, ) + (c g(x, )),
only through the argument. That is, although the location of the constrained local
maximizer may indeed change when we vary , we can ignore that variation, treating
the constrained local maximizer as fixed in taking the derivative.
Theorem 5.7 Let I be an open interval and X Rn , and let f : X I R and
g : Rn I R be twice continuously differentiable in an open neighborhood around
(x , ), an interior point of X I. Let : I Rn and : I R be continuously
differentiable mappings such that for all I, () is a constrained local maximizer
satisfying the first order condition (1) with multiplier () at . Let x = ( ) and
= ( ), and define the mapping F : I R by
F () = f ((), )
35
L
(x , , ).
i
i=1
i=1
To verify the latter equality, we write G() = g((), ) and use the chain rule to
conclude
DG() =
n
X
g
di
g
((), )
() +
((), ) = 0,
x
d
i
i=1
where the second equality above follows since g((), ) takes a constant value of c
on I, so its derivative is zero. Then
n
X
g di
(x , ) =
(x , )
( ),
xi
d
i=1
g
g di
(x , )
(x , )
( ) = 0,
xi
xi
d
The previous analysis looks less general than it is, and in fact, it provides an intuitive
interpretation of the Lagrange multiplier. Although the parameter does not explicitly enter the value of the constraint, c, we can consider a simple linear specification,
g(x, ) = g(x) , so the Lagrangian becomes
L(x, ) = f (x) + (c + g(x)),
36
L
(x , , ) = .
That is, the value of the multiplier at a constrained local maximizer tells us the
marginal effect of increasing the value of the constraint.
Example In the consumers problem, given prices and income p1 , p2 , and I, let
x1 (p1 , p2 , I) and x2 (p1 , p2 , I) be demands satisfying the first order condition and second order sufficient condition. Then the consumers maximum utility is
U(p1 , p2 , I) = u(x1 (p1 , p2 , I), x2 (p1 , p2 , I)),
and the function U() is called the consumers indirect utility function. How does
this vary with respect to prices and income? Consider I. According to the envelope
theorem, we take the partial derivative of the Lagrangian,
u1 (x1 , x2 ) + (I p1 x1 p2 x2 ),
with respect to I, where x1 and x2 are fixed at their maximized values and is
the associated multiplier. Thats just ! Thus, we see that the Lagrange multiplier
measures the rate at which the consumers utility increases with her income, i.e., it
is the marginal utility of money. How does the consumers maximum utility vary
with the price p1 ? It is simply x1 .
5.4
interior to X. Assume the gradients of the constraints, {Dg1 (x), . . . , Dgm (x)}, are
linearly independent. If x is a local constrained maximizer of f subject to g1 (x) =
c1 , . . . , gm (x) = cm , then there are unique multipliers 1 , 2 , . . . , m R such that
Df (x) =
m
X
j Dgj (x).
(5)
j=1
Of course, the first order condition can be written in terms of partial derivatives:
m
X
f
gj
j
(x) =
(x)
xi
x
i
j=1
When there is one constraint (m = 1), the requirement is that Dg1 (x) 6= 0 (the same
as before); when there are two constraints (m = 2), the requirement is that the two
gradients, Dg1 (x) and Dg2 (x), are not collinear; when there are three constraints
(m = 3), the gradients of the constraints do not lie on the same plane.
As for the case of a single equality constraint, the constraint qualification is needed
for the result. To provide some geometric insight into the condition, consider the
case of two constraints in Figure 11. Here, x is a constrained local maximizer of f
(in fact, it is the unique element of the constraint set), but Df (x) cannot be written
as a linear combination of Dg1 (x) and Dg2 (x), which are linearly dependent.
Put differently, Lagranges theorem says that if x is an interior constrained local
maximizer, there exist 1 , . . . , m R such that (x, 1 , . . . m ) is a critical point of
the Lagrangian function L : X Rm R, now defined by
L(x, 1 , . . . , m ) = f (x) +
m
X
j=1
38
j (cj gj (x)).
x2
Dg1(x)
Df (x)
x
g 1 = c1
Dg2(x)
g 2 = c2
x1
Figure 11: Constraint qualification needed
The analysis of second order conditions and envelope theorems is very much the
same as with a single equality constraint. Indeed, the interpretation of the Lagrange
multipliers is the same: j tells us the rate at which the maximized value of f changes
if we increase cj in the j th constraint.
Example Consider the problem of a social planner in an exchange economy. There
are n consumers and K commodities. The social endowment (the amount in existence)
of good k is Wk . The planner has to decide on an allocation of the goods to consumers,
where xi = (xi1 , xi2 , . . . , xiK ) is the bundle for consumer i and x1k + x2k + + xnk = Wk
for each good k. Given continuously differentiable utility functions u1 , u2 , . . . , un
representing the preferences of consumers and non-negative weights 1 , . . . , n (not
all zero), the planner solves
max
i=1,...,n
xik R+ k=1,...,K
s.t.
n
X
n
X
i=1
xik = Wk k = 1, 2, . . . , K.
i=1
This is a maximization problem subject to multiple equality constraints, one for each
commodity. The Lagrangian for the problem is
L(x1 , . . . , xn , ) =
n
X
i ui (xi1 , . . . , xiK ) +
i=1
K
X
k=1
39
k (Wk
n
X
i=1
xik ).
ui i
(x , . . . , xiK ) = k
xik 1
i = 1, . . . , n, k = 1, . . . , K
= i
xik = Wk
k = 1, . . . , K.
i=1
k
.
That is, if we look at any is marginal rate of substitution between any goods k and
(measuring the value of k for i in terms of ), it is k . This is independent of i, so
the marginal rates of substitution of the consumers are equal. Indeed, recall that k
is the rate at which maximized social welfare increases with an increase in the total
amount of good k (and similarly for ). So k is the social value of good k in terms
of good , i.e., an extra unit of good k is worth roughly kl units of good . The first
order condition says that the planner must equate the individual values of the goods
to the social value. Second, the first order conditions imply
ui
(xi1 , . . . , xiK )
i x
k
= 1.
j
.
i
The lefthand side compares the increased utility consumer i would get from more
of good k to the increased utility consumer j would get. If it is high, is weight in
the welfare function must be low compared to js. Is this the opposite of what you
expected? If is weight werent relatively low, the planner would go ahead and give
more of good k, and that would raise social welfare but then the original allocation
couldnt have been optimal! To continue the example, recall the definition of a Walrasian equilibrium allocation (
x1 , . . . , xn ): there exist prices p1 , p2 , . . . , pK such that
1. for all i = 1, . . . , n, xi = (
xi1 , . . . , xiK ) solves
maxxi1 ,...,xiK 0 ui (xi1 , . . . , xiK )
i
s.t. p1 xi1 + + pK xiK p1 w1i + + p1 wK
.
40
2. for all k = 1, . . . , K, Wk =
Pn
i=1
xik .
i=1,...,n
xik R+ k=1,...,K
s.t.
n
X
n
X
i=1
xik = Wk k = 1, 2, . . . , K.
i=1
ui i
(x1 , . . . , xiK ) = k ,
xk
or equivalently,
ui i
(x1 , . . . , xiK ) = i k .
xk
Clearly, the Walrasian allocation (
x1 , . . . , xn ) satisfies the first order conditions with
multipliers k = pk . Adding concavity of consumer utilities (see Theorem 5.9), we can
conclude that the Walrasian allocation is indeed the social optimum given weights 1i ,
and then the multiplier pk represents the social value of good k.
Other results from the analysis of a single equality constraint carry over to the case of
multiple constraints. First, we note the implications of quasi-concave objective and
linear constraints. Again, the result follows from Theorem 6.2.
Theorem 5.9 Let X be open and convex, let f : X R be quasi-concave and
continuously differentiable, and let g1 : Rn R, . . . , gm : Rn R be be linear. If
g1 (x) = c1 , . . . , gm (x) = cm , and there exists 1 , . . . , m R such that the first order
condition (5) holds with respect to x, then x is a constrained global maximizer of f
subject to g1 (x) = c1 , . . . , gm (x) = cm provided either of two conditions holds:
1. Df (x) 6= 0, or
41
2. f is concave.
With quasi-concavity, a constrained strict local maximizer is the unique constrained
(global) maximizer.
Theorem 5.10 Let X Rn be convex, let f : X R be quasi-concave, and let
g1 : Rn R, . . . , gm : Rn R be linear. If x X is a constrained strict local
maximizer, then it is the unique constrained global maximizer, i.e., it is the unique
maximizer of f subject to g1 (x) = c1 , . . . , gm (x) = cm .
Moving to the second order analysis, the necessary condition is again that the second
directional derivative of the Lagrangian be non-positive, now in every direction that
is orthogonal to the gradient of each constraint function.
Theorem 5.11 Let f : X R and g1 : Rn R, . . . , gm : Rn R be twice continuously differentiable in an open neighborhood around x, an interior point of X.
Assume the gradients {Dg1 (x), . . . , Dgm (x)} are linearly independent. Assume x is
a constrained local maximizer of f subject to g1 (x) = c1 , . . . , gm (x) = cm , and let
1 , . . . , m R satisfy the first order condition (5). Consider any direction t such
that Dgj (x)t = 0 for all j = 1, . . . , m. Then
"
#
m
X
2
2
j D gj (x) t 0.
t D f (x)
j=1
Again, strengthening the weak inequality to strict gives us a second order condition
that, in combination with the first order condition, is sufficient for a constrained strict
local maximizer. Note that, in contrast to the analysis of necessary conditions, the
next result does not rely on the constraint qualification.
Theorem 5.12 Let f : X R and g1 : Rn R, . . . , gm : Rn R be twice differentiable in an open neighborhood around x, an interior point of X. Assume x satisfies
the constraints g1 (x) = c1 , . . . , gm (x) = cm and the first order condition (5) with
multipliers 1 , . . . , m R. Assume that for all directions t with Dgj (x)t = 0 for all
j = 1, . . . , m, we have
"
#
m
X
j D 2 gj (x) t < 0.
(6)
t D 2 f (x)
j=1
m
X
j Dx gj (x , ) = 0,
j=1
L
(x , , ).
43
Again, we can use the envelope theorem to characterize j as the marginal effect of
increasing the value of the jth constraint.
6
6.1
Inequality Constraints
First Order Analysis
44
x2
Dg1 (z)
Dg2 (z)
Df (z)
Dg1 (y)
Df (y)
y
Df (x) = 0
C
x
g 1 = c1
g 2 = c2
x1
Figure 12: Kuhn-Tucker conditions
are unique multipliers 1 , . . . m R such that
Df (x) =
m
X
j Dgj (x)
(7)
j = 1, . . . , m
j = 1, . . . , m.
(8)
(9)
j=1
j (cj gj (x)) = 0
j 0
Proof Under the conditions of the theorem, suppose x is a constrained local maximizer. By Gales (1960) Theorem 2.9, either there exist 0 , 1 , . . . , k 0 (not all
zero) such that
0 Df (x) +
k
X
j (Dgj (x)) = 0,
j=1
or there exists a direction t such that Df (x)t > 0 and Dgj (x)t > 0 for all j =
1, . . . , k. In the latter case, however, we can choose > 0 sufficiently small so that
f (x + t) > f (x) and gj (x + t) < gj (x) = cj for all j = 1, . . . , k, but then x is not
a constrained local maximizer, a contradiction. In the former case, note that linear
independence of {Dg1 (x), . . . , Dgk (x)} implies that 0 6= 0, and so we can define
45
m
X
j=1
j (cj gj (x)),
and then condition (7) from Theorem 6.1 is the requirement that x is a critical point
of the Lagrangian given multipliers 1 , . . . , m .
An important difference from the case of equality constraints is that the constraint
qualification now holds only for the gradients of binding constraints. (With equality
46
constraints, every constraint is binding, but now some may not be.) Another important difference, touched on above, is that the multipliers are non-negative. Indeed,
the interpretation of j is as before it tells us the rate of change of the maximized value of the objective function as we increase constrained value cj of the jth
constraint but now only the inequality gj (x) cj needs to be maintained, so
increasing cj cant hurt, so the multipliers are non-negative. Yet another difference is
that the equality constraints gj (x) = cj have been replaced in (8) by the conditions
j (cj gj (x)), j = 1, . . . , m, which are called the complementary slackness conditions.
Put differently, complementary slackness says that
j > 0 cj gj (x) = 0
j = 0 cj gj (x) > 0.
In words, the multiplier of every slack constraint is zero, and every constraint with a
positive multiplier is binding.
Referring back to Figure 12, first consider a constrained local maximizer such as x.
Assuming the constraint qualification holds, the Karoush-Kuhn-Tucker theorem says
that the gradient of f at x can be written as
Df (x) = 1 Dg1 (x) + 2 Dg2 (x),
and since both constraints are slack, complementary slackness implies 1 = 2 = 0,
which gives us Df (x) = 0. At y, the second constraint is slack, so 2 = 0, and we
have Df (y) = 1 Dg1 (y) for 1 0, as depicted. At z, the first order condition (7)
implies that the gradient of the objective lies in the semi-positive cone generated by
the binding constraints, as depicted.
To see why the constraint qualification is needed, consider Figure 13, a simple adaptation of Figure 11. Again, x is a local constrained maximizer of f (in fact, it is the
unique element of the constraint set), but Df (x) cannot be written as a non-negative
linear combination of Dg1 (x) and Dg2 (x), because they are linearly dependent.
Example The consumers problem is most accurately formulated in terms of inequality constraints. We can now think of u defined on all R2 and impose non-negativity
constraints on the consumers choice. The problem is
max2
xR
u(x1 , x2 )
s.t. p1 x1 + p2 x2 I
x1 0
x2 0.
Defining g1 (x1 , x2 ) = p1 x1 + p2 x2 , g2 (x1 , x2 ) = x1 , and g3 (x1 , x2 ) = x2 , this is
a maximization problem subject to the inequality constraints (1) g1 (x1 , x2 ) I,
47
x2
Dg1(x)
Df (x)
x
g 1 c1
Dg2(x)
g 2 c2
x1
Figure 13: Constraint qualification needed
(2) g2 (x1 , x2 ) 0, and (3) g3 (x1 , x2 ) 0. Note the constraints cannot bind simultaneously. First, consider the possibility that only (2) binds, i.e., p1 x1 + p2 x2 < I,
x1 = 0, and x2 > 0. Note that Dg2 (x) = (1, 0) 6= 0, so the constraint qualification
is met. By complementary slackness, it follows that 1 = 3 = 0, so the first order
condition becomes
g2
u
(x1 , x2 ) = 2
(x1 , x2 ) = 2
x1
x1
g2
u
(x1 , x2 ) = 2
(x1 , x2 ) = 0
x2
x2
g2 (x1 , x2 ) = 0, 2 0,
but this is incompatible with monotonicity of u, so we discard this case. Similarly for
the case in which only (3) binds, the case in which (2) and (3) both bind, and the case
in which no constraints bind. Next, consider the case in which (1) and (2) bind, i.e.,
p1 x1 + p2 x2 = I, x1 = 0, x2 > 0. Note that Dg1(x) = (p1 , p2 ) and Dg2(x) = (1, 0)
are linearly independent, so the constraint qualification is met. Since x2 > 0, complementary slackness implies 3 = 0, so the first order conditions are
u
g1
g2
(x1 , x2 ) = 1
(x1 , x2 ) + 2
(x1 , x2 )
x1
x1
x1
u
g1
g2
(x1 , x2 ) = 1
(x1 , x2 ) + 2
(x1 , x2 )
x2
x2
x2
g1 (x1 , x2 ) = I,
g2 (x1 , x2 ) = 0, 1 , 2 0.
48
so all such bundles are optimal. If the razors edge condition on marginal rates of substitution and relative prices does not hold, then either ab > pp12 or the opposite obtain,
and the only possible optimal bundles are the corner solutions. In the former case,
aI
bI
I
I
,
,0
=
>
= u 0,
u
p1
p1
p2
p2
so the consumer optimally spends all of his money on good 1, and in the remaining
case he spends everything on good 2.
6.2
Concave Programming
49
Dt gj (x) = lim
and of course, for each slack constraint, we have j = 0. Combining these observations, we conclude
m
X
Dt f (x) =
j Dt gj (x) 0.
j=1
Now suppose in order to derive a contradiction that f (y) > f (x). Then there exists
(0, 1] such that
Dt f (z()) = Df (z())t > 0,
and by quasi-concavity of f , we have f (z()) f (x). See Figure 14 for a visual
depiction. By continuity of the dot product, there exists > 0 sufficiently small that
1
Df (z())(t Df (x)) > 0. Letting t = ||t+Df
(t Df (x)) point in the direction
(x)||
of the perturbed vector t Df (x), it follows that the derivative of f at z() in this
direction is positive, i.e., Dt f (z()) > 0. This means that for sufficiently small > 0,
we can define w = z()+t such that f (w) > f (z()) f (x). Given (0, 1], define
v() = x + (w x) = (1 )x + w,
50
1
(w
||wx||
f (v()) f (x)
0.
0
||w x||
Ds f (x) = lim
(10)
Df (x)t
||w x||
Df (x)[t Df (x)]
=
||w x|| ||t + Df (x)||
51
replacemen
x2
Df (x)
t
x
z()
y
t
w
level
set of f
x1
In words,
given
x
,
minimizes
j
j=1 j (cj gj (x )); and given , x maximizes
Pm
f (x) + j=1 j (cj gj (x)). Note that the maximization problem over x is unconstrained, but if (x , ) is a saddlepoint, then x will indeed satisfy gj (x ) cj for each
j; indeed, if cj gj (x ) < 0, then the term j (cj gj (x )) could be made arbitrarily
negative by choice of arbitrarily large j , so (x , ) could not be a saddlepoint.
Theorem 6.4 Let f : Rn R be concave, let g1 : Rn R, . . . , gm : Rn R be
convex, and let x Rn . If there exist 1 , . . . , m R such that (x , ) is a saddlepoint of the Lagrangian, then x is a global constrained maximizer of f subject to
g1 (x) c1 , . . . , gm (x) cm . Conversely, assume there is some x Rn such that
g1 (
x) < c1 , . . . , gm (
x) < cm . If x is a constrained local maximizer of f subject to
g1 (x) c1 , . . . , gm (x) cm , then there exist 1 , . . . , m R such that (x , ) is a
saddlepoint of the Lagrangian. Furthermore, if f, g1 , . . . , gm are differentiable at x,
then the first order condition (7)(9) holds.
52
The condition g1 (
x) < c1 , . . . , gm (
x) < cm is called Slaters condition. To gain an
intuition for the saddlepoint theorem and the need for Slaters condition, consider
Figure 15. Here, we consider maximizing a function of any number of variables, but
to illustrate the problem in a two-dimensional graph, we assume there is a single
inequality constraint, g(x) c. On the horizontal axis, we graph values of f (x) as x
varies over Rn , and on the vertical axis, we graph cg(x) as x varies over the Euclidean
space. When f is concave and g is convex (so c g(x) is concave), you can check
that the set {(f (x), c g(x)) | x Rn }, which is shaded in the figure, is convex. The
values (f (x), c g(x)) corresponding to vectors x satisfying the constraint g(x) c
lie above the horizontal axis, the darker shaded regions in the figure. The ordered
pairs (f (x ), c g(x )) corresponding to solutions of the constrained maximization
problem are indicated by the black dots.
Consider the problem of minimizing f (x ) + (c g(x )) with respect to , holding
x fixed. This simply means that at a saddlepoint, (i) if c g(x ) > 0, then = 0,
and (ii) if c g(x ) = 0, then can be any non-negative number. Figure 15 depicts
the first possibility in Panel (a) and the second possibility in Panels (b) and (c). Now
consider the problem of maximizing f (x) + (c g(x)) with respect to x, holding
fixed. Lets write the objective function as a dot product: (1, ) (f (x), c
g(x)). Viewed this way, we can understand the problem as choosing the ordered pair
(f (x), c g(x)) in the shaded region that maximizes the linear function with gradient
(1, ). This is depicted in Panels (a) and (b). The difference in the two panels is
that in (a), the constraint is not binding at the solution to the optimization problem
(so Df (x ) = g(x ) = 0), while in (b) it is (so may be positive).
The difference between Panels (b) and (c) is that Slaters condition is not satisfied in
the latter: there is no x such that g(x) < c; graphically, the shaded region does not
contain any points above the horizontal axis. The pair (f (x ), cg(x)) corresponding
to the solution of the maximization problem is indicated by the black dot; we then
must choose such that (f (x ), cg(x )) maximizes the linear function with gradient
(1, ). The difficulty is that for any finite , the pair (f (x ), c g(x )) does not
maximize the linear function; instead, the maximizing pair will correspond to a vector
x that violates the constraint, i.e., c g(x) < 0. To make (f (x ), c g(x )) the
maximizing pair, the gradient of the linear function must be pointing straight up,
which would correspond to something like infinite (whatever that would mean). In
other words, if Slaters condition is not satisfied, then there may be no way to choose
a multiplier to solve the saddlepoint problem.
Example For a formal example demonstrating the need for Slaters condition, let
n = 1, f (x) = x, m = 1, c1 = 0, and g(x) = x2 . The only point in R satisfying
g1 (x) 0 is x = 0, so this is trivially the constrained maximizer of f . But Df (0) = 1
and Dg1 (0) = 0, so there is no 0 such that Df (0) = Dg1 (0).
53
a)
c g(x)
(f (x ), c g(x ))
(1, 0)
= 0
f (x)
b)
c g(x)
(1, )
> 0
(f (x ), c g(x ))
c)
c g(x)
(1, )
f (x)
(1, )?
f (x)
(f (x ), c g(x ))
54
6.3
The second order analysis parallels that for multiple equality constraints, modified to
accommodate the different first order conditions. Again, the necessary condition is
that the second directional derivative of the Lagrangian be non-positive in a restricted
set of directions. A difference is that now the inequality must hold only for directions
orthogonal to the gradients of binding constraints.
Theorem 6.5 Let f : Rn R, g1 : Rn R, . . . , gm : Rn R be twice continuously differentiable in an open neighborhood around x. Suppose the first k constraints are the binding ones at x, and assume the gradients of the binding constraints,
{Dg1(x), . . . , Dgk (x)}, are linearly independent. Assume x is a constrained local maximizer of f subject to g1 (x) c1 , . . . , gm (x) cm , and let 1 , . . . , m R+ satisfy
the first order conditions (7)(9). Consider any direction t such that Dgj (x)t = 0 for
all binding constraints j = 1, . . . , k. Then
"
#
m
X
j D 2 gj (x) t 0.
t D 2 f (x)
j=1
Note that the range of directions for which the above inequality must hold is the
set of directions that are orthogonal to the gradients of binding constraints. One
might think it should hold as well for directions t such that Dgj (x)t 0 for all
j = 1, . . . , m, since any direction with Dgj (x)t < 0 is feasible for that constraint. In
fact, the stronger version of the condition (using the larger range of directions) is not
necessary.
Example Let n = 1, f (x) = ex , m = 1, g1 (x) = x, and c1 = 0. Clearly, x = 0
maximizes f subject to g1 (x) 0, and the first order condition Df (0) = 1 Dg1 (0)
55
Again, strengthening the weak inequality to strict gives us a second order condition
that, in combination with the first order condition, is sufficient for a constrained strict
local maximizer. In contrast to the analysis of necessary conditions, the next result
does not rely on the constraint qualification.
Theorem 6.6 Let f : Rn R, g1 : Rn R, . . . , gm : Rn R be twice continuously
differentiable in an open neighborhood around x. Assume x satisfies the constraints
g1 (x) c1 , . . . , gm (x) cm and the first order condition with multipliers 1 , . . . , m
R+ satisfying (7)(9). Assume that for all directions t with Dgj (x)t 0 for all binding
constraints j = 1, . . . , k, we have
"
#
m
X
2
2
j D gj (x) t < 0.
(11)
t D f (x)
j=1
56
Dx f (x , ) =
m
X
j Dx gj (x , )
j=1
j (cj
gj (x , )) = 0
j 0
j = 1, . . . , m
j = 1, . . . , m,
j Dx gj (x , ) t < 0.
t Dx f (x , )
j=1
Then there are an open set Y Rn with x Y , and open interval J R with J,
and continuously differentiable mappings : J Y , 1 : J R, . . . , m : J R
such that for all J, (i) () is the unique maximizer of f (, ) subject to g1 (x, )
c1 , . . . , gm (x, ) cm belonging to Y , (ii) the unique multipliers for which ()
satisfies the first order necessary condition (1) with strict complementary slackness at
are 1 (), . . . , m (), and (iii) () satisfies the second order sufficient condition
(11) at with multipliers 1 (), . . . , m ().
Fortunately, the statement of the envelope theorem carries over virtually unchanged.
Theorem 6.8 Let I be an open interval, and let f : Rn I R and g1 : Rn I R,
. . . , gm : Rn I R be twice continuously differentiable in an open neighborhood of
(x , ). Let : I Rn and 1 : I R, . . . , m : I R be continuously differentiable
mappings such that for all I, () is a constrained local maximizer satisfying the
first order condition (7)(9) at with multipliers 1 (), . . . , m (). Let x = ( )
and j = j ( ), j = 1, . . . , m, and define the mapping F : I R by
F () = f ((), )
for all I. Then F is continuously differentiable and
DF ( ) =
L
(x , , ).
57
Again, we can use the envelope theorem to characterize j as the marginal effect
of increasing the value of the jth constraint; with inequality constraints, of course,
this cannot diminish the maximized value of the objective, so the multipliers are
non-negative.
We now return to the topic of characterizing Pareto optimal alternatives and explore
an alternative approach using the framework of constrained optimization. First, we
give a general characterization in terms of inequality constrained optimization. Second, we establish a necessary first order condition for Pareto optimality that adds a
rank condition on gradients of individual utilities to the assumptions of Theorem 3.9
to deduce strictly positive coefficients and provides an interpretation of the coefficients
in terms of shadow prices of utilities. Finally, we establish that with quasi-concave
utilities, the first order condition is actually sufficient for Pareto optimality as well.
This gives us a full characterization that, in comparison with Corollary 3.7, weakens the assumption of concavity to quasi-concavity but adds the rank condition on
gradients.
The next result is structure free, extending our earlier analysis by dropping all convexity, concavity, and differentiability conditions. It gives a full characterization: an
alternative is Pareto optimal if and only if it satisfies n different maximization problems (one for each individual) subject to inequality constraints. The proof follows
directly from definitions and is omitted.
Theorem 7.1 Let x A be an alternative, and let ui = ui (x) for all i. Then x is
Pareto optimal if and only if it solves
maxyX ui (y)
s.t. uj (y) uj , j = 1, . . . , i 1, i + 1, . . . , n
for all i.
Note that the sufficiency direction of Theorem 7.1 uses the fact that the alternative
x solves n constrained optimization problems, one for each individual. Figure 16
demonstrates that this feature is needed for the result: there, x maximizes u2 (y)
subject to u1 (y) u1 , but it is Pareto dominated by x . Obviously, x is Pareto
optimal, as it maximizes u1 (y) subject to u2 (y) u2 and it maximizes u2 (y) subject
to u1 (y) u1 .
Of course, we can use our analysis of maximization subject to multiple inequality constraints to draw implications of Theorem 7.1. Consider a Pareto optimal alternative
58
utility for 2
(u1 (x ), u2(x ))
(u1 (x ), u2(x ))
u2
u2
utility
for 1
59
Recall that the multiplier on a constraint has the interpretation of giving the rate of
change of the maximized objective function as we increase the value of the constraint.
In this context, the multiplier ij has a special meaning: it is the rate at which we
can increase i utility by taking utility away from individual j. Put differently, it is
the rate at which is utility would decrease if we increase js utility (holding all other
individuals at the constraint). Thus, it is the shadow price of utility for j in terms
of utility for i. Geometrically, viewed in Rd , the gradient Dui(x) of individual i lies
on the (n 1)-dimensional hyperplane spanned by the other individuals gradients.
Now recall the mapping u : X Rn defined by u(x) = (u1 (x), . . . , un (x)). Then
u(X) is the set of possible utility vectors, and the linear independence assumption in
Theorem 7.2 is equivalent to the requirement that the derivative of u at x, which is
the matrix
u1
u1
(x) x
(x)
x1
d
..
..
..
,
.
.
.
un
(x)
x1
un
(x)
xd
has rank n1. This means that there is a uniquely defined hyperplane that is tangent
to u(X) at the point u(x). When there are just two individuals, this implies there is
a unique tangent line at (u1 (x), u2 (x)), as in Figure 16. See Figure 7 for the case of
three individuals. This hyperplane has a normal vector p that is uniquely defined up
to a non-zero scalar. The first order condition (12) from Theorem 7.2 can be written
in matrix terms as
u1
1
(x) u
(x)
x1
xd
i
..
..
..
1 ii1 1 ii+1 in
= 0,
.
.
.
un
(x)
x1
un
(x)
xd
and we conclude that p is, up to a non-zero scalar, equal to the vector of multipliers
(with a coefficient of one for i) for individual is problem.
An implication of the above analysis is that the vectors (i1 , . . . , ii1 , 1, ii+1, . . . , in )
of multipliers corresponding to individuals i = 1, . . . , n are collinear. Indeed, they
are each normal to the tangent hyperplane at u(x), and the set of normal vectors is
one-dimensional, so the claim follows. The claim can also be verified mechanically by
multiplying both sides of
X
Dui(x) =
ij Duj (x)
j6=i
by
1
ij
X i
1
k
Du
(x)
Duk (x).
i
i
j
i
k6=i,j j
60
ik
,
ij
1
ij
as claimed.
Three interesting conclusions follow from these observations. First, the multipliers
from Theorem 7.2 are actually strictly positive. Second, the utility shadow prices for
any two individuals are reciprocal: we can transfer utility from j to i at rate ij , and
we can transfer utility from i to j at rate ji = 1i . Third, the relative prices of any
j
two individuals are independent of the problem we consider. To see this, consider any
two individuals h, i, and let j and k be any two individuals. Then from the analysis
in the preceding paragraph, we have
ij
hj /hi
hj
= h h = h.
ik
k /i
k
If, for example, it is twice as expensive, in terms of is utility, to increase js utility
as it is to increase ks utility, then it is also twice as expensive in terms of hs utility.
A finaland importantgeometric insight stems from the sign of the multipliers;
they are all non-negative, and at least one is strictly positive. Thus, the tangent
hyperplane to u(X) has a normal vector with all non-negative coordinates, at least
one positive. When there are just two individuals, this means that the utility frontier
is sloping downward at (u1 (x), u2 (x)), as in Figure 16, and the idea extends to a
general number of individuals, as in Figure 7. We conclude that at a Pareto optimal
alternative for which the constraint qualification is satisfies, the boundary of u(X) is
sloping downward, in a precise sense.
This is only a necessary condition, as Figure 17 illustrates: the boundary of u(X)
is downward sloping at (u1 (x), u2 (x)), but x is Pareto dominated by y. Although
conceptually possible, however, the anomaly depicted in the figure is precluded under
the typical assumption of quasi-concave utility. Recall that, by Theorem 6.2, the first
order condition is sufficient for a maximizer when the objective and constraints are
quasi-concave. With Theorem 7.1, this yields the following result.
Theorem 7.3 Assume A Rd is convex, let x A be an alternative, and assume
each ui : Rd R is continuously differentiable and quasi-concave. Suppose that for
each i, Dui(x) 6= 0 and there exist multipliers i1 , ii1 , ii+1 , . . . , in 0, j 6= i, such
that
X
Dui(x) =
ij Duj (x).
j6=i
Thus, under quite general conditions, the first order condition (12) is necessary and
sufficient for Pareto optimality.
61
utility for 2
i Dui (x) = 0.
i=1
Mixed Constraints
The goal of this section is simply to draw together results for equality constrained
and inequality constrained maximization into a general framework. Conceptually,
nothing new is added.
Let f : Rn R, g1 : Rn R, . . . , g : Rn R, and h1 : Rn R, . . . , hm : Rn R.
62
j Dgj (x) +
j Dhj (x)
(13)
j=1
j=1
j (dj hj (x)) = 0
j 0
m
X
j = 1, . . . , m
j = 1, . . . , m.
(14)
(15)
X
j=1
j (cj gj (x)) +
m
X
j=1
j (dj hj (x)),
and condition (13) from Theorem 8.1 is then the requirement that x is a critical point
of the Lagrangian given multipliers 1 , . . . , , 1 , . . . , m .
Our results for quasi-concave objective functions with non-zero gradient go through
in the general setting, now with the assumption that all equality constraints are linear
and all inequality constraints are quasi-convex. Again, we rely on Theorem 6.2 for
the proof.
63
m
X
X
j D 2 gj (x)
j D 2 hj (x) t 0.
t D 2 f (x)
j=1
j=1
Again, strengthening the weak inequality to strict yields the second order sufficient
condition for a local maximizer. See Theorem 4 of Fiacco and McCormick (1968).
64
m
X
X
j D 2 gj (x)
j D 2 hj (x) t < 0.
(16)
t D 2 f (x)
j=1
j=1
Dx f (x , ) =
j Dx gj (x , )
j=1
j (dj
hj (x , )) = 0
j 0
m
X
j Dx hj (x , )
j=1
j = 1, . . . , m
j = 1, . . . , m,
X
X
j Dx2 hj (x , ) t < 0.
j Dx2 gj (x , )
t Dx2 f (x , )
j=1
j=1
Then there are an open set Y Rn with x Y , and open interval J R with
J, and continuously differentiable mappings : J Y , 1 : J R, . . . , : J
65
L
(x , , , ).
Finally, we can again use the envelope theorem to characterize j as the marginal
effect of increasing the value of the jth equality constraint; and j as the marginal
effect of increasing the value of the jth inequality constraint, which of course cannot
reduce the maximized value of the objective.
References
[1] A. Fiacco and Y.Ishizuka (1990) Sensitivity and Stability Analysis for Nonlinear
Programming, Annals of Operations Research, 27: 215236.
[2] A. Fiacco and G. McCormick (1968) Nonlinear Programming: Sequential Unconstrained Minimization Techniques, McLean, VA: Research Analysis Corporation.
[3] D. Gale (1960) The Theory of Linear Economic Models, Chicago, IL: University
of Chicago Press.
66
[4] C. Simon and L. Blume (1994) Mathematics for Economists, New York, NY:
Norton.
[5] R. Sundaram (1996) A First Course in Optimization Theory, New York, NY:
Cambridge University Press.
67