Nonlinear Programming Concepts PDF

Nonlinear Programming:
Concepts, Algorithms and

Applications
L. T. Biegler
Chemical Engineering Department
Carnegie Mellon University
Pittsburgh, PA
Nonlinear Programming and Process Optimization

Introduction
Unconstrained Optimization
Algorithms
Newton Methods
Quasi-Newton Methods
Constrained Optimization
Karush Kuhn-Tucker Conditions

Special Classes of Optimization Problems
Reduced Gradient Methods (GRG2, CONOPT, MINOS)
Successive Quadratic Programming (SQP)
Interior Point Methods
Process Optimization
Black Box Optimization

Modular Flowsheet Optimization Infeasible Path
The Role of Exact Derivatives
Large-Scale Nonlinear Programming
Data Reconciliation
Real-time Process Optimization
Further Applications
Sensitivity Analysis for NLP Solutions

Multiperiod Optimization Problems
Summary and Conclusions

2
Introduction
Optimization: given a system or process, find the best solution to
this process within constraints.
Objective Function: indicator of "goodness" of solution, e.g., cost,
yield, profit, etc.
Decision Variables: variables that influence process behavior and
can be adjusted for optimization.
In many cases, this task is done by trial and error (through case
study). Here, we are interested in a systematic approach to this
task - and to make this task as efficient as possible.
Some related areas:

- Math programming
- Operations Research
Currently - Over 30 journals devoted to optimization with roughly
200 papers/month - a fast moving field!
3
Optimization Viewpoints
Mathematician - characterization of theoretical properties
of optimization, convergence, existence, local
convergence rates.
Numerical Analyst - implementation of optimization
method for efficient and "practical" use. Concerned with
ease of computations, numerical stability, performance.
Engineer - applies optimization method to real problems.
Concerned with reliability, robustness, efficiency,
diagnosis, and recovery from failure.
Optimization Literature
Engineering
1. Edgar, T.F., D.M. Himmelblau, and L. S. Lasdon, Optimization of Chemical
Processes, McGraw-Hill, 2001.
2. Papalambros, P. and D. Wilde, Principles of Optimal Design. Cambridge Press,
1988.
3. Reklaitis, G., A. Ravindran, and K. Ragsdell, Engineering Optimization, Wiley, 1983.
4. Biegler, L. T., I. E. Grossmann and A. Westerberg, Systematic Methods of Chemical
Process Design, Prentice Hall, 1997.
Numerical Analysis
1. Dennis, J.E. and R. Schnabel, Numerical Methods of Unconstrained Optimization,
Prentice-Hall, (1983), SIAM (1995)
2. Fletcher, R. Practical Methods of Optimization, Wiley, 1987.
3. Gill, P.E, W. Murray and M. Wright, Practical Optimization, Academic Press,
1981.
4. Nocedal, J. and S. Wright, Numerical Optimization, Springer, 1998
Motivation
Scope of optimization
Provide systematic framework for searching among a specified
space of alternatives to identify an optimal design, i.e., as a
decision-making tool
Premise
Conceptual formulation of optimal product and process design
corresponds to a mathematical programming problem
min f ( x , y )
st h ( x , y ) = 0
g ( x, y ) 0
x R n y {0, 1}ny
6
Optimization in Design, Operations and Control

MILP
MINLP
Global
LP,QP
NLP
SA/GA
HENS
MENS
Separations
x
x
Reactors
Equipment
Design
Flowsheeting
Scheduling
Supply Chain
Real-time
optimization
Linear MPC
Nonlinear
MPC
Hybrid
x
x
x
x
Unconstrained Multivariable Optimization

Problem:
Min
f(x)
(n variables)
Equivalent to: Max -f(x), x Rn

Nonsmooth Functions
- Direct Search Methods
- Statistical/Random Methods
Smooth Functions
- 1st Order Methods
- Newton Type Methods
- Conjugate Gradients
Example: Optimal Vessel Dimensions

What is the optimal L/D ratio for a cylindrical vessel?
Constrained Problem
(1)
D2
Min CT
+ CS DL = cost
2
s.t.
D 2 L
V 4
= 0
Convert to Unconstrained (Eliminate L)
D
Min CT
2
+ CS
4V
= cost
D
d(cost)
4VC s
= CT D = 0
2
dD
D
4V CS
D =
CT
1/ 3
(2)
2/3
4V 1/3 C T
L =
CS
==> L/D = CT/CS

Note:
What if L cannot be eliminated in (1) explicitly? (strange shape)
What if D cannot be extracted from (2)?
(cost correlation implicit)
Two Dimensional Contours of F(x)

Convex Function
Discontinuous
Nonconvex Function
Multimodal, Nonconvex
Nondifferentiable (convex)
10
Local vs. Global Solutions
Find a local minimum point x* for f(x) for feasible region defined by
constraint functions: f(x*) f(x) for all x satisfying the constraints in
some neighborhood around x* (not for all x X)
Finding and verifying global solutions will not be considered here.
Requires a more expensive search (e.g. spatial branch and bound).
A local solution to the NLP is also a global solution under the
following sufficient conditions based on convexity.
f(x) is convex in domain X, if and only if it satisfies:
f( y + (1-) z) f(y) + (1-)f(z)
for any , 0 1, at all points y and z in X.
11
Linear Algebra - Background

Some Definitions
Scalars - Greek letters, , ,
Vectors - Roman Letters, lower case
Matrices - Roman Letters, upper case
Matrix Multiplication:
C = A B if A n x m, B m x p and C n x p, Cij = k Aik Bkj
Transpose - if A n x m,
interchange rows and columns --> AT m x n
Symmetric Matrix - A n x n (square matrix) and A = AT
Identity Matrix - I, square matrix with ones on diagonal
and zeroes elsewhere.
Determinant: "Inverse Volume" measure of a square matrix
det(A) = i (-1)i+j Aij Aij for any j, or
det(A) = j (-1)i+j Aij Aij for any i, where Aij is the determinant
of an order n-1 matrix with row i and column j removed.
det(I) = 1
Singular Matrix: det (A) = 0

12

Gradient Vector - (f(x))
f / x1
f /
x 2
f =
......
f / x n
Hessian Matrix (2f(x) - Symmetric)

2f
2f 2f
x12
x1 x 2
x1 x n
....
....
2 f(x) = ....
2f
2 f 2 f
xn x 1 xn x 2
x2n
Note:
2 f
xi x j
2 f
x j x i
13
Some Identities for Determinant

det(A B) = det(A) det(B);
det (A) = det(AT)
det(A) = n det(A); det(A) = i i(A)
Eigenvalues: det(A- I) = 0, Eigenvector: Av = v

Characteristic values and directions of a matrix.
For nonsymmetric matrices eigenvalues can be complex,
so we often use singular values, = (AT)1/2 0
Vector Norms
|| x ||p = {i |xi|p}1/p
(most common are p = 1, p = 2 (Euclidean) and p = (max norm = maxi|xi|))
Matrix Norms
||A|| = max ||A x||/||x|| over x (for p-norms)
||A||1 - max column sum of A, maxj (i |Aij|)
||A|| - maximum row sum of A, maxi (j |Aij|)
||A||2 = [max()] (spectral radius)
||A||F = [i j (Aij)2]1/2 (Frobenius norm)
() = ||A|| ||A-1|| (condition number) = max/min (using 2-norm)
14
Linear Algebra - Eigenvalues

Find v and where Avi = i vi, i = i,n
Note: Av - v = (A - I) v = 0 or det (A - I) = 0
For this relation is an eigenvalue and v is an eigenvector of A.
If A is symmetric, all i are real
i > 0, i = 1, n; A is positive definite
i < 0, i = 1, n; A is negative definite
i = 0, some i: A is singular
Quadratic Form can be expressed in Canonical Form (Eigenvalue/Eigenvector)
xTAx
AV = V
V - eigenvector matrix (n x n)
- eigenvalue (diagonal) matrix = diag(i)
If A is symmetric, all i are real and V can be chosen orthonormal (V-1 = VT).
Thus, A = V V-1 = V VT
For Quadratic Function: Q(x) = aTx + xTAx
Define:
z = VTx and Q(Vz) = (aTV) z + zT (VTAV)z

= (aTV) z + zT z
Minimum occurs at (if i > 0) x = -A-1a
or
x = Vz = -V(-1VTa)
15
Positive (or Negative) Curvature

Positive (or Negative) Definite Hessian
Both eigenvalues are strictly positive or negative
A is positive definite or negative definite
Stationary points are minima or maxima
16
Zero Curvature
Singular Hessian
One eigenvalue is zero, the other is strictly positive or negative
A is positive semidefinite or negative semidefinite
There is a ridge of stationary points (minima or maxima)
17
Indefinite Curvature
Indefinite Hessian
One eigenvalue is positive, the other is negative
Stationary point is a saddle point
A is indefinite
Note: these can also be viewed as two dimensional projections for higher dimensional problems
18
Eigenvalue Example
1
1 T 2
Min Q(x) = x + x
2 1
1
2
AV = V with A =
1
T
1
x
2
1
1/ 2 1/ 2
1 0
V AV = =
with V =
0 3
-1/ 2 1/ 2
T
All eigenvalues are positive

Minimum occurs at z* = -/-1VTa
(x x ) / 2
2
z =V x = 1
(x1 + x 2 ) / 2
0
z* =
2 /(3 2)
T
(x + x ) / 2
2
x = Vz = 1
(x1 + x 2 ) / 2
1/3
x* =
1/3
19
Comparison of Optimization Methods

1. Convergence Theory
Global Convergence - will it converge to a local optimum
(or stationary point) from a poor starting point?
Local Convergence Rate - how fast will it converge close to
the solution?
2. Benchmarks on Large Class of Test Problems
Representative Problem (Hughes, 1981)
Min f(x1, x2) = exp(-)
u = x1 - 0.8
v = x2 - (a1 + a2 u2 (1- u)1/2 - a3 u)
= -b1 + b2 u2 (1+u)1/2 + b3 u
= c1 v2 (1 - c2 v)/(1+ c3 u2)
a = [ 0.3, 0.6, 0.2]
b = [5, 26, 3]
c = [40, 1, 10]
x* = [0.7395, 0.3144]
f(x*) = 5.0893
20
Three Dimensional Surface and Curvature for Representative Test Problem
Regions where minimum

eigenvalue is less than:
[0, -10, -50, -100, -150, -200]
21
What conditions characterize an optimal solution?

Unconstrained Local Minimum
Necessary Conditions
f (x*) = 0
pT2f (x*) p 0 for pn
(positive semi-definite)
x2
x*

Sufficient Conditions
f (x*) = 0
pT2f (x*) p > 0 for pn
(positive definite)
Contours of f(x)
x1
For smooth functions, why are contours around optimum elliptical?

Taylor Series in n dimensions about x*:
1
f ( x) = f ( x*) + f ( x*)T ( x x*) + ( x x*)T 2 f ( x*)( x x*)
2
Since f(x*) = 0, f(x) is purely quadratic for x close to x*
22
Newtons Method
Taylor Series for f(x) about xk
Take derivative wrt x, set LHS 0
0 f(x) = f(xk) + 2f(xk) (x - xk)
(x - xk) d = - (2f(xk))-1 f(xk)
f(x) is convex (concave) if for all x n, 2f(x) is positive (negative) semidefinite
i.e. minj j 0 (maxj j 0)
Method can fail if:
- x0 far from optimum
- 2f is singular at any point
- f(x) is not smooth
Search direction, d, requires solution of linear equations.
Near solution:
2
k +1 - x * = K
k
x
x - x*
23
Basic Newton Algorithm - Line Search

0. Guess x0, Evaluate f(x0).
1. At xk, evaluate f(xk).
2. Evaluate Bk = 2f(xk) or an approximation.
3. Solve: Bk d = -f(xk)
If convergence error is less than tolerance:
e.g., ||f(xk) || and ||d|| STOP, else go to 4.
4. Find so that 0 < 1 and f(xk + d) < f(xk)
sufficiently (Each trial requires evaluation of f(x))
5. xk+1 = xk + d. Set k = k + 1 Go to 1.
24
Newtons Method - Convergence Path
Starting Points
[0.8, 0.2] needs steepest descent steps w/ line search up to O, takes 7 iterations to ||f(x*)|| 10-6
[0.35, 0.65] converges in four iterations with full steps to ||f(x*)|| 10-6
25
Newtons Method - Notes

Choice of Bk determines method.
- Steepest Descent: Bk = I
- Newton: Bk = 2f(x)
With suitable Bk, performance may be good enough if f(xk + d)
is sufficiently decreased (instead of minimized along line search
direction).
Trust region extensions to Newton's method provide very strong
global convergence properties and very reliable algorithms.
Local rate of convergence depends on choice of Bk.
Newton Quadratic Rate :
limk
Steepest descent Linear Rate : limk

Desired? Superlinear Rate :
limk
x k +1 x *
x x*
k
x k +1 x *
x x*
k
x k +1 x *
x x*
k
=K
<1
=0
26
Motivation:
Need Bk to be positive definite.
Avoid calculation of 2f.
Avoid solution of linear system for d = - (Bk)-1 f(xk)
Strategy: Define matrix updating formulas that give (Bk) symmetric, positive
definite and satisfy:
(Bk+1)(xk+1 - xk) = (fk+1 - fk) (Secant relation)
DFP Formula: (Davidon, Fletcher, Powell, 1958, 1964)
T
y
s
+
y
y
s
y
s
s
y
y
y
(
B )
(
B ) ( B )
k
=B +
T
T
T
ys
ys ys
k
k+1
( )( )
(B )
k+1 -1
k+1
=H
where:
T
k
y
y
ss
H
H
= Hk + T y Hk y
s y
T
s = xk+1- xk
y = f (xk+1) - f (xk)
27
BFGS Formula (Broyden, Fletcher, Goldfarb, Shanno, 1970-71)
k+1
(B )
k+1 1
= Hk+1 = Hk +
T
k
T k
s
yy
B
s
B
= Bk + T k
s y
sB s
(s -
y - H s) y s sT
(
H y)s + s (s - H y)
T
T
T
y s
y s y s
k
( )( )
Notes:
1) Both formulas are derived under similar assumptions and have
symmetry
2) Both have superlinear convergence and terminate in n steps on
quadratic functions. They are identical if is minimized.
3) BFGS is more stable and performs better than DFP, in general.
4) For n 100, these are the best methods for general purpose
problems if second derivatives are not available.
28
Quasi-Newton Method - BFGS

Convergence Path
Starting Point
[0.8, 0.2] starting from B0 = I, converges in 9 iterations to ||f(x*)|| 10-6
29
Sources For Unconstrained Software

Harwell (HSL)
IMSL
NAg - Unconstrained Optimization Codes
Netlib (www.netlib.org)
MINPACK
TOMS Algorithms, etc.
These sources contain various methods
Quasi-Newton
Gauss-Newton
Sparse Newton
Conjugate Gradient
30
Constrained Optimization
(Nonlinear Programming)
Problem:
s.t.
Minx f(x)
g(x) 0
h(x) = 0
where:
f(x) - scalar objective function
x - n vector of variables
g(x) - inequality constraints, m vector
h(x) - meq equality constraints.
Sufficient Condition for Unique Optimum
- f(x) must be convex, and
- feasible region must be convex,
i.e. g(x) are all convex
h(x) are all linear
Except in special cases, ther is no guarantee that a local optimum is global if
sufficient conditions are violated.
31
Example: Minimize Packing Dimensions
What is the smallest box for three round objects?

Variables: A, B, (x1, y1), (x2, y2), (x3, y3)
Fixed Parameters: R1, R2, R3
Objective: Minimize Perimeter = 2(A+B)
Constraints: Circles remain in box, cant overlap
Decisions: Sides of box, centers of circles.
2
3
A
1
B
x1, y1 R1
x 2, y 2 R 2
x 3, y 3 R 3
x1 B - R1, y1 A - R1
x2 B - R 2, y2 A - R 2
x3 B - R 3, y3 A - R 3
in box
x1, x2, x3, y1, y2, y3, A, B 0
(x - x )2 + (y - y )2 (R + R )2
1
2
1
2
1
2
2
2
2
(x1 - x3) + (y1 - y3) (R1 + R 3)
2
2
y 2 - y3) (R2 + R 3)2
(
)
+
x
x
(
2
3
no overlaps
32
Characterization of Constrained Optima

Min
Min
Linear Program
Linear Program
(Alternate Optima)
Min
Min
Min
Convex Objective Functions

Linear Constraints
Min
Min
Nonconvex Region
Multiple Optima
Min
Min
Min
Nonconvex Objective
Multiple Optima
33
What conditions characterize an optimal solution?

Necessary Conditions
f (x*) = 0
pT2f (x*) p 0 for pn
(positive semi-definite)

Sufficient Conditions
f (x*) = 0
pT2f (x*) p > 0 for pn
(positive definite)
34
Optimal solution for inequality constrained problem
Min f(x)
s.t. g(x) 0
Analogy: Ball rolling down valley pinned by fence
Note: Balance of forces (f, g1)
35
Optimal solution for general constrained problem
Problem: Min
s.t.
f(x)
g(x) 0
h(x) = 0
Analogy: Ball rolling on rail pinned by fences
Balance of forces: f, g1, h
36
Optimality conditions for local optimum

Necessary First Order Karush Kuhn - Tucker Conditions
L (x*, u, v) = f(x*) + g(x*) u + h(x*) v = 0

(Balance of Forces)
u 0 (Inequalities act in only one direction)
g (x*) 0, h (x*) = 0 (Feasibility)
uj gj(x*) = 0 (Complementarity: either gj(x*) = 0 or uj = 0)
u, v are "weights" for "forces," known as KKT multipliers, shadow
prices, dual variables
To guarantee that a local NLP solution satisfies KKT conditions, a constraint
qualification is required. E.g., the Linear Independence Constraint Qualification
(LICQ) requires active constraint gradients, [gA(x*) h(x*)], to be linearly
independent. Also, under LICQ, KKT multipliers are uniquely determined.
Necessary (Sufficient) Second Order Conditions

- Positive curvature in "constraint" directions.
- pT 2L (x*) p 0 (pT 2L (x*) p > 0)
where p are the constrained directions: gA(x*)Tp = 0, h(x*)Tp = 0
37
Single Variable Example of KKT Conditions

Min (x)2 s.t. -a x a, a > 0
x* = 0 is seen by inspection
f(x)
Lagrange function :
L(x, u) = x2 + u1(x-a) + u2(-a-x)
First Order KKT conditions:
L(x, u) = 2 x + u1 - u2 = 0
u1 (x-a) = 0
u2 (-a-x) = 0
-a x a
u1, u2 0
Consider three cases:

u1 > 0, u2 = 0
u1 = 0, u2 > 0
u1 = u2 = 0
-a
Upper bound is active, x = a, u1 = -2a, u2 = 0

Lower bound is active, x = -a, u2 = -2a, u1 = 0
Neither bound is active, u1 = 0, u2 = 0, x = 0
Second order conditions (x*, u1, u2 =0)

xxL (x*, u*) = 2
pTxxL (x*, u*) p = 2 (x)2 > 0
38
Single Variable Example

of KKT Conditions - Revisited
Min -(x)2 s.t. -a x a, a > 0
x* = a is seen by inspection
-a
Lagrange function :
L(x, u) = -x2 + u1(x-a) + u2(-a-x)
First Order KKT conditions:
L(x, u) = -2x + u1 - u2 = 0
u1 (x-a) = 0
u2 (-a-x) = 0
-a x a
u1, u2 0
Consider three cases:

u1 > 0, u2 = 0
u1 = 0, u2 > 0
u1 = u2 = 0
f(x)
Upper bound is active, x = a, u1 = 2a, u2 = 0

Lower bound is active, x = -a, u2 = 2a, u1 = 0
Neither bound is active, u1 = 0, u2 = 0, x = 0
Second order conditions (x*, u1, u2 =0)

xxL (x*, u*) = -2
pTxxL (x*, u*) p = -2(x)2 < 0
39
Interpretation of Second Order Conditions

For x = a or x = -a, we require the allowable direction to satisfy the
active constraints exactly. Here, any point along the allowable
direction, x* must remain at its bound.
For this problem, however, there are no nonzero allowable directions
that satisfy this condition. Consequently the solution x* is defined
entirely by the active constraint. The condition:
pT xxL (x*, u*, v*) p > 0
for all allowable directions, is vacuously satisfied - because there are
no allowable directions that satisfy gA(x*)T p = 0. Hence, sufficient
second order conditions are satisfied.
As we will see, sufficient second order conditions are satisfied by linear

programs as well.
40
Role of KKT Multipliers

-a
Also known as:

Shadow Prices
Dual Variables
Lagrange Multipliers
a a + a
f(x)
Suppose a in the constraint is increased to a + a

f(x*) = (a + a)2
and
[f(x*, a + a) - f(x*, a)]/a = 2a + a
df(x*)/da = 2a = u1
41
Special Cases of Nonlinear Programming

Linear Programming:
Min cTx
x2
s.t. Ax b
Cx = d, x 0
Functions are all convex global min.
Because of Linearity, can prove solution will
always lie at vertex of feasible region.
x1
Simplex Method
- Start at vertex
- Move to adjacent vertex that offers most improvement
- Continue until no further improvement
Notes:
1) LP has wide uses in planning, blending and scheduling
2) Canned programs widely available.
42
Linear Programming Example

Simplex Method
Min -2x1 - 3x2
s.t. 2x1 + x2 5
x1, x2 0
Now, define f = -2x1 - 3x2
Set x1, x2 = 0, x3 = 5 and form tableau

x1
x2
x3
f
2
1
1
0
2
3
0
1
Min
-2x1 - 3x2
s.t. 2x1 + x2 + x3 = 5
x1, x2, x3 0
(add slack variable)
f + 2x1 + 3x2 = 0
b
5
0
x1, x2 nonbasic
x3
basic
To decrease f, increase x2. How much? so x3 0

x1
x2
x3
f
b
2
1
1
0
5
-4
0
-3
1
-15
f can no longer be decreased! Optimal
Underlined terms are -(reduced gradients); nonbasic variables (x1, x3), basic variable x2
43
Quadratic Programming
Min
aTx + 1/2 xT B x
Axb
Cx=d
1) Can be solved using LP-like techniques:
(Wolfe, 1959)
Min
j (zj+ + zj-)
s.t.
a + Bx + ATu + CTv = z+ - zAx - b + s = 0
Cx - d = 0
s, z+, z- 0
{uj sj = 0}
with complicating conditions.
Problem:
2) If B is positive definite, QP solution is unique.

If B is pos. semidefinite, optimum value is unique.
3) Other methods for solving QPs (faster)
- Complementary Pivoting (Lemke)
- Range, Null Space methods (Gill, Murray).
44
Portfolio Planning Problem

Definitions:
xi - fraction or amount invested in security i
ri (t) - (1 + rate of return) for investment i in year t.
i - average r(t) over T years, i.e.
T
r (t)
1
i =
T
Max
t=1
i i
s.t.
=1
xi 0, etc.
Note: maximize average return, no accounting for risk.
45
Portfolio Planning Problem

Definition of Risk - fluctuation of ri(t) over investment (or past) time period.
To minimize risk, minimize variance about portfolio mean (risk averse).
Variance/Covariance Matrix, S
{ S} ij = 2ij
1
=
T
t =1
x T Sx
Max
s .t.
(r (t) - )(r (t) - )

T
xi = 1
ixi R
x i 0 , etc .
Example: 3 investments
j
1. IBM
1.3
2. GM
1.2
3. Gold
1.08
1
3
S = 1
2
-0.5 0.4
- 0.5
0.4
1
46
Portfolio Planning Problem - GAMS

SIMPLE PORTFOLIO INVESTMENT PROBLEM (MARKOWITZ)
4
5 OPTION LIMROW=0;
6 OPTION LIMXOL=0;
7
8 VARIABLES IBM, GM, GOLD, OBJQP, OBJLP;
9
10 EQUATIONS E1,E2,QP,LP;
11
12 LP.. OBJLP =E= 1.3*IBM + 1.2*GM + 1.08*GOLD;
13
14 QP.. OBJQP =E= 3*IBM**2 + 2*IBM*GM - IBM*GOLD
15 + 2*GM**2 - 0.8*GM*GOLD + GOLD**2;
16
17 E1..1.3*IBM + 1.2*GM + 1.08*GOLD =G= 1.15;
18
19 E2.. IBM + GM + GOLD =E= 1;
20
21 IBM.LO = 0.;
22 IBM.UP = 0.75;
23 GM.LO = 0.;
24 GM.UP = 0.75;
25 GOLD.LO = 0.;
26 GOLD.UP = 0.75;
27
28 MODEL PORTQP/QP,E1,E2/;
29
30 MODEL PORTLP/LP,E2/;
31
32 SOLVE PORTLP USING LP MAXIMIZING OBJLP;
33
34
SOLVE PORTQP USING NLP MINIMIZING OBJQP;
47

S O L VE S U M M A R Y
**** MODEL STATUS
**** OBJECTIVE VALUE
RESOURCE USAGE, LIMIT
1.270
ITERATION COUNT, LIMIT
1
BDM - LP VERSION 1.01
A. Brooke, A. Drud, and A. Meeraus,
Analytic Support Unit,
Development Research Department,
World Bank,
Washington D.C. 20433, U.S.A.
1 OPTIMAL
1.2750
1000.000
1000
Estimate work space needed

-Work space allocated
EXIT - - OPTIMAL SOLUTION FOUND.
LOWER
- - - - EQU LP
.
.
- - - - EQU E2
1.000
LOWER
- - - - VAR IBM
0.750
- - - - VAR GM
.
- - - - VAR GOLD
.
- - - - VAR OBJLP -INF
**** REPORT SUMMARY :
33 Kb
--
231 Kb
LEVEL
UPPER
MARGINAL
1.000
1.200
.
1.000
1.000
LEVEL
0.750
0.250
..
1.275
UPPER
0.100
0.750
0.750
+INF
MARGINAL
.
-0.120
.
NONOPT
0 INFEASIBLE
0 UNBOUNDED
Model Statistics SOLVE PORTQP USING NLP FROM LINE 34
MODEL STATISTICS
BLOCKS OF EQUATIONS
3
SINGLE EQUATIONS
BLOCKS OF VARIABLES
4
SINGLE VARIABLES
NON ZERO ELEMENTS
10
NON LINEAR N-Z
DERIVITIVE POOL
8
CONSTANT POOL
CODE LENGTH
95
GENERATION TIME
=
2.360 SECONDS
EXECUTION TIME
=
3.510 SECONDS
3
4
3
3
48

S O L VE S U M M A R Y
MODEL
PORTLP
OBJECTIVE
OBJLP
TYPE
LP
DIRECTION
MAXIMIZE
SOLVER
MINOS5
FROM LINE
34
**** SOLVER STATUS
1 NORMAL COMPLETION
**** MODEL STATUS
2 LOCALLY OPTIMAL
**** OBJECTIVE VALUE
0.4210
RESOURCE USAGE, LIMIT
3.129
1000.000
ITERATION COUNT, LIMIT
3
1000
EVALUATION ERRORS
0
0
MINOS
5.3
(Nov. 1990)
Ver: 225-DOS-02
B.A. Murtagh, University of New South Wales
and
P.E. Gill, W. Murray, M.A. Saunders and M.H. Wright
Systems Optimization Laboratory, Stanford University.
EXIT - - OPTIMAL SOLUTION FOUND
MAJOR ITNS, LIMIT
FUNOBJ, FUNCON CALLS
8
SUPERBASICS
INTERPRETER USAGE
NORM RG / NORM PI
3.732E-17
LOWER
- - - - EQU QP
.
- - - - EQU E1
1.150
- - - - EQU E2
1.000
LOWER
- - - - VAR IBM
.
- - - - VAR GM
.
- - - - VAR GOLD
.
- - - - VAR OBJLP -INF
**** REPORT SUMMARY :
1
1
.21
LEVEL
1.150
1.000
LEVEL
0.183
0.248
0.569
1.421
0
NONOPT
0 INFEASIBLE
0 UNBOUNDED
0
ERRORS
Model Statistics SOLVE PORTQP USING NLP FROM LINE 34
EXECUTION TIME
=
1.090 SECONDS
UPPER
. .
+INF
1.000
UPPER
0.750
0.750
0.750
+INF
MARGINAL
1.000
1.216
-0.556
MARGINAL
.
EPS
.
.
49
Algorithms for Constrained Problems

Motivation: Build on unconstrained methods wherever possible.
Classification of Methods:
Reduced Gradient Methods - (with Restoration) GRG2, CONOPT
Reduced Gradient Methods - (without Restoration) MINOS
Successive Quadratic Programming - generic implementations
Penalty Functions - popular in 1970s, but fell into disfavor. Barrier
Methods have been developed recently and are again popular.
Successive Linear Programming - only useful for "mostly linear"
problems
We will concentrate on algorithms for first four classes.
Evaluation: Compare performance on "typical problem," cite experience
on process problems.
50
Representative Constrained Problem

(Hughes, 1981)
g1 0
g2 0
Min f(x1, x2) = exp(-)

g1 = (x2+0.1)2[x12+2(1-x2)(1-2x2)] - 0.16 0
g2 = (x1 - 0.3)2 + (x2 - 0.3)2 - 0.16 0
x* = [0.6335, 0.3465]
f(x*) = -4.8380
51
Reduced Gradient Method with Restoration

(GRG2/CONOPT)
Min f(x)
s.t. g(x) + s = 0 (add slack variable)
c(x) = 0
a x b, s 0
Min
f(z)
s.t. h(z) = 0
azb
Partition variables into:

zB - dependent or basic variables
zN - nonbasic variables, fixed at a bound
zS - independent or superbasic variables
Analogy to linear programming. Superbasics required only if nonlinear problem
Solve unconstrained problem in space of superbasic variables.
Let zT = [zST zBT zNT] optimize wrt zS with h(zS, zB , zN) = 0
Calculate constrained derivative or reduced gradient wrt zS.
Dependent variables are zB Rm
Nonbasic variables zN (temporarily) fixed
52
Definition of Reduced Gradient

df
f dz B f
=
+
dz S z S dz S z B
Because h( z ) = 0, we have :
T
h
h
dh =
dz S +
dz B = 0
z B
z S
1
h h
dz B
1
h
h
=
=
zS
zB
dz S
z
S B
This leads to :
df
f
1 f
=
zS h z B h
dz S z S
z B
By remaining feasible always, h(z) = 0, a z b, one can apply an

unconstrained algorithm (quasi-Newton) using (df/dzS)
Solve problem in reduced space of zS variables
53
Example of Reduced Gradient

Min
x1 2 x 2
s .t .
3 x1 + 4 x 2 = 24
h T = [ 3 4], f
= [ 2 x1 - 2]
Let z S = x1 , z B = x 2
df
f
=
zS h zB h
dz S
z S
f
z B
df
1
= 2 x1 3[4 ] (- 2 ) = 2 x1 + 3 / 2
dx 1
If hT is (m x n); zShT is m x (n-m); zBhT is (m x m)
(df/dzS) is the change in f along constraint direction per unit change in zS

54
Sketch of GRG Algorithm

1. Initialize problem and obtain a feasible point at z0
2. At feasible point zk, partition variables z into zN, zB, zS
3. Calculate reduced gradient, (df/dzS)
4. Evaluate search direction for zS, d = B-1(df/dzS)
5. Perform a line search.
Find (0,1] with zS := zSk + d
Solve for h(zSk + d, zB, zN) = 0
If f(zSk + d, zB, zN) < f(zSk, zB, zN),
set zSk+1 =zSk + d, k:= k+1
6. If ||(df/dzS)||<, Stop. Else, go to 2.
55
GRG Algorithm Properties

1.
2.
3.
4.
5.
GRG2 has been implemented on PCs as GINO and is very reliable and
robust. It is also the optimization solver in MS EXCEL.
CONOPT is implemented in GAMS, AIMMS and AMPL
GRG2 uses Q-N for small problems but can switch to conjugate
gradients if problem gets large. CONOPT uses exact second derivatives.
Convergence of h(zS, zB , zN) = 0 can get very expensive because h is
required
Safeguards can be added so that restoration (step 5.) can be dropped
and efficiency increases.
Representative Constrained Problem Starting Point [0.8, 0.2]

GINO Results - 14 iterations to ||f(x*)|| 10-6
CONOPT Results - 7 iterations to ||f(x*)|| 10-6 from feasible point.
56
Reduced Gradient Method without Restoration

(MINOS/Augmented)
Motivation: Efficient algorithms
are available that solve linearly
constrained optimization
problems (MINOS):
Min f(x)
s.t. Ax b
Cx = d
Extend to nonlinear problems,
through successive linearization
Develop major iterations
(linearizations) and minor
iterations (GRG solutions) .
Strategy: (Robinson, Murtagh & Saunders)

1. Partition variables into basic, nonbasic
variables and superbasic variables..
2. Linearize active constraints at zk
Dkz = ck
3. Let = f (z) + vTh (z) + (hTh)
(Augmented Lagrange),
4. Solve linearly constrained problem:
Min
(z)
s.t.
Dz = c
azb
using reduced gradients to get zk+1
5. Set k=k+1, go to 3.
6. Algorithm terminates when no
movement between steps 3) and 4).
57
MINOS/Augmented Notes
1. MINOS has been implemented very efficiently to take care of
linearity. It becomes LP Simplex method if problem is totally
linear. Also, very efficient matrix routines.
2. No restoration takes place, nonlinear constraints are reflected in
(z) during step 3). MINOS is more efficient than GRG.
3. Major iterations (steps 3) - 4)) converge at a quadratic rate.
4. Reduced gradient methods are complicated, monolithic codes:
hard to integrate efficiently into modeling software.
Representative Constrained Problem Starting Point [0.8, 0.2]
MINOS Results: 4 major iterations, 11 function calls
to ||f(x*)|| 10-6
58
Successive Quadratic Programming (SQP)

Motivation:
Take KKT conditions, expand in Taylor series about current point.
Take Newton step (QP) to determine next point.
Derivation KKT Conditions

xL (x*, u*, v*) = f(x*) + gA(x*) u* + h(x*) v* = 0
h(x*) = 0
gA(x*) = 0,
where gA are the active constraints.
Newton - Step
xx L g
A
g T
0
A
h T
0
x L (x k , u k , v k )
h x
gA (x k )
0 u = -
0 v
h (x k )
Requirements:
xxL must be calculated and should be regular
correct active set gA
good estimates of uk, vk
59
SQP Chronology
1. Wilson (1963)
- active set can be determined by solving QP:
Min
f(xk)Td + 1/2 dT xx L(xk, uk, vk) d
d
s.t.
g(xk) + g(xk)T d 0
h(xk) + h(xk)T d = 0
2. Han (1976), (1977), Powell (1977), (1978)
- approximate xxL using a positive definite quasi-Newton update (BFGS)
- use a line search to converge from poor starting points.
Notes:
- Similar methods were derived using penalty (not Lagrange) functions.
- Method converges quickly; very few function evaluations.
- Not well suited to large problems (full space update used).
For n > 100, say, use reduced space methods (e.g. MINOS).
60
Elements of SQP Hessian Approximation

What about xxL?
need to get second derivatives for f(x), g(x), h(x).
need to estimate multipliers, uk, vk; xxL may not be positive
semidefinite
Approximate xxL (xk, uk, vk) by Bk, a symmetric positive
definite matrix.
T
k
T k
s
yy
B
s
B
k
k+1
=
+
B
B
T
k
s y
sB s
s = xk+1 - xk
BFGS Formula
y = L(xk+1, uk+1, vk+1) - L(xk, uk+1, vk+1)
second derivatives approximated by change in gradients
positive definite Bk ensures unique QP solution
61
Elements of SQP Search Directions

How do we obtain search directions?
Form QP and let QP determine constraint activity
At each iteration, k, solve:

Min
f(xk) Td + 1/2 dT Bkd
d
s.t.
g(xk) + g(xk) T d 0
h(xk) + h(xk) T d = 0
Convergence from poor starting points
As with Newton's method, choose (stepsize) to ensure progress
toward optimum:
xk+1 = xk + d.
is chosen by making sure a merit function is decreased at each
iteration.
Exact Penalty Function
(x) = f(x) + [ max (0, gj(x)) + |hj (x)|]
> maxj {| uj |, | vj |}
Augmented Lagrange Function
(x) = f(x) + uTg(x) + vTh(x)
+ /2 { (hj (x))2 + max (0, gj (x))2}
62
Newton-Like Properties for SQP
Fast Local Convergence

B = xxL
xxL is p.d and B is Q-N
B is Q-N update, xxL not p.d
Quadratic
1 step Superlinear
2 step Superlinear
Enforce Global Convergence

Ensure decrease of merit function by taking 1
Trust region adaptations provide a stronger guarantee of global
convergence - but harder to implement.
63
Basic SQP Algorithm

0. Guess x0, Set B0 = I (Identity). Evaluate f(x0), g(x0) and h(x0).
1. At xk, evaluate f(xk), g(xk), h(xk).
2. If k > 0, update Bk using the BFGS Formula.
3. Solve:
s.t.
Mind f(xk)Td + 1/2 dTBkd

g(xk) + g(xk)Td 0
h(xk) + h(xk)Td = 0
If KKT error less than tolerance: ||L(x*)|| , ||h(x*)|| ,

||g(x*)+|| . STOP, else go to 4.
4. Find so that 0 < 1 and (xk + d) < (xk) sufficiently
(Each trial requires evaluation of f(x), g(x) and h(x)).
5. xk+1 = xk + d. Set k = k + 1 Go to 2.
64
Problems with SQP

Nonsmooth Functions - Reformulate
Ill-conditioning - Proper scaling
Poor Starting Points Trust Regions can help
Inconsistent Constraint Linearizations
- Can lead to infeasible QP's
x2
x1
Min x2
s.t. 1 + x1 - (x2)2 0
1 - x1 - (x2)2 0
x2 -1/2
65
SQP Test Problem

1.2
1.0
0.8
x2
0.6
x*
0.4
0.2
0.0
0.0
0.2
0.4
x1
0.6
0.8
1.0
1.2
Min x2
s.t. -x2 + 2 x12 - x13 0
-x2 + 2 (1-x1)2 - (1-x1)3 0
x* = [0.5, 0.375].
66
SQP Test Problem First Iteration

1.2
1.0
0.8
x2
0.6
0.4
0.2
0.0
0.0
0.2
0.4
x1
0.6
0.8
1.0
1.2
Start from the origin (x0 = [0, 0]T) with B0 = I, form:

Min
s.t.
d2 + 1/2 (d12 + d22)

d2 0
d1 + d2 1
d = [1, 0]T. with 1 = 0 and 2 = 1.
67
SQP Test Problem Second Iteration

1.2
1.0
0.8
x2
0.6
x*
0.4
0.2
0.0
0.0
0.2
0.4
x1
0.6
0.8
1.0
1.2
From x1 = [0.5, 0]T with B1 = I

(no update from BFGS possible), form:
d2 + 1/2 (d12 + d22)
-1.25 d1 - d2 + 0.375 0
1.25 d1 - d2 + 0.375 0
d = [0, 0.375]T with 1 = 0.5 and 2 = 0.5
x* = [0.5, 0.375]T is optimal
Min
s.t.
68
Representative Constrained Problem

SQP Convergence Path
g1 0
g2 0
Starting Point[0.8, 0.2] - starting from B0 = I and staying in bounds

and linearized constraints; converges in 8 iterations to ||f(x*)|| 10-6
69
Barrier Methods for Large-Scale

Nonlinear Programming
minn f ( x)
Can generalize for
x
Original Formulation
s.t c( x) = 0
x0
a xb
Barrier Approach
minn ( x) = f ( x) ln xi
x
i =1
s.t c( x) = 0
As 0,
x*() x*
Fiacco and McCormick (1968)

70
Solution of the Barrier Problem

Newton Directions (KKT System)
f ( x ) + A ( x ) v
XVe e
c(x)
Solve
Reducing the System
= 0
= 0
= 0
dv = X 1e v X 1Vdx
W A I d x
f + A v
AW0+ 0 A
d d=x c
1
= X V
T
+
1
c
V A
0 X 0d v
S
e
IPOPT Code www.coin-or.org
71
Global Convergence of Newton-based

Barrier Solvers
Merit Function
Exact Penalty:
P(x, ) = f(x) + ||c(x)||
Augd Lagrangian: L*(x, , ) = f(x) + Tc(x) + ||c(x)||2

Assess Search Direction (e.g., from IPOPT)
Line Search choose stepsize to give sufficient decrease of merit function
using a step to the boundary rule with ~0.99.
for (0, ], xk +1 = xk + d x
xk + d x (1 ) xk > 0
vk +1 = vk + d v (1 )vk > 0
k +1 = k + (+ k )
How do we balance (x) and c(x) with ?
Is this approach globally convergent? Will it still be fast?
72
Global Convergence Failure

(Wchter and B., 2000)
x2
Min f ( x)
1
s.t. x1 x3 = 0
2
( x1 ) 2 x2 1 = 0
x2 , x3 0
Newton-type line search stalls
even though descent directions
exist
x1
k T
k
A( x ) d x + c( x ) = 0
xk + d x > 0
Remedies:
Composite Step Trust Region
(Byrd et al.)
Filter Line Search Methods
73
Line Search Filter Method

Store (k, k) at allowed iterates
Allow progress if trial point is
acceptable to filter with margin
(x)
If switching condition
[kT d ]a [ k ]b , a > 2b > 2

is satisfied, only an Armijo line
search is required on k
If insufficient progress on stepsize,
evoke restoration phase to reduce .
Global convergence and superlinear
local convergence proved (with
second order correction)
(x) = ||c(x)||
74
Implementation Details
Modify KKT (full space) matrix if nonsingular
Wk + k + 1
T
A
k
Ak
2 I
1 - Correct inertia to guarantee descent direction

2 - Deal with rank deficient Ak
KKT matrix factored by MA27

Feasibility restoration phase
Min || c( x) ||1 + || x xk ||Q
xl xk xu
Apply Exact Penalty Formulation
Exploit same structure/algorithm to reduce infeasibility
75
Details of IPOP T Algorithm

Options
Comparison
Line Search Strategies

- l 2 exact penalty merit function
- augmented Lagrangian
function
- Filter method (adapted from
Fletcher and Leyffer)
34 COPS Problems
(600 - 160 000 variables)
486 CUT E Problems
(2 50 000 variables)
56 MIT T Problems
(12097 99998 variables)
Hessian Calculation
- BFGS (reduced space)
- SR1 (reduced space)
- Exact full Hessian (direct)
- Exact reduced Hes sian (direct)
Performance Measure
- rp, l = (#iterp,l)/ (#iterp, min)
- P() = fraction of problems
with log2(rp, l) <
- Preconditioned CG
76
IPOPT Comparison with KNITRO and

LOQO
CPU time (COPS)
CPU time (CUTE)
0.9
0.9
0.8
0.7
0.7
0.6
0.6
rho
rho
0.8
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
4
tau
CPU time (MITT)
4
tau
IPOPT has better performance,

robustness on CUTE, MITT and
COPS problem sets
1
0.9
0.8
0.7
Similar results appear with iteration

counts
IPOPT
KNITRO
LOQO
0.6
rho
IPOPT
KNITRO
LOQO
IPOPT
KNITRO
LOQO
0.5
Can be downloaded from

http://www.coin-or.org
0.4
0.3
0.2
See links for additional information
0.1
0
4
tau
77
Recommendations for Constrained Optimization

1. Best current algorithms
GRG 2/CONOPT
MINOS
SQP
IPOPT
2. GRG 2 (or CONOPT) is generally slower, but is robust. Use with highly
nonlinear functions. Solver in Excel!
3. For small problems (n 100) with nonlinear constraints, use SQP.
4. For large problems (n 100) with mostly linear constraints, use MINOS.
==> Difficulty with many nonlinearities
Fewer Function
Evaluations
SQP
MINOS Tailored Linear

CONOPT
Algebra
Small, Nonlinear Problems - SQP solves QP's, not LCNLP's, fewer function calls.
Large, Mostly Linear Problems - MINOS performs sparse constraint decomposition.
Works efficiently in reduced space if function calls are cheap!
Exploit Both Features IPOPT takes advantages of few function evaluations and largescale linear algebra, but requires exact second derivatives
78
Available Software for Constrained

Optimization
SQP Routines
HSL, NaG and IMSL (NLPQL) Routines
NPSOL Stanford Systems Optimization Lab
SNOPT Stanford Systems Optimization Lab (rSQP discussed later)
IPOPT http://www.coin-or.org
GAMS Programs
CONOPT - Generalized Reduced Gradient method with restoration
MINOS - Generalized Reduced Gradient method without restoration
A student version of GAMS is now available from the CACHE office. The cost for this package
including Process Design Case Students, GAMS: A Users Guide, and GAMS - The Solver Manuals,
and a CD-ROM is $65 per CACHE supporting departments, and $100 per non-CACHE supporting
departments and individuals. To order please complete standard order form and fax or mail to
CACHE Corporation. More information can be found on http://www.che.utexas.edu/cache/gams.html
MS Excel
Solver uses Generalized Reduced Gradient method with restoration
79
Rules for Formulating Nonlinear Programs

1) Avoid overflows and undefined terms, (do not divide, take logs, etc.)
e.g. x + y - ln z = 0 x + y - u = 0
exp u - z = 0
2) If constraints must always be enforced, make sure they are linear or bounds.
e.g. v(xy - z2)1/2 = 3
vu = 3
u- (xy - z2) = 0, u 0
3) Exploit linear constraints as much as possible, e.g. mass balance
xi L + yi V = F zi li + vi = fi
L li = 0
4) Use bounds and constraints to enforce characteristic solutions.
e.g.
a x b, g (x) 0
to isolate correct root of h (x) = 0.
5) Exploit global properties when possibility exists. Convex (linear equations?)
Linear Program? Quadratic Program? Geometric Program?
6) Exploit problem structure when possible.
e.g. Min
[Tx - 3Ty]
s.t. xT + y - T2 y = 5
4x - 5Ty + Tx = 7
0T1
(If T is fixed solve LP) put T in outer optimization loop.
80
Process Optimization
Problem Definition and Formulation
State of Nature and Problem Premises
Restrictions: Physical, Legal

Economic, Political, etc.
Desired Objective: Yield,

Economic, Capacity, etc.
Decisions
Mathematical Modeling and Algorithmic Solution
Process Model Equations
Constraints
Objective Function
Additional Variables
81
Large Scale NLP Algorithms

Motivation: Improvement of Successive Quadratic Programming
as Cornerstone Algorithm
process optimization for design, control and operations
Evolution of NLP Solvers:
SQP
rSQP
IPOPT
rSQP++
19991988-98:
1981-87:
: Simultaneous
Static
Flowsheet
Real-time
dynamic
optimization
optimization
optimization
over
over
over
1 000
100
100
000
000
variables
variables
variables
and
and
and
constraints
constraints
constraints
Current: Tailor structure, architecture and problems
82
Flowsheet Optimization Problems - Introduction

Modular Simulation Mode
Physical Relation to Process
In
Out
- Intuitive to Process Engineer

- Unit equations solved internally
- tailor-made procedures.
Convergence Procedures - for simple flowsheets, often identified

from flowsheet structure
Convergence "mimics" startup.
Debugging flowsheets on "physical" grounds
83
Flowsheet Optimization Problems - Features
Design Specifications
Specify # trays reflux ratio, but would like to specify
overhead comp. ==> Control loop -Solve Iteratively
1
3
Nested Recycles Hard to Handle

Best Convergence Procedure?
Frequent block evaluation can be expensive

Slow algorithms applied to flowsheet loops.
NLP methods are good at breaking looks
84
Chronology in Process Optimization

Sim. Time Equiv.
1. Early Work - Black Box Approaches
Friedman and Pinder (1972)
Gaddy and co-workers (1977)
2. Transition - more accurate gradients
Parker and Hughes (1981)
Biegler and Hughes (1981)
3. Infeasible Path Strategy for Modular Simulators
Biegler and Hughes (1982)
Chen and Stadtherr (1985)
Kaijaluoto et al. (1985)
and many more
4. Equation Based Process Optimization
Westerberg et al. (1983)
Shewchuk (1985)
DMO, NOVA, RTOPT, etc. (1990s)
75-150
300
64
13
<10
<5
2
1-2
Process optimization should be as cheap and easy as process simulation

85
Process Simulators with Optimization

Capabilities (using SQP)
Aspen Custom Modeler (ACM)
Aspen/Plus
gProms
Hysim/Hysys
Massbal
Optisim
Pro/II
ProSim
ROMeo
VTPLAN
86
Simulation and Optimization of Flowsheets

2
h(y) = 0
4
1
w(y)
Min f(x), s.t. g(x) 0

For single degree of freedom:
work in space defined by curve below.
requires repeated (expensive) recycle convergence
f(x, y(x))
87
Expanded Region with Feasible Path
88
"Black Box" Optimization Approach

Vertical steps are expensive (flowsheet convergence)
Generally no connection between x and y.
Can have "noisy" derivatives for gradient optimization.
89
SQP - Infeasible Path Approach

solve and optimize simultaneously in x and y
extended Newton method
90
Optimization Capability for Modular Simulators

(FLOWTRAN, Aspen/Plus, Pro/II, HySys)
Architecture
- Replace convergence with optimization block
- Problem definition needed (in-line FORTRAN)
- Executive, preprocessor, modules intact.
Examples
1. Single Unit and Acyclic Optimization
- Distillation columns & sequences
2. "Conventional" Process Optimization
- Monochlorobenzene process
- NH3 synthesis
3. Complicated Recycles & Control Loops
- Cavett problem
- Variations of above
91
Optimization of Monochlorobenzene Process

S06
U = 100
HC1
Cooling
Water
80o F
A-1
PHYSICAL PROPERTY OPTIONS

Cavett Vapor Pressure
Redlich-Kwong Vapor Fugacity
Corrected Liquid Fugacity
Ideal Solution Activity Coefficient
OPT (SCOPT) OPTIMIZER
Optimal Solution Found After 4 Iterations
Kuhn-Tucker Error
0.29616E-05
Allowable Kuhn-Tucker Error 0.19826E-04
Objective Function
-0.98259
ABSORBER
15 Trays
(3 Theoretical Stages)
32 psia
S11
S04
Feed
o
80 F
37 psia
DISTILLATION
30 Trays
(20 Theoretical Stages)
270 F
HC1
F-1
S01
D-1
S10
S07
Benzene,
0.1 Lb Mole/Hr
of MCB
25
psia
S05
S02
FLASH
S09
Steam
o
360 F
Steam
S03
H-1
U = 100
Feed Flow Rates
LB Moles/Hr
Optimization Variables
HC1
10
32.006 0.38578 200.00 120.00
Benzene 40
MCB
50
Tear Variables
0.10601E-19 13.064 79.229 120.00 50.000
Tear Variable Errors (Calculated Minus Assumed)
-0.10601E-19 0.72209E-06
-0.36563E-04 0.00000E+00
0.00000E+00
-Results of infeasible path optimization
-Simultaneous optimization and convergence of tear streams.
360o F
S08
12,000
2
Btu/hr- ft
T-1
S12
TREATER
90 F
Maximize
Profit
H-2
U = 100
S13
C
P-1
Water
80o F
S15
0
120 F
T
S14
MCB
92
Ammonia Process Optimization

Pr
H2
Tr
N2
N2
H2
CH4
Ar
Hydrogen Feed
5.2%
94.0%
0.79 %
---
Nitrogen Feed
99.8%
--0.02%
0.01%
To
Tf
Tf
Product
Hydrogen and Nitrogen feed are mixed, compressed, and combined

with a recycle stream and heated to reactor temperature. Reaction
occurs in a multibed reactor (modeled here as an equilibrium reactor) to
partially convert the stream to ammonia. The reactor effluent is cooled
and product is separated using two flash tanks with intercooling. Liquid
from the second stage is flashed at low pressure to yield high purity
NH3 product. Vapor from the two stage flash forms the recycle and is
recompressed.
93

Optimization Problem
Performance Characterstics
Max
5 SQP iterations.
2.2 base point simulations.
objective function improves by
$20.66 x 106 to $24.93 x 106.
difficult to converge flowsheet
at starting point
{Total Profit @ 15% over five years}
s.t. 105 tons NH3/yr.

Pressure Balance
No Liquid in Compressors
1.8 H2/N2 3.5
Treact 1000o F
NH3 purged 4.5 lb mol/hr
NH3 Product Purity 99.9 %
Tear Equations
Optimum
Objective Function($106)
24.9286
Starting point
20.659
1. Inlet temp. reactor (oF)
400
400
2. Inlet temp. 1st flash (oF)
65
65
3. Inlet temp. 2nd flash (oF)
35
35
4. Inlet temp. rec. comp. (oF)

5. Purge fraction (%)
80.52
0.0085
107
0.01
6. Reactor Press. (psia)
2163.5
2000
7. Feed 1 (lb mol/hr)
2629.7
2632.0
8. Feed 2 (lb mol/hr)
691.78
691.4
94
How accurate should gradients be for optimization?

Recognizing True Solution
KKT conditions and Reduced Gradients determine true solution
Derivative Errors will lead to wrong solutions!
Performance of Algorithms
Constrained NLP algorithms are gradient based
(SQP, Conopt, GRG2, MINOS, etc.)
Global and Superlinear convergence theory assumes accurate gradients
Worst Case Example (Carter, 1991)
Newtons Method generates an ascent direction and fails for any !
Min f ( x) = x T Ax
+ 1 /
A=
1 /
1/
+ 1 /
x0 = [1 1]T f ( x0 ) = x0
g ( x0 ) = f ( x0 ) + O( )
d = A1 g ( x0 )
95
Implementation of Analytic Derivatives

parameters, p
exit variables, s
Module Equations
c(v, x, s, p, y) = 0
Sensitivity
Equations
dy/dx
ds/dx
dy/dp
ds/dp
Automatic Differentiation Tools

JAKE-F, limited to a subset of FORTRAN (Hillstrom, 1982)
DAPRE, which has been developed for use with the NAG library (Pryce, Davis, 1987)
ADOL-C, implemented using operator overloading features of C++ (Griewank, 1990)
ADIFOR, (Bischof et al, 1992) uses source transformation approach FORTRAN code .
TAPENADE, web-based source transformation for FORTRAN code
Relative effort needed to calculate gradients is not n+1 but about 3 to 5
(Wolfe, Griewank)
96

(9 decisions and 6 tear variables)
Flash Recycle Optimization

(2 decisions + 7 tear variables)
Reactor
S3
S1
Mixer
S2
Flash
S6
Hi P
S5
S4
Ratio
Flash
S7
Lo P
2
3
1/2
2
Max S3(A) *S3(B) - S3(A) - S3(C) + S3(D) - (S3(E))
Flash
8000
200
GRG
SQP
GRG
6000
rSQP
100
1Numerical
Exact
CPU Seconds (VS 3200)
CPU Seconds (VS 3200)
SQP
rSQP
4000
2000
Numerical
1
Exact
97
Large-Scale SQP
Min
s.t.
f(z)
c(z)=0
zL z zU
Min
s.t.
f(zk)T d + 1/2 d T Wk d
c(zk) + (k)T d = 0
zL zk + d zU
Characteristics
Many equations and variables ( 100 000)
Many bounds and inequalities ( 100 000)
Few degrees of freedom (10 - 100)
Steady state flowsheet optimization
Real-time optimization
Parameter estimation
Many degrees of freedom ( 1000)
Dynamic optimization (optimal control, MPC)
State estimation and data reconciliation
98
Few degrees of freedom => reduced space SQP (rSQP)

Take advantage of sparsity of A=c(x)
project W into space of active (or equality constraints)
curvature (second derivative) information only needed in space of degrees of
freedom
second derivatives can be applied or approximated with positive curvature
(e.g., BFGS)
use dual space QP solvers
+ easy to implement with existing sparse solvers, QP methods and line search
techniques
+ exploits 'natural assignment' of dependent and decision variables (some
decomposition steps are 'free')
+ does not require second derivatives
- reduced space matrices are dense
- may be dependent on variable partitioning
- can be very expensive for many degrees of freedom
- can be expensive if many QP bounds
99
Reduced space SQP (rSQP)

Range and Null Space Decomposition
Assume no active bounds, QP problem with n variables and m
constraints becomes:
W k
kT
A
f ( x k )
Ak d
=
k
0 +
c( x )
Define reduced space basis, Zk n x (n-m) with (Ak)TZk = 0

Define basis for remaining space Yk n x m, [Yk Zk]n x n
is nonsingular.
Let d = Yk dY + Zk dZ to rewrite:
Y Z
0
k
k T
0W
k T
I A
k
A Y Z
0 0
k
dY k k
0
Y Z
dZ =
I
0
k
0f (x )
k
I c(x )
100
Reduced space SQP (rSQP)

0
Y k T W k Y k
kT k k
Z W Y
Ak T Y k
0
kT
Y W Y
kT
Z W kZk
0
Ak dY
Y k T f ( x k )

0 d Z = Z k T f ( x k )

c( x k )
0 +
kT
(ATY) dY =-c(xk) is square, dY determined from bottom row.

Cancel YTWY and YTWZ; (unimportant as dZ, dY --> 0)
(YTA) O = -YTf(xk), can be determined by first order estimate
Calculate or approximate w= ZTWY dY, solve ZTWZ dZ =-ZTf(xk)-w
Compute total step: d = Y dY + Z dZ
101
Reduced space SQP (rSQP) Interpretation
dZ
dY
SQP step (d) operates in a higher dimension
Satisfy constraints using range space to get dY
Solve small QP in null space to get dZ
In general, same convergence properties as SQP.
102
Choice of Decomposition Bases

1. Apply QR factorization to A. Leads to dense but well-conditioned Y and Z.
R
R
A = Q = [Y Z ]
0
0
2. Partition variables into decisions u and dependents v. Create
orthogonal Y and Z with embedded identity matrices (ATZ = 0, YTZ=0).
AT = u cT
v cT = [N
C]
N T C T
I
Z = 1 Y =
C
N
I
3. Coordinate Basis - same Z as above, YT = [ 0 I ]
Bases use gradient information already calculated.

Adapt decomposition to QP step
Theoretically same rate of convergence as original SQP.
Coordinate basis can be sensitive to choice of u and v. Orthogonal is not.
Need consistent initial point and nonsingular C; automatic generation
103
rSQP Algorithm
1. Choose starting point x0.
2. At iteration k, evaluate functions f(xk), c(xk) and their gradients.
3. Calculate bases Y and Z.
4. Solve for step dY in Range space from
(ATY) dY =-c(xk)
5. Update projected Hessian Bk ~ ZTWZ (e.g. with BFGS), wk (e.g., zero)
6. Solve small QP for step dZ in Null space.
Min ( Z T f ( x k ) + wk )T d Z + 1 / 2d Z B k d Z
T
s.t.
7.
8.
9.
10.
11.
xL x k + YdY + Zd Z xU
If error is less than tolerance stop. Else

Solve for multipliers using (YTA) = -YTf(xk)
Calculate total step d = Y dY + Z dZ.
Find step size and calculate new point, xk+1 = xk +
Continue from step 2 with k = k+1.
104
rSQP Results: Computational Results for

General Nonlinear Problems
Vasantharajan et al (1990)
105
rSQP Results: Computational Results

for Process Problems
Vasantharajan et al (1990)
106
Comparison of SQP and rSQP

Coupled Distillation Example - 5000 Equations
Decision Variables - boilup rate, reflux ratio
1.
2.
3.
4.
5.
Method
SQP*
rSQP
rSQP
rSQP
rSQP
CPU Time Annual Savings

2 hr
negligible
15 min. $ 42,000
15 min. $ 84,000
15 min. $ 84,000
15 min $107,000
Comments
Base Case
Base Case
Higher Feed Tray Location
Column 2 Overhead to Storage
Cases 3 and 4 together
18
10
1
QVK
QVK
107
Real-time Optimization with rSQP

Sunoco Hydrocracker Fractionation Plant
(Bailey et al, 1993)
Existing process, optimization on-line at regular intervals: 17 hydrocarbon
components, 8 heat exchangers, absorber/stripper (30 trays), debutanizer (20
trays), C3/C4 splitter (20 trays) and deisobutanizer (33 trays).
FUEL GAS
C3
REFORMER
NAPHTHA
RECYCLE
OIL
iC4
DIB
DEBUTANIZER
MIXED LPG
C3/C4
SPLITTER
LIGHT
NAPHTHA
MAIN FRAC.
PREFLASH
ABSORBER/
STRIPPER
REACTOR EFFLUENT FROM

LOW PRESSURE SEPARATOR
nC4
square parameter case to fit the model to operating data.

optimization to determine best operating conditions
108
Optimization Case Study Characteristics

Model consists of 2836 equality constraints and only ten independent variables. It
is also reasonably sparse and contains 24123 nonzero Jacobian elements.
NP
P = ziCi + ziC + zi Ci Pm U
G
iG
E
i
iE
m=1
Cases Considered:
1. Normal Base Case Operation
2. Simulate fouling by reducing the heat exchange coefficients for the debutanizer
3. Simulate fouling by reducing the heat exchange coefficients for splitter
feed/bottoms exchangers
4. Increase price for propane
5. Increase base price for gasoline together with an increase in the octane credit
109
110
Many degrees of freedom => full space IPOPT
W k +
kT
A
( x k )
Ak d
=
k
0 +
c( x )
work in full space of all variables

second derivatives useful for objective and constraints
use specialized large-scale Newton solver
+ W=xxL(x,) and A=c(x) sparse, often structured
+ fast if many degrees of freedom present
+ no variable partitioning required
- second derivatives strongly desired
- W is indefinite, requires complex stabilization
- requires specialized large-scale linear algebra
111
QHV
L
O
H
3LS
*DVROLQH%OHQGLQJ
2,/7$1.6
*DVROLQH%OHQGLQJ+HUH
*$667$7,216
6XSSO\WDQNV
,QWHUPHGLDWHWDQNV
),1$/352'8&7758&.6
)LQDO3URGXFWWDQNV
112
Blending Problem & Model Formulation
q t ,i
f t ,ij
..qt , j
f t , jk
qi
..qt ,k
.. f t ,k
max
( ck ft , k ci ft ,i )
t k
vk
s.t.
ft , jk ft , ij + vt + 1, j
k
=v
t, j
=0
ft , k f
t , jk
j
vt , j
qt , j ft , jk qt , i ft , ij + qt + 1, j vt + 1, j
k
Supply tanks (i)
Intermediate tanks (j)
Final Product tanks (k)
=q v
t, j t, j
q f
=0
q f
t , j t , jk
t, k t, k
j
qt ,k qk
qk
max
min
f & v ------ flowrates and tank volumes

q
------ tank qualities
vt , j v j
vj
max
min
Model Formulation in AMPL

113
Small Multi-day Blending Models

Single Qualities
Haverly, C. 1978 (HM)
F1
Audet & Hansen 1998 (AHM)
B1
F1
B1
P1
P1
F2
F2
F3
B2
P2
B2
F3
B3
114
Honeywell Blending Model Multiple Days

48 Qualities
115
Summary of Results Dolan-Mor plot

Performance profile (iteration count)
1
0.9
0.8
0.7
0.6
0.5
IPOPT
LOQO
0.4
KNITRO
SNOPT
MINOS
0.3
LANCELOT
0.2
0.1
0
1
10
100
1000
10000
100000
1000000
10000000
116
Comparison of NLP Solvers: Data Reconciliation

1000
LANCELOT
Iterations
800
MINOS
600
SNOPT
400
KNITRO
LOQO
200
IPOPT
0
0
200
400
600
Degrees of Freedom
CPU Time (s, norm.)
100
10
LANCELOT
M INOS
SNOPT
1
0
200
400
600
KNITRO
LOQO
0.1
IPOPT
0.01
Degrees of Freedom
117
Sensitivity Analysis for Nonlinear Programming
At nominal conditions, p0
Min f(x, p0)
s.t. c(x, p0) = 0
a(p0) x b(p0)
How is the optimum affected at other conditions, p p0?
Model parameters, prices, costs
Variability in external conditions
Model structure
How sensitive is the optimum to parameteric uncertainties?
Can this be analyzed easily?
118
Calculation of NLP Sensitivity

Take KKT Conditions
L(x*, p, , v) = 0
c(x*, p0) = 0
ETx* - bnd( p0)=0
and differentiate and expand about p0.

pxL(x*, p, , v)T + xxL(x*, p, , v)T px*T + xh(x*, p, , v)T pT + pvT = 0
pc(x*, p0)T + xc(x*, p0)T px*T = 0
ET (px*T - pbndT)=0
Notes:
A key assumption is that under strict complementarity, the active set will not
change for small perturbations of p.
If an element of x* is at a bound then pxi*T = pbndT
Second derivatives are required to calculate sensitivities, px*T
Is there a cheaper way to calculate px*T?
119
Decomposition for NLP Sensitivity

Let L( x*, p, , v) = f ( x*) + T c( x*) + ( E T ( x * bnd ( p )))T v
W
AT
E T
xp L( x*, p, , v)T
A E p x *T
T
T
0 0 p = p c( x*, p )
E T p bnd T
0 0 p vT
Partition variables into basic, nonbasic and superbasic
pxT= Z pxST + YpxBT + TpxNT

Set pxNT =pbndNT, nonbasic variables to rhs,
Substitute for remaining variables
Perform range and null space decomposition
Solve only for pxST andpxBT
120
Decomposition for NLP Sensitivity

Y T WY
T
Z WY
AT Y
T
Y T ( xp L( x*, p, , v)T + W T p xN T )
Y T WY Y T A p xB
T
T
T
T
T
Z WZ
0 p xS = Z ( xp L( x*, p, , v) + W T p x N )
T
T
T
0
0 p T
c
(
x
*,
p
)
+
A
T
x
p
p
N
Solve only for pxBT from bottom row andpxST from middle row
If second derivatives are not available, ZTWZ, ZTWY pxBT and
ZTWT pxNT can be constructed by directional finite differencing
If assumption of strict complementarity is violated, sensitivity can be
calculated using a QP subproblem.
121
Second Order Tests

Reduced Hessian needs to be positive definite
At solution x*: Evaluate eigenvalues of ZTxxL*Z
Strict local minimum if all positive.
Nonstrict local minimum: If nonnegative, find eigenvectors for
zero eigenvalues, regions of nonunique solutions
Saddle point: If any are negative, move along directions of
corresponding eigenvectors and restart optimization.
x2
z1
Saddle
Point
x*
x1
z2
122
Sensitivity for Flash Recycle Optimization

(2 decisions, 7 tear variables)
S3
S1
Mixer
S2
Flash
S6
S5
S4
Ratio
S7
2
3
1/2
2
Max S3(A) *S3(B) - S3(A) - S3(C) + S3(D) - (S3(E))
Second order sufficiency test:

Dimension of reduced Hessian = 1
Positive eigenvalue
Sensitivity to simultaneous change in feed rate and upper bound on purge ratio
Only 2-3 flowsheet perturbations required for second order information
123

(9 decisions, 8 tear variables)
20
Sensitivities vs. Re-optimized Pts

Reactor
19.5
Hi P
Flash
Lo P
Objective Function
Sensitivity
19
QP1
QP2
18.5
Actual
18
17.5
Flash
0.1
0.01
0.001
17
Relative perturbation change

Second order sufficiency test:
Dimension of reduced Hessian = 4
Eigenvalues = [2.8E-4, 8.3E-10, 1.8E-4, 7.7E-5]
Sensitivity to simultaneous change in feed rate and upper bound on reactor conversion
Only 5-6 extra perturbations for second derivatives
124
Multiperiod Optimization
Coordination
Case 1
1.
Case 2
Case 3
Case 4
Case N
Design plant to deal with different operating scenarios (over time or with
uncertainty)
2. Can solve overall problem simultaneously
large and expensive
polynomial increase with number of cases
must be made efficient through specialized decomposition

3. Solve also each case independently as an optimization problem (inner
problem with fixed design)
overall coordination step (outer optimization problem for design)
require sensitivity from each inner optimization case with design

variables as external parameters
125
Multiperiod Flowsheet Example

i
CA
CA
T0
T1
V
i
T2
F2
F1
A
W
i
Tw
Parameters
E (kJ/mol)
k0 (1/h)
F (kmol/h)
Time (h)
Tw
Period 1 Period 2 Period 3 Period 4

555.6
583.3
611.1
527.8
10
11
12
9
45.4
40.8
24.1
32.6
1000
4000
2000
1000
126
Multiperiod Design Model

Min f0(d) + i fi(d, xi)
s.t. hi(xi, d) = 0, i = 1, N
gi(xi, d) 0, i = 1, N
r(d) 0
Variables:
x: state (z) and control (u) variables in each operating period
d: design variables (e. g. equipment parameters) used
i: substitute for d in each period and add i = d
d
hi, gi
Min f0(d) + i fi(d, xi)

s.t. hi(xi, i) = 0, i = 1, N
gi(xi, i) +si = 0, i = 1, N
0 si, d i=0, i = 1, N
r(d) 0
127
Multiperiod Decomposition Strategy

SQP Subproblem
Block diagonal bordered KKT matrix

(arrowhead structure)
Solve each block sequentially (range/null dec.)
to form small QP in space of d variables
Reassemble all other steps from QP solution
128

From decomposition of KKT block in each period, obtain the
following directions that are parametric in d:
Substituting back into the original QP subproblem leads to

a QP only in terms of d.
N
T
minimize = d f 0 + { p f iT (ZBi + YBi ) + (ZAi + Y Ai )T 2p Li (Z Bi + YB i ) } d
i =1
N
2
+ d d L0 + { (ZBi + YBi )T 2p Li (Z Bi + YB i ) } d
i =1
2
subject to
r + dr d 0
Once d is obtained, directions are obtained from the above

equations.
129
Starting Point
Original QP
Decoupling

Active Set Update
pi steps are parametric in d and their

components are created independently
Decomposition linear in number of Calculate
Reduced
Hessian
periods and trivially parallelizable
Choosing the active inequality
constraints can be done through:
-Active set strategy (e.g., bound
Calculate
addition)
Step
-Interior point strategy using
barrier terms in objective
MINOR
Easy to implement in decomposition ITERATION
Null - Range
Decomposition
QP at d
Bound Addition
LOOP
Line search
Optimality
Conditions
NO
YES
Optimal Solution
130
Multiperiod Flowsheet 1
(13+2) variables and (31+4) constraints (1 period)
262 variables and 624 constraints (20 periods)
CA
T1
400
CA
T0
V
i
T2
F2
300
F1
CPU time (s)

SQP (T)
MINOS(T)
MPD/SQP(T)
200
100
A
W
0
0
Tw
Tw
10
Periods
20
30
131
Multiperiod Example 2 Heat Exchanger Network

(12+3) variables and (31+6) constraints (1 period)
243 variables and 626 constraints (20 periods)
H1
T
T
C1
50
40
563 K
30
300 K
Q
T
A
20
SQP (T)
MINOS (T)
MPD/SQP (T)
CPU time (s)
393 K
10
320 K
0
0
350 K
10
Periods
20
30
132
Summary and Conclusions

-Unconstrained Newton and Quasi Newton Methods
-KKT Conditions and Specialized Methods
-Reduced Gradient Methods (GRG2, MINOS)
-Successive Quadratic Programming (SQP)
-Reduced Hessian SQP
-Interior Point NLP (IPOPT)
Process Optimization Applications
-Modular Flowsheet Optimization
-Equation Oriented Models and Optimization
-Realtime Process Optimization
-Blending with many degrees of freedom
Further Applications
-Sensitivity Analysis for NLP Solutions
-Multiperiod Optimization Problems
133
Optimization of DifferentialAlgebraic Equation Systems

L. T. Biegler
Chemical Engineering Department
Carnegie Mellon University
Pittsburgh, PA
DAE Optimization Outline

I
II
III
IV
Introduction
Process Examples
Parametric Optimization
- Gradient Methods
Perturbation
Direct - Sensitivity Equations
Adjoint Equations
Optimal Control Problems
- Optimality Conditions
- Model Algorithms
Sequential Methods
Multiple Shooting
Indirect Methods
Simultaneous Solution Strategies
- Formulation and Properties
- Process Case Studies
- Software Demonstration
2
Dynamic Optimization Problem
min (z(t), y(t), u(t), p, t f )
s.t.
dz(t)
= f (z(t), y(t),u(t),t, p)
dt
g (z (t ), y(t ), u(t ), t , p ) = 0
zo
z
t, time
z, differential variables
y, algebraic variables
z(0 )
d z(t) d z
d y(t) d y
d u(t) d u
d p d p
u
u
tf, final time

u, control variables
p, time independent parameters
3
DAE Models in Process Engineering

Differential Equations
Conservation Laws (Mass, Energy, Momentum)
Algebraic Equations
Constitutive Equations, Equilibrium (physical properties,
hydraulics, rate laws)
Semi-explicit form
Assume to be index one (i.e., algebraic variables can be solved
uniquely by algebraic equations)
If not, DAE can be reformulated to index one (see Ascher and
Petzold)
Characteristics
Large-scale models not easily scaled
Sparse but no regular structure
Direct linear solvers widely used
Coarse-grained decomposition of linear algebra
4
Parameter Estimation
Catalytic Cracking of Gasoil (Tjoa, 1991)
p3
p1
p2
A
Q, Q
S , A
S
a = ( p1 + p3 )a 2
1.0
q = p1a 2 p2 q
0.8
a(0) = 1, q (0) = 0
0.6
YA_data
YQ_data
Yi
YA_estimate
0.4
number of states and ODEs: 2

number of parameters:3
no control profiles
constraints: pL p pU
YQ_estimate
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
Objective Function: Ordinary Least Squares

(p1, p2, p3)0 = (6, 4, 1)
(p1, p2, p3)* = (11.95, 7.99, 2.02)
(p1, p2, p3)true = (12, 8, 2)
5
Batch Distillation Multi-product Operating Policies

5XQEHWZHHQGLVWLOODWLRQEDWFKHV
7UHDWDVERXQGDU\YDOXHRSWLPL]DWLRQSUREOHP
:KHQWRVZLWFKIURP$WRRIIFXW WR%"
+RZPXFKRIIFXW WRUHF\FOH"
5HIOX["
%RLOXS 5DWH"
2SHUDWLQJ7LPH"
$
Nonlinear Model Predictive Control (NMPC)

NMPC Estimation and Control
d : disturbances
u : manipulated
variables
Process
z : differential states
y : algebraic states
Why NMPC?

NMPC Controller
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )
Track a profile
Severe nonlinear dynamics (e.g,
sign changes in gains)
Operate process over wide range
(e.g., startup and shutdown)
ysp : set points
Model Updater
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )
NMPC Subproblem
min
u
s.t .
|| y (t ) y
sp
|| Q y + || u(t k ) u(t k 1 ) || Q u
z (t ) = F ( z (t ), y (t ), u (t ), t )
0 = G ( z (t ), y (t ), u(t ), t )
z (t ) = z init
Bound Constraint s
Other Constraint s
7
Batch Process Optimization

Optimization of dynamic batch process operation resulting from reactor and
distillation column
DAE models:
A+BC
z = f(z, y, u, p)
C+BP+E
g(z, y, u, p) = 0
P+CG
number of states and DAEs: nz + ny
0
zi,I
parameters for equipment design
(reactor, column)
nu control profiles for optimal operation
0
zi,II
0
zi,III
0
zi,IV
zi,If
f
zi,II
f
zi,III
f
zi,IV
Bi
uL u(t) uU
zL z(t) zU
yL y(t) yU
pL p pU
Objective Function: amortized economic function at end of cycle time tf
Constraints:
640
25
C o nsta nt
C o nsta nt
630
Dyn a m ic
Dyn a m ic
20
620
610
15
600
10
590
580
5
0
0.25
0.5 0.75
Tim e (h r.)
1.25
optimal reactor temperature policy
0.5
1.5
2.5
optimal column reflux ratio
Reactor Design Example

Plug Flow Reactor Optimization
The cracking furnace is an important example in the olefin production industry, where various
hydrocarbon feedstocks react. Consider a simplified model for ethane cracking (Chen et al.,
1996). The objective is to find an optimal profile for the heat flux along the reactor in order to
maximize the production of ethylene.
C H
2 4
F
exit
Max
s.t. DAE
Texit 1180K
The reaction system includes six molecules, three free radicals, and seven reactions. The
model also includes the heat balance and the pressure drop equation. This gives a total of
eleven differential equations.
CH +C H CH +C H
C H 2CH
3
26
3
H+CH H + C H
26
2 25
C H +C H CH + CH
25 2 4 36
3
H+C H C H
2 4 25
26
4 25
C H C H + H
2 5
24
2CH C H
25
4 10
2500
2000
1500
3
1000
500
1
0
Heat flux kJ/m2s
Flow rate mol/s
Concentration and Heat Addition Profile
0
0
10
Le ngth m
C2H4
C2H6
log(H2)+12
Dynamic Optimization Approaches
Pontryagin(1962)
DAE Optimization Problem
Variational Approach
Inefficient for constrained
problems
Discretize
controls
Apply a NLP solver
Vassiliadis(1994)
Sequential Approach
Efficient for constrained problems
10
Sequential Approaches - Parameter Optimization
Consider a simpler problem without control profiles:

e.g., equipment design with DAE models - reactors, absorbers, heat exchangers
Min
(z(tf))
z = f(z, p), z (0) = z0

g(z(tf)) 0, h(z(tf)) = 0
By treating the ODE model as a "black-box" a sequential algorithm can be constructed that can
be treated as a nonlinear program.
NLP
Solver
P
ODE
Model
,g,h
z (t)
Gradient
Calculation
Task: How are gradients calculated for optimizer?

11
Gradient Calculation
Perturbation
Sensitivity Equations
Adjoint Equations
Perturbation
Calculate approximate gradient by solving ODE model (np + 1) times
Let = , g and h (at t = tf)
d/dpi = { (pi + pi) - (pi)}/ pi
Very simple to set up
Leads to poor performance of optimizer and poor detection of optimum
unless roundoff error (O(1/pi) and truncation error (O(pi)) are small.
Work is proportional to np (expensive)
12
Direct Sensitivity
From ODE model:
{z = f ( z, p, t ), z (0) = z0 ( p)}
p
z (t )
i = 1, ...np
define si (t ) =
pi
d
f f
z (0)
si = ( si ) =
+
si , si (0) =
dt
pi z
pi
T
(nz x np sensitivity equations)

z and si , i = 1,np, an be integrated forward simultaneously.
for implicit ODE solvers, si(t) can be carried forward in time after converging on z
linear sensitivity equations exploited in ODESSA, DASSAC, DASPK, DSL48s and a
number of other DAE solvers
Sensitivity equations are efficient for problems with many more constraints than
parameters (1 + ng + nh > np)
13
Example: Sensitivity Equations

z1 = z12 + z 22
z 2 = z1 z 2 + z1 pb
z1 = 5, z 2 (0) = pa
s (t ) a , j = z (t ) j / pa , s (t ) b , j = z (t ) j / pb , j = 1,2
sa ,1 = 2 z1 sa ,1 + 2 z 2 sa , 2
sa , 2 = z1 sa , 2 + z 2 sa ,1 + sa ,1 pb
sa ,1 = 0, sa , 2 (0) = 1
sb ,1 = 2 z1 sb ,1 + 2 z 2 sb , 2
sb , 2 = z1 + z1 sb , 2 + z 2 sb ,1 + sb ,1 pb
sb ,1 = 0, sb , 2 (0) = 0
14
Adjoint Sensitivity
Adjoint or Dual approach to sensitivity
Adjoin model to objective function or constraint
( = ,g or h)
tf
= (t f ) T ( z f ( z , p, t ))dt
0
tf
= (t f ) + (0) T z 0 ( p) (t f ) T z (t f ) + z T + T F ( z, p, t ))dt
0
((t)) serve as multipliers on ODEs)

Now, integrate by parts
tf
T
( z (t f ))
z0 ( p)
f
f
d =
(t f ) z (t f ) +
(0) dp + + z (t ) + dp dt
z
z (t f )
p
p
0
T
and find d/dp subject to feasibility of ODEs

Now, set all terms not in dp to zero.
15
Adjoint System
( z (t f ))
f
= (t ), (t f ) =
z
z (t f )
tf
d z0 ( p )
=
(0) + (t ) dt
dp
p
p
0
Integrate model equations forward
Integrate adjoint equations backward and evaluate integral and sensitivities.
Notes:
nz (ng + nh + 1) adjoint equations must be solved backward (one for each
objective and constraint function)
for implicit ODE solvers, profiles (and even matrices) can be stored and
carried backward after solving forward for z as in DASPK/Adjoint (Li and
Petzold)
more efficient on problems where: np > 1 + ng + nh
16
Example: Adjoint Equations

z1 = z12 + z22
z1 = z1 z 2 + z1 pb
z1 = 5, z2 (0) = pa
Form T f ( z , p, t ) = 1 ( z12 + z22 ) + 2 ( z1 z 2 + z1 pb )
( z (t f ))
f
(t ), (t f ) =
z
z (t f )
t
f
f
d z0 ( p )
=
(0) + (t ) dt
dp
p
p
then becomes :
1 = 21 z1 2 ( z 2 + pb ), 1 (t f ) =
2 = 21 z 2 2 z1 , 2 (t f ) =
d (t f )
dpa
d (t f )
dpb
(t f )
z1 (t f )
(t f )
z 2 (t f )
= 1 (0)
tf
= 2 (t )z1 (t )dt
0
17
Example: Hot Spot Reactor

L
Min
TP ,TR , L ,TS
s.t.
= L (T (t ) TS / TR )dt
0
dq
= 0.3(1 q (t )) exp[20 20 / T (t )], q (0) = 0
dt
dT
dq
= 1.5(T (t ) TS / TR ) + 2 / 3 , T (0) = 1
dt
dt
+ feed (TR ,110o C) -H product(TP , T(L)) = 0
TP = 120o C, T(L) = 1 + 10 o C/TR
TR
3:1 B/A
383 K
A + 3B --> C + 3D
L
TP
Ts
TP = specified product temperature

TR = reactor inlet, reference temperature
L = reactor length
Ts = steam sink temperature
q(t) = reactor conversion profile
T(t) = normalized reactor temperature profile
Cases considered:
Hot Spot - no state variable constraints
Hot Spot with T(t) 1.45

18
Hot Spot Reactor: Unconstrained Case

Method: SQP (perturbation derivatives)
L(norm) TR(K) TS(K) TP(K)
Initial:
1.0
462.23 425.26 250
Optimal:
1.25
500
470.1 188.4
13 SQP iterations / 2.67 CPU min. (Vax II)
1.6
1.2
1.5
Normalized Temperature
1.0
Conversion, q
0.8
0.6
0.4
0.2
1.3
1.2
1.1
1.0
0.0
0.0
1.4
0.5
Normalized Length
1.0
1.5
0.0
0.5
1.0
1.5
Normalized Length
Constrained Temperature Case: could not be solved with sequential method

19
Tricks to generalize classes of problems

Variable Final Time (Miele, 1980)
Define
t = pn+1 , 0 1, pn+1 = tf
Let dz/dt = (1/ pn+1) dz/d = f(z, p) dz/d = (pn+1) f(z, p)

Converting Path Constraints to Final Time
Define measure of infeasibility as a new variable, znz+1(t) (Sargent & Sullivan, 1977):
tf
z nz +1 (t f ) = max(0, g j ( z (t ), u (t )) 2 dt
j
or znz +1 (t ) = max(0, g j ( z (t ), u (t )) 2 , z nz +1 (0) = 0

j
Enforce z nz +1 (t f ) (however, constraint is degenerate)
20
Profile Optimization - (Optimal Control)
Optimal Feed Strategy (Schedule) in Batch Reactor

Optimal Startup and Shutdown Policy
Optimal Control of Transients and Upsets
Sequential Approach: Approximate control profile as through parameters (piecewise

constant, linear, polynomial, etc.)
Apply NLP to discretization as with parametric optimization
Obtain gradients through adjoints (Hasdorff; Sargent and Sullivan; Goh and Teo) or
sensitivity equations (Vassiliadis, Pantelides and Sargent; Gill, Petzold et al.)
Variational (Indirect) approach: Apply optimality conditions and solve as boundary

value problem
21
Derivation of Variational Conditions

Indirect Approach
Optimality Conditions (Bound constraints on u(t))
Min (z(tf))
s.t. dz/dt = f(z, u), z (0) = z0
g (z(tf)) 0
h (z(tf)) = 0
a u(t) b
Form Lagrange function - adjoin objective function and constraints:
= ( t f ) + g ( z ( t f )) T + h ( z ( t f )) T v
tf
T ( z f ( z , u )) + aT ( a u ( t )) + bT ( u ( t ) b ) dt
Integrate by parts :
= ( t f ) + g ( z ( t f )) T + h ( z ( t f )) T v + T ( 0 ) z ( 0 ) T ( t f ) z ( t f )
+
tf
T z + T f ( z , u ) + aT ( a u ( t )) + bT ( u ( t ) b ) dt
22
Derivation of Variational Conditions

h
g
=
+
+
v z ( t f ) + T ( 0 ) z ( 0 )
z
z
z
tf
f ( z , u )
f ( z , u )
+
+
+
z
(
t
)
b
a u ( t ) dt 0
At optimum, 0. Since u is the control variable, let all other terms vanish.
z(tf):
h
g
(t f ) = + +
z z
z t =t f
z(0): (0) = 0 (if z(0) is not specified)
H
f
z(t):

Define Hamiltonian, H = Tf(z,u)

For u not at bound: f
H
=
=0
u
u
H
= a b
For u at bounds:
u
H
Upper bound, u(t) = b,
= b 0
u
Ta (a u(t))
bT (u(t) b)
ua u(t) ub
a 0, b 0
Lower bound, u(t) = a,
H
= a 0
u
23
Car Problem
Travel a fixed distance (rest-to-rest) in minimum time.
Min x3(t f )
Min t f
s.t. x1 = x2
s.t. x" = u
x2 = u
a u (t ) b
x3 = 1
x(0) = 0, x(t f ) = L
a u (t ) b
x(0) = 0, x(t f ) = 0
x1 (0) = 0, x1 (t f ) = L
x2 (0) = 0, x2 (t f ) = 0
Hamiltonian : H = 1 x2 + 2u + 3
Adjoints : = 0 ==> (t ) = c
1
2 = 1 ==> 2 (t ) = c2 + c1 (t f t )
= 0 ==> (t ) = 1, (t ) = 1
3
t = 0, c1t f + c2 > 0, u = b
H
= 2 = c2 + c1 (t f t )
u
t = t f , c2 > 0, u = a
Crossover (2 = 0) occurs at t = t s
24
Car Problem
Analytic Variational Solution
b
Optimal Profile
u(t)
From state equations:
1 / 2 bt 2 ,t < t s
x1(t) =
2
1 / 2 bt 2s - a (t s - t f ) , t t s
x2 (t) = bt, t < t s
bts + a (t - t s ), t t s
Problem is linear in u(t). Frequently
these problems have "bang-bang"
character.
For nonlinear and larger problems, the
variational conditions can be solved
numerically as boundary value
problems.
ts
tf
Apply boundary conditions at t = tf:

x1(tf) = 1/2 (b ts2 - a (ts - tf)2) = L
x2(tf) = bts + a (tf - ts) = 0
ts = 2L 1/2
b (1- b / a )
1/ 2
tf =
2L
(1 b / a)
b(1 - b / a )
25
Example: Batch reactor - temperature profile

Maximize yield of B after one hours operation by manipulating a transformed
temperature, u(t).
Minimize -zB(1.0)
s.t.
zA = -(u+u2/2) zA
u
zB = u zA
A
zA(0) = 1
2
zB(0) = 0
u /2
C
u(T(t))
0 u(t) 5
Adjoint Equations:
H = -A(u+u2/2) zA + B u zA
H/u = A (1+u) zA + B zA
A = A(u+u2/2) - B u, A(1.0) = 0
B = 0,
B(1.0) = -1
Cases Considered
1. NLP Approach - piecewise constant and linear profiles.
2. Control Vector Iteration
26
Batch Reactor Optimal Temperature Program

Piecewise Constant
Optimal Profile, u(t)
0.
0.2
0.4
Time, h
0.6
0.8
1.0
Results
Piecewise Constant Approximation with Variable Time Elements
Optimum B/A: 0.57105
27

Piecewise Linear
0.
0.2
0.4
Time, h
0.6
0.8
1.0
Results:
Piecewise Linear Approximation with Variable Time Elements
Optimum B/A: 0.5726
Equivalent # of ODE solutions: 32
28

Indirect Approach
0.
0.2
0.4
Time, h
0.6
0.8
1.0
Results:
Control Vector Iteration with Conjugate Gradients
Optimum (B/A): 0.5732
Equivalent # of ODE solutions: 58
29
Dynamic Optimization - Sequential Strategies

Small NLP problem, O(np+nu) (large-scale NLP solver not required)
Use NPSOL, NLPQL, etc.
Second derivatives difficult to get

Repeated solution of DAE model and sensitivity/adjoint equations, scales with
nz and np
Dominant computational cost
May fail at intermediate points

Sequential optimization is not recommended for unstable systems. State
variables blow up at intermediate iterations for control variables and
parameters.
Discretize control profiles to parameters (at what level?)
Path constraints are difficult to handle exactly for NLP approach
30
Instabilities in DAE Models

This example cannot be solved with sequential methods (Bock, 1983):
dy1/dt = y2
dy2/dt = 2 y1 + (2 2) sin ( t)
The characteristic solution to these equations is given by:
y1(t) = sin ( t) + c1 exp(- t) + c2 exp( t)
y2 (t) = cos ( t) - c1 exp(- t) + c2 exp( t)
Both c1 and c2 can be set to zero by either of the following equivalent

conditions:
IVP
y1(0) = 0, y2 (0) =
BVP
y1(0) = 0, y1(1) = 0
31
IVP Solution
If we now add roundoff errors e1 and e2 to the IVP and BVP conditions, we
see significant differences in the sensitivities of the solutions.
For the IVP case, the sensitivity to the analytic solution profile is seen by
large changes in the profiles y1(t) and y2(t) given by:
y1(t) = sin ( t) + (e1 - e2/) exp(- t)/2

+(e1 + e2/) exp( t)/2
y2 (t) = cos ( t) - ( e1 - e2) exp(- t)/2
+ ( e1 + e2) exp( t)/2
Therefore, even if e1 and e2 are at the level of machine precision (< 10-13), a
large value of and t will lead to unbounded solution profiles.
32
BVP Solution
On the other hand, for the boundary value problem, the errors
affect the analytic solution profiles in the following way:
y1(t) = sin ( t) + [e1 exp()- e2] exp(- t)/[exp() - exp(- )]
+ [e1 exp(- ) - e2] exp( t)/[exp() - exp(- )]
y2(t) = cos ( t) [e1 exp()- e2] exp(- t)/[exp() - exp(- )]
+ [e1 exp(-) - e2] exp( t)/[exp() - exp(- )]
Errors in these profiles never exceed t (e1 + e2), and as a result a
solution to the BVP is readily obtained.
33
BVP and IVP Profiles

e1, e2 = 10-9
Linear BVP solves easily
IVP blows up before midpoint
34

Pontryagin(1962)

problems
Apply a NLP solver
Discretize
controls

Discretize some
state variables
Vassiliadis(1994)
Sequential Approach
Can not handle instabilities properly

Small NLP
Multiple Shooting
Handles instabilities
Larger NLP
35
Multiple Shooting for Dynamic Optimization

Divide time domain into separate regions
Integrate DAEs state equations over each region

Evaluate sensitivities in each region as in sequential approach wrt uij, p and zj
Impose matching constraints in NLP for state variables over each region
Variables in NLP are due to control profiles as well as initial conditions in each region
36
Multiple Shooting
Nonlinear Programming Problem
min (z (t f ), y (t f ) )
s.t.
ui , j , p
z ( z j , u i , j , p , t j +1 ) z j +1 = 0
z k z ( z j , u i, j , p, t k ) z k
y k y ( z j , u i, j , p, t k ) y k
u i u i, j u i
l
pl p pu
dz
= f (z, y, ui, j , p), z(tj ) = z j
dt
g (z,y,u i, j , p ) = 0
z 0 = z (0)
minn f ( x)
s.t c( x) = 0
x L x xu
Solved Implicitly
37
BVP Problem Decomposition

IC
B1
A1
B2
A2
B3
A3
B4
A4
BN
AN
FC
Consider: Jacobian of Constraint Matrix for NLP

bound unstable modes with boundary conditions (dichotomy)
can be done implicitly by determining stable pivot sequences in multiple shooting constraints
approach
well-conditioned problem implies dichotomy in BVP problem (deHoog and Mattheij)
Bock Problem (with t = 50)
Sequential approach blows up (starting within 10-9 of optimum)
Multiple Shooting optimization requires 4 SQP iterations
38
Dynamic Optimization Multiple Shooting Strategies

Larger NLP problem O(np+nu+NE nz)
Use SNOPT, MINOS, etc.
Second derivatives difficult to get
Repeated solution of DAE model and sensitivity/adjoint equations, scales with

nz and np
Dominant computational cost
May fail at intermediate points
Multiple shooting can deal with unstable systems with sufficient time
elements.
Discretize control profiles to parameters (at what level?)
Path constraints are difficult to handle exactly for NLP approach
Block elements for each element are dense!
Extensive developments and applications by Bock and coworkers using
MUSCOD code
39

Pontryagin(1962)

problems
Apply a NLP solver
Discretize
controls

Discretize all
state variables
Vassiliadis(1994)
Sequential Approach
Can not handle instabilities properly

Small NLP
Simultaneous Approach
Handles instabilities
Large NLP
40
Nonlinear Programming Formulation
Nonlinear Dynamic
Optimization Problem
Continuous variables
Collocation on
finite Elements
Discretized variables
Nonlinear Programming
Problem (NLP)
41
Discretization of Differential Equations

Orthogonal Collocation
Given: dz/dt = f(z, u, p), z(0)=given
Approximate z and u by Lagrange interpolation polynomials (order
K+1 and K, respectively) with interpolation points, tk
K
k =0
j =0
jk
z K +1 (t ) = z k " k (t ), " k (t ) =
K
k =1
j =1
j k
u K (t ) = uk " k (t ), " k (t ) =
(t t j )
(t k t j )
(t t j )
(t k t j )
==> z N +1 (t k ) = z k
==> u N (t k ) = u k
Substitute zN+1 and uN into ODE and apply equations at tk.

K
r (t k ) = z j " j (t k ) f ( z k , uk ) = 0, k = 1,...K
j =0
42
Collocation Example
K
k =0
j =0
jk
z K +1 (t ) = z k " k (t ), " k (t ) =
(t t j )
(t k t j )
==> z N +1 (t k ) = z k
t 0 = 0 , t 1 = 0 .21132 , t 2 = 0 .78868
" 0 (t) = (t
" (t) = t/ 3 - 1 / 6
0
" 1(t) = - 8 .195 t 2 + 6 .4483 t , " 1(t) = 6 .4483 - 16 .39 t
" 2 (t) = 2 .19625 t 2 - 0 .4641 t , " 2 (t) = 4 .392 t - 0 .46412
2
- t + 1 )/ 6 ,
Solve z = z 2 - 3 z + 2 , z( 0 ) = 0
==> z 0 = 0
2
z 0 " 0 (t 1 ) + z 1 " 1(t 1 ) + z 2 " 2 (t 1 ) = z 1 - 3 z 1 + 2
( 2 .9857 z 1 + 0 .46412 z 2 = z 1 - 3 z 1 + 2 )
2
z 0 " 0 (t 2 ) + z 1 " 1(t 2 ) + z 2 " 2 (t 2 ) = z 2 - 3 z 2 + 2
2
(- 6 .478 z 1 + 3 z 2 = z 2 - 3 z 2 + 2 )
2
z 0 = 0 , z 1 = 0 .291 ( 0 .319 ), z
= 0 .7384 ( 0 .706 )
z(t) = 1.5337 t - 0 .76303 t 2

43
Converted Optimal Control Problem

Using Collocation
to Nonlinear Program
z N+1 (t)
z(t)
State Profile
(z(tf))
z = f(z, u, p), z(0)=z0
g(z(t), u(t), p) 0
h(z(t), u(t), p) = 0
Min
s.t.
Min (z f )
(t k ) f ( zk , uk ) = 0, z0 = z(0)
j =0
g(zk ,uk ) 0
k = 1,...K
h(zk ,uk ) = 0
z "
j
z "
j =0
(1) z f = 0
t1
t2
t3
tf
r(t)
t1
t2
t3
How accurate is approximation

44
Results of Optimal Temperature Program

Batch Reactor (Revisited)
Results - NLP with Orthogonal Collocation

Optimum B/A - 0.5728
# of ODE Solutions - 0.7 (Equivalent)
45
Collocation on Finite Elements

i 1
dz 1 dz
=
dt hi d
tij = hi + hi j , [0,1]
i =1
dz
= hi f ( z , u )
d
ti
to
q=2
u
K
q=1
z(t) = " q(t) ziq
tf
hi
Discontinuous Algebraic and
Control variables
Mesh points
Continuous Differential variables

element i
Collocation points
Finite element, i
Polynomials
y(t) = "q(t) yiq

q=1
u
K
u(t) = "q(t) uiq

q=1
q=0
r (tik ) = ( zij " j ( k )) hi f ( zik , uik , p) = 0, k = 1,..K, i = 1,.. NE

j =0
46
Nonlinear Programming Problem

min (z f
s.t.
( z "
ij
j =0
minn f ( x)
( k )) hi f ( zik , uik , p) = 0
g (zi ,k , yi ,k ,ui ,k , p ) = 0
K
(z
j =0
i 1, j
" j (1)) zi 0 = 0, i = 2,..NE
(z
j =0
NE , j
s.t c( x) = 0
x L x xu
" j (1)) z f = 0, z10 = z (0)
z i, j z i, j z i, j
l
Finite elements, hi, can also be variable to

determine break points for u(t).
y i , j y i, j y i , j
ui, j ui, j ui, j
pl p pu
Add hu hi 0, hi=tf
Can add constraints g(h, z, u) for

approximation error
47
Hot Spot Reactor Revisited

L
Min
TP ,TR , L ,TS
s.t.
= L (T (t ) TS / TR )dt
0
dq
= 0.3(1 q (t )) exp[20 20 / T (t )], q (0) = 0
dt
dT
dq
= 1.5(T (t ) TS / TR ) + 2 / 3 , T (0) = 1
dt
dt
+ feed (TR ,110o C) -H product(TP , T(L)) = 0
TP = 120o C, T(L) = 1 + 10 o C/TR
TR
3:1 B/A
383 K
A + 3B --> C + 3D
L
TP
Ts
TP = specified product temperature

TR = reactor inlet, reference temperature
L = reactor length
Ts = steam sink temperature
q(t) = reactor conversion profile
T(t) = normalized reactor temperature profile
Cases considered:
Hot Spot - no state variable constraints
Hot Spot with T(t) 1.45

48
Base Case Simulation

Method: OCFE at initial point with 6 equally spaced elements
L(norm) TR(K)
TS(K)
TP(K)
Base Case:
1.0
462.23 425.26 250
1.8
integrated profile
collocation
1.6
Temperature
Conversion
integrated profile
collocation
1.4
1.2
0
0.0
1.0
0.2
0.4
0.6
Normalized Length
0.8
1.0
1.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Normalized Length
49
Unconstrained Case
Method: OCFE combined formulation with rSQP
identical to integrated profiles at optimum
L(norm) TR(K)
TS(K)
TP(K)
Initial:
1.0
462.23 425.26 250
Optimal:
1.25
500
470.1
188.4
123 CPU s. (Vax II)
= -171.5
1.6
1.0
1.5
Normalized Temperature
1.2
Conversion, q
0.8
0.6
0.4
0.2
0.0
0.0
0.5
1.0
Normalized Length
1.5
1.4
1.3
1.2
1.1
1.0
0.0
0.5
1.0
1.5
Normalized Length
50
Temperature Constrained Case
T(t) 1.45
Method: OCFE combined formulation with rSQP,

identical to integrated profiles at optimum
L(norm) TR(K)
TS(K)
Initial:
1.0
462.23
425.26
Optimal:
1.25
500
450.5
TP(K)
250
232.1
57 CPU s. (Vax II), = -148.5

1.2
1.5
Temperature
1.0
Conversion
0.8
0.6
0.4
1.3
1.2
1.1
0.2
0.0
0.0
1.4
0.5
Normalized Length
1.0
1.5
1.0
0.0
0.5
1.0
1.5
Normalized Length
51
Theoretical Properties of Simultaneous Method

A. Stability and Accuracy of Orthogonal Collocation
Equivalent to performing a fully implicit Runge-Kutta integration of

the DAE models at Gaussian (Radau) points
2K order (2K-1) method which uses K collocation points
Algebraically stable (i.e., possesses A, B, AN and BN stability)
B. Analysis of the Optimality Conditions
An equivalence has been established between the Kuhn-Tucker

conditions of NLP and the variational necessary conditions
Rates of convergence have been established for the NLP method
52
Simultaneous DAE Optimization

Case Studies
Reactor - Based Flowsheets

Fed-Batch Penicillin Fermenter
Temperature Profiles for Batch Reactors
Parameter Estimation of Batch Data
Synthesis of Reactor Networks
Batch Crystallization Temperature Profiles
Grade Transition for LDPE Process
Ramping for Continuous Columns
Reflux Profiles for Batch Distillation and Column Design
Source Detection for Municipal Water Networks
Air Traffic Conflict Resolution
Satellite Trajectories in Astronautics
Batch Process Integration
Optimization of Simulated Moving Beds
53
Production of High Impact Polystyrene (HIPS)

Startup and Transition Policies (Flores et al., 2005a)
Monomer,
Transfer/Term.
agents
Coolant
Catalyst
Polymer
54
Phase Diagram of Steady States

Transitions considered among all steady state pairs
600
650
1: Qi = 0.0015
2: Qi = 0.0025
Upper SteadyState
3: Qi = 0.0040
A5
N3
550
600
System State
550
500
Temperature (K)
Temperature (K)
Medium SteadyState
450
400
A1
1
500
1: Qcw = 10
2: Qcw = 1.0
3: Qcw = 0.1
450
N2
Lower SteadyState
400
A2
N2
350
A3
350
A4
Bifurcation
Parameter
3
B1
N1
300
B2
C1
B3
C2
300
0.5
1.5
2
2.5
3
Cooling water flowrate (L/s)
3.5
4.5
0.02
0.04
0.06
Initiator flowrate (L/s)
0.08
0.1
0.12
55
x 10
0.5
0.5
400
1
Time [h]
1.5
Cooling water Flow. [l/sec]Jacket Temp. [K] Monomer Conc. [mol/l]
Initiator Flow. [l/sec] Reactor Temp. [K] Initiator Conc. [mol/l]
Startup to Unstable Steady State

10
8
6
4
0.5
1
Time [h]
1.5
0.5
1
Time [h]
1.5
0.5
1
Time [h]
1.5
320
310
350
300
300
1.5
3
0
x 10
0.5
1
Time [h]
1.5
290
1
0.5
Feedrate Flow. [l/sec]
0.5
0
20
40
60
Time [h]
80
100
2
1
0
0.5
1
Time [h]
1.5
926 variables
476 constraints
36 iters. / 0.95 CPU s (P4)
56
HIPS Process Plant (Flores et al., 2005b)
Many grade transitions considered with stable/unstable pairs

1-6 CPU min (P4) with IPOPT
Study shows benefit for sequence of grade changes to
achieve wide range of grade transitions.
57
Batch Distillation Optimization Case Study - 1

R(t)
D(t), x
dx i,d
dt
dx i,0
V(t)
xb
y i, N - x i, d
H cond
x i, d - x i,0
dt
S
dS
-V
=
dt R + 1
=
*DXJHHIIHFWRIFROXPQKROGXSV
2YHUDOOSURILWPD[LPL]DWLRQ
0DNH7UD\&RXQW&RQWLQXRXV
58
2SWLPL]DWLRQ&DVH6WXG\
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
1R/LTXLG7UD\+ROGXS
1R9DSRU+ROGXS
)RXUFRPSRQHQWPL[WXUH
6KRUWFXWVWHDG\VWDWHWUD\PRGHO
)HQVNH8QGHUZRRG*LOOLODQG
&DQEHVXEVWLWXWHGE\PRUHGHWDLOHGVWHDG\VWDWHPRGHOV
)UHGHQVOXQG DQG*DOLQGH]$OND\D
2SWLPL]DWLRQ3UREOHPV&RQVLGHUHG
(IIHFWRI&ROXPQ+ROGXS+FRQG
7RWDO3URILW0D[LPL]DWLRQ
59
0D[LPXP'LVWLOODWH3UREOHP
30
Reflux Ratio
20
With Holdup
No Holdup
&RPSDULVRQRIRSWLPDOUHIOX[SURILOHV
ZLWKDQGZLWKRXWKROGXS+FRQG
10
0
0.0
0.2
0.4
0.6
0.8
1.0
Time ( hours )
0.98
Distillate Composition
&RPSDULVRQRIGLVWLOODWH
SURILOHVZLWKDQGZLWKRXW
KROGXS+FRQGDW
RYHUDOOSXULW\
0.97
0.96
0.95
With Holdup
No Holdup
0.94
0.93
0.92
0.91
0.0
0.2
0.4
0.6
Time ( hours )
0.8
1.0
60
%DWFK'LVWLOODWLRQ3URILW0D[LPL]DWLRQ
Max {Net Sales(D, S0)/(tf +Tset) TAC(N, V)}
N = 20 trays, Tsetup = 1 hour
xd = 0.98, xfeed = 0.50, = 2
Cprod/Cfeed = 4.1
V = 120 moles/hr, S = 100 moles.
0
30
Reflux Ratio
20
N = 20
N = 33.7
10
0
0
2
Time ( hours )
4
61
%DWFK'LVWLOODWLRQ 2SWLPL]DWLRQ&DVH6WXG\
R(t)
D(t), x
dx1,N +1
V
=
[y - x , ]
dt
H N+1 1,N 1 N +1
dx1,p
dt
R
x
, p = 1,...,N
y1, p-1 - y1 , p + R+1
x
1,p+1
1,p
Hp
V(t)
xb
dx1,0 V
R
x
=
x1,0 - y1,0 + R+1
x
1,1
1,0
S
dt
dD
V
=
dt R + 1
N+1
N+1
S 0x 0i,0 = S0 - Hp x i,0 + H p x i,p
p=1
p=1
C
xi,p = 1.0
1
yi,p = 1.0
1
Ideal Equilibrium Equations

yi,p = Ki,p xi,p
Binary Column (55/45, Cyclohexane, Toluene)
S0 = 200, V = 120, Hp = 1, N = 10, ~8000 variables,
< 2 CPU hrs. (Vaxstation 3200)
62
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
&RQVWDQW7UD\+ROGXS
1R9DSRU+ROGXS
%LQDU\0L[WXUHWROXHQHF\FORKH[DQH
KRXURSHUDWLRQ
7RWDO5HIOX[,QLWLDO&RQGLWLRQ
&DVHV&RQVLGHUHG
&RQVWDQW&RPSRVLWLRQRYHU7LPH
6SHFLILHG&RPSRVLWLRQDW)LQDO7LPH
%HVW&RQVWDQW5HIOX[3ROLF\
3LHFHZLVH&RQVWDQW5HIOX[3ROLF\
63
0.006
0.005
0.004
0.003
0.002
Reflux
Purity
0.001
0
0.0
Top Plate Purity, (1-x)
Reflux Policy
5HIOX[2SWLPL]DWLRQ&DVHV
&RQVWDQW3XULW\RYHU7LPH
[W
' WI
0.000
0.2
0.4
0.6
0.8
1.0
Time (hrs.)
2YHUDOO'LVWLOODWH3XULW\
6KRUWFXW&RPSDULVRQ
' WI
Purity
0.008
6
0.006
4
0.004
0
0.0
0.002
Distillate Purity, (1-x)
' WI
Reflux
Reflux Policy
[GW95GW'WI
0.010
0.000
0.2
0.4
0.6
0.8
1.0
Time (hrs.)
64
5HIOX[2SWLPL]DWLRQ&DVHV
0.004
0.30
Reflux
Purity
0.26
0.002
0.24
0.001
0.22
0.003
Reflux Policy
0.28
&RQVWDQW5HIOX[RYHU7LPH
[GW95GW'WI
' WI
0.000
0.20
0.0
0.2
0.4
0.6
0.8
1.0
Time (hrs.)
' WI
Reflux
4
0.006
3
0.004
2
0.002
[GW95GW'WI
Purity
0.008
Reflux Policy
3LHFHZLVH&RQVWDQW
5HIOX[RYHU7LPH
0.010
0.000
0.0
0.2
0.4
0.6
0.8
1.0
Time (hrs.)
65
Batch Reactive Distillation Case Study 3

Reversible reaction between acetic acid and ethanol
CH3COOH + CH3CH2OH l CH3COOCH2CH3 + H2O
t = 0, x = 0.25
for all components
R(t)
D(t), x
ttf ==11
max
max 0
Ddt
Ddt
s.t. DAE
Ester 0.4600
xxEster
0.4600
DD
V(t)
xb
Wajde & Reklaitis (1995)66
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
&RQVWDQW7UD\+ROGXS
1R9DSRU+ROGXS
7HUWLDU\0L[WXUH(W2++2$F(7$F+2
&ROG6WDUW,QLWLDO&RQGLWLRQ
&DVHV&RQVLGHUHG
6SHFLILHG&RPSRVLWLRQDW)LQDO7LPH
2SWLPXP5HIOX[3ROLF\
9DULRXV7UD\V&RQVLGHUHG
KRXURSHUDWLRQ
67
Batch Reactive Distillation

Optimal Reflux Profiles
Condenser Composition (8 trays)

0.50
80
0.40
M. fraction
Reflux Ratio
100
60
40
0.30
0.20
0.10
20
0.00
0.0
0
0.0
0.2
0.4
0.6
0.8
1.0
YDULDEOHV
'$(V
GHJUHHVRIIUHHGRP
ILQLWHHOHPHQWV
,3237LWHUDWLRQV
&38PLQXWHV
Ester
0.6
0.8
1.0
Eth.
Wat.
A. A.
25 trays
Distillate Composition
Ethyl Acetate
15 trays
0.4
time(hr)
time (hr)
8 trays
0.2
0.45
0.35
0.25
0.0
0.2
0.4
0.6
0.8
1.0
time (hr)
8 trays
15 trays
25 trays
68
Batch Reactive Distillation

DAEs
Discretized
Variables
Iterations
98
1788
15
168
25
258
CPU (s)
Global
Elemental
14
56.4
37.2
3048
32
245.7
207.5
4678
45
1083.2
659.3
CPU Decomposition Time

21
CPU(s)/iter
Trays
16
11
6
1
90
130
170
DAEs
210
250
69
Nonlinear Model Predictive Control (NMPC)

NMPC Estimation and Control
d : disturbances
u : manipulated
variables
Process
z : differential states
y : algebraic states
Why NMPC?

NMPC Controller
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )
Track a profile
Severe nonlinear dynamics (e.g,
sign changes in gains)
Operate process over wide range
(e.g., startup and shutdown)
ysp : set points
Model Updater
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )
NMPC Subproblem
min
u
s.t .
|| y (t ) y
sp
|| Q y + || u(t k ) u(t k 1 ) || Q u
z (t ) = F ( z (t ), y (t ), u (t ), t )
0 = G ( z (t ), y (t ), u(t ), t )
z (t ) = z init
Bound Constraint s
Other Constraint s
70
Dynamic optimization in a
MATLAB Framework
Dynamic Optimization
Problem
Dynamic Process
Process Model
f (x , x, y , u, p, t ) = 0
JOBKENNUNG : C:\Krawal-m odular\IGCC\IGCC_Puertollano_kom plett.gek
g (x , y, u, p, t ) = 0
Discretization
Method
No. of Time
Elements
Collocation
Order
Inequality Constraints
bar
kJ/kg
kg/s
C (X)
Alle Drcke sind absolut
P..Druck..bar
Saturator-System
M ..M assenstrom ..kg/s

PHI..Luft-Feuchte..%
Wrmeschaltplan
Nr. -F Ref T Erlangen, 13.Oct.1999
SIEMENS AG
F Ref T
Initial Conditions
x (t0 ) = x0
Constraints at Final Time
(x (t f ), x(t f ), y (t f ), u (t f ), x 0 , p, t f ) = 0
Objective Function
min P(x(t
Process Model
f (x , y , u , p, t f , x 0 ) = 0
g (x , y , u , p, t f ) = 0
Inequality Constraints
h (x, x, y, u, p, t ) 0
copy ofDesign
T ..T em peratur..C
H..Enthalpie..kJ/kg
In Bearbeitung
NLP Optimization
Problem
Full
Discretization
of State and
Control
Variables
h (x , y , u , p, t f ) 0
Constraints at Final Time
(x N , y N , u N , p, t f ) = 0
t
Objective Function
min P x N t , y N t , u N t , p, t f
), y (t f ), u(t f ), p, x 0 , t f )
u (t),p ,x 0 ,t f
71
Tennessee Eastman Process
Unstable Reactor
11 Controls; Product, Purge streams
Model extended with energy balances
72
Tennessee Eastman Challenge Process
DAE Model
Number of differential equations
NLP Optimization problem

30
Number of algebraic variables
152
Number of algebraic equations
141
Difference (control variables)
11
Number of variables
of which are fixed
10920
0
Number of constraints
10260
Number of lower bounds
780
Number of upper bounds
540
Number of nonzeros in Jacobian
49230
Number of nonzeros in Hessian
14700
Method of Full Discretization of State and Control Variables

Large-scale Sparse block-diagonal NLP
73
Setpoint change studies

Process variable
Type
Magnitude
Step
-15%
Make a step change to the variable(s) used to set
the process production rate so that the product
flow leaving the stripper column base changes
from 14,228 to 12,094 kg h-1
Reactor operating pressure

Step
change
-60 kPa
Make a step change so that the reactor operating
pressure changes from 2805 to 2745 kPa
Purge gas composition of

component B change
+2%
Make a step change so that the composition of
component B in the gas purge changes from
13.82 to 15.82%
Production rate change
Step
Setpoint changes for the base case [Downs & Vogel]
74
Case Study:
Change Reactor pressure by 60 kPa
Control profiles
All profiles return to their
base case values
Same production rate
Same product quality
Same control profile
Lower pressure leads to
larger gas phase (reactor)
volume
Less compressor load
75
TE Case Study Results I

Shift in TE process
Same production rate
More volume for reaction
Same reactor temperature
Initially less cooling water flow
(more evaporation)
76
Case Study- Results II

Shift in TE process
Shift in reactor effluent to more
condensables
Increase cooling water flow
Increase stripper steam to
ensure same purity
Less compressor work
77
Case Study:
Change Reactor Pressure by 60 kPa
Optimization with IPOPT

1000 Optimization Cycles
5-7 CPU seconds
11-14 Iterations
Optimization with SNOPT
Often failed due to poor
conditioning
Could not be solved within
sampling times
> 100 Iterations
78
Optimization as a Framework for Integration

+ Directly handles interactions, multiple conditions
+ Trade-offs unambiguous, quantitative
- Larger problems to solve

- Need to consider a diverse process models
Research Questions
How should diverse models be integrated?
Is further algorithmic development needed?
79
Batch Integration Case Study

Plant Design
Dynamic Processing
R
T
Time
Time
Production Planning
Stage 1
A
Stage 2
B
A
C
B
:KDWDUHWKH,QWHUDFWLRQVEHWZHHQ'HVLJQ
DQG'\QDPLFVDQG3ODQQLQJ"
:KDWDUHWKHGLIIHUHQFHVEHWZHHQ6HTXHQWLDODQG
6LPXOWDQHRXV6WUDWHJLHV"
(VSHFLDOO\,PSRUWDQWLQ%DWFK6\VWHPV
80
6LPXOWDQHRXV'\QDPLF2SWLPL]DWLRQ
Conv.
Dynamic Processing
Same conversion T
in reduced time
Shorter
processing times
Time
Conv.
Time
Higher conversion T
in same time
Fewer
product batches
Time
Time
Best constant
Best transient
Production Planning
Stage 1
Stage 2
B
A
C
B
Shorter Planning
Horizon
GLVFUHWL]H '$(VVWDWHDQGFRQWUROSURILOHV

ODUJHVFDOHRSWLPL]DWLRQSUREOHP
KDQGOHVSURILOHFRQVWUDLQWVGLUHFWO\
LQFRUSRUDWHVHTXLSPHQWYDULDEOHVGLUHFWO\
'$(PRGHOVROYHGRQO\RQFH
FRQYHUJHVIRUXQVWDEOHV\VWHPV
81
Scheduling Formulation
VHTXHQFLQJRIWDVNVSURGXFWVHTXLSPHQW
H[SHQVLYHGLVFUHWHFRPELQDWRULDORSWLPL]DWLRQ
FRQVLGHULGHDOWUDQVIHUSROLFLHV8,6DQG=:
FORVHGIRUPUHODWLRQV%LUHZDU DQG*URVVPDQQ
8QOLPLWHG,QW6WRUDJH8,6
stage 2
6KRUWSURGXFWLRQF\FOH
&\FOHWLPHLQGHSHQGHQWRIVHTXHQFH
stage I
stage I
=HUR:DLW=:
,PPHGLDWHWUDQVIHUUHTXLUHG
6ODFNWLPHVGHSHQGHQWRQSDLU
/RQJHUSURGXFWLRQF\FOHUHTXLUHG
stage 2
82
Case Study Example

4 stages, 3 products of different purity
Dynamic reactor - temperature profile
Dynamic column - reflux profile
A+B C
C +B P+E
P+C G
z i0,I
z i0,II
z i0,III
z i0,IV
z i ,II
z if,III
z i ,I
z if,IV
Bi
Process Optimization Cases

SQ - Sequential Design - Scheduling - Dynamics
SM - Simultaneous Design and Scheduling
Dynamics with endpoints fixed .
SM* - Simultaneous Design, Scheduling and Dynamics
83
6FHQDULRVLQ&DVH6WXG\
&RPSDULVRQRI'\QDPLFYV%HVW&RQVWDQW3URILOHV
5 EHVWFRQVWDQWWHPSHUDWXUHSURILOH
5 RSWLPDOWHPSHUDWXUHSROLF\
& EHVWFRQVWDQWUHIOX[UDWLR
& RSWLPDOUHIOX[UDWLR
640
25
C o nsta nt
C o nsta nt
630
Dyn a m ic
Dyn a m ic
20
620
610
15
600
10
590
580
5
0
0.25
0.5 0.75
Tim e (h r.)
1.25
0.5
1.5
2.5
84
Results for Simultaneous Cases

2 00 00
R0/R1
best constant /
optimal temperature
Pro fit ($)
1 50 00
1 00 00
C0/C1
best constant /
optimal reflux ratio
5 00 0
0
R 0 C 0
R 1 C 0
R 0 C 1
R 1 C 1
C a se s
S e q u e n t ia l
S im u lt a n e o u s
w it h F I X E D s t a t e s
64
S Im u lt a n e o u s
w it h F R E E s t a t e s
I
I I
I I I
60
I V
60
I V
I
I I
I I I
I
I I
I I I
I V
- ZW schedule becomes tighter

- less dependent on product sequences
85
Summary
Sequential Approaches
- Parameter Optimization
Gradients by: Direct and Adjoint Sensitivity Equations
- Optimal Control (Profile Optimization)
Variational Methods
NLP-Based Methods
- Require Repeated Solution of Model
- State Constraints are Difficult to Handle
Simultaneous Approach
- Discretize ODE's using orthogonal collocation on finite elements (solve larger optimization problem)
- Accurate approximation of states, location of control discontinuities through element placement.
- Straightforward addition of state constraints.
- Deals with unstable systems
Simultaneous Strategies are Effective
- Directly enforce constraints
- Solve model only once
- Avoid difficulties at intermediate points
Large-Scale Extensions
- Exploit structure of DAE discretization through decomposition
- Large problems solved efficiently with IPOPT
86
DAE Optimization Resources

References
Bryson, A.E. and Y.C. Ho, Applied Optimal Control, Ginn/Blaisdell, (1968).
Himmelblau, D.M., T.F. Edgar and L. Lasdon, Optimization of Chemical
Processes, McGraw-Hill, (2001).
Ray. W.H., Advanced Process Control, McGraw-Hill, (1981).
Software
Dynamic Optimization Codes

ACM Aspen Custom Modeler
DynoPC - simultaneous optimization code (CMU)
COOPT - sequential optimization code (Petzold)
gOPT - sequential code integrated into gProms (PSE)
MUSCOD - multiple shooting optimization (Bock)
NOVA - SQP and collocation code (DOT Products)
Sensitivity Codes for DAEs

DASOLV - staggered direct method (PSE)
DASPK 3.0 - various methods (Petzold)
SDASAC - staggered direct method (sparse)
DDASAC - staggered direct method (dense)
87
DynoPC Windows Implementation

Model in FORTRAN
Data/Config.
Outputs/Graphic & Texts
Interface of DynoPC
F2c/C++
Redued SQP
ADOL_C
Optimal Solution
Preprocessor
Calc. of independent
variable move
F/DF
Interior Point
For QP
DDASSL
F/DF
Decomposition
COLDAE
Simulator/Discretizer
Line Search
Starting Point
Linearization
FILTER
88
Example: Batch Reactor Temperature
Max
b (t f )
s.t.
E1
da
= k1 exp(
)a
at
RT
E1
E2
db
= k1 exp(
) a k 2 exp(
)b
at
RT
RT
a+b+c =1
89
Example: Car Problem

Min
s.t.
tf
z1 = z2
z2 = u
z2 zmax
-2 u 1
12
10
Velocity
Acceleration
-1
Velocity
Acceleration, u(t)
subroutine model(nz,ny,nu,np,t,z,dmz,y,u,p,f)
double precision t, z(nz),dmz(nz), y(ny),u(nu),p(np)
double precision f(nz+ny)
-2
2
-3
10
20
30
40
f(1) = p(1)*z(2) - dmz(1)

f(2) = p(1)*u(1) - dmz(2)
return
end
Time
90
4.5E-03
60
4.0E-03
3.5E-03
50
3.0E-03
2.5E-03
40
30
2.0E-03
1.5E-03
1.0E-03
5.0E-04
20
0.0E+00
T ja c k e t (C )
C ry s ta l s iz e
Example: Crystallizer Temperature
10
10
15
20
Time (hr)
Crystal Size
T jacket
TT
TC
Maximize crystal size

at final time
Steam
Cooling
water
Control variable = Tjacket = f(t)?
SUBROUTINE model(nz,ny,nu,np,x,z,dmz,y,u,p,f)
implicit double precision (a-h,o-z)
double precision f(nz+ny),z(nz),dmz(nz),Y(ny),yp(4),u(1)
double precision kgr, ln0, ls0, kc, ke, kex, lau, deltT, alpha
dimension a(0:3), b(0:3)
data alpha/1.d-4/,a/-66.4309d0, 2.8604d0, -.022579d0, 6.7117d-5/,
+ b/16.08852d0, -2.708263d0, .0670694d0, -3.5685d-4/, kgr/ 4.18d-3/,
+ en / 1.1d0/, ln0/ 5.d-5/, Bn / 3.85d2/, em / 5.72/, ws0/ 2.d0/,
+ Ls0/ 5.d-4 /, Kc / 35.d0 /, Kex/ 65.d0/, are/ 5.8d0 /,
+ amt/ 60.d0 /, V0 / 1500.d0/, cw0/ 80.d0/,cw1/ 45.d0/,v1 /200.d0/,
+
tm1/ 55.d0/,x6r/0.d0/, tem/ 0.15d0/,clau/ 1580.d0/,lau/1.35d0/,
+ cp/ 0.4d0 /,cbata/ 1.2d0/, calfa/ .2d0 /, cwt/ 10.d0/
ke = kex*area
x7i = cw0*lau/(100.d0-cw0)
v = (1.d0 - cw0/100.d0)*v0
w = lau*v0
yp(1) = (deltT + dsqrt(deltT**2 + alpha**2))*0.5d0
yp(2) = (a(0) + a(1)*yp(4) + a(2)*yp(4)**2 + a(3)*yp(4)**3)
yp(3) = (b(0) + b(1)*yp(4) + b(2)*yp(4)**2 + b(3)*yp(4)**3)
deltT = yp(2) - z(8)
yp(4) = 100.d0*z(7)/(lau+z(7))
f(1) = Kgr*z(1)**0.5*yp(1)**en - dmz(1)
f(2) = Bn*yp(1)**em*1.d-6 - dmz(2)
f(3) = ((z(2)*dmz(1) + dmz(2) * Ln0)*1.d+6*1.d-4) - dmz(3)
f(4) = (2.d0*cbata*z(3)*1.d+4*dmz(1)+dmz(2)*Ln0**2*1.d+6)-dmz(4)
f(5) = (3.d0*calfa*z(4)*dmz(1)+dmz(2)*Ln0**3*1.d+6) - dmz(5)
f(6) = (3.d0*Ws0/(Ls0**3)*z(1)**2*dmz(1)+clau*V*dmz(5))-dmz(6)
f(7) = -dmz(6)/V - dmz(7)
f(8) = (Kc*dmz(6) - Ke*(z(8) - u(1)))/(w*cp) - dmz(8)
f(9) = y(1)+YP(3)- u(1)
return
end
91

Nonlinear Programming Concepts PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nonlinear Programming Concepts PDF

Uploaded by

Copyright:

Available Formats

Nonlinear Programming:

Concepts, Algorithms and

Nonlinear Programming and Process Optimization

Karush Kuhn-Tucker Conditions

Black Box Optimization

Large-Scale Nonlinear Programming

Sensitivity Analysis for NLP Solutions

Summary and Conclusions

Some related areas:

4. Nocedal, J. and S. Wright, Numerical Optimization, Springer, 1998

Optimization in Design, Operations and Control

Unconstrained Multivariable Optimization

Equivalent to: Max -f(x), x Rn

Example: Optimal Vessel Dimensions

Convert to Unconstrained (Eliminate L)

==> L/D = CT/CS

Two Dimensional Contours of F(x)

Local vs. Global Solutions

f( y + (1-) z) f(y) + (1-)f(z)

for any , 0 1, at all points y and z in X.

Linear Algebra - Background

Singular Matrix: det (A) = 0

Linear Algebra - Background

Hessian Matrix (2f(x) - Symmetric)

Linear Algebra - Background

Some Identities for Determinant

Eigenvalues: det(A- I) = 0, Eigenvector: Av = v

Linear Algebra - Eigenvalues

z = VTx and Q(Vz) = (aTV) z + zT (VTAV)z

Minimum occurs at (if i > 0) x = -A-1a

Positive (or Negative) Curvature

All eigenvalues are positive

Comparison of Optimization Methods

Three Dimensional Surface and Curvature for Representative Test Problem

Regions where minimum

What conditions characterize an optimal solution?

Unconstrained Local Minimum

For smooth functions, why are contours around optimum elliptical?

Basic Newton Algorithm - Line Search

Newtons Method - Convergence Path

Newtons Method - Notes

Steepest descent Linear Rate : limk

Quasi-Newton Method - BFGS

Sources For Unconstrained Software

Example: Minimize Packing Dimensions

What is the smallest box for three round objects?

Characterization of Constrained Optima

Convex Objective Functions

What conditions characterize an optimal solution?

Unconstrained Local Minimum

Unconstrained Local Minimum

Optimal solution for inequality constrained problem

Optimal solution for general constrained problem

Optimality conditions for local optimum

L (x*, u, v) = f(x*) + g(x*) u + h(x*) v = 0

Necessary (Sufficient) Second Order Conditions

Single Variable Example of KKT Conditions

Consider three cases:

Upper bound is active, x = a, u1 = -2a, u2 = 0

Second order conditions (x*, u1, u2 =0)

Single Variable Example

Consider three cases:

L (x, u, v) = f(x) + g(x) u + h(x) v = 0

If KKT error less than tolerance: ||L(x)|| , ||h(x)|| ,