You are on page 1of 224

Nonlinear Programming:

Concepts, Algorithms and


Applications
L. T. Biegler
Chemical Engineering Department
Carnegie Mellon University
Pittsburgh, PA

Nonlinear Programming and Process Optimization


Introduction
Unconstrained Optimization

Algorithms
Newton Methods
Quasi-Newton Methods

Constrained Optimization

Karush Kuhn-Tucker Conditions


Special Classes of Optimization Problems
Reduced Gradient Methods (GRG2, CONOPT, MINOS)
Successive Quadratic Programming (SQP)
Interior Point Methods

Process Optimization

Black Box Optimization


Modular Flowsheet Optimization Infeasible Path
The Role of Exact Derivatives

Large-Scale Nonlinear Programming

Data Reconciliation
Real-time Process Optimization

Further Applications

Sensitivity Analysis for NLP Solutions


Multiperiod Optimization Problems

Summary and Conclusions


2

Introduction
Optimization: given a system or process, find the best solution to
this process within constraints.
Objective Function: indicator of "goodness" of solution, e.g., cost,
yield, profit, etc.
Decision Variables: variables that influence process behavior and
can be adjusted for optimization.

In many cases, this task is done by trial and error (through case
study). Here, we are interested in a systematic approach to this
task - and to make this task as efficient as possible.

Some related areas:


- Math programming
- Operations Research
Currently - Over 30 journals devoted to optimization with roughly
200 papers/month - a fast moving field!
3

Optimization Viewpoints
Mathematician - characterization of theoretical properties
of optimization, convergence, existence, local
convergence rates.
Numerical Analyst - implementation of optimization
method for efficient and "practical" use. Concerned with
ease of computations, numerical stability, performance.
Engineer - applies optimization method to real problems.
Concerned with reliability, robustness, efficiency,
diagnosis, and recovery from failure.

Optimization Literature
Engineering
1. Edgar, T.F., D.M. Himmelblau, and L. S. Lasdon, Optimization of Chemical
Processes, McGraw-Hill, 2001.
2. Papalambros, P. and D. Wilde, Principles of Optimal Design. Cambridge Press,
1988.
3. Reklaitis, G., A. Ravindran, and K. Ragsdell, Engineering Optimization, Wiley, 1983.
4. Biegler, L. T., I. E. Grossmann and A. Westerberg, Systematic Methods of Chemical
Process Design, Prentice Hall, 1997.

Numerical Analysis
1. Dennis, J.E. and R. Schnabel, Numerical Methods of Unconstrained Optimization,
Prentice-Hall, (1983), SIAM (1995)
2. Fletcher, R. Practical Methods of Optimization, Wiley, 1987.
3. Gill, P.E, W. Murray and M. Wright, Practical Optimization, Academic Press,

1981.

4. Nocedal, J. and S. Wright, Numerical Optimization, Springer, 1998

Motivation
Scope of optimization
Provide systematic framework for searching among a specified
space of alternatives to identify an optimal design, i.e., as a
decision-making tool

Premise
Conceptual formulation of optimal product and process design
corresponds to a mathematical programming problem

min f ( x , y )
st h ( x , y ) = 0
g ( x, y ) 0
x R n y {0, 1}ny
6

Optimization in Design, Operations and Control


MILP

MINLP

Global

LP,QP

NLP

SA/GA

HENS

MENS

Separations

x
x

Reactors

Equipment
Design

Flowsheeting

Scheduling

Supply Chain

Real-time
optimization

Linear MPC

Nonlinear
MPC
Hybrid

x
x

x
x

Unconstrained Multivariable Optimization


Problem:

Min

f(x)

(n variables)

Equivalent to: Max -f(x), x Rn


Nonsmooth Functions
- Direct Search Methods
- Statistical/Random Methods
Smooth Functions
- 1st Order Methods
- Newton Type Methods
- Conjugate Gradients

Example: Optimal Vessel Dimensions


What is the optimal L/D ratio for a cylindrical vessel?
Constrained Problem
(1)

D2

Min CT
+ CS DL = cost
2

s.t.

D 2 L
V 4

= 0

Convert to Unconstrained (Eliminate L)

D
Min CT
2

+ CS

4V
= cost
D

d(cost)
4VC s
= CT D = 0
2
dD
D
4V CS
D =
CT

1/ 3

(2)

2/3

4V 1/3 C T

L =
CS

==> L/D = CT/CS


Note:
What if L cannot be eliminated in (1) explicitly? (strange shape)
What if D cannot be extracted from (2)?
(cost correlation implicit)

Two Dimensional Contours of F(x)


Convex Function

Discontinuous

Nonconvex Function

Multimodal, Nonconvex

Nondifferentiable (convex)
10

Local vs. Global Solutions

Find a local minimum point x* for f(x) for feasible region defined by
constraint functions: f(x*) f(x) for all x satisfying the constraints in
some neighborhood around x* (not for all x X)
Finding and verifying global solutions will not be considered here.
Requires a more expensive search (e.g. spatial branch and bound).
A local solution to the NLP is also a global solution under the
following sufficient conditions based on convexity.
f(x) is convex in domain X, if and only if it satisfies:

f( y + (1-) z) f(y) + (1-)f(z)

for any , 0 1, at all points y and z in X.

11

Linear Algebra - Background


Some Definitions
Scalars - Greek letters, , ,
Vectors - Roman Letters, lower case
Matrices - Roman Letters, upper case
Matrix Multiplication:
C = A B if A n x m, B m x p and C n x p, Cij = k Aik Bkj
Transpose - if A n x m,
interchange rows and columns --> AT m x n
Symmetric Matrix - A n x n (square matrix) and A = AT
Identity Matrix - I, square matrix with ones on diagonal
and zeroes elsewhere.
Determinant: "Inverse Volume" measure of a square matrix
det(A) = i (-1)i+j Aij Aij for any j, or
det(A) = j (-1)i+j Aij Aij for any i, where Aij is the determinant
of an order n-1 matrix with row i and column j removed.
det(I) = 1

Singular Matrix: det (A) = 0


12

Linear Algebra - Background


Gradient Vector - (f(x))
f / x1
f /
x 2
f =

......

f / x n

Hessian Matrix (2f(x) - Symmetric)


2f
2f 2f
x12
x1 x 2
x1 x n
....
....
2 f(x) = ....
2f

2 f 2 f

xn x 1 xn x 2
x2n

Note:

2 f
xi x j

2 f
x j x i
13

Linear Algebra - Background

Some Identities for Determinant


det(A B) = det(A) det(B);
det (A) = det(AT)
det(A) = n det(A); det(A) = i i(A)

Eigenvalues: det(A- I) = 0, Eigenvector: Av = v


Characteristic values and directions of a matrix.
For nonsymmetric matrices eigenvalues can be complex,
so we often use singular values, = (AT)1/2 0

Vector Norms
|| x ||p = {i |xi|p}1/p
(most common are p = 1, p = 2 (Euclidean) and p = (max norm = maxi|xi|))
Matrix Norms
||A|| = max ||A x||/||x|| over x (for p-norms)
||A||1 - max column sum of A, maxj (i |Aij|)
||A|| - maximum row sum of A, maxi (j |Aij|)
||A||2 = [max()] (spectral radius)
||A||F = [i j (Aij)2]1/2 (Frobenius norm)
() = ||A|| ||A-1|| (condition number) = max/min (using 2-norm)

14

Linear Algebra - Eigenvalues


Find v and where Avi = i vi, i = i,n
Note: Av - v = (A - I) v = 0 or det (A - I) = 0
For this relation is an eigenvalue and v is an eigenvector of A.
If A is symmetric, all i are real
i > 0, i = 1, n; A is positive definite
i < 0, i = 1, n; A is negative definite
i = 0, some i: A is singular
Quadratic Form can be expressed in Canonical Form (Eigenvalue/Eigenvector)
xTAx
AV = V
V - eigenvector matrix (n x n)
- eigenvalue (diagonal) matrix = diag(i)
If A is symmetric, all i are real and V can be chosen orthonormal (V-1 = VT).
Thus, A = V V-1 = V VT
For Quadratic Function: Q(x) = aTx + xTAx
Define:

z = VTx and Q(Vz) = (aTV) z + zT (VTAV)z


= (aTV) z + zT z

Minimum occurs at (if i > 0) x = -A-1a

or

x = Vz = -V(-1VTa)

15

Positive (or Negative) Curvature


Positive (or Negative) Definite Hessian
Both eigenvalues are strictly positive or negative
A is positive definite or negative definite
Stationary points are minima or maxima

16

Zero Curvature
Singular Hessian
One eigenvalue is zero, the other is strictly positive or negative
A is positive semidefinite or negative semidefinite
There is a ridge of stationary points (minima or maxima)

17

Indefinite Curvature
Indefinite Hessian
One eigenvalue is positive, the other is negative
Stationary point is a saddle point
A is indefinite

Note: these can also be viewed as two dimensional projections for higher dimensional problems
18

Eigenvalue Example
1
1 T 2
Min Q(x) = x + x
2 1
1
2
AV = V with A =
1
T

1
x
2
1

1/ 2 1/ 2
1 0
V AV = =

with V =
0 3
-1/ 2 1/ 2
T

All eigenvalues are positive


Minimum occurs at z* = -/-1VTa
(x x ) / 2
2
z =V x = 1

(x1 + x 2 ) / 2
0
z* =

2 /(3 2)
T

(x + x ) / 2
2
x = Vz = 1

(x1 + x 2 ) / 2
1/3
x* =
1/3

19

Comparison of Optimization Methods


1. Convergence Theory
Global Convergence - will it converge to a local optimum
(or stationary point) from a poor starting point?
Local Convergence Rate - how fast will it converge close to
the solution?
2. Benchmarks on Large Class of Test Problems
Representative Problem (Hughes, 1981)
Min f(x1, x2) = exp(-)
u = x1 - 0.8
v = x2 - (a1 + a2 u2 (1- u)1/2 - a3 u)
= -b1 + b2 u2 (1+u)1/2 + b3 u
= c1 v2 (1 - c2 v)/(1+ c3 u2)
a = [ 0.3, 0.6, 0.2]
b = [5, 26, 3]
c = [40, 1, 10]
x* = [0.7395, 0.3144]

f(x*) = 5.0893
20

Three Dimensional Surface and Curvature for Representative Test Problem

Regions where minimum


eigenvalue is less than:
[0, -10, -50, -100, -150, -200]
21

What conditions characterize an optimal solution?


Unconstrained Local Minimum
Necessary Conditions
f (x*) = 0
pT2f (x*) p 0 for pn
(positive semi-definite)

x2

x*

Unconstrained Local Minimum


Sufficient Conditions
f (x*) = 0
pT2f (x*) p > 0 for pn
(positive definite)

Contours of f(x)

x1

For smooth functions, why are contours around optimum elliptical?


Taylor Series in n dimensions about x*:

1
f ( x) = f ( x*) + f ( x*)T ( x x*) + ( x x*)T 2 f ( x*)( x x*)
2
Since f(x*) = 0, f(x) is purely quadratic for x close to x*
22

Newtons Method
Taylor Series for f(x) about xk
Take derivative wrt x, set LHS 0
0 f(x) = f(xk) + 2f(xk) (x - xk)
(x - xk) d = - (2f(xk))-1 f(xk)
f(x) is convex (concave) if for all x n, 2f(x) is positive (negative) semidefinite
i.e. minj j 0 (maxj j 0)
Method can fail if:
- x0 far from optimum
- 2f is singular at any point
- f(x) is not smooth
Search direction, d, requires solution of linear equations.
Near solution:
2
k +1 - x * = K
k
x
x - x*
23

Basic Newton Algorithm - Line Search


0. Guess x0, Evaluate f(x0).
1. At xk, evaluate f(xk).
2. Evaluate Bk = 2f(xk) or an approximation.
3. Solve: Bk d = -f(xk)
If convergence error is less than tolerance:
e.g., ||f(xk) || and ||d|| STOP, else go to 4.
4. Find so that 0 < 1 and f(xk + d) < f(xk)
sufficiently (Each trial requires evaluation of f(x))
5. xk+1 = xk + d. Set k = k + 1 Go to 1.

24

Newtons Method - Convergence Path

Starting Points
[0.8, 0.2] needs steepest descent steps w/ line search up to O, takes 7 iterations to ||f(x*)|| 10-6
[0.35, 0.65] converges in four iterations with full steps to ||f(x*)|| 10-6
25

Newtons Method - Notes


Choice of Bk determines method.
- Steepest Descent: Bk = I
- Newton: Bk = 2f(x)
With suitable Bk, performance may be good enough if f(xk + d)
is sufficiently decreased (instead of minimized along line search
direction).
Trust region extensions to Newton's method provide very strong
global convergence properties and very reliable algorithms.
Local rate of convergence depends on choice of Bk.
Newton Quadratic Rate :

limk

Steepest descent Linear Rate : limk


Desired? Superlinear Rate :

limk

x k +1 x *
x x*
k

x k +1 x *
x x*
k

x k +1 x *
x x*
k

=K
<1
=0
26

Quasi-Newton Methods
Motivation:
Need Bk to be positive definite.
Avoid calculation of 2f.
Avoid solution of linear system for d = - (Bk)-1 f(xk)
Strategy: Define matrix updating formulas that give (Bk) symmetric, positive
definite and satisfy:
(Bk+1)(xk+1 - xk) = (fk+1 - fk) (Secant relation)
DFP Formula: (Davidon, Fletcher, Powell, 1958, 1964)
T
y
s
+
y
y
s
y
s
s
y
y
y
(
B )
(
B ) ( B )
k
=B +
T
T
T
ys
ys ys
k

k+1

( )( )

(B )

k+1 -1

k+1

=H

where:

T
k
y
y
ss
H
H
= Hk + T y Hk y
s y
T

s = xk+1- xk
y = f (xk+1) - f (xk)
27

Quasi-Newton Methods
BFGS Formula (Broyden, Fletcher, Goldfarb, Shanno, 1970-71)

k+1

(B )

k+1 1

= Hk+1 = Hk +

T
k
T k
s
yy
B
s
B
= Bk + T k
s y
sB s

(s -

y - H s) y s sT
(
H y)s + s (s - H y)
T
T
T
y s
y s y s
k

( )( )

Notes:
1) Both formulas are derived under similar assumptions and have
symmetry
2) Both have superlinear convergence and terminate in n steps on
quadratic functions. They are identical if is minimized.
3) BFGS is more stable and performs better than DFP, in general.
4) For n 100, these are the best methods for general purpose
problems if second derivatives are not available.
28

Quasi-Newton Method - BFGS


Convergence Path

Starting Point
[0.8, 0.2] starting from B0 = I, converges in 9 iterations to ||f(x*)|| 10-6
29

Sources For Unconstrained Software


Harwell (HSL)
IMSL
NAg - Unconstrained Optimization Codes
Netlib (www.netlib.org)
MINPACK
TOMS Algorithms, etc.
These sources contain various methods
Quasi-Newton
Gauss-Newton
Sparse Newton
Conjugate Gradient

30

Constrained Optimization
(Nonlinear Programming)
Problem:
s.t.

Minx f(x)
g(x) 0
h(x) = 0

where:
f(x) - scalar objective function
x - n vector of variables
g(x) - inequality constraints, m vector
h(x) - meq equality constraints.
Sufficient Condition for Unique Optimum
- f(x) must be convex, and
- feasible region must be convex,
i.e. g(x) are all convex
h(x) are all linear
Except in special cases, ther is no guarantee that a local optimum is global if
sufficient conditions are violated.
31

Example: Minimize Packing Dimensions

What is the smallest box for three round objects?


Variables: A, B, (x1, y1), (x2, y2), (x3, y3)
Fixed Parameters: R1, R2, R3
Objective: Minimize Perimeter = 2(A+B)
Constraints: Circles remain in box, cant overlap
Decisions: Sides of box, centers of circles.

2
3

A
1
B

x1, y1 R1

x 2, y 2 R 2

x 3, y 3 R 3

x1 B - R1, y1 A - R1
x2 B - R 2, y2 A - R 2
x3 B - R 3, y3 A - R 3

in box
x1, x2, x3, y1, y2, y3, A, B 0

(x - x )2 + (y - y )2 (R + R )2
1
2
1
2
1
2

2
2
2
(x1 - x3) + (y1 - y3) (R1 + R 3)

2
2
y 2 - y3) (R2 + R 3)2
(
)
+
x
x
(
2
3

no overlaps

32

Characterization of Constrained Optima


Min
Min

Linear Program

Linear Program
(Alternate Optima)

Min

Min
Min

Convex Objective Functions


Linear Constraints
Min

Min

Nonconvex Region
Multiple Optima

Min

Min

Min
Nonconvex Objective
Multiple Optima

33

What conditions characterize an optimal solution?

Unconstrained Local Minimum


Necessary Conditions
f (x*) = 0
pT2f (x*) p 0 for pn
(positive semi-definite)

Unconstrained Local Minimum


Sufficient Conditions
f (x*) = 0
pT2f (x*) p > 0 for pn
(positive definite)
34

Optimal solution for inequality constrained problem

Min f(x)
s.t. g(x) 0
Analogy: Ball rolling down valley pinned by fence
Note: Balance of forces (f, g1)
35

Optimal solution for general constrained problem

Problem: Min
s.t.

f(x)
g(x) 0
h(x) = 0
Analogy: Ball rolling on rail pinned by fences
Balance of forces: f, g1, h
36

Optimality conditions for local optimum


Necessary First Order Karush Kuhn - Tucker Conditions

L (x*, u, v) = f(x*) + g(x*) u + h(x*) v = 0


(Balance of Forces)
u 0 (Inequalities act in only one direction)
g (x*) 0, h (x*) = 0 (Feasibility)
uj gj(x*) = 0 (Complementarity: either gj(x*) = 0 or uj = 0)
u, v are "weights" for "forces," known as KKT multipliers, shadow
prices, dual variables
To guarantee that a local NLP solution satisfies KKT conditions, a constraint
qualification is required. E.g., the Linear Independence Constraint Qualification
(LICQ) requires active constraint gradients, [gA(x*) h(x*)], to be linearly
independent. Also, under LICQ, KKT multipliers are uniquely determined.

Necessary (Sufficient) Second Order Conditions


- Positive curvature in "constraint" directions.
- pT 2L (x*) p 0 (pT 2L (x*) p > 0)
where p are the constrained directions: gA(x*)Tp = 0, h(x*)Tp = 0
37

Single Variable Example of KKT Conditions


Min (x)2 s.t. -a x a, a > 0
x* = 0 is seen by inspection

f(x)

Lagrange function :
L(x, u) = x2 + u1(x-a) + u2(-a-x)
First Order KKT conditions:
L(x, u) = 2 x + u1 - u2 = 0
u1 (x-a) = 0
u2 (-a-x) = 0
-a x a
u1, u2 0

Consider three cases:


u1 > 0, u2 = 0
u1 = 0, u2 > 0
u1 = u2 = 0

-a

Upper bound is active, x = a, u1 = -2a, u2 = 0


Lower bound is active, x = -a, u2 = -2a, u1 = 0
Neither bound is active, u1 = 0, u2 = 0, x = 0

Second order conditions (x*, u1, u2 =0)


xxL (x*, u*) = 2
pTxxL (x*, u*) p = 2 (x)2 > 0

38

Single Variable Example


of KKT Conditions - Revisited
Min -(x)2 s.t. -a x a, a > 0
x* = a is seen by inspection

-a

Lagrange function :
L(x, u) = -x2 + u1(x-a) + u2(-a-x)
First Order KKT conditions:
L(x, u) = -2x + u1 - u2 = 0
u1 (x-a) = 0
u2 (-a-x) = 0
-a x a
u1, u2 0

Consider three cases:


u1 > 0, u2 = 0
u1 = 0, u2 > 0
u1 = u2 = 0

f(x)

Upper bound is active, x = a, u1 = 2a, u2 = 0


Lower bound is active, x = -a, u2 = 2a, u1 = 0
Neither bound is active, u1 = 0, u2 = 0, x = 0

Second order conditions (x*, u1, u2 =0)


xxL (x*, u*) = -2
pTxxL (x*, u*) p = -2(x)2 < 0

39

Interpretation of Second Order Conditions


For x = a or x = -a, we require the allowable direction to satisfy the
active constraints exactly. Here, any point along the allowable
direction, x* must remain at its bound.
For this problem, however, there are no nonzero allowable directions
that satisfy this condition. Consequently the solution x* is defined
entirely by the active constraint. The condition:
pT xxL (x*, u*, v*) p > 0
for all allowable directions, is vacuously satisfied - because there are
no allowable directions that satisfy gA(x*)T p = 0. Hence, sufficient
second order conditions are satisfied.

As we will see, sufficient second order conditions are satisfied by linear


programs as well.

40

Role of KKT Multipliers


-a

Also known as:


Shadow Prices
Dual Variables
Lagrange Multipliers

a a + a

f(x)

Suppose a in the constraint is increased to a + a


f(x*) = (a + a)2
and
[f(x*, a + a) - f(x*, a)]/a = 2a + a
df(x*)/da = 2a = u1
41

Special Cases of Nonlinear Programming


Linear Programming:
Min cTx
x2
s.t. Ax b
Cx = d, x 0
Functions are all convex global min.
Because of Linearity, can prove solution will
always lie at vertex of feasible region.

x1

Simplex Method
- Start at vertex
- Move to adjacent vertex that offers most improvement
- Continue until no further improvement
Notes:
1) LP has wide uses in planning, blending and scheduling
2) Canned programs widely available.
42

Linear Programming Example


Simplex Method
Min -2x1 - 3x2
s.t. 2x1 + x2 5
x1, x2 0

Now, define f = -2x1 - 3x2

Set x1, x2 = 0, x3 = 5 and form tableau


x1
x2
x3
f
2
1
1
0
2
3
0
1

Min
-2x1 - 3x2
s.t. 2x1 + x2 + x3 = 5
x1, x2, x3 0
(add slack variable)
f + 2x1 + 3x2 = 0
b
5
0

x1, x2 nonbasic
x3
basic

To decrease f, increase x2. How much? so x3 0


x1
x2
x3
f
b
2
1
1
0
5
-4
0
-3
1
-15
f can no longer be decreased! Optimal
Underlined terms are -(reduced gradients); nonbasic variables (x1, x3), basic variable x2
43

Quadratic Programming
Min
aTx + 1/2 xT B x
Axb
Cx=d
1) Can be solved using LP-like techniques:
(Wolfe, 1959)

Min
j (zj+ + zj-)
s.t.
a + Bx + ATu + CTv = z+ - zAx - b + s = 0
Cx - d = 0
s, z+, z- 0
{uj sj = 0}
with complicating conditions.
Problem:

2) If B is positive definite, QP solution is unique.


If B is pos. semidefinite, optimum value is unique.
3) Other methods for solving QPs (faster)
- Complementary Pivoting (Lemke)
- Range, Null Space methods (Gill, Murray).

44

Portfolio Planning Problem


Definitions:
xi - fraction or amount invested in security i
ri (t) - (1 + rate of return) for investment i in year t.
i - average r(t) over T years, i.e.
T

r (t)

1
i =
T

Max

t=1

i i

s.t.

=1

xi 0, etc.
Note: maximize average return, no accounting for risk.
45

Portfolio Planning Problem


Definition of Risk - fluctuation of ri(t) over investment (or past) time period.
To minimize risk, minimize variance about portfolio mean (risk averse).
Variance/Covariance Matrix, S

{ S} ij = 2ij

1
=
T

t =1

x T Sx

Max
s .t.

(r (t) - )(r (t) - )


T

xi = 1

ixi R

x i 0 , etc .
Example: 3 investments
j
1. IBM
1.3
2. GM
1.2
3. Gold
1.08

1
3
S = 1
2
-0.5 0.4

- 0.5
0.4
1
46

Portfolio Planning Problem - GAMS


SIMPLE PORTFOLIO INVESTMENT PROBLEM (MARKOWITZ)
4
5 OPTION LIMROW=0;
6 OPTION LIMXOL=0;
7
8 VARIABLES IBM, GM, GOLD, OBJQP, OBJLP;
9
10 EQUATIONS E1,E2,QP,LP;
11
12 LP.. OBJLP =E= 1.3*IBM + 1.2*GM + 1.08*GOLD;
13
14 QP.. OBJQP =E= 3*IBM**2 + 2*IBM*GM - IBM*GOLD
15 + 2*GM**2 - 0.8*GM*GOLD + GOLD**2;
16
17 E1..1.3*IBM + 1.2*GM + 1.08*GOLD =G= 1.15;
18
19 E2.. IBM + GM + GOLD =E= 1;
20
21 IBM.LO = 0.;
22 IBM.UP = 0.75;
23 GM.LO = 0.;
24 GM.UP = 0.75;
25 GOLD.LO = 0.;
26 GOLD.UP = 0.75;
27
28 MODEL PORTQP/QP,E1,E2/;
29
30 MODEL PORTLP/LP,E2/;
31
32 SOLVE PORTLP USING LP MAXIMIZING OBJLP;
33
34
SOLVE PORTQP USING NLP MINIMIZING OBJQP;

47

Portfolio Planning Problem - GAMS


S O L VE S U M M A R Y
**** MODEL STATUS
**** OBJECTIVE VALUE
RESOURCE USAGE, LIMIT
1.270
ITERATION COUNT, LIMIT
1
BDM - LP VERSION 1.01
A. Brooke, A. Drud, and A. Meeraus,
Analytic Support Unit,
Development Research Department,
World Bank,
Washington D.C. 20433, U.S.A.

1 OPTIMAL
1.2750
1000.000
1000

Estimate work space needed


-Work space allocated
EXIT - - OPTIMAL SOLUTION FOUND.
LOWER
- - - - EQU LP
.
.
- - - - EQU E2
1.000
LOWER
- - - - VAR IBM
0.750
- - - - VAR GM
.
- - - - VAR GOLD
.
- - - - VAR OBJLP -INF
**** REPORT SUMMARY :

33 Kb
--

231 Kb

LEVEL

UPPER

MARGINAL
1.000
1.200

.
1.000

1.000

LEVEL
0.750
0.250
..
1.275

UPPER
0.100
0.750
0.750
+INF

MARGINAL
.
-0.120
.

NONOPT

0 INFEASIBLE
0 UNBOUNDED
SIMPLE PORTFOLIO INVESTMENT PROBLEM (MARKOWITZ)
Model Statistics SOLVE PORTQP USING NLP FROM LINE 34
MODEL STATISTICS
BLOCKS OF EQUATIONS
3
SINGLE EQUATIONS
BLOCKS OF VARIABLES
4
SINGLE VARIABLES
NON ZERO ELEMENTS
10
NON LINEAR N-Z
DERIVITIVE POOL
8
CONSTANT POOL
CODE LENGTH
95
GENERATION TIME
=
2.360 SECONDS
EXECUTION TIME
=
3.510 SECONDS

3
4
3
3

48

Portfolio Planning Problem - GAMS


S O L VE S U M M A R Y
MODEL
PORTLP
OBJECTIVE
OBJLP
TYPE
LP
DIRECTION
MAXIMIZE
SOLVER
MINOS5
FROM LINE
34
**** SOLVER STATUS
1 NORMAL COMPLETION
**** MODEL STATUS
2 LOCALLY OPTIMAL
**** OBJECTIVE VALUE
0.4210
RESOURCE USAGE, LIMIT
3.129
1000.000
ITERATION COUNT, LIMIT
3
1000
EVALUATION ERRORS
0
0
MINOS
5.3
(Nov. 1990)
Ver: 225-DOS-02
B.A. Murtagh, University of New South Wales
and
P.E. Gill, W. Murray, M.A. Saunders and M.H. Wright
Systems Optimization Laboratory, Stanford University.
EXIT - - OPTIMAL SOLUTION FOUND
MAJOR ITNS, LIMIT
FUNOBJ, FUNCON CALLS
8
SUPERBASICS
INTERPRETER USAGE
NORM RG / NORM PI
3.732E-17
LOWER
- - - - EQU QP
.
- - - - EQU E1
1.150
- - - - EQU E2
1.000
LOWER
- - - - VAR IBM
.
- - - - VAR GM
.
- - - - VAR GOLD
.
- - - - VAR OBJLP -INF
**** REPORT SUMMARY :

1
1
.21
LEVEL

1.150
1.000
LEVEL
0.183
0.248
0.569
1.421
0
NONOPT
0 INFEASIBLE
0 UNBOUNDED
0
ERRORS
SIMPLE PORTFOLIO INVESTMENT PROBLEM (MARKOWITZ)
Model Statistics SOLVE PORTQP USING NLP FROM LINE 34
EXECUTION TIME
=
1.090 SECONDS

UPPER
. .
+INF
1.000
UPPER
0.750
0.750
0.750
+INF

MARGINAL
1.000
1.216
-0.556
MARGINAL
.
EPS
.
.

49

Algorithms for Constrained Problems


Motivation: Build on unconstrained methods wherever possible.
Classification of Methods:
Reduced Gradient Methods - (with Restoration) GRG2, CONOPT
Reduced Gradient Methods - (without Restoration) MINOS
Successive Quadratic Programming - generic implementations
Penalty Functions - popular in 1970s, but fell into disfavor. Barrier
Methods have been developed recently and are again popular.
Successive Linear Programming - only useful for "mostly linear"
problems
We will concentrate on algorithms for first four classes.
Evaluation: Compare performance on "typical problem," cite experience
on process problems.
50

Representative Constrained Problem


(Hughes, 1981)

g1 0

g2 0

Min f(x1, x2) = exp(-)


g1 = (x2+0.1)2[x12+2(1-x2)(1-2x2)] - 0.16 0
g2 = (x1 - 0.3)2 + (x2 - 0.3)2 - 0.16 0
x* = [0.6335, 0.3465]
f(x*) = -4.8380
51

Reduced Gradient Method with Restoration


(GRG2/CONOPT)
Min f(x)
s.t. g(x) + s = 0 (add slack variable)
c(x) = 0
a x b, s 0

Min
f(z)

s.t. h(z) = 0
azb

Partition variables into:


zB - dependent or basic variables
zN - nonbasic variables, fixed at a bound
zS - independent or superbasic variables
Analogy to linear programming. Superbasics required only if nonlinear problem
Solve unconstrained problem in space of superbasic variables.
Let zT = [zST zBT zNT] optimize wrt zS with h(zS, zB , zN) = 0
Calculate constrained derivative or reduced gradient wrt zS.
Dependent variables are zB Rm
Nonbasic variables zN (temporarily) fixed

52

Definition of Reduced Gradient


df
f dz B f
=
+
dz S z S dz S z B
Because h( z ) = 0, we have :
T

h
h
dh =
dz S +
dz B = 0
z B
z S
1

h h
dz B
1
h
h
=
=

zS
zB

dz S

z
S B
This leads to :
df
f
1 f
=
zS h z B h
dz S z S
z B

By remaining feasible always, h(z) = 0, a z b, one can apply an


unconstrained algorithm (quasi-Newton) using (df/dzS)
Solve problem in reduced space of zS variables
53

Example of Reduced Gradient


Min

x1 2 x 2

s .t .

3 x1 + 4 x 2 = 24

h T = [ 3 4], f

= [ 2 x1 - 2]

Let z S = x1 , z B = x 2

df
f
=
zS h zB h
dz S
z S

f
z B

df
1
= 2 x1 3[4 ] (- 2 ) = 2 x1 + 3 / 2
dx 1
If hT is (m x n); zShT is m x (n-m); zBhT is (m x m)

(df/dzS) is the change in f along constraint direction per unit change in zS


54

Sketch of GRG Algorithm


1. Initialize problem and obtain a feasible point at z0
2. At feasible point zk, partition variables z into zN, zB, zS
3. Calculate reduced gradient, (df/dzS)
4. Evaluate search direction for zS, d = B-1(df/dzS)
5. Perform a line search.
Find (0,1] with zS := zSk + d
Solve for h(zSk + d, zB, zN) = 0
If f(zSk + d, zB, zN) < f(zSk, zB, zN),
set zSk+1 =zSk + d, k:= k+1
6. If ||(df/dzS)||<, Stop. Else, go to 2.
55

GRG Algorithm Properties


1.
2.
3.
4.
5.

GRG2 has been implemented on PCs as GINO and is very reliable and
robust. It is also the optimization solver in MS EXCEL.
CONOPT is implemented in GAMS, AIMMS and AMPL
GRG2 uses Q-N for small problems but can switch to conjugate
gradients if problem gets large. CONOPT uses exact second derivatives.
Convergence of h(zS, zB , zN) = 0 can get very expensive because h is
required
Safeguards can be added so that restoration (step 5.) can be dropped
and efficiency increases.

Representative Constrained Problem Starting Point [0.8, 0.2]


GINO Results - 14 iterations to ||f(x*)|| 10-6
CONOPT Results - 7 iterations to ||f(x*)|| 10-6 from feasible point.

56

Reduced Gradient Method without Restoration


(MINOS/Augmented)
Motivation: Efficient algorithms
are available that solve linearly
constrained optimization
problems (MINOS):
Min f(x)
s.t. Ax b
Cx = d
Extend to nonlinear problems,
through successive linearization
Develop major iterations
(linearizations) and minor
iterations (GRG solutions) .

Strategy: (Robinson, Murtagh & Saunders)


1. Partition variables into basic, nonbasic
variables and superbasic variables..
2. Linearize active constraints at zk
Dkz = ck
3. Let = f (z) + vTh (z) + (hTh)
(Augmented Lagrange),
4. Solve linearly constrained problem:
Min
(z)
s.t.
Dz = c
azb
using reduced gradients to get zk+1
5. Set k=k+1, go to 3.
6. Algorithm terminates when no
movement between steps 3) and 4).
57

MINOS/Augmented Notes
1. MINOS has been implemented very efficiently to take care of
linearity. It becomes LP Simplex method if problem is totally
linear. Also, very efficient matrix routines.
2. No restoration takes place, nonlinear constraints are reflected in
(z) during step 3). MINOS is more efficient than GRG.
3. Major iterations (steps 3) - 4)) converge at a quadratic rate.
4. Reduced gradient methods are complicated, monolithic codes:
hard to integrate efficiently into modeling software.
Representative Constrained Problem Starting Point [0.8, 0.2]
MINOS Results: 4 major iterations, 11 function calls
to ||f(x*)|| 10-6
58

Successive Quadratic Programming (SQP)


Motivation:
Take KKT conditions, expand in Taylor series about current point.
Take Newton step (QP) to determine next point.

Derivation KKT Conditions


xL (x*, u*, v*) = f(x*) + gA(x*) u* + h(x*) v* = 0
h(x*) = 0
gA(x*) = 0,
where gA are the active constraints.
Newton - Step

xx L g
A

g T
0
A

h T
0

x L (x k , u k , v k )
h x

gA (x k )
0 u = -

0 v
h (x k )

Requirements:
xxL must be calculated and should be regular
correct active set gA
good estimates of uk, vk
59

SQP Chronology
1. Wilson (1963)
- active set can be determined by solving QP:
Min
f(xk)Td + 1/2 dT xx L(xk, uk, vk) d
d
s.t.
g(xk) + g(xk)T d 0
h(xk) + h(xk)T d = 0
2. Han (1976), (1977), Powell (1977), (1978)
- approximate xxL using a positive definite quasi-Newton update (BFGS)
- use a line search to converge from poor starting points.
Notes:
- Similar methods were derived using penalty (not Lagrange) functions.
- Method converges quickly; very few function evaluations.
- Not well suited to large problems (full space update used).
For n > 100, say, use reduced space methods (e.g. MINOS).
60

Elements of SQP Hessian Approximation


What about xxL?
need to get second derivatives for f(x), g(x), h(x).
need to estimate multipliers, uk, vk; xxL may not be positive
semidefinite
Approximate xxL (xk, uk, vk) by Bk, a symmetric positive
definite matrix.
T
k
T k
s
yy
B
s
B
k
k+1
=
+
B
B
T
k
s y
sB s
s = xk+1 - xk
BFGS Formula
y = L(xk+1, uk+1, vk+1) - L(xk, uk+1, vk+1)
second derivatives approximated by change in gradients
positive definite Bk ensures unique QP solution

61

Elements of SQP Search Directions


How do we obtain search directions?

Form QP and let QP determine constraint activity

At each iteration, k, solve:


Min
f(xk) Td + 1/2 dT Bkd
d
s.t.
g(xk) + g(xk) T d 0
h(xk) + h(xk) T d = 0
Convergence from poor starting points
As with Newton's method, choose (stepsize) to ensure progress
toward optimum:
xk+1 = xk + d.
is chosen by making sure a merit function is decreased at each
iteration.
Exact Penalty Function
(x) = f(x) + [ max (0, gj(x)) + |hj (x)|]
> maxj {| uj |, | vj |}
Augmented Lagrange Function
(x) = f(x) + uTg(x) + vTh(x)
+ /2 { (hj (x))2 + max (0, gj (x))2}
62

Newton-Like Properties for SQP

Fast Local Convergence


B = xxL
xxL is p.d and B is Q-N
B is Q-N update, xxL not p.d

Quadratic
1 step Superlinear
2 step Superlinear

Enforce Global Convergence


Ensure decrease of merit function by taking 1
Trust region adaptations provide a stronger guarantee of global
convergence - but harder to implement.

63

Basic SQP Algorithm


0. Guess x0, Set B0 = I (Identity). Evaluate f(x0), g(x0) and h(x0).
1. At xk, evaluate f(xk), g(xk), h(xk).
2. If k > 0, update Bk using the BFGS Formula.
3. Solve:
s.t.

Mind f(xk)Td + 1/2 dTBkd


g(xk) + g(xk)Td 0
h(xk) + h(xk)Td = 0

If KKT error less than tolerance: ||L(x*)|| , ||h(x*)|| ,


||g(x*)+|| . STOP, else go to 4.
4. Find so that 0 < 1 and (xk + d) < (xk) sufficiently
(Each trial requires evaluation of f(x), g(x) and h(x)).
5. xk+1 = xk + d. Set k = k + 1 Go to 2.
64

Problems with SQP


Nonsmooth Functions - Reformulate
Ill-conditioning - Proper scaling
Poor Starting Points Trust Regions can help
Inconsistent Constraint Linearizations
- Can lead to infeasible QP's
x2

x1

Min x2
s.t. 1 + x1 - (x2)2 0
1 - x1 - (x2)2 0
x2 -1/2

65

SQP Test Problem


1.2

1.0

0.8
x2
0.6
x*

0.4

0.2

0.0
0.0

0.2

0.4

x1

0.6

0.8

1.0

1.2

Min x2
s.t. -x2 + 2 x12 - x13 0
-x2 + 2 (1-x1)2 - (1-x1)3 0
x* = [0.5, 0.375].
66

SQP Test Problem First Iteration


1.2

1.0

0.8
x2
0.6

0.4

0.2

0.0
0.0

0.2

0.4

x1

0.6

0.8

1.0

1.2

Start from the origin (x0 = [0, 0]T) with B0 = I, form:


Min
s.t.

d2 + 1/2 (d12 + d22)


d2 0
d1 + d2 1
d = [1, 0]T. with 1 = 0 and 2 = 1.
67

SQP Test Problem Second Iteration


1.2

1.0

0.8
x2
0.6
x*

0.4

0.2

0.0
0.0

0.2

0.4

x1

0.6

0.8

1.0

1.2

From x1 = [0.5, 0]T with B1 = I


(no update from BFGS possible), form:
d2 + 1/2 (d12 + d22)
-1.25 d1 - d2 + 0.375 0
1.25 d1 - d2 + 0.375 0
d = [0, 0.375]T with 1 = 0.5 and 2 = 0.5
x* = [0.5, 0.375]T is optimal
Min
s.t.

68

Representative Constrained Problem


SQP Convergence Path

g1 0

g2 0

Starting Point[0.8, 0.2] - starting from B0 = I and staying in bounds


and linearized constraints; converges in 8 iterations to ||f(x*)|| 10-6
69

Barrier Methods for Large-Scale


Nonlinear Programming
minn f ( x)
Can generalize for
x

Original Formulation

s.t c( x) = 0
x0

a xb

Barrier Approach

minn ( x) = f ( x) ln xi
x

i =1

s.t c( x) = 0
As 0,

x*() x*

Fiacco and McCormick (1968)


70

Solution of the Barrier Problem


Newton Directions (KKT System)

f ( x ) + A ( x ) v
XVe e
c(x)
Solve
Reducing the System

= 0
= 0
= 0

dv = X 1e v X 1Vdx

W A I d x
f + A v
AW0+ 0 A
d d=x c
1

= X V
T
+

1
c
V A
0 X 0d v

S
e
IPOPT Code www.coin-or.org
71

Global Convergence of Newton-based


Barrier Solvers
Merit Function
Exact Penalty:

P(x, ) = f(x) + ||c(x)||

Augd Lagrangian: L*(x, , ) = f(x) + Tc(x) + ||c(x)||2


Assess Search Direction (e.g., from IPOPT)
Line Search choose stepsize to give sufficient decrease of merit function
using a step to the boundary rule with ~0.99.

for (0, ], xk +1 = xk + d x
xk + d x (1 ) xk > 0
vk +1 = vk + d v (1 )vk > 0

k +1 = k + (+ k )
How do we balance (x) and c(x) with ?
Is this approach globally convergent? Will it still be fast?
72

Global Convergence Failure


(Wchter and B., 2000)

x2

Min f ( x)
1
s.t. x1 x3 = 0
2
( x1 ) 2 x2 1 = 0
x2 , x3 0
Newton-type line search stalls
even though descent directions
exist
x1
k T
k

A( x ) d x + c( x ) = 0
xk + d x > 0

Remedies:
Composite Step Trust Region
(Byrd et al.)
Filter Line Search Methods

73

Line Search Filter Method


Store (k, k) at allowed iterates
Allow progress if trial point is
acceptable to filter with margin

(x)

If switching condition

[kT d ]a [ k ]b , a > 2b > 2


is satisfied, only an Armijo line
search is required on k
If insufficient progress on stepsize,
evoke restoration phase to reduce .
Global convergence and superlinear
local convergence proved (with
second order correction)

(x) = ||c(x)||

74

Implementation Details
Modify KKT (full space) matrix if nonsingular

Wk + k + 1

T
A
k

Ak
2 I

1 - Correct inertia to guarantee descent direction


2 - Deal with rank deficient Ak

KKT matrix factored by MA27


Feasibility restoration phase

Min || c( x) ||1 + || x xk ||Q

xl xk xu
Apply Exact Penalty Formulation
Exploit same structure/algorithm to reduce infeasibility
75

Details of IPOP T Algorithm


Options

Comparison

Line Search Strategies


- l 2 exact penalty merit function
- augmented Lagrangian
function
- Filter method (adapted from
Fletcher and Leyffer)

34 COPS Problems
(600 - 160 000 variables)
486 CUT E Problems
(2 50 000 variables)
56 MIT T Problems
(12097 99998 variables)

Hessian Calculation
- BFGS (reduced space)
- SR1 (reduced space)
- Exact full Hessian (direct)
- Exact reduced Hes sian (direct)

Performance Measure
- rp, l = (#iterp,l)/ (#iterp, min)
- P() = fraction of problems
with log2(rp, l) <

- Preconditioned CG

76

IPOPT Comparison with KNITRO and


LOQO
CPU time (COPS)

CPU time (CUTE)

0.9

0.9

0.8

0.7

0.7

0.6

0.6
rho

rho

0.8

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

4
tau
CPU time (MITT)

4
tau

IPOPT has better performance,


robustness on CUTE, MITT and
COPS problem sets

1
0.9
0.8
0.7

Similar results appear with iteration


counts

IPOPT
KNITRO
LOQO

0.6
rho

IPOPT
KNITRO
LOQO

IPOPT
KNITRO
LOQO

0.5

Can be downloaded from


http://www.coin-or.org

0.4
0.3
0.2

See links for additional information

0.1
0

4
tau

77

Recommendations for Constrained Optimization


1. Best current algorithms

GRG 2/CONOPT

MINOS

SQP

IPOPT
2. GRG 2 (or CONOPT) is generally slower, but is robust. Use with highly
nonlinear functions. Solver in Excel!
3. For small problems (n 100) with nonlinear constraints, use SQP.
4. For large problems (n 100) with mostly linear constraints, use MINOS.
==> Difficulty with many nonlinearities
Fewer Function
Evaluations

SQP

MINOS Tailored Linear


CONOPT
Algebra

Small, Nonlinear Problems - SQP solves QP's, not LCNLP's, fewer function calls.
Large, Mostly Linear Problems - MINOS performs sparse constraint decomposition.
Works efficiently in reduced space if function calls are cheap!
Exploit Both Features IPOPT takes advantages of few function evaluations and largescale linear algebra, but requires exact second derivatives
78

Available Software for Constrained


Optimization
SQP Routines
HSL, NaG and IMSL (NLPQL) Routines
NPSOL Stanford Systems Optimization Lab
SNOPT Stanford Systems Optimization Lab (rSQP discussed later)
IPOPT http://www.coin-or.org
GAMS Programs
CONOPT - Generalized Reduced Gradient method with restoration
MINOS - Generalized Reduced Gradient method without restoration
A student version of GAMS is now available from the CACHE office. The cost for this package
including Process Design Case Students, GAMS: A Users Guide, and GAMS - The Solver Manuals,
and a CD-ROM is $65 per CACHE supporting departments, and $100 per non-CACHE supporting
departments and individuals. To order please complete standard order form and fax or mail to
CACHE Corporation. More information can be found on http://www.che.utexas.edu/cache/gams.html

MS Excel
Solver uses Generalized Reduced Gradient method with restoration
79

Rules for Formulating Nonlinear Programs


1) Avoid overflows and undefined terms, (do not divide, take logs, etc.)
e.g. x + y - ln z = 0 x + y - u = 0
exp u - z = 0
2) If constraints must always be enforced, make sure they are linear or bounds.
e.g. v(xy - z2)1/2 = 3

vu = 3
u- (xy - z2) = 0, u 0
3) Exploit linear constraints as much as possible, e.g. mass balance
xi L + yi V = F zi li + vi = fi
L li = 0
4) Use bounds and constraints to enforce characteristic solutions.
e.g.
a x b, g (x) 0
to isolate correct root of h (x) = 0.
5) Exploit global properties when possibility exists. Convex (linear equations?)
Linear Program? Quadratic Program? Geometric Program?
6) Exploit problem structure when possible.
e.g. Min
[Tx - 3Ty]
s.t. xT + y - T2 y = 5
4x - 5Ty + Tx = 7
0T1
(If T is fixed solve LP) put T in outer optimization loop.
80

Process Optimization
Problem Definition and Formulation
State of Nature and Problem Premises

Restrictions: Physical, Legal


Economic, Political, etc.

Desired Objective: Yield,


Economic, Capacity, etc.

Decisions

Mathematical Modeling and Algorithmic Solution

Process Model Equations

Constraints

Objective Function

Additional Variables
81

Large Scale NLP Algorithms


Motivation: Improvement of Successive Quadratic Programming
as Cornerstone Algorithm
process optimization for design, control and operations
Evolution of NLP Solvers:
SQP

rSQP

IPOPT

rSQP++
19991988-98:
1981-87:
: Simultaneous
Static
Flowsheet
Real-time
dynamic
optimization
optimization
optimization
over
over
over
1 000
100
100
000
000
variables
variables
variables
and
and
and
constraints
constraints
constraints
Current: Tailor structure, architecture and problems
82

Flowsheet Optimization Problems - Introduction


Modular Simulation Mode
Physical Relation to Process

In

Out

- Intuitive to Process Engineer


- Unit equations solved internally
- tailor-made procedures.

Convergence Procedures - for simple flowsheets, often identified


from flowsheet structure
Convergence "mimics" startup.
Debugging flowsheets on "physical" grounds
83

Flowsheet Optimization Problems - Features

Design Specifications
Specify # trays reflux ratio, but would like to specify
overhead comp. ==> Control loop -Solve Iteratively

1
3

Nested Recycles Hard to Handle


Best Convergence Procedure?

Frequent block evaluation can be expensive


Slow algorithms applied to flowsheet loops.
NLP methods are good at breaking looks
84

Chronology in Process Optimization


Sim. Time Equiv.
1. Early Work - Black Box Approaches
Friedman and Pinder (1972)
Gaddy and co-workers (1977)
2. Transition - more accurate gradients
Parker and Hughes (1981)
Biegler and Hughes (1981)
3. Infeasible Path Strategy for Modular Simulators
Biegler and Hughes (1982)
Chen and Stadtherr (1985)
Kaijaluoto et al. (1985)
and many more
4. Equation Based Process Optimization
Westerberg et al. (1983)
Shewchuk (1985)
DMO, NOVA, RTOPT, etc. (1990s)

75-150
300
64
13
<10

<5
2
1-2

Process optimization should be as cheap and easy as process simulation


85

Process Simulators with Optimization


Capabilities (using SQP)
Aspen Custom Modeler (ACM)
Aspen/Plus
gProms
Hysim/Hysys
Massbal
Optisim
Pro/II
ProSim
ROMeo
VTPLAN
86

Simulation and Optimization of Flowsheets


2

h(y) = 0
4

1
w(y)

Min f(x), s.t. g(x) 0


For single degree of freedom:
work in space defined by curve below.
requires repeated (expensive) recycle convergence
f(x, y(x))

87

Expanded Region with Feasible Path

88

"Black Box" Optimization Approach


Vertical steps are expensive (flowsheet convergence)
Generally no connection between x and y.
Can have "noisy" derivatives for gradient optimization.
89

SQP - Infeasible Path Approach


solve and optimize simultaneously in x and y
extended Newton method
90

Optimization Capability for Modular Simulators


(FLOWTRAN, Aspen/Plus, Pro/II, HySys)
Architecture
- Replace convergence with optimization block
- Problem definition needed (in-line FORTRAN)
- Executive, preprocessor, modules intact.
Examples
1. Single Unit and Acyclic Optimization
- Distillation columns & sequences
2. "Conventional" Process Optimization
- Monochlorobenzene process
- NH3 synthesis
3. Complicated Recycles & Control Loops
- Cavett problem
- Variations of above
91

Optimization of Monochlorobenzene Process


S06

U = 100

HC1

Cooling
Water
80o F

A-1

PHYSICAL PROPERTY OPTIONS


Cavett Vapor Pressure
Redlich-Kwong Vapor Fugacity
Corrected Liquid Fugacity
Ideal Solution Activity Coefficient
OPT (SCOPT) OPTIMIZER
Optimal Solution Found After 4 Iterations
Kuhn-Tucker Error
0.29616E-05
Allowable Kuhn-Tucker Error 0.19826E-04
Objective Function
-0.98259

ABSORBER
15 Trays
(3 Theoretical Stages)
32 psia

S11

S04

Feed
o

80 F
37 psia

DISTILLATION
30 Trays
(20 Theoretical Stages)

270 F

HC1

F-1
S01

D-1

S10

S07

Benzene,
0.1 Lb Mole/Hr
of MCB

25
psia

S05

S02

FLASH

S09

Steam
o

360 F

Steam
S03

H-1
U = 100
Feed Flow Rates

LB Moles/Hr
Optimization Variables
HC1
10
32.006 0.38578 200.00 120.00
Benzene 40
MCB
50
Tear Variables
0.10601E-19 13.064 79.229 120.00 50.000
Tear Variable Errors (Calculated Minus Assumed)
-0.10601E-19 0.72209E-06
-0.36563E-04 0.00000E+00
0.00000E+00
-Results of infeasible path optimization
-Simultaneous optimization and convergence of tear streams.

360o F

S08

12,000
2
Btu/hr- ft

T-1
S12

TREATER

90 F

Maximize
Profit

H-2
U = 100
S13
C

P-1

Water
80o F

S15
0

120 F
T

S14
MCB

92

Ammonia Process Optimization


Pr

H2

Tr
N2

N2
H2
CH4
Ar

Hydrogen Feed
5.2%
94.0%
0.79 %
---

Nitrogen Feed
99.8%
--0.02%
0.01%

To
Tf

Tf

Product

Hydrogen and Nitrogen feed are mixed, compressed, and combined


with a recycle stream and heated to reactor temperature. Reaction
occurs in a multibed reactor (modeled here as an equilibrium reactor) to
partially convert the stream to ammonia. The reactor effluent is cooled
and product is separated using two flash tanks with intercooling. Liquid
from the second stage is flashed at low pressure to yield high purity
NH3 product. Vapor from the two stage flash forms the recycle and is
recompressed.
93

Ammonia Process Optimization


Optimization Problem

Performance Characterstics

Max

5 SQP iterations.
2.2 base point simulations.
objective function improves by
$20.66 x 106 to $24.93 x 106.
difficult to converge flowsheet
at starting point

{Total Profit @ 15% over five years}

s.t. 105 tons NH3/yr.


Pressure Balance
No Liquid in Compressors
1.8 H2/N2 3.5
Treact 1000o F
NH3 purged 4.5 lb mol/hr
NH3 Product Purity 99.9 %
Tear Equations

Optimum
Objective Function($106)

24.9286

Starting point
20.659

1. Inlet temp. reactor (oF)

400

400

2. Inlet temp. 1st flash (oF)

65

65

3. Inlet temp. 2nd flash (oF)

35

35

4. Inlet temp. rec. comp. (oF)


5. Purge fraction (%)

80.52
0.0085

107
0.01

6. Reactor Press. (psia)

2163.5

2000

7. Feed 1 (lb mol/hr)

2629.7

2632.0

8. Feed 2 (lb mol/hr)

691.78

691.4

94

How accurate should gradients be for optimization?


Recognizing True Solution
KKT conditions and Reduced Gradients determine true solution
Derivative Errors will lead to wrong solutions!
Performance of Algorithms
Constrained NLP algorithms are gradient based
(SQP, Conopt, GRG2, MINOS, etc.)
Global and Superlinear convergence theory assumes accurate gradients
Worst Case Example (Carter, 1991)
Newtons Method generates an ascent direction and fails for any !

Min f ( x) = x T Ax
+ 1 /
A=
1 /

1/
+ 1 /

x0 = [1 1]T f ( x0 ) = x0

g ( x0 ) = f ( x0 ) + O( )
d = A1 g ( x0 )

95

Implementation of Analytic Derivatives


parameters, p

exit variables, s

Module Equations

c(v, x, s, p, y) = 0

Sensitivity
Equations

dy/dx
ds/dx
dy/dp
ds/dp

Automatic Differentiation Tools


JAKE-F, limited to a subset of FORTRAN (Hillstrom, 1982)
DAPRE, which has been developed for use with the NAG library (Pryce, Davis, 1987)
ADOL-C, implemented using operator overloading features of C++ (Griewank, 1990)
ADIFOR, (Bischof et al, 1992) uses source transformation approach FORTRAN code .
TAPENADE, web-based source transformation for FORTRAN code
Relative effort needed to calculate gradients is not n+1 but about 3 to 5
(Wolfe, Griewank)
96

Ammonia Process Optimization


(9 decisions and 6 tear variables)

Flash Recycle Optimization


(2 decisions + 7 tear variables)

Reactor

S3
S1

Mixer

S2

Flash

S6

Hi P

S5

S4

Ratio

Flash

S7
Lo P

2
3
1/2
2
Max S3(A) *S3(B) - S3(A) - S3(C) + S3(D) - (S3(E))

Flash

8000

200

GRG
SQP

GRG

6000

rSQP

100

1Numerical

Exact

CPU Seconds (VS 3200)

CPU Seconds (VS 3200)

SQP

rSQP

4000

2000

Numerical
1

Exact

97

Large-Scale SQP
Min
s.t.

f(z)
c(z)=0
zL z zU

Min
s.t.

f(zk)T d + 1/2 d T Wk d
c(zk) + (k)T d = 0
zL zk + d zU

Characteristics
Many equations and variables ( 100 000)
Many bounds and inequalities ( 100 000)
Few degrees of freedom (10 - 100)
Steady state flowsheet optimization
Real-time optimization
Parameter estimation
Many degrees of freedom ( 1000)
Dynamic optimization (optimal control, MPC)
State estimation and data reconciliation
98

Few degrees of freedom => reduced space SQP (rSQP)


Take advantage of sparsity of A=c(x)
project W into space of active (or equality constraints)
curvature (second derivative) information only needed in space of degrees of
freedom
second derivatives can be applied or approximated with positive curvature
(e.g., BFGS)
use dual space QP solvers
+ easy to implement with existing sparse solvers, QP methods and line search
techniques
+ exploits 'natural assignment' of dependent and decision variables (some
decomposition steps are 'free')
+ does not require second derivatives
- reduced space matrices are dense
- may be dependent on variable partitioning
- can be very expensive for many degrees of freedom
- can be expensive if many QP bounds
99

Reduced space SQP (rSQP)


Range and Null Space Decomposition
Assume no active bounds, QP problem with n variables and m
constraints becomes:

W k
kT
A

f ( x k )
Ak d

=
k
0 +
c( x )

Define reduced space basis, Zk n x (n-m) with (Ak)TZk = 0


Define basis for remaining space Yk n x m, [Yk Zk]n x n
is nonsingular.
Let d = Yk dY + Zk dZ to rewrite:

Y Z

0
k

k T

0W
k T
I A
k

A Y Z

0 0
k

dY k k
0
Y Z
dZ =
I
0

k
0f (x )

k
I c(x )
100

Reduced space SQP (rSQP)


Range and Null Space Decomposition
0

Y k T W k Y k
kT k k
Z W Y
Ak T Y k

0
kT

Y W Y
kT
Z W kZk
0

Ak dY
Y k T f ( x k )

0 d Z = Z k T f ( x k )

c( x k )
0 +

kT

(ATY) dY =-c(xk) is square, dY determined from bottom row.


Cancel YTWY and YTWZ; (unimportant as dZ, dY --> 0)
(YTA) O = -YTf(xk), can be determined by first order estimate
Calculate or approximate w= ZTWY dY, solve ZTWZ dZ =-ZTf(xk)-w
Compute total step: d = Y dY + Z dZ

101

Reduced space SQP (rSQP) Interpretation

dZ
dY
Range and Null Space Decomposition
SQP step (d) operates in a higher dimension
Satisfy constraints using range space to get dY
Solve small QP in null space to get dZ
In general, same convergence properties as SQP.
102

Choice of Decomposition Bases


1. Apply QR factorization to A. Leads to dense but well-conditioned Y and Z.
R
R
A = Q = [Y Z ]
0
0
2. Partition variables into decisions u and dependents v. Create
orthogonal Y and Z with embedded identity matrices (ATZ = 0, YTZ=0).

AT = u cT

v cT = [N

C]

N T C T
I
Z = 1 Y =

C
N
I

3. Coordinate Basis - same Z as above, YT = [ 0 I ]

Bases use gradient information already calculated.


Adapt decomposition to QP step
Theoretically same rate of convergence as original SQP.
Coordinate basis can be sensitive to choice of u and v. Orthogonal is not.
Need consistent initial point and nonsingular C; automatic generation
103

rSQP Algorithm
1. Choose starting point x0.
2. At iteration k, evaluate functions f(xk), c(xk) and their gradients.
3. Calculate bases Y and Z.
4. Solve for step dY in Range space from
(ATY) dY =-c(xk)
5. Update projected Hessian Bk ~ ZTWZ (e.g. with BFGS), wk (e.g., zero)
6. Solve small QP for step dZ in Null space.

Min ( Z T f ( x k ) + wk )T d Z + 1 / 2d Z B k d Z
T

s.t.
7.
8.
9.
10.
11.

xL x k + YdY + Zd Z xU

If error is less than tolerance stop. Else


Solve for multipliers using (YTA) = -YTf(xk)
Calculate total step d = Y dY + Z dZ.
Find step size and calculate new point, xk+1 = xk +
Continue from step 2 with k = k+1.
104

rSQP Results: Computational Results for


General Nonlinear Problems
Vasantharajan et al (1990)

105

rSQP Results: Computational Results


for Process Problems
Vasantharajan et al (1990)

106

Comparison of SQP and rSQP


Coupled Distillation Example - 5000 Equations
Decision Variables - boilup rate, reflux ratio

1.
2.
3.
4.
5.

Method
SQP*
rSQP
rSQP
rSQP
rSQP

CPU Time Annual Savings


2 hr
negligible
15 min. $ 42,000
15 min. $ 84,000
15 min. $ 84,000
15 min $107,000

Comments
Base Case
Base Case
Higher Feed Tray Location
Column 2 Overhead to Storage
Cases 3 and 4 together

18

10

1
QVK

QVK
107

Real-time Optimization with rSQP


Sunoco Hydrocracker Fractionation Plant
(Bailey et al, 1993)
Existing process, optimization on-line at regular intervals: 17 hydrocarbon
components, 8 heat exchangers, absorber/stripper (30 trays), debutanizer (20
trays), C3/C4 splitter (20 trays) and deisobutanizer (33 trays).
FUEL GAS

C3

REFORMER
NAPHTHA
RECYCLE
OIL

iC4

DIB

DEBUTANIZER

MIXED LPG

C3/C4
SPLITTER

LIGHT
NAPHTHA

MAIN FRAC.

PREFLASH

ABSORBER/
STRIPPER

REACTOR EFFLUENT FROM


LOW PRESSURE SEPARATOR

nC4

square parameter case to fit the model to operating data.


optimization to determine best operating conditions

108

Optimization Case Study Characteristics


Model consists of 2836 equality constraints and only ten independent variables. It
is also reasonably sparse and contains 24123 nonzero Jacobian elements.

NP

P = ziCi + ziC + zi Ci Pm U
G

iG

E
i

iE

m=1

Cases Considered:
1. Normal Base Case Operation
2. Simulate fouling by reducing the heat exchange coefficients for the debutanizer
3. Simulate fouling by reducing the heat exchange coefficients for splitter
feed/bottoms exchangers
4. Increase price for propane
5. Increase base price for gasoline together with an increase in the octane credit

109

110

Many degrees of freedom => full space IPOPT

W k +
kT
A

( x k )
Ak d

=
k
0 +
c( x )

work in full space of all variables


second derivatives useful for objective and constraints
use specialized large-scale Newton solver
+ W=xxL(x,) and A=c(x) sparse, often structured
+ fast if many degrees of freedom present
+ no variable partitioning required
- second derivatives strongly desired
- W is indefinite, requires complex stabilization
- requires specialized large-scale linear algebra
111

QHV
L
O
H
3LS

*DVROLQH%OHQGLQJ
2,/7$1.6

*DVROLQH%OHQGLQJ+HUH

*$667$7,216

6XSSO\WDQNV

,QWHUPHGLDWHWDQNV

),1$/352'8&7758&.6

)LQDO3URGXFWWDQNV
112

Blending Problem & Model Formulation

q t ,i

f t ,ij

..qt , j

f t , jk

qi

..qt ,k

.. f t ,k

max

( ck ft , k ci ft ,i )
t k

vk

s.t.

ft , jk ft , ij + vt + 1, j
k

=v
t, j

=0
ft , k f
t , jk
j

vt , j

qt , j ft , jk qt , i ft , ij + qt + 1, j vt + 1, j
k

Supply tanks (i)

Intermediate tanks (j)

Final Product tanks (k)

=q v
t, j t, j

q f
=0
q f
t , j t , jk
t, k t, k
j
qt ,k qk
qk
max
min

f & v ------ flowrates and tank volumes


q
------ tank qualities

vt , j v j
vj
max
min

Model Formulation in AMPL


113

Small Multi-day Blending Models


Single Qualities
Haverly, C. 1978 (HM)
F1

Audet & Hansen 1998 (AHM)

B1
F1

B1

P1
P1

F2
F2

F3

B2
P2

B2
F3

B3

114

Honeywell Blending Model Multiple Days


48 Qualities

115

Summary of Results Dolan-Mor plot


Performance profile (iteration count)
1
0.9
0.8
0.7

0.6
0.5

IPOPT
LOQO

0.4

KNITRO
SNOPT
MINOS

0.3

LANCELOT

0.2
0.1
0
1

10

100

1000

10000

100000

1000000

10000000

116

Comparison of NLP Solvers: Data Reconciliation


1000
LANCELOT

Iterations

800

MINOS

600

SNOPT

400

KNITRO
LOQO

200

IPOPT

0
0

200

400

600

Degrees of Freedom

CPU Time (s, norm.)

100

10

LANCELOT
M INOS
SNOPT

1
0

200

400

600

KNITRO
LOQO

0.1

IPOPT

0.01

Degrees of Freedom

117

Sensitivity Analysis for Nonlinear Programming

At nominal conditions, p0
Min f(x, p0)
s.t. c(x, p0) = 0
a(p0) x b(p0)
How is the optimum affected at other conditions, p p0?
Model parameters, prices, costs
Variability in external conditions
Model structure
How sensitive is the optimum to parameteric uncertainties?
Can this be analyzed easily?
118

Calculation of NLP Sensitivity


Take KKT Conditions

L(x*, p, , v) = 0
c(x*, p0) = 0
ETx* - bnd( p0)=0

and differentiate and expand about p0.


pxL(x*, p, , v)T + xxL(x*, p, , v)T px*T + xh(x*, p, , v)T pT + pvT = 0
pc(x*, p0)T + xc(x*, p0)T px*T = 0
ET (px*T - pbndT)=0
Notes:
A key assumption is that under strict complementarity, the active set will not
change for small perturbations of p.
If an element of x* is at a bound then pxi*T = pbndT
Second derivatives are required to calculate sensitivities, px*T
Is there a cheaper way to calculate px*T?
119

Decomposition for NLP Sensitivity


Let L( x*, p, , v) = f ( x*) + T c( x*) + ( E T ( x * bnd ( p )))T v
W
AT

E T

xp L( x*, p, , v)T
A E p x *T

T
T
0 0 p = p c( x*, p )

E T p bnd T
0 0 p vT

Partition variables into basic, nonbasic and superbasic

pxT= Z pxST + YpxBT + TpxNT


Set pxNT =pbndNT, nonbasic variables to rhs,
Substitute for remaining variables
Perform range and null space decomposition
Solve only for pxST andpxBT
120

Decomposition for NLP Sensitivity


Y T WY
T
Z WY
AT Y

T
Y T ( xp L( x*, p, , v)T + W T p xN T )
Y T WY Y T A p xB
T

T
T
T
T
Z WZ
0 p xS = Z ( xp L( x*, p, , v) + W T p x N )
T
T
T

0
0 p T

c
(
x
*,
p
)
+
A
T

x
p
p
N

Solve only for pxBT from bottom row andpxST from middle row
If second derivatives are not available, ZTWZ, ZTWY pxBT and
ZTWT pxNT can be constructed by directional finite differencing
If assumption of strict complementarity is violated, sensitivity can be
calculated using a QP subproblem.

121

Second Order Tests


Reduced Hessian needs to be positive definite
At solution x*: Evaluate eigenvalues of ZTxxL*Z
Strict local minimum if all positive.
Nonstrict local minimum: If nonnegative, find eigenvectors for
zero eigenvalues, regions of nonunique solutions
Saddle point: If any are negative, move along directions of
corresponding eigenvectors and restart optimization.
x2
z1
Saddle
Point

x*
x1
z2

122

Sensitivity for Flash Recycle Optimization


(2 decisions, 7 tear variables)
S3
S1

Mixer

S2

Flash

S6
S5

S4

Ratio

S7

2
3
1/2
2
Max S3(A) *S3(B) - S3(A) - S3(C) + S3(D) - (S3(E))

Second order sufficiency test:


Dimension of reduced Hessian = 1
Positive eigenvalue
Sensitivity to simultaneous change in feed rate and upper bound on purge ratio
Only 2-3 flowsheet perturbations required for second order information
123

Ammonia Process Optimization


(9 decisions, 8 tear variables)
20

Sensitivities vs. Re-optimized Pts


Reactor

19.5

Hi P
Flash

Lo P

Objective Function

Sensitivity
19

QP1
QP2

18.5
Actual
18

17.5

Flash

0.1

0.01

0.001

17

Relative perturbation change


Second order sufficiency test:
Dimension of reduced Hessian = 4
Eigenvalues = [2.8E-4, 8.3E-10, 1.8E-4, 7.7E-5]
Sensitivity to simultaneous change in feed rate and upper bound on reactor conversion
Only 5-6 extra perturbations for second derivatives
124

Multiperiod Optimization
Coordination

Case 1

1.

Case 2

Case 3

Case 4

Case N

Design plant to deal with different operating scenarios (over time or with
uncertainty)

2. Can solve overall problem simultaneously

large and expensive

polynomial increase with number of cases

must be made efficient through specialized decomposition


3. Solve also each case independently as an optimization problem (inner
problem with fixed design)

overall coordination step (outer optimization problem for design)

require sensitivity from each inner optimization case with design


variables as external parameters
125

Multiperiod Flowsheet Example


i

CA

CA

T0

T1

V
i

T2

F2

F1

A
W
i

Tw

Parameters
E (kJ/mol)
k0 (1/h)
F (kmol/h)
Time (h)

Tw

Period 1 Period 2 Period 3 Period 4


555.6
583.3
611.1
527.8
10
11
12
9
45.4
40.8
24.1
32.6
1000
4000
2000
1000
126

Multiperiod Design Model


Min f0(d) + i fi(d, xi)
s.t. hi(xi, d) = 0, i = 1, N
gi(xi, d) 0, i = 1, N
r(d) 0
Variables:
x: state (z) and control (u) variables in each operating period
d: design variables (e. g. equipment parameters) used
i: substitute for d in each period and add i = d
d
hi, gi

Min f0(d) + i fi(d, xi)


s.t. hi(xi, i) = 0, i = 1, N
gi(xi, i) +si = 0, i = 1, N
0 si, d i=0, i = 1, N
r(d) 0
127

Multiperiod Decomposition Strategy


SQP Subproblem

Block diagonal bordered KKT matrix


(arrowhead structure)
Solve each block sequentially (range/null dec.)
to form small QP in space of d variables
Reassemble all other steps from QP solution
128

Multiperiod Decomposition Strategy


From decomposition of KKT block in each period, obtain the
following directions that are parametric in d:

Substituting back into the original QP subproblem leads to


a QP only in terms of d.
N

T
minimize = d f 0 + { p f iT (ZBi + YBi ) + (ZAi + Y Ai )T 2p Li (Z Bi + YB i ) } d

i =1
N
2

+ d d L0 + { (ZBi + YBi )T 2p Li (Z Bi + YB i ) } d

i =1
2

subject to

r + dr d 0

Once d is obtained, directions are obtained from the above


equations.
129

Starting Point
Original QP
Decoupling

Multiperiod Decomposition Strategy


Active Set Update

pi steps are parametric in d and their


components are created independently
Decomposition linear in number of Calculate
Reduced
Hessian
periods and trivially parallelizable
Choosing the active inequality
constraints can be done through:
-Active set strategy (e.g., bound
Calculate
addition)
Step
-Interior point strategy using
barrier terms in objective
MINOR
Easy to implement in decomposition ITERATION

Null - Range
Decomposition

QP at d

Bound Addition

LOOP

Line search

Optimality
Conditions

NO

YES
Optimal Solution

130

Multiperiod Flowsheet 1
(13+2) variables and (31+4) constraints (1 period)
262 variables and 624 constraints (20 periods)

CA

T1
400

CA

T0

V
i

T2

F2

300

F1

CPU time (s)


SQP (T)
MINOS(T)
MPD/SQP(T)

200

100

A
W

0
0

Tw

Tw

10

Periods

20

30

131

Multiperiod Example 2 Heat Exchanger Network


(12+3) variables and (31+6) constraints (1 period)
243 variables and 626 constraints (20 periods)

H1
T
T
C1

50

40

563 K

30

300 K
Q

T
A

20

SQP (T)
MINOS (T)
MPD/SQP (T)

CPU time (s)

393 K

10
320 K

0
0
350 K

10

Periods

20

30

132

Summary and Conclusions


-Unconstrained Newton and Quasi Newton Methods
-KKT Conditions and Specialized Methods
-Reduced Gradient Methods (GRG2, MINOS)
-Successive Quadratic Programming (SQP)
-Reduced Hessian SQP
-Interior Point NLP (IPOPT)
Process Optimization Applications
-Modular Flowsheet Optimization
-Equation Oriented Models and Optimization
-Realtime Process Optimization
-Blending with many degrees of freedom
Further Applications
-Sensitivity Analysis for NLP Solutions
-Multiperiod Optimization Problems
133

Optimization of DifferentialAlgebraic Equation Systems


L. T. Biegler
Chemical Engineering Department
Carnegie Mellon University
Pittsburgh, PA

DAE Optimization Outline


I
II

III

IV

Introduction
Process Examples
Parametric Optimization
- Gradient Methods
Perturbation
Direct - Sensitivity Equations
Adjoint Equations
Optimal Control Problems
- Optimality Conditions
- Model Algorithms
Sequential Methods
Multiple Shooting
Indirect Methods
Simultaneous Solution Strategies
- Formulation and Properties
- Process Case Studies
- Software Demonstration
2

Dynamic Optimization Problem

min (z(t), y(t), u(t), p, t f )

s.t.

dz(t)
= f (z(t), y(t),u(t),t, p)
dt

g (z (t ), y(t ), u(t ), t , p ) = 0
zo
z

t, time
z, differential variables
y, algebraic variables

z(0 )
d z(t) d z

d y(t) d y

d u(t) d u

d p d p

u
u

tf, final time


u, control variables
p, time independent parameters
3

DAE Models in Process Engineering


Differential Equations
Conservation Laws (Mass, Energy, Momentum)
Algebraic Equations
Constitutive Equations, Equilibrium (physical properties,
hydraulics, rate laws)
Semi-explicit form
Assume to be index one (i.e., algebraic variables can be solved
uniquely by algebraic equations)
If not, DAE can be reformulated to index one (see Ascher and
Petzold)
Characteristics
Large-scale models not easily scaled
Sparse but no regular structure
Direct linear solvers widely used
Coarse-grained decomposition of linear algebra
4

Parameter Estimation
Catalytic Cracking of Gasoil (Tjoa, 1991)
p3
p1
p2
A
Q, Q
S , A
S

a = ( p1 + p3 )a 2

1.0

q = p1a 2 p2 q

0.8

a(0) = 1, q (0) = 0

0.6

YA_data
YQ_data

Yi

YA_estimate

0.4

number of states and ODEs: 2


number of parameters:3
no control profiles
constraints: pL p pU

YQ_estimate
0.2
0.0
0.0

0.2

0.4

0.6

0.8

1.0

Objective Function: Ordinary Least Squares


(p1, p2, p3)0 = (6, 4, 1)
(p1, p2, p3)* = (11.95, 7.99, 2.02)
(p1, p2, p3)true = (12, 8, 2)
5

Batch Distillation Multi-product Operating Policies


5XQEHWZHHQGLVWLOODWLRQEDWFKHV
7UHDWDVERXQGDU\YDOXHRSWLPL]DWLRQSUREOHP
:KHQWRVZLWFKIURP$WRRIIFXW WR%"
+RZPXFKRIIFXW WRUHF\FOH"
5HIOX["
%RLOXS 5DWH"
2SHUDWLQJ7LPH"
$

Nonlinear Model Predictive Control (NMPC)


NMPC Estimation and Control
d : disturbances

u : manipulated
variables

Process

z : differential states
y : algebraic states

Why NMPC?




NMPC Controller
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )

Track a profile
Severe nonlinear dynamics (e.g,
sign changes in gains)
Operate process over wide range
(e.g., startup and shutdown)

ysp : set points

Model Updater
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )

NMPC Subproblem
min
u

s.t .

|| y (t ) y

sp

|| Q y + || u(t k ) u(t k 1 ) || Q u

z (t ) = F ( z (t ), y (t ), u (t ), t )
0 = G ( z (t ), y (t ), u(t ), t )
z (t ) = z init
Bound Constraint s
Other Constraint s
7

Batch Process Optimization


Optimization of dynamic batch process operation resulting from reactor and
distillation column
DAE models:
A+BC
z = f(z, y, u, p)
C+BP+E
g(z, y, u, p) = 0
P+CG
number of states and DAEs: nz + ny
0
zi,I
parameters for equipment design
(reactor, column)
nu control profiles for optimal operation

0
zi,II

0
zi,III

0
zi,IV

zi,If

f
zi,II

f
zi,III

f
zi,IV

Bi

uL u(t) uU
zL z(t) zU
yL y(t) yU
pL p pU
Objective Function: amortized economic function at end of cycle time tf
Constraints:

640

25
C o nsta nt

C o nsta nt

630

Dyn a m ic

Dyn a m ic

20

620
610

15

600
10
590
580

5
0

0.25

0.5 0.75
Tim e (h r.)

1.25

optimal reactor temperature policy

0.5

1.5

2.5

optimal column reflux ratio

Reactor Design Example


Plug Flow Reactor Optimization
The cracking furnace is an important example in the olefin production industry, where various
hydrocarbon feedstocks react. Consider a simplified model for ethane cracking (Chen et al.,
1996). The objective is to find an optimal profile for the heat flux along the reactor in order to
maximize the production of ethylene.
C H
2 4
F
exit
Max
s.t. DAE

Texit 1180K

The reaction system includes six molecules, three free radicals, and seven reactions. The
model also includes the heat balance and the pressure drop equation. This gives a total of
eleven differential equations.
CH +C H CH +C H
C H 2CH
3
26
3
H+CH H + C H
26
2 25
C H +C H CH + CH
25 2 4 36
3
H+C H C H
2 4 25

26
4 25
C H C H + H
2 5
24
2CH C H
25
4 10

2500

2000

1500

3
1000

500

1
0

Heat flux kJ/m2s

Flow rate mol/s

Concentration and Heat Addition Profile

0
0

10

Le ngth m
C2H4

C2H6

log(H2)+12

Dynamic Optimization Approaches

Pontryagin(1962)

DAE Optimization Problem

Variational Approach
Inefficient for constrained
problems

Discretize
controls

Apply a NLP solver

Vassiliadis(1994)

Sequential Approach

Efficient for constrained problems

10

Sequential Approaches - Parameter Optimization

Consider a simpler problem without control profiles:


e.g., equipment design with DAE models - reactors, absorbers, heat exchangers
Min

(z(tf))

z = f(z, p), z (0) = z0


g(z(tf)) 0, h(z(tf)) = 0
By treating the ODE model as a "black-box" a sequential algorithm can be constructed that can
be treated as a nonlinear program.
NLP
Solver
P

ODE
Model

,g,h
z (t)

Gradient
Calculation

Task: How are gradients calculated for optimizer?


11

Gradient Calculation

Perturbation
Sensitivity Equations
Adjoint Equations
Perturbation
Calculate approximate gradient by solving ODE model (np + 1) times
Let = , g and h (at t = tf)
d/dpi = { (pi + pi) - (pi)}/ pi
Very simple to set up
Leads to poor performance of optimizer and poor detection of optimum
unless roundoff error (O(1/pi) and truncation error (O(pi)) are small.
Work is proportional to np (expensive)
12

Direct Sensitivity
From ODE model:

{z = f ( z, p, t ), z (0) = z0 ( p)}
p
z (t )
i = 1, ...np
define si (t ) =
pi
d
f f
z (0)
si = ( si ) =
+
si , si (0) =
dt
pi z
pi
T

(nz x np sensitivity equations)


z and si , i = 1,np, an be integrated forward simultaneously.
for implicit ODE solvers, si(t) can be carried forward in time after converging on z
linear sensitivity equations exploited in ODESSA, DASSAC, DASPK, DSL48s and a
number of other DAE solvers
Sensitivity equations are efficient for problems with many more constraints than
parameters (1 + ng + nh > np)

13

Example: Sensitivity Equations


z1 = z12 + z 22
z 2 = z1 z 2 + z1 pb
z1 = 5, z 2 (0) = pa
s (t ) a , j = z (t ) j / pa , s (t ) b , j = z (t ) j / pb , j = 1,2
sa ,1 = 2 z1 sa ,1 + 2 z 2 sa , 2
sa , 2 = z1 sa , 2 + z 2 sa ,1 + sa ,1 pb
sa ,1 = 0, sa , 2 (0) = 1
sb ,1 = 2 z1 sb ,1 + 2 z 2 sb , 2
sb , 2 = z1 + z1 sb , 2 + z 2 sb ,1 + sb ,1 pb
sb ,1 = 0, sb , 2 (0) = 0
14

Adjoint Sensitivity
Adjoint or Dual approach to sensitivity
Adjoin model to objective function or constraint
( = ,g or h)

tf

= (t f ) T ( z f ( z , p, t ))dt
0

tf

= (t f ) + (0) T z 0 ( p) (t f ) T z (t f ) + z T + T F ( z, p, t ))dt
0

((t)) serve as multipliers on ODEs)


Now, integrate by parts

tf
T
( z (t f ))

z0 ( p)

f
f

d =
(t f ) z (t f ) +
(0) dp + + z (t ) + dp dt
z

z (t f )
p

p
0
T

and find d/dp subject to feasibility of ODEs


Now, set all terms not in dp to zero.
15

Adjoint System
( z (t f ))
f
= (t ), (t f ) =
z
z (t f )
tf

d z0 ( p )
=
(0) + (t ) dt
dp
p
p

0
Integrate model equations forward
Integrate adjoint equations backward and evaluate integral and sensitivities.
Notes:
nz (ng + nh + 1) adjoint equations must be solved backward (one for each
objective and constraint function)
for implicit ODE solvers, profiles (and even matrices) can be stored and
carried backward after solving forward for z as in DASPK/Adjoint (Li and
Petzold)
more efficient on problems where: np > 1 + ng + nh
16

Example: Adjoint Equations


z1 = z12 + z22
z1 = z1 z 2 + z1 pb
z1 = 5, z2 (0) = pa
Form T f ( z , p, t ) = 1 ( z12 + z22 ) + 2 ( z1 z 2 + z1 pb )

( z (t f ))
f
(t ), (t f ) =
z
z (t f )
t

f
f

d z0 ( p )
=
(0) + (t ) dt
dp
p
p

then becomes :

1 = 21 z1 2 ( z 2 + pb ), 1 (t f ) =
2 = 21 z 2 2 z1 , 2 (t f ) =
d (t f )
dpa
d (t f )
dpb

(t f )
z1 (t f )

(t f )
z 2 (t f )

= 1 (0)

tf

= 2 (t )z1 (t )dt
0

17

Example: Hot Spot Reactor


L

Min

TP ,TR , L ,TS

s.t.

= L (T (t ) TS / TR )dt
0

dq
= 0.3(1 q (t )) exp[20 20 / T (t )], q (0) = 0
dt
dT
dq
= 1.5(T (t ) TS / TR ) + 2 / 3 , T (0) = 1
dt
dt
+ feed (TR ,110o C) -H product(TP , T(L)) = 0
TP = 120o C, T(L) = 1 + 10 o C/TR

TR

3:1 B/A
383 K

A + 3B --> C + 3D
L

TP

Ts

TP = specified product temperature


TR = reactor inlet, reference temperature
L = reactor length
Ts = steam sink temperature
q(t) = reactor conversion profile
T(t) = normalized reactor temperature profile

Cases considered:

Hot Spot - no state variable constraints

Hot Spot with T(t) 1.45


18

Hot Spot Reactor: Unconstrained Case


Method: SQP (perturbation derivatives)
L(norm) TR(K) TS(K) TP(K)
Initial:
1.0
462.23 425.26 250
Optimal:
1.25
500
470.1 188.4
13 SQP iterations / 2.67 CPU min. (Vax II)
1.6

1.2

1.5

Normalized Temperature

1.0

Conversion, q

0.8

0.6

0.4

0.2

1.3

1.2

1.1

1.0

0.0
0.0

1.4

0.5

Normalized Length

1.0

1.5

0.0

0.5

1.0

1.5

Normalized Length

Constrained Temperature Case: could not be solved with sequential method


19

Tricks to generalize classes of problems


Variable Final Time (Miele, 1980)
Define

t = pn+1 , 0 1, pn+1 = tf

Let dz/dt = (1/ pn+1) dz/d = f(z, p) dz/d = (pn+1) f(z, p)


Converting Path Constraints to Final Time
Define measure of infeasibility as a new variable, znz+1(t) (Sargent & Sullivan, 1977):
tf

z nz +1 (t f ) = max(0, g j ( z (t ), u (t )) 2 dt
j

or znz +1 (t ) = max(0, g j ( z (t ), u (t )) 2 , z nz +1 (0) = 0


j

Enforce z nz +1 (t f ) (however, constraint is degenerate)

20

Profile Optimization - (Optimal Control)

Optimal Feed Strategy (Schedule) in Batch Reactor


Optimal Startup and Shutdown Policy
Optimal Control of Transients and Upsets

Sequential Approach: Approximate control profile as through parameters (piecewise


constant, linear, polynomial, etc.)
Apply NLP to discretization as with parametric optimization
Obtain gradients through adjoints (Hasdorff; Sargent and Sullivan; Goh and Teo) or
sensitivity equations (Vassiliadis, Pantelides and Sargent; Gill, Petzold et al.)

Variational (Indirect) approach: Apply optimality conditions and solve as boundary


value problem

21

Derivation of Variational Conditions


Indirect Approach
Optimality Conditions (Bound constraints on u(t))
Min (z(tf))
s.t. dz/dt = f(z, u), z (0) = z0
g (z(tf)) 0
h (z(tf)) = 0
a u(t) b
Form Lagrange function - adjoin objective function and constraints:
= ( t f ) + g ( z ( t f )) T + h ( z ( t f )) T v

tf

T ( z f ( z , u )) + aT ( a u ( t )) + bT ( u ( t ) b ) dt
Integrate by parts :

= ( t f ) + g ( z ( t f )) T + h ( z ( t f )) T v + T ( 0 ) z ( 0 ) T ( t f ) z ( t f )
+

tf

T z + T f ( z , u ) + aT ( a u ( t )) + bT ( u ( t ) b ) dt
22

Derivation of Variational Conditions


h
g

=
+
+
v z ( t f ) + T ( 0 ) z ( 0 )
z
z
z

tf

 f ( z , u )
f ( z , u )

+
+
+

z
(
t
)

b
a u ( t ) dt 0

At optimum, 0. Since u is the control variable, let all other terms vanish.
z(tf):
h
g
(t f ) = + +
z z
z t =t f
z(0): (0) = 0 (if z(0) is not specified)
H
f
z(t):


Define Hamiltonian, H = Tf(z,u)


For u not at bound: f
H
=
=0
u
u
H
= a b
For u at bounds:
u
H
Upper bound, u(t) = b,
= b 0
u

Ta (a u(t))
bT (u(t) b)
ua u(t) ub
a 0, b 0

Lower bound, u(t) = a,

H
= a 0
u
23

Car Problem
Travel a fixed distance (rest-to-rest) in minimum time.
Min x3(t f )
Min t f
s.t. x1 = x2
s.t. x" = u
x2 = u
a u (t ) b
x3 = 1
x(0) = 0, x(t f ) = L
a u (t ) b
x(0) = 0, x(t f ) = 0
x1 (0) = 0, x1 (t f ) = L
x2 (0) = 0, x2 (t f ) = 0
Hamiltonian : H = 1 x2 + 2u + 3
Adjoints :  = 0 ==> (t ) = c
1

2 = 1 ==> 2 (t ) = c2 + c1 (t f t )
 = 0 ==> (t ) = 1, (t ) = 1
3

t = 0, c1t f + c2 > 0, u = b
H
= 2 = c2 + c1 (t f t )
u
t = t f , c2 > 0, u = a
Crossover (2 = 0) occurs at t = t s

24

Car Problem
Analytic Variational Solution
b

Optimal Profile
u(t)

From state equations:

1 / 2 bt 2 ,t < t s
x1(t) =
2
1 / 2 bt 2s - a (t s - t f ) , t t s

x2 (t) = bt, t < t s

bts + a (t - t s ), t t s
Problem is linear in u(t). Frequently
these problems have "bang-bang"
character.
For nonlinear and larger problems, the
variational conditions can be solved
numerically as boundary value
problems.

ts

tf

Apply boundary conditions at t = tf:


x1(tf) = 1/2 (b ts2 - a (ts - tf)2) = L
x2(tf) = bts + a (tf - ts) = 0
ts = 2L 1/2
b (1- b / a )

1/ 2

tf =

2L
(1 b / a)
b(1 - b / a )

25

Example: Batch reactor - temperature profile


Maximize yield of B after one hours operation by manipulating a transformed
temperature, u(t).

Minimize -zB(1.0)
s.t.
zA = -(u+u2/2) zA
u
zB = u zA
A
zA(0) = 1
2
zB(0) = 0
u /2
C
u(T(t))
0 u(t) 5
Adjoint Equations:
H = -A(u+u2/2) zA + B u zA
H/u = A (1+u) zA + B zA
A = A(u+u2/2) - B u, A(1.0) = 0
B = 0,
B(1.0) = -1

Cases Considered
1. NLP Approach - piecewise constant and linear profiles.
2. Control Vector Iteration
26

Batch Reactor Optimal Temperature Program


Piecewise Constant

Optimal Profile, u(t)

0.

0.2

0.4
Time, h

0.6

0.8

1.0

Results
Piecewise Constant Approximation with Variable Time Elements
Optimum B/A: 0.57105
27

Batch Reactor Optimal Temperature Program


Piecewise Linear

Optimal Profile, u(t)

0.

0.2

0.4
Time, h

0.6

0.8

1.0

Results:
Piecewise Linear Approximation with Variable Time Elements
Optimum B/A: 0.5726
Equivalent # of ODE solutions: 32
28

Batch Reactor Optimal Temperature Program


Indirect Approach

Optimal Profile, u(t)

0.

0.2

0.4
Time, h

0.6

0.8

1.0

Results:
Control Vector Iteration with Conjugate Gradients
Optimum (B/A): 0.5732
Equivalent # of ODE solutions: 58
29

Dynamic Optimization - Sequential Strategies


Small NLP problem, O(np+nu) (large-scale NLP solver not required)

Use NPSOL, NLPQL, etc.

Second derivatives difficult to get


Repeated solution of DAE model and sensitivity/adjoint equations, scales with
nz and np

Dominant computational cost

May fail at intermediate points


Sequential optimization is not recommended for unstable systems. State
variables blow up at intermediate iterations for control variables and
parameters.
Discretize control profiles to parameters (at what level?)
Path constraints are difficult to handle exactly for NLP approach

30

Instabilities in DAE Models


This example cannot be solved with sequential methods (Bock, 1983):
dy1/dt = y2
dy2/dt = 2 y1 + (2 2) sin ( t)
The characteristic solution to these equations is given by:
y1(t) = sin ( t) + c1 exp(- t) + c2 exp( t)
y2 (t) = cos ( t) - c1 exp(- t) + c2 exp( t)

Both c1 and c2 can be set to zero by either of the following equivalent


conditions:
IVP

y1(0) = 0, y2 (0) =

BVP

y1(0) = 0, y1(1) = 0

31

IVP Solution
If we now add roundoff errors e1 and e2 to the IVP and BVP conditions, we
see significant differences in the sensitivities of the solutions.
For the IVP case, the sensitivity to the analytic solution profile is seen by
large changes in the profiles y1(t) and y2(t) given by:

y1(t) = sin ( t) + (e1 - e2/) exp(- t)/2


+(e1 + e2/) exp( t)/2
y2 (t) = cos ( t) - ( e1 - e2) exp(- t)/2
+ ( e1 + e2) exp( t)/2
Therefore, even if e1 and e2 are at the level of machine precision (< 10-13), a
large value of and t will lead to unbounded solution profiles.
32

BVP Solution
On the other hand, for the boundary value problem, the errors
affect the analytic solution profiles in the following way:
y1(t) = sin ( t) + [e1 exp()- e2] exp(- t)/[exp() - exp(- )]
+ [e1 exp(- ) - e2] exp( t)/[exp() - exp(- )]
y2(t) = cos ( t) [e1 exp()- e2] exp(- t)/[exp() - exp(- )]
+ [e1 exp(-) - e2] exp( t)/[exp() - exp(- )]
Errors in these profiles never exceed t (e1 + e2), and as a result a
solution to the BVP is readily obtained.

33

BVP and IVP Profiles


e1, e2 = 10-9
Linear BVP solves easily
IVP blows up before midpoint

34

Dynamic Optimization Approaches


Pontryagin(1962)

Variational Approach

DAE Optimization Problem

Inefficient for constrained


problems

Apply a NLP solver

Discretize
controls

Efficient for constrained problems


Discretize some
state variables

Vassiliadis(1994)

Sequential Approach

Can not handle instabilities properly


Small NLP

Multiple Shooting
Handles instabilities

Larger NLP

35

Multiple Shooting for Dynamic Optimization


Divide time domain into separate regions

Integrate DAEs state equations over each region


Evaluate sensitivities in each region as in sequential approach wrt uij, p and zj
Impose matching constraints in NLP for state variables over each region
Variables in NLP are due to control profiles as well as initial conditions in each region

36

Multiple Shooting
Nonlinear Programming Problem
min (z (t f ), y (t f ) )
s.t.

ui , j , p

z ( z j , u i , j , p , t j +1 ) z j +1 = 0
z k z ( z j , u i, j , p, t k ) z k

y k y ( z j , u i, j , p, t k ) y k

u i u i, j u i
l

pl p pu

dz
= f (z, y, ui, j , p), z(tj ) = z j
dt
g (z,y,u i, j , p ) = 0
z 0 = z (0)

minn f ( x)
s.t c( x) = 0
x L x xu

Solved Implicitly
37

BVP Problem Decomposition


IC
B1

A1
B2

A2
B3

A3
B4

A4

BN

AN
FC

Consider: Jacobian of Constraint Matrix for NLP


bound unstable modes with boundary conditions (dichotomy)
can be done implicitly by determining stable pivot sequences in multiple shooting constraints
approach
well-conditioned problem implies dichotomy in BVP problem (deHoog and Mattheij)
Bock Problem (with t = 50)
Sequential approach blows up (starting within 10-9 of optimum)
Multiple Shooting optimization requires 4 SQP iterations
38

Dynamic Optimization Multiple Shooting Strategies


Larger NLP problem O(np+nu+NE nz)
Use SNOPT, MINOS, etc.
Second derivatives difficult to get

Repeated solution of DAE model and sensitivity/adjoint equations, scales with


nz and np
Dominant computational cost
May fail at intermediate points

Multiple shooting can deal with unstable systems with sufficient time
elements.
Discretize control profiles to parameters (at what level?)
Path constraints are difficult to handle exactly for NLP approach
Block elements for each element are dense!
Extensive developments and applications by Bock and coworkers using
MUSCOD code
39

Dynamic Optimization Approaches


Pontryagin(1962)

Variational Approach

DAE Optimization Problem

Inefficient for constrained


problems

Apply a NLP solver

Discretize
controls

Efficient for constrained problems


Discretize all
state variables

Vassiliadis(1994)

Sequential Approach

Can not handle instabilities properly


Small NLP

Simultaneous Approach
Handles instabilities

Large NLP

40

Nonlinear Programming Formulation

Nonlinear Dynamic
Optimization Problem

Continuous variables

Collocation on
finite Elements

Discretized variables

Nonlinear Programming
Problem (NLP)

41

Discretization of Differential Equations


Orthogonal Collocation
Given: dz/dt = f(z, u, p), z(0)=given
Approximate z and u by Lagrange interpolation polynomials (order
K+1 and K, respectively) with interpolation points, tk
K

k =0

j =0
jk

z K +1 (t ) = z k " k (t ), " k (t ) =
K

k =1

j =1
j k

u K (t ) = uk " k (t ), " k (t ) =

(t t j )
(t k t j )

(t t j )
(t k t j )

==> z N +1 (t k ) = z k

==> u N (t k ) = u k

Substitute zN+1 and uN into ODE and apply equations at tk.


K

r (t k ) = z j " j (t k ) f ( z k , uk ) = 0, k = 1,...K
j =0

42

Collocation Example
K

k =0

j =0
jk

z K +1 (t ) = z k " k (t ), " k (t ) =

(t t j )
(t k t j )

==> z N +1 (t k ) = z k

t 0 = 0 , t 1 = 0 .21132 , t 2 = 0 .78868
" 0 (t) = (t

" (t) = t/ 3 - 1 / 6
0
" 1(t) = - 8 .195 t 2 + 6 .4483 t , " 1(t) = 6 .4483 - 16 .39 t
" 2 (t) = 2 .19625 t 2 - 0 .4641 t , " 2 (t) = 4 .392 t - 0 .46412
2

- t + 1 )/ 6 ,

Solve z = z 2 - 3 z + 2 , z( 0 ) = 0
==> z 0 = 0
2
z 0 " 0 (t 1 ) + z 1 " 1(t 1 ) + z 2 " 2 (t 1 ) = z 1 - 3 z 1 + 2
( 2 .9857 z 1 + 0 .46412 z 2 = z 1 - 3 z 1 + 2 )
2
z 0 " 0 (t 2 ) + z 1 " 1(t 2 ) + z 2 " 2 (t 2 ) = z 2 - 3 z 2 + 2
2

(- 6 .478 z 1 + 3 z 2 = z 2 - 3 z 2 + 2 )
2

z 0 = 0 , z 1 = 0 .291 ( 0 .319 ), z

= 0 .7384 ( 0 .706 )

z(t) = 1.5337 t - 0 .76303 t 2


43

Converted Optimal Control Problem


Using Collocation

to Nonlinear Program

z N+1 (t)
z(t)

State Profile

(z(tf))
z = f(z, u, p), z(0)=z0
g(z(t), u(t), p) 0
h(z(t), u(t), p) = 0

Min
s.t.

Min (z f )

(t k ) f ( zk , uk ) = 0, z0 = z(0)
j =0

g(zk ,uk ) 0
k = 1,...K

h(zk ,uk ) = 0

z "
j

z "
j =0

(1) z f = 0

t1

t2

t3

tf

r(t)

t1

t2

t3

How accurate is approximation


44

Results of Optimal Temperature Program


Batch Reactor (Revisited)

Results - NLP with Orthogonal Collocation


Optimum B/A - 0.5728
# of ODE Solutions - 0.7 (Equivalent)
45

Collocation on Finite Elements


i 1

dz 1 dz
=
dt hi d

tij = hi + hi j , [0,1]
i =1

dz
= hi f ( z , u )
d

ti

to

q=2

u
K

q=1

z(t) = " q(t) ziq

tf
hi
Discontinuous Algebraic and
Control variables

Mesh points

Continuous Differential variables


element i

Collocation points

Finite element, i

Polynomials

y(t) = "q(t) yiq


q=1

u
K

u(t) = "q(t) uiq


q=1

q=0

r (tik ) = ( zij " j ( k )) hi f ( zik , uik , p) = 0, k = 1,..K, i = 1,.. NE


j =0

46

Nonlinear Programming Problem


min (z f
s.t.

( z "
ij

j =0

minn f ( x)
( k )) hi f ( zik , uik , p) = 0

g (zi ,k , yi ,k ,ui ,k , p ) = 0
K

(z
j =0

i 1, j

" j (1)) zi 0 = 0, i = 2,..NE

(z
j =0

NE , j

s.t c( x) = 0
x L x xu

" j (1)) z f = 0, z10 = z (0)

z i, j z i, j z i, j
l

Finite elements, hi, can also be variable to


determine break points for u(t).

y i , j y i, j y i , j

ui, j ui, j ui, j

pl p pu

Add hu hi 0, hi=tf

Can add constraints g(h, z, u) for


approximation error
47

Hot Spot Reactor Revisited


L

Min

TP ,TR , L ,TS

s.t.

= L (T (t ) TS / TR )dt
0

dq
= 0.3(1 q (t )) exp[20 20 / T (t )], q (0) = 0
dt
dT
dq
= 1.5(T (t ) TS / TR ) + 2 / 3 , T (0) = 1
dt
dt
+ feed (TR ,110o C) -H product(TP , T(L)) = 0
TP = 120o C, T(L) = 1 + 10 o C/TR

TR

3:1 B/A
383 K

A + 3B --> C + 3D
L

TP

Ts

TP = specified product temperature


TR = reactor inlet, reference temperature
L = reactor length
Ts = steam sink temperature
q(t) = reactor conversion profile
T(t) = normalized reactor temperature profile

Cases considered:

Hot Spot - no state variable constraints

Hot Spot with T(t) 1.45


48

Base Case Simulation


Method: OCFE at initial point with 6 equally spaced elements
L(norm) TR(K)
TS(K)
TP(K)
Base Case:
1.0
462.23 425.26 250

1.8

integrated profile
collocation

1.6

Temperature

Conversion

integrated profile
collocation

1.4

1.2

0
0.0

1.0
0.2

0.4

0.6

Normalized Length

0.8

1.0

1.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Normalized Length

49

Unconstrained Case
Method: OCFE combined formulation with rSQP
identical to integrated profiles at optimum
L(norm) TR(K)
TS(K)
TP(K)
Initial:
1.0
462.23 425.26 250
Optimal:
1.25
500
470.1
188.4
123 CPU s. (Vax II)
= -171.5
1.6

1.0

1.5

Normalized Temperature

1.2

Conversion, q

0.8

0.6

0.4

0.2

0.0
0.0

0.5

1.0

Normalized Length

1.5

1.4
1.3
1.2
1.1
1.0
0.0

0.5

1.0

1.5

Normalized Length

50

Temperature Constrained Case

T(t) 1.45

Method: OCFE combined formulation with rSQP,


identical to integrated profiles at optimum
L(norm) TR(K)
TS(K)
Initial:
1.0
462.23
425.26
Optimal:
1.25
500
450.5

TP(K)
250
232.1

57 CPU s. (Vax II), = -148.5


1.2

1.5

Temperature

1.0

Conversion

0.8

0.6

0.4

1.3
1.2
1.1

0.2

0.0
0.0

1.4

0.5

Normalized Length

1.0

1.5

1.0
0.0

0.5

1.0

1.5

Normalized Length

51

Theoretical Properties of Simultaneous Method


A. Stability and Accuracy of Orthogonal Collocation

Equivalent to performing a fully implicit Runge-Kutta integration of


the DAE models at Gaussian (Radau) points
2K order (2K-1) method which uses K collocation points
Algebraically stable (i.e., possesses A, B, AN and BN stability)

B. Analysis of the Optimality Conditions

An equivalence has been established between the Kuhn-Tucker


conditions of NLP and the variational necessary conditions
Rates of convergence have been established for the NLP method

52

Simultaneous DAE Optimization


Case Studies

Reactor - Based Flowsheets


Fed-Batch Penicillin Fermenter
Temperature Profiles for Batch Reactors
Parameter Estimation of Batch Data
Synthesis of Reactor Networks
Batch Crystallization Temperature Profiles
Grade Transition for LDPE Process
Ramping for Continuous Columns
Reflux Profiles for Batch Distillation and Column Design
Source Detection for Municipal Water Networks
Air Traffic Conflict Resolution
Satellite Trajectories in Astronautics
Batch Process Integration
Optimization of Simulated Moving Beds

53

Production of High Impact Polystyrene (HIPS)


Startup and Transition Policies (Flores et al., 2005a)

Monomer,
Transfer/Term.
agents

Coolant

Catalyst
Polymer

54

Phase Diagram of Steady States


Transitions considered among all steady state pairs
600
650
1: Qi = 0.0015
2: Qi = 0.0025
Upper SteadyState
3: Qi = 0.0040

A5
N3

550

600

System State

550

500

Temperature (K)

Temperature (K)

Medium SteadyState
450

400

A1

1
500

1: Qcw = 10
2: Qcw = 1.0
3: Qcw = 0.1

450

N2

Lower SteadyState

400

A2

N2

350
A3

350

A4

Bifurcation
Parameter
3
B1

N1

300

B2

C1
B3

C2

300

0.5

1.5

2
2.5
3
Cooling water flowrate (L/s)

3.5

4.5

0.02

0.04

0.06
Initiator flowrate (L/s)

0.08

0.1

0.12
55

x 10

0.5

0.5

400

1
Time [h]

1.5

Cooling water Flow. [l/sec]Jacket Temp. [K] Monomer Conc. [mol/l]

Initiator Flow. [l/sec] Reactor Temp. [K] Initiator Conc. [mol/l]

Startup to Unstable Steady State


10
8
6
4

0.5

1
Time [h]

1.5

0.5

1
Time [h]

1.5

0.5

1
Time [h]

1.5

320
310

350

300

300
1.5

3
0
x 10

0.5

1
Time [h]

1.5

290
1

0.5

Feedrate Flow. [l/sec]

0.5
0

20

40
60
Time [h]

80

100

2
1
0

0.5

1
Time [h]

1.5

926 variables
476 constraints
36 iters. / 0.95 CPU s (P4)
56

HIPS Process Plant (Flores et al., 2005b)

Many grade transitions considered with stable/unstable pairs


1-6 CPU min (P4) with IPOPT
Study shows benefit for sequence of grade changes to
achieve wide range of grade transitions.
57

Batch Distillation Optimization Case Study - 1


R(t)

D(t), x

dx i,d
dt
dx i,0

V(t)
xb

y i, N - x i, d

H cond

x i, d - x i,0

dt
S
dS
-V
=
dt R + 1
=

*DXJHHIIHFWRIFROXPQKROGXSV
2YHUDOOSURILWPD[LPL]DWLRQ
0DNH7UD\&RXQW&RQWLQXRXV

58

2SWLPL]DWLRQ&DVH6WXG\ 
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
1R/LTXLG7UD\+ROGXS
1R9DSRU+ROGXS
)RXUFRPSRQHQWPL[WXUH 
6KRUWFXWVWHDG\VWDWHWUD\PRGHO
)HQVNH8QGHUZRRG*LOOLODQG
&DQEHVXEVWLWXWHGE\PRUHGHWDLOHGVWHDG\VWDWHPRGHOV
)UHGHQVOXQG DQG*DOLQGH]$OND\D
2SWLPL]DWLRQ3UREOHPV&RQVLGHUHG
(IIHFWRI&ROXPQ+ROGXS +FRQG
7RWDO3URILW0D[LPL]DWLRQ
59

0D[LPXP'LVWLOODWH3UREOHP
30

Reflux Ratio

20
With Holdup
No Holdup

&RPSDULVRQRIRSWLPDOUHIOX[SURILOHV
ZLWKDQGZLWKRXWKROGXS +FRQG 

10

0
0.0

0.2

0.4

0.6

0.8

1.0

Time ( hours )

0.98

Distillate Composition

&RPSDULVRQRIGLVWLOODWH
SURILOHVZLWKDQGZLWKRXW
KROGXS +FRQG DW
RYHUDOOSXULW\

0.97
0.96
0.95

With Holdup
No Holdup

0.94
0.93
0.92
0.91
0.0

0.2

0.4

0.6

Time ( hours )

0.8

1.0
60

%DWFK'LVWLOODWLRQ3URILW0D[LPL]DWLRQ
Max {Net Sales(D, S0)/(tf +Tset) TAC(N, V)}
N = 20 trays, Tsetup = 1 hour
xd = 0.98, xfeed = 0.50, = 2
Cprod/Cfeed = 4.1
V = 120 moles/hr, S = 100 moles.
0

30

Reflux Ratio

20
N = 20
N = 33.7
10

0
0

2
Time ( hours )

4
61

%DWFK'LVWLOODWLRQ 2SWLPL]DWLRQ&DVH6WXG\ 
R(t)

D(t), x

dx1,N +1
V
=
[y - x , ]
dt
H N+1 1,N 1 N +1
dx1,p
dt

R
x
, p = 1,...,N
y1, p-1 - y1 , p + R+1
x
1,p+1
1,p

Hp

V(t)
xb

dx1,0 V

R
x

=
x1,0 - y1,0 + R+1
x
1,1
1,0

S
dt
dD
V
=
dt R + 1

N+1
N+1
S 0x 0i,0 = S0 - Hp x i,0 + H p x i,p

p=1
p=1
C

xi,p = 1.0
1

yi,p = 1.0
1

Ideal Equilibrium Equations


yi,p = Ki,p xi,p
Binary Column (55/45, Cyclohexane, Toluene)
S0 = 200, V = 120, Hp = 1, N = 10, ~8000 variables,
< 2 CPU hrs. (Vaxstation 3200)
62

2SWLPL]DWLRQ&DVH6WXG\ 
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
&RQVWDQW7UD\+ROGXS
1R9DSRU+ROGXS
%LQDU\0L[WXUH WROXHQHF\FORKH[DQH
KRXURSHUDWLRQ
7RWDO5HIOX[,QLWLDO&RQGLWLRQ
&DVHV&RQVLGHUHG
&RQVWDQW&RPSRVLWLRQRYHU7LPH
6SHFLILHG&RPSRVLWLRQDW)LQDO7LPH
%HVW&RQVWDQW5HIOX[3ROLF\
3LHFHZLVH&RQVWDQW5HIOX[3ROLF\
63

0.006

0.005

0.004

0.003

0.002
Reflux
Purity
0.001

0
0.0

Top Plate Purity, (1-x)

Reflux Policy

5HIOX[2SWLPL]DWLRQ&DVHV
&RQVWDQW3XULW\RYHU7LPH
[ W  
' WI  

0.000

0.2

0.4

0.6

0.8

1.0

Time (hrs.)

2YHUDOO'LVWLOODWH3XULW\

6KRUWFXW&RPSDULVRQ
' WI  

Purity

0.008

6
0.006

4
0.004

0
0.0

0.002

Distillate Purity, (1-x)

' WI  

Reflux

Reflux Policy

[G W 9 5 GW ' WI  

0.010

0.000

0.2

0.4

0.6

0.8

1.0

Time (hrs.)
64

5HIOX[2SWLPL]DWLRQ&DVHV
0.004

0.30
Reflux
Purity

0.26
0.002

0.24
0.001

0.22

Distillate Purity, (1-x)

0.003

Reflux Policy

0.28

&RQVWDQW5HIOX[RYHU7LPH
[G W 9 5 GW ' WI  

' WI  

0.000

0.20
0.0

0.2

0.4

0.6

0.8

1.0

Time (hrs.)

' WI  

Reflux
4

0.006
3

0.004
2

0.002

Distillate Purity, (1-x)

[G W 9 5 GW ' WI  

Purity

0.008

Reflux Policy

3LHFHZLVH&RQVWDQW
5HIOX[RYHU7LPH

0.010

0.000
0.0

0.2

0.4

0.6

0.8

1.0

Time (hrs.)
65

Batch Reactive Distillation Case Study 3


Reversible reaction between acetic acid and ethanol
CH3COOH + CH3CH2OH l CH3COOCH2CH3 + H2O

t = 0, x = 0.25
for all components

R(t)

D(t), x

ttf ==11

max
max 0

Ddt
Ddt

s.t. DAE
Ester 0.4600
xxEster
0.4600
DD

V(t)
xb

Wajde & Reklaitis (1995)66

2SWLPL]DWLRQ&DVH6WXG\ 
0RGHOLQJ$VVXPSWLRQV
,GHDO7KHUPRG\QDPLFV
&RQVWDQW7UD\+ROGXS
1R9DSRU+ROGXS
7HUWLDU\0L[WXUH (W2++2$F(7$F+2
&ROG6WDUW,QLWLDO&RQGLWLRQ
&DVHV&RQVLGHUHG
6SHFLILHG&RPSRVLWLRQDW)LQDO7LPH
2SWLPXP5HIOX[3ROLF\
9DULRXV7UD\V&RQVLGHUHG 
KRXURSHUDWLRQ
67

Batch Reactive Distillation


Optimal Reflux Profiles

Condenser Composition (8 trays)


0.50

80

0.40

M. fraction

Reflux Ratio

100

60
40

0.30
0.20
0.10

20

0.00
0.0

0
0.0

0.2

0.4

0.6

0.8

1.0

YDULDEOHV
'$(V
GHJUHHVRIIUHHGRP
ILQLWHHOHPHQWV
,3237LWHUDWLRQV
&38PLQXWHV

Ester

0.6

0.8

1.0

Eth.

Wat.

A. A.

25 trays

Distillate Composition
Ethyl Acetate

15 trays

0.4

time(hr)

time (hr)
8 trays

0.2

0.45

0.35

0.25
0.0

0.2

0.4

0.6

0.8

1.0

time (hr)
8 trays

15 trays

25 trays
68

Batch Reactive Distillation


DAEs

Discretized
Variables

Iterations

98

1788

15

168

25

258

CPU (s)
Global

Elemental

14

56.4

37.2

3048

32

245.7

207.5

4678

45

1083.2

659.3

CPU Decomposition Time


21

CPU(s)/iter

Trays

16
11
6
1
90

130

170

DAEs

210

250
69

Nonlinear Model Predictive Control (NMPC)


NMPC Estimation and Control
d : disturbances

u : manipulated
variables

Process

z : differential states
y : algebraic states

Why NMPC?




NMPC Controller
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )

Track a profile
Severe nonlinear dynamics (e.g,
sign changes in gains)
Operate process over wide range
(e.g., startup and shutdown)

ysp : set points

Model Updater
z = F (z , y , u , p, d )
0 = G (z , y , u , p, d )

NMPC Subproblem
min
u

s.t .

|| y (t ) y

sp

|| Q y + || u(t k ) u(t k 1 ) || Q u

z (t ) = F ( z (t ), y (t ), u (t ), t )
0 = G ( z (t ), y (t ), u(t ), t )
z (t ) = z init
Bound Constraint s
Other Constraint s
70

Dynamic optimization in a
MATLAB Framework
Dynamic Optimization
Problem

Dynamic Process

Process Model
f (x , x, y , u, p, t ) = 0

JOBKENNUNG : C:\Krawal-m odular\IGCC\IGCC_Puertollano_kom plett.gek

g (x , y, u, p, t ) = 0

Discretization
Method
No. of Time
Elements
Collocation
Order

Inequality Constraints

bar
kJ/kg
kg/s
C (X)
Alle Drcke sind absolut
P..Druck..bar

Saturator-System

M ..M assenstrom ..kg/s


PHI..Luft-Feuchte..%

Wrmeschaltplan
Nr. -F Ref T Erlangen, 13.Oct.1999

SIEMENS AG
F Ref T

Initial Conditions

x (t0 ) = x0

Constraints at Final Time

(x (t f ), x(t f ), y (t f ), u (t f ), x 0 , p, t f ) = 0

Objective Function

min P(x(t

Process Model

f (x , y , u , p, t f , x 0 ) = 0

g (x , y , u , p, t f ) = 0
Inequality Constraints

h (x, x, y, u, p, t ) 0

copy ofDesign

T ..T em peratur..C
H..Enthalpie..kJ/kg

In Bearbeitung

NLP Optimization
Problem

Full
Discretization
of State and
Control
Variables

h (x , y , u , p, t f ) 0
Constraints at Final Time

(x N , y N , u N , p, t f ) = 0
t

Objective Function

min P x N t , y N t , u N t , p, t f

), y (t f ), u(t f ), p, x 0 , t f )

u (t),p ,x 0 ,t f

71

Tennessee Eastman Process

Unstable Reactor
11 Controls; Product, Purge streams
Model extended with energy balances
72

Tennessee Eastman Challenge Process

DAE Model
Number of differential equations

NLP Optimization problem


30

Number of algebraic variables

152

Number of algebraic equations

141

Difference (control variables)

11

Number of variables
of which are fixed

10920
0

Number of constraints

10260

Number of lower bounds

780

Number of upper bounds

540

Number of nonzeros in Jacobian

49230

Number of nonzeros in Hessian

14700

Method of Full Discretization of State and Control Variables


Large-scale Sparse block-diagonal NLP

73

Setpoint change studies


Process variable

Type

Magnitude

Step

-15%
Make a step change to the variable(s) used to set
the process production rate so that the product
flow leaving the stripper column base changes
from 14,228 to 12,094 kg h-1

Reactor operating pressure


Step
change

-60 kPa
Make a step change so that the reactor operating
pressure changes from 2805 to 2745 kPa

Purge gas composition of


component B change

+2%
Make a step change so that the composition of
component B in the gas purge changes from
13.82 to 15.82%

Production rate change

Step

Setpoint changes for the base case [Downs & Vogel]

74

Case Study:
Change Reactor pressure by 60 kPa
Control profiles
All profiles return to their
base case values
Same production rate
Same product quality
Same control profile
Lower pressure leads to
larger gas phase (reactor)
volume
Less compressor load

75

TE Case Study Results I


Shift in TE process
Same production rate
More volume for reaction
Same reactor temperature
Initially less cooling water flow
(more evaporation)

76

Case Study- Results II


Shift in TE process
Shift in reactor effluent to more
condensables
Increase cooling water flow
Increase stripper steam to
ensure same purity
Less compressor work

77

Case Study:
Change Reactor Pressure by 60 kPa

Optimization with IPOPT


1000 Optimization Cycles
5-7 CPU seconds
11-14 Iterations
Optimization with SNOPT
Often failed due to poor
conditioning
Could not be solved within
sampling times
> 100 Iterations
78

Optimization as a Framework for Integration


+ Directly handles interactions, multiple conditions
+ Trade-offs unambiguous, quantitative

- Larger problems to solve


- Need to consider a diverse process models
Research Questions
How should diverse models be integrated?
Is further algorithmic development needed?

79

Batch Integration Case Study


Plant Design

Dynamic Processing
R

T
Time

Time

Production Planning
Stage 1

A
Stage 2

B
A

C
B

:KDWDUHWKH,QWHUDFWLRQVEHWZHHQ'HVLJQ
DQG'\QDPLFVDQG3ODQQLQJ"
:KDWDUHWKHGLIIHUHQFHVEHWZHHQ6HTXHQWLDODQG
6LPXOWDQHRXV6WUDWHJLHV"
(VSHFLDOO\,PSRUWDQWLQ%DWFK6\VWHPV

80

6LPXOWDQHRXV'\QDPLF2SWLPL]DWLRQ
Conv.

Dynamic Processing
Same conversion T
in reduced time

Shorter
processing times
Time

Conv.

Time

Higher conversion T
in same time

Fewer
product batches
Time

Time

Best constant

Best transient

Production Planning
Stage 1

Stage 2

B
A

C
B

Shorter Planning
Horizon

GLVFUHWL]H '$(V VWDWHDQGFRQWUROSURILOHV


ODUJHVFDOHRSWLPL]DWLRQSUREOHP
KDQGOHVSURILOHFRQVWUDLQWVGLUHFWO\
LQFRUSRUDWHVHTXLSPHQWYDULDEOHVGLUHFWO\
'$(PRGHOVROYHGRQO\RQFH
FRQYHUJHVIRUXQVWDEOHV\VWHPV
81

Scheduling Formulation
VHTXHQFLQJRIWDVNVSURGXFWVHTXLSPHQW

H[SHQVLYHGLVFUHWHFRPELQDWRULDORSWLPL]DWLRQ
FRQVLGHULGHDOWUDQVIHUSROLFLHV 8,6DQG=:

FORVHGIRUPUHODWLRQV %LUHZDU DQG*URVVPDQQ

8QOLPLWHG,QW6WRUDJH 8,6
stage 2
6KRUWSURGXFWLRQF\FOH
&\FOHWLPHLQGHSHQGHQWRIVHTXHQFH
stage I

stage I

=HUR:DLW =: 
,PPHGLDWHWUDQVIHUUHTXLUHG
6ODFNWLPHVGHSHQGHQWRQSDLU
/RQJHUSURGXFWLRQF\FOHUHTXLUHG

stage 2
82

Case Study Example


4 stages, 3 products of different purity
Dynamic reactor - temperature profile
Dynamic column - reflux profile
A+B C
C +B P+E
P+C G
z i0,I

z i0,II

z i0,III

z i0,IV

z i ,II

z if,III

z i ,I

z if,IV

Bi

Process Optimization Cases


SQ - Sequential Design - Scheduling - Dynamics
SM - Simultaneous Design and Scheduling
Dynamics with endpoints fixed .
SM* - Simultaneous Design, Scheduling and Dynamics
83

6FHQDULRVLQ&DVH6WXG\
&RPSDULVRQRI'\QDPLFYV%HVW&RQVWDQW3URILOHV
5 EHVWFRQVWDQWWHPSHUDWXUHSURILOH
5 RSWLPDOWHPSHUDWXUHSROLF\
& EHVWFRQVWDQWUHIOX[UDWLR
& RSWLPDOUHIOX[UDWLR
640

25
C o nsta nt

C o nsta nt

630

Dyn a m ic

Dyn a m ic

20

620
610

15

600
10
590
580

5
0

0.25

0.5 0.75
Tim e (h r.)

1.25

0.5

1.5

2.5
84

Results for Simultaneous Cases


2 00 00

R0/R1
best constant /
optimal temperature

Pro fit ($)

1 50 00

1 00 00

C0/C1
best constant /
optimal reflux ratio

5 00 0

0
R 0 C 0

R 1 C 0

R 0 C 1

R 1 C 1

C a se s
S e q u e n t ia l

S im u lt a n e o u s
w it h F I X E D s t a t e s

64

S Im u lt a n e o u s
w it h F R E E s t a t e s

I
I I
I I I

60

I V

60

I V

I
I I
I I I
I
I I
I I I
I V

- ZW schedule becomes tighter


- less dependent on product sequences

85

Summary
Sequential Approaches
- Parameter Optimization
Gradients by: Direct and Adjoint Sensitivity Equations
- Optimal Control (Profile Optimization)
Variational Methods
NLP-Based Methods
- Require Repeated Solution of Model
- State Constraints are Difficult to Handle
Simultaneous Approach
- Discretize ODE's using orthogonal collocation on finite elements (solve larger optimization problem)
- Accurate approximation of states, location of control discontinuities through element placement.
- Straightforward addition of state constraints.
- Deals with unstable systems
Simultaneous Strategies are Effective
- Directly enforce constraints
- Solve model only once
- Avoid difficulties at intermediate points
Large-Scale Extensions
- Exploit structure of DAE discretization through decomposition
- Large problems solved efficiently with IPOPT

86

DAE Optimization Resources


References
Bryson, A.E. and Y.C. Ho, Applied Optimal Control, Ginn/Blaisdell, (1968).
Himmelblau, D.M., T.F. Edgar and L. Lasdon, Optimization of Chemical
Processes, McGraw-Hill, (2001).
Ray. W.H., Advanced Process Control, McGraw-Hill, (1981).
Software

Dynamic Optimization Codes


ACM Aspen Custom Modeler
DynoPC - simultaneous optimization code (CMU)
COOPT - sequential optimization code (Petzold)
gOPT - sequential code integrated into gProms (PSE)
MUSCOD - multiple shooting optimization (Bock)
NOVA - SQP and collocation code (DOT Products)

Sensitivity Codes for DAEs


DASOLV - staggered direct method (PSE)
DASPK 3.0 - various methods (Petzold)
SDASAC - staggered direct method (sparse)
DDASAC - staggered direct method (dense)
87

DynoPC Windows Implementation


Model in FORTRAN

Data/Config.

Outputs/Graphic & Texts

Interface of DynoPC
F2c/C++

Redued SQP

ADOL_C

Optimal Solution
Preprocessor
Calc. of independent
variable move

F/DF

Interior Point
For QP

DDASSL

F/DF

Decomposition

COLDAE
Simulator/Discretizer

Line Search
Starting Point

Linearization

FILTER

88

Example: Batch Reactor Temperature

Max

b (t f )

s.t.
E1
da
= k1 exp(
)a
at
RT

E1
E2
db
= k1 exp(
) a k 2 exp(
)b
at
RT
RT

a+b+c =1
89

Example: Car Problem


Min
s.t.

tf
z1 = z2
z2 = u
z2 zmax
-2 u 1

12

10

Velocity
Acceleration

-1

Velocity

Acceleration, u(t)

subroutine model(nz,ny,nu,np,t,z,dmz,y,u,p,f)
double precision t, z(nz),dmz(nz), y(ny),u(nu),p(np)
double precision f(nz+ny)

-2
2

-3

10

20

30

40

f(1) = p(1)*z(2) - dmz(1)


f(2) = p(1)*u(1) - dmz(2)
return
end

Time

90

4.5E-03

60

4.0E-03
3.5E-03

50

3.0E-03
2.5E-03

40
30

2.0E-03
1.5E-03
1.0E-03
5.0E-04

20

0.0E+00

T ja c k e t (C )

C ry s ta l s iz e

Example: Crystallizer Temperature

10

10

15

20

Time (hr)
Crystal Size

T jacket

TT

TC

Maximize crystal size


at final time
Steam
Cooling
water
Control variable = Tjacket = f(t)?

SUBROUTINE model(nz,ny,nu,np,x,z,dmz,y,u,p,f)
implicit double precision (a-h,o-z)
double precision f(nz+ny),z(nz),dmz(nz),Y(ny),yp(4),u(1)
double precision kgr, ln0, ls0, kc, ke, kex, lau, deltT, alpha
dimension a(0:3), b(0:3)
data alpha/1.d-4/,a/-66.4309d0, 2.8604d0, -.022579d0, 6.7117d-5/,
+ b/16.08852d0, -2.708263d0, .0670694d0, -3.5685d-4/, kgr/ 4.18d-3/,
+ en / 1.1d0/, ln0/ 5.d-5/, Bn / 3.85d2/, em / 5.72/, ws0/ 2.d0/,
+ Ls0/ 5.d-4 /, Kc / 35.d0 /, Kex/ 65.d0/, are/ 5.8d0 /,
+ amt/ 60.d0 /, V0 / 1500.d0/, cw0/ 80.d0/,cw1/ 45.d0/,v1 /200.d0/,
+
tm1/ 55.d0/,x6r/0.d0/, tem/ 0.15d0/,clau/ 1580.d0/,lau/1.35d0/,
+ cp/ 0.4d0 /,cbata/ 1.2d0/, calfa/ .2d0 /, cwt/ 10.d0/
ke = kex*area
x7i = cw0*lau/(100.d0-cw0)
v = (1.d0 - cw0/100.d0)*v0
w = lau*v0
yp(1) = (deltT + dsqrt(deltT**2 + alpha**2))*0.5d0
yp(2) = (a(0) + a(1)*yp(4) + a(2)*yp(4)**2 + a(3)*yp(4)**3)
yp(3) = (b(0) + b(1)*yp(4) + b(2)*yp(4)**2 + b(3)*yp(4)**3)
deltT = yp(2) - z(8)
yp(4) = 100.d0*z(7)/(lau+z(7))
f(1) = Kgr*z(1)**0.5*yp(1)**en - dmz(1)
f(2) = Bn*yp(1)**em*1.d-6 - dmz(2)
f(3) = ((z(2)*dmz(1) + dmz(2) * Ln0)*1.d+6*1.d-4) - dmz(3)
f(4) = (2.d0*cbata*z(3)*1.d+4*dmz(1)+dmz(2)*Ln0**2*1.d+6)-dmz(4)
f(5) = (3.d0*calfa*z(4)*dmz(1)+dmz(2)*Ln0**3*1.d+6) - dmz(5)
f(6) = (3.d0*Ws0/(Ls0**3)*z(1)**2*dmz(1)+clau*V*dmz(5))-dmz(6)
f(7) = -dmz(6)/V - dmz(7)
f(8) = (Kc*dmz(6) - Ke*(z(8) - u(1)))/(w*cp) - dmz(8)
f(9) = y(1)+YP(3)- u(1)
return
end
91

You might also like