Professional Documents
Culture Documents
Interfaces
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
http://dx.doi.org/10.1287/inte.2015.0802
2015 INFORMS
Ming Zhao
Department of Decision and Information Sciences, Bauer College of Business, University of Houston,
Houston, Texas 77204, mzhao@bauer.uh.edu
The Procter & Gamble (P&G) fabric-care business is a multibillion dollar organization that oversees a global
portfolio of products, including household brands such as Tide, Dash, and Gain. Production is impacted by
a steady stream of reformulation modifications, imposed by new-product innovation and constantly changing
material supply conditions. In this paper, we describe the creation and application of a novel analytical framework that has helped P&G determine the ingredient levels and product and process architectures that enable the
company to create some of the worlds best laundry products. Modeling cleaning performance and other key
properties such as density required P&G to develop innovative quantitative techniques based on visual statistical tools. It used advanced mathematical programming methods to address challenges that the manufacturing
process imposed, product performance requirements, and physical constraints, which collectively result in a
hard mixed-integer nonlinear (nonconvex) optimization problem. We describe how P&G applied our framework
in its North American market to identify a strategy that improves the performance of its laundry products,
provides targeted consumer benefits, and enables cost savings in the order of millions of dollars.
Keywords: pooling; blending; optimization; response surface; design of experiments.
History: This paper was refereed.
Traditional formulation approaches involve simplifying the problem, hypothesizing a solution, physically creating and testing prototypes, analyzing results,
and iterating the results until various objectives are
met. Physical prototyping can be expensive and time
consuming, resulting in slow and costly iteration
cycles; as a result, these traditional approaches no
longer meet todays needs.
P&Gs research and development organization is
at the forefront of the development and adoption of
modeling tools that enable the company to make better
decisions on product formulation, processing, and manufacturing. These include empirical, first-principles,
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
and semi-empirical models that predict chemical reactions during manufacturing, in-use physical properties of the product, technical performance of
the product, and even consumer acceptance rates.
These tools enable researchers to instantly predict a
products physical properties and performance, integrate models, and balance production trade-offs using
a variety of predictive and prescriptive capabilities.
Until recently, the complexity of laundry-formulation
and manufacturing processes limited us to consider
reformulating only a single product at a time; however, breakthroughs in mathematical optimization
technology have made possible system-wide portfolio
reformulation. This is critically important because it
permits us to model and optimize product differentiation within a portfolio and consider sharing common
materials within the manufacturing process. In this
paper, we present the scope of laundry-portfolio modeling and optimization at P&G, the creation of capabilities we developed to address this scope, and its
application to innovate P&Gs North American powder laundry portfolio.
445
Figure 1: (Color online) The production of laundry detergent mixtures creates a blending process.
Predictive Models
Predictive models are either empirical or semiempirical in nature. Empirical models are third-order
Variable 1 Variable 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0
50
100
0
0
100
100
50
100
100
50
50
50
0
0
50
100
50
50
100
0
100
100
100
0
0
50
50
0
50
0
50
Variable 3
100
0
50
0
0
100
0
50
0
100
50
100
50
50
100
50
Coffee
SRI
88.56
81.96
85.84
79.65
86.20
94.02
83.46
85.21
83.48
91.88
84.00
88.99
84.58
84.52
88.84
81.62
polynomial functions that capture the two performance qualities of a mixture: stain removal and
whiteness.
Empirical models for stain removal and whiteness
were created using experimental design procedures,
an efficient means of model creation for controlled
experiments (Box et al. 2005, Kutner et al. 2004), using
JMP software for the design and analysis. Figure 2
shows an example of an experimental design for a
three-variable model for a coffee stain. The table in
the figure lists the set of 16 test treatments we ran
and the associated coffee-stain response. The image to
Variable 1
0
20
60
40
Variable 2
80 100 100 80
60
40
20
100
80
Coffee SRI
95.0
92.5
90.0
100
87.5
82.5
80
80.0
60
60
40
40
20
20
0
100
80
60
Var
40
iab
le 2
20
0
20
40
80
60
ble
0
100
ria
Va
Figure 2: (Color online) This experimental design for a stain-removal index (SRI) for a coffee stain is
characterized by SRI coefficients (left) and can be visualized as response-surface models (right).
Variable 3
Variable 3
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
446
447
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
Before
wash
After
wash
Temperature = t
Hardness = h
Soil level = s
Washer
Figure 3: (Color online) A standard wash protocol is used when testing the
stain-removal effectiveness of a mixture.
Figure 4: (Color online) Stains are scanned before and after a wash experiment; in this example, we use a coffee stain.
448
32) COFFEE
INSTANT ACTUAL
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
32) COFFEE INSTANT
PREDICTED
P R SQuare
2-3%
Summary of fit
R SQUaRE
R SQUaRE ADJ
2OOT MEAN SQUARE ERROR
-EAN OF RESPONSE
/BSERVATIONS OR SUM WGTS
Analysis of variance
DF
Sum of
squares
Mean square
F ratio
-ODEL
%RROR
# TOTAL
Source
Prob > F
*
Lack of fit
Source
,ACK OF FIT
DF
Sum of
squares
0URE ERROR
4OTAL ERROR
Mean square
F ratio
Prob > F
Max r sq
Figure 5: (Color online) Fit diagnostics, shown for the SRI response function for a coffee stain, show strong predictive model performance and are
typical of stain results.
Optimization
The predictive models described offer the capability to
determine cleaning and density properties of a given
arbitrary mixture of ingredients without resorting to
expensive and time-consuming physical prototypes.
The next step was to determine ideal mixture ingredient compositions to ultimately bring P&G products to
the market while meeting stringent quality and manufacturing requirements. The optimization problem we
describe determines the most economical composition of mixtures, evaporation levels, and intermediateto-product assignments. In this section, we describe
the optimization problem at a high level to present a
basic statement of the problem, highlight complicating features, and motivate the solution-methodology
discussion. We include a more detailed mathematical
formulation in the appendix.
Intermediate batches and final products consist of
mixtures of ingredients. Although ingredient proportions determine the properties of these mixtures, the
manufacturing process does not simply blend pure
ingredients. Rather, premixes (mixtures of a small
subset of ingredients) are sourced and mixed to
achieve a final-ingredient mixture. The eligibility of
premixes to be added to an intermediate batch versus
a final product is determined primarily by the nature
of the premix. Premixes with high water content are
typically added to intermediate batches so that the
water can evaporate. Some premixes (e.g., perfumes)
can be unique to one product or available to all products, either in the intermediate batch, postevaporation
stage, or both.
Figure 6 illustrates the analytical representation
of material flowing through various stages of the
manufacturing process for a simple two-intermediate,
three-product example. Individual premixes may be
eligible to go into intermediate mixtures, directly into
the final-product mixtures (as additives), or either. For
example, the figure shows that premix Pre1 can be
assigned only to intermediates (B1 and B2), whereas
premix Pre3 can be assigned to intermediates and
product Prod2. The additive Pre5 can be assigned
only to Prod3 directly.
Percentages by weight of premixes used in total
production of intermediate batches (B1 and B2) and
449
Common intermediates
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
Premixes
B1 (pre)
e1
B1 (post)
Pre1
B2 (pre)
Pre2
Pre3
Pre4
e2
B2 (post)
Products
Prod1
Prod2
Prod3
Pre5
Figure 6: The layout defines the various decision variables and constraints
of the optimization problem.
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
450
the set of products into subsets, such that an intermediate is assigned to each subset of products that
must satisfy various nonlinear constraints, such as
performance requirements. This interpretation of the
problem is closely related to the pooling problem
(Bodington and Baker 1990), and can be shown to
be an NP-hard mixed-integer nonlinear (nonconvex)
program.
Previous approaches to solving pooling problems
include relaxation and discretization strategies (Gupte
et al. 2013), Benders decomposition (Floudas and
Aggarwal 1990), Lagrangian relaxation (Visweswaran
and Floudas 1990), branch and cut (Audet et al. 2004),
and mixed integer linear programming (MILP) (Dey
and Gupte 2013). These approaches do not directly
apply to the variant we consider in this paper. Most of
the effort to date has focused on addressing the bilinear terms in the problem; however, the evaporation
process, the requirement that assignments must be
binary decisions, and nonconvex performance models
violate basic assumptions of much of the work conducted to date.
State-of-the-art solution methodologies for the pooling problem that attempt to prove optimality are typically restricted to problems of (in equivalent terms) a
few dozen ingredients, premixes, intermediates, and
products; Dey and Gupte (2013) provide example
instance sizes, where the number of inputs and outputs is less than 100. Several features of the problem we address in this paper make it a significantly
harder generalization of the classical pooling problem, although its size is comparable to instances
addressed by the best-known methods. Furthermore,
the best-known methods often take many hours to
reach a small optimality-bound gap. For our purposes, solutions must be produced in an order of minutes (to allow scenario exploration, which is critical to
P&G formulation practices); therefore, we have taken
a heuristic approach to solve this optimization problem. The appendix provides a detailed explanation of
the optimization solution methodology.
implement a new work process. The use of optimization tools requires a set of skills best defined by an
optimization triad (Figure 7) that includes functional,
data, and optimization experts.
Functional experts are typically a small number of
individuals from different R&D functions, including
consumer, formulation, and process. Data experts are
one or two individuals who have access to all the
necessary information, such as material pricing and
material balances, and are typically skilled in visual
analytics tools, such as JMP statistical tools. Optimization experts are staff members who have the necessary programming skills to interact with the optimization models at the SAS code level. Figure 7 defines
the responsibilities of expert in each category.
We defined a new work process (Figure 8), which
has enabled this multifunctional, multiskilled team
to efficiently use the new capability; we describe the
components as follows.
1. Problem definition: Functional experts frame the
problem to be solved and the research questions to be
addressed. For example, we researched and analyzed
processes that would (1) allow P&G to deliver significantly better product performance at equal or lower
cost over its current processes, and (2) reduce the number of intermediates while maintaining performance.
2. Knowledge development: Functional experts
start building the problems framework by collecting
knowledge and building a common state of understanding across the team. We define knowledge as
existing models, heuristic approaches, and the simplification of assumptions, all of which must form a basis
for agreement. For example, if different density models exist, the team must come to a consensus on which
Functional expert
Optimization expert
$EFINES PROBLEM TO
BE SOLVED CREATES
APPROPRIATE MODELS
DEFINES CONSTRAINTS
$EFINES OPTIMIZATION
FRAMEWORK WRITES
CODE RUNS SCENARIOS
Data expert
'ATHERS APPROPRIATE INPUT DATA INTO
DESIRED FORMAT ANALYZES SOLUTIONS
USING VISUAL ANALYTICS TOOLS
Figure 7: (Color online) This figure shows the optimization triad that P&G
adopted as part of its standard process.
451
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
7ORK PROCESS
0ROBLEM DEFINITION
FUNCTIONAL EXPERT
7RITE OPTIMIZATION
CODE
OPTIMIZATION EXPERT
)TERATIVE TESTING
-ATHEMATICAL
VALIDATION
OPTIMIZATION EXPERT
!SSUMPTION
+NOWLEDGE
DEVELOPMENT
FUNCTIONAL EXPERT
-ODEL DEVELOPMENT
FUNCTIONAL EXPERT
/PTIMIZATION
PROBLEM FORMULATION
OPTIMIZATION EXPERT
0ORTFOLIO DATA
GATHERING EG VOL
COST FORMULATION
DATA EXPERT
#HURN!NALYSIS
!LL
/PTIMIZED
RECOMMENDATION
FUNCTIONAL EXPERT
Figure 8: (Color online) The optimization work process at P&G follows a sequence of nine interrelated steps.
new research question, constantly verifying assumptions and validating preliminary results with other
experts and stakeholders.
7. Iterative testing and mathematical validation:
Optimization experts participate with other members
of the team in a cycle of iterative testing to mathematically validate models, ensuring that constraints are
correctly interpreted and observed and that optimum
solutions are robust. In this step, parameters (e.g.,
the number of multistart points and stopping criteria)
are tuned, and model infeasibilities, often caused by
products with too-stringent performance constraints,
are commonly found. At this stage, steps must be
taken to ensure feasibility; revisiting the constraint
bounds is an example.
8. Churn and analysis: In this step, the entire team
exercises the optimization engine to optimize multiple
scenarios to answer the research question(s). Scenarios
can include changes in performance requirements, the
number of intermediates allowed, constraint bounds,
materials allowed to be used (and where they can be
used), or a combination of these scenarios. For highly
complex research questions, P&G implements a churn
event; in such an event, all members of the team are
colocated for two to four days. They focus all their
time on a common set of problems, and their objective is to produce data to inform decision making.
During churn, the team occupies a common room
equipped with physical and digital visualization tools
that aid the work process. Poster-sized printouts display a master list of all scenarios, and key parameters are captured and color coded to differentiate the
scenarios. They analyze completed scenarios on a display that consists of eight 42" high-definition television screens configured to behave as a single monitor,
allowing high-resolution visualizations to be spread
across a large area. The team typically reviews the
results using JMP software, which permits interactive
visualization and analysis.
9. Optimized recommendation: Functional experts
take results from the analysis and formulate a recommendation. Recommendations can be as simple as a
new set of formulations to meet a new requirement,
or as complex as a multistage strategy for a portfolio
of products, which might include the introduction of
new technologies.
In the course of this project, we learned that a mindset shift is required to solve for the entire portfolio
rather than for one product at a time. Traditionally,
projects have addressed formulation changes for one
or only a small subset of products, while preserving the composition of most products in the portfolio. Once this mind-set shift happened, however, we
were able to identify (and test) portfolio-management
strategies, which led to some unique options that
P&G had not previously uncovered.
Outputs and Results
Because the algorithm we present in this paper is
based on building an approximation of the power set
of products, each optimization can be easily extended
to solve for various numbers of intermediate batches
by solving the selection step for as many independent subproblems as there are productsfrom one
intermediate for all products to the singleton cases.
We exploited this aspect of the algorithm to provide a range of intermediate batch configurations for
each optimization run for marginal additional run
time, running the set-covering subproblems in parallel. When running in this mode, one must determine where to apply the stopping criterion. Because
P&Gs production focus is on number of intermediates greater than or equal to four, we apply our
stopping criterion at three intermediates at an estimated gap of three percent for all optimization runs.
Figure 9 shows solution objective values (total cost
difference for annual production) of a typical optimization, which we represent as the difference between
our solution output objective and the annual cost of
a historical production run. An obvious feature of the
problem we consider here is that as the number of
intermediate batches increases, the globally optimal
objective must not increase, because the model allows
more flexibility in tuning each products assigned
intermediate. The figure reflects this, and we note that
the algorithms flow was motivated to ensure that this
check to verify rationality would never be violated
despite the presence of nonconvexity.
Reporting such a multiple intermediate solution
proved valuable to management because an immediate comparison of neighboring solutions could show
the benefits of increasing (or decreasing) the number
of intermediate batches in productiona costly, but
sometimes beneficial, manufacturing investment. For
example, the marginal benefits of increasing the number of intermediate batches from 14 to 17 are minimal;
only at 18 intermediates does a significant change
occur (because of the discrete nature of the problem),
which may justify the added complexity and cost.
/PTIMAL COSTS DIFFERENCE VS BENCHMARK
#OST DIFFERENCE VS BENCHMARK
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
452
3OLUTION
"ENCHMARK
n
n
n
n
.O OF BATCHES
453
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
As we mention previously, each optimization instance was referenced against a related historical
benchmark production run (created without the benefits of an analytical approach). Our expectation for
the success of this project was that the solution provided by the framework would always improve on
the benchmarks premix annual costs. This is a reasonable expectation because, for a given number of
intermediate batches, the benchmark is typically a
feasible mixture. Comparing the 14-intermediate solution annual cost with the benchmark value in Figure 9
shows that this requirement was met in this example, and that the solution typically produces savings
in the order of magnitude shown (between $0.5 and
$6 million).
Few instances currently exist at P&G because statistical and semi-empirical models have only recently
been developed for entire product portfolios. Table 1
lists two important powder instances we used in
this work and also shows a number of ingredients, premixes, products, response-surface models,
and benchmark intermediates that represent typical
problem sizes.
The optimization tool produces significantly better results than the benchmark for both instances in
the table; its run times are very reasonable, given the
needs of the P&G work process. Tables 2 and 3 summarize the results for these instances, listing most
intermediate configurations from three to the number of products in the instance. The tables show cost
differences from the benchmark (in millions) and run
times at which each intermediate configuration satisfied its stopping criterion. The longest run time is the
total run time of the process. Note the cost improvements when compared to the benchmark even when
running at the minimum number of intermediate
batches (three) for both instances. The values in the
Instance name
3bii
Wenlock
Ingredients
Premixes
Products
RS
Intermediates
54
42
38
34
21
25
50
38
12
13
Intermediates
3
4
5
6
7
8
9
10
11
12
13
14
21
Bounds (%)
1508
1609
1800
1808
1902
1904
1906
1907
1908
2000
2001
2002
2003
2035
1077
1019
0078
0058
0045
0036
0029
0022
0015
0009
0004
0000
45805
25300
11002
11004
11005
11006
11006
11007
11009
11100
11101
11102
3206
Table 2: The table shows results for instance 3bii. The bolded values
represent a best comparison to the benchmark solution, which has 12
intermediate batches.
Intermediates
3
4
5
6
7
8
9
10
11
12
13
14
15
16
25
Bounds (%)
7604
7908
8109
8206
8300
8305
8307
8309
8400
8401
8402
8403
8403
8404
8404
2046
1042
0079
0058
0043
0028
0021
0015
0012
0009
0007
0005
0002
0001
0000
25606
25607
25608
25700
25702
25703
25704
25705
25707
25708
25800
25801
25803
25804
6805
Table 3: In these results for the Wenlock instance, the bolded values represent a best comparison to the benchmark solution, which has 13 intermediate batches.
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
454
more than 30 scenarios. Here, we discuss two examples from the study results.
Increasing the number of intermediates allowed
from five to 10 would result in additional formula cost
savings. Compared to a five-intermediate benchmark,
using the optimization procedure would result in
annual cost improvements of $2 million for five
intermediates and $5 million for 10 intermediates.
Although a consequence of this strategy is to increase
complexity at the manufacturing site for handling
a higher number of intermediate batches, this is a
justifiable decision based on this demonstrable cost
reduction.
The introduction of a new active ingredient (currently not in the North American powder formulation, but available in empirical and semi-empirical
models) across the whole portfolio would generate
annual savings in excess of $20 million, while delivering target performance profile (i.e., cleaning of different stains) for each product. P&G has incorporated
this knowledge into its short- to mid-term strategies
for this part of our business.
We have estimated that without the portfolio optimization tool, twice as many staff members would
have to work for far longer periods of time to run
30 scenarios on 20 formulations using the former
single-product-at-a-time approach and would produce inferior results.
User sponsorship and adoption have been instrumental to the success of this project. In an email to
the team, Christian Becerra, P&G senior researcher
and lead formulator for the North American powder
business, provided a set of additional benefits of our
portfolio optimization framework that go beyond the
savings described earlier (Becerra 2014):
Smart optimization: Identifying a formula and
process strategy that meets our criteria, removing
chemistry that will not deliver the desired performance profile to the consumer. This leads to smart
savings.
Next level of optimization: Going beyond the traditional single-formula to full-portfolio optimization
(i.e., the ability to see the big picture).
Flexibility: Integrating technical features and an
understanding of consumer needs to make more
robust portfolio propositions.
Summary of Benefits
Portfolio optimization is changing the way we do
product development at P&G. Previously, we were
often limited to one of two strategies; we developed
each product individually, resulting in highly complex
portfolios that required high numbers of intermediate
batches, or we imposed simplification strategies that
resulted in higher formulation costs.
Portfolio optimization allows us to test formulation
and simplification strategies against the whole portfolio, giving us a realistic estimate of the potential
impact of these strategies. Fast iteration cycles allow
us to evaluate multiple strategies in a short time, discarding elements that will bring little value and combining elements that provide significant advantages.
The ability to analyze an entire portfolio simultaneously is also changing the way product designers
think about performance, because we can now more
easily differentiate performance among the products.
Thus, we can make smarter decisions about formulation and simplification strategies and respond with
agility when needed.
As lead strategies are identified and the formulation for the full portfolio is generated, some physical testing is required to confirm that the required
performance and physical properties of the products
are indeed met. This is especially true for solutions
that are near the minimum or maximum values of
the input ranges, because the confidence intervals of
the predictions are typically at their widest in these
ranges. Future phases of this project will address this
need, and continuously expand and improve the quality of predictive and optimization models.
455
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
As a direct result of using this optimization platform and measuring its value to our business, we
are now making the following enhancements to our
processes:
Evolving P&G and its work processes to better
use this and other operations research capabilities to
define and execute the best portfolio strategies.
Investing in resources to improve our data
management system and automate data pull and
formatting.
Building more models into the optimization framework and continuing to do so as new models become
available, potentially including consumer models and
first-principles chemistry models.
Finally, although the benefits of this tool have been
well demonstrated in our laundry-powder business,
many reapplication opportunities remain at P&G. In
the coming years, we expect to continue to develop
this capability further for our laundry business and
create similar capabilities for other P&G businesses.
Appendix
Semi-empirical Models
Stain-removal performance is predicted by the stainremoval index (SRI) response function, which has the following form:
SRI = C0 + C1 v1 + C2 v2 + C3 v1 v2 + 1
where Ci represent coefficients and vi represent design variables: wash concentrations (milligrams per liter) of chemical
ingredients and wash conditions (e.g., temperature). Whiteness models have a similar form.
Intermediate Batch Density
CDb 4b0 5 = C0 CIb 4b0 5 + C1 V3 + C2 V4 +
CIb : True density of an intermediate batch, also
known as absolute density; the density of only the
solid components of the batch post-evaporation.
CDb : Bulk density or the density accounting for the
mass of solid components and air entrapped in the
particle.
0
b0 : Vector ib
of postevaporation mass fraction of
ingredient i in intermediate b.
C0 1 C1 1 C2 1 0 0 0 : Regression coefficients.
Vj : Variables that define processing and chemistry
parameters (e.g., temperature, hardness, b0 )
CIb 4b0 5 = Pn
i=1 ib /i
Product Density
1
0
0
4z
/4
f
55
i i + xk /CDb 4b 5
i=1 i
FDk 4zk 1 xk 5 = Pn
P: Set of premixes.
I: Set of ingredients (raw materials).
B: Set of intermediate batches.
K: Set of finished products.
Parameters
n: Number of intermediates required in the solution.
v pb : Mass percentage upper bound of premix p in production of pre-dried intermediate b.
0
v pb
: Mass percentage upper bound of premix p in production of postevaporated intermediate b.
zpk : Mass percentage upper bound of premix p in production of finished product k.
ip : Mass percentage of ingredient i in production of
premix p.
cp : Cost of premix p, in $ per metric ton.
qks (stat factor): Number of doses in a stat unit for product k.
qkv : Production target of product k: number of stat units
sold per year.
qkw (wash volume): Volume of water in a single wash
for product k, in liters.
qkd (dosage): Grams of product k in wash machine
for one wash cycle, in liters.
R k : Maximum rate of evaporation limit for product k.
rb : Maximum evaporation capability for intermediate
batch b.
Decision Variables
vpb 601 v pb 7: Mass percentage of premix p in production pre-evaporated intermediate b.
eb : Evaporation variable for intermediate b.
zpk 601 zpk 7: Mass percentage of premix p in production of finished product k as an additive (not subject to
evaporation).
xk 601 17: Mass percentage of intermediate component
in production of product k.
ybk = 1 if finished product k is assigned to intermediate
batch b, 0 otherwise.
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
456
= eb ib 1
i I\8w91 b B1
0
eb 41 wb 5 = 1 wb
1
b B1
(3)
bB
pP
Rk = qkv xk
4w1 b w1
b 5ybk Rk 1
k K0
(6)
(7)
Empirical and semi-empirical models are based on experimental designs that are valid only within specific values,
and accuracy can degrade severely if extrapolated beyond
these bounds. Furthermore, compositions cannot differ
drastically from benchmark mixtures, and the water content
of powder products must be strictly controlled. Therefore,
ingredient mass percentage values must lie within predefined lower and upper bounds:
ik ik ik 1
i I1 k K0
(8)
0
0
0
ib
ib
ib
1
i I1 b B1
k K0
(10)
k K1
(11)
bB
0
w1 b w1
b rb 1
Fk 4k 5 fk 1
(2)
Empirical models impose constraints on product performance. The term Fk 4k 5 represents a vector for each product
of all SRI and whiteness functions, each row characterized
by third-order polynomial expressions of the products composition ik and parameters temperature, hardness, and soil
level. Products created by the optimization must achieve a
minimum level of performance, which is defined by vector
fk for each product:
pP
bB
pP
bB
(9)
Singleton Step
To start the process, we solve for the artificial case in
which each product is allowed to have its own dedicated intermediate batch. This is equivalent to specifying K
457
Premixes
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
Pre1
Common intermediates
Pre2
Pre3
B1 (pre)
e1
B1 (post)
Products
Prod1
Pre4
Figure A.1: The singleton subproblem layout for Prod1 is much simpler than the full-problem layout of Figure 6 and is independent of other
products.
independent problems, each with n = 1. For an example portfolio P1 P2 P3 , this requires us to solve three independent mixture-optimization problems comprising P1 , P2 ,
and P3 . Figure A.1 shows the corresponding Prod1 singleton of the problem illustrated in Figure 6.
We use the SAS/OR interior point nonlinear programming solver for the singleton subproblems, which are independent and can therefore be solved in parallel by enabling
the SAS cofor multithreaded processing capability. Complications exist, however, primarily because of bilinear terms
and nonconvex empirical and semi-empirical functions.
We address these complications by employing the multistart mechanism of the SAS nonlinear programming (NLP)
solver (also threaded), which aims to improve the likelihood
of finding globally optimal solutions. Note that the singleton step subproblem is a specialization of the configuration
step problem, where = and ykk = 1.
Although global optimality is neither provable nor guaranteed, we will describe a method for improving the ultimate quantity that is to be derived from these solutions: the
globally optimal cost ck of each singleton. For now, we begin
to build a groupings pool as the union of all singletons.
In the previous example, = P1 P2 P3 .
Grouping Step
We exploit a physical observation to generate a relatively
small number of promising product groups to approximate
this idea. The observation relies on the premise that expecting products with similar performance requirements to be
able to benefit from extracting a portion of their mixture
from the same intermediate is reasonable. Singletons are
ideal chemical compositions for products because they benefit from dedicated intermediates. Grouping finished products based on the similarity of their singleton intermediate
batch compositions is known to be beneficial by observation
in practice. For example, products that must meet stringent
SRI requirements for the same stains often share an intermediate to reduce costs; in isolation, they would have very
similar intermediate batches.
To exploit this observation, we define a metric of similarity between optimal (or best-known) intermediate compositions of singleton solutions. For any two singletons k
and l,
cp pk
pl
(14)
kl =
p
Variables
vl = 1 if the intermediate of product l is used in calculating kl for all k to be grouped with l, 0 otherwise.
ukl = 1 if product k is assigned to intermediate of singleton l, 0 otherwise.
A = 1 if group A is selected into the solution, 0
otherwise.
Grouping Problem Formulation
kl ukl
Minimize
(15)
k l
subject to
ukl = 1 k
(16)
l
vl = n
(17)
l
ukl vl
k l
vk 1 ukl
k l k = l
(18)
(19)
(20)
vl 0 1 l
(21)
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
458
u=1 0
0 0 1
has a one-to-one mapping to P1 P2 P3 . Multiple iterations of the algorithm build out by appending nonzero
columns of each solution u to . For example,
1
1
=
1
1
1
l\A
A
A n 1
Pre1
Common intermediate
Pre2
Pre
Post
Products
Pre3
Prod2
Pre4
Prod3
Pre5
lA
Premixes
(23)
Figure A.2: The configuration subproblem layout shows that only one common intermediate exists, eliminating the need for the binary variable that
assigns products to intermediates.
c
n=
Ai T
n
cAi
Parameters
xkA : Mass percentage of intermediate component in
production of product k for solution of group A.
vpA : Mass percentage of premix p in production of preevaporated intermediate for solution of group A.
ckA : Cost of producing product k using group A
solution:
ckA = qks qkv
cp zpk + xkA eA vpA
(24)
p
459
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
Variables
sA = 1 if group A is used in the solution, 0 otherwise.
wkA = 1 if intermediate of product k is derived from
group A, 0 otherwise.
Selection Problem Formulation
XX
ckA wkA
Minimize
(25)
AA kA
subject to
wkA = 1
k K1
(26)
AA2 kA
sA = n1
(27)
AA
wkA sA
A A1
(28)
A A1 k A1
(29)
kA
wkA sA
sA 801 19
wkA 801 19
A A1
A A1 k A0
(30)
(31)
to a known solution of A because of the effect of nonconvexity. The process ensures that we provide the best-known
coefficients for the objective function in the selection step by
exploiting the hierarchy in the problem, even if we have not
converged to global optimality for every NLP calculation in
the configuration step.
We further exploit this idea to augment A with all partitions of its members, because initializing all such partitions
with the solution of their supersets is possible. For example,
A = 8P1 918P2 918P3 918P4 918P5 918P2 1P3 918P1 1P4 1P5 9
(33)
can be augmented with 88P1 1 P4 91 8P1 1 P5 91 8P4 1 P5 99, with
each new subgroup being initialized with the restricted portion of s8P1 1 P4 1 P5 9 and c8P1 1 P4 1 P5 9 .
We cannot guarantee global optimality for the configuration step because of the nonconvexity of the problem;
therefore, cA can only be regarded as an upper bound to
the production cost of a group A. Ideally, for the solution
methodology to converge with some confidence of optimality, it would be helpful to identify lower bounds cA for costs
of groups in P4A5 for which we have not solved the associated NLPs. We illustrate this using the following example.
Consider a scenario in which we are solving a fiveproduct, two-intermediate problem and have built A defined in Equation (33). Such a pool would be constructed
by the solution of the singleton step with one augmentation
from the grouping step that produces 88P2 1 P3 91 8P1 1 P4 1 P5 99.
We solve an NLP for each member of A to calculate its corresponding cost. Although we have not solved the NLP for
the group 8P1 1 P2 1 P3 9, we can estimate its lower bound as a
consequence of Inequality (32),
c8P1 1 P2 1 P3 9 2= max 4c8P1 9 + c8P2 9 + c8P3 9 51 4c8P1 9 + c8P2 1 P3 9 5 0
We define A = A 88P1 1 P2 1 P3 99 and the corresponding optimal cost (derived by calculating T A in the selection step):
X
c Ai 0
c A1 n =
Ai T A1 n
n 2= 4cA1
n cA1 n 5/cA1 n n
(34)
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
460
Algorithm Summary
Here, we summarize the steps in the algorithm.
1. Singleton step. Let iteration count m = 0. Initialize A
with all singletons, solve singleton NLP configuration problems, and calculate similarities kl for all pairs of singletons.
2. Grouping step. Let m = m + 1. Solve the grouping
problem and append candidate groups to A, including partitions of each group. Also, append to A additional groups
derived from combinations of members of A.
3. Configuration step. In parallel, solve independent configuration problem NLPs for new groups of A.
4. Selection step. Solve set-covering problems for upper
bounds cA1n
and lower-bound estimates cA1n
, respectively.
5. Terminate the algorithm if Inequality (34) holds; else
go to step 2.
References
ASTM International (2014) Standard Guide for Evaluating Stain
Removal Performance in Home Laundering (ASTM International,
West Conshohocken, PA).
Audet C, Brimberg J, Hansen P, Le Digabel S, Mladenovic N (2004)
Pooling problem: Alternate formulations and solution methods. Management Sci. 50(6):761776.
Becerra C (2014) Observed benefits of the portfolio optimization
approach provided via email communication with Natalie
Esquejo, June.
Bodington CE, Baker TE (1990) A history of mathematical programming in the petroleum industry. Interfaces 20(4):117127.
Box GEP, Stuart Hunter J, Hunter WG (2005) Statistics for
Experimenters2 Design, Innovation, and Discovery, 2nd ed. (John
Wiley & Sons, Hoboken, NJ).
Burnham KP, Anderson DR (2002) Model Selection and Multimodel
Inference2 A Practical Information-Theoretic Approach, 2nd ed.
(Springer, New York).
Burnham KP, Anderson DR (2004) Multimodel inference: Understanding AIC and BIC in model selection. Sociol. Methods Res.
33(2):261304.
Dey S, Gupte A (2013) Analysis of MILP techniques for the pooling problem. Accessed April 1, 2015, http://www.optimization
-online.org/DB_FILE/2013/04/3849.pdf.
Floudas CA, Aggarwal A (1990) A decomposition strategy for
global optimum search in the pooling problem. ORSA J. Comput. 2(3):225235.
Goos P, Jones B (2011) Optimal Design of Experiments2 A Case Study
Approach (John Wiley & Sons, Hoboken, NJ).
Gupte A, Ahmed S, Dey S, Cheon M (2013) Pooling problem: Relaxations and discretizations. Accessed April 1, 2015, http://www
.optimization-online.org/DB_FILE/2012/10/3658.pdf.
Johnson RT, Montgomery DC, Jones BA (2011) An expository paper
on optimal design. Quality Engrg. 23(3):287301.
Kutner M, Nachtsheim C, Neter J, Li W (2004) Applied Linear Statistical Models, 5th ed. (McGraw-Hill/Irwin, New York).
Miller AJ (1990) Subset Selection in Regression (Chapman & Hall,
London).
Visweswaran V, Floudas CA (1990) A global optimization algorithm (GOP) for certain classes of nonconvex NLPsII. Applications of theory and test problems. Comput. Chemical Engrg.
14(12):14191434.
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
461