Inte 2 E20152 E0802

This article was downloaded by: [146.83.129.
99] On: 02 November 2015, At: 04:47

Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA
Interfaces
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
Statistical and Optimization Techniques for Laundry

Portfolio Optimization at Procter & Gamble
Nats Esquejo, Kevin Miller, Kevin Norwood, Ivan Oliveira, Rob Pratt, Ming Zhao
To cite this article:

Nats Esquejo, Kevin Miller, Kevin Norwood, Ivan Oliveira, Rob Pratt, Ming Zhao (2015) Statistical and Optimization Techniques
for Laundry Portfolio Optimization at Procter & Gamble. Interfaces 45(5):444-461. http://dx.doi.org/10.1287/inte.2015.0802
Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the articles accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
Copyright 2015, INFORMS
Please scroll down for articleit is on subsequent pages
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
Downloaded from informs.org by [146.83.129.99] on 02 November 2015, at 04:47 . For personal use only, all rights reserved.
Vol. 45, No. 5, SeptemberOctober 2015, pp. 444461

ISSN 0092-2102 (print) ISSN 1526-551X (online)
http://dx.doi.org/10.1287/inte.2015.0802
2015 INFORMS
Statistical and Optimization Techniques for

Laundry Portfolio Optimization at
Procter & Gamble
Nats Esquejo
Procter & Gamble, Newcastle-Upon-Tyne NE27 0QW, United Kingdom,
esquejo.nl@pg.com
Kevin Miller, Kevin Norwood

Procter & Gamble, Cincinnati, Ohio 45202
{miller.kp@gp.com, norwood.kt@pg.com}
Ivan Oliveira, Rob Pratt

SAS, Cary, North Carolina 27513
{ivan.oliveira@sas.com, rob.pratt@sas.com}
Ming Zhao
Department of Decision and Information Sciences, Bauer College of Business, University of Houston,
Houston, Texas 77204, mzhao@bauer.uh.edu
The Procter & Gamble (P&G) fabric-care business is a multibillion dollar organization that oversees a global
portfolio of products, including household brands such as Tide, Dash, and Gain. Production is impacted by
a steady stream of reformulation modifications, imposed by new-product innovation and constantly changing
material supply conditions. In this paper, we describe the creation and application of a novel analytical framework that has helped P&G determine the ingredient levels and product and process architectures that enable the
company to create some of the worlds best laundry products. Modeling cleaning performance and other key
properties such as density required P&G to develop innovative quantitative techniques based on visual statistical tools. It used advanced mathematical programming methods to address challenges that the manufacturing
process imposed, product performance requirements, and physical constraints, which collectively result in a
hard mixed-integer nonlinear (nonconvex) optimization problem. We describe how P&G applied our framework
in its North American market to identify a strategy that improves the performance of its laundry products,
provides targeted consumer benefits, and enables cost savings in the order of millions of dollars.
Keywords: pooling; blending; optimization; response surface; design of experiments.
History: This paper was refereed.
Traditional formulation approaches involve simplifying the problem, hypothesizing a solution, physically creating and testing prototypes, analyzing results,
and iterating the results until various objectives are
met. Physical prototyping can be expensive and time
consuming, resulting in slow and costly iteration
cycles; as a result, these traditional approaches no
longer meet todays needs.
P&Gs research and development organization is
at the forefront of the development and adoption of
modeling tools that enable the company to make better
decisions on product formulation, processing, and manufacturing. These include empirical, first-principles,
rocter & Gamble (P&G) laundry products are

global household brands that include Tide, Dash,
and Gain, and are offered in several physical product
forms, including powders, liquids, pods, tablets, and
bars. These products are manufactured in more than
30 sites and sold in more than 150 countries worldwide. The design of laundry-product formulations
(i.e., ingredient composition of chemical mixtures) has
become more complex over the years because of challenges, such as product-portfolio expansion, rapidly
changing ingredient costs and availability, and increasing competitive activity. The pace of change is fast and
increasing.
444
Esquejo et al.: Laundry Portfolio Optimization at P&G
Interfaces 45(5), pp. 444461, 2015 INFORMS
and semi-empirical models that predict chemical reactions during manufacturing, in-use physical properties of the product, technical performance of
the product, and even consumer acceptance rates.
These tools enable researchers to instantly predict a
products physical properties and performance, integrate models, and balance production trade-offs using
a variety of predictive and prescriptive capabilities.
Until recently, the complexity of laundry-formulation
and manufacturing processes limited us to consider
reformulating only a single product at a time; however, breakthroughs in mathematical optimization
technology have made possible system-wide portfolio
reformulation. This is critically important because it
permits us to model and optimize product differentiation within a portfolio and consider sharing common
materials within the manufacturing process. In this
paper, we present the scope of laundry-portfolio modeling and optimization at P&G, the creation of capabilities we developed to address this scope, and its
application to innovate P&Gs North American powder laundry portfolio.
Problem Definition and Challenges

The P&G North American laundry detergent business comprises three product forms: powders, liquids,
and pods. Powder detergents, which generate annual
sales of several hundred million U.S. dollars, are a
critical part of P&Gs North American business. Even
as we focus on powders as the primary application,
the framework for these tools can (and must) be easily extendable to other forms. Therefore, although we
focus on the powder problem in this paper, we note
that the liquid form is a simplified version of this
problem.
Laundry-product formulation can occur in one or
more manufacturing sites to supply multiple markets.
Identical product formulations are commonly made
in three or four different sites to fulfill the demand
of an entire region, such as Western Europe or North
America. Because each manufacturing site defines its
own set of products, the possibility exists that 80 percent of the products produced in two different manufacturing sites may coincide, whereas the remaining
20 percent are small-volume formulations that only
one site supplies. We typically refer to a portfolio of
445
products as a group of products with a common set

of characteristics. In this paper, we define portfolio as
the set of formula-unique powder laundry detergents
manufactured in our North American site.
Figure 1 illustrates the mixing architecture and
problem structure of the laundry detergent blending
process. A large portfolio of products is created from
a relatively small number of intermediate batches (i.e.,
1 to 8 in Figure 1); an intermediate batch, also called
an intermediate, is a mixture that is shared by various finished products. Each product is created by
blending a portion of its mixture from exactly one
intermediate batch with as many finishing additives
as required. Intermediates and finished products are
chemical mixtures of one or more ingredients or finishing additives. Ingredients or finishing additives are
sourced in the form of chemical mixtures, which we
refer to as premixes. The ingredient composition of
each premix is given, whereas the proportion of premixes to be combined to produce a desirable mixture
of ingredients must be specified (as a decision variable). Costs are specified at the premix level, whereas
product properties are determined by the ingredient
composition.
The goal of production is to minimize portfolio annual material spend across a network, which
currently includes about 40 products and up to
40 ingredients; material costs typically account for
approximately 60 percent of the total cost of production. P&G imposes many constraints to ensure that its
targeted levels of quality and manufacturing feasibility are achieved. These include requirements for stain
removal and whiteness performance, material balance and density of intermediate batches and product
mixtures, manufacturing site (i.e., plant) throughput,
water content, and raw material usage. Decisions to
be made include: assignments of products to intermediates, intermediate-proportion contributions to each
product, mixture compositions of intermediates, and
additive proportions in final products. In addition,
for laundry detergent powders, intermediate batches
must conform to unique evaporation rules that make
the problem more complex. Making the intermediate
batch requires mixing ingredient premixes in a slurry,
and then evaporating the excess water to form a freeflowing powder, which is mixed with finishing additives to create the final product.

Figure 1: (Color online) The production of laundry detergent mixtures creates a blending process.
Predictive Models
Predictive models are either empirical or semiempirical in nature. Empirical models are third-order
Variable 1 Variable 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0
50
100
0
0
100
100
50
100
100
50
50
50
0
0
50
100
50
50
100
0
100
100
100
0
0
50
50
0
50
0
50
Variable 3
100
0
50
0
0
100
0
50
0
100
50
100
50
50
100
50
Coffee
SRI
88.56
81.96
85.84
79.65
86.20
94.02
83.46
85.21
83.48
91.88
84.00
88.99
84.58
84.52
88.84
81.62
polynomial functions that capture the two performance qualities of a mixture: stain removal and
whiteness.
Empirical models for stain removal and whiteness
were created using experimental design procedures,
an efficient means of model creation for controlled
experiments (Box et al. 2005, Kutner et al. 2004), using
JMP software for the design and analysis. Figure 2
shows an example of an experimental design for a
three-variable model for a coffee stain. The table in
the figure lists the set of 16 test treatments we ran
and the associated coffee-stain response. The image to
Variable 1
0
20
60
40
Variable 2
80 100 100 80
60
40
20
100
80
Coffee SRI
95.0
92.5
90.0
100
87.5
82.5
80
80.0
60
60
40
40
20
20
0
100
80
60
Var
40
iab
le 2
20
0
20
40
80
60
ble
0
100
ria
Va
Figure 2: (Color online) This experimental design for a stain-removal index (SRI) for a coffee stain is
characterized by SRI coefficients (left) and can be visualized as response-surface models (right).
Variable 3
To formalize a solution to this complex problem, we

separate the analysis into two categories: predictive
models and optimization. Predictive models are used
to quantify the various relationships within the system,
and optimization incorporates these predictive models into the mathematical formulation to determine the
ideal values of decision variables. Next, we provide
details about each component of the problem.
Variable 3
446
447
the right of the table is a graphical representation of

the treatments, with the shading corresponding to the
value of the stain-removal index (SRI) for coffee. In
this example, higher SRI values (darker shading) are
more desirable.
Our experimental designs were based on i-optimal
criteria; such designs minimize average variance of
prediction over the region of experimentation (Goos
and Jones 2011, Johnson et al. 2011). They also included 16 variables with all two-way and selected
three-way interactions, producing third-order designs
with approximately 300 model terms. These variables
included all the key cleaning ingredients and wash
conditions of interest (e.g., surfactants and wash temperature). We used this design procedure for all stain
and whiteness models (approximately 60 responses).
We generated empirical data for the design by
making laboratory-scale formula prototypes, which
we physically tested in standardized wash protocols. The stain removal and whiteness procedures we
used were similar to ASTM method D4265-14 (ASTM
International 2014), which involves creating standard
stain sets and characterizing their color before and
after wash (E) using image analysis. Figure 3 illustrates this procedure for stains for which measures are
assigned to each of several standard technical stains
that are washed together in a single experiment. Figure 4 shows an example of a standard coffee stain that
has been processed with a given product test mixture
(before and after wash).
With the experiments in the design completed,
our next step was to evaluate and select response
Before wash color After wash color
Stain removal =
Before wash color
Before
wash
After
wash
Temperature = t
Hardness = h
Soil level = s
Washer
Figure 3: (Color online) A standard wash protocol is used when testing the
stain-removal effectiveness of a mixture.
Stain before wash
Stain after wash
Figure 4: (Color online) Stains are scanned before and after a wash experiment; in this example, we use a coffee stain.
models. We accomplished model selection for each

response using three stepwise regression techniques:
P -value threshold, minimum-corrected Akaike information criterion, and minimum Bayesian information
criterion; see Burnham and Anderson (2002, 2004),
and Miller (1990) for a description of using Akaike
and Bayesian information criteria for model selection. We used multivariate regression to quantify the
selected models and validation metrics to determine
the best model for each response, and we conducted
several levels of validation for each model to characterize prediction quality. One of the common techniques involved quantifying standard fitting diagnostics for the data set used for model creation. These
metrics include R square, R square adjusted, root
mean square error, lack-of-fit p-values, and other similar metrics. R square adjusted for the models ranged
from 0.55 to 0.95 with an average of 0.84. We used
models with R square adjusted below 0.70 only if
our technology experts agreed that trends that the
model displayed were acceptable for business purposes. All models in this design were deemed acceptable. Figure 5 shows an example of these diagnostics
for coffee SRI.
Semi-empirical models are based, to some extent,
on physical relationships. In this paper, we define
semi-empirical models as functional forms derived
from known equations (typically based on physical
laws and theorems), whose coefficients were determined by fitting the equation to a set of experimental
data. Finished-product density and intermediate density were the primary semi-empirical models used;
both are nonconvex functions of mixture-ingredient
448
is based on the material balance of water in the drying

process.
Actual by predicted plot

32)COFFEE
INSTANT ACTUAL

32)COFFEEINSTANT
PREDICTED
PRSQuare
2-3%
Summary of fit
RSQUaRE

RSQUaREADJ

2OOTMEANSQUAREERROR

-EANOFRESPONSE

/BSERVATIONSORSUMWGTS

Analysis of variance
DF
Sum of
squares
Mean square
F ratio
-ODEL

%RROR

#TOTAL

Source
Prob > F
*
Lack of fit
Source
,ACKOFFIT
DF
Sum of
squares

0UREERROR

4OTALERROR

Mean square
F ratio

Prob > F

Max r sq

Figure 5: (Color online) Fit diagnostics, shown for the SRI response function for a coffee stain, show strong predictive model performance and are
typical of stain results.
proportions. The appendix includes detailed functional forms of these models.

Finally, we used a first-principles model of evaporative load for any given formulation to estimate the
impact on evaporative rate, an important manufacturing consideration for detergent powders. This model
Optimization
The predictive models described offer the capability to
determine cleaning and density properties of a given
arbitrary mixture of ingredients without resorting to
expensive and time-consuming physical prototypes.
The next step was to determine ideal mixture ingredient compositions to ultimately bring P&G products to
the market while meeting stringent quality and manufacturing requirements. The optimization problem we
describe determines the most economical composition of mixtures, evaporation levels, and intermediateto-product assignments. In this section, we describe
the optimization problem at a high level to present a
basic statement of the problem, highlight complicating features, and motivate the solution-methodology
discussion. We include a more detailed mathematical
formulation in the appendix.
Intermediate batches and final products consist of
mixtures of ingredients. Although ingredient proportions determine the properties of these mixtures, the
manufacturing process does not simply blend pure
ingredients. Rather, premixes (mixtures of a small
subset of ingredients) are sourced and mixed to
achieve a final-ingredient mixture. The eligibility of
premixes to be added to an intermediate batch versus
a final product is determined primarily by the nature
of the premix. Premixes with high water content are
typically added to intermediate batches so that the
water can evaporate. Some premixes (e.g., perfumes)
can be unique to one product or available to all products, either in the intermediate batch, postevaporation
stage, or both.
Figure 6 illustrates the analytical representation
of material flowing through various stages of the
manufacturing process for a simple two-intermediate,
three-product example. Individual premixes may be
eligible to go into intermediate mixtures, directly into
the final-product mixtures (as additives), or either. For
example, the figure shows that premix Pre1 can be
assigned only to intermediates (B1 and B2), whereas
premix Pre3 can be assigned to intermediates and
product Prod2. The additive Pre5 can be assigned
only to Prod3 directly.
Percentages by weight of premixes used in total
production of intermediate batches (B1 and B2) and
449
Common intermediates
Premixes
B1 (pre)
e1
B1 (post)
Pre1
B2 (pre)
Pre2
Pre3
Pre4
e2
B2 (post)
Products
Prod1
Prod2
Prod3
Pre5
Figure 6: The layout defines the various decision variables and constraints
of the optimization problem.
products (Prod1, Prod2, and Prod3) are determined

as part of the optimization problem. In turn, these
decision variables determine mass percentages of raw
materials in the total production of intermediate and
product mixtures, as determined by the given premix
compositions. Because intermediates must adhere to
stringent water-content requirements, an evaporation
phase that produces a postevaporation mixture is necessary, as the distinction between pre- and postevaporated intermediates in Figure 6 indicates.
The extent of evaporation of intermediate batches
is a control variable of the manufacturing process and
must be prescribed as a decision variable for each
intermediate batch; because it is accomplished using
spray drying technology, it is also subject to maximum evaporation constraints based on plant restrictions. The appendix provides mathematical details
about the role of this variable; in this discussion, we
simply point out that it introduces into the formulation a difficult bilinear term that cannot be eliminated.
Each product can derive a portion of its mixture
from at most one postevaporated intermediate. This
requirement permits a continuous manufacturing process flow, which avoids capital investment for storage and handling of the intermediate batch; however,
it limits the ability to channel various intermediate
batches to each product. The intermediate-to-product
assignment is a binary decision variable.
In addition to these requirements, the following

constraints must hold:
1. Ingredient and premix mass fractions must add
up to 100 percent for all mixtures in the process.
2. Water content in the pre- and postevaporation
intermediate mixtures must lie within a given range
(typically between one percent and three percent mass
content for postevaporation).
3. Each ingredients mass fractions must lie within
predefined bounds to ensure the integrity of the physical models.
4. Finished products must have density values
within predefined bounds.
5. Finished product SRI and whiteness values must
be bounded below by benchmark values.
6. A predefined number of intermediates must be
used.
Items 13 are linear constraints and do not impose
undue computational burden on the solution process; however, items 4 and 5 enforce density, SRI, and
whiteness constraints, introducing additional nonlinear, nonconvex constraints to the formulation. Product
density is expressed in terms of a rational function,
and SRI and whiteness are third-order polynomial
functions of the mixture-composition variables. In
combination with the previously described bilinear
term and binary decision variables, these features
make this a difficult mixed-integer nonlinear programming (MINLP) problem.
The optimization objective is to minimize the total
cost of premixes used in the mixing process, weighted
by product dosage per stat unit (i.e., a unit of demand) and product production volume targets. This
objective represents the total cost of production. The
appendix provides a detailed mathematical formulation of the MINLP, including all sets, decision variables, constraints, and objectives.
Optimization Solution Methodology

As previously stated, a requirement of the problem
is that a given number of intermediate batches must
be used in the mixing process to produce a set of
products, where each product can take a portion of
its composition from exactly one intermediate. Any
number of products can use an intermediate. We can
interpret this as a set-partitioning problem with nonlinear side constraints; that is, we aim to partition
450
the set of products into subsets, such that an intermediate is assigned to each subset of products that
must satisfy various nonlinear constraints, such as
performance requirements. This interpretation of the
problem is closely related to the pooling problem
(Bodington and Baker 1990), and can be shown to
be an NP-hard mixed-integer nonlinear (nonconvex)
program.
Previous approaches to solving pooling problems
include relaxation and discretization strategies (Gupte
et al. 2013), Benders decomposition (Floudas and
Aggarwal 1990), Lagrangian relaxation (Visweswaran
and Floudas 1990), branch and cut (Audet et al. 2004),
and mixed integer linear programming (MILP) (Dey
and Gupte 2013). These approaches do not directly
apply to the variant we consider in this paper. Most of
the effort to date has focused on addressing the bilinear terms in the problem; however, the evaporation
process, the requirement that assignments must be
binary decisions, and nonconvex performance models
violate basic assumptions of much of the work conducted to date.
State-of-the-art solution methodologies for the pooling problem that attempt to prove optimality are typically restricted to problems of (in equivalent terms) a
few dozen ingredients, premixes, intermediates, and
products; Dey and Gupte (2013) provide example
instance sizes, where the number of inputs and outputs is less than 100. Several features of the problem we address in this paper make it a significantly
harder generalization of the classical pooling problem, although its size is comparable to instances
addressed by the best-known methods. Furthermore,
the best-known methods often take many hours to
reach a small optimality-bound gap. For our purposes, solutions must be produced in an order of minutes (to allow scenario exploration, which is critical to
P&G formulation practices); therefore, we have taken
a heuristic approach to solve this optimization problem. The appendix provides a detailed explanation of
the optimization solution methodology.
Implementation and Usage

Team and Workflow
To best utilize this new analytical capability, P&G had
to devise a new way of organizing our team and also

implement a new work process. The use of optimization tools requires a set of skills best defined by an
optimization triad (Figure 7) that includes functional,
data, and optimization experts.
Functional experts are typically a small number of
individuals from different R&D functions, including
consumer, formulation, and process. Data experts are
one or two individuals who have access to all the
necessary information, such as material pricing and
material balances, and are typically skilled in visual
analytics tools, such as JMP statistical tools. Optimization experts are staff members who have the necessary programming skills to interact with the optimization models at the SAS code level. Figure 7 defines
the responsibilities of expert in each category.
We defined a new work process (Figure 8), which
has enabled this multifunctional, multiskilled team
to efficiently use the new capability; we describe the
components as follows.
1. Problem definition: Functional experts frame the
problem to be solved and the research questions to be
addressed. For example, we researched and analyzed
processes that would (1) allow P&G to deliver significantly better product performance at equal or lower
cost over its current processes, and (2) reduce the number of intermediates while maintaining performance.
2. Knowledge development: Functional experts
start building the problems framework by collecting
knowledge and building a common state of understanding across the team. We define knowledge as
existing models, heuristic approaches, and the simplification of assumptions, all of which must form a basis
for agreement. For example, if different density models exist, the team must come to a consensus on which
Functional expert
Optimization expert
$EFINESPROBLEMTO
BESOLVEDCREATES
APPROPRIATEMODELS
DEFINESCONSTRAINTS
$EFINESOPTIMIZATION
FRAMEWORKWRITES
CODERUNSSCENARIOS
Data expert
'ATHERSAPPROPRIATEINPUTDATAINTO
DESIREDFORMATANALYZESSOLUTIONS
USINGVISUALANALYTICSTOOLS
Figure 7: (Color online) This figure shows the optimization triad that P&G
adopted as part of its standard process.
451
7ORKPROCESS
0ROBLEMDEFINITION
FUNCTIONALEXPERT
7RITEOPTIMIZATION
CODE
OPTIMIZATIONEXPERT
)TERATIVETESTING
-ATHEMATICAL
VALIDATION
OPTIMIZATIONEXPERT
!SSUMPTION
+NOWLEDGE
DEVELOPMENT
FUNCTIONALEXPERT
-ODELDEVELOPMENT
FUNCTIONALEXPERT
/PTIMIZATION
PROBLEMFORMULATION
OPTIMIZATIONEXPERT
0ORTFOLIODATA
GATHERINGEGVOL
COSTFORMULATION
DATAEXPERT
#HURN!NALYSIS
!LL
/PTIMIZED
RECOMMENDATION
FUNCTIONALEXPERT
Figure 8: (Color online) The optimization work process at P&G follows a sequence of nine interrelated steps.
density model to use. Additional data or analysis may

be required if the team cannot come to an agreement
because of missing or conflicting information.
3. Model development: Functional experts develop
models for new-product parameters or adapt (e.g.,
linearize) current models to make them more suitable
for optimization.
4. Portfolio data gathering: Data experts collect and
format all the information needed for the framework.
This proved to be a challenging step because we
encountered multiple databases with missing relational keys. The process can be quite manual in some
cases.
5. Optimization problem formulation: Optimization experts interpret the research question(s) and,
using the knowledge collected, models, and portfolio
information as context, create or adapt the mathematical framework for optimization by defining or modifying variables, objective functions, and constraints.
Optimization experts start the modeling work as soon
as the research question has been defined, and further
refine the optimization models as additional knowledge is generated and the portfolio data are compiled.
6. Write optimization code: Optimization experts
write or adapt SAS/OR code as needed to address the
new research question, constantly verifying assumptions and validating preliminary results with other
experts and stakeholders.
7. Iterative testing and mathematical validation:
Optimization experts participate with other members
of the team in a cycle of iterative testing to mathematically validate models, ensuring that constraints are
correctly interpreted and observed and that optimum
solutions are robust. In this step, parameters (e.g.,
the number of multistart points and stopping criteria)
are tuned, and model infeasibilities, often caused by
products with too-stringent performance constraints,
are commonly found. At this stage, steps must be
taken to ensure feasibility; revisiting the constraint
bounds is an example.
8. Churn and analysis: In this step, the entire team
exercises the optimization engine to optimize multiple
scenarios to answer the research question(s). Scenarios
can include changes in performance requirements, the
number of intermediates allowed, constraint bounds,
materials allowed to be used (and where they can be
used), or a combination of these scenarios. For highly
complex research questions, P&G implements a churn
event; in such an event, all members of the team are
colocated for two to four days. They focus all their
time on a common set of problems, and their objective is to produce data to inform decision making.
During churn, the team occupies a common room
equipped with physical and digital visualization tools
that aid the work process. Poster-sized printouts display a master list of all scenarios, and key parameters are captured and color coded to differentiate the
scenarios. They analyze completed scenarios on a display that consists of eight 42" high-definition television screens configured to behave as a single monitor,
allowing high-resolution visualizations to be spread
across a large area. The team typically reviews the
results using JMP software, which permits interactive
visualization and analysis.
9. Optimized recommendation: Functional experts
take results from the analysis and formulate a recommendation. Recommendations can be as simple as a
new set of formulations to meet a new requirement,
or as complex as a multistage strategy for a portfolio
of products, which might include the introduction of
new technologies.
In the course of this project, we learned that a mindset shift is required to solve for the entire portfolio
rather than for one product at a time. Traditionally,
projects have addressed formulation changes for one
or only a small subset of products, while preserving the composition of most products in the portfolio. Once this mind-set shift happened, however, we
were able to identify (and test) portfolio-management
strategies, which led to some unique options that
P&G had not previously uncovered.
Outputs and Results
Because the algorithm we present in this paper is
based on building an approximation of the power set
of products, each optimization can be easily extended
to solve for various numbers of intermediate batches
by solving the selection step for as many independent subproblems as there are productsfrom one
intermediate for all products to the singleton cases.
We exploited this aspect of the algorithm to provide a range of intermediate batch configurations for
each optimization run for marginal additional run
time, running the set-covering subproblems in parallel. When running in this mode, one must determine where to apply the stopping criterion. Because
P&Gs production focus is on number of intermediates greater than or equal to four, we apply our
stopping criterion at three intermediates at an estimated gap of three percent for all optimization runs.
Figure 9 shows solution objective values (total cost
difference for annual production) of a typical optimization, which we represent as the difference between
our solution output objective and the annual cost of
a historical production run. An obvious feature of the
problem we consider here is that as the number of
intermediate batches increases, the globally optimal
objective must not increase, because the model allows
more flexibility in tuning each products assigned
intermediate. The figure reflects this, and we note that
the algorithms flow was motivated to ensure that this
check to verify rationality would never be violated
despite the presence of nonconvexity.
Reporting such a multiple intermediate solution
proved valuable to management because an immediate comparison of neighboring solutions could show
the benefits of increasing (or decreasing) the number
of intermediate batches in productiona costly, but
sometimes beneficial, manufacturing investment. For
example, the marginal benefits of increasing the number of intermediate batches from 14 to 17 are minimal;
only at 18 intermediates does a significant change
occur (because of the discrete nature of the problem),
which may justify the added complexity and cost.
/PTIMALCOSTSDIFFERENCEVSBENCHMARK

#OSTDIFFERENCEVSBENCHMARK
452

3OLUTION
"ENCHMARK

n
n
n
n

.OOFBATCHES
Figure 9: (Color online) Multiple intermediate-solution annual costs show

a potential reduction of $4 million when the number of intermediate
batches is held constant for a 14-batch instance. An alternative interpretation suggests that, at similar costs and levels of quality, production can
occur with only six intermediate batches.
453
As we mention previously, each optimization instance was referenced against a related historical
benchmark production run (created without the benefits of an analytical approach). Our expectation for
the success of this project was that the solution provided by the framework would always improve on
the benchmarks premix annual costs. This is a reasonable expectation because, for a given number of
intermediate batches, the benchmark is typically a
feasible mixture. Comparing the 14-intermediate solution annual cost with the benchmark value in Figure 9
shows that this requirement was met in this example, and that the solution typically produces savings
in the order of magnitude shown (between $0.5 and
$6 million).
Few instances currently exist at P&G because statistical and semi-empirical models have only recently
been developed for entire product portfolios. Table 1
lists two important powder instances we used in
this work and also shows a number of ingredients, premixes, products, response-surface models,
and benchmark intermediates that represent typical
problem sizes.
The optimization tool produces significantly better results than the benchmark for both instances in
the table; its run times are very reasonable, given the
needs of the P&G work process. Tables 2 and 3 summarize the results for these instances, listing most
intermediate configurations from three to the number of products in the instance. The tables show cost
differences from the benchmark (in millions) and run
times at which each intermediate configuration satisfied its stopping criterion. The longest run time is the
total run time of the process. Note the cost improvements when compared to the benchmark even when
running at the minimum number of intermediate
batches (three) for both instances. The values in the
Instance name
3bii
Wenlock
Ingredients
Premixes
Products
RS
Intermediates
54
42
38
34
21
25
50
38
12
13
Table 1: This table summarizes the size of two representative instances

in P&Gs portfolio. The term RS refers to the number of response-surface
models, and the Intermediates column represents the number of intermediate batches in the benchmark solution.
Intermediates
3
4
5
6
7
8
9
10
11
12
13
14
21
Cost diff. vs. bench

($ in millions)
Bounds (%)
Run time (sec)
1508
1609
1800
1808
1902
1904
1906
1907
1908
2000
2001
2002
2003
2035
1077
1019
0078
0058
0045
0036
0029
0022
0015
0009
0004
0000
45805
25300
11002
11004
11005
11006
11006
11007
11009
11100
11101
11102
3206
Table 2: The table shows results for instance 3bii. The bolded values
represent a best comparison to the benchmark solution, which has 12
intermediate batches.
bolded row represent a direct comparison with the

benchmark intermediate configuration.
The most direct application of our work has been in
P&Gs North American dry laundry portfolio, which
consists of more than 20 unique formulations. Our
objective was to study different strategies for simplification and savings, while delivering the same or
better consumer-relevant cleaning performance. The
churn team spent two days running and analyzing
Intermediates
3
4
5
6
7
8
9
10
11
12
13
14
15
16
25
Cost diff. vs. bench

($ in millions)
Bounds (%)
Run time (sec)
7604
7908
8109
8206
8300
8305
8307
8309
8400
8401
8402
8403
8403
8404
8404
2046
1042
0079
0058
0043
0028
0021
0015
0012
0009
0007
0005
0002
0001
0000
25606
25607
25608
25700
25702
25703
25704
25705
25707
25708
25800
25801
25803
25804
6805
Table 3: In these results for the Wenlock instance, the bolded values represent a best comparison to the benchmark solution, which has 13 intermediate batches.
454
more than 30 scenarios. Here, we discuss two examples from the study results.
Increasing the number of intermediates allowed
from five to 10 would result in additional formula cost
savings. Compared to a five-intermediate benchmark,
using the optimization procedure would result in
annual cost improvements of $2 million for five
intermediates and $5 million for 10 intermediates.
Although a consequence of this strategy is to increase
complexity at the manufacturing site for handling
a higher number of intermediate batches, this is a
justifiable decision based on this demonstrable cost
reduction.
The introduction of a new active ingredient (currently not in the North American powder formulation, but available in empirical and semi-empirical
models) across the whole portfolio would generate
annual savings in excess of $20 million, while delivering target performance profile (i.e., cleaning of different stains) for each product. P&G has incorporated
this knowledge into its short- to mid-term strategies
for this part of our business.
We have estimated that without the portfolio optimization tool, twice as many staff members would
have to work for far longer periods of time to run
30 scenarios on 20 formulations using the former
single-product-at-a-time approach and would produce inferior results.
User sponsorship and adoption have been instrumental to the success of this project. In an email to
the team, Christian Becerra, P&G senior researcher
and lead formulator for the North American powder
business, provided a set of additional benefits of our
portfolio optimization framework that go beyond the
savings described earlier (Becerra 2014):
Smart optimization: Identifying a formula and
process strategy that meets our criteria, removing
chemistry that will not deliver the desired performance profile to the consumer. This leads to smart
savings.
Next level of optimization: Going beyond the traditional single-formula to full-portfolio optimization
(i.e., the ability to see the big picture).
Flexibility: Integrating technical features and an
understanding of consumer needs to make more
robust portfolio propositions.

Redefining the cleaning vision: Differentiating

performance between brands and maintaining brand
equity.
Exploring out-of-the-box concepts: Exploring current production constraints and raw material ingredients, but with the flexibility to integrate deviations
that may lead to better results in performance and
process.
Managing what-if scenarios: Saving time and
resources.
Multifunctional integration: Performing a more
robust proposition versus isolated optimization efforts based on function.
Summary of Benefits
Portfolio optimization is changing the way we do
product development at P&G. Previously, we were
often limited to one of two strategies; we developed
each product individually, resulting in highly complex
portfolios that required high numbers of intermediate
batches, or we imposed simplification strategies that
resulted in higher formulation costs.
Portfolio optimization allows us to test formulation
and simplification strategies against the whole portfolio, giving us a realistic estimate of the potential
impact of these strategies. Fast iteration cycles allow
us to evaluate multiple strategies in a short time, discarding elements that will bring little value and combining elements that provide significant advantages.
The ability to analyze an entire portfolio simultaneously is also changing the way product designers
think about performance, because we can now more
easily differentiate performance among the products.
Thus, we can make smarter decisions about formulation and simplification strategies and respond with
agility when needed.
As lead strategies are identified and the formulation for the full portfolio is generated, some physical testing is required to confirm that the required
performance and physical properties of the products
are indeed met. This is especially true for solutions
that are near the minimum or maximum values of
the input ranges, because the confidence intervals of
the predictions are typically at their widest in these
ranges. Future phases of this project will address this
need, and continuously expand and improve the quality of predictive and optimization models.
455
As a direct result of using this optimization platform and measuring its value to our business, we
are now making the following enhancements to our
processes:
Evolving P&G and its work processes to better
use this and other operations research capabilities to
define and execute the best portfolio strategies.
Investing in resources to improve our data
management system and automate data pull and
formatting.
Building more models into the optimization framework and continuing to do so as new models become
available, potentially including consumer models and
first-principles chemistry models.
Finally, although the benefits of this tool have been
well demonstrated in our laundry-powder business,
many reapplication opportunities remain at P&G. In
the coming years, we expect to continue to develop
this capability further for our laundry business and
create similar capabilities for other P&G businesses.
Appendix
Semi-empirical Models
Stain-removal performance is predicted by the stainremoval index (SRI) response function, which has the following form:
SRI = C0 + C1 v1 + C2 v2 + C3 v1 v2 + 1
where Ci represent coefficients and vi represent design variables: wash concentrations (milligrams per liter) of chemical
ingredients and wash conditions (e.g., temperature). Whiteness models have a similar form.
Intermediate Batch Density
CDb 4b0 5 = C0 CIb 4b0 5 + C1 V3 + C2 V4 +
CIb : True density of an intermediate batch, also
known as absolute density; the density of only the
solid components of the batch post-evaporation.
CDb : Bulk density or the density accounting for the
mass of solid components and air entrapped in the
particle.
0
b0 : Vector ib
of postevaporation mass fraction of
ingredient i in intermediate b.
C0 1 C1 1 C2 1 0 0 0 : Regression coefficients.
Vj : Variables that define processing and chemistry
parameters (e.g., temperature, hardness, b0 )
CIb 4b0 5 = Pn
i=1 ib /i
i : Liquid density of pure component i.
Product Density
1
0
0
4z
/4
f
55
i i + xk /CDb 4b 5
i=1 i
FDk 4zk 1 xk 5 = Pn
zk : Vector of mass fractions zik of finishing additive i in

finished product k.
xk : Mass fraction of assigned intermediate in production of
finished product k.
i : Density of finishing additive i.
fi : Packing factor of finishing additive i.
Optimization Model
Notation
Sets
P: Set of premixes.
I: Set of ingredients (raw materials).
B: Set of intermediate batches.
K: Set of finished products.
Parameters
n: Number of intermediates required in the solution.
v pb : Mass percentage upper bound of premix p in production of pre-dried intermediate b.
0
v pb
: Mass percentage upper bound of premix p in production of postevaporated intermediate b.
zpk : Mass percentage upper bound of premix p in production of finished product k.
ip : Mass percentage of ingredient i in production of
premix p.
cp : Cost of premix p, in $ per metric ton.
qks (stat factor): Number of doses in a stat unit for product k.
qkv : Production target of product k: number of stat units
sold per year.
qkw (wash volume): Volume of water in a single wash
for product k, in liters.
qkd (dosage): Grams of product k in wash machine
for one wash cycle, in liters.
R k : Maximum rate of evaporation limit for product k.
rb : Maximum evaporation capability for intermediate
batch b.
Decision Variables
vpb 601 v pb 7: Mass percentage of premix p in production pre-evaporated intermediate b.
eb : Evaporation variable for intermediate b.
zpk 601 zpk 7: Mass percentage of premix p in production of finished product k as an additive (not subject to
evaporation).
xk 601 17: Mass percentage of intermediate component
in production of product k.
ybk = 1 if finished product k is assigned to intermediate
batch b, 0 otherwise.
456
Because mixing decisions are made at a premix level,

variables v, z, and y take a p P index, thus specifying premix composition. The evaporation process is expressed in
terms of ingredient compositions, requiring us to calculate
ingredient mass percentage quantities pre- and postevaporation for each ingredient i in each intermediate b:
X
ib =
ip vpb 1 i I1 b B1
(1)
pP
0
ib
= eb ib 1
i I\8w91 b B1
0
eb 41 wb 5 = 1 wb
1
b B1
(3)
bB
Constraints. Fundamental physical requirements that

characterize the mixing process are modeled as constraints.
Intermediate and final-product mixtures have mass percentage ingredient contributions that must add to 100 percent:
X
X
vpb = 11 b B and
zpk + xk = 11 k K0 (5)
pP
pP
The rate of evaporation Rk is limited by

X
0
Rk = qkv xk
4w1 b w1
b 5ybk Rk 1
k K0
(6)
Additionally, there are physical limitations on the amount

of water that can be evaporated in the intermediate batch:
b B0
(7)
Empirical and semi-empirical models are based on experimental designs that are valid only within specific values,
and accuracy can degrade severely if extrapolated beyond
these bounds. Furthermore, compositions cannot differ
drastically from benchmark mixtures, and the water content
of powder products must be strictly controlled. Therefore,
ingredient mass percentage values must lie within predefined lower and upper bounds:
ik ik ik 1
i I1 k K0
(8)
Similarly, bounds must be imposed on ingredient compositions in the intermediate batches:

ib ib ib 1
0
0
0
ib
ib
ib
1
i I1 b B1
k K0
(10)
Semi-empirical models impose constraints on product

density,
F Dk F Dk 4zk 1 xk 1 ybk 5 F Dk 1
k K1
(11)
where we recall that F Dk 4zk 1 xk 1 ybk 5 is a nonlinear function

representing the density of product k, which is dependent
on ybk through its corresponding intermediate batch density CDb .
Finally, we recall that a requirement of the process is that
each product is assigned to exactly one intermediate batch:
X
ybk = 11 k K0
(12)
bB
Objective. The optimization objective is to minimize the

total cost of premixes used in the mixing process, weighted
by product dosage per stat unit (qks , a unit of demand) and
product production-volume targets (qkv ), using at most n
intermediates:

X s vX
X
cp zpk + xk
min
qk qk
ybk eb vpb 0
(13)
kK
bB
0
w1 b w1
b rb 1
Fk 4k 5 fk 1
(2)
where the subscript w is used to denote water. Similarly,

empirical and semi-empirical functions are expressed in
terms of ingredient mass percentage at the product level,
requiring us to calculate these values for each ingredient i
in each product k:
X
X
0
ik =
ip zpk + xk
ybk ib
1 i I1 k K0
(4)
pP
Empirical models impose constraints on product performance. The term Fk 4k 5 represents a vector for each product
of all SRI and whiteness functions, each row characterized
by third-order polynomial expressions of the products composition ik and parameters temperature, hardness, and soil
level. Products created by the optimization must achieve a
minimum level of performance, which is defined by vector
fk for each product:
pP
bB
A useful quantity in the presentation to follow is the cost

of production of a subset of products A K for a given set
of values x, y, e, v, and z:

X s vX
X
cA =
qk qk
cp zpk + xk
ybk eb vpb 0
kA
pP
bB
Optimization Solution Methodology

Our algorithm is based on a column-generation heuristic, where a set-covering master problem interacts with
independent subproblems that prescribe intermediate-toproduct groupings and mixture compositions. The algorithm is based on the following sequence of steps:
1. Singleton
2. Grouping
3. Configuration
4. Selection
5. Return to step 2 until convergence.
(9)
which include, most importantly for the role they play in

the manufacturing of intermediate batches, constraints on
water content.
Singleton Step
To start the process, we solve for the artificial case in
which each product is allowed to have its own dedicated intermediate batch. This is equivalent to specifying K
457
Premixes
Pre1
Common intermediates
Pre2
Pre3
B1 (pre)
e1
B1 (post)
Products
Prod1
Pre4
Figure A.1: The singleton subproblem layout for Prod1 is much simpler than the full-problem layout of Figure 6 and is independent of other
products.
independent problems, each with n = 1. For an example portfolio P1 P2 P3 , this requires us to solve three independent mixture-optimization problems comprising P1 , P2 ,
and P3 . Figure A.1 shows the corresponding Prod1 singleton of the problem illustrated in Figure 6.
We use the SAS/OR interior point nonlinear programming solver for the singleton subproblems, which are independent and can therefore be solved in parallel by enabling
the SAS cofor multithreaded processing capability. Complications exist, however, primarily because of bilinear terms
and nonconvex empirical and semi-empirical functions.
We address these complications by employing the multistart mechanism of the SAS nonlinear programming (NLP)
solver (also threaded), which aims to improve the likelihood
of finding globally optimal solutions. Note that the singleton step subproblem is a specialization of the configuration
step problem, where = and ykk = 1.
Although global optimality is neither provable nor guaranteed, we will describe a method for improving the ultimate quantity that is to be derived from these solutions: the
globally optimal cost ck of each singleton. For now, we begin
to build a groupings pool as the union of all singletons.
In the previous example, = P1 P2 P3 .
Grouping Step
We exploit a physical observation to generate a relatively
small number of promising product groups to approximate
this idea. The observation relies on the premise that expecting products with similar performance requirements to be
able to benefit from extracting a portion of their mixture
from the same intermediate is reasonable. Singletons are
ideal chemical compositions for products because they benefit from dedicated intermediates. Grouping finished products based on the similarity of their singleton intermediate
batch compositions is known to be beneficial by observation
in practice. For example, products that must meet stringent
SRI requirements for the same stains often share an intermediate to reduce costs; in isolation, they would have very
similar intermediate batches.
To exploit this observation, we define a metric of similarity between optimal (or best-known) intermediate compositions of singleton solutions. For any two singletons k
and l,

cp pk
pl

(14)
kl =
p
represents a sum of the absolute-value difference of mass

percentage of premixes in their respective singleton postevaporation intermediates, weighted by the cost of premix,
where we have used the product indices to represent corresponding singleton pairs.
The goal of the grouping step is to generate candidate
groups of products that minimize the sum of kl , such that n
intermediate batches are used. We accomplish this by defining a model based on bipartite assignments, where each
product is eligible to be paired with each singletons optimal
intermediate. Binary variables determine which intermediates should be used, and arc variables determine prospective intermediate-to-product pairings. These are then interpreted to specify product groups that are appended to
(each composed of a subset of products).
Sets
: Current groupings pool.
= k l k l.
Parameters
kl : Similarity measure; see Equation (14).
Variables
vl = 1 if the intermediate of product l is used in calculating kl for all k to be grouped with l, 0 otherwise.
ukl = 1 if product k is assigned to intermediate of singleton l, 0 otherwise.
A = 1 if group A is selected into the solution, 0
otherwise.
Grouping Problem Formulation

kl ukl
Minimize
(15)
k l
subject to
ukl = 1 k
(16)
l
vl = n
(17)
l
ukl vl
k l
vk 1 ukl
k l k = l
(18)
(19)
ukl 0 1 k l
(20)
vl 0 1 l
(21)
Inequality (19) ensures that if a product has been assigned

to an intermediate batch of another products singleton,
458
its own singleton intermediate batch becomes ineligible for

grouping other products.
Groups implied by ukl are appended to for further
steps in the algorithm (i.e., configuration and selection).
Our definition of ensures no symmetry in the problem, allowing us to use ukl to uniquely map to . For
example,
u=1 0
0 0 1
has a one-to-one mapping to P1 P2 P3 . Multiple iterations of the algorithm build out by appending nonzero
columns of each solution u to . For example,
created by appending the above solution to a groupings

pool that has been initialized with singletons, is represented
in matrix form as
1
1
=
1
1
1
Because the purpose of this step is to enrich to provide

a better approximation of the optimal members of ,
each iteration (i.e., each time the grouping problem is called)
must produce n groups of which at least one is not currently
in . We accomplish this by adding to the above formulation variable A and constraints

1 ulk +
ulk 1 A A k
(22)
and
l\A
A
A n 1
Pre1
Common intermediate
Pre2
Pre
Post
Products
Pre3
Prod2
Pre4
Prod3
Pre5
= P1 P2 P3 P1 P2
lA
Premixes
(23)
where we interpret A as columns of , l A as rows of

with values of 1, and l \A as rows of with values of 0.
We can interpret the variable as follows. For an existing
group A , if A = 0, then Inequality (22) ensures that A is
not a group in the solution of the grouping step as specified
by u, and if A = 1, the constraint is relaxed. Inequality (23)
allows the model to relax this condition up to n 1 times.
For example, when solving for a five-intermediate problem,
Inequalities (22) and (23) together ensure that a new solution is produced that allows for at most four intermediate
batches already in .
Configuration Step
Given a group of products, the production optimization
problem becomes a continuous NLP. Figure A.2 illustrates
the example of Figure 6 when we impose a group Prod2,
Prod3. Note that there is a single intermediate, and therefore no need to identify the intermediate batch by index.
Figure A.2: The configuration subproblem layout shows that only one common intermediate exists, eliminating the need for the binary variable that
assigns products to intermediates.
This variant of the problem has no integer variables,

allowing us to use standard NLP solvers. Because the singleton problem can be interpreted as a special case of the
configuration problem where only one product exists, we
implement the singleton computation by using the configuration model code.
In the configuration step, we solve the related NLPs to
identify (locally) minimum-cost values of producing these
independent groups of mixtures. These problems are solved
in parallel using SAS NLP solver-threaded capability.
Selection Step
The selection step is simply a cardinality-constrained set
covering of using groups in as eligible subsets. The
goal is to minimize the total optimal cost of production
c
n=
Ai T
n
cAi
with an optimal partition T

n , where T n = n. Because
is a relatively small set, we can easily solve this problem
with the SAS/OR MILP solver.
Parameters
xkA : Mass percentage of intermediate component in
production of product k for solution of group A.
vpA : Mass percentage of premix p in production of preevaporated intermediate for solution of group A.
ckA : Cost of producing product k using group A
solution:

ckA = qks qkv
cp zpk + xkA eA vpA
(24)
p
459
Variables
sA = 1 if group A is used in the solution, 0 otherwise.
wkA = 1 if intermediate of product k is derived from
group A, 0 otherwise.
Selection Problem Formulation
XX
ckA wkA
Minimize
(25)
AA kA
subject to
wkA = 1
k K1
(26)
AA2 kA
sA = n1
(27)
AA
wkA sA
A A1
(28)
A A1 k A1
(29)
kA
wkA sA
sA 801 19
wkA 801 19
A A1
A A1 k A0
(30)
(31)
Algorithm Enhancements (Augmentation and Bounding)

Initially, when we start with singletons and the first round
of groupings, A contains at most K + n subsets. We note
that the similarity-based grouping step attempts to heuristically exploit an observation of optimal intermediate compositions but is not guaranteed to lead to optimal intermediate
batch groups. Furthermore, because of the nonconvexity of
the problem, the configuration step likely produces solutions that are not globally optimal.
To address these issues, we iterate through the grouping, configuration, and selection steps to accomplish two
goals: (1) improve our approximation of the optimal region
of P4K5 by augmenting A; and (2) improve the global optimality of cost values by evaluating subgroup relationships.
Item (1) occurs automatically within the formulation by
the inclusion of constraints that require at least one new
group not already in A to be generated, ensuring that at
least one next-best product group is appended to A for subsequent steps. Item (2) requires more explanation, which we
illustrate in the following example.
Splitting a group A into a partition TA allows each subgroup of the partition to have its own intermediate batch
(and therefore more flexibility in the choice of mixture composition). Global optimality thus requires that
X
cA
cAi 0
(32)
Ai TA
When we restrict solution sA to the products in Ai A (by

eliminating any k A\Ai ), it becomes feasible for any such
subgroup. It is therefore advantageous to replace sAi by sA
whenever the condition cA < cAi is detected for any subset
of A. This check is performed for all groups in the solution
pool A and is motivated by our knowledge that a known
locally optimal solution of Ai might be inferior (more costly)
to a known solution of A because of the effect of nonconvexity. The process ensures that we provide the best-known
coefficients for the objective function in the selection step by
exploiting the hierarchy in the problem, even if we have not
converged to global optimality for every NLP calculation in
the configuration step.
We further exploit this idea to augment A with all partitions of its members, because initializing all such partitions
with the solution of their supersets is possible. For example,

A = 8P1 918P2 918P3 918P4 918P5 918P2 1P3 918P1 1P4 1P5 9
(33)
can be augmented with 88P1 1 P4 91 8P1 1 P5 91 8P4 1 P5 99, with
each new subgroup being initialized with the restricted portion of s8P1 1 P4 1 P5 9 and c8P1 1 P4 1 P5 9 .
We cannot guarantee global optimality for the configuration step because of the nonconvexity of the problem;
therefore, cA can only be regarded as an upper bound to
the production cost of a group A. Ideally, for the solution
methodology to converge with some confidence of optimality, it would be helpful to identify lower bounds cA for costs
of groups in P4A5 for which we have not solved the associated NLPs. We illustrate this using the following example.
Consider a scenario in which we are solving a fiveproduct, two-intermediate problem and have built A defined in Equation (33). Such a pool would be constructed
by the solution of the singleton step with one augmentation
from the grouping step that produces 88P2 1 P3 91 8P1 1 P4 1 P5 99.
We solve an NLP for each member of A to calculate its corresponding cost. Although we have not solved the NLP for
the group 8P1 1 P2 1 P3 9, we can estimate its lower bound as a
consequence of Inequality (32),

c8P1 1 P2 1 P3 9 2= max 4c8P1 9 + c8P2 9 + c8P3 9 51 4c8P1 9 + c8P2 1 P3 9 5 0
We define A = A 88P1 1 P2 1 P3 99 and the corresponding optimal cost (derived by calculating T A in the selection step):
X
c Ai 0
c A1 n =
Ai T A1 n
We could augment A with all members of P4A5 because

the lower-bound estimate can always be calculated from
singletons. For large problems, P4A5 is prohibitively large;
therefore, we instead augment based on building supersets
of existing groups.
We have been careful to call this an estimate of the lower
bound because nonconvexity could prevent us from accurately calculating the costs of each Ai , because cAi is only an
upper bound. Therefore, a true lower bound cannot be guaranteed (although it continues to improve as the algorithm
iterates). Given some desired optimality gap n , we use
n 2= 4cA1
n cA1 n 5/cA1 n n
(34)
as a heuristic stopping criterion, with the expectation that

the algorithm might terminate prior to achieving true global
optimality within n .
460
Algorithm Summary
Here, we summarize the steps in the algorithm.
1. Singleton step. Let iteration count m = 0. Initialize A
with all singletons, solve singleton NLP configuration problems, and calculate similarities kl for all pairs of singletons.
2. Grouping step. Let m = m + 1. Solve the grouping
problem and append candidate groups to A, including partitions of each group. Also, append to A additional groups
derived from combinations of members of A.
3. Configuration step. In parallel, solve independent configuration problem NLPs for new groups of A.
4. Selection step. Solve set-covering problems for upper
bounds cA1n
and lower-bound estimates cA1n
, respectively.
5. Terminate the algorithm if Inequality (34) holds; else
go to step 2.
References
ASTM International (2014) Standard Guide for Evaluating Stain
Removal Performance in Home Laundering (ASTM International,
West Conshohocken, PA).
Audet C, Brimberg J, Hansen P, Le Digabel S, Mladenovic N (2004)
Pooling problem: Alternate formulations and solution methods. Management Sci. 50(6):761776.
Becerra C (2014) Observed benefits of the portfolio optimization
approach provided via email communication with Natalie
Esquejo, June.
Bodington CE, Baker TE (1990) A history of mathematical programming in the petroleum industry. Interfaces 20(4):117127.
Box GEP, Stuart Hunter J, Hunter WG (2005) Statistics for
Experimenters2 Design, Innovation, and Discovery, 2nd ed. (John
Wiley & Sons, Hoboken, NJ).
Burnham KP, Anderson DR (2002) Model Selection and Multimodel
Inference2 A Practical Information-Theoretic Approach, 2nd ed.
(Springer, New York).
Burnham KP, Anderson DR (2004) Multimodel inference: Understanding AIC and BIC in model selection. Sociol. Methods Res.
33(2):261304.
Dey S, Gupte A (2013) Analysis of MILP techniques for the pooling problem. Accessed April 1, 2015, http://www.optimization
-online.org/DB_FILE/2013/04/3849.pdf.
Floudas CA, Aggarwal A (1990) A decomposition strategy for
global optimum search in the pooling problem. ORSA J. Comput. 2(3):225235.
Goos P, Jones B (2011) Optimal Design of Experiments2 A Case Study
Approach (John Wiley & Sons, Hoboken, NJ).
Gupte A, Ahmed S, Dey S, Cheon M (2013) Pooling problem: Relaxations and discretizations. Accessed April 1, 2015, http://www
.optimization-online.org/DB_FILE/2012/10/3658.pdf.
Johnson RT, Montgomery DC, Jones BA (2011) An expository paper
on optimal design. Quality Engrg. 23(3):287301.
Kutner M, Nachtsheim C, Neter J, Li W (2004) Applied Linear Statistical Models, 5th ed. (McGraw-Hill/Irwin, New York).
Miller AJ (1990) Subset Selection in Regression (Chapman & Hall,
London).
Visweswaran V, Floudas CA (1990) A global optimization algorithm (GOP) for certain classes of nonconvex NLPsII. Applications of theory and test problems. Comput. Chemical Engrg.
14(12):14191434.
Nats Esquejo received her Bachelor of Science degree in

chemical engineering from the University of the Philippines

in 1996 and is a section head in R&D at Procter & Gamble.

She has extensive experience in both process and formulation design in the fabric and home care business and is
working in the modelling and simulation group, focused on
integration and optimization of models, and application of
big data techniques. She also conducts training and consults
on design of experiments and process control techniques
internally in P&G.
Kevin Miller received his bachelors degree in chemistry
from Xavier University in 1999. He is a principal researcher
at Procter & Gamble working in Fabric and home care modelling and simulation. He started in laundry product design
for North American Granules products and continued product design in Central Eastern European and Latin American
Granules. His current focus in modelling and simulation is
on model integration and product optimization.
Kevin Norwood received his PhD in physical chemistry
from Iowa State University in 1990 and is a research fellow
in R&D at Procter & Gamble. He leads technical work to
create and apply modeling approaches to formulate products within the fabric and home care businesses. His current
work is focused on integration of models across disciplines.
He started with P&G in 1991 and has worked in analytical
science, technology, formulation, and modeling, where he
has spent the majority of his career.
Ivan Oliveira manages the Advanced Analytics and Optimization Services (AAOS) group at SAS, where he has
directed projects in operations research (OR) and optimization applications in a variety of industries. AAOS projects
deliver consulting expertise to SAS customers in the field
of OR, inventory optimization, revenue management and
price optimization, and related technologies. A sample of
AAOS projects includes optimal scheduling for ATM cash
replenishment, portfolio optimization in government, revenue management in various industries, retail inventory
replenishment and pricing, chemical mixture portfolio optimization in CPG, optimal assignment of delinquent loan
processing, and simulation for drug discovery. AAOS is
also engaged in internal SAS R&D projects, including optimization for data mining. He earned his BS in mechanical engineering at the University of Virginia and MS and
PhD in mechanical engineering at Massachusetts Institute
of Technology.
Rob Pratt has worked at SAS since 2000 and is a senior
manager in the Operations Research Department within
SAS R&Ds Advanced Analytics Division. He manages a
team of developers responsible for the optimization modeling language, network algorithms, and the decomposition
algorithm. He earned a BS in mathematics (with a second
major in English) from the University of Dayton, and both
an MS in mathematics and a PhD in operations research
from the University of North Carolina at Chapel Hill.
Ming Zhao is assistant professor in the Department
of Decision and Information Sciences at the University of
Houston. He served as senior operations research specialist in Advanced Analytics and Optimization Services
(AAOS) group at SAS. The projects he has worked on

include chemical mixture portfolio optimization for P&G,
operating rooms scheduling, renewable energy integration
and power system operations, retail inventory replenishment and pricing, and optimization for data mining. He
461
earned his PhD from University at Buffalo in 2008 and

was a postdoctoral researcher at IBM T.J. Watson Research
Center, where he worked primarily on unit commitment
problem and supply chain management in the mining
industry.

Inte 2 E20152 E0802

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inte 2 E20152 E0802

Uploaded by

Copyright:

Available Formats

This article was downloaded by: [146.83.129.

99] On: 02 November 2015, At: 04:47

Statistical and Optimization Techniques for Laundry

To cite this article:

Vol. 45, No. 5, SeptemberOctober 2015, pp. 444461

Statistical and Optimization Techniques for

Kevin Miller, Kevin Norwood

Ivan Oliveira, Rob Pratt

rocter & Gamble (P&G) laundry products are

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

Problem Definition and Challenges

products as a group of products with a common set

Esquejo et al.: Laundry Portfolio Optimization at P&G

To formalize a solution to this complex problem, we

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

the right of the table is a graphical representation of

Stain before wash

Stain after wash

models. We accomplished model selection for each

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

is based on the material balance of water in the drying

Actual by predicted plot

proportions. The appendix includes detailed functional forms of these models.

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

products (Prod1, Prod2, and Prod3) are determined

In addition to these requirements, the following

Optimization Solution Methodology

Implementation and Usage

Esquejo et al.: Laundry Portfolio Optimization at P&G

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

density model to use. Additional data or analysis may

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

Figure 9: (Color online) Multiple intermediate-solution annual costs show

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

Table 1: This table summarizes the size of two representative instances

Cost diff. vs. bench

Run time (sec)

bolded row represent a direct comparison with the

Cost diff. vs. bench

Run time (sec)

Esquejo et al.: Laundry Portfolio Optimization at P&G

Redefining the cleaning vision: Differentiating

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

i : Liquid density of pure component i.

zk : Vector of mass fractions zik of finishing additive i in

Esquejo et al.: Laundry Portfolio Optimization at P&G

Interfaces 45(5), pp. 444461, 2015 INFORMS

Because mixing decisions are made at a premix level,

Constraints. Fundamental physical requirements that

The rate of evaporation Rk is limited by

Additionally, there are physical limitations on the amount

Similarly, bounds must be imposed on ingredient compositions in the intermediate batches:

Semi-empirical models impose constraints on product

where we recall that F Dk 4zk 1 xk 1 ybk 5 is a nonlinear function

Objective. The optimization objective is to minimize the

ukl 0 1 k l

Because the purpose of this step is to enrich to provide

= P1 P2 P3 P1 P2

where we interpret A as columns of , l A as rows of

with an optimal partition T