You are on page 1of 25

Generalized DEA model of fundamental analysis

and its application to portfolio optimization


N.C.P. Edirisinghe
*
, X. Zhang
Department of Statistics, Operations, and Management Science, College of Business Administration,
University of Tennessee, Knoxville, TN 37996, USA
Available online 18 April 2007
Abstract
Fundamental analysis is used in asset selection for equity portfolio management. In this paper, a
generalized data envelopment analysis (DEA) model is developed to analyze a rms nancial state-
ments over time in order to determine a relative nancial strength indicator (RFSI) that is predictive
of rms stock price returns. RFSI is based on maximizing the correlation between the DEA-based
score of nancial strength and the stock market performance. This maximization involves a dicult
binary nonlinear program that requires iterative re-conguration of parameters of nancial state-
ments as inputs and outputs. We utilize a two-step heuristic algorithm that combines random sam-
pling and local search optimization. The proposed approach is tested with 230 rms from various US
technology-industries to determine optimized RFSI indicators for stock selection. Then, those
selected stocks are used within portfolio optimization models to demonstrate the usefulness of the
scheme for portfolio risk management.
2007 Elsevier B.V. All rights reserved.
JEL classication: C61; C67; G11
Keywords: Portfolio optimization; Fundamental analysis; Relative nancial strength; Data envelopment analysis
0378-4266/$ - see front matter 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.jbankn.2007.04.008
*
Corresponding author. Tel.: +1 865 974 1684; fax: +1 865 974 2490.
E-mail address: chanaka@utk.edu (N.C.P. Edirisinghe).
Available online at www.sciencedirect.com
Journal of Banking & Finance 31 (2007) 33113335
www.elsevier.com/locate/jbf
1. Introduction
Fundamental analysis (FA) is the process of evaluating a public rm for its investment-
worthiness by looking at its business at the basic or fundamental nancial level, see for
example, Thomsett (1998). It involves examining a rms nancials and operations, espe-
cially sales, earnings, growth potential, assets, debt, management, products, and competi-
tion. FA may also include analyzing market behavior that stresses the study of underlying
factors of supply and demand, see Doyle et al. (2003) and Piotroski (2000). The main goal
is to enhance the ability to predict future security price movement and then use such pre-
dictions to design equity portfolios. On the other hand, technical analysis (TA) operates
on the theory that market prices at any given point in time reect all known factors aect-
ing supply and demand, as well as a rms relative nancial strength. Thus, TA focuses on
analyzing market prices themselves, rather than directly evaluating factors of fundamental
strength or factors of supply and demand. Strategies based on TA generally utilize a series
of calculations designed to detect when a price change is likely to occur so that an investor
can manage market positions in the short-term, such as the case in highly leveraged deriv-
ative markets. In contrast, FA takes on a more long-term perspective in determining which
rms are most likely to perform well in the future, based on their fundamental business
strengths.
The work in this paper complements the approach of fundamental analysis. The objec-
tive of our research is to focus only on the publicly-available nancial statements of a
given rm and to use them to determine a measure of underlying business strength for
the rm. In determining the underlying nancial health of a company, the raw nancial
numbers of a rm do not provide the perspective required to dierentiate between healthy
and unhealthy stocks for investment. In other words, the context provided by a compari-
son of a given rm to its industry and to the market as a whole is essential. Therefore, the
focus in this paper is not to evaluate a rms business strength in isolation. Instead, a rel-
ative strength indicator is computed by comparing a given rm to many other rms which
are in a similar business segment of the market, such as the industry to which the rm
belongs. The central premise of this research is that market prices have factored in pub-
licly-available information about the rm, but the future expectations of price perfor-
mance are determined by the perceived business strength of the rm. Thus, this notion
is consistent with the ecient market theory, where the price of a stock is assumed to
reect the knowledge and expectations of all investors since everyone has the same infor-
mation about the stock. The aim of this paper is to provide a measurable (objective) metric
of that knowledge that is highly correlated with stock price performance. Then, such a
metric can be used as a proxy for gauging a rms expected nancial performance, and
hence the rms future stock price performance. In this sense, a companys nancial state-
ments (income statement, balance sheet and cash ow statement) become indispensable
resources for investment decision making. It must also be stated that this research is
not focused on determining if a stock is undervalued, overvalued, or trading at fair market
value, nor does it focus on qualitative market factors that are internal or external to the
rm.
Many quantitative models have been proposed in the literature for stock price predic-
tion using nancial statements regression models and articial neural network (ANN)
models have been applied, see for instance, Kanas (2001), Quah and Srinivasan (1999),
and Thanassoulis (1993). In Kanas (2001), historical nancial data is used as inputs and
3312 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
stock price is used as output in an ANN model. There are also other approaches based on
applying regression-based techniques using data from nancial statements as explanatory
variables to predict future cash ow or stock performance. Ou (1989) used logistic regres-
sion to estimate the probability of an earnings increase in a subsequent year. Graham et al.
(2002) applied ordinary least squares regression to determine a rms market value as a
linear function of its earnings and book value.
The work in this paper is motivated by the basic approach of Edirisinghe and Zhang
(2007), where nancial statement data was used in a data envelopment analysis (DEA)
model. However, that work assumed that the analyst is able to categorize nancial data
into separate inputs and outputs in a DEA model specication to compute a nancial
strength metric. With such an a priori model specication, although measures of high
nancial strength may result, they could display low correlations with market returns.
Then, the analyst runs the risk of reaching the inevitable (false) conclusion that underlying
nancial strength is not factored into market returns, contrary to the ecient market
hypothesis. In this paper, we generalize the basic DEA methodology where inputs and
outputs parameters are not xed a priori, instead they are determined via an optimiza-
tion process formulated to maximize correlations between nancial strength and market
returns. This process results in a relative nancial strength indicator (RFSI) that is highly
predictive of stock returns.
To the best of our knowledge, a DEA-based nancial strength metric for a rm has not
been directly incorporated within fundamental analysis for stock investments. Data Envel-
opment Analysis (DEA) is commonly used to evaluate the relative eciency of a number
of Decision Making Units (DMUs). The basic DEA model in Charnes et al. (1978), called
the CCR model, has lead to several extensions, most notably the BCC model of Banker
et al. (1984) and the additive model of Charnes et al. (1985). DEA models have been exten-
sively used in performance appraisal in a wide range of applications including nancial
performance as well as non-nancial performance measurement. In the nancial applica-
tions of DEA methodology, one particularly appealing idea is to measure managerial e-
ciency of a company by using its nancial statements. For example, using certain nancial
ratios as inputs and outputs, DEA is used to evaluate performance of banks (Yeh, 1996),
CRAF participants (Bowlin, 2004), defense business segments (Bowlin, 1999), and credit
unions (Pille and Paradi, 2002). DEA-based eciencies and Sharpe ratios are compared to
evaluate performance of dierent hedge funds, see Gregoriou et al. (2005). Alam and
Robin (1998) compute relative technical eciencies for rms in the airline industry and
analyze their association with corresponding stock price returns. However, their work is
based upon input and output variables that are generally non-nancial in nature and they
are typically not found in the publicly-available nancial statements.
In all of the above approaches, the underlying DEA model is specied with a xed
(exogenous) set of input and output parameters to compute an eciency score. Our work
contrasts with the traditional DEA approach in that an input/output categorization is
endogenously determined by a model that seeks the highest correlation between stock
returns and eciency metric. Consequently, our approach leads to identifying the best-
performing companies from the poor-performing, i.e., stock screening, for consider-
ation in equity portfolio management. Given that distribution parameters of stock returns
are often subject to estimation error, screening mechanisms such as the proposed RFSI-
based stock selection can yield better risk-reward characteristics in portfolio optimization.
This is demonstrated by applying the RFSI-based approach to identify favorable stocks as
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3313
candidates for stock portfolio optimization. The resulting portfolios are shown in this
paper to have superior risk-reward performance.
The remainder of the paper is organized as follows. In Section 2, the mathematical for-
mulation of a standard DEA model is rst provided, along with various nancial para-
meters from income statements and balance sheets to be used as inputs and outputs.
The generalized DEA approach is then discussed. In Section 3, the RFSI determination
problem is formulated as a correlation maximization model, and a two-stage (heuristic)
solution scheme is developed for its solution. Portfolio selection criteria based on statisti-
cal tests of RFSI are covered in Section 4. Section 5 applies the methodology in a case
study involving 230 rms from six US industries. Using actual data from 1996 to 2002,
the RFSI approach is applied to identify rms to include within a mean-variance quadratic
portfolio optimization model. The paper concludes with some remarks in Section 6.
2. DEA model
Data envelopment analysis (DEA) is a nonparametric method for measuring the rela-
tive eciencies of a set of similar decision making units (DMUs) by relating their outputs
to their inputs and categorizing the DMUs into managerially ecient and managerially
inecient. It originated from Farrells, 1957 work, which was later popularized by Char-
nes et al. (1978). The CCR ratio model in Charnes et al. (1978) seeks to optimize the ratio
of a linear combination of outputs to a linear combination of inputs.
To explain the basic premise of a DEA model, let there be J independent DMUs (rms)
whose performance (or eciency) must be evaluated relative to each other. One begins
with a given set of inputs parameters (say, M) and a given set of output parameters
(say, N) which are common to all J rms. The relative (managerial) eciency then mea-
sures how well a given rm (in the group of J rms) converts its M inputs to the N outputs,
which is computed as the ratio of a certain aggregated output measure to a certain aggre-
gated input measure. Such aggregated input (and output) measures are computed by tak-
ing a non-negative linear combination of the M inputs (and N outputs). Following this
idea, the input-oriented relative performance (strength or eciency) f
k
of some rm k,
k = 1, . . . , J, is then dened as the maximized value of the latter ratio, determined over
all possible aggregating multipliers such that no rm in the group will attain a relative per-
formance measure greater than unity. The CCR model is formulated as follows:
f
k
: max
u;v

N
n1
xo
nk
v
nk

M
m1
xi
mk
u
mk
s:t:

N
n1
xo
nj
v
nk

M
m1
xi
mj
u
mk
6 1; j 1; . . . ; J
u
mk
; v
nk
P0; m 1; . . . ; M; n 1; . . . ; N:
1
For rm j, the (measured) level of input parameter m is (xi)
mj
, m = 1, . . . , M, while that
of output parameter n is (xo)
nj
, n = 1, . . . , N. The input and output non-negative multipli-
ers for rm k are denoted by the variables u
mk
and v
nk
, respectively. The model in (1) yields
the maximum achievable eciency for rm k, denoted f
k
, provided every other rm is also
applying the same aggregating non-negative multipliers in computing their input to output
conversion ratios. f
k
is termed the DEA eciency score of rm k. An eciency score of less
3314 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
than one is indicative of that it may be possible to decrease the level of input for the same
level of output, while a score of 1 indicates the rm is DEA-ecient. By applying (1) to
each rm independently, the respective (maximum) relative eciency score for each rm
is computed. The equivalent linear programming formulation of model (1) is, see Charnes
et al. (1978),
^
f
k
: max
u;v

N
n1
xo
nk
v
nk
s:t:

M
m1
xi
mk
u
mk
1

M
m1
xi
mj
u
mk

N
n1
xo
nj
v
nk
6 0; j 1; . . . ; J
u
mk
; v
nk
P0; m 1; . . . ; M; n 1; . . . ; N:
2
It is easy to show that
^
f
k
f
k
holds under the non-negativity of the observed data. More
precisely, if (xi)
mk
> 0 for some m = 1, . . . , M, then,
^
f
k
f
k
holds. Conversely, suppose
(xi)
mk
6 0 for all m = 1, . . . , M. Then, the maximization in (1) is not well-dened, and
(2) is infeasible, in which case, we assign a performance strength of
^
f
k
0. For detailed
discussions on DEA models that involve negative inputs/outputs, see Lovell and Pastor
(1995) and Portela et al. (2004), for instance. The issue of negative data stems from the
fact that in the application in this paper, the input and output data come from nancial
statements. That is, it is possible that all input parameters for a given rm have non-po-
sitive values, depending on how the input parameters are chosen from nancial statements.
Such is the case if return on assets and return on equity are chosen as the only input
parameters and if these two parameters are negative for a rm being evaluated.
The input/output parameters for the DEA framework are identied from nancial
statements of companies. A total of 18 nancial parameters are used, either directly or
computed, from the quarterly nancial statements of a rm, as presented in Table 1. These
parameters examine a rms fundamental performance through a range of performance
perspectives: protability, asset utilization, liquidity, leverage, valuation, and growth
perspectives.
2.1. The generalized DEA approach
It is important to observe that, in order to apply the DEA model in (2), the M input
parameters and N output parameters are required to be explicitly identied a priori. While
this may be possible in certain applications (such as production) where input to output
conversion mechanisms are well-understood, our case is dierent. We must select a set
of input and output parameters from the universe of 18 nancial parameters describing
a rms nancial health (see Table 1). The objective of such a selection is that the resulting
relative DEA performance score of a rm can be interpreted as providing a measure of its
underlying nancial strength. Such nancial strength measures are required to be strongly
correlated with the market price process, under the ecient market hypothesis. If the
inputs and outputs for the DEA model are chosen exogenously (a priori), the resulting
DEA performance scores for rms may not be representative of the fundamental nancial
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3315
strengths that are rewarded by the nancial markets. The generalized DEA approach
(GDEA) developed in this paper leaves the selection of inputs and outputs as exible as
possible in the sense that a proper selection of the latter is sought iteratively to maximize
the correlation of the DEA-based strength evaluation and the stock market performance.
This process is best-explained in Fig. 1.
To illustrate the GDEA approach in Fig. 1, consider the universe of I (=18) parameters
that are potential inputs and outputs. Suppose a given parameter i may be used as an input
and/or output, or not used at all. Furthermore, suppose the level at which a parameter
must be specied in the DEA model in (2) is treated as unknown. Consequently, for a
parameter i with an observed value x
ij
for rm j, the level at which it enters the model
as an input is denoted by y
i
x
ij
, where the input scaling variable y
i
P0. Similarly, the level
at which the parameter i enters as an output for rm j is z
i
x
ij
and the output scaling var-
iable z
i
P0. Collecting the y
i
and z
i
components for all parameters, we dene an input
scaling parameter vector by y 2 R
I
and an output scaling vector by z 2 R
I
. An appropriate
selection of values for the pair y; z 2 R
2I
is not a rm-specic issue. Rather, it must be
chosen as a property of the industry, i.e., group of rms, so that relative performance
scores of rms can be compared to each other within the same industry. More impor-
Table 1
Financial parameters used for fundamental analysis
i Parameter Description Perspective
1 Return on equity Net income generated per unit of common shareholders
equity
Protability
2 Return on assets Net income divided by the total assets Protability
3 Net prot margin Net income a rm makes for every $1 it generates in revenue Protability
4 Receivables turnover Revenues for the period divided by receivables Asset
utilization
5 Inventory turnover Revenues for the period divided by inventories Asset
utilization
6 Asset turnover Revenue generated per dollar of assets a rm owns Asset
utilization
7 Current ratio Total current assets divided by total current liabilities Liquidity
8 Quick ratio Total current assets minus inventory divided by total current
liabilities
Liquidity
9 Debt to equity ratio Long-term debt divided by shareholders equity Liquidity
10 Leverage ratio Total assets divided by shareholders equity Leverage
11 Solvency ratio-I Total liability divided by total assets Leverage
12 Solvency ratio-II Total liability divided by shareholders equity Leverage
13 Price to earnings (PE)
ratio
Stock price divided by net income per share Valuation
14 Price to book ratio Stock price divided by shareholders equity per common
share
Valuation
15 Earnings per share
(EPS)
Net income minus dividends divided by common shares Protability
16 Revenue growth rate Current quarters revenue divided by the previous quarters
revenue minus one
Growth
17 Net income growth rate Current quarters net income divided by the previous quarters
net income minus one
Growth
18 Earnings per share
growth rate
Current quarters EPS divided by the previous quarters EPS
minus one
Growth
3316 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
tantly, such a performance score must represent the fundamental nancial health of a rm
that is predictive of (or highly correlated with) the stock price action. Therefore, the vector
(y, z) is to be held xed when computing eciencies of all J rms in the group. Under the
scaling vector parametrization (y, z), the resulting DEA model is
g
k
y; z : max
u;v

I
i1
z
i
x
ik
v
ik

I
i1
y
i
x
ik
u
ik
s:t:

I
i1
z
i
x
ij
v
ik

I
i1
y
i
x
ij
u
ik
6 1; j 1; . . . ; J
u
ik
; v
ik
P0; i 1; . . . ; I;
3
where y is chosen such that

I
i1
y
i
> 0. g
k
(y, z) is simply referred to as the (nancial) per-
formance score of rm k corresponding to the input/output scaling vector pair (y, z). The
following equivalent linear programming model can be used to compute g
k
(y, z).
g
k
y; z : max
u;v

I
i1
z
i
x
ik
v
ik
s:t:

I
i1
y
i
x
ik
u
ik
1

I
i1
y
i
x
ij
u
ik

I
i1
z
i
x
ij
v
ik
6 0; j 1; . . . ; J
u
ik
; v
ik
P0; i 1; . . . ; I:
4
In DEA, the issue of setting a given parameter in both the input and output sets simulta-
neously has been addressed in, for instance, Beasley (1995) and Cook et al. (2006). In the
case of a CCR model, when a parameter is used both in inputs and outputs, the resulting
DEA eciency is 1 for each rm. That is,
Proposition 2.1. For some parameter i 2 {1, . . . , I}, let y
i
> 0 and z
i
> 0. For a rm k being
evaluated, suppose the measured value of parameter i satises x
ik
> 0. Then, g
k
(y, z) = 1
holds.
Strength metric
for each firm
Market correlation
with strength
Market correlation
maximized?
Run DEA model
(for each firm)
Set inputs &
outputs fixed
Re-categorize
inputs/outputs
no
yes
STOP
Fig. 1. Schematic of the generalized DEA approach.
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3317
Proof. Can be shown similar to Lemma 1 and Theorem 1 in Cook et al. (2006). h
Therefore, if y
i
> 0 and z
i
> 0 for a parameter i with data x
ij
> 0 for all rms j, the DEA-
based strength score is 1 for all rms. For example, when the parameter i is the current
ratio (see Table 1), the data is always positive for all rms. In such a case, correlation
between the computed nancial strength score and the stock market performance is zero,
and thus, such a choice on (y, z) will not maximize the desired strength-market correlation,
see Fig. 1. Consequently, to reduce the search space for (y, z) in correlation maximization,
we set y
i
z
i
= 0 for all i = 1, . . . , I. This prohibits a given nancial parameter i from being in
the inputs and outputs simultaneously.
Denition 2.2. A given vector-pair (y, z) is said to satisfy the complementarity condition if
and only if y
i
z
i
= 0 for all i = 1, . . . , I. In this case, such a pair is simply referred to as a
complementary pair (y, z).
Therefore, a complementary pair (y, z) allows the categorization of the universe of I
parameters as distinct inputs and outputs. In contrast, Cook et al. (2006) introduced
the notion of exible measures whereby a new parameter can be considered in the presence
of existing input/output sets. Their model then determines if this new parameter should be
an input or an output in order to improve the (maximized) DEA eciency. In our case, the
objective is to have the highest correlation between DEA eciencies and the stock market
returns. Therefore, we take a dierent approach that allows the complementary vector
pairs (y, z) to play the role of exible measures in a more generalized setting. For this pur-
pose, the domain of (y, z) must be appropriately chosen to force which parameters should
never (or must) be in inputs/outputs.
Another important property of the CCR model is its unit-invariance. That is, the DEA-
eciency computed by (1) is independent of the units in which the input and output
parameters are measured, see Ray (2004, pp. 106107) and Lovell and Pastor (1995). Sta-
ted in our context,
Proposition 2.3. g
k
(y, z) is positively homogeneous of degree 0 in (y, z) jointly and sepa-
rately.
Proof. Follows directly from the proof of unit invariance of the CCR-DEA model, see, for
instance, Ray (2004). h
The main implication of Proposition 2.3 is that it restricts the domain of feasible
complementary vector pairs (y, z) to a binary space. Along with the complementarity con-
dition in Denition 2.2, thus, the feasible domain of the scaling vectors (y, z) in (4) must
satisfy,
y
i
z
i
0; y
i
; z
i
2 f0; 1g; i 1; . . . ; I: 5
An equivalent linear transformation of (5), along with the condition that

i
y
i
> 0, yield
the following Binary Complementary Domain (BCD), denoted by X, for the feasible
choices for (y, z).
BCD: X : y; z :

I
i1
y
i
P1; y
i
z
i
6 1; y
i
; z
i
2 f0; 1g; i 1; . . . ; I
_ _
: 6
3318 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Accordingly, for every rm k in the group (i.e., industry), the corresponding nancial per-
formance score g
k
(y, z) is determined by the model in (4) for a specied binary complemen-
tary vector pair (y, z) 2 X. The goal is to search for (y, z) 2 X such that the performance
score so-computed would be a suitable metric of the underlying nancial strength of a
given rm, relative to all rms in the group.
When the model in (4) is specied using parameters under the BCD condition in (6) that
requires choosing (y, z) 2 X, it is herein referred to as GDEA under unrestricted BCD, or
simply, unrestricted GDEA version. On the other hand, we also consider a certain restric-
tion on the binary complementary domain, based on a practical interpretation of the
nancial perspectives in Table 1, as given next.
2.2. Restricted BCD
The parameters of asset utilization, liquidity, and leverage perspectives can generally be
interpreted as inputs because activities that are measured by these parameters depend on the
planning and operational strategies of a rm. On the other hand, the parameters of prot-
ability and growth perspectives are generally considered as outputs because revenue/income
generation is a major objective criterion for a rm. The valuation parameters measure how
well the equity markets perceive success of a rm, and they are generally not concerned
with a rms input strategy. Accordingly, in a restricted GDEAapproach, input parameters
are only chosen from the perspectives of asset utilization, liquidity, and leverage, while the
output parameters are chosen only from the protability, growth, and valuation perspec-
tives. This leads to the following restricted binary complementary domain,
Restricted BCD: X

: y; z 2 X :

3
i1
y
i

18
i13
y
i
0;

12
i4
z
i
0
_ _
: 7
Performance of the unrestricted and restricted GDEA versions will be compared within
portfolio optimization using the application reported in Section 5. In the sequel, we will
also compare results with a xed exogenous input/output categorization, referred to as
a base categorization and denoted by (y
0
, z
0
),
y
0
: f0; 0; 0; 1; 1; 1; 1; 1; 1; 1; 1; 1; 0; 0; 0; 0; 0; 0g
z
0
: f1; 1; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 1; 1; 1; 1; 1; 1g:
_
8
Thus, the base categorization is completely determined by the interpretation of nancial
parameters, rather than their usefulness in determining a nancial performance score.
Note that (y
0
, z
0
) 2 X
*
and this base categorization consists of all 18 nancial parameters
given in Table 1. The interest in (y
0
, z
0
) is merely for comparison with (y, z) in X (or X
*
)
that might better represent the underlying nancial strength of a rm.
3. Relative nancial strength indicator (RFSI)
The process of determining an RFSI requires, rst, determining a correlation metric for
the DEA-based performance scores and the stock price returns, for the industry as a
whole, for a given vector pair (y, z), and second, designing a suitable iterative procedure
to choose (y, z) 2 X (or X
*
) in an attempt to maximize the latter correlation metric (see
Fig. 1).
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3319
Let the DEA-based performance score for a rm k in a given industry be determined
according to the model in (4) as g
k
(y, z), for a specied categorization (y, z) 2 X. The RFSI
is developed here for the unrestricted X; for the restricted version of RFSI, X is simply
replaced with X
*
. Computing the model in (4) requires the realized values x
ij
of all nancial
parameters for all rms. The future value of a parameter i for rm j is a random variable,
denoted by X
ij
. The collection of random variables X
ij
for i = 1, . . . , I = 18 and j = 1, . . . , J
is X. Realizations of X
ij
are observed as x
ij
in (published) nancial statements of a given
period (i.e., quarter). For a future period t of uncertain nancial performance, the collec-
tion of random variables is the vector X
t
: fX
t
ij
: 8i; 8jg. Then, the DEA-based relative
nancial eciency for the industry is represented by the collection of random variables
g
t
(y, z): = {g
j
(y, z; X
t
) : j = 1, . . . , J}. Once the period t nancial statements are observed,
with X
t
realized as x
t
, the random vector g
t
(y, z) is realized as the vector of values
{g
j
(y, z; x
t
) : j = 1, . . . , J}.
Let R
t
j
denote the stock price rate of return (RoR) random variable (for future period t)
of rm j, and those for all rms are represented by the random J-vector
R
t
: fR
t
j
: j 1; . . . ; Jg. Observed realizations of period t RoR is the vector
r
t
: fr
t
j
: j 1; . . . ; Jg. Consider the pairwise correlations between the two random vec-
tors g
t
(y, z) and R
t
, denoted by the correlation vector C
t
y; z 2 R
J
. Its jth component,
for rm j, is given by
C
t
j
y; z : Corrfg
j
y; z; X
t
; R
t
j
g; 9
where j = 1, . . . , J. The correlation vector C
t
(y, z) is, therefore, a measure of the predictive
power of the DEA-based (nancial) eciency metric on stock price returns for the chosen
industry. Indeed, a positive and signicant correlation vector C
t
(y, z) implies that the
DEA-based eciency score is a valuable proxy of the stock market performance of the
industry. Observe that C
t
(y, z) for period t depends on the chosen binary complementary
vector (y, z) 2 X. The best industry correlation is thus obtained when one searches for
(y, z) 2 X such that an appropriate metric of the vector C
t
(y, z) is maximized. Vector norms
cannot be used as appropriate metrics here because the goal is to seek positive (and large)
correlations across all rms in the industry. While more complicated formulae are possi-
ble, we use the simple average statistic

C
t
y; z :
1
J

J
j1
C
t
j
y; z; 10
herein termed the industry correlation metric, to search for the highest positive correlations
industry-wide. Note that the correlation vector C
t
(y, z) is unknown for the future period t,
and thus, it must be forecasted. To forecast C
t
(y, z), we use the historical (observed) sample
x
s
, s = 1, . . . , t 1. Using a history length of t
0
periods, C
t
j
y; z is estimated by the sample
correlation coecient, given by
c
t
j
y; z : Correlation coefficient between fg
j
y; z; x
s
g
t1
stt
0
and fr
s
j
g
t1
stt
0
: 11
Then, the industry correlation metric

C
t
y; z in (10) is estimated as
c
t
y; z :
1
J

J
j1
c
t
j
y; z: 12
3320 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Observe that the statistic c
t
y; z for period t depends on the chosen binary complementary
vector (y, z) 2 X. The best industry-correlation metric is thus obtained when one searches
for (y, z) 2 X such that c
t
y; z is maximized, i.e., solve the industry-correlation maximiza-
tion model
CORMAX : max
y;z
c
t
y; z
s:t: y; z 2 X:
13
Let an optimal binary complementary pair solving the above maximization be denoted by
(y
*
, z
*
). Note that dependence of this pair on the period index t is suppressed. The corre-
sponding industry correlation metric,

C
t
y

; z

in (10), is required to be statistically signif-


icant, for if not, the use of the DEA-based nancial strength indicator for the given
industry cannot be validated for investment decision making. Statistical tests for this pur-
pose are discussed in Section 4.1. When this industry correlation metric is veried to be
statistically signicant, the Relative Financial Strength Indicator (RFSI) for a given rm
in the industry is dened as follows.
Denition 3.1. Suppose

C
t
y

; z

is statistically signicant for a given industry, where


(y
*
, z
*
) is an optimal solution of (13). Then, the Relative Financial Strength Indicator
(RFSI) of rm j for (a future) period t, given the observed nancial statement data x
s
for
t t
0
6 s 6 t 1 for the industry, is dened by
RFSIt; j : Eg
j
y

; z

; X
t
jg
j
y

; z

; x
tt
0
; . . . ; g
j
y

; z

; x
t1
; 14
where g
j
(y
*
, z
*
; x
s
) is computed according to the DEA model in (4) for the input/output
categorization (y
*
, z
*
), and E[ ] denotes the expectation operator.
To simplify the computation of RFSI, the expectation in (14) is estimated by the simple
moving average forecast (of
^
t periods,
^
t 6 t
0
), given as
RFSIt; j;
^
t
1
^
t

t1
st
^
t
g
j
y

; z

; x
s
: 15
RFSIt; j;
^
t is bounded within 0 and 1, where a value of unity indicates the highest possible
relative nancial strength indicator for rm j, relative to the industry concerned. Also note
that a single input/output categorization (y
*
, z
*
) of the 18 nancial parameters in Table 1 is
used in computing the RFSI for all rms in the industry, for the future period t. For future
periods beyond t, it may be necessary to adapt RFSI to new nancial statement observa-
tions, by resolving (13) for a revised optimal input/output categorization.
3.1. Two-step heuristic solution method
The CORMAX model in (13) is a dicult optimization problem because evaluation of the
objective function (statistic) c
t
y; z in (12) requires the solution of a sequence of linear
optimization models (4) so that each of the sample correlation coecients c
s
j
y; z in
(11) can be computed. Therefore, the objective function in (13) cannot be explicitly written
in closed-form nor can it be veried to be concave (or pseudo-concave) in the 2I-dimen-
sional decision variable-vector (y, z). Nonconvex optimization is known to be computa-
tionally tedious, see for instance, Horst et al. (1995). Moreover, X is a binary solution
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3321
space, i.e., (13) is a binary nonconvex optimization model. Global optimality conditions
for discrete nonconvex optimization have been studied, e.g. see Larsson and Patrikssony
(2005). However, ecient methods are available only for specially structured problems
and/or without integer restrictions, e.g. see Tawarmalani and Sahinidis (2004) and Zhang
et al. (1999).
Alternatively, we employ an ecient heuristic solution scheme. The method is a two-
step procedure, which is based on, rst, sampling a set of initial (y, z) points from the fea-
sible domain X, and then, performing a local search optimization in X for each of those
initial sample points. Consider a random sample of (vector) points x
s
:
y
s
; z
s
2 X & R
2I
, for s 2 S, where Sdenotes the index set of the sample points. For each
sample point, the objective criterion is calculated and the sample of industry correlation
metric values
fc
t
y
s
; z
s
: s 2 Sg 16
is collected. Then, each sample value c
t
y
s
; z
s
is improved to a locally optimal value by
employing a non-gradient based local search procedure, starting from the point x
s
2 X.
The corresponding local optimum is denoted by ~ x
s
: ~y
s
; ~z
s
. Then, an approximation
for the optimal input/output categorization for the industry is determined by
y

; z

% arg max
s2S
fc
t
~y
s
; ~z
s
g: 17
The following procedure is applied to generate a (random) sample point x
s
2 X: randomly
draw a set of 2I values from a continuous uniform distribution in [1, 1]. The rst I val-
ues are collected to form the I-vector a
s
. The last I values are collected to form the I-vector
b
s
. Then, the sample point x
s
= (y
s
, z
s
) is dened by the solution of the binary linear
program
y
s
; z
s
: arg max
y;z
fa
s

0
y b
s
0z : y; z 2 Xg; 18
where a prime denotes the transposition of a vector. This process is then repeated for each
s 2 S.
3.1.1. Local search
The non-gradient-based local search procedure is a modication from the Hooke and
Jeeves (HJ) method, see Bazaraa et al. (1993). Given a current solution x
p
, at some iter-
ation p of the local search procedure, the original HJ method performs an exploratory
search along the coordinate directions. Coordinate directions that improve the objective
function are used to dene a new iterate. The direction to the new iterate from the starting
solution x
p
is used to perform a pattern search. This method is adapted to the binary
model in (13).
For some candidate x
p
2 X, the ith coordinate x
p
i
is either 0 or 1. If x
p
i
0, then an
exploratory move is allowed only in the positive (x
i
) coordinate direction. If x
p
i
1, then
an exploratory move is allowed only in the negative (x
i
) coordinate direction. Once, a new
iterate ^ x
p
is so-determined, a pattern (line) search is not necessary in our case since the
search point x
p
k^ x
p
x
p
62 X for k 62 {0, 1}. The resulting algorithmic steps are as
follows:
3322 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Algorithm-LS
Initialization: Given x
s
= (y
s
, z
s
) 2 X, see (18), determine c
t
x
s
.
Set p = 1, f p c
t
x
s
, and x(p) = x
s
.
Step 1: For i = 1, . . . , 2I and denoting the ith elementary coordinate direction by e
i
, let
x
i
: xp e
i
if xp e
i
2 X and f p < c
t
xp e
i

else; x
i
: xp e
i
if xp e
i
2 X and f p < c
t
xp e
i

else; x
i
: xp:
Compute x
2I+1
by the XOR (exclusive or or not equal to) operation:
x
2I1
: x
1
xor x
2
xor . . . xor x
2I
:
If x
2I+1
62 X, set x
2I+1
= x(p).
Step 2: Let xp 1 : arg maxfc
t
x
i
: i 1 . . . ; 2I 1g.
If c
t
xp 1 c
t
xp : Terminate the local search and set
x
s
xp.
Else, if c
t
xp 1 > c
t
xp, let f p 1 c
t
xp 1
set p p + 1 and go to Step 1.
4. Portfolio selection using RFSI
Asset allocation is the practice of dividing resources among dierent categories such as
stocks, bonds, real estate, cash equivalents, etc. Often mathematical optimization models
are used for asset allocation with an intent to reduce risk exposure, since each asset class
has a dierent correlation to the others. Within a given category, say stocks, the portfolio
manager must choose specic industries to invest in and then specic rms must be
selected within those industries. Indeed, the asset allocation and the choice of individual
securities must occur integrative to portfolio risk management. Furthermore, such portfo-
lios must be temporally (say, quarterly) rebalanced to account for the economic and mar-
ket evolutions. For integrated risk control with frequent rebalancing under portfolio
optimization, see for instance, Edirisinghe (2007) and the many references therein.
The early work of portfolio optimization dates back to Markowitz (1952), where a
trade-o between portfolio expected return and portfolio variance is sought. While many
variants of this approach have been proposed over the years, the fact remains that a uni-
verse of securities must be pre-selected prior to running a portfolio optimization model to
determine optimal portfolio weights. While there is an increased research-focus on risk
specications (or lack thereof) to capture the investors risk attitude, it is imperative that
an underlying universe of securities be carefully selected and their stochastic elements be
accurately estimated. In fact, the latter two aspects are often the more dominant aspects
in the practice of portfolio management. In this section, we focus on the question of select-
ing a universe of securities for portfolio risk management, by applying the RFSI concept
developed in the preceding sections. Then, portfolio optimization is performed to deter-
mine optimal weights on individual securities.
Consider a stock portfolio investment problem where a given budget must be allocated
among stocks in H dierent industries. Suppose there are J
h
stocks (rms) under
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3323
consideration in industry h, h = 1, . . . , H. The fund allocation problem is considered in two
stages: in the rst stage, the problem is to choose the industries for investment of the given
budget, and then, to choose specic stocks within those industries as potential candidates
in a portfolio. In the second stage, specic portfolio weights are assigned to the chosen
candidate securities such that the investors risk-return preferences are satised. The stage
one problem is an asset selection problem and the stage two is a portfolio optimization
problem. We consider applying a selection process based on RFSI to the stage one. Note
that nancial statements of all rms in the H industries (thus, a total of J
0


H
h1
J
h
rms)
must be used to collect data for the 18 nancial parameters in Table 1.
Suppose the current time period is t 1 and investment is desired for period t (i.e., the
subsequent quarter). Referring to the method of computing RFSI presented in Section 3,
and using the publicly-available nancial data x
s
h
for periods s < t, the model for correla-
tion maximization in (13) is solved to determine an optimal complementary binary pair
(y
h
*
, z
h
*
), for each industry h = 1, . . . , H. The industry correlation metric corresponding
to this optimized input/output categorization, C
t
h
y
h
; z
h
in (10), must be veried to be
statistically signicant. Industry selection for portfolio optimization is based upon the sig-
nicance of this correlation, tested via the sample estimate c
t
h
y
h
; z
h
in (12).
4.1. Statistical tests of correlations
We are concerned with identifying industries that do not provide statistical evidence for
RFSI-based predictability of stock returns, i.e., the industry correlation metric C
t
h
y
h
; z
h

is not a signicant positive value. Consider the following hypothesis test for a minimum
positive correlation (q
0
) for a given industry h,
H
0
: C
t
h
y
h
; z
h
6 q
0
H
1
: C
t
h
y
h
; z
h
> q
0
:
_
19
The above null hypothesis H
0
indicates that DEA-based fundamental nancial strength is
not consistent with the ecient market theory for industry h. Note that
C
t
h
y
h
; z
h
:
1
J
h

J
j1
C
t
j;h
y
h
; z
h
, see (10), rm-correlations C
t
j;h
y
h
; z
h
are estimated
by c
t
j;h
y
h
; z
h
in (11). Consider the following arctan hyperbolic transformation of
c
t
j;h
y
h
; z
h
:
w
j;h
: tanh
1
c
t
j;h
y
h
; z
h

1
2
log
e
1 c
t
j;h
y
h
; z
h

1 c
t
j;h
y
h
; z
h

_ _
: 20
Using the results that Ec
t
j;h
% C
t
j;h
and Varc
t
j;h
% 1 C
t
j;h

2
, R.A. Fisher (18901962)
showed that w
j,h
is approximately normally distributed,
w
j;h
% Normal
1
2
log
e
1 C
t
j;h
y
h
; z
h

1 C
t
j;h
y
h
; z
h

_ _
;
1
t
0
3
_ _
; 21
see Tamhane and Dunlop (2000), where t
0
is the number of periods used in the estimation
in (11). Next, dene the industry-average statistic by

w
h
:
1
J
h

J
h
j1
w
j;h
; 22
3324 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
which is the sum of J
h
normal random variables, and thus,

w
h
is normally distributed:

w
h
% Normal
1
2J
h

J
h
j1
log
e
1 C
t
j;h
y
h
; z
h

1 C
t
j;h
y
h
; z
h

_ _
; ^ r
2
_ _
; 23
where the variance of

w
h
is given by
^ r
2
:
1
J
h
t
0
3

2
J
h

j<k
Covw
j;h
; w
k;h
: 24
Cov(w
j,h
,w
k,h
) in (24) depends on the covariance between c
t
j;h
y
h
; z
h
and c
t
k;h
y
h
; z
h
. The
latter two correlations are the point estimates of the rm-correlations C
t
j;h
y
h
; z
h
and
C
t
k;h
y
h
; z
h
, for rms j and k. Firm-correlation measures the degree of association
between a rms nancial strength and its stock price return, a process that may be
expected to be fairly consistent across all rms in the industry. Therefore, when rms j
and k operate independent of each other, the point estimates c
t
j;h
y
h
; z
h
and
c
t
k;h
y
h
; z
h
of the rm-correlations C
t
j;h
y
h
; z
h
and C
t
k;h
y
h
; z
h
, respectively, can be
expected to be independent of each other as well. This independence assumption results
in Cov(w
j,h
,w
k,h
) = 0, and thus, the parameters of distribution of

w
h
are (approximately)
known once the value of C
t
j;h
y
h
; z
h
is known.
Observe that under the equality sign in the null hypothesis in (19), one has reference
only to the industry correlation metric, i.e., C
t
h
y
h
; z
h
q
0
; however, we need knowledge
of the individual rm-correlations C
t
j;h
y
h
; z
h
. Let the latter correlations be given by
C
t
j;h
y
h
; z
h
q
0
h
j;h
for j 1; . . . ; J
h
; 25
where h
j,h
must satisfy the requirements:

J
h
j1
h
j;h
J
h
and
1
q
0
6 h
j;h
6
1
q
0
8j 1; . . . ; J
h
: 26
The constraints in (26) follow from the fact that the industry-correlation metric is q
0
(spec-
ied as a positive value) and that rm-correlations are bounded within 1 and +1. Then,
denoting
w
0
h :
1
2J
h

J
h
j1
log
e
1 q
0
h
j;h
1 q
0
h
j;h
_ _
and r
2
:
1
J
h
t
0
3
; 27
it follows that

w
h
% Normalw
0
h; r
2
. For niteness of the mean, w
0
(h), the inequalities in
(26) must be satised as strict inequalities, i.e.,
1
q
0
< h
j;h
<
1
q
0
, j = 1, . . . , J
h
. Then, for
a-signicance level and for the one-sided test, H
0
is accepted if

J
h
t
0
3
_

w
h
w
0
h 6 Z
1
1 a; 28
or,

w
h
6 w
0
h
Z
1
1 a

J
h
t
0
3
_ ; 29
where Z
1
is the inverse c.d.f. of a standard normal random variable. To test H
0
to con-
clude that the DEA-based relative strength does not provide sucient explanatory power
for stock price returns in industry h, therefore, specic h values are required. Such infor-
mation is not available, nor can it be estimated. However, if H
0
is accepted for the smallest
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3325
(threshold) value of the right hand side in (29) over all possible h, then, indeed H
0
is ac-
cepted for the industry h. Dene,
w
min
0
: inf
h
w
0
h :

J
h
j1
h
j;h
J
h
;
1
q
0
< h
j;h
<
1
q
0
; j 1; . . . ; J
h
_ _
: 30
Hence, if

w
h
6 w
min
0

Z
1
1a

J
h
t
0
3
p
, then H
0
is accepted for the industry h.
Proposition 4.1. For 0 < q
0
< 1,
w
min
0

1
2
log
e
1 q
0
1 q
0
_ _
: 31
Proof. Note that w
0
(h) is nonconvex it is convex in the positive orthant and concave in
the negative orthant. Consider the (relaxed) minimization problem:
Z

: min
h
w
0
h :

J
h
j1
h
j;h
J
h
_ _
32
and thus, Z

6 w
min
0
. Since the constraints of (32) are linear, Constraint Qualication
(CQ) is satised at all feasible solutions, thus implying that every optimal solution of
(32) must be a KarushKuhnTucker (KKT) point, see Bazaraa et al. (1993). Denoting
the Lagrange multiplier associated with the equality constraint by k, the KKT conditions
yield,
2q
0
1 q
0
h
j;h
_ _
2
k 0 8j 1; . . . ; J
h
: 33
Therefore, h
j,h
= m
j
a must hold for all j = 1, . . . , J
h
, where m
j
is +1 or 1 and a is a positive
constant. The equality constraint thus implies that

J
h
j1
m
j
J
h
=a > 0, which is the net
count of positive values in h
j,h
for j = 1, . . . , J
h
, and thus,

J
h
j1
m
j
1; 2; . . . ; J
h
. That is,
a can take on the set of possible discrete values 1;
J
h
J
h
1
;
J
h
J
h
2
. . . ;
J
h
2
; J
h
_ _
. Each of these val-
ues for a denes a distinct KKT point provided the objective function is well-dened at
those points, which is given by
1
2J
h

j:m
j
1
log
e
1 aq
0
1 aq
0
_ _

j:m
j
1
log
e
1 aq
0
1 aq
0
_ _
_ _

1
2J
h

J
h
j1
m
j
_ _
log
e
1 aq
0
1 aq
0
_ _
since the following identity holds:
log
e
1 q
0
a
1 q
0
a
_ _
log
e
1 q
0
a
1 q
0
a
_ _
0:
Therefore, the objective value associated with each KKT-point is given by
1
2J
h
J
h
a
log
e
1aq
0
1aq
0
_ _
provided a 6
1
q
0
. Then, the minimum in (32) is obtained by
Z

min
1
2a
log
e
1 aq
0
1 aq
0
_ _
: a 1;
J
h
J
h
1
;
J
h
J
h
2
. . . ; minfJ
h
; 1=q
0
g
_ _
: 34
3326 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Next, noting that the function f x
1
x
log
e

1xq
0
1xq
0
is monotonically nondecreasing in x for
x 2 [1, 1/q
0
], it follows that a = 1 is indeed the optimal solution in (34), which thus implies
that each m
j
= + 1 at the optimum. That is, h
j,h
= 1, "j, solves the relaxed problem in (32),
which yields Z


1
2
log
e
1q
0
1q
0
_ _
. But, for 0 < q
0
< 1, we have
1
q
0
< h
j;h
1 <
1
q
0
for all j.
This leads to the feasible point upper bound on the inmum in (30) as w
min
0
6 Z

. Combin-
ing with Z

6 w
min
0
, the proof is completed. h
4.2. Selection criteria for portfolio optimization
The foregoing statistical analysis can be used to determine an industry partition for
investment. Given a set of industries h = 1, . . . , H for consideration in period t, an invest-
ment-worthy (screened) set His determined under the Industry Selection Criterion given
by
ISC: H: h :

w
h
>
j
2
log
e
1 q
0
1 q
0
_ _

Z
1
1 a

J
h
t
0
3
_ ; h 1; . . . ; H
_ _
; 35
where j P1 is a user-specied (safety) factor. For each industry h 2 H, individual stocks j
are chosen from the given rms j = 1, . . . , J
h
by using RFSI as a selection discriminator.
Under Denition 3.1 and referring to the computation of RFSI in (15), for a given indus-
try h 2 H, evaluate the moving average forecast of
^
t periods,
RFSIt; j;
^
t
1
^
t

t1
st
^
t
g
j
y
h
; z
h
; x
s
: 36
The subset (of rms) J
h
from industry h 2 H is chosen for portfolio analysis under the
Stock Selection Criterion given by
SSC : J
h
: fj : RFSIt; j;
^
t PR

; j 1; . . . ; J
h
g; 37
where R
*
is a prespecied threshold, where 0 < R
*
6 1. The stocks in the subset J
h
, for
h 2 H, are then expected to perform well in the stock market with high condence. The
universe of securities for portfolio analysis is thus given by
N:
_
h2H
J
h
; 38
which is a subset of the original universe of stocks, i.e., jNj6 J
0
:

H
h1
J
h
. Investment
weight to be attached to each stock j 2 Nis then determined by a portfolio optimization
model. There are several models in the literature for this purpose, and the choice of a mod-
el is primarily guided by risk/return considerations. Risk specications are multi-pronged
and portfolio optimization models are multi-faceted. For instance, when there are trans-
actions and slippage costs of trading, portfolio drawdown characteristics are a major con-
cern of risk. Also, when market evolutionary dynamics are nonstationary, multiperiod
sequential stochastic decision optimization is shown to yield superior performance com-
pared to static one period models, see Edirisinghe (2007) for details.
The focus here is to demonstrate the usefulness of the preceding selection criteria, (ISC)
and (SSC), over the unscreened set of J
0
stocks. The specic advantage of our stock screen-
ing rules will be that stochastic parameter estimation will be performed in a reduced
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3327
dimension using stocks that are likely to have strong performance. Reduced dimensions
typically lead to smaller statistical estimation errors in RoR parameters. Consequently,
a given portfolio optimization model is expected to provide better risk/return performance
characteristics. To this end, we utilize the standard mean-variance portfolio optimization
framework in a static one period setting. We compare the two models, RFSI-based model
(RMV) that uses the universe Nand the standard mean-variance model (SMV) using all
J
0
stocks. Portfolio weights carry the designation w
j
for stock j, with a superscript R or S
indicating the type of model used to calculate the weights.
RMV: max
w
R

j2N
l
j
w
R
j
k

j;k2N
r
jk
w
R
j
w
R
k
39
s:t:

j2N
w
R
j
6 1
w
R
j
P0; j 2 N:
SMV: max
w
S

J
0
j1
l
j
w
S
j
k

J
0
j;k1
r
jk
w
S
j
w
S
k
40
s:t:

J
0
j1
w
S
j
6 1
w
S
j
P0; j 1; . . . ; J
0
:
In (39) and (40), k is a risk tolerance parameter that trades o portfolio mean return with
portfolio variance. Thus, a higher k is indicative of risk-averse portfolio weighting that
strives for better diversication. Performance comparison of these two models is presented
in the next section.
5. Application of RFSI in the technology sector
The DEA-based relative nancial strength indicator, RFSI, is applied in portfolio opti-
mization where several (publicly-traded) US companies in various industries are consid-
ered. The objective is to validate the use of RFSI-based stock selection as a means of
improving risk/return performance of optimized portfolios. Quarterly nancial statements
of rms during the period 19962002 are used. Of the 28 consecutive quarters, the rst
quarter is set aside for the initial calculations of RoR, growth rates etc. Reported results
pertain to a time window of t
0
= 27 quarters for industry-correlation maximization in (13).
Only the technology sector is used for the experimentation, of which six industries are cho-
sen: software & programming (h = 1), communications equipment (h = 2), computer
hardware (h = 3), computer networks (h = 4), semiconductors (h = 5), and computer ser-
vices (h = 6). Quarterly data for all rms in these 6 industries are electronically obtained
from the WRDS (Wharton Research Data Services) database. The nancial statement
data, as well as quarterly stock price information, are checked for completeness and only
those rms with complete data are chosen within each industry. Thus, the usable number
of rms (J
h
) in each industry is limited, and they are J
1
= 52, J
2
= 51, J
3
= 14, J
4
= 17,
J
5
= 75, and J
6
= 21. Thus, the total number of rms is J
0
= 230.
In order to determine the optimal input/output categorization of the 18 nancial
parameters in Table 1, the two-step heuristic solution method in Section 3.1 is applied,
3328 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
where an initial sample (y
s
, z
s
), s 2 S, is determined using a sample of size j Sj 20. The
corresponding sample of objective correlations in (16) is then computed. For each sample
point, the objective industry-correlation value c
t
h
y; z, being forecasted for period t = 29
(i.e., the rst quarter of 2003), is improved via the local search procedure in Algorithm-
LS, see Section 3.1.1. The optimal vector pairs (y
h
*
, z
h
*
) corresponding to the largest cor-
relation are then obtained, see Table 2. The notation in, out, or represent a given
nancial statement parameter i is an input, output, or it is not considered, respectively, in
an industry h. These results pertain to the unrestricted version that uses X in (6). Those
for the restricted version X
*
in (7) are in parentheses in Table 2. The local search trajectory
corresponding to the sample point that leads to the reported optimal input/output catego-
rization is plotted, for the unrestricted domain X and the restricted domain X
*
, in Figs. 2
and 3, respectively, for each industry.
For industry-optimal (y
h
*
, z
h
*
) categorization, for both cases of unrestricted and
restricted domains, industry-correlation metric c
t
h
y
h
; z
h
is estimated according to (11),
and each industry is tested for statistical signicance using the hypothesis test in (19).
Table 2
Optimal input/output categorization (y
h
*
, z
h
*
) for RFSI in each industry
Financial parameter (i) Industry (h)
Software Communic. Hardware Networks Semicond. Services
1 in (out) () (out) () () (out)
2 () () () out (out) () ()
3 (out) () (out) out (out) in () out ()
4 out () (in) out (in) in (in) () out (in)
5 out (in) (in) out () () () ()
6 () () out () () out () ()
7 in (in) (in) out (in) () () out (in)
8 () in () in (in) () (in) in (in)
9 in (in) () in () in (in) (in) ()
10 (in) in () () in (in) out () ()
11 out () () () () in (in) in ()
12 in (in) out () () () in (in) in ()
13 out () () out (out) () in () (out)
14 (out) in (out) (out) out (out) (out) (out)
15 (out) out (out) (out) () (out) ()
16 (out) out (out) () () out (out) (out)
17 in () () out (out) () () (out)
18 out (out) () in (out) () () out ()
Max. Corr. c
t
h
y
h
; z
h

Unrestricted domain 0.2360 0.1947 0.2424 0.2992 0.2220 0.1856


Restricted domain (0.1848) (0.1594) (0.0995) (0.2992) (0.1595) (0.1588)
Test statistic

w
h
Unrestricted domain 0.2537 0.2035 0.2551 0.3185 0.2352 0.1951
Restricted domain (0.1927) (0.1683) (0.1010) (0.3185) (0.1704) (0.1642)
Base categorization
Corr. c
t
h
y
0
; z
0
0.1318 0.0504 0.0328 0.1488 0.0668 0.0416
Test statistic 0.1392 0.0521 0.0459 0.1539 0.0677 0.0373
ISC critical value 0.1670 0.1674 0.2101 0.2018 0.1592 0.1937
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3329
The test statistic

w
h
in (22) is computed and reported in Table 2. The minimum positive
correlation is set to q
0
= +0.10, which yields w
min
0
0:1003, see (31). Setting the level of
signicance a = 5% and the safety factor j = 1.2, the resulting critical values for the
1 . 0 -
5 0 . 0 -
0
5 0 . 0
1 . 0
5 1 . 0
2 . 0
5 2 . 0
3 . 0
5 3 . 0
9 8 7 6 5 4 3 2 1 0
# n o i t a r e t i h c r a e s l a c o L
I
n
d
u
s
t
r
y
-
c
o
r
r
e
l
a
t
i
o
n

m
e
t
r
i
c
e r a w t f o s s n o i t a c i n u m m o C e r a w d r a H s k r o w t e N s r o t c u d n o C i m e S s e c i v r e S
Fig. 2. Trajectories of maximizing industry-correlation Unrestricted case.
1 . 0 -
5 0 . 0 -
0
5 0 . 0
1 . 0
5 1 . 0
2 . 0
5 2 . 0
3 . 0
5 3 . 0
9 8 7 6 5 4 3 2 1 0
# n o i t a r e t i h c r a e s l a c o L
I
n
d
u
s
t
r
y
-
c
o
r
r
e
l
a
t
i
o
n

m
e
t
r
i
c
e r a w t f o s s n o i t a c i n u m m o C e r a w d r a H s k r o w t e N s r o t c u d n o C i m e S s e c i v r e S
Fig. 3. Trajectories of maximizing industry-correlation Restricted case.
3330 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
ISC criterion in (35) are reported in the same table. Observe that for the optimized indus-
try-correlation metric under unrestricted parameter domain X, all six industries are chosen
by the ISC criterion, while that under the restricted domain X
*
leads to rejecting the two
industries, hardware (h = 3) and services (h = 6).
Correlations corresponding to the exogenous base categorization in (8) are also
reported in Table 2. Note that the maximized correlations (for quarter 1 of 2003) are
strictly better than those resulting from the base categorization of the 18 nancial
parameters. Also note that the base categorization fails to pick a single industry for invest-
ment, based on DEA-based predictability.
For the industries chosen as above, RFSI indicator is computed according to (36), with
^
t 4. That is, the most recent four quarter moving average is computed for predicting
RFSI for quarter 1 of 2003. These RFSI predictions are plotted in Figs. 4 and 5, respec-
tively, for unrestricted and restricted cases. Specifying the threshold R
*
= 0.60 for the
stock selection criterion in (37), stocks are chosen for portfolio optimization. As evident
from Figs. 4 and 5, only a small fraction of the universe of 230 securities are chosen by
the SSC; for the unrestricted case, 85 securities are selected (jNj 85) and for the
restricted case, only 49 securities are selected (jNj 49).
5.1. Portfolio optimization
The model RMV in (39) is executed for several risk tolerance levels using the screened
subsets of stocks, under the stock selection criterion. Let RMV(u) indicate the model for
the unrestricted case, specied with 85 stocks, and let RMV(r) denote the restricted case,
having 49 stocks. In contrast, the model SMV in (40) is specied with the original universe
of 230 stocks. In this section, the performance of the model SMV is compared with
0
1 . 0
2 . 0
3 . 0
4 . 0
5 . 0
6 . 0
7 . 0
8 . 0
9 . 0
1
1 . 1
0 0 2 0 5 1 0 0 1 0 5 0
y t i r u c e S
R
F
S
I

f
o
r

2
0
0
3
Q
1
e r a w t f o S s n o i t a c i n u m m o C e r a w d r a H s k r o w t e N s r o t c u d n o C i m e S s e c i v r e S
Fig. 4. RFSI predictions for chosen industries Unrestricted case.
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3331
RMV(u) and RMV(r). Portfolio allocations (i.e., weights) are determined by solving the
appropriate quadratic programming models, specied with a model-time period of
3-months covering from January 01 to March 31, 2003, herein referred to as the invest-
ment horizon. We consider a buy-and-hold strategy, wherein the model-determined
0
1 . 0
2 . 0
3 . 0
4 . 0
5 . 0
6 . 0
7 . 0
8 . 0
9 . 0
1
1 . 1
0 0 2 0 5 1 0 0 1 0 5 0
y t i r u c e S
R
F
S
I

f
o
r

2
0
0
3
Q
1
e r a w t f o S s n o i t a c i n u m m o C s k r o w t e N s r o t c u d n o C i m e S
Fig. 5. RFSI predictions for chosen industries Restricted case.
% 0
% 0 5
% 0 0 1
% 0 5 1
% 0 0 2
% 0 5 2
% 0 0 3
% 0 4 % 5 3 % 0 3 % 5 2 % 0 2 % 5 1 % 0 1 % 5 % 0
n o i t a i v e D d r a d n a t S d e z i l a u n n A
A
n
n
u
a
l
i
z
e
d

P
o
r
t
f
o
l
i
o

R
a
t
e

o
f

R
e
t
u
r
n
) u ( V M R : d e t c i r t s e r n U ) r ( V M R : d e t c i r t s e R V M S : d r a d n a t S
3 0 0 2 , 1 3 r a M - 1 0 n a J
Fig. 6. Portfolio (actual) ecient frontiers under RFSI-based and standard models.
3332 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
optimal stock allocations are held unchanged during the entire investment horizon. Con-
sequently, the resulting portfolios are evaluated (i.e., out-of-sample simulated) on a daily
basis by using the actual price realizations from the investment horizon. That is, portfolio
allocations determined at the end of 2002 by the model are simulated using prices from
January 01 to March 31, 2003 to determine the actual portfolio performance
characteristics.
Performance characteristics are compared across the three dierent models RMV(u),
RMV(r), and SMV. By varying the value of the risk tolerance parameter k in each model,
ecient frontiers are traced and plotted in Fig. 6. It is evident that the RFSI-based models
outperform the standard mean-variance trade-o optimization, with the unrestricted RFSI
model showing impressive portfolio gains over the restricted version.
During the same investment horizon, the market barometer index, S&P-500 index, dis-
plays an annualized standard deviation of 22.4%. The three model versions are set such
that each model will provide a portfolio with an annualized standard deviation of (approx-
imately) 22.4%. Portfolio evolutions corresponding to this case are depicted in Fig. 7,
where the performance of S&P-500 index is also indicated. The daily cumulative returns
are signicantly improved in the case of the RFSI-based model using the unrestricted
selection of input/output parameters.
6. Concluding remarks
Capturing fundamental nancial strength of publicly traded rms is of paramount
interest to equity portfolio managers. To this end, this article developed a new quantitative
metric, termed the Relative Financial Strength Indicator (RFSI), which is designed to have
high correlation with stock price returns. The underlying methodology is based on using a
% 0 1 -
% 5 -
% 0
% 5
% 0 1
% 5 1
% 0 2
% 5 2
1 6 6 5 1 5 6 4 1 4 6 3 1 3 6 2 1 2 6 1 1 1 6 1
# y a D
D
a
i
l
y

C
u
m
u
l
a
t
i
v
e

R
o
R
) u ( V M R : d e t c i r t s e r n U ) r ( V M R : d e t c i r t s e R V M S : d r a d n a t S x e d n i 0 0 5 P & S
3 0 0 2 , 1 3 r a M - 1 0 n a J % 4 . 2 2 = v e D d t S . n n A , s e i r e s e m i t l l a r o F
Fig. 7. Portfolio RoR under RFSI-based and standard models.
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3333
generalized version of data envelopment analysis, coupled with selecting inputs and out-
puts from nancial statements via a well-dened optimization process. The resulting mea-
sure of relative fundamental strength is then veried to yield high degree of correlation via
statistical analysis. Consequently, RFSI is a useful tool for selecting investment-worthy
industries and securities for portfolio analysis. With the proposed selection rules, it is
shown that portfolios so-optimized have signicantly improved risk-return characteristics.
To our knowledge, this is the rst instance that portfolio optimization is integrated with
optimization-based tools for fundamental analysis of companies.
While signicant eorts have been expended in literature toward understanding risk-
return characters of portfolios constructed with a given universe of securities, methods
for selecting such a universe have not been addressed to any appreciable extent. With
the RFSI methodology proposed in this paper, portfolio managers can objectively validate
the inclusion of a given industry or a given security for possible investment, based on the
industrys or rms fundamental nancial strength. The question of diversication of
investments can then be addressed within those chosen securities. With limited computa-
tional experiments in this paper, the advantages of such an approach is demonstrated
using portfolio variance as the risk measure. In our future work, we will pursue specic
choices for parameters of RFSI computations using data sets that cover a broader market,
as well as the eects of using alternative risk measures for portfolio optimization.
Acknowledgements
The authors wish to thank the editor and the anonymous referees for many useful com-
ments that improved the presentation of this paper.
References
Alam, I.M.S., Robin, C.S., 1998. The relationship between stock market returns and technical eciency
innovations: Evidence from the US airline industry. Journal of Productivity Analysis 9, 3551.
Banker, R.D., Charnes, A., Cooper, W.W., 1984. Some models for estimating technical and scale ineciencies in
data envelopment analysis. Management Science 30, 10781092.
Bazaraa, M.S., Sherali, H.D., Shetty, C.M., 1993. Nonlinear Programming. John Wiley.
Beasley, J., 1995. Determining teaching and research eciencies. Journal of the Operational Research Society 46,
441452.
Bowlin, W.F., 1999. An analysis of the nancial performance of defense business segments using data
envelopment analysis. Journal of Accounting and Public Policy 18, 287310.
Bowlin, W.F., 2004. Financial analysis of civil reserve air eet participants using data envelopment analysis.
European Journal of Operational Research 154, 691709.
Charnes, A., Cooper, W.W., Rhodes, E., 1978. Measuring the eciency of decision-making units. European
Journal of Operational Research 2, 429444.
Charnes, A., Cooper, W.W., Golany, B., Seiford, L.M., Stutz, J., 1985. Foundations of data envelopment
analysis for ParetoKoopmans ecient empirical production functions. Journal of Economics 30, 91107.
Cook, W.D., Green, R.H., Zhu, J., 2006. Dual-role factors in data envelopment analysis. IIE Transactions 38,
105115.
Doyle, J.T., Lundholm, R.J., Soliman, M.T., 2003. The predictive value of expenses excluded from Pro Forma
earnings. Review of Accounting Studies 8, 145174.
Edirisinghe, N.C.P., 2007. Integrated risk control using stochastic programming ALM models for money
management. In: Zenios, S.A., Ziemba, W.T. (Eds.), In: Handbook of Asset and Liability Management, vol.
2. Elsevier Science BV.
Edirisinghe, N.C.P., Zhang, X., 2007. Portfolio selection under DEA-based relative nancial strength indicators:
Case of US industries. Journal of the Operational Research Society, in press.
3334 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Farrell, M.J., 1957. The measurement of eciency of production. Journal of the Royal Statistical Society (Series
A) 120, 251281.
Graham, C.M., Cannice, M.V., Sayre, T.L., 2002. The value-relevance of nancial and non-nancial information
for Internet companies. Thunderbird International Business Review 44, 4770.
Gregoriou, G.N., Sedzro, K., Zhu, J., 2005. Hedge fund performance appraisal using data envelopment analysis.
European Journal of Operational Research 164, 555571.
Horst, R., Pardalos, P.M., Thoai, N.V., 1995. Introduction to global optimization. In: Series in: Nonconvex
Optimization and its Applications, vol. 3. Kluwer Academic Publishers.
Kanas, A., 2001. Neural network linear forecasts for stock returns. International Journal of Financial Economics
6, 245254.
Larsson, T., Patrikssony, M., 2005. Global optimality conditions for discrete and nonconvex optimization With
applications to Lagrangian heuristics and column generation. Working Paper, Linkoping University, SE-581
83 Linkoping, Sweden.
Lovell, C.A.K., Pastor, J.T., 1995. Units invariant and translation invariant DEA models. Operational Research
Letters 18, 147151.
Markowitz, H.M., 1952. Portfolio selection. Journal of Finance 7, 7791.
Ou, J.A., 1989. Financial statement analysis and the prediction of stock returns. Journal of Accounting and
Economics 11, 295329.
Pille, P., Paradi, J.C., 2002. Financial performance analysis of Ontario (Canada) Credit Unions: An application
of DEA in the regulatory environment. European Journal of Operational Research 139, 339350.
Piotroski, J.D., 2000. Value investing: The use of historical nancial statement information to separate winners
from losers. Journal of Accounting Research 38, 141.
Portela, M.C.A.S., Thanassoulis, E., Simpson, G., 2004. Negative data in DEA: A directional distance approach
applied to bank branches. Journal of the Operational Research Society 55, 11111121.
Quah, T.S., Srinivasan, B., 1999. Improving returns on stock investment through neural network selection.
Expert Systems Applications 17, 295301.
Ray, S.C., 2004. Data Envelopment Analysis: Theory and Techniques for Economics and Operations Research.
Cambridge University Press.
Tamhane, A.C., Dunlop, D.D., 2000. Statistics and Data Analysis from Elementary to Intermediate. Prentice-
Hall, New Jersey.
Tawarmalani, M., Sahinidis, N.V., 2004. Global optimization of mixed-integer nonlinear programs: A theoretical
and computational study. Mathematical Programming 99, 563591.
Thanassoulis, E., 1993. A comparison of regression analysis and data envelopment analysis as alternative
methods for performance assessments. Journal of the Operational Research Soceity 44, 11291144.
Thomsett, M.C., 1998. Mastering Fundamental Analysis. Dearborn, Chicago.
Yeh, Q.J., 1996. The application of data envelopment analysis in conjunction with nancial ratios for bank
performance evaluation. Journal of the Operational Research Soceity 47, 980988.
Zhang, L.-S., Gao, F., Zhu, W.-X., 1999. Nonlinear integer programming and global optimization. Journal of
Computational Mathematics 17, 179190.
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3335

You might also like