Professional Documents
Culture Documents
N
n1
xo
nk
v
nk
M
m1
xi
mk
u
mk
s:t:
N
n1
xo
nj
v
nk
M
m1
xi
mj
u
mk
6 1; j 1; . . . ; J
u
mk
; v
nk
P0; m 1; . . . ; M; n 1; . . . ; N:
1
For rm j, the (measured) level of input parameter m is (xi)
mj
, m = 1, . . . , M, while that
of output parameter n is (xo)
nj
, n = 1, . . . , N. The input and output non-negative multipli-
ers for rm k are denoted by the variables u
mk
and v
nk
, respectively. The model in (1) yields
the maximum achievable eciency for rm k, denoted f
k
, provided every other rm is also
applying the same aggregating non-negative multipliers in computing their input to output
conversion ratios. f
k
is termed the DEA eciency score of rm k. An eciency score of less
3314 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
than one is indicative of that it may be possible to decrease the level of input for the same
level of output, while a score of 1 indicates the rm is DEA-ecient. By applying (1) to
each rm independently, the respective (maximum) relative eciency score for each rm
is computed. The equivalent linear programming formulation of model (1) is, see Charnes
et al. (1978),
^
f
k
: max
u;v
N
n1
xo
nk
v
nk
s:t:
M
m1
xi
mk
u
mk
1
M
m1
xi
mj
u
mk
N
n1
xo
nj
v
nk
6 0; j 1; . . . ; J
u
mk
; v
nk
P0; m 1; . . . ; M; n 1; . . . ; N:
2
It is easy to show that
^
f
k
f
k
holds under the non-negativity of the observed data. More
precisely, if (xi)
mk
> 0 for some m = 1, . . . , M, then,
^
f
k
f
k
holds. Conversely, suppose
(xi)
mk
6 0 for all m = 1, . . . , M. Then, the maximization in (1) is not well-dened, and
(2) is infeasible, in which case, we assign a performance strength of
^
f
k
0. For detailed
discussions on DEA models that involve negative inputs/outputs, see Lovell and Pastor
(1995) and Portela et al. (2004), for instance. The issue of negative data stems from the
fact that in the application in this paper, the input and output data come from nancial
statements. That is, it is possible that all input parameters for a given rm have non-po-
sitive values, depending on how the input parameters are chosen from nancial statements.
Such is the case if return on assets and return on equity are chosen as the only input
parameters and if these two parameters are negative for a rm being evaluated.
The input/output parameters for the DEA framework are identied from nancial
statements of companies. A total of 18 nancial parameters are used, either directly or
computed, from the quarterly nancial statements of a rm, as presented in Table 1. These
parameters examine a rms fundamental performance through a range of performance
perspectives: protability, asset utilization, liquidity, leverage, valuation, and growth
perspectives.
2.1. The generalized DEA approach
It is important to observe that, in order to apply the DEA model in (2), the M input
parameters and N output parameters are required to be explicitly identied a priori. While
this may be possible in certain applications (such as production) where input to output
conversion mechanisms are well-understood, our case is dierent. We must select a set
of input and output parameters from the universe of 18 nancial parameters describing
a rms nancial health (see Table 1). The objective of such a selection is that the resulting
relative DEA performance score of a rm can be interpreted as providing a measure of its
underlying nancial strength. Such nancial strength measures are required to be strongly
correlated with the market price process, under the ecient market hypothesis. If the
inputs and outputs for the DEA model are chosen exogenously (a priori), the resulting
DEA performance scores for rms may not be representative of the fundamental nancial
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3315
strengths that are rewarded by the nancial markets. The generalized DEA approach
(GDEA) developed in this paper leaves the selection of inputs and outputs as exible as
possible in the sense that a proper selection of the latter is sought iteratively to maximize
the correlation of the DEA-based strength evaluation and the stock market performance.
This process is best-explained in Fig. 1.
To illustrate the GDEA approach in Fig. 1, consider the universe of I (=18) parameters
that are potential inputs and outputs. Suppose a given parameter i may be used as an input
and/or output, or not used at all. Furthermore, suppose the level at which a parameter
must be specied in the DEA model in (2) is treated as unknown. Consequently, for a
parameter i with an observed value x
ij
for rm j, the level at which it enters the model
as an input is denoted by y
i
x
ij
, where the input scaling variable y
i
P0. Similarly, the level
at which the parameter i enters as an output for rm j is z
i
x
ij
and the output scaling var-
iable z
i
P0. Collecting the y
i
and z
i
components for all parameters, we dene an input
scaling parameter vector by y 2 R
I
and an output scaling vector by z 2 R
I
. An appropriate
selection of values for the pair y; z 2 R
2I
is not a rm-specic issue. Rather, it must be
chosen as a property of the industry, i.e., group of rms, so that relative performance
scores of rms can be compared to each other within the same industry. More impor-
Table 1
Financial parameters used for fundamental analysis
i Parameter Description Perspective
1 Return on equity Net income generated per unit of common shareholders
equity
Protability
2 Return on assets Net income divided by the total assets Protability
3 Net prot margin Net income a rm makes for every $1 it generates in revenue Protability
4 Receivables turnover Revenues for the period divided by receivables Asset
utilization
5 Inventory turnover Revenues for the period divided by inventories Asset
utilization
6 Asset turnover Revenue generated per dollar of assets a rm owns Asset
utilization
7 Current ratio Total current assets divided by total current liabilities Liquidity
8 Quick ratio Total current assets minus inventory divided by total current
liabilities
Liquidity
9 Debt to equity ratio Long-term debt divided by shareholders equity Liquidity
10 Leverage ratio Total assets divided by shareholders equity Leverage
11 Solvency ratio-I Total liability divided by total assets Leverage
12 Solvency ratio-II Total liability divided by shareholders equity Leverage
13 Price to earnings (PE)
ratio
Stock price divided by net income per share Valuation
14 Price to book ratio Stock price divided by shareholders equity per common
share
Valuation
15 Earnings per share
(EPS)
Net income minus dividends divided by common shares Protability
16 Revenue growth rate Current quarters revenue divided by the previous quarters
revenue minus one
Growth
17 Net income growth rate Current quarters net income divided by the previous quarters
net income minus one
Growth
18 Earnings per share
growth rate
Current quarters EPS divided by the previous quarters EPS
minus one
Growth
3316 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
tantly, such a performance score must represent the fundamental nancial health of a rm
that is predictive of (or highly correlated with) the stock price action. Therefore, the vector
(y, z) is to be held xed when computing eciencies of all J rms in the group. Under the
scaling vector parametrization (y, z), the resulting DEA model is
g
k
y; z : max
u;v
I
i1
z
i
x
ik
v
ik
I
i1
y
i
x
ik
u
ik
s:t:
I
i1
z
i
x
ij
v
ik
I
i1
y
i
x
ij
u
ik
6 1; j 1; . . . ; J
u
ik
; v
ik
P0; i 1; . . . ; I;
3
where y is chosen such that
I
i1
y
i
> 0. g
k
(y, z) is simply referred to as the (nancial) per-
formance score of rm k corresponding to the input/output scaling vector pair (y, z). The
following equivalent linear programming model can be used to compute g
k
(y, z).
g
k
y; z : max
u;v
I
i1
z
i
x
ik
v
ik
s:t:
I
i1
y
i
x
ik
u
ik
1
I
i1
y
i
x
ij
u
ik
I
i1
z
i
x
ij
v
ik
6 0; j 1; . . . ; J
u
ik
; v
ik
P0; i 1; . . . ; I:
4
In DEA, the issue of setting a given parameter in both the input and output sets simulta-
neously has been addressed in, for instance, Beasley (1995) and Cook et al. (2006). In the
case of a CCR model, when a parameter is used both in inputs and outputs, the resulting
DEA eciency is 1 for each rm. That is,
Proposition 2.1. For some parameter i 2 {1, . . . , I}, let y
i
> 0 and z
i
> 0. For a rm k being
evaluated, suppose the measured value of parameter i satises x
ik
> 0. Then, g
k
(y, z) = 1
holds.
Strength metric
for each firm
Market correlation
with strength
Market correlation
maximized?
Run DEA model
(for each firm)
Set inputs &
outputs fixed
Re-categorize
inputs/outputs
no
yes
STOP
Fig. 1. Schematic of the generalized DEA approach.
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3317
Proof. Can be shown similar to Lemma 1 and Theorem 1 in Cook et al. (2006). h
Therefore, if y
i
> 0 and z
i
> 0 for a parameter i with data x
ij
> 0 for all rms j, the DEA-
based strength score is 1 for all rms. For example, when the parameter i is the current
ratio (see Table 1), the data is always positive for all rms. In such a case, correlation
between the computed nancial strength score and the stock market performance is zero,
and thus, such a choice on (y, z) will not maximize the desired strength-market correlation,
see Fig. 1. Consequently, to reduce the search space for (y, z) in correlation maximization,
we set y
i
z
i
= 0 for all i = 1, . . . , I. This prohibits a given nancial parameter i from being in
the inputs and outputs simultaneously.
Denition 2.2. A given vector-pair (y, z) is said to satisfy the complementarity condition if
and only if y
i
z
i
= 0 for all i = 1, . . . , I. In this case, such a pair is simply referred to as a
complementary pair (y, z).
Therefore, a complementary pair (y, z) allows the categorization of the universe of I
parameters as distinct inputs and outputs. In contrast, Cook et al. (2006) introduced
the notion of exible measures whereby a new parameter can be considered in the presence
of existing input/output sets. Their model then determines if this new parameter should be
an input or an output in order to improve the (maximized) DEA eciency. In our case, the
objective is to have the highest correlation between DEA eciencies and the stock market
returns. Therefore, we take a dierent approach that allows the complementary vector
pairs (y, z) to play the role of exible measures in a more generalized setting. For this pur-
pose, the domain of (y, z) must be appropriately chosen to force which parameters should
never (or must) be in inputs/outputs.
Another important property of the CCR model is its unit-invariance. That is, the DEA-
eciency computed by (1) is independent of the units in which the input and output
parameters are measured, see Ray (2004, pp. 106107) and Lovell and Pastor (1995). Sta-
ted in our context,
Proposition 2.3. g
k
(y, z) is positively homogeneous of degree 0 in (y, z) jointly and sepa-
rately.
Proof. Follows directly from the proof of unit invariance of the CCR-DEA model, see, for
instance, Ray (2004). h
The main implication of Proposition 2.3 is that it restricts the domain of feasible
complementary vector pairs (y, z) to a binary space. Along with the complementarity con-
dition in Denition 2.2, thus, the feasible domain of the scaling vectors (y, z) in (4) must
satisfy,
y
i
z
i
0; y
i
; z
i
2 f0; 1g; i 1; . . . ; I: 5
An equivalent linear transformation of (5), along with the condition that
i
y
i
> 0, yield
the following Binary Complementary Domain (BCD), denoted by X, for the feasible
choices for (y, z).
BCD: X : y; z :
I
i1
y
i
P1; y
i
z
i
6 1; y
i
; z
i
2 f0; 1g; i 1; . . . ; I
_ _
: 6
3318 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Accordingly, for every rm k in the group (i.e., industry), the corresponding nancial per-
formance score g
k
(y, z) is determined by the model in (4) for a specied binary complemen-
tary vector pair (y, z) 2 X. The goal is to search for (y, z) 2 X such that the performance
score so-computed would be a suitable metric of the underlying nancial strength of a
given rm, relative to all rms in the group.
When the model in (4) is specied using parameters under the BCD condition in (6) that
requires choosing (y, z) 2 X, it is herein referred to as GDEA under unrestricted BCD, or
simply, unrestricted GDEA version. On the other hand, we also consider a certain restric-
tion on the binary complementary domain, based on a practical interpretation of the
nancial perspectives in Table 1, as given next.
2.2. Restricted BCD
The parameters of asset utilization, liquidity, and leverage perspectives can generally be
interpreted as inputs because activities that are measured by these parameters depend on the
planning and operational strategies of a rm. On the other hand, the parameters of prot-
ability and growth perspectives are generally considered as outputs because revenue/income
generation is a major objective criterion for a rm. The valuation parameters measure how
well the equity markets perceive success of a rm, and they are generally not concerned
with a rms input strategy. Accordingly, in a restricted GDEAapproach, input parameters
are only chosen from the perspectives of asset utilization, liquidity, and leverage, while the
output parameters are chosen only from the protability, growth, and valuation perspec-
tives. This leads to the following restricted binary complementary domain,
Restricted BCD: X
: y; z 2 X :
3
i1
y
i
18
i13
y
i
0;
12
i4
z
i
0
_ _
: 7
Performance of the unrestricted and restricted GDEA versions will be compared within
portfolio optimization using the application reported in Section 5. In the sequel, we will
also compare results with a xed exogenous input/output categorization, referred to as
a base categorization and denoted by (y
0
, z
0
),
y
0
: f0; 0; 0; 1; 1; 1; 1; 1; 1; 1; 1; 1; 0; 0; 0; 0; 0; 0g
z
0
: f1; 1; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 1; 1; 1; 1; 1; 1g:
_
8
Thus, the base categorization is completely determined by the interpretation of nancial
parameters, rather than their usefulness in determining a nancial performance score.
Note that (y
0
, z
0
) 2 X
*
and this base categorization consists of all 18 nancial parameters
given in Table 1. The interest in (y
0
, z
0
) is merely for comparison with (y, z) in X (or X
*
)
that might better represent the underlying nancial strength of a rm.
3. Relative nancial strength indicator (RFSI)
The process of determining an RFSI requires, rst, determining a correlation metric for
the DEA-based performance scores and the stock price returns, for the industry as a
whole, for a given vector pair (y, z), and second, designing a suitable iterative procedure
to choose (y, z) 2 X (or X
*
) in an attempt to maximize the latter correlation metric (see
Fig. 1).
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3319
Let the DEA-based performance score for a rm k in a given industry be determined
according to the model in (4) as g
k
(y, z), for a specied categorization (y, z) 2 X. The RFSI
is developed here for the unrestricted X; for the restricted version of RFSI, X is simply
replaced with X
*
. Computing the model in (4) requires the realized values x
ij
of all nancial
parameters for all rms. The future value of a parameter i for rm j is a random variable,
denoted by X
ij
. The collection of random variables X
ij
for i = 1, . . . , I = 18 and j = 1, . . . , J
is X. Realizations of X
ij
are observed as x
ij
in (published) nancial statements of a given
period (i.e., quarter). For a future period t of uncertain nancial performance, the collec-
tion of random variables is the vector X
t
: fX
t
ij
: 8i; 8jg. Then, the DEA-based relative
nancial eciency for the industry is represented by the collection of random variables
g
t
(y, z): = {g
j
(y, z; X
t
) : j = 1, . . . , J}. Once the period t nancial statements are observed,
with X
t
realized as x
t
, the random vector g
t
(y, z) is realized as the vector of values
{g
j
(y, z; x
t
) : j = 1, . . . , J}.
Let R
t
j
denote the stock price rate of return (RoR) random variable (for future period t)
of rm j, and those for all rms are represented by the random J-vector
R
t
: fR
t
j
: j 1; . . . ; Jg. Observed realizations of period t RoR is the vector
r
t
: fr
t
j
: j 1; . . . ; Jg. Consider the pairwise correlations between the two random vec-
tors g
t
(y, z) and R
t
, denoted by the correlation vector C
t
y; z 2 R
J
. Its jth component,
for rm j, is given by
C
t
j
y; z : Corrfg
j
y; z; X
t
; R
t
j
g; 9
where j = 1, . . . , J. The correlation vector C
t
(y, z) is, therefore, a measure of the predictive
power of the DEA-based (nancial) eciency metric on stock price returns for the chosen
industry. Indeed, a positive and signicant correlation vector C
t
(y, z) implies that the
DEA-based eciency score is a valuable proxy of the stock market performance of the
industry. Observe that C
t
(y, z) for period t depends on the chosen binary complementary
vector (y, z) 2 X. The best industry correlation is thus obtained when one searches for
(y, z) 2 X such that an appropriate metric of the vector C
t
(y, z) is maximized. Vector norms
cannot be used as appropriate metrics here because the goal is to seek positive (and large)
correlations across all rms in the industry. While more complicated formulae are possi-
ble, we use the simple average statistic
C
t
y; z :
1
J
J
j1
C
t
j
y; z; 10
herein termed the industry correlation metric, to search for the highest positive correlations
industry-wide. Note that the correlation vector C
t
(y, z) is unknown for the future period t,
and thus, it must be forecasted. To forecast C
t
(y, z), we use the historical (observed) sample
x
s
, s = 1, . . . , t 1. Using a history length of t
0
periods, C
t
j
y; z is estimated by the sample
correlation coecient, given by
c
t
j
y; z : Correlation coefficient between fg
j
y; z; x
s
g
t1
stt
0
and fr
s
j
g
t1
stt
0
: 11
Then, the industry correlation metric
C
t
y; z in (10) is estimated as
c
t
y; z :
1
J
J
j1
c
t
j
y; z: 12
3320 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Observe that the statistic c
t
y; z for period t depends on the chosen binary complementary
vector (y, z) 2 X. The best industry-correlation metric is thus obtained when one searches
for (y, z) 2 X such that c
t
y; z is maximized, i.e., solve the industry-correlation maximiza-
tion model
CORMAX : max
y;z
c
t
y; z
s:t: y; z 2 X:
13
Let an optimal binary complementary pair solving the above maximization be denoted by
(y
*
, z
*
). Note that dependence of this pair on the period index t is suppressed. The corre-
sponding industry correlation metric,
C
t
y
; z
; z
; z
; X
t
jg
j
y
; z
; x
tt
0
; . . . ; g
j
y
; z
; x
t1
; 14
where g
j
(y
*
, z
*
; x
s
) is computed according to the DEA model in (4) for the input/output
categorization (y
*
, z
*
), and E[ ] denotes the expectation operator.
To simplify the computation of RFSI, the expectation in (14) is estimated by the simple
moving average forecast (of
^
t periods,
^
t 6 t
0
), given as
RFSIt; j;
^
t
1
^
t
t1
st
^
t
g
j
y
; z
; x
s
: 15
RFSIt; j;
^
t is bounded within 0 and 1, where a value of unity indicates the highest possible
relative nancial strength indicator for rm j, relative to the industry concerned. Also note
that a single input/output categorization (y
*
, z
*
) of the 18 nancial parameters in Table 1 is
used in computing the RFSI for all rms in the industry, for the future period t. For future
periods beyond t, it may be necessary to adapt RFSI to new nancial statement observa-
tions, by resolving (13) for a revised optimal input/output categorization.
3.1. Two-step heuristic solution method
The CORMAX model in (13) is a dicult optimization problem because evaluation of the
objective function (statistic) c
t
y; z in (12) requires the solution of a sequence of linear
optimization models (4) so that each of the sample correlation coecients c
s
j
y; z in
(11) can be computed. Therefore, the objective function in (13) cannot be explicitly written
in closed-form nor can it be veried to be concave (or pseudo-concave) in the 2I-dimen-
sional decision variable-vector (y, z). Nonconvex optimization is known to be computa-
tionally tedious, see for instance, Horst et al. (1995). Moreover, X is a binary solution
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3321
space, i.e., (13) is a binary nonconvex optimization model. Global optimality conditions
for discrete nonconvex optimization have been studied, e.g. see Larsson and Patrikssony
(2005). However, ecient methods are available only for specially structured problems
and/or without integer restrictions, e.g. see Tawarmalani and Sahinidis (2004) and Zhang
et al. (1999).
Alternatively, we employ an ecient heuristic solution scheme. The method is a two-
step procedure, which is based on, rst, sampling a set of initial (y, z) points from the fea-
sible domain X, and then, performing a local search optimization in X for each of those
initial sample points. Consider a random sample of (vector) points x
s
:
y
s
; z
s
2 X & R
2I
, for s 2 S, where Sdenotes the index set of the sample points. For each
sample point, the objective criterion is calculated and the sample of industry correlation
metric values
fc
t
y
s
; z
s
: s 2 Sg 16
is collected. Then, each sample value c
t
y
s
; z
s
is improved to a locally optimal value by
employing a non-gradient based local search procedure, starting from the point x
s
2 X.
The corresponding local optimum is denoted by ~ x
s
: ~y
s
; ~z
s
. Then, an approximation
for the optimal input/output categorization for the industry is determined by
y
; z
% arg max
s2S
fc
t
~y
s
; ~z
s
g: 17
The following procedure is applied to generate a (random) sample point x
s
2 X: randomly
draw a set of 2I values from a continuous uniform distribution in [1, 1]. The rst I val-
ues are collected to form the I-vector a
s
. The last I values are collected to form the I-vector
b
s
. Then, the sample point x
s
= (y
s
, z
s
) is dened by the solution of the binary linear
program
y
s
; z
s
: arg max
y;z
fa
s
0
y b
s
0z : y; z 2 Xg; 18
where a prime denotes the transposition of a vector. This process is then repeated for each
s 2 S.
3.1.1. Local search
The non-gradient-based local search procedure is a modication from the Hooke and
Jeeves (HJ) method, see Bazaraa et al. (1993). Given a current solution x
p
, at some iter-
ation p of the local search procedure, the original HJ method performs an exploratory
search along the coordinate directions. Coordinate directions that improve the objective
function are used to dene a new iterate. The direction to the new iterate from the starting
solution x
p
is used to perform a pattern search. This method is adapted to the binary
model in (13).
For some candidate x
p
2 X, the ith coordinate x
p
i
is either 0 or 1. If x
p
i
0, then an
exploratory move is allowed only in the positive (x
i
) coordinate direction. If x
p
i
1, then
an exploratory move is allowed only in the negative (x
i
) coordinate direction. Once, a new
iterate ^ x
p
is so-determined, a pattern (line) search is not necessary in our case since the
search point x
p
k^ x
p
x
p
62 X for k 62 {0, 1}. The resulting algorithmic steps are as
follows:
3322 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Algorithm-LS
Initialization: Given x
s
= (y
s
, z
s
) 2 X, see (18), determine c
t
x
s
.
Set p = 1, f p c
t
x
s
, and x(p) = x
s
.
Step 1: For i = 1, . . . , 2I and denoting the ith elementary coordinate direction by e
i
, let
x
i
: xp e
i
if xp e
i
2 X and f p < c
t
xp e
i
else; x
i
: xp e
i
if xp e
i
2 X and f p < c
t
xp e
i
else; x
i
: xp:
Compute x
2I+1
by the XOR (exclusive or or not equal to) operation:
x
2I1
: x
1
xor x
2
xor . . . xor x
2I
:
If x
2I+1
62 X, set x
2I+1
= x(p).
Step 2: Let xp 1 : arg maxfc
t
x
i
: i 1 . . . ; 2I 1g.
If c
t
xp 1 c
t
xp : Terminate the local search and set
x
s
xp.
Else, if c
t
xp 1 > c
t
xp, let f p 1 c
t
xp 1
set p p + 1 and go to Step 1.
4. Portfolio selection using RFSI
Asset allocation is the practice of dividing resources among dierent categories such as
stocks, bonds, real estate, cash equivalents, etc. Often mathematical optimization models
are used for asset allocation with an intent to reduce risk exposure, since each asset class
has a dierent correlation to the others. Within a given category, say stocks, the portfolio
manager must choose specic industries to invest in and then specic rms must be
selected within those industries. Indeed, the asset allocation and the choice of individual
securities must occur integrative to portfolio risk management. Furthermore, such portfo-
lios must be temporally (say, quarterly) rebalanced to account for the economic and mar-
ket evolutions. For integrated risk control with frequent rebalancing under portfolio
optimization, see for instance, Edirisinghe (2007) and the many references therein.
The early work of portfolio optimization dates back to Markowitz (1952), where a
trade-o between portfolio expected return and portfolio variance is sought. While many
variants of this approach have been proposed over the years, the fact remains that a uni-
verse of securities must be pre-selected prior to running a portfolio optimization model to
determine optimal portfolio weights. While there is an increased research-focus on risk
specications (or lack thereof) to capture the investors risk attitude, it is imperative that
an underlying universe of securities be carefully selected and their stochastic elements be
accurately estimated. In fact, the latter two aspects are often the more dominant aspects
in the practice of portfolio management. In this section, we focus on the question of select-
ing a universe of securities for portfolio risk management, by applying the RFSI concept
developed in the preceding sections. Then, portfolio optimization is performed to deter-
mine optimal weights on individual securities.
Consider a stock portfolio investment problem where a given budget must be allocated
among stocks in H dierent industries. Suppose there are J
h
stocks (rms) under
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3323
consideration in industry h, h = 1, . . . , H. The fund allocation problem is considered in two
stages: in the rst stage, the problem is to choose the industries for investment of the given
budget, and then, to choose specic stocks within those industries as potential candidates
in a portfolio. In the second stage, specic portfolio weights are assigned to the chosen
candidate securities such that the investors risk-return preferences are satised. The stage
one problem is an asset selection problem and the stage two is a portfolio optimization
problem. We consider applying a selection process based on RFSI to the stage one. Note
that nancial statements of all rms in the H industries (thus, a total of J
0
H
h1
J
h
rms)
must be used to collect data for the 18 nancial parameters in Table 1.
Suppose the current time period is t 1 and investment is desired for period t (i.e., the
subsequent quarter). Referring to the method of computing RFSI presented in Section 3,
and using the publicly-available nancial data x
s
h
for periods s < t, the model for correla-
tion maximization in (13) is solved to determine an optimal complementary binary pair
(y
h
*
, z
h
*
), for each industry h = 1, . . . , H. The industry correlation metric corresponding
to this optimized input/output categorization, C
t
h
y
h
; z
h
in (10), must be veried to be
statistically signicant. Industry selection for portfolio optimization is based upon the sig-
nicance of this correlation, tested via the sample estimate c
t
h
y
h
; z
h
in (12).
4.1. Statistical tests of correlations
We are concerned with identifying industries that do not provide statistical evidence for
RFSI-based predictability of stock returns, i.e., the industry correlation metric C
t
h
y
h
; z
h
is not a signicant positive value. Consider the following hypothesis test for a minimum
positive correlation (q
0
) for a given industry h,
H
0
: C
t
h
y
h
; z
h
6 q
0
H
1
: C
t
h
y
h
; z
h
> q
0
:
_
19
The above null hypothesis H
0
indicates that DEA-based fundamental nancial strength is
not consistent with the ecient market theory for industry h. Note that
C
t
h
y
h
; z
h
:
1
J
h
J
j1
C
t
j;h
y
h
; z
h
, see (10), rm-correlations C
t
j;h
y
h
; z
h
are estimated
by c
t
j;h
y
h
; z
h
in (11). Consider the following arctan hyperbolic transformation of
c
t
j;h
y
h
; z
h
:
w
j;h
: tanh
1
c
t
j;h
y
h
; z
h
1
2
log
e
1 c
t
j;h
y
h
; z
h
1 c
t
j;h
y
h
; z
h
_ _
: 20
Using the results that Ec
t
j;h
% C
t
j;h
and Varc
t
j;h
% 1 C
t
j;h
2
, R.A. Fisher (18901962)
showed that w
j,h
is approximately normally distributed,
w
j;h
% Normal
1
2
log
e
1 C
t
j;h
y
h
; z
h
1 C
t
j;h
y
h
; z
h
_ _
;
1
t
0
3
_ _
; 21
see Tamhane and Dunlop (2000), where t
0
is the number of periods used in the estimation
in (11). Next, dene the industry-average statistic by
w
h
:
1
J
h
J
h
j1
w
j;h
; 22
3324 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
which is the sum of J
h
normal random variables, and thus,
w
h
is normally distributed:
w
h
% Normal
1
2J
h
J
h
j1
log
e
1 C
t
j;h
y
h
; z
h
1 C
t
j;h
y
h
; z
h
_ _
; ^ r
2
_ _
; 23
where the variance of
w
h
is given by
^ r
2
:
1
J
h
t
0
3
2
J
h
j<k
Covw
j;h
; w
k;h
: 24
Cov(w
j,h
,w
k,h
) in (24) depends on the covariance between c
t
j;h
y
h
; z
h
and c
t
k;h
y
h
; z
h
. The
latter two correlations are the point estimates of the rm-correlations C
t
j;h
y
h
; z
h
and
C
t
k;h
y
h
; z
h
, for rms j and k. Firm-correlation measures the degree of association
between a rms nancial strength and its stock price return, a process that may be
expected to be fairly consistent across all rms in the industry. Therefore, when rms j
and k operate independent of each other, the point estimates c
t
j;h
y
h
; z
h
and
c
t
k;h
y
h
; z
h
of the rm-correlations C
t
j;h
y
h
; z
h
and C
t
k;h
y
h
; z
h
, respectively, can be
expected to be independent of each other as well. This independence assumption results
in Cov(w
j,h
,w
k,h
) = 0, and thus, the parameters of distribution of
w
h
are (approximately)
known once the value of C
t
j;h
y
h
; z
h
is known.
Observe that under the equality sign in the null hypothesis in (19), one has reference
only to the industry correlation metric, i.e., C
t
h
y
h
; z
h
q
0
; however, we need knowledge
of the individual rm-correlations C
t
j;h
y
h
; z
h
. Let the latter correlations be given by
C
t
j;h
y
h
; z
h
q
0
h
j;h
for j 1; . . . ; J
h
; 25
where h
j,h
must satisfy the requirements:
J
h
j1
h
j;h
J
h
and
1
q
0
6 h
j;h
6
1
q
0
8j 1; . . . ; J
h
: 26
The constraints in (26) follow from the fact that the industry-correlation metric is q
0
(spec-
ied as a positive value) and that rm-correlations are bounded within 1 and +1. Then,
denoting
w
0
h :
1
2J
h
J
h
j1
log
e
1 q
0
h
j;h
1 q
0
h
j;h
_ _
and r
2
:
1
J
h
t
0
3
; 27
it follows that
w
h
% Normalw
0
h; r
2
. For niteness of the mean, w
0
(h), the inequalities in
(26) must be satised as strict inequalities, i.e.,
1
q
0
< h
j;h
<
1
q
0
, j = 1, . . . , J
h
. Then, for
a-signicance level and for the one-sided test, H
0
is accepted if
J
h
t
0
3
_
w
h
w
0
h 6 Z
1
1 a; 28
or,
w
h
6 w
0
h
Z
1
1 a
J
h
t
0
3
_ ; 29
where Z
1
is the inverse c.d.f. of a standard normal random variable. To test H
0
to con-
clude that the DEA-based relative strength does not provide sucient explanatory power
for stock price returns in industry h, therefore, specic h values are required. Such infor-
mation is not available, nor can it be estimated. However, if H
0
is accepted for the smallest
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3325
(threshold) value of the right hand side in (29) over all possible h, then, indeed H
0
is ac-
cepted for the industry h. Dene,
w
min
0
: inf
h
w
0
h :
J
h
j1
h
j;h
J
h
;
1
q
0
< h
j;h
<
1
q
0
; j 1; . . . ; J
h
_ _
: 30
Hence, if
w
h
6 w
min
0
Z
1
1a
J
h
t
0
3
p
, then H
0
is accepted for the industry h.
Proposition 4.1. For 0 < q
0
< 1,
w
min
0
1
2
log
e
1 q
0
1 q
0
_ _
: 31
Proof. Note that w
0
(h) is nonconvex it is convex in the positive orthant and concave in
the negative orthant. Consider the (relaxed) minimization problem:
Z
: min
h
w
0
h :
J
h
j1
h
j;h
J
h
_ _
32
and thus, Z
6 w
min
0
. Since the constraints of (32) are linear, Constraint Qualication
(CQ) is satised at all feasible solutions, thus implying that every optimal solution of
(32) must be a KarushKuhnTucker (KKT) point, see Bazaraa et al. (1993). Denoting
the Lagrange multiplier associated with the equality constraint by k, the KKT conditions
yield,
2q
0
1 q
0
h
j;h
_ _
2
k 0 8j 1; . . . ; J
h
: 33
Therefore, h
j,h
= m
j
a must hold for all j = 1, . . . , J
h
, where m
j
is +1 or 1 and a is a positive
constant. The equality constraint thus implies that
J
h
j1
m
j
J
h
=a > 0, which is the net
count of positive values in h
j,h
for j = 1, . . . , J
h
, and thus,
J
h
j1
m
j
1; 2; . . . ; J
h
. That is,
a can take on the set of possible discrete values 1;
J
h
J
h
1
;
J
h
J
h
2
. . . ;
J
h
2
; J
h
_ _
. Each of these val-
ues for a denes a distinct KKT point provided the objective function is well-dened at
those points, which is given by
1
2J
h
j:m
j
1
log
e
1 aq
0
1 aq
0
_ _
j:m
j
1
log
e
1 aq
0
1 aq
0
_ _
_ _
1
2J
h
J
h
j1
m
j
_ _
log
e
1 aq
0
1 aq
0
_ _
since the following identity holds:
log
e
1 q
0
a
1 q
0
a
_ _
log
e
1 q
0
a
1 q
0
a
_ _
0:
Therefore, the objective value associated with each KKT-point is given by
1
2J
h
J
h
a
log
e
1aq
0
1aq
0
_ _
provided a 6
1
q
0
. Then, the minimum in (32) is obtained by
Z
min
1
2a
log
e
1 aq
0
1 aq
0
_ _
: a 1;
J
h
J
h
1
;
J
h
J
h
2
. . . ; minfJ
h
; 1=q
0
g
_ _
: 34
3326 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
Next, noting that the function f x
1
x
log
e
1xq
0
1xq
0
is monotonically nondecreasing in x for
x 2 [1, 1/q
0
], it follows that a = 1 is indeed the optimal solution in (34), which thus implies
that each m
j
= + 1 at the optimum. That is, h
j,h
= 1, "j, solves the relaxed problem in (32),
which yields Z
1
2
log
e
1q
0
1q
0
_ _
. But, for 0 < q
0
< 1, we have
1
q
0
< h
j;h
1 <
1
q
0
for all j.
This leads to the feasible point upper bound on the inmum in (30) as w
min
0
6 Z
. Combin-
ing with Z
6 w
min
0
, the proof is completed. h
4.2. Selection criteria for portfolio optimization
The foregoing statistical analysis can be used to determine an industry partition for
investment. Given a set of industries h = 1, . . . , H for consideration in period t, an invest-
ment-worthy (screened) set His determined under the Industry Selection Criterion given
by
ISC: H: h :
w
h
>
j
2
log
e
1 q
0
1 q
0
_ _
Z
1
1 a
J
h
t
0
3
_ ; h 1; . . . ; H
_ _
; 35
where j P1 is a user-specied (safety) factor. For each industry h 2 H, individual stocks j
are chosen from the given rms j = 1, . . . , J
h
by using RFSI as a selection discriminator.
Under Denition 3.1 and referring to the computation of RFSI in (15), for a given indus-
try h 2 H, evaluate the moving average forecast of
^
t periods,
RFSIt; j;
^
t
1
^
t
t1
st
^
t
g
j
y
h
; z
h
; x
s
: 36
The subset (of rms) J
h
from industry h 2 H is chosen for portfolio analysis under the
Stock Selection Criterion given by
SSC : J
h
: fj : RFSIt; j;
^
t PR
; j 1; . . . ; J
h
g; 37
where R
*
is a prespecied threshold, where 0 < R
*
6 1. The stocks in the subset J
h
, for
h 2 H, are then expected to perform well in the stock market with high condence. The
universe of securities for portfolio analysis is thus given by
N:
_
h2H
J
h
; 38
which is a subset of the original universe of stocks, i.e., jNj6 J
0
:
H
h1
J
h
. Investment
weight to be attached to each stock j 2 Nis then determined by a portfolio optimization
model. There are several models in the literature for this purpose, and the choice of a mod-
el is primarily guided by risk/return considerations. Risk specications are multi-pronged
and portfolio optimization models are multi-faceted. For instance, when there are trans-
actions and slippage costs of trading, portfolio drawdown characteristics are a major con-
cern of risk. Also, when market evolutionary dynamics are nonstationary, multiperiod
sequential stochastic decision optimization is shown to yield superior performance com-
pared to static one period models, see Edirisinghe (2007) for details.
The focus here is to demonstrate the usefulness of the preceding selection criteria, (ISC)
and (SSC), over the unscreened set of J
0
stocks. The specic advantage of our stock screen-
ing rules will be that stochastic parameter estimation will be performed in a reduced
N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335 3327
dimension using stocks that are likely to have strong performance. Reduced dimensions
typically lead to smaller statistical estimation errors in RoR parameters. Consequently,
a given portfolio optimization model is expected to provide better risk/return performance
characteristics. To this end, we utilize the standard mean-variance portfolio optimization
framework in a static one period setting. We compare the two models, RFSI-based model
(RMV) that uses the universe Nand the standard mean-variance model (SMV) using all
J
0
stocks. Portfolio weights carry the designation w
j
for stock j, with a superscript R or S
indicating the type of model used to calculate the weights.
RMV: max
w
R
j2N
l
j
w
R
j
k
j;k2N
r
jk
w
R
j
w
R
k
39
s:t:
j2N
w
R
j
6 1
w
R
j
P0; j 2 N:
SMV: max
w
S
J
0
j1
l
j
w
S
j
k
J
0
j;k1
r
jk
w
S
j
w
S
k
40
s:t:
J
0
j1
w
S
j
6 1
w
S
j
P0; j 1; . . . ; J
0
:
In (39) and (40), k is a risk tolerance parameter that trades o portfolio mean return with
portfolio variance. Thus, a higher k is indicative of risk-averse portfolio weighting that
strives for better diversication. Performance comparison of these two models is presented
in the next section.
5. Application of RFSI in the technology sector
The DEA-based relative nancial strength indicator, RFSI, is applied in portfolio opti-
mization where several (publicly-traded) US companies in various industries are consid-
ered. The objective is to validate the use of RFSI-based stock selection as a means of
improving risk/return performance of optimized portfolios. Quarterly nancial statements
of rms during the period 19962002 are used. Of the 28 consecutive quarters, the rst
quarter is set aside for the initial calculations of RoR, growth rates etc. Reported results
pertain to a time window of t
0
= 27 quarters for industry-correlation maximization in (13).
Only the technology sector is used for the experimentation, of which six industries are cho-
sen: software & programming (h = 1), communications equipment (h = 2), computer
hardware (h = 3), computer networks (h = 4), semiconductors (h = 5), and computer ser-
vices (h = 6). Quarterly data for all rms in these 6 industries are electronically obtained
from the WRDS (Wharton Research Data Services) database. The nancial statement
data, as well as quarterly stock price information, are checked for completeness and only
those rms with complete data are chosen within each industry. Thus, the usable number
of rms (J
h
) in each industry is limited, and they are J
1
= 52, J
2
= 51, J
3
= 14, J
4
= 17,
J
5
= 75, and J
6
= 21. Thus, the total number of rms is J
0
= 230.
In order to determine the optimal input/output categorization of the 18 nancial
parameters in Table 1, the two-step heuristic solution method in Section 3.1 is applied,
3328 N.C.P. Edirisinghe, X. Zhang / Journal of Banking & Finance 31 (2007) 33113335
where an initial sample (y
s
, z
s
), s 2 S, is determined using a sample of size j Sj 20. The
corresponding sample of objective correlations in (16) is then computed. For each sample
point, the objective industry-correlation value c
t
h
y; z, being forecasted for period t = 29
(i.e., the rst quarter of 2003), is improved via the local search procedure in Algorithm-
LS, see Section 3.1.1. The optimal vector pairs (y
h
*
, z
h
*
) corresponding to the largest cor-
relation are then obtained, see Table 2. The notation in, out, or represent a given
nancial statement parameter i is an input, output, or it is not considered, respectively, in
an industry h. These results pertain to the unrestricted version that uses X in (6). Those
for the restricted version X
*
in (7) are in parentheses in Table 2. The local search trajectory
corresponding to the sample point that leads to the reported optimal input/output catego-
rization is plotted, for the unrestricted domain X and the restricted domain X
*
, in Figs. 2
and 3, respectively, for each industry.
For industry-optimal (y
h
*
, z
h
*
) categorization, for both cases of unrestricted and
restricted domains, industry-correlation metric c
t
h
y
h
; z
h
is estimated according to (11),
and each industry is tested for statistical signicance using the hypothesis test in (19).
Table 2
Optimal input/output categorization (y
h
*
, z
h
*
) for RFSI in each industry
Financial parameter (i) Industry (h)
Software Communic. Hardware Networks Semicond. Services
1 in (out) () (out) () () (out)
2 () () () out (out) () ()
3 (out) () (out) out (out) in () out ()
4 out () (in) out (in) in (in) () out (in)
5 out (in) (in) out () () () ()
6 () () out () () out () ()
7 in (in) (in) out (in) () () out (in)
8 () in () in (in) () (in) in (in)
9 in (in) () in () in (in) (in) ()
10 (in) in () () in (in) out () ()
11 out () () () () in (in) in ()
12 in (in) out () () () in (in) in ()
13 out () () out (out) () in () (out)
14 (out) in (out) (out) out (out) (out) (out)
15 (out) out (out) (out) () (out) ()
16 (out) out (out) () () out (out) (out)
17 in () () out (out) () () (out)
18 out (out) () in (out) () () out ()
Max. Corr. c
t
h
y
h
; z
h