You are on page 1of 19

Three meanings of intergenerational mobility: a follow

up
Gaston Yalonetzky∗
University of Leeds, OPHI
October 10, 2012

Abstract
Van de Gaer et al. (2001) identified three meanings of intergenerational mobility
(movement, equality of opportunity, equality of life chances) and, in the absence of
indices capturing the latter two meanings, made measurement proposals to fill that gap.
Their focus was on quantile transition matrices for discretized continuous variables.
This paper discusses the three meanings of intergenerational mobility when variables
are discrete. A family of indices measuring mobility as equality of opportunity for
discrete variables is proposed; followed by a family of indices measuring mobility as
equality of life chances. The paper also discusses situations in which the meanings
coincide.
JEL Classification: J62, O15.
Keywords: Intergenerational mobility; transition matrices.

Introduction
The seminal contribution by Van de Gaer et al. (2001) identified, and clarified the dif-
ferences between, three meanings of intergenerational mobility in applications to quantile
transition matrices, i.e. transition matrices for discretized continuous variables. The first of
these meanings is mobility as movement, i.e. the reduction of the likelihood that the off-
spring replicates the wellbeing achievements of parents (e.g. education, income). The second
meaning, mobility as equality of opportunity, means more proximity between the conditional
cumulative distributions of the offspring wellbeing variables, where the conditioning is by the
parental values of the variables. This signals a lower intensity of the first-order stochastic
dominance relationship, which in the inequality-of-opportunity literature is also known as
(first-order) opportunity dominance (Fleurbaey, 2008). Finally, mobility as equalization in
life chances is similar to the previous meaning, but the categories of the variable do not
have a natural ordering based on relative desirability (i.e. they are discrete but not ordinal).
Hence mobility in this context means higher similarity between the conditional probability

G.Yalonetzky@leeds.ac.uk.

1
distributions. The authors’ contribution also laid out theorems showing the incompatibility
between several mobility axioms, involving properties that represent mobility meanings.
Now the nature of the well-being variable determines the range of appropriate statistical
tools available for the analysis of intergenerational mobility. For instance, there is a broad
literature of mobility indices suitable for continuous variables, including the work of Cowell
(1985), Fields and Ok (1996, 1999), and Schluter and van de Gaer (2011). There is also a
second strand of the literature that proposed indices based on quantile transition matrices
for continuous variables, which required discretizing the latter (e.g. Van de Gaer et al., 2001).
In general, these contributions are not suitable for discrete (ordinal or categorical) variables,
either because it is not sensible to use indices that map from an arbitrary scale (in the case
of the first strand), or because it is not easy to partition the population into equal parts
according to a discrete variable (which is necessary for the construction of quantile matrices,
as in the second strand of the literature), or for both reasons.
However, many well-being variables are either ordinal or categorical, e.g. self-reported
health, life satisfaction, educational levels, or occupational categories. Indices based on
size transition matrices, i.e. matrices whose rows and columns are determined exogenously
(Formby et al., 2004), are good candidates for intergenerational mobility measurement with
discrete variables, since the latter’s categories provide the exogenous boundaries of the ma-
trix. Several such indices have been around for decades, but there has not been a discussion
as to the meanings of mobility that they measure. Hence this paper, firstly, reformulates,
and re-interprets, the three meanings of mobility introduced and discussed in the context of
quantile matrices by Van de Gaer et al. (2001). 1 Then the paper shows that, among existing
indices, only some capture mobility as movement, and some capture mobility as equalization
in life chances, also in the case of size matrices. However the latter fulfill a benchmark of
perfect mobility that is unnecessarily stringent for size matrices.
Considering this gap in the availability of indices for the measurement of other meanings
of intergenerational mobility, the paper’s first main contribution is the proposal of a new
family of mobility indices for size matrices that only measure mobility as equality of oppor-
tunity. Then the second main contribution is the proposal of a new family of mobility indices
for size matrices that only measure mobility as equalization of life chances. The first family
is based on the work of Silber and Yalonetzky (2011), who propose new indices of inequality
of opportunity for ordinal variables, while the second family is inspired by the segregation
measurement literature, in particular the work of Reardon and Firebaugh (2002). Finally,
motivated by a suggestion in Van de Gaer et al. (2001, p. 525-6), the paper discusses the
coincidences among the three mobility meanings that emerge when the admissible domain
is restricted to that of monotone transition matrices.
The rest of the paper proceeds with the introduction of the notation, followed by a re-
statement and re-interpretation, for size matrices, of the axioms, meanings of mobility, and
incompatibilities, proposed and discussed by Van de Gaer et al. (2001). Then a brief section
reviews some existing indices showing that they either capture mobility as movement, or as
equality in life chances, or no mobility meaning at all. The limitations of existing indices
measuring mobility as equalization in life chances are pointed out. Thus the section highlights
1
Van de Gaer et al. (2001, Footnote 1) do mention that their ideas can also be applied to stochastic
matrices, but they did not pursue this route further.

2
the lack of mobility indices for size matrices that measure mobility as equality of opportunity,
and mobility as equalization of life chances. Then the new family of mobility indices that
capture mobility as equality of opportunity is introduced and its axiom fulfillment behaviour
is analyzed. Afterwards the new indices of mobility for the meaning of equalization of life
chances are introduced and its axiom fulfillment behaviour is likewise analyzed. The last
methodological section discussed the peculiar situation of monotone size transition matrices
in relation to the three meanings of intergenerational mobility. The paper ends with some
concluding remarks.

Meanings of intergenerational mobility and properties


in the context of discrete variables
Notation
The wellbeing attribute is measured with a discrete variable that can be ordinal or categori-
cal. For presentation purposes, it is best to render it ordinal. The variable is: X ∈ [1, Xtop ],
and [1, Xtop ] @ N+ . The transition probability of having a value for the offspring of X(O) = i
conditioned on the event that the parent had an education of X(P ) = j is:
Nij
pi|j ≡ Pr [X(O) = i|X(P ) = j] = (1)
N.j
Where Nij is the number of parent-offspring pairs in the population for whom X(O) = i
PXtop
and X(P ) = j. N.j ≡ i=1 Nij is the number of parents in the population for whom
X(P ) = j. The size transition matrix M is defined as follows:
 
p1|1 . . . p1|Xtop
M ≡ .. ..
, (2)
 
. pi|j .
pXtop |1 . . . pXtop |Xtop
PXtop
and ∀j ∈ [1, Xtop ] : i=1 pi|j = 1. Some mobility indices (e.g. the ones introduced here)
map from cumulative P probability distributions. Hence it is worth introducing the cumulative
probability: Fi|j ≡ is=1 ps|j .
The analysis of intergenerational mobility based on discrete variables uses mobility indices
based on transition matrices. By contrast, with discrete variables, mobility indices that are
sensitive to the distances between parental and offspring values are not appropriate. This
section is based, largely, on Van de Gaer et al. (2001), who introduced and discussed axioms
and meanings of intergenerational mobility for quantile matrices. This paper’s focus are size
matrices, whose grids are not combinations of interval categories stemming from a discretized
continuous variables. Instead, in a size matrix, the grids are combinations of the actual
categories from the discrete variables (Formby et al., 2004). Another important difference is
that, while quantile matrices are bi-stochastic, size matrices are only stochastic.
The purpose of this section is to transplant, adapt and re-state the axiomatic analysis of
Van de Gaer et al. (2001) to the environment of size matrices. The section starts with an
introduction of the traditional mobility axioms, with adjustments suitable for size matrices.

3
Axioms of meaning, axioms of permutation, and axioms of maximum and minimum mobility
are considered. Then I mention the logical relationships between the axioms, derived by
Van de Gaer et al. (2001). These relationships explain how, and why, trends generated by
different mobility indices could disagree a priori.
Before starting with the axioms of meaning, it is important to introduce further relevant
notation. Firstly, let’s introduce the idea of a mobility index mapping from a transition ma-
trix to the real line. M : M → R. Usually a normalized index mapping toward the interval
[0, 1] is preferred. Another important concept is that of diagonalizing transformations. A
diagonalizing transformation of matrix M generates a new matrix M f , i.e. T k,l;q,r [M ] ≡ M
f
ε
, such that:

pgk|q = pk|q − ε
pgk|r = pk|r + ε
pf
l|q = pl|q + ε
pf
l|r = pl|r − ε
pf
i|j = pi|j ∀i, j 6= k, l; q, r
If the variable is ordinal, then it is also assumed, in terms of preferences over wellbeing
categories that: q < r and k < l. Therefore, in size matrices applied to ordinal variables,
diagonalizing transformations tend to reduce positive association between parental and off-
spring outcomes, while preserving the arithmetic average of the matrix’s column probabilities
(i.e. the conditional probability distributions of the offspring). By contrast, in quantile ma-
trices, diagonalizing transformations tend to reduce positive association between parental
and offspring outcomes, while preserving the bi-stochastic nature of the matrix, i.e. the sum
of probabilities belonging to the same row, or the same column, must be equal to one, and
the marginal distributions of parental and offspring outcomes must be uniform.2
Finally, matrix M C ≡ ΞC [M ] and matrix M R ≡ ΞR [M ] are defined. M C stems from
permutating two columns of M using the column permutation operator, ΞC . Likewise, M R
ensues from permutating rows of M using the row permutation operator, ΞR .

Axioms of meaning
Van de Gaer et al. (2001) explain, and distinguish between, three meanings of intergenera-
tional mobility: 1) mobility as movement, i.e. as a reduction in the probability that offspring
2
An alternative diagonalizing transformation for stochastic matrices, mentioned in Van de Gaer et al.
N
(2001, Footnote 1), requires adding and subtracting wεq and wεr where wq ≡ PXtop.q is the proportion of
j=1 N.j
parents with wellbeing values of q. This transformation preserves not the arithmetic average, but the actual
marginal distribution of offspring wellbeing, i.e. the weighted average of the matrix’s column probabilities,
where the weights are given by the marginal distribution of parental wellbeing (the conditioning variable).
This paper’s results and indices can be readily adjusted to operate with this alternative diagonalizing trans-
formation. However, using this alternative transformation requires bringing in additional information to
mobility comparisons based on transition matrices. For instance, two identical transition matrices could be
ranked differently in terms of mobility if they have different marginal distributions of parental wellbeing and
the index is sensitive to the latter. Hence this paper’s focus on diagonalizing transformations that do not
require any extra information besides the transition matrices themselves.

4
reproduce the wellbeing of their parents; 2) mobility as equality of opportunity, in which
higher mobility means closer proximity among the cumulative distributions of the wellbe-
ing variable conditioned by parental attributes, i.e. lower intensity of first-order stochastic
dominance relationships; and 3) mobility as equalization in life chances. Unlike the second
meaning, in the third one, the different categories of the variable do not have a relative ap-
peal vis-à-vis each other, i.e. the variable is discrete but not ordinal. Consequently, higher
mobility as equalization in life chances is deemed to occur when the discrete probability
distributions, conditioned by parental attributes, resemble more each other. The three ax-
ioms of meaning can be adapted to size matrices without further adjustments. They are the
following:

Axiom 1 Movement (MOV): M [Tεq,r;q,r [M ]] > M [M ] .

Axiom 2 Equality of Opportunity (EOP): if k < l, q < r, and F i|q ≥ Fi|r ∀i ∈ [1, Xtop ];
g g
k,l;q,r 3
then M Tε [M ] > M [M ].
 k,l;q,r 
k|q ≥ p
Axiom 3 Equalization in Life Chances (ELP): if pg g k|r and p
f l|q ≤ p l|r ; then M Tε
f [M ] >
M [M ] .

Axioms of permutation
Van de Gaer et al. (2001) introduced two axioms of permutation. These are also easily
adaptable to the environment of size matrices. The first axiom, anonymity, says that the
mobility index should not change when the columns of the transition matrix, i.e. its offspring
distributions conditioned by parental values, are permutated. The second axiom, called
focus on probabilities, says that the mobility index should not change when the rows of
the transition matrix are permutated. In other words, when the categories of the discrete
variable do not have relative desirability vis-à-vis each other. Formally, the following are the
two axioms of permutation:
 
Axiom 4 Anonymity (AN): M [M ] = M M C .
 
Axiom 5 Focus on Probabilities (FP): M [M ] = M M R .

Axioms of maximum and minimum mobility


Shorrocks (1978) proposed a mobility axiom by which the mobility index must declare max-
imum immobility only in the case of a transition matrix in the form of an identity matrix,
I. An alternative version of immobility, proposed by Van de Gaer et al. (2001), which is
more relevant for the conception of mobility as inequality in life chances, is the axiom of
perfect predictability, whereby a mobility index should declare maximum immobility in cases
3
The meaning of mobility as lower inequality of opportunity is closely related to the concept of opportu-
nity dominance in the inequality-of-opportunity literature. However, in the context of transition matrices,
equality between the conditional cumulative distributions is only a necessary, but insufficient, condition for
full equality of opportunity. The latter insufficiency is due to the so-called partial-circumstance problem. See
Fleurbaey (2008, chapter 9) for a thorough discussion.

5
of transition matrices that ensue from any column permutation of I (including, of course,
the identity matrix itself). This weak form of maximum immobility is not conceptually
compatible with the notion of mobility as movement. The two axioms of minimum mobility
are also reasonable for size matrices, without any further adjustment. They are:

Axiom 6 Immobility (IM): M [M ] ≥ M [I] .


 
Axiom 7 Perfect Predictability (PP): M I C = M [I] .

Finally, Shorrocks proposed two axioms of maximum (or perfect) mobility. The weak
axiom of mobility requires a mobility index to take a particular value when the transition
matrix features identical columns, i.e. pi|1 = . . . = pi|2 = . . . = pi|Etop ∀i ∈ [1, Xtop ]. The
strong axiom of maximum mobility stipulates that a mobility index should take its maximum
value when the transition matrix exhibits identical columns. In order to express these axioms,
firstly it is worth introducing 1Xtop , which is an Xtop -dimensional column vector of ones. Then
0
the size matrix of identical columns is defined: M M ≡ p1 , p2 , . . . , pXtop 10Xtop . 4 Following
Van de Gaer et al. (2001), this paper focuses on the strong axiom of maximum mobility, or
perfect mobility, formally:
 
Axiom 8 Perfect Mobility (PM): M M M > M [M ] .

Now Van de Gaer et al. (2001, p. 524-5) show incompatibilities between some of the
axioms. These incompatibilities also hold in the realm of size transition matrices. They
are important because they imply that, generally, one single index can capture only one of
the three meanings of intergenerational mobility at a time. Hence, a priori, the mobility
trends generated by different indices, each capturing only one meaning of mobility, could
diverge. Following the authors’ order of theorems and corollaries, the inconsistencies are the
following:

Theorem 1 MOV and PM are incompatible. That is, higher mobility as movement can
be obtained, beyond the situation of perfect mobility (represented by the matrix of identical
columns), by subtracting even more probability mass from the diagonal.

Proof. Same as in Van de Gaer et al. (2001, p. 524-5).

Theorem 2 MOV and AN are incompatible. According to AN a permutation of columns of


the identity matrix should not affect the value of the mobility index. However, according to
MOV, such permutation should yield a higher value since it subtracts probability mass from
the diagonal.

Proof. Same as in Van de Gaer et al. (2001, p. 524-5).

Corollary 1 MOV and PP are incompatible; because AN implies PP.


4 1
Note that, by contrast, a bi-stochastic matrix with identical columns is characterized by pi|j = Xtop ∀i, j.

6
Theorem 3 MOV and FP are incompatible. The reason is similar to the one explaining the
incompatibility between MOV and AN. In the case of MOV and FP, consider a permutation
of the rows of the identity matrix.

Proof. Same as in Van de Gaer et al. (2001, p. 524-5).

Theorem 4 EOP and FP are incompatible. That is, a permutation of rows could either
reduce or increase the intensity of a first-order stochastic dominance relationship between
two columns of a transition matrix, which would translate into a change in the value of any
mobility index satisfying EOP. However, according to FP, the index should not react to a
permutation of rows.

Proof. Same as in Van de Gaer et al. (2001, p. 524-5).

From these it is possible to conclude that one single index cannot measure more than one
meaning of mobility at a time. The reason is the following: Theorem 4 and Corollary 1 state
that FP is incompatible with both MOV and EOP. However FP should not be incompatible
with ELC, because in ELC the relative desirability of the outcome categories is irrelevant,
i.e. the variable is categorical. Hence a permutation of matrix’s rows, i.e. of the lottery
outcomes, should not change the value of an index sensitive fulfilling ELC. Put it differently,
an index satisfying ELC should also satisfy FP. But then, by Theorem 4 and Corollary 1,
it would not satisfy both MOV and EOP. What about an index satisfying MOV and EOP
at the same time? A permutation of columns can change the dominance relationship, i.e.
a pairwise permutation could flip about the roles of dominating and dominated offspring
group, but it does not change the intensity of that (first-order) dominance relationship. By
contrast, EOP is only sensitive to that intensity. Hence an index satisfying EOP should also
satisfy AN. But, then, Theorem 2 states that MOV and AN are incompatible. Therefore
one single index cannot measure more than one of the three meanings of intergenerational
mobility.
Now, in the case of size matrices, several existing indices can measure mobility as move-
ment, while others do not measure any of the three meanings. A few satisfy ELC, but are
characterized by benchmarks of perfect mobility that are unnecessarily stringent for size ma-
trices. The next section provides examples of these existing indices, showing their property
fulfillment. The subsequent sections propose indices that are sensitive to EOP and ELC,
respectively.

Measurement of mobility by existing indices


Van de Gaer et al. (2001, Table 2, p. 526) checked the axiom fulfillment of eight mobility
indices. They found that none of them fulfilled either EOP or ELC, while only five satisfied
MOV. That was, then, their main motivation to propose indices sensitive to EOP and ELC
(respectively) for discretized continuous variables. Now, it is straightforward to show that,
in the context of size transition matrices, the same eight indices do no satisfy EOP or ELC,
and the same five satisfy MOV. Moreover, as explained below, the proposals (of indices
satisfying EOP and ELC, respectively) by Van de Gaer et al. (2001) are not suitable for

7
ordinal variables and size matrices. Hence there is a case to propose mobility indices that
satisfy EOP and ELC (respectively) for size matrices.
Table (1) features three prominent examples, from the eight indices studied by Van de
Gaer et al. (2001). The fourth index, P R was proposed by Parker and Rougier (2001).
Unlike previous indices, it was not analyzed by Van de Gaer et al. (2001).5 The fifth formula
is the family of indices proposed by Van de Gaer et al. (2001) in order to measure mobility
as equalization of life chances using quantile transition matrices.
The first index, ST , is Shorrocks’ trace index. It satisfies M OV . Hence, due to theorems
1 and 3, and corollary 1, it does not satisfy AN, PM and PP. It does not satisfy F P either,
but it fulfils IM . Likewise, being insensitive to changes outside the diagonal, it does not
satisfy EOP or ELC.

5
Due to publication timing, Van de Gaer et al. (2001) may have been unaware of the indices by Parker
and Rougier (2001).

8
Table 1: Examples of mobility indices based on transition matrices
Index Axioms fulfilled Source
PXtop
Xtop − i=1 pi|i
ST = Xtop −1 MOV, IM Shorrocks (1978)
E2 = 1 − |λ2 | IM, PM Sommers and Conlisk (1979)
PXtop PXtop
B2 = Xtop (X1top −1) i=1 j=1 pi|j |i − j| MOV, IM Bartholomew (1982)(adapted)
i
X top Xtop 2
hP P
P R = 1 − Xtop1 −1 i=1 j=1 pi|j − 1 PP, AN, FP, IM,ELC Parker and Rougier (2001)
1−ϕ 1
PXtop PXtop
IϕELC = [1− 1 ϕ1 ] [ X12 i=1 j=1 [(Xtop )pi|j ] − ϕ ]
Xtop
PP, AN, FP, IM, ELC Van de Gaer et al. (2001)
Xtop top

9
The second index, E2, is the second-eigenvalue index proposed by Sommers and Conlisk
(1979). In the formula on the second row of Table (1) the second eigen-value is denoted by λ2 .
The second eigen-value provides a measure of the speed with which the initial distribution of
values converges toward the ergodic distribution of the transition matrix; i.e. it is a measure
of persistence in the distribution, which depends on the matrix’s transition regime. As shown
by Van de Gaer et al. (2001) for the case of quantile matrices, this index does not satisfy
any of the three axioms of meaning as well as none of the permutation axioms. However, it
does satisfy the axioms of maximum and minimum mobility (except for P P ).
The third index, B2, is a variation of one of the indices proposed by Bartholomew (1982).
PXtop PXtop
In the original index, Bartholomew weights the expression i=1 j=1 pi|j |i − j| , using pi ,
i.e. the ergodic probability of attaining value i. The problem of weighting in such way is
that then the index does not satisfy MOV (Shorrocks, 1978). However, weighting with X1top ,
as appears in Table (1), B2 does satisfy M OV . The index also satisfies IM , but not the
other axioms.
The fourth index, P R was proposed as a measure of mobility as unpredictability.6 The
idea is to consider a matrix as more mobile if it is harder to predict the offspring status
from parental categories. On one extreme, perfect predictability, is considered as the state
of maximum immobility, which includes the identity matrix as a particular, yet not unique,
case. This notion is identical to the one used in the P P axiom above. On the other extreme,
perfect unpredictability is a situation in which pi|j = X1top . P R fulfills ELC. It also satisfies
the permutation axioms and the axioms of minimum mobility. Hence it could be a good
candidate for an index capturing mobility as equality in life chances. However, its benchmark
of perfect unpredictability is problematic for the measurement of equality of opportunity
and equality of life chances, because perfect unpredictability implies complete equality of
opportunity and of life chances, but the reverse is not true. That is, P R does not fulfill P M .
By contrast, the indices proposed below satisfy P M and either EOP or ELC.
Finally, IϕELC is a family (for different values of ϕ ∈ [0, ∞])proposed by Van de Gaer
et al. (2001) in order to measure mobility as equality of life chances for quantile transition
matrices.7 Even in the context of size matrices, it is easy to show that the family IϕELC
satisfy ELC, as well as AN, FP, IM and PP. However, with size matrices and categorical
variables, IϕELC does not fulfill PM because the index attains its maximum if and only if
there is perfect unpredictability, which is not implied by complete equality of life chances.8 .
By contrast, the indices proposed below satisfy P M and either EOP or ELC.
hP i
6 X Xtop PXtop 2
The original formulation by Parker and Rougier (2001) is: P R = Xtoptop−1 i=1 p
j=1 i|j − 1 . The
formula in the table is different so that 0 ≤ P R ≤ 1 and a higher value of the index represents higher
mobility, as with the other indices.
PXtop PXtop
7
The original formulation by Van de Gaer et al. (2001) is: IϕELC = 1 − X12 i=1 j=1 [(Xtop )pi|j ]
1−ϕ
.
top

The formula in the table is different so that 0 ≤ IϕELC ≤ 1 and a higher value of the index represents higher
mobility, as with the other indices.
8
Whereas, in the case of quantile matrices, perfect unpredictability occurs if and only if there is complete
equality of life chances

10
A family of indices capturing mobility as equality of
opportunity for size matrices and ordinal variables
Van de Gaer et al. (2001) proposed a family of indices meant to capture mobility as equality of
opportunity for discretized continuous variables. Their indices also required information from
the values of the wellbeing variable. The reason was that their indices compared inequality
across offspring groups (determined by parental variable values) in a distributional standard,
e.g. a mean, that depended both on the conditional probabilities (i.e. column probabilities)
of obtaining a given wellbeing value and on that value itself. Hence the spread in the offspring
marginal distribution also needed to be taken into account when looking at the differences
in opportunity.9 However, in the case of ordinal variables cardinal scales are arbitrary.
Therefore the indices proposed by Van de Gaer et al. (2001) for the measurement of mobility
as opportunity are not suitable. Is it possible, then, to measure mobility as opportunity
with indices that satisfy EOP, but without incorporating information on the values of the
outcomes, i.e. in the context of ordinal variables? The following proposal, based on Silber
and Yalonetzky (2011) gives a positive answer.
Silber and Yalonetzky (2011) proposed a family of indices useful for the assessment of
inequality of opportunity when outcome variables are ordinal. The indices compare each
conditional cumulative distribution, Fi|j , against the cumulative distribution of a population
average (which may be arithmetic, or weighted by the population subgroup composition).
However, those indices need to be re-normalized to render them suitable for transition ma-
trices since their number of columns (population groups) can be different from the number
of rows (the categories of the outcome variable). The following proposal, based on the
PXtop
arithmetic, non-weighted average cumulative distribution, Fiprom ≡ X1top j=1 Fi|j , measures
mobility as opportunity in transition matrices:

Xtop Xtop
α 2α XX
Fi|j − F prom α , if Xtop is even

O =1− i
Xtop (Xtop − 1) j=1 i=1
Fi|j − F prom α
α
PXtop PXtop
2(2X top ) j=1 i=1 i
Oα = 1 − , if Xtop is odd (3)
(Xtop − 1)2 (Xtop + 1)[(Xtop − 1)α−1 + (Xtop + 1)α−1 ]

The following proposition states what axioms are fulfilled by Oα :

Proposition 1 Oα ∀α > 1 satisfies EOP, AN, PP and PM. It does not satisfy MOV, ELC,
FP, and IM.

Proof. See Appendix.


For practical purposes, two interesting members of the family (3) are O1 and O2 . But note
that, unlike the indices with α > 1, it is easy to show, following the derivations in the
Appendix, that O1 fulfills a weaker version of EOP, whereby if k < l, q < r, and F
gi|q ≥ Fi|r
g
k,l;q,r
∀i ∈ [1, Xtop ]; then M[Tε [M ]] ≥ M[M ].
9
Fleurbaey (2008, chapter 9) also discusses this point.

11
A potential limitation of the family (3) is that it does not satisfy IM when Xtop > 2. The
reason is that Oα = 0 if and only if p1|j = 1 for half of the Xtop columns and pXtop |j = 1 for
top −1
the other half, when Xtop is even; and Oα = 0 if and only if p1|j = 1 for X2X
top +1
top
, or X2X top
, of
top −1
the Xtop columns and pXtop |j = 1 for the other group ( X2X top
, or X2X
top +1
top
, respectively), when
Xtop is odd. In those cases, there are bound to be empty rows in the transition matrices,
whereas any matrix resulting from a column permutation of an identity matrix (including
the identity matrix itself) never has empty rows.
However, if the domain of admissible size transition matrices is restricted to include only
matrices with non-empty rows, then family (3) also satisfies IM and requires re-normalization
as follows:

α Xtop Xtop
Xtop
Fi|j − F prom α ∀α ≥ 1.
XX
α
O = 1 − PXtop i (4)
k=1 [k (Xtop − k)α + k α (Xtop − k)] j=1 i=1

In the case of the family (4) O1 is:


Xtop Xtop
1 3 XX
Fi|j − F prom

O =1− 2 i (5)
Xtop −1 j=1 i=1

A family of indices capturing mobility as equality of life


chances for size matrices and categorical variables
The proposal in this section is inspired by the segregation measurement literature (e.g. see
Reardon and Firebaugh, 2002). These indices compare, for instance, the distributions of
ethnicity, conditioned by a school district, against each other. Many of the indices can
be amended to be applicable to transition matrices by: 1) changing their normalization to
account for the fact that the number of categories/states of the conditioning variable has
to be the same as that of the categories of the outcome variable (i.e. in order to have
a square matrix); and 2) removing their dependence on the marginal distribution of the
conditioning variable (e.g. proportions of the population in each school district; see Reardon
and Firebaugh (2002)).
The following general family is proposed:

PXtop PXtop
β i=1 j=1 |pi|j − pprom
i |β
C =1− 1−β
(6)
Xtop (Xtop − 1)[(Xtop − 1)β−1 + 1]
PXtop
Where pprom
i ≡ 1
Xtop j=1 pi|j . For practical purposes, two interesting members of family
(6) are:

PXtop PXtop
1 i=1 j=1 |pi|j − pprom
i |
C =1− (7)
2(Xtop − 1)

12
PXtop PXtop
2 i=1 j=1 |pi|j − pprom
i |2
C =1− (8)
Xtop − 1

C 1 and C 2 are, respectively, similar to the Dissimilarity index and the Relative Diversity
index, from the segregation literature (see Reardon and Firebaugh, 2002, Table 2). However,
the two segregation indices are sensitive to the marginal distribution of the conditioning
variable, and are applicable to non-square matrices.
The following proposition states what axioms are fulfilled by C β :

Proposition 2 C β ∀β > 1 satisfies ELC, AN, FP, PP, IM and PM. It does not satisfy
MOV and EOP.

Proof. See Appendix.


Note that, like O1 , C 1 satisfies only a weak version of ELC.

Is it possible to reconcile the measurement of different


meanings of mobility?: The case of monotone matrices
Shorrocks (1978) was the first to recognize incompatibilities between some axioms. He
focused on the incompatibility between MOV and PM (Theorem 1) and, among other pro-
posals, suggested restricting the admissible domain of transition matrices. He proposed the
concept of matrices with quasi-maximal diagonal. A matrix with a quasi-maximal diagonal
is one for which: µj pj|j ≥ µi pi|j ; µj , µi > 0∀j, i. Among other consequences, matrices having
quasi-maximal diagonals always have probability mass in the main diagonal (the main one).
Shorrocks showed that when the permissible domain of matrices is restricted to those of
matrices with quasi-maximal diagonals, MOV becomes compatible with PM as the value of
the trace of the matrices gets bounded between Xtop and 1, and it can take the latter value
if and only if all columns are identical.
An alternative proposal, briefly considered by Van de Gaer et al. (2001), is to restrict the
permissible domain of transition matrices to monotone matrices. This section fully explores
the implications of restricting the domain to size monotone matrices. These are character-
ized by the (weak) first-order stochastic dominance of offspring distributions conditioned on
higher parental values over offspring distributions conditioned on lower parental values.
Formally, M is a monotone matrix if and only if: Fi|j ≥ Fi|j+1 ∀i ∈ [1, Xtop ] , j ∈
[1, Xtop − 1].10
In the realm of monotone matrices several of the above theorems become irrelevant,
because the imposition of monotonicity forbids several permutations of columns and rows as
contemplated in axioms AN and FP, respectively. Hence, for instance, maximum immobility
and perfect predictability become conflated and represented uniquely by the identity matrix.
On the other extreme, since monotone matrices have quasi-maximal diagonals, maximum
mobility according to the three meanings, is attained if and only if the (monotone) matrix
has identical columns.
10
For an exhaustive treatment of the properties of monotone transition matrices see Dardanoni (1995).

13
But, besides the extreme situations, how far more is it possible to reconcile the mea-
surement of the three meanings of mobility within the set of monotone matrices? A good
starting point for this question is the following proposition:

Proposition 3 If matrices M and Tεk,l;q,r [M ] are monotone, and M[Tεk,l;q,r [M ]] > M[M ]
then M measures mobility as equality of opportunity.

Proof. Because both matrices are monotone then F i|q ≥ Fi|r ∀i ∈ [1, Xtop ]. Therefore EOP
g g
holds.
Proposition 3 implies that any mobility index, that responds positively to any diagonal-
izing transformation within the restricted set of monotone matrices, measures mobility as
opportunity, as it satisfies EOP. Now, an index responding positively to a transformation
Tεq,r;q,r [M ], i.e. one applied over the main diagonal, would also be measuring mobility as
movement, satisfying MOV. Therefore, when matrices are monotone, indices satisfying EOP
also satisfy MOV. The reverse is not true because not all diagonalizing transformation that
improve mobility as equality of opportunity remove probability mass from the main diagonal.
What about ELC? If an index reacts positively to a diagonalizing transformation that
fulfills the conditions of Axiom 3, then, according to that same axiom, it measures mobility as
equalization of life chances. If, on top of that, the matrices involved are monotone, then, due
to Proposition 3, the index is also measuring mobility as equalization of opportunity, i.e. the
intensity of the dominance relationships decreases. The reverse, however, is not true; that is,
not every transformation that occurs in the environment described by Proposition 3 produces
an increase in a mobility index that satisfies ELC. In other words, not all transformations
that generate mobility as opportunity also generate mobility as equalization in life chances,
even among monotone matrices. The following example illustrates one such case:
   
0.6 0.3 0 0.6 0.3 0
A =  0.3 0.4 0  , AF =  0.9 0.7 0  (9)
0.1 0.3 1 1 1 1
   
0.6 0.3 0 0.6 0.3 0
2,3,1,2 2,3,1,2
T0.2 [A] =  0.2 0.5 0  , T0.2 [AF ] =  0.8 0.8 0  (10)
0.2 0.2 1 1 1 1
In the example, a diagonalizing transformation involving probability mass of 0.2 took
place between the four cells in the intersection between the first two columns and the second
and third row, in (monotone) matrix A (AF is the corresponding matrix of cumulative
2,3,1,2
distribution functions). The ensuing matrix, T0.2 [A] is also monotone. Therefore, by
Proposition 3, any index that reacts positively to that transformation is measuring mobility
as opportunity. However, the transformation does not satisfy the conditions of Axiom 3
pk|q ≥ pg
(g k|r and pfl|q ≤ p
f l|r ). Hence an index that satisfies ELC should not be expected to
react positively to the example’s diagonalizing transformation.
Finally, how prevalent are monotone matrices in practice? It is not easy to produce a
”census” of all size transition matrices ever computed, or computable, from datasets. Van de
Gaer et al. (2001) mention several examples of monotonicity violation in quantile matrices.
By contrast, in the case of size matrices, it seems to be easier to find several examples of

14
monotonicity fulfillment. For instance, in applications to educational mobility, monotone
matrices have been found for Colombia and Brazil (Behrman et al., 2001, one matrix for
each country); Italy (Checchi et al., 1999, two matrices); the US (Checchi et al. (1999,
two matrices); Johnson (2002, one matrix)); Germany (Heineck and Riphan, 2007, two
matrices); and India (Azam and Bhatt, 2012, six matrices). On the other hand, Yalonetzky
(2012) computed eight transition matrices of educational levels in Mexico (four cohorts of
adult offspring and by gender), of which six are monotone and two (25%) are not monotone.
More extremely, all six class mobility matrices computed by Wydick (1999) for Guatemala
are not monotone (although it is an intra-generational analysis). If this pattern is common
across datasets then even if most size transition matrices of intergenerational mobility were
monotone, it’s possible that a minority is not monotone. Therefore, both the potential
differences in mobility trends generated by indices that capture different meanings, and the
incompatibilities shown by Van de Gaer et al. (2001), deserve attention in practice.

Concluding remarks
This paper’s contributions are meant as a follow-up to the seminal work by Van de Gaer
et al. (2001), which clarified the three meanings of intergenerational mobility when the
latter is measured with discretized continuous variables. In this paper, the meanings are re-
interpreted and applied in the context of discrete measures of well-being, both ordinal and
categorical. Considering the paucity of mobility indices capturing the meaning of mobility
as opportunity, the paper proposes a family of mobility indices measuring this concept for
size transition matrices. The indices are based on the work of Silber and Yalonetzky (2011).
These indices are suitable for ordinal variables and some of them have desirable normalization
properties under reasonable restrictions in the domain of admissible matrices (e.g. non-empty
rows): they fulfill IM and PM, and take specific values when the situations of maximum and
minimum mobility are attained.
Likewise, considering the scarcity of mobility indices capturing the meaning of equality of
life chances in the context of size matrices and categorical variables, the paper also proposes
a family of mobility indices capturing this meaning, based on the segregation measurement
literature (e.g. Reardon and Firebaugh, 2002). Compared to the few other indices in the
literature that satisfy ELC (e.g. Parker and Rougier (2001), Van de Gaer et al. (2001)),
those of the family proposed in this paper exhibit more desirable normalization properties,
as they fulfill both IM and PM, whereas the other indices do not fulfill PM.
The paper also shows that, without any significant restrictions in the domain of admissible
size transition matrices, mobility indices can only, at most, satisfy one of the three axioms
of meaning. By contrast, different results obtain when the domain is restricted to admit
only monotone matrices: (1) Indices reacting positively to diagonalizing transformations
anywhere in the matrix measure mobility as opportunity; (2) indices satisfying EOP also
satisfy MOV, but the reverse is not true; and (3) indices satisfying EOP also satisfy ELC,
but the reverse is not true.

15
Appendix
Proof of Proposition 1
Proving the satisfaction
PXtopofPAN and PP is straightforward as permutations of columns do
Xtop prom α

not change the sum j=1 i=1 Fi|j − F i . Satisfaction of PM is also easy, as Oα = 1
if and only Fi|j = Fiprom ∀i, j, which means that all columns are identical. Satisfaction of
AN implies violation of MOV since the two axioms are incompatible according to Theorem
2. Violation of FP is also easy to show because row permutations can alter the cumulative
distribution functions, in turn altering the value of Oα , whereas, by contrast, FP requires no
change in the mobility index when rows are permutated.

Violation of ELC is also easy to show with examples using Xtop > 2.11 Consider matrix
A and its respective matrix of cumulative probability functions AF , in (11). Now a diagonal-
izing transformation involving the first and third row of A, and probability mass of  = 0.1
yields the matrix, together with its respective cumulative matrix, in (12). Computing Oα
for the two matrices (before and after the transformation) yields no change. However, ac-
cording to ELC the index’s value should have increased in order to reflect higher mobility
(as equalization of life chances). Hence Oα violates ELC.

   
0.4 0.2 0 0 0.4 0.2 0 1
 0 0.4 0 0   0.4 0.6 0 1 
A=  0.2 0.4 0 0
,
 AF =   0.6 1 0 1 
 (11)
0.4 0 1 1 1 1 1 1
   
0.3 0.3 0 0 0.3 0.3 0 0
1,3,1,2
 0 0.4 0 0  1,3,1,2
 0.3 0.7 0 0 
T0.1 [A] = 
 0.3 0.3 0 0
, T0.1 [AF ] = 
 0.6 1
 (12)
 0 0 
0.4 0 1 1 1 1 1 1
The proof that Oα satisfies EOP requires more elaboration, but in essence, is similar to
the proof used by Van de Gaer et al. (2001, Section III). Consider thePfollowing the general
Xt op PXt op
function g(|Fi|j − Fiprom |) and g 0 > 0. Now consider the index G = − i=1 j=1 g(|Fi|j −
prom
Fi |). A small diagonalizing transformation under the conditions stipulated by the EOP
axiom has the following total effect on G (approximated by the total derivative):

l−1
prom prom
dG ∼
X
= d [g 0 (|Fi|q − Fiprom |)(−1)I(Fi|q <Fi )
− g 0 (|Fi|r − Fiprom |)(−1)I(Fi|r <Fi )
] (13)
i=k

Bearing in mind that, an implication of the condition of Axiom 2 is that Fi|r ≤ Fi|q ∀i and
∃t|Ft|r < Ft|q , there are three situations, for each row i between rows k and l −1, that need to
be assessed in order to ascertain the sign of dG: I. Fi|r ≤ Fi|q < Fiprom ; II. Fi|r ≤ Fiprom ≤ Fi|q ;
11
When Xtop = 2, EOP and ELC are identical.

16
and III. Fiprom < Fi|r ≤ Fi|q . In case II. it is easy to show that dG > 0, as long as g 0 > 0. In
cases I. and II. dG > 0 if and only if g 00 > 0 i.e. if g is strictly convex.
Now |Fi|j − Fiprom |α is convex if and only if α > 1. Therefore Oα satisfies EOP ∀ α > 1.

Proof of Proposition 2
Proving the satisfaction of AN, PXFP PP is straightforward as permutations of columns, and
Xtop prom α
top
P
rows, do not change the sum j=1 i=1 pi|j − pi . Satisfaction of PM is also easy, as
C α = 1 if and only pi|j = pprom
i ∀i, j, which means that all columns are identical. Satisfaction
of AN implies violation of MOV since the two axioms are incompatible according to Theorem
2.
Violation of EOP is also easy to show with examples using Xtop > 2, as in the previous
sub-section. Consider, for instance, the pairs of matrices in (9) and (10). Computing C β for
the two matrices (before and after the transformation) yields no change. However, accord-
ing to EOP the index’s value should have increased in order to reflect higher mobility (as
equalization of opportunities). Hence C β violates EOP.

The proof that C β satisfies ELC is similar to the one worked out above in order to
show that Oα fulfills EOP. Consider, again, the the general function g(|pi|j − pprom
i |) and
0
PXt op PXt op prom
g > 0. Now consider the index G = − i=1 j=1 g(|pi|j − pi |). A small diagonalizing
transformation under the conditions stipulated by ELC has the following total effect on G
(approximated by the total derivative):

prom prom
dG ∼ prom
= d[g 0 (|pk|q − Fk |)(−1)I(Fi|q <Fk )
− g 0 (|pk|r − pprom
k |)(−1)I(pk|r <pk )
+
I(pl|r <pprom I(pl|q <pprom
g 0 (|pl|r − pprom
l |)(−1) l )
− g 0 (|pl|q − pprom
l |)(−1) l )
] (14)

Bearing in mind that, an implication of the condition of Axiom 3 is that pk|q > pk|r and
pl|q < pl|r , there are three situations, for k, that need to be assessed in order to ascertain
the sign of dG: I. pk|r < pk|q < pprom k ; II. pk|r ≤ pprom
k < pk|q ; and III. pprom
k < pk|r < pk|q .
prom
Likewise, for l there are three analogous situations: I. pl|q < pl|r < pk ; II. pl|q ≤ pprom l <
pl|r ; and III. pprom
l < pl|q < pl|r .
When the two cases II hold simultaneously, it is easy to show that dG > 0, as long as
g 0 > 0. Otherwise, dG > 0 if and only if g 00 > 0 i.e. if g is strictly convex.
Now |pi|j − pprom
i |β is convex if and only if β > 1. Therefore C β satisfies ELC ∀ β > 1.

Acknowledgments
I would like to thank Dirk Van de Gaer for detailed comments on an earlier draft, and seminar
participants at the World Bank Conference ”Inequality of what? Outcomes, opportunities
and fairness”, for very helpful comments.

17
References
Azam, M. and V. Bhatt (2012). Like father, like son? intergenerational education mobility
in india. Mimeo.

Bartholomew, D. (1982). Stochastic models for social processes. Wiley.

Behrman, J., A. Gaviria, and M. Szekely (2001). Intergenerational mobility in latin america.
Economia 2 (1), 1–44.

Checchi, D., A. Ichino, and A. Rustichini (1999). More equal but less mobile? education
financing and intergenerational mobility in italy and the u.s. Journal of Public Eco-
nomics 74, 351–93.

Cowell, F. (1985). Measures of distributional change: an axiomatic approach. Review of


Economic Studies LII, 135–51.

Dardanoni, V. (1995). Income distribution dynamics: monotone markov chains make light
work. Social Choice and Welfare 12, 181–92.

Fields, G. and E. Ok (1996). The meaning and measurement of income mobility. Journal of
Economic Theory 71, 349–77.

Fields, G. and E. Ok (1999). Measuring movement of incomes. Economica 66 (264), 455–71.

Fleurbaey, M. (2008). Fairness, Responsibility and Welfare. Oxford University Press.

Formby, J., J. Smith, and B. Zheng (2004). Mobility measurement, transition matrices and
statistical inference. Journal of Econometrics 120, 181–205.

Heineck, G. and R. Riphan (2007). Intergenerational transmission of educational attainment


in germany: The last five decades. IZA Discussion Paper Series 2985.

Johnson, P. (2002). Intergenerational dependence in education and income. Applied Eco-


nomics Letter 9, 159–62.

Parker, S. and J. Rougier (2001). Measuring social mobility as unpredictability. Econom-


ica 68, 63–76.

Reardon, S. and G. Firebaugh (2002). Measures of multigroup segregation. Sociological


Methodology 32, 33–67.

Schluter, C. and D. van de Gaer (2011). Upward structural mobility, exchange mobility,
and subgroup consistent mobility measurement: U.s.-german mobility rankings revisited.
Review of Income and Wealth 57 (1), 1–22.

Shorrocks, A. (1978). The measurement of mobility. Econometrica 46 (5), 1013–24.

18
Silber, J. and G. Yalonetzky (2011). Measuring inequality in life chances with ordinal vari-
ables. In J. Bishop (Ed.), Research on Economic Inequality, Volume 19, Chapter 4, pp.
77–98. Emerald.

Sommers, P. and J. Conlisk (1979). Eigenvalue immobility measures for markov chains.
Journal of Mathematical Sociology 6, 253–76.

Van de Gaer, D., E. Schokkaert, and M. Martinez (2001). Three meanings of intergenerational
mobility. Economica 68 (272), 519–37.

Wydick, B. (1999). Credit access, human capital and class structure mobility. Journal of
Development Studies 35 (6), 131–52.

Yalonetzky, G. (2012). Intergenerational mobility of education in mexico: an analisis by


cohorts and gender. Mimeo.

19

You might also like