You are on page 1of 32

Preferences, Beliefs, and Early Investments in Children

Flvio Cunha

University of Pennsylvania

Abstract
PRELIMINARY AND INCOMPLETE, PLEASE DO NOT CIRCULATE OR CITE. In general, economic models of
human development assume that mothers have rational expectations about the technology of skill formation, that
is, about the process that determines how their offspring develop skills in cognitive domains. This assumption implies that all women regardless of their race, education, or socio-economic status know the objective distribution
function of the marginal returns to investments on the development of their children. When such models are structurally estimated, the variation in observed investments across families is attributed to shocks, heterogeneity in the
characteristics of the children or families, but not to the heterogeneity in the knowledge base of the mother. In this
paper, I formulate an economic model of human development that allows for the heterogeneity in maternal beliefs
to play a role in the maternal choices of investments. Beliefs matter because mothers may have biased expectations
about their childs full potential (reference points) and/or because they may have biased beliefs about the returns
to investments (subjective beliefs about the technology of skill formation). To empirically determine the importance of these issues, I analyze data from the Maternal Knowledge of Infant Development Survey (MKIDS) a unique
dataset that has information about both subjective beliefs about the technology of skill formation and reference
points. I then pool the CNLSY/79 and MKIDS dataset to estimate a structural model to simulate the importance of
beliefs and reference points in explaining the heterogeneity in parental investments.

Introduction

Inequalities in skills are fundamentally linked to economic and social inequalities. A substantial fraction
of the residual variance in log wages is due to the stock of human capital that accumulates long before
I

thank James Heckman and Robert Dugger for their unwavering encouragement and support of my research on the economics of human development. Dalton Banks, Michelle Giffords, Debbie Jaffe, Snejana Nihtienova, Ben Sapp, and Cheryl Tocci
provided excellent research assistance. This paper has benefited from my collaborations with Anton Badev, Jere Behrman, Jennifer Culhane, and Irma Elo. I am thankful to Limor Golan, James Heckman, Maximiliam Kasy, Charles Manski, Pedro Mira,
Andy Postlewaite, Matt Wiswall, and Ken Wolpin for their comments. I also thank seminar participants at UCLA, Stanford, as
well as participants at Early Childhood Development and Human Capital Accumulation conference at UCL, the Institute for
Research on Povertys Summer Workshop at the University of Wisconsin - Madison, Childrens Human Capital Development workshop at Aarhus University in Denmark, and the CES-ifo conference on the Economics of Education. Funding was
generously provided by a grant from the Institute for New Economic Thinking.

1 Introduction

individuals begin to work (Keane and Wolpin, 1997; Cunha, Heckman, and Navarro, 2005). Gaps in
college enrollment among young men classified by the household income of their respective parents are
substantively explained by gaps in skills as measured during the teenage years of the subjects lives (e.g.,
Cameron and Heckman, 1998; Carneiro and Heckman, 2003; Cunha, Heckman, Lochner, and Masterov,
2006). The same is true for the notable black-white gap in wages earned by men (Neal and Johnson, 1996).
Low investments in young children in part explain the emergence of gaps in skills (Cunha and Heckman, 2007). The technology of skill formation is such that large investments in later stages of the lifecycle
are necessary to compensate for early neglect (Cunha and Heckman, 2008; Cunha, Heckman, and Schennach, 2010). This partially explains why it is more difficult to reduce inequalities with interventions made
during the adult stages of the lifecycle (see literature review in Carneiro and Heckman, 2003). The evidence from experimental as well as non-experimental data shows that increasing early investments for
children from disadvantaged backgrounds has significant economic benefits with respect both to labor
market outcomes (e.g., higher labor income, stronger attachment to the labor force, less dependence on
welfare) and with respect to indicators of high performance in other dimensions of life, such as higher
educational attainment for females and lower probability of participation in criminal activities for males
(Karoly, Kilburn, and Cannon, 2005; Ludwig and Miller, 2007; Campbell et al, 2008; Hoddinott, Maluccio,
Behrman, Flores, and Martorell, 2008; Maluccio, Hoddinott, Behrman, Quisumbing, Martorell and Stein,
2008; Reynolds and Temple, 2008; Behrman, Calderon, Preston, Hoddinott, Martorell and Stein, 2009,
Heckman, Moon, Pinto, Saveleyev, and Yavitz, 2010).
Low investments in children are more common in low socio-economic status (SES) families. A vast
literature on language development has shown that the quantity (e.g., number of words) and quality (e.g.,
variety of words and greater syntactic complexity) of verbal interaction between parents and children increases with SES (Brody, 1968; Dunn, Wooding, and Herman, 1977; Field and Pawlby, 1980; Hoff-Ginsberg,
1991; Hess and Shipman, 1965; Ninio, 1980, Tulkin and Kagan, 1972). Parents with high SES engage in
more explicit teaching (e.g., teaching about object labels or teaching about causality) than do parents with
low SES (Brophy, 1970; Hammer and Weiss, 1999; Lawrence and Shipley, 1996). The analysis of time-use
data by Bianchi, Robinson, and Mylkie (2007) demonstrates that higher-SES mothers spend more time on
teaching activities with their children than do lower-SES mothers. Kalil, Ryan, and Corey (2009) show that
they not only spend more time with their children, but are also more likely to dedicate time to activities
that best suit their childrens developmental needs. There is evidence that investments, measured either
directly by specific maternal actions or indirectly by Home Observation for Measurement of the Environment (HOME) scores, increase with one important determinant of family SES: maternal schooling. Currie
and Moretti (2002) explore exogenous variations in college-attendance costs to show that maternal schooling raises investments in the health of children in the form of more frequent prenatal care and reduced
smoking. Carneiro, Meghir, and Parey (forthcoming) use a similar strategy and conclude that as maternal schooling increases, so does the respective mothers provision of appropriate play materials and daily
stimulation for her child.
There are at least four non-mutually exclusive explanations for the pattern between investments and

1 Introduction

family SES. First, suppose that all children are identical and parents are the same in every aspect (including
their characteristics and preferences) except for permanent income. If parents cannot borrow against the
future permanent income of their children, then the amount that parents invest is determined by their
permanent income, which is lower in low-SES families (Becker and Tomes, 1986; Dahl and Lochner, 2008;
Duncan, Ziol-Guest, and Kalil, 2010; Brown, Scholz, and Seshadri, forthcoming).
Second, differences in maternal preferences about two (or more) distinct dimensions of human capital
can also explain the observed pattern of investments. For example, Lynd and Lynd (1929, 1937) reported
that working-class mothers ranked "strict obedience" as their most important childrearing goal more frequently than higher-SES mothers did. Harwood (1992) found that, when asked to describe how they
would like their toddlers to behave if left with a stranger in a doctors waiting room, lower-SES mothers
rated proper demeanor as more important than did higher-SES mothers. This finding has been confirmed
in other contexts as well (e.g., Alwin, 1984; Luster, Rhoades, and Haas, 1989; Pearlin and Kohn, 1966;
Tudge, Hogan, Snezhkova, Kulakova, and Etz, 2000; Wright and Wright, 1976). The data on language
interaction partially supports this view. They show that high-SES mothers devote relatively more time to
the formation of their childrens cognitive skills while low-SES mothers devote more time teaching their
children to become obedient (see references about language interaction above). Sociology literature argues
that the stronger preferences towards socio-emotional skills by lower-SES mothers reflect those mothers
forecasts for their children choosing occupations in which obedience and conformity have relatively higher
returns (Kohn, 1963).
Third, marginal returns to investments and, consequently, levels of investments, may be affected by
the characteristics of the parents or those of the child. For example, mothers skills may affect how much
schooling they obtain and the productivity of their investments (Behrman and Rosenzweig, 2002). An
example is self-efficacy (Bandura, 1977): if efficacy in the context of schooling is positively correlated
with efficacy in the parenting context, then highly efficacious individuals would attain higher levels of
schooling and invest more in their children. A similar argument holds for the characteristics of the child.
In fact, differences in ability across children may affect investments across or within households (Becker,
1967; Behrman, Rosenzweig and Taubman, 1982; Behrman, Rosenzweig and Taubman, 1994; Behrman,
Rosenzweig and Taubman, 1995; Ashenfelter and Rouse, 1998; Aizer and Cunha, 2011).
Fourth, maternal subjective beliefs about the technology of skill formation may be correlated with SES.
These beliefs partially determine maternal expectations about returns for investments, which, in turn,
determine investment choices. If markets are complete and if low SES mothers beliefs generate low expectations for returns to investments, then low SES mothers will invest too little in their children. Such
choices may have seriously detrimental consequences for their childrens development. While a number
of studies have considered the first three explanations, I have not been able to find any work that has
measured maternal beliefs about the technology of skill formation. This is the focus of the current study.
That low subjective beliefs about returns may affect investments has been recognized in developmental
psychology for at least 50 years (Hunt, 1961; Vygostky, 1978). The issue I explore in this paper is related
to, but different from, that fields large body of literature focused on measuring maternal and paternal

1 Introduction

knowledge about child development. These studies show that the lower the parents SES, the lower their
expectation about cognitive development, but not necessarily about the development of other domains of
human capital (e.g., Epstein, 1979; Hess et al., 1980; Ninio, 1988; Ninio and Rinott, 1988; Mansbach and
Greenbaum, 1999). This literature also finds that fathers tend to have lower expectations than mothers,
but the difference is smaller the higher the schooling attainment of the father and the higher the paternal
involvement in the care of the child.
Ceteris paribus, the gaps in beliefs about when children will master certain skills may arise because of
different reasons. First, the gaps may be a product of the differences in investments that arise even if parents had the same beliefs about the technology of skill formation. These differences in investments would
be associated with differences in preferences, resources, or parental/offspring characteristics as discussed
above. Second, the gaps may be a product of the differences in beliefs holding investments fixed. Obviously, it is also possible that part of the differences in investments arise because of differences in beliefs
as well as differences in preferences, resources, and parental/offspring characteristics. Unfortunately, the
data collected so far in the different fields of the social sciences do not allow us to isolate the importance
of these different factors.
Even though no studies quantify maternal beliefs about the technology of skill formation, a few studies do suggest that maternal beliefs play a significant role in the determination of investments. More
importantly, these studies also suggest that beliefs can be changed by public policy. Aizer and Stroud
(2010), for example, track the smoking habits of educated and non-educated pregnant women before and
after the release of the 1964 Surgeon General Report on Smoking and Health. Before the release of the
report, educated and non-educated pregnant women smoked at roughly the same rates. After the report,
the smoking habits of educated women decreased immediately, and there was suddenly a ten-percentage
point gap between pregnant women who were educated and non-educated in smoking. Glied and LlerasMuney (2008) find that inequalities in mortality across schooling groups increases for diseases that engage
more technological innovations in treatment. An interpretation of this finding is that information changes
some individuals beliefs about the returns (in life-expectancy years) of treatments and, as a result, these
individuals choose to have the procedures more often than those whose beliefs were not affected (see also
Price and Simon, 2009).
Another piece of evidence is offered by Roy (2009), who investigates how maternal investments, in the
context of a developing country, are affected by maternal knowledge that a particular nutritional input,
iodized salt, prevents early onset brain damage. As she documents in her data, mothers often elect not
to use iodized salt, despite it being inexpensive, because they are simply unaware of its impact on child
development. In the Philippines data analyzed by Roy, childrens achievement increased by 14% of a
standard deviation one year after parents became aware of the value of this input.
Further evidence that maternal knowledge matters and can be affected by policy interventions comes
from the Nurse-Family Partnership Study (NFP, Olds et al, 2002). The studys target group was firsttime pregnant women in the U.S. that either qualified for Medicaid or had no health insurance. Women
who agreed to participate in the study were randomly assigned to a control or treatment group. The

2 Model

women in the treatment group received home visitations by nurses during pregnancy and infancy. A goal
of the home visitations was to improve the health and development of the child by helping the parents
to provide more competent care. As reported in Olds et al (2002), the women in the treatment group
had lower cotinine levels during pregnancy and were less likely to have a subsequent pregnancy in the
24 months following delivery. Moreover, these mothers provided more enriching environments to their
infants after delivery and, in turn, their children exhibited superior cognitive and emotional development
at the age of 21 months.
Interestingly, other early childhood programs, which have shown to produce long term impacts (e.g.,
Perry Preschool and the Chicago Parent Children Centers), provide parents with some information about
child development (see program descriptions in Cunha, Heckman, Lochner, and Masterov, 2006). The picture that arises from the research described above is that maternal investments may be strongly influenced
by maternal beliefs about the technology of skill formation. In spite of this body of literature, economic
theories of human capital accumulation assume that all women regardless of their race, education, or
socio-economic status know the true values of the parameters of the technology of skill formation as well
as the distribution functions of shocks affecting the trajectories of skills. When such models are structurally
estimated, the variation in observed investments across families is attributed to shocks, heterogeneity in
the characteristics of the children or families and not to the heterogeneity in the knowledge base of the
mothers.
The evidence suggests that beliefs matter and can be manipulated with information with significant
consequences for inequality in cognitive, social-emotional, and health development. Information is a public good and, as such, it is known that private incentives do not necessarily generate a socially optimal
allocation. In spite of the importance of such issues, current research on human capital formation has no
data or modeling framework to study them.
This paper is organized in the following way. In Section 2, I present the model. In Section 3, I describe
the data and the assumptions I impose to estimate the technology of skill formation. In Section 4, I describe
the elicitation of reference points and subjective beliefs about the technology of skill formation. Section 5
describes the next steps.

Model

I consider a household that has only one child and the problem is static. This simple framework allows
to focus on the identification issues that arise when I introduce reference points and maternal beliefs
about the technology of skill formation. A fully dynamic treatment of the problem requires modeling the
dynamics of beliefs as well as those of the skills. This is done by Badev and Cunha (2012) who formulate
a model in which parents are learning about the technology of skill formation.

2 Model

2.1

Budget Constraint

Consider a mother that has income yi that can be allocated between consumption ci and child investment
xi . In every period, the mother faces the following budget constraint:
ci + xi = yi

(1)

where p stands for the price of investment in terms of consumption. Because this is essentially a static
problem, I forego issues about credit constraints which may be important in the determination of investments (e.g., Becker and Tomes, 1986; Cunha and Heckman, 2007; Caucutt and Lochner, 2012).

2.2

Preferences

Let ci and qi,1 denote, respectively, household consumption and the quality of the child at the end of the
period. The maternal preferences are given by:



h
 i
R
R
u ci , qi,1 , qiR = ln (ci ) + 1 ln (qi,1 ) + 2 1 qi,1 qi,1
ln (qi,1 ) ln qi,1

(2)

The utility function is separable in consumption and child quality. The marginal utility of quality
depends on the maternal reference point of child development, qiR . For 2 > 0, the utility function (2)
captures the situation in which the parent has aversion to under-development and insensitivity to overdevelopment. That is, ceteris paribus, one extra unit of child quality increases parental utility by more
if qi qiR then when the opposite happens. In particular, a mothers reference point qiR could be so
unrealistically high (e.g., knowledge of basic algebra by age 2 years) that even if that even if qi qiR ,
the child could still be considered "over-developed" by a trained developmental psychologist in spite of
the mothers disappointment that the goal qiR was not achieved. An analogous reasoning applies to the
situation in which the reference point qiR is extremely low.
In my model,
  the mother faces uncertainty about the reference point. That is, from the mothers point
R
2 .
of view, ln qi,1
is a random variable that is normally distributed with mean R,i and variance R,i

2.3

The Technology of Skill Formation

The relationship between investments and skills is governed by the technology of skill formation:
qi,1 = f (qi,0 , xi , i , i,1 )
where qi,1 is the amount of skill that was produced during the period and f is the technology of skill
formation. The inputs in the production of skills are the stock of skills at the beginning of the period, qi,0 ,
the parental investments xi,1 , the maternal fixed-effect i , and the shock that is observed by the mother
after she makes investment choices, i,1 .

2 Model

Following the empirical findings of Cunha, Heckman, and Schennach (2010), I assume that the production function of skills f is Cobb-Douglas:1
f (qi,0 , xi , i , i,1 ) = Aqi,0 xi ei +i,1

(3)

The intercept is the parameter that governs self-productivity of skills and I assume it is known by the
parent. The parameter is the component of the production function that may or may not be known by
the mother. This parameter is important in the determination of investments because the higher , the
higher the marginal returns to investments.

2.4

Maternal Beliefs about the Technology of Skill Formation

From the point of view of a mother who is about to choose investments on the child, the parameter and
the term i,1 are random variables. More specifically, I assume that for mother i, is normally distributed
2 and is also normally distributed with mean zero and variance 2 .
with mean ,i and variance ,i
i,1
,i
It is helpful to define mother is information set, i . At the time that the mother makes schooling
choices, she observes income yi , the childs initial
 condition
 qi,0 , the maternal fixed-effect i , the mean and
2
variance of the distribution of reference points R,i , R,i , the mean and variance of the maternal beliefs


2
2 . Thus:
about , , , and the variance of shocks i,1 , ,i


2
2
2
i = yi , qi,0 , i , R,i , R,i
, ,i , ,i
, ,i
.
Now, consider a mother that chooses to invest xi on her child. Then, conditional on the maternal
information set i and investments xi , the expected natural logarithm of value added is:
E [ ln qi,1 | i , xi ] = ln A + ln qi,0 + ,i ln xi + i .

(4)

Clearly, the lower the maternal expectation about , the lower the maternal expected returns to investments. The variance of the natural logarithm of value added is:
2
2
Var [ ln qi,1 | i , xi ] = ,i
+ ,i
[ln xi ]2 .

(5)

The higher the subjective variance of, the more uncertain the mother is about the returns to investments.
1

Cunha, Heckman, and Schennach (2010) test and cannot reject the Cobb-Douglas formulation when they estimate the
process using only cognitive skills (see the results in the Online Appendix of their paper).

2 Model

2.5

The Problem of the Mother

I can write the problem of the mother as:



h 
i

V (i ) = max E u ci , qi,1 , qiR i , xi
xi

subject to the budget constraint (1), the technology of skill formation (3), and the information set i which
contains, in particular, the maternal information about the distribution of reference points and the maternal subjective beliefs about the technology of skill formation.
Note that the technology of skill formation (3) drives the dynamics of accumulation of skills, but because this is not necessarily known by the mother, it does not influence the maternal decisions of investments. Instead, the maternal choice of investments are driven by the expected marginal returns (as
perceived by the mother) as well as the variance of the marginal returns. These quantities are idiosyncratic
and can vary from one mother to the next.
It is possible to obtain closed-form expression for the problem of the mother.

V (i ) = max ln (y xi ) + (1 + 2 ) ln A + ln qi,0 + ,i ln xi + i 2 R,i + ,i




xi

R,i ln A ln qi,0 ,i ln
,i

R,i ln A ln qi,0 ,i ln
,i

where
2
2
2
,i = R,i
+ ,i
+ ,i
[ln xi ]2

I next describe qualitatively describe how the prediction of the model above differs from a model that
does not feature either reference points or subjective beliefs about the technology of skill formation. To
2 = 0, so that mothers believe that the share of investments in the
build the intuition, I assume that ,i
Cobb-Douglas production function is ,i . 2
Figure 1 displays the optimal choices of child development (horizontal axis) and household consumption (vertical axis) when the mother knows or does not know the parameter in the technology of skill
formation (3). The constraint depicted by the blue curve is the constraint that mothers would face if they
knew the true value of . The constraint captured by the red curve is the constraint that mothers face and
is determined by their mean beliefs ,i , which in this case is such that ,i < . 3 For the type of preferences depicted in Figure 1, the fact that ,i < induces investments to be lower than they would be if the
2

2 = 0, there are two interpretations that are essentially identical. The first is that the mother has deterministic
When ,i
beliefs (which may or may not be biased). The second is that the mother adopts a technology in which the share of investments is ,i 6= , but she knows it is so. Note, however, that the empirical evidence that I describe below supports the
2 > 0 which implies that mothers do face substantial uncertainty about the impacts of investments on
situation in which ,i
child development.
3 To see how one can arrive asee how one can arrive at this constraint, note that it is possible to invert the production function
and human capital and write:
! 1
,i
qi,1
xi,1 =
.

Aqi,0 ei +i,1

3 Estimation of the Technology of Skill Formation

mothers knew the true value of . In other words, a parental education program that raised maternal
beliefs from ,i to would generate significant increases in parental investments and child development.
To understand the important of reference points into the determination of early investments and child
development, suppose that mothers know the true value of the parameter . However, mothers are heterogeneous with respect to the reference points. In particular, Figure 2 shows one mother that has high
reference points and another mother that has low reference points. Clearly, these mothers make different
choices even though they are identical in all of the state variables, with the expection of R,i .
The discussion above shows that heterogeneity in mean beliefs or in the mean of reference points can
generate substantial heterogeneity in investments. It is also possible that differences in investments arise
because of the heterogeneity in the maternal fixed-effect. Obviously, once one allows for heterogeneity
in the variance beliefs about the technology of skill formation or uncertainty about reference points, then
identification problems will certainly be insurmountable if all of these state variables are unobserved. For
this reason, I will propose and carry out a procedure to elicit information about the means and variances
of beliefs and reference points. However, before I explain how I do so, I will start by considering the
estimation of the technology of skill formation.

Estimation of the Technology of Skill Formation

One of the goals of the paper is to compare the objective estimates of the technology of skill formation
(3) with the maternal subjective beliefs, described by (4) and (5). As described by Cunha, Heckman, and
Schennach (2010), there are three problems that arise in the estimation of (3). First, it is important to
recognize that skills and investments are measured with error. I follow Cunha, Heckman, and Schennach
(2010) and adopt a factor-model approach to address this problem.
Second, measures of skills and investments have no metric. This is particularly problematic for the
context of this paper, because unless one fixes the scale of investments and skills, it is possible to explore
their lack of cardinality to find any result that is desired. In order to solve this problem, I follow the
psychometric literature and transform the scores into a metric of mental development. To set the scale
and location of investments, I use the information from the PSID time-use of children.
Third, investments are endogenous and may be correlated with the maternal fixed effect. I solve this
problem by looking at variation of investments across children within the same household.
In the next subsections, I describe the data from the Children of the National Longitudinal Survey of
the Youth 1979 (CNSLY/79) which I use in the estimation of the technology of skill formation (3).

3.1

Human Capital at Birth

The human capital of the child at birth is measured by the length of the gestation period, the weight and
body length of the baby at birth, and the number of days that the child stayed at the hospital after birth.
I replace this expression into the budget constraint (1) and I obtain one constraint that is written as function of ci and qi,1 .

3 Estimation of the Technology of Skill Formation

10

These measures of childs human capital at birth have been used elsewhere (e.g., Cunha, Heckman, and
Schennach, 2010). For each child i, let Mi,j denote the measure j of child development, for j = 1, ..., 4.
Without loss of generality, let Mi,1 denote the lenght of the gestation period which is measured in months.
I model:
h ,
Mi,1 =
hi,0 + i,1
(6)
h + h h + h , j = 2, 3, 4.
Mi,j = 0,j
i,j
1,j i,0
The factor model (6) implies that hi,0 inherits the location and the scale from gestation length, which is
measured in months. This choice is convenient because it means that the metric of human capital at birth
is the same as that of human capital at age 24 months, which I describe next.

3.2

Human Capital around Age 24 Months

To measure cognitive skills of the child around age 24 months, I use the Motor and Social Development
Scale (MSD). This scale was developed by the National Center for Health Statistics to measure dimensions
of motor, social, and cognitive development of young children from birth through the age of 47 months.
The items were derived from standard measures of child development (Bayley Scales of Infant Development, the Gesell Scale, Denver Developmental Screening Test), which have high reliability and validity as
shown by Poe (1986). The scale has been used in the National Health Interview Survey (a large national
health survey that included 2714 children up to age 4) and in the third National Health and Nutrition
Examination Survey (NHANES, 19881994). The scale has been used successfully by the CNLSY/79 since
1986. Based on the childs age, NLSY79 mothers answer 15 age-appropriate items out of 48 motor and
social development items. These items have been used with minority children with no apparent difficulty.
Based on the childs age, NLSY79 mothers answer fifteen age-appropriate items out of 48 motor and social
development items. These items are divided in eight components (parts A through H) that a mother completes contingent on the childs ages. Part A is appropriate for infants during the first four months of life
(i.e., zero through three months) and the most advanced section, Part H, is addressed to children between
the ages of 22 and 47 months. All of the items are dichotomous (scored either zero or one) and the total
raw score for children of a particular age is obtained by a simple summation (with a range 0 to 15) of the
affirmative responses in the age-appropriate section. Associated with each raw score is a series of norms:
(1) an overall percentile and standard score and (2) same-gender by age percentile and normed scores.
That is, boys were scored using the male national norms and girls were assigned female national norms,
and both genders received combined gender norm scores. All these normed scores were constructed by
CHRR using data from the nationally representative sample in the 1981 Child Supplement to the National
Health Interview Survey (National Center for Health Statistics 1984).

3 Estimation of the Technology of Skill Formation

3.2.1

11

Item Response Theory Analysis of the Motor and Social Development Instrument

Several problems arise in the use of the MSD scale as a measure of early human capital. First, Cunha,
Heckman, and Schennach (2010) find evidence that the MSD is contamined with measurement error. One
reason that may explain why the MSD has measurement error is because it has relatively few items. For
example, the Bayley Scale of Infant Development has a total of 64 items 32 items measure mental development and other 32 items to measure motor development. Another reason is that the MSD is based on
maternal reports, while the Bayley Scale of Infant Development involves direct observation of the child by
a trained expert in child development.
To mitigate the problem of measurement error, I conduct Item Response Theory (IRT) analysis and
treat the mothers responses as repeated binary indicators of her childs ability. To understand how IRT
can help mitigate measurement error, note that the childs raw score in the MSD is simply the number of
"yes" answers that the mother reports for the child. If all items were of identical difficulty, then this simple
average is the best that one could achieve to estimate a childs ability. To the extent that the items differ
in their difficulty level, then a weighted average can provide a more precise estimate of the childs ability
by assigning higher weight to more difficult items. IRT analysis provides the way to obtain these weights
and makes it possible to estimate the childs ability as precisely as possible.
Another advantage of conducting IRT analysis is that it provides a way to transform the MSD scores
into a metric in "age equivalents". This type of information will be important in comparing the objective estimates of the technology of skill formation (3) with subjective mean beliefs (4) and in quantifying
findings with respect to reference points.
denote the latent variable that is determined according to:
Let di,j

di,j
= j + j qi + i,j .

(7)

The variable qi is child i0 s latent ability and it is independent from the error term i,j , which is i.i.d. across
is not observed. Instead, I observe d = 1 if, and only if, d 0
children i and items j. The variable di,j
i,j
i,j
and di,j = 0, otherwise. Thus:



Pr di,j = 1 qi = 1 Pr i,j j j qi qi = 1 F j j qi ,

(8)

where F is the distribution of i,j . The parameter j represents the location of item j. In the case of cognitive
testing, the lower the value of j , the higher the difficulty of the item. The parameter j represents the
discrimination of the item: that is, the degree to which the item discriminates between persons in different
regions on the latent continuum. When j is high, persons with low ability have a much smaller chance of
correctly responding than persons of higher ability.
Estimation of the parameters j and j for j = 1, ..., J is done via maximum likelihood. Assume that qi
is i.i.d. across children and let f q denote its density function. Then, the likelihood function can be written

3 Estimation of the Technology of Skill Formation

as:

L ( , | d) =

12

F j j qi

1di,j 

1 F j j qi

di,j

f q (qi ) dqi

(9)

i =1 j =1

In principle, it would be possible to use the CNLSY/79 data to estimate the parameters , , and the
density function f q in (9). Although the CNLSY/79 sample is representative of the children of the women
born between 1957 and 1964, it is not a representative sample of U.S. children in general. For this reason,
I use the data from the National Health and Nutrition Examination Survey 1988-1994 (NHANES). In my


empirical application, I assume that i,j N (0, 1) and f q (q ) = kK=1 k q , k , k2 where q , k , k2
is the density of a normal random variable with mean k and variance k2 . The term k is the weight of the
element k and satisfies kK=1 k = 1. To fix the location of q , I impose the restriction kK=1 k k = 0. To fix
the scale of q , I set 36 = 1. Table 1 presents the estimated parameters for the items of the MSD scale. Note
that there is substantial heterogeneity across items in terms of difficulty ( j ) and discrimination power ( j ).
I next explore such information to obtain an estimate of a childs ability that is more precise than the

one that is obtained from the simple summation of di,j reports. Let the vector di = di,ja , ..., di,jb denote
the maternal reports for child i for items j = ja , ..., jb , where ja > jb . An estimate of the childs ability is the

posterior mean E qi di . This moment can be estimated from the following algorithm. For each child i
2
and element k let i,k,0 = k , i,k,0 = k , and i,k,0
= k2 for k = 1, ..., K. Next, define the parameters that


 2

determine the conditional distribution of qi : i,k,j = E qi di,ja , ..., di,j , i,k,j
= Var qi di,ja , ..., di,j , and

i,k,j = Pr element k| di,ja , ..., di,j . Let i N (0, 1) and note that conditional on element k of the mixture,
then qi = i,k,0 + i,k,0 i . I can then use well-known results from the truncated normal distribution to
derive the following recursive rule for the mean:

i,k,j+1 =

i,k,j

i,k,j +

2
1+2j i,k,j

2
j i,k,j

2
1+2j i,k,j

j j i,k,j

r
1+ 2 2
j i,k,j

2
j i,k,j

j j i,k,j

r
1+ 2 2
j i,k,j

i f di,j+1 = 0,

j j i,k,j

r
1+ 2 2
j i,k,j

j j i,k,j

1 r
1+ 2 2
j i,k,j

i f di,j+1 = 1.

2 . I can obtain this information by applying the following


Note that the formula above uses the variance i,k,j

3 Estimation of the Technology of Skill Formation

13

recursive rule for the update of the variance:

2
i,k,j
+1 =

4
2j i,k,j
2
1+2j i,k,j

4
2j i,k,j
2
1+2j i,k,j

j j i,k,j

r
1+ 2 2
j i,k,j

1 +



r j j i,k,j r j j i,k,j
2
2
1+
1+ 2 2
j i,k,j
j i,k,j



r j j i,k,j r j j i,k,j
2
2
1+
1+ 2 2
j i,k,j
j i,k,j

j j i,k,j

1 r
1+ 2 2
j i,k,j



r j j i,k,j r j j i,k,j
2
2
1+
1+ 2 2
j i,k,j
j i,k,j

j j i,k,j

r
1+ 2 2
j i,k,j



r j j i,k,j r j j i,k,j
2
2
1+
1+ 2 2
j i,k,j
j i,k,j

i f di,j+1 = 0,

j j i,k,j

1 r
1+ 2 2
j i,k,j

i f di,j+1 = 1.

Finally, the formulas above are conditional on an element k of the mixture. To be able to obtain the unconditional mean, it is necessary to know the updated weight of element k. Again, this can be obtained from the follo
recursive rule for the update of the weight of element k:

i,k,j+1 =

j j i,k,j

i,j,k r
1+ 2 2
j i,k,j

j j i,l,j

lL=1 i,j,l r
1+ 2 2
j i,l,j

i f di,j = 0,

j j i,k,j

i,j,k 1 r
1+ 2 2
j i,k,j

j j i,k,j

lL=1 i,j,l 1 r
1+ 2 2
j i,k,j

i f di,j = 1.

Given the information, I can construct the following estimate of childs ability:
K

qi =

k i,k,jb .

k =1

Clearly, qi addresses measurement error. It does so by optimally weighing each item of the MSD instrument according to its difficulty and discrimination power. However, the IRT analysis per se does not
produce a natural metric of ability. To do so, I explore the fact that children had different age (in months)
at the time that their skills were assessed by NANHES. In fact, the age range of children was from 2
months-old through 47 months-old. At each age (in months), I estimate the mean score qi , which I denote
by q a . That is, q a is the median of qi among children who are a months-old at the time of the interview. I
plot the relationship between a and q a with dots in Figure 3. Interestingly, the curve in Figure 3 that fits so

3 Estimation of the Technology of Skill Formation

14

closely the points is generated by the following function:


q a = 0 + 1 ln a,
where 0 and 1 are estimated via OLS. Clearly, there is a close relationship between median scores, q a ,
and the natural logarithm of age (in months), ln a.
To see how I transform the estimate of ability that has no metric, qi , into one that has a metric, I simply
invert this relationship:
q 0
.
(10)
ln qi = i
1
The interpretation of ln qi is straightforward: A child with skill level ln qi is the child that has the mental
development of the median child who is ln qi years-old.

3.3

Investments

I measure parental investments that describe the quality of the childs home environment by items in the
CNLSY-79 HOME Short Form (HOME-SF). They are a subset of the measures used to construct the HOME
scale designed to assess emotional support and cognitive stimulation children receive through their home
environments, planned events, and family surroundings. I include measures of the following parental
investments: How often a child leaves the house, the number of books the child has, how often the mother
reads to the child, the number of soft/role play toys, the number of push/pull toys, how often the child
eats with his/her mother/father, how often the mother talks to her child while she is doing other forms
of household work (e.g., cooking dinner). I supplement these items with information about duration of
breastfeeding and how frequently the mother took the child for well-baby visits. I factor analyze these
measures and extract the first factor as the measure of parental investments that I use in the estimation of
the technology of skill formation.
As discussed above, the investment factor produced by the approach above does not have a location
or a scale. Because skills are measured in time, I change the location and the scale of the investment to the
same metric of skills (i.e., time). The information about the time investment in human capital of children
come from the Child Supplement of the Panel Study of Income Dynamics (PSID-CS) which contains a
time-use survey of children aged 0 to 12 in 1997. As measures of investments, I consider any time that the
child spends with the mother in which she is primarily engaged with the child. A few examples of such
activities (but not by any means an exhaustive list) are (a) taking the child to museums, theater, movies, or
other events; (b) reading or talking to the child; (c) playing with the child indoors or outdoors; (d) feeding
or having meals with the child.
More formally, let Ti,1 and Ti,2 denote the the total amount of time (hours per day) that the mother
invests in the child during a weekday and a weekend day, respectively. Let
Ti = 260 Ti,1 + 105 Ti,2

3 Estimation of the Technology of Skill Formation

15

denote the total number of hours per year that the mother invests in the child. If we assume that a month
Ti
is the amount of investment, in one year, measured in months. Let t
has around 720 hours, then ti = 720
and t2 denote the mean and variance of ti , respectively.
Let Li,j denote the measure j from the component of the HOME score for child i. I assume the following
relationship between Li,j and parental investment is:
x
x
x
Li,j = 0,j
+ 1,j
xi + i,j
.

The investment factor xi follows a mixture of normals with mean t and variance t2 . Note that because
the mean and the variance of the factor xi is already set, it is not necessary to normalize any of the factor
x .
loadings 1,j

3.4

Estimation

To estimate the parameters of the production function, I explore the fact that the CNLSY/79 tracks multiple
children from the same mother. By differencing out the maternal fixed effect, I can estimate the parameters
of the technology of skill formation by looking how differences in investments and initial conditions affect
the stock of human capital around age 24 months.
Table 2 shows the estimation results for samples that differ in the ranges of ages of the children. In
all the regressions I add controls for the age of the mother at the birth of the child as well as the childs
gender and birth order. These variables have little effect on the estimation of the parameters and . An
important variation in human capital at age around 24 months arises simply because childrens human
capital are not measured at exactly the month in which children are, actually, 24 months-old. There are
several ways of addressing this problem. First, it is possible to add dummies for the age (in months) of
the child at the date of the assessment. Second, given the relationship between the median MSD score
and the natural log of the age of the child, it is possible to produce a simpler model by controlling for the
natural log of the age of the child. Third, it is possible to focus on children whose assessment are closer
and closer to the target age. For this reason, Table 2 displays estimates of the technology of skill formation
(3) for these three different approaches. Note that different approaches produce different estimates of .
More specifically, when I consider all children between ages 2 and 36 months-old, I find that = 0.102
(s.e. 0.0201) if I add dummies for age at assessment date and = 0.121 (s.e. 0.0108) if I add the natural
logarithm of age at assessment date. The small difference in the results suggest that the natural logarithm
of age at assessment date is an accurate way of addressing this heterogeneity in age. In fact, Figure 4 plots
the coefficients on age dummies against a linear function of log age and it is clear that the relationship
between the two sets of controls are really very close. For this reason, I adopt the more parcimonious
specification in the remaining regressions.
As discussed above, it is possible to address this problem by considering children whose ages at assessment date are close to the target age. Obviously, the cost of doing so is that more and more children are

4 Data on Reference Points and Subjective Beliefs about the Technology of Skill Formation

16

dropped from the analysis as we focus on narrower and narrower bands around the target age. Columns
3 through 7 from Table 2 show that tends to become larger the more the sample is constrained to be
around the target age of 24 months. In the most constrained sample, I find that = 0.159 (s.e. 0.054).

Data on Reference Points and Subjective Beliefs about the Technology of


Skill Formation

4.1

Elicitation of Beliefs about the Technology of Skill Formation

A mothers decisions regarding how much to invest in her child depends not on the actual technology
in place but rather on the beliefs that she holds about the technology, in particular her beliefs about the
parameter . The current literature in economics assumes that mothers know these parameters and investigates how other constraints (e.g., credit constraints) affect the choice of investments. The approach that I
develop in this paper shows that the policy tools that are analyzed with current models of investments are
only a subset of the possible policy tools that arise if informational issues about child development and
the technology of skill formation are quantitatively important.
One of the reasons why the current literature in economics maintains the assumption that the technology of skill formation is known by the agents is that it is not possible to separately identify preferences
from beliefs without strong assumptions about one or the other (Manski, 2004). Since the early 1990s,
economists have started to collect data that elicit beliefs about different objects of interest. In a series of
papers, Dominitz and Manski investigate students subjective expectations about the returns of schooling,
an individuals perception of job insecurity, the subjective expectations about future income, and expected
future social security benefits.4 Such data have been successfully incorporated into choice models to distinguish between preferences and beliefs. For example, Lochner (2007) tackles the issue of individuals
perception of the criminal system, Delavande (2008) studies how beliefs about contraception methods
help explain choices made by sexually active women. Van der Klaauw and Wolpin (2008) used the HRS
data on survival expectations and retirement expectations to help estimate a stochastic dynamic model of
retirement behavior. Zafar (2008) shows that gaps across genders in major choice is due to differences in
preferences as well as in beliefs about coursework and not because of differences in expectations about
academic ability. In the literature of human capital accumulation, Attanasio and Kauffman (2009) show
that "subjective" distribution of returns to schooling affect the decision to invest in schooling in the context
of a developing country. More specifically, these authors show that not only the expected return , but also
the the return risk affect the decision to attend high school. Kauffman (2012) explores the heterogeneity in
returns between individuals from poor and rich backgrounds and she finds that the former require higher
expected returns to be induced to attend college than the latter. Interestingly, poor individuals with high
4

Dominitz and Manski (1996) quantify students expectations about returns of education. Dominitz and Manski (1997a)
study the subjective expectations about future income, while Dominitz and Manski (1997b) focus on an individuals perception
of ones own job insecurity. Dominitz, Manski, and Heinz (2003) investigate expectations about future benefits from social
security.

4 Data on Reference Points and Subjective Beliefs about the Technology of Skill Formation

17

expected returns are particularly responsive to changes in direct costs, which is consistent with them being
credit constrained.
A remarkable feature of the data collected in these studies is the fact that there exists a tight connection
between the belief data that are elicited and the economic model that is formulated to study the topic of
interest. In this paper, I would obviously like to collect data on directly, but this is impossible because
mothers do not think about child development in such an abstract framework. I argue, however, that
women do have implicit knowledge about the impacts of their actions, and the research design I describe
next allows the analyst to translate this implicit knowledge into beliefs regarding the parameters .
The first step is to create hypothetical scenarios of the "initial condition" of the child as well as parental
investments. The survey instrument describes to the expectant mother four different scenarios of investments and the health condition of the baby at birth. In the first scenario, the child is healthy at birth (q0 )
and the mother chooses a high level of invesment ( x ). In the second scenario, the baby is also healthy, but
the mother
 chooses a low level of investment ( x ). In the third and fourth scenarios, the baby is unhealthy
at birth q0 . But the scenarios three and four are different in the levels of investments which is high in the
third scenario and low in the fourth. It is important to emphasize that these two inputs in the technology
of skill formation are invariant across subjects in the survey. As I make clearer below, the variability in the
beliefs about arises because of the heterogeneity in the reports of child development.
The respondents watch a video that explains in detail the differences between healthy versus not
healthy baby. The healthy baby is the one whose gestation lasts 39 weeks, weights seven pounds at birth,
and goes home after at most three days at the hospital. The unhealthy baby was born after 35 weeks of
gestation, weighs only 5 pounds, and stays at the hospital for seven days after birth.
The video also shows examples of activities that mothers do with the child. With the exception of
breastfeeding, all of the activities are part of the HOME instrument: (a) soothing the baby when he/she is
upset; (b) moving the baby arms and legs around playfully; (c) talking to the baby; (d) playing peek-a-boo
with the baby; (e) singing songs with the baby; (f) telling stories to the baby; (g) reading books to the baby;
(h) taking the baby outside to play in the yard, park, or playground. The activities are the same for the
high and low level of investments. The difference is in the amount of time: in the high level, mothers
spend four hours a day in these types of activities, while in the low level they spend only three hours a
day. These figures correspond, respectively, to roughly the sixth and fourth decile of investments when
measured in the metric of time. 5
The second step is to create an instrument that can be used to elicit maternal subjective expectations of
child development for each one of the four scenarios described above. The approach I take in this paper
is to adapt the survey instrument used by the MSD Score from the CNLSY/79. The main advantage of
adapting the instrument is that it maintains comparability: the set of items used to elicit expected child
development are the ones that also measure actual child development and was used for the estimation of
5

Culhane, Cunha, and Elo (2012) investigate how sensitive the elicited beliefs are with respect to the parameters that describe
a healthy or unhealthy child and the number of hours that determine high or low investments. They find evidence that the
elicited beliefs are robust to variations in parameter values.

4 Data on Reference Points and Subjective Beliefs about the Technology of Skill Formation

18

the technology of skill formation (3). As I now explain, although the questions are similar, they differ in
one important detail. In the MSD instrument, a mother provides yes/no answers to questions about child
development. For example, a mother of a child who is 24 months-old is asked the question: Does your
child speak a sentence of three words or more?. In the instrument used to elicit maternal beliefs about the
technology of skill formation, the question is What do you think is the youngest age and the oldest age
a child learns to speak a partial sentence of three words or more?. The respondent uses a sliding scale to
indicate the age range in which she believes a child will develop these skills.
This is done
of
n
 for each
 oneo
the four scenarios of initial condition and parental investment (q0 , x ) (q0 , x ) , (q0 , x ) , q0 , x , q0 , x ,
as shown in Figure ??
I now explain how to construct E [ ln (qi,1 )| q0 , x ] as well as Var [ ln (qi,1 )| q0 , x ] from the answers provided by the respondents. I assume that the probability that the child will develop a certain skill is uniform
within the age range provided by the respondent. For example, assume that the age range provided by the
respondent to the question about speaking partial sentences
 of three words or more for the first scenario

(q0 , x ) is 16 to 28 months and for the fourth scenario q0 , x it is 18 to 30 months. Under the uniformity
assumption, these figures imply that, for the first scenario, the probability that the child will learn before
age 16 is zero, the probability after age 28 months is one, and the probability at age 24 months, the object
of interest, is 0.75. The same calculation for the fourth scenario implies that the probability that the child
will know how to speak a partial sentence of three words or more is 0.5.
In summary, the maternal answers to age range, when combined with the uniformity assumption,
produce the subjective probability that a child will be able to do a certain task of the MSD at age 24
months. This is done for each scenario of initial condition and maternal investments. Let pi,j,k denote this
probability for the scenariok. According to the choice probability equation of the IRT model (8), there is a
:
monotonic relationship between the probability pi,j,k and latent skill qi,j,k



pi,j,k = 1 F j j qi,j,k
.
The parameters j and j are already estimated using the NHANES dataset. I can then invert this expres :
sion to obtain one estimate of subjective expected child development, qi,j,k

qi,j,k


F1 1 pi,j,k + j
=
.
j

Note that it is possible to estimate subjective expected child development with only one item from the
. First, q has no metric, but I can apply the transformation
MSD scale. There are two problems with qi,j,k
i,j,k
(10)
 andwork with the measure of child development that has the natural log ofmental
 age as metric,

ln qi,j,k . Second, it is natural to expect that there is measurement error in ln qi,j,k . To tackle this

4 Data on Reference Points and Subjective Beliefs about the Technology of Skill Formation

19

problem, I average across the items of the MSD. As a result, I define:


1
E [ ln (qi,k )| i , q0 , x ] =
J

ln

qi,j,k

j =1

as the estimated subjective expected child development from respondent i and scenario k.
Knowledge of this quantity allows us to estimate the expected beliefs about the technology of skill
formation. Assume that E [ i,1 | i , q0 , x ] = 0. This assumption implies that the information from the hypotethical scenarios that I provide to the respondent has no information about the future shocks that will
be realized after the child is born. Under this assumption, note that the following relationship holds for
scenarios 1 and 2:
E [ ln qi,1 | i , q0 , x ] = ln A + ln (q0 ) + ,i ln ( x ) + i ,
E [ ln qi,1 | i , q0 , x ] = ln A + ln (q0 ) + ,i ln ( x ) + i .
Clearly, I can estimate i.1 for every mother i from:
,i =

E [ ln qi,1 | i , q0 , x ] E [ ln qi,1 | i , q0 , x ]
.
ln ( x ) ln ( x )

(11)

Expression (11) states that the mean beliefs ,i is the elasticity of child development with respect to
parental investment. Note that, so far, I have used only two of the four scenarios that I have. It is also
possible to construct a second estimate of beliefs from scenarios three and four:

0,i =

h
i
h
i
E ln qi,1 | i , q0 , x E ln qi,1 | i , q0 , x
ln ( x ) ln ( x )

(12)

2 ,2 . The idea is similar, except


Next, I describe how one can estimate the maternal variance beliefs, ,i
,i
that instead of recovering expected child development, I estimate the variance of child development:

1
Var [ ln (qi )| i , q0 , x ] =
J1

ln

qi,j,k

E [ ln (qi )| i , q0 , x ]

o2

j =1

To recover the maternal beliefs, note that the first and second scenarios produce a system of two equations
and two unknowns:
2 + 2 [ ln ( x )]2 ,
Var [ ln (qi )| i , q0 , x ] = ,i
,i
2 + 2 [ ln ( x )]2 .
Var [ ln (qi )| i , q0 , x ] = ,i
,i

Obviously, it is possible to extract the information from beliefs from the third and fourth scenarios as well.

4 Data on Reference Points and Subjective Beliefs about the Technology of Skill Formation

4.2

20

Results

The Maternal Knowlege of Infant Development Study (MKIDS) uses the approach described above to
obtain data on mean and variance beliefs of pregnant mothers. The study focuses on patients on a OBGYN clinic affiliated with the Drexel Universitys College of Medicine in Philadelphia, PA. The clinic
serves primarily medicaid-eligible patients. As such, the patients are primarily African-American (80%),
at most high-school diploma (85% have at most a high-school diploma and around 60% have dropped out
of high school), and low household income (median household monthly income is $1200). Around 72% of
the sample are neither married nor have a partner.
Table 3 displays the mean and variance beliefs from this population. If I average the mean beliefs
across healthy and unhealthy scenarios, I find that the median subject believes that is around 0.076.
In contrast, the objective estimate of the technology of skill formation is around = 0.121, which is around
60% larger than the mean beliefs of the median respondent. Note that there is wide variability in mean
belifs: The respondent at the 25th percentile has mean beliefs close to zero, while the one at the 75th
percentile has beliefs around 0.3. This large heterogeneity in mean beliefs could potentially explain large
variability in investments that I find in the CNLSY/79 data.
The mean beliefs about are higher for the healthy scenario. However, even in that case, we find
that the median respondent has beliefs about that are equal to 0.089. In fact, our objective estimates
suggest that is about 35% higher. More importantly, I find that the median beliefs are extremely low
for the unhealthy scenario. According to the beliefs of the median subject, is around 0.01 when the
child is born unhealthy. This represents a major underestimation of the returns to investments in children.
From a policy point of view, this result is important because the sample of women interviewed by MKIDS
is more likely to have children born prematurely or too light than the non-medicaid eligible population.
Table 3 also shows variance beliefs. Again, we find evidence of important variability on how precise
pregnant women are about the returns to investments in children. More importantly, note again that mothers have lower variance (or are more precise) when the children are born unhealthy. When combined
with the results for mean beliefs, MKIDS subjects beliefs about returns are lower and exhibit less variance
when children are born unhealthy. In the context of a dynamic model in which parents are learning about
the technology of skill formation, this combination of low mean beliefs and low variance beliefs leads to
little investments for two reasons. When mean returns about are low, maternal expectations about the
returns are also low, and this leads them to invest little. On the other hand, when parental variance beliefs
are low, there is little incentive to invest in children to learn and update beliefs about the parameter of
the technology of skill formation. In other words, the result of low mean and variance, when combined,
reinforce each other and lead mothers to invest even less in their children.

4.3

Measuring Reference Points

In order to measure maternal reference points in a way that is consistent with the CNLSY/79 data, I adapt
the MSD instrument in a very simple way. Instead of asking respondents yes/no questions (which is not

5 Conclusion

21

possible for a sample of women who have not had their children yet), I ask them to report the age at which
they think their child will be able to do a task that is part of the MSD scale. If the answer is 24 months
or less, I replace their answer with a yes in the corresponding MSD instrument. Otherwise, I set the
answer to no. I can then use the IRT analysis to estimate the mean and variance of reference points by
applying the recursive rules derived in the IRT analysis.
Table 4 displays the results for all the MSD tasks that I use in the MKIDS survey. Again, there is large
variability in mean age across respondents even across respondents that are otherwise very similar in
race, education, income, and marital statuts. For example, the median woman believes that children will
be able to wave good-bye around age nine months. In fact, my analysis of the NHANES dataset shows
that the median age is actually 6 months, which is close to what the 25th percentile mother actually says.
A result of the overestimation of the age at which children can do certain skills is the underestimation of
reference points. For example, the median mother has as reference point, for a child who is 24 months-old,
the mental development of a child who is around 18 months-old. This implies a six month developmental
delay. As shown in the analytical discussion of the model, low reference points will lead parents to invest
little in their children.

Conclusion

This paper develops a model of investments in early human capital of children in which mothers face
uncertainty about the technology of skill formation and have preferences that depend on reference points.
As I show above, the model can generate important heterogeneity in investments in children in ways
that have not been considered before in the literature. These informational failures may be overcome by
policies that inform parents about the marginal returns to investments as well as appropriate targets for
children.
To investigate the importance of uncertainty about the technology of skill formation, I proceed in two
ways. First, I objectively estimate the parameters from the CNLSY/79 data. To do that, I account for measurement error of skills and investments, the lack of metric on skills and investments, and the endogeneity
of investments. Following Dalh and Lochner (2012), I am currently carrying out an estimation of the technology that uses exogenous changes in the Earned Income Tax Credit as a source of exogenous variation
in investments. A future version of the paper will also contain the estimates that explore this identification
strategy.
Second, I adapt the CNLSY/79 instrument that is used to measure the human capital of young children so that it can be used to estimate maternal beliefs about the parameter of the technology of skill
formation that drives the marginal returns to investments. I colect data on medicaid eligible women from
Philadelphia and show that there is wide variability in mean and variance beliefs. Importantly, I find that
the typical woman in the data set tend to have low mean beliefs about the parameter of the technology of
skill formation. The mean beliefs are even lower for children who are born unhealthy. This finding is
concerning because medicaid-eligible women are more likely to bear children who are born unhealthy.

Figure 2

.25

Probability
.5

.75

Transforming age range into probability

12

16

20
24
28
32
Child Age (in months)

Logistic prediction, high

36

40

High investment

44

48

Figure 3
Expected development for two levels of investments (x)
Speak partial sentence - NHANES

.75
Probability
.5
.25
0

.25

Probability
.5

.75

Probability to expected development

Age range to probability


Speak partial sentence - MKIDS

8 12 16 20 24 28 32 36 40 44 48
Child Age (in months)
Logistic

Age range

8 12 16 20 24 28 32 36 40 44 48
Child Age (Months)
Data

Predicted

Figure 4
Comparing answers to different MSD items
Probability into expected development
1

Age range into probability


Speak partial sentence

.75
.5
.25

.25

.5

.75

Speak partial sentence

Know own age and sex

Know own age and sex

8 12 16 20 24 28 32 36 40 44 48
Child Age (in months)

8 12 16 20 24 28 32 36 40 44 48
Child Age (Months)

Figure
Intervention on income

10
x

kdensity xstar1

Baseline Investments
E_delta7_px
0 .1.2.3.4.5

0 .1 .2 .3 .4 .5

Kernel density of Investments

kdensity xstar7

4
6
8
10 quantiles of xstar1

10

Elasticity
E_delta7_pgam
0 .1 .2 .3 .4 .5

E_delta7_pqr
0 .1 .2 .3 .4 .5

Reference

4
6
8
10 quantiles of mu_qr

10

2
4
6
8
10 quantiles of mu_gamma

10

Figure 7
Intervention on parameters of reference point distribution
Baseline Reference
.08

Baseline Investments

10
x

E_delta2_pqr
.04
0

.1

.02

.02

.2

E_delta2_px

.3

.04

.06

.4

.06

.5

Kernel density of Investments

2
4
6
8 10
10 quantiles of xstar1

2
4
6
8 10
10 quantiles of mu_qr

Figure

Density
0

.1

.02

.2

.04

Density

.3

.06

.4

.08

.5

Histogram of alpha_2

.1

Histogram of alpha_1

10

20

30
alpha1

40

50

5
alpha2

10

Figure
3
0

Density
2

2 Density
4
6

Histogram of mean beliefs about gamma Histogram of mean beliefs about theta

.2

.4
.6
mu_gamma_un28

.8

.5Density
1

1.5

Histogram of mean beliefs about reference point

2.5

3
Elnh1

3.5

-4.5

-4
-3.5
mu_theta_un28

-3

Table1
Subjectiveexpectationsaboutthetechnologyofskillformation
Accountingformeasurementerrorbyaveragingacrossitems
25th
75%
Percentile Percentile

Std
Deviation

Mean

Median

OverallitemsandScenarios1

8.8%

4.5%

4.5%

23.0%

32.9%

"Good"versus"poor"heathatbirth
Overallitems,onlyscenarioswith"good"healthatbirth
Overallitems,onlyscenarioswith"poor"healthatbirth

12.7%
4.9%

7.2%
2.9%

0.0%
7.2%

29.1%
21.1%

36.5%
33.6%

Primarilymotorvs.primarilycognitiveitems
Cognitiveitems,allscenarios
Motoritems,allscenarios

7.6%
9.9%

0.0%
4.6%

0.9%
8.0%

20.8%
23.7%

33.4%
38.7%

Primarilymotorvs.primarilycognitiveitemsbyhealthatbirth
Cognitiveitems,"good"healthatbirth
Cognitiveitems,"poor"healthatbirth
Motoritems,"good"healthatbirth
Motoritems,"poor"healthatbirth

10.3%
4.9%
14.8%
4.9%

0.0%
0.0%
9.0%
0.0%

0.0%
1.8%
0.0%
6.1%

26.1%
17.0%
35.7%
22.3%

35.6%
34.5%
43.8%
41.2%

Accountingformeasurementerrorbyestimatingfactormodel
25th
75%
Percentile Percentile
3.4%
21.0%

Std
Deviation
32.9%

Mean

Median

OverallitemsandScenarios1

7.4%

3.9%

"Good"versus"poor"heathatbirth
Overallitems,onlyscenarioswith"good"healthatbirth
Overallitems,onlyscenarioswith"poor"healthatbirth

11.8%
3.0%

6.7%
0.2%

1.3%
8.2%

29.4%
15.8%

36.1%
32.8%

Primarilymotorvs.primarilycognitiveitems
Cognitiveitems,allscenarios
Motoritems,allscenarios

5.1%
6.9%

2.1%
3.1%

4.0%
5.0%

16.3%
19.0%

36.4%
35.4%

Primarilymotorvs.primarilycognitiveitemsbyhealthatbirth
Cognitiveitems,"good"healthatbirth
Cognitiveitems,"poor"healthatbirth
Motoritems,"good"healthatbirth
Motoritems,"poor"healthatbirth

10.2%
0.0%
16.6%
2.8%

4.2%
4.4%
10.4%
4.1%

0.4%
7.4%
3.4%
13.9%

24.7%
13.3%
36.1%
7.3%

39.2%
36.9%
39.9%
36.6%

Healthatbirthis"good"if(1)thebaby'sweightatbirthis8pounds,thebaby'slengthatbirthis20inches,andthegestationalage
is9months.Healthatbirthis"poor"ifthebaby'sweightatbirthis5pounds,thebaby'slengthatbirthis18inches,andthe
gestationalageis7months.Wheninvestmentis"high"themotherspends6hours/dayinteractingwiththebaby.Incontrast,
wheninvestmentis"low"themotherspendsonly2hours/dayinteractingwiththebaby.

Table3
Subjectivebeliefsaboutthetechnologyofskillformation
Checkingsensitivityofthelogitassumption
Notaccountingformeasurementerror
OverallitemsandScenarios
Logistic
25th
Percentile

75th
Percentile

StdDev

Mean

Median

Targetageis24months

8.8%

4.5%

3.4%

21.0%

32.9%

Targetageis28months

12.6%

8.7%

3.0%

28.9%

32.2%

Targetageis32months

14.6%

10.7%

3.0%

30.1%

29.0%

Targetageis36months

12.8%

7.6%

2.1%

27.8%

24.8%

Uniform
25th
Percentile

75th
Percentile

StdDev

Mean

Median

Targetageis24months

7.4%

3.9%

3.4%

21.0%

32.9%

Targetageis28months

22.8%

17.2%

4.9%

36.7%

23.0%

Targetageis32months

21.7%

19.2%

4.2%

33.0%

21.6%

Targetageis36months

21.0%

16.7%

5.7%

32.8%

19.3%

Mean

LowerTriangular
25th
75th
Median
Percentile
Percentile

StdDev

Targetageis24months

15.8%

7.2%

3.3%

22.3%

22.9%

Targetageis28months

20.8%

15.7%

4.3%

32.1%

22.7%

Targetageis32months

21.7%

17.8%

4.2%

33.7%

22.1%

Targetageis36months

21.7%

17.6%

3.9%

32.8%

22.3%

Mean

UpperTriangular
25th
75th
Median
Percentile
Percentile

StdDev

Targetageis24months

15.2%

7.1%

2.1%

24.3%

23.5%

Targetageis28months

24.1%

18.3%

4.8%

39.9%

25.3%

Targetageis32months

22.9%

19.1%

4.1%

40.0%

23.7%

Targetageis36months

21.1%

15.2%

4.4%

35.7%

21.2%

Mean
Targetageis24months

26.1%

DirectElicitationofProbability
25th
75th
Median
Percentile
Percentile
17.5%

4.8%

40.9%

StdDev
35.3%

Table 5
Objective estimation of the technology of skill formation
Estimates of for full sample and selected subsamples

Dependent variable: Natural log of skills around age 24 months1


Full Sample

13 to 35 Months
18.0%***
(1.99%)

Full Sample

16 to 32 Months
19.9%***
(2.70%)

19 to 29 Months
25.7%***
(3.84%)

Analysis by race/ethnicity

13 to 35 Months

16 to 32 Months

19 to 29 Months

Hispanic subsample only

18.3%***
(4.21%)

18.9%***
(5.45%)

34.4%***
(7.64%)

Black subsample only

18.7%***
(3.17%)

21.7%***
(4.42%)

20.1%***
(6.33%)

Non-Hispanic, non-black subsample only

16.9%***
(3.80%)

19.7%***
(5.45%)

31.1%***
(8.82%)

Analysis by maternal education at first birth


18.1%***
Mother is high-school dropout at birth of the first child
(3.16%)

20.2%***
(4.64%)

20.4%***
(5.80%)

Mother is at least high-school graduate at birth of the first


child

18.6%***
(3.69%)

29.3%***
(5.30%)

18.9%***
(4.23%)

16.5%***
(4.99%)

20.9%***
(3.99%)

29.4%***
(5.79%)

Analysis by maternal score on Rotter's locus of control scale


Maternal score on Rotter's locus of control is in top
19.6%***
20.0%***
2
(2.44%)
(3.28%)
quartile

23.5%***
(4.71%)

Maternal AFQT is in bottom quartile

17.8%***
(2.71%)

Analysis by maternal cognitive skills


16.3%***
(3.19%)

Maternal AFQT is in 2nd quartile or higher

Maternal score on Rotter's locus of control is in 3rd


quartile or lower2

20.0%***
(2.84%)

13.9%***
(4.14%)

20.4%***
(5.67%)

Analysis by maternal score on Rosenberg's self esteem scale


Maternal score on Rosenberg's self esteem scale is in
13.9%***
21.4%***
3

bottom quartile

Maternal score on Rosenberg's self esteem scale is in 2nd


3

quartile or higher

17.6%**
(6.94%)

(4.16%)

(6.29%)

18.2%*
(9.94%)

18.6%***
(2.37%)

19.1%***
(3.02%)

27.2%***
(4.33%)

Robust standard errors in parentheses. All regressions have dummy variables for: (i) the child's gender, (ii) birth order, (iii) age at
the time of measurement of the dependent variable, (iv) year of birth and (v) maternal age at the time of the child's birth.
*** p<0.01, ** p<0.05, * p<0.1
1

Skills are measured by the Motor-Social Development Scale and are scaled in "mental" age of development.

The Rotter locus of control scale measures the extent to which individuals believe that they can control events that affect them. In
the NLSY/79, it takes on values between 4 and 16. Low values indicate that individuals tend to believe that they can control the
events, while high values suggest that individuals believe that events are beyond their control.
3

The Rosenberg self esteem scale measures an individuals self esteem. In the NLSY/79, it takes on values between 9 and 30. Low
values indicate lack of self esteem.

Table6
Movingmedianexpectationsclosetoobjectiveestimates

Interpolationmethod

Median
Changeinchild
Target
Changein
expectation
developmentat
elasticity() investments
()
age24months

Lowertriangularat32months
Uppertriangularat32months
Lowertriangularat28months
Uppertriangularat28months

17.8%
19.1%
15.7%
18.3%

19.9%
19.9%
19.9%
19.9%

10.6%
3.6%
24.3%
7.9%

2.2%
0.7%
5.0%
1.6%

Lowertriangularat32months
Uppertriangularat32months
Lowertriangularat28months
Uppertriangularat28months

17.8%
19.1%
15.7%
18.3%

25.7%
25.7%
25.7%
25.7%

40.4%
31.4%
58.1%
36.9%

9.5%
7.4%
13.7%
8.7%

Lowertriangularat32months
Uppertriangularat32months
Lowertriangularat28months
Uppertriangularat28months

17.8%
19.1%
15.7%
18.3%

28.3%
28.3%
28.3%
28.3%

53.8%
43.8%
73.3%
50.0%

19.5%
15.9%
26.6%
18.1%

You might also like