You are on page 1of 10

Identifying Growth Patterns of the High-Tech

Manufacturing Industry across the Seoul


Metropolitan Area Using Latent Class Analysis
Li Wan, Ph.D. 1; and Youngsoo An, Ph.D. 2

Abstract: Latent class analysis (LCA) is a well-established research method in social science for explaining observed correlations or vari-
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

ance by identifying latent classes, but it has rarely been applied in urban studies. This paper provides an empirical case of using LCA to
identify the generic growth patterns of the high-tech manufacturing industry across locations in the Seoul metropolitan area (SMA). The
presented model uses standardized high-tech industry growth data processed from firm registration data in SMA (2009–2014) as outcome
variables, and incorporates the initial high-tech firm density in 2009 as a covariate, aiming to explore the possible link between identified
growth patterns and initial high-tech firm density. The authors found that during 2009–2014 continuous growth of the high-tech industry was
more likely to occur in locations with relatively low initial high-tech firm density. As firm density rose, fewer locations could have sustaining
growth but increasingly relied on certain triggers to achieve further growth. The probability of falling into industrial decline also increased as
high-tech firm density grew. In addition, locations with relatively low high-tech firm density were more likely to experience big fluctuations.
The methodology and the findings presented in this paper are expected to contribute to industrial location choices and development studies in
similar metropolitan areas. DOI: 10.1061/(ASCE)UP.1943-5444.0000397. © 2017 American Society of Civil Engineers.

Introduction manufacturing sectors. For example, Arauzo-Carod (2005) found


that the labor-intensive sector (e.g., heavy industry) and the re-
The growth of the manufacturing industry has been a major topic in search and development intensive sector (e.g., high-tech industry)
regional and urban development studies for its prominent role in follow different mechanisms in industrial location choices. Second,
economic and employment growth. In the context of the Republic to understand the changes in local industry over time, an explicit
of Korea, the value added of the manufacturing sector has been temporal dimension is required, rather than merely the start and end
growing since 2004 (Fig. 1), with a noticeable retardation around status. For example, some locations may see a linear growth pat-
2009 due to the global financial crisis. The percentage share of the tern, whereas some others may experience leap-frog growth or fluc-
manufacturing industry in the national gross value added has also tuations. These growth trajectories per se are informative in the
been increasing during the same period (dashed line in Fig. 1). By sense that they may represent different mechanisms of growth.
2014, the value added of the manufacturing industry accounted for Such growth patterns are well captured by the geometric curve
more than 30% of the national gross value added. The growth of the of the growth trajectory (Gold 1964), which provides a useful per-
manufacturing industry plays a pivotal role in the prosperity of the spective to examine growth over time and compare growth patterns
economy. In addition, within the manufacturing industry, the heavy, among locations.
light, and high-tech sectors present different growth patterns The purpose of this paper is to identify the growth patterns of the
(Fig. 2). The high-tech manufacturing sector (electronic, electrical high-tech manufacturing industry across locations in the Seoul met-
equipment, and precision instruments) has been growing at a faster ropolitan area (SMA), using longitudinal firm registration data
rate than other manufacturing sectors, which highlights the ongoing from 2009 to 2014. The SMA was chosen as the study area because
transition to a knowledge-based economy in Korea since the 1990s. of its dominant role in the national economy—the value added of
In 2014, the high-tech manufacturing sector accounted for about SMA manufacturing industry accounted for 49.4% of the national
one-third of the total value added of the manufacturing industry gross in 2014 (KOSIS 2016). In this paper the authors focus on the
as a whole. high-tech manufacturing sector, rather than taking the manufactur-
To articulate industrial growth, more information and dimen- ing sectors as a whole. This is to acknowledge the fact that sectors
sionalities need to be taken into consideration apart from the ag- tend to follow different mechanisms of growth. To identify growth
gregate growth curve as in Fig. 1. First, the industrial growth patterns, latent class growth models were developed following a
and the associated locational mechanism are likely to vary across data-driven, explorative approach. Latent class growth analysis
is a particular type of latent class analysis (LCA) that deals with
1
Research Associate, Dept. of Architecture, Univ. of Cambridge, longitudinal data. It uses outcome variables measured at multiple
Trumpington St., Cambridge CB2 1PX, U.K. (corresponding author). time points to define a latent class model in which the latent classes
ORCID: https://orcid.org/0000-0002-7863-0729. E-mail: lw423@cam.ac.uk correspond to different growth curve shapes for the outcome
2
Research Professor, Dept. of Urban Design and Planning, Univ. of variables (Muthén and Muthén 2000). Whereas LCA is a well-
Seoul, Seoul 02504, Korea. E-mail: ysan@uos.ac.kr
established research method in social and behavioral science
Note. This manuscript was submitted on December 8, 2016; approved
on April 5, 2017; published online on June 30, 2017. Discussion period (Clogg 1995), it has rarely be applied in urban development studies.
open until November 30, 2017; separate discussions must be submitted This paper is organized as follows. The research on industrial
for individual papers. This paper is part of the Journal of Urban Planning growth patterns and the methodology of latent variable analysis are
and Development, © ASCE, ISSN 0733-9488. reviewed in the next section. This is followed by the introduction of

© ASCE 04017011-1 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


existing research often uses aggregate data at the national or
regional level, at which spatial heterogeneity is not a key point
of interest.
Another line of research exploring industrial growth is through
modeling firm location choices. This research line incorporates the
explicit spatial dimension in the sense that the study area is often
divided into zones as units of analysis. Arauzo-Carod et al. (2010)
provide a thorough overview of the methodology and determinants
of industrial location. They found that agglomeration economies
are generally accepted as an important determinant for industrial
location. The term agglomeration economies denotes that the spa-
Fig. 1. Value added of manufacturing industry in Republic of Korea tial proximity of firms would lead to positive externalities that may
(data from KOSIS 2010) boost productivity and promote further concentration (Fujita et al.
2001; Glaeser 1999; Marshall 1890). When agglomeration reaches
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

a cerntain threshold, diseconomies may also arise that deter further


concentration. In firm location models, the level of agglomeration
is usually measured by the density of firms within a given geo-
graphic area. The research on industrial locations, though often us-
ing longitudinal data to enlarge the sample size, tends to focus only
on the start and end status of the study period (i.e., the net changes
between two cross sections). Growth patterns and interrelations
over time are often neglected.

Latent Variable Analysis


In statistics, latent variables, as opposed to observed variables, are
Fig. 2. Value added by manufacturing sector in Republic of Korea
variables that are not directly observed but can be inferred from
(data from KOSIS 2010)
observed variables (indicators). Various methods are developed
under the framework of latent variable analysis, with respect to
the type of latent variables and indicators (categorical or continu-
the empirical data with descriptive statistics. In the “Method” sec- ous), the temporal characteristics of indicators (cross sectional or
tion, the research problems and the analytical procedure are pre- longitudinal), and the model assumptions for classification. An ex-
sented. Model estimation and selection are discussed next. The tensive overview on the methodological variations of latent variable
research findings are drawn in the “Discussion” section. The “Con- analysis is provided in Muthén (2001). In this section, the differ-
clusion” section summarizes the paper; limitations are also pointed ence between latent class analysis and factor analysis is discussed
out in this section. first. This is followed by a brief introduction of the standard latent
class model and its extensions. Considerations for model estimation
and selection are discussed afterward.
Literature Review Latent class analysis was introduced by Lazarsfeld and Henry
(1968), and it was extended by Goodman (1974) and Clogg (1995).
In this section, the literature on industrial growth patterns and the The standard LCA is intended to identify unobserved class mem-
methodological background of latent variable analysis are reviewed bership among sample objects. There is a similarity between LCA
in turn. and the nonhierarchical factor analysis using continuous latent
variables, because the two methods both find classes (clusters)
of samples that are similar in terms of means, variances, and co-
Industrial Growth Patterns variances. However, the key difference lies in the fact that in factor
The search for industrial growth patterns stems from the early work analysis the cluster-membership probability is directly estimated
of Burns (1934), who studied the varying rates of growth of several through the maximum log-likelihood method, which typically max-
American industries, aiming to find uniformity in growth pattern. imizes cross-cluster variation or minimizes within-cluster variation,
Burns proposed a basic generalization with respect to industrial whereas in LCA the posterior class-membership probability is
growth that “an industry tends to grow at a declining rate, its rise derived with estimated class-specific parameters. In other words,
being eventually followed by a decline,” based on a study of lon- the LCA can classify observed samples in a probabilistic fashion,
gitudinal industrial growth data (1870–1885) in the United States. and thus it can be used to derive other conditional probabilities
This seminal research topic was revisited by Gold (1964) 30 years (Vermunt and Magidson 2002).
later, who reviewed Burns’ work with updated empirical evidence Latent class growth analysis (LCGA) is an extension of LCA
and reiterated the significance of investigating industrial growth that deals with longitudinal data for identifying distinctive groups
patterns. Gold (1964) explored the geometric features of industrial of individual growth trajectories (Nagin 1999; Nagin and Land
growth patterns during 1929 and the 1950s in the United States and 1993). For a linear-type LCGA, a set of intercepts and slopes,
found that progressive retardation may be far less pervasive and namely, growth factors, are devised to measure growth trajectories,
undiminished rates of growth may be more frequent than had been in which the intercepts represent the time-invariant variance
suggested by Burns. He also proposed that the uniformity of growth (e.g., initial status) and the slopes depict the time-varying growth
patterns is unlikely to exist across industries. This research strand pattern (e.g., the amount of change over time). In LCGA, it is gen-
differs from the econometric approach in that it directly examines erally assumed that no variation across individuals is allowed
industrial growth patterns in a geometrical manner. However, within classes, though growth factors can change over classes,

© ASCE 04017011-2 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


representing different growth patterns among classes. The LCGA is The quality of model fitting can be evaluated based on the precision
also called growth curve analysis when continuous indicators are of classification and the entropy measure. In terms of model con-
applied. This section follows the variable definitions in Muthén vergence, local optima and multiple maxima (Titterington et al.
(2001) and introduces the generic formulation of LCGA. The 1985) are often encountered in the estimation, particularly in mod-
observed variables are x and y, where x denotes a vector of cova- els with a large degree of across-class variations such as in GMM
riates and y denotes a vector of continuous outcome variables applications. One general method to tackle local solutions is to es-
(indicators). In this paper, all observed variables are continuous; timate the model with random starting values to find the solution
thus, no categorical indicators are considered. The latent variables with the highest log likelihood value. Such a searching algorithm
are η denoting a vector of continuous latent variables, and c rep- has been automated in most software packages, such as Mplus. For
resenting a latent categorical variable with K classes, c ¼ GMM particularly, Ram and Grimm (2009) use a four-step method
ðc1 ; c2 ; : : : ; cK Þ. Finally, t ¼ 0 to T denotes the time of measure- to demonstrate that the GMM building should start from linear,
ment for observed variables. The generic formulation is thus parameter-restricted models such as standard LCGA, and then
given by gradually relax the variance constraints and increase the number
X
K of classes to establish a stable GMM application. Other useful in-
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

yti ¼ ½Pic ðη0ic At0ic þ η1ic At1ic þ etic Þ structions for applying GMM can be found in Jung and Wickrama
c¼1 (2008) and Wang and Hanges (2011).
The model selection involves two important considerations:
where Pic = probability that the sample i belongs to class c with
P (1) determination of the number of classes and (2) comparison
c Pic ¼ 1; η 0ic and η 1ic = continuous latent variables to be of models of different specifications given the number of classes.
estimated that are time-invariant; At0ic and At1ic = intercepts and An overview of this topic can be found in Celeux et al. (1997). It
the slopes, respectively, that together measure the growth pattern should be noted that the conventional chi-squared tests can only be
over time; and etic ∼ Nð0; Ψk Þ = time-variant residual term, as- used to compare two latent class models that have an identical num-
sumed to be uncorrelated with other variables. ber of classes, but not models with different class numbers. Instead,
To interpret the model parameters and variables relating to the the Akaike information criterion (AIC) (Akaike 1973) and the
geometric pattern of growth curves, the authors follow Ram and Bayesian information criteria (BIC) (Schwarz 1978) have been
Grimm (2009). For the growth factors, At0ic is usually assumed put forward for performance-testing purposes; both include a pen-
to be constant (At0ic ¼ 1) to represent the influence of initial status
alty term for the number of parameters in the model (the penalty
over the growth trajectory. The term At1ic captures the pattern of
term is larger in BIC than in AIC). A low AIC/BIC value indicates a
changes over time, which can be specified a priori (e.g., linear,
better-fitting model. Another index for model selection is the
quadratic) or estimated from the data together with the latent
Lo-Mendell-Rubin (LMR) likelihood ratio test (Lo et al. 2001),
variables. The free elements of At1ic , when not predetermined, re-
in which the likelihood ratio statistic for a k-class model is related
present a distinctive, class-specific pattern of growth that best fits
the data. to the neighboring k − 1 class model. However, Jeffries (2003)
As a further generalization of LCGA, growth mixture modeling found that there is a flaw in the mathematical proof of the LMR
(GMM) allows within-class variation in determining the trajectory test for normal outcomes. The LMR test results thus need to be
classes. Nonetheless, when building a GMM model, it is useful to interpreted with caution. Alternatively, McLachlan and Peel (2000)
start from the standard LCGA, because the zero constraints on class proposed a parametric bootstrap likelihood ratio test (BLRT),
variance could significanly reduce computing time and result in which does not require the a priori distribution type for the like-
faster model convergence (Kreuter and Muthén 2007). The ex- lihood difference between the k − 1 and k class model, and uses
tended flexibility of GMM makes it a useful tool to explore data bootstrap samples to estimate the distribution. Nylund et al. (2007)
of various kinds when a prioir knowledge on the latent trajectory empirically compared several model fit indices (BLRT, BIC, LMR,
classes is not well established. In this paper, the GMM mehod is etc.), and found that the BLRT outperforms the others, followed by
adopted to identify the generic growth patterns of the high-tech the BIC and the adjusted BIC, albeit with the extra computation
manufacturing industry in the Seoul metropolitan area. burden incurred by the bootstrapping procedure.
Another merit of latent class analysis is its capability of incor- Given the aforementioned fit statistics for model selection, as a
porating covariates (concomitant variables). A formal proposal of summary note, Muthén (2003) pointed out that latent class models
LCA with covariates was presented in Clogg (1981), although he should be amenable to both statistical and substantive checking.
used the term external variables instead of covariate. The key idea Latent class models should not only be statistically sound, but also
of incorporating covariates in LCA is to distinguish endogenous conform to relevant theory. They can be useful for practical
variables that serve as indicators from external variables that purposes.
may be used to predict class membership. In this paper, the incor-
poration of covariates means that not only generic industrial growth
patterns can be identified, but also that the possible connections Data
between the identified patterns and the selected covariate can be
explored. The introduction of covariates thus facilitates the inter-
Geography
pretation of model results and implies causalities for further inves-
tigation. The technical descriptions of the LCA with covariates The spatial scope of this study is the Seoul metropolitan area of
can be found in Vermunt and Magidson (2002) and Muthén and the Republic of Korea. The SMA had a population of 48 million
Asparouhov (2011). in 2010, covering a land area of 12,056 km2 . The SMA includes
three major cities, Seoul, Incheon, and Gyeonggi-do. Seoul-
Incheon is one of the largest urban metropolitan areas in the world
Model Estimation and Selection (Cox 2016). In terms of firm and employment, SMA is home to
The latent class models can be estimated through maximum 47.1% of the total firms and 51.7% of the total employment in
likelihood using an expectation maximization (EM) algorithm. Korea (KOSIS 2016).

© ASCE 04017011-3 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

Fig. 3. Study area: Seoul metropolitan area (SMA) in Republic of Korea

To define the spatial unit for this analysis, the SMA was divided The potential bias caused by the uneven size of zones should
into 79 zones, which are in line with administrative divisions. The be addressed before conducting the latent class analysis. This is
descriptive statistics (the minimum, maximum and standard because large zones tend to have more firms as well as growth.
deviation) of the zoning system are provided in Fig. 3, which shows To correct such bias, firm density (number of firms per square
that the SMA zones vary considerably in terms of geographic area. kilometer) is used instead of absolute counts for the following
analyses. The density distribution of high-tech firms in SMA is
presented in Fig. 4. The zones were reordered according to their
Descriptive Statistics density ranking in 2009. It shows that the relative ranking of firm
The firm data for this research came from the annual Business density changes over time, but the general distribution pattern re-
Yearbook published by the Korean Chamber of Commerce and tains. Locations of high firm density tend to have higher absolute
Industry (KCCI). The purchased data include firm-level registration growth, whereas the middle range shows more dynamic growth
records (such as firm name, industrial type, address, employment patterns.
size, etc.) from 2009 to 2014. Specifically, this paper focuses on
high-tech manufacturing firms. The firm-level count data were
aggregated according to the zoning system. The outcome was a Method
longitudinal data set of the zonal number of high-tech manufactur-
ing firms from 2009 to 2014. The descriptive statistics of the high-
Problem Definition
tech firm count data in SMA are presented in Table 1.
The total number of high-tech manufacturing firms in SMA Given the empirical nature of this paper, defining the research
has grown continuously since 2009. A particularly fast period of problems was a key step in developing an appropriate analytical
growth was seen during 2011–2012, which might signify a recov- plan. The authors defined the research problems as (1) how many
ering bounce from the global financial crisis in 2008. In addition, latent groups of growth patterns are expected that are amenable to
the standard deviation (SD) of firm numbers also increased from both statistical and substantive checking, (2) how these groups dif-
2009 to 2012 and has remained stable since 2013. It suggests that fer from each other, and (3) how to explore the link between the
the development gap among locations has been widening during predicted group membership with the initial high-tech firm density
2009–2012. (covariate).

Table 1. Descriptive Statistics of High-Tech Firm Count Data in SMA


Variable 2009 2010 2011 2012 2013 2014
Minimum 0 0 0 0 1 1
Maximum 1,668 1,678 2,014 2,737 2,614 2,533
Mean 234 243 259 290 301 308
Standard deviation 303.7 310.6 342.3 413.3 413.4 410.5
Sum 18,490 19,176 20,441 22,890 23,783 24,336
Percentage growtha — 3.7 6.6 12.0 3.9 2.3
a
Percentage growth based on the previous year.

© ASCE 04017011-4 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


2009, depicted by the flat lines on the figure bottom. By contrast,
relatively large changes were witnessed for locations with
higher initial firm density. One location, Geumcheon District in
Seoul [see the top line in Fig. 5(a)], presented a distinct growth
trajectory—it remained the most concentrated location of high-
tech firms in SMA throughout the study period. Along the trajec-
tory, noticeable growth was witnessed during 2011–2012. The
authors investigated the possible cause for this quick growth and
found that it probably relates to a 2011 industrial regeneration
scheme in Seoul that aimed to transform obsolete factories into
modern industry parks.
To focus on growth patterns, the density growth trajectories
have been standardized [Fig. 5(b)]. The standardization procedure
endows zero-mean growth trajectories for all locations, which rules
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

out the influence of initial firm density. The geometric pattern thus
becomes the key point of interest. In Fig. 5(b), an upward growing
trend can be visually identified, in which a faster growing stage is
seen between 2011 and 2012. This trend is in line with the overall
growth of high-tech firms in SMA (Table 1). However, significant
variations were also revealed across locations, which calls for fur-
ther investigation using latent class models.
Fig. 4. Distribution of high-tech firm density in SMA (2009–2014) Based on the empirical investigations, two major factors were
identified that contribute to the compound variance observed in the
industrial growth data: (1) the initial high-tech firm density (cross
sectional) and (2) the pattern of growth trajectory (longitudinal).
Data-Guided Approach The authors’ postulation was that latent classes exist for each factor,
To design the procedure of latent class analysis, a data-guided and such latent classes are correlated in the sense that the initial
approach was adopted by investigating the data and postulating firm-density measure may serve as the covariate for determining
the underlying structure. The growth trajectories of high-tech firms the class membership of growth patterns. The analytical procedures
in SMA are plotted in Fig. 5(a), in which each line depicts the are summarized in Table 2.
change of high-tech firm density at a certain location during The authors started from a baseline single-group growth model,
2009–2014. A considerable level of variance is shown across in which an average growth pattern was estimated for all locations.
locations. Specifically, the amount of changes over time was rela- The purpose of this step was to examine the data using model tools
tively small for locations with low initial high-tech firm density in and obtain benchmark fit statistics that could be used to evaluate

Fig. 5. (a) High-tech firm density growth trajectories in SMA 2009–2014; (b) high-tech firm density growth trajectories in SMA 2009–2014 by
standard deviation unit

© ASCE 04017011-5 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


Table 2. Procedure of Latent Class Analysis
Step Data input
1: Establish a baseline single-group growth model Standardized density growth data (longitudinal)
2: Establish growth mixture models of varying number of classes to identify growth Standardized density growth data (longitudinal)
trajectory patterns
3: Establish latent class model to identify classes based on firm density measure Firm density measure (cross sectional)
4: Expand selected growth mixture model from Step 2 by incorporating Standardized density growth data (longitudinal) and firm
the density measure as a covariate density measure (cross sectional)

Table 3. Fit Statistics of Latent Class Models for High-Tech Firm Growth Patterns in SMA
Number of classes
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

Index 1 2 3 4 5 6 7
Log likelihood (LL) −460.42 −374.22 −337.61 −311.84 −293.55 −274.53 −261.08
Akaike (AIC) 944.84 786.44 727.21 689.68 667.11 643.07 630.17
Bayesian (BIC) 973.27 831.46 788.82 767.87 761.88 754.43 758.12
Sample-size adjusted BIC 935.43 771.55 706.84 663.82 635.76 606.24 587.85
Bootstrapped likelihood ratio test
Approximate P-value — 0.0000 0.0000 0.0000 0.0000 0.0128 0.1923

subsequent models. Step 2 aimed to identify the generic growth Table 4. Class Count of Latent Class Growth Models
patterns of high-tech firms in SMA during the study period, using Number of classes in latent model
standardized firm-density trajectories. Step 3 sought to divide the Member count
locations into groups according to their high-tech firm density in in each class 1 2 3 4 5 6 7
2009 and 2014. As opposed to the density trajectories in Step 2, Class A 79 73 67 35 35 32 32
actual firm-density measures were used in this step. Step 3 thus Class B — 6 6 6 5 5 5
complemented Step 2 and provided a cross-sectional analysis of Class C — — 6 4 4 5 4
firm-density distribution among locations. Finally, Step 4 inte- Class D — — — 34 34 33 32
Class E — — — — 1 1 1
grated the findings of Steps 2 and 3, using firm density in 2009
Class F — — — — — 3 3
as the covariate to predict the membership probability of different Class G — — — — — — 2
growth trajectory classes. Firm density was chosen as the covariate Total 79 79 79 79 79 79 79
because it explicitly represents the level of industrial agglomera-
tion. Other locational factors such as labor education level and
transport accessibility are also plausible options for the covariate
in further studies. Overall, this data-guided approach was devised represented the same growth pattern across models. Class B is first
from the authors’ empirical investigation and complies with the revealed by the 2-class model, and the class count remains stable
common practice of establishing growth mixture models. (5–6 locations) as the class number increases. Class C is then iden-
tified by the 3-class model as a separated set from Class A, and
the class count remains stable (4–6 locations). Class D is extracted
Model Estimation and Selection from Class A by the 4-class model, which includes about 32 loca-
tions. As the number of classes keeps increasing, Classes E, F,
Models in this paper were developed using the Mplus software. and G are identified but the class count is generally small (fewer
Mplus uses maximum-likelihood with the EM algorithm, estimat- than three locations), implying that the new classes are essentially
ing the posterior probability of class membership with a given num- regression outliers extracted from existing classes.
ber of classes. More technical details regarding the EM algorithm in To understand the classification of growth patterns, the seven
Mplus can be found in Muthén and Shedden (1999). In this section, identified growth patterns are plotted in Fig. 6. Class A represents
the results of model estimation are presented first; the selection a continuous growth pattern from 2009 to 2014, with increased
process is then discussed. growth speed (slope) during 2010–2011. Locations in this class
The model fit statistics for the single-class model and a series of exhibit strong growth-sustaining power that may well stem from
multiclass models with an incremental number of classes are pro- agglomeration economies. Class B represents a declining pattern.
vided in Table 3. Compared with the single-group model, signifi- Significant loss of high-tech firms was seen in these locations dur-
cant improvement in model performance was witnessed for the ing 2009–2010, followed by another steep decline between 2011
2-class model. The change of AIC, BIC, and log-likelihood (LL) and 2012. Locations in Class C feature faster growth during
all imply better model classification as the number of classes in- 2009–2010, fluctuations until 2013 and a declining trend by
creased from 2 to 6. The P-value from the BLRT also shows that 2014. Class D shows a different type of growth trajectory, in which
the model improved continuously from 2 classes to 6 classes, and fast growth is triggered in 2011 after initial sluggish growth, and
rejects the 7-class model against the 6-class model. growth continues until the end of the study period. The quick
Given the aforementioned fit statistics, the authors checked the growth during 2011–2012 is likely attributed to the 2011 industrial
member count of each class for the multiclass models (Table 4). For regeneration scheme implemented in Seoul. The local high-tech in-
comparison purposes, the classes were ordered such that each class dustry in Class E experienced a dramatic V-shape shock during

© ASCE 04017011-6 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

Fig. 6. Identified growth patterns of the 7-class model

2009–2011. A large fluctuation was also witnessed in Class F, but to their high-tech firm density in 2009 and 2014 (two cross
Class F differs from Class E in that locations in Class F ended up sections). Results are presented through a scatter plot in Fig. 7.
with a firm-density level higher than that of 2009. Class G revealed From the scatter plot, a good linear correlation was found be-
in the 7-class model also shows a constantly fluctuating pattern. tween high-tech firm density in 2009 and 2014. It implies that the
Nonetheless, if an average trajectory of Class G is derived through aggregate density distribution holds between the two static years.
linear interpolation, the interpolated trajectory shows a linear Class 1 has only one member, Geumcheon-gu, which had an out-
growth trend that is similar to that of Class A. standing high-tech firm density in 2009 (averaging 74.7 firms per
Given the aforementioned statistic and substantive check, the square kilometer) and remained the top in 2016. Class 2 represents
6-class model was selected for the following reasons. First, the 71 locations that have relatively low firm density (averaging 3.4
6-class model has the lowest BIC value and a low BLRT P-value firms per square kilometer in 2009). Class 3 includes seven loca-
(0.0128) that rejects the 5-class model. Second, the 7-class model is tions, representing a medium level of high-tech firm density in
rejected by the high BLRT P-value (0.1923). Third, in latent class SMA (averaging 27.4 firms per square kilometer in 2009).
analysis, classes with small counts usually need to be approached The final step (Step 4) of the latent class analysis was to extend
with caution. However, given the distinctive pattern of Classes E the 6-class model in Step 2 by introducing the 2009 high-tech firm
and F, the authors deemed that these locations should be extracted density as a covariate. The final growth mixture model with the
as new classes. These minor classes reflect the heterogeneity of covariate will link the posterior probability of class membership
high-tech firm growth in SMA. to the initial firm density in 2009. The growth pattern classification
The next step (Step 3) was to estimate the latent class model is presented in Fig. 8. The six growth patterns are in line with the
using high-tech firm density as indicators. A similar model selec- model estimations in Step 2, implying stability of the model.
tion method was applied as discussed previously. The selected The authors further present the probability and count statistics of
model divides the locations in SMA into three classes according model estimation in Table 5. In terms of class count, one location
previously in Class A moved to Class D, and the rest of the counts
remained unchanged. The posterior probability of class member-
ship shows that the model predictions for Classes A–E are all
higher than the 95% level of accuracy. The prediction for Class
F is 95.5%, with another 4.4% probability of its members belong-
ing to Class A. Overall, the authors deemed that the extended
model with the covariate retains the features of the original model
without the covariate.

Discussion

For the final model with the covariate, Fig. 9 further illustrates the
probability of class membership as a function of the covariate. For
any given firm-density value, the sum of membership probabilities
for all classes equals one. The authors also made use of the latent
class model developed in Step 2 and categorized the locations ac-
cording to their initial high-tech firm density in 2009 (see the three
vertical lines in Fig. 9). The three lines represent the average firm
density of the low (3.4=km2 ), middle (27.4=km2 ), and high
(74.7=km2 ) firm-density classes, respectively, which were derived
in Step 3. Here four findings are summarized.
First, for locations with low high-tech firm density in 2009, the
probability of following either the Class A or D type of growth is
Fig. 7. Scatter plot of high-tech firm density in SMA by class
about 79%, with a higher probability of being Class A (47%) than
membership
Class D (32%). Recall that Class A features strong self-sustaining

© ASCE 04017011-7 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

Fig. 8. Identified growth patterns of the 6-class model with covariate

Table 5. Classification Probability of Latent Class Membership by Latent Class and Class Count
Class Class A (%) Class B (%) Class C (%) Class D (%) Class E (%) Class F (%) Sum (%) Count
Class A 98.0 0.0 0.1 1.9 0.0 0.0 100.0 31
Class B 0.0 100.0 0.0 0.0 0.0 0.0 100.0 5
Class C 0.5 0.0 99.5 0.0 0.0 0.0 100.0 5
Class D 3.3 0.0 0.0 96.7 0.0 0.0 100.0 34
Class E 0.0 0.0 0.0 0.0 100.0 0.0 100.0 1
Class F 4.4 0.0 0.0 0.1 0.0 95.5 100.0 3

growth in which no significant decline is detected over the study the probability of being Class A drops to less than 5% (1% when
period, whereas Class D represents a growth pattern that requires firm density increases to 44=km2 ).
certain triggers to achieve further growth (Fig. 8). This finding sug- Second, as the initial firm density rises to the middle range,
gests that for locations with low high-tech firm density, 47% of the probability of being Classes D and B increases. As the initial
them tend to see continuous growth for a period up to the length density approaches the middle-range threshold (27.4=km2 ), the
of the study period, subject to the macro economy. However, this probability of being Class D rises to more than 70%, suggesting
probability decreases quickly as the initial density rises, such that that for most of the middle-range locations, further growth of
locations with middle and high firm density are unlikely to follow the high-tech industry relies on certain triggers, most likely policy
the self-sustaining growth of Class A. This may imply the turning triggers in the context of SMA. Without the incentives, the self-
point from agglomeration economies to diseconomies. In fact, sustaining growth of the local high-tech industry can hardly go
when initial firm density approaches around 30=km2 in 2009, beyond the middle range.
Third, as the initial high-tech firm density keeps increasing, the
probability of being Class D encounters a tipping point and turns to
decline. It implies that for locations with upper-middle to high firm
density in 2009, the probability of actually achieving the triggered
growth is dropping. Meanwhile, the probability of being Class B
(industrial declining) continues to grow. In fact the monotonic
probability curve of Class B suggests that locations with higher
initial firm density are more likely to suffer industrial regression.
Last, Classes E and F feature significant fluctuations (in relative
terms) over time. This study shows that such growth patterns only
happen in locations with very low initial firm density. For locations
with initial firm density greater than 10 firms per square kilometer
in 2009, the Class E and F type of industrial growth is much less
likely to occur. In addition, the probability of being Class C (stag-
nation after big early growth) remains relatively stable across the
initial density range, albeit with the lower-middle locations having
a slightly higher probability.
To understand the spatial heterogeneity of high-tech industry
growth, the authors examined the geographic distribution of the
identified growth patterns in SMA, together with the mapping
of high-tech firm density in 2009 [Figs. 10(a and b)]. It showed
that the Class D locations were mostly located in Seoul and the
west coast, which were built-up areas with notable high-tech indus-
trial establishments in 2009. By contrast, the Class A locations
Fig. 9. Probability of class membership as a function of covariate
were generally outside the Seoul city, featuring relatively low

© ASCE 04017011-8 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

Fig. 10. (a) The geographic distribution of 6 identified classes in SMA; (b) high-tech firm density in 2009 (covariate) in SMA

high-tech firm density. The locations of other classes were gener- policies at different stages of industry development. The authors
ally dispersed with no significant spatial autocorrelation detected. expect that the methodology and the findings of this paper will con-
As a final caveat, it should be noted that the latent class models tribute to industrial location choices and policy studies in similar
estimated in this paper are essentially period specific. This is be- metropolitan areas.
cause the growth trajectory data, though standardized, are period The analysis follows a data-driven explorative approach. To
specific by nature. The covariate vector (high-tech firm density focus on the growth pattern, the standardized firm-density data
in 2009) is also a static measure that will change over time. In ad- between 2009 and 2014 were used to conduct the latent class analy-
dition, the length of the study period for pattern recognition is a key sis. A series of multiclass latent models were estimated with incre-
boundary condition. For example, if the study period were short- mental numbers of classes. The model selection involved both
ened to two years (2009–2011), Class D in the final model would statistical and substantive checks, as recommended in existing lit-
represent a stagnated pattern, rather than the triggered-growth pat- erature. For firm-density growth patterns, six distinctive classes of
tern. In the extreme case of investigating growth on a yearly basis, industry growth trajectories were identified. The authors also de-
the growth trajectory simply becomes not available. The authors veloped a 3-class model that divides all SMA locations into three
thus argue that for pattern recognition purposes, the growth trajec- groups according to their high-tech firm density in 2009 and 2014.
tories should be examined over a relatively long period, subject to The separation of the two latent models (for growth patterns and for
data availability. Given the overall incremental growth of high-tech firm density) is supported by the authors’ investigation of the data.
industry in SMA, the authors deem that the five-year study period The final model incorporates the initial high-tech firm density in
(2009–2014) represents a reasonable length for generalizing the 2009 as a continuous covariate.
aggregate growth patterns of high-tech industry in SMA. The research found that locations with relatively low firm den-
sity tended to have continuous growth (Class A) for the local high-
tech industry in SMA during 2009–2014. As the initial firm density
Conclusion rose, the probability of being Class A dropped significantly; in-
stead, locations were more likely to follow the triggered-growth
This paper makes use of the firm registration data in the Seoul pattern (Class D) or encounter industrial regression (Class B).
metropolitan area to identify the generic growth patterns of the As the initial density approached the middle-range threshold
high-tech manufacturing industry through latent class analysis. (27.4=km2 in 2009), the probability of being Class D rose to more
The research also investigates the possible link between identified than 70%, suggesting that for most of the middle-range locations,
growth patterns and initial firm density. Latent class analysis is a further growth of the high-tech industry would increasingly rely on
well-established research method in the field of behavioral science, certain triggers. The research also found that the probability of
but it has rarely been applied in urban studies. This paper provides being Class D encounters a tipping point and turns to decline. It
an empirical example of using latent class analysis to identify in- implies that for locations with upper-middle to high firm density
dustrial growth patterns among locations with longitudinal firm in 2009, the probability of actually achieving the triggered growth
count data. The application of the latent class analysis in the urban is dropping; meanwhile the probability of being Class B (industrial
field may provide new insights on understanding industrial growth declining) is increasing. As a final caveat, the model estimations
within a cluster of interconnected locations and the possible role of presented are essentially period specific. The authors argue that

© ASCE 04017011-9 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011


for pattern recognition purposes, the growth trajectories should be Jung, T., and Wickrama, K. A. (2008). “An introduction to latent class
examined over a relatively long period subject to data availability. growth analysis and growth mixture modeling.” Social Personality
The long study period would improve the temporal generality of the Psychol. Compass, 2(1), 302–317.
identified growth patterns. KOSIS (Korean Statistical Information Service). (2010). “Korean statistical
In terms of the limitations of the paper and the directions for information service: Statistical database.” 〈http://kosis.kr/〉 (Dec. 2010).
further study, first, the temporal generality of the identified growth KOSIS (Korean Statistical Information Service). (2016). “Korean statistical
patterns are not empirically verified. Extra data input is required to information service.” 〈http://kosis.kr/〉 (Dec. 2016).
Kreuter, F., and Muthén, B. (2007). “Longitudinal modeling of population
enable the test. If verified, the possible transitions between and the
heterogeneity: Methodological challenges to the analysis of empirically
combinations of generic patterns can be explored. Second, given
derived criminal trajectory profiles.” Advances in latent variable mix-
the identified growth patterns, the underlying causality for each
ture models, G. R. Hancock and K. M. Samuelsen, eds., Information
pattern has not been broached. Such causality can be investigated Age Publishing, Charlotte, NC, 53–75.
by linking the growth pattern with observed location-specific Lazarsfeld, P. F., and Henry, N. W. (1968). Latent structure analysis,
attributes and policy variables. Third, the agglomeration thresholds Houghton Mill, Boston.
(measured by firm density in this paper) for enabling self- Lo, Y., Mendell, N. R., and Rubin, D. B. (2001). “Testing the number of
Downloaded from ascelibrary.org by University of Seoul on 05/30/18. Copyright ASCE. For personal use only; all rights reserved.

sustaining growth promise to be an interesting topic that may in- components in a normal mixture.” Biometrika, 88(3), 767–778.
form policy making. Whether such agglomeration thresholds vary Marshall, A. (1890). Principles of political economy, Maxmillan,
across world cities is another empirical question to be investigated. New York.
McLachlan, G., and Peel, D. (2000). Finite mixture models, Wiley,
New York.
Acknowledgments Mplus [Computer software]. Muthén & Muthén, Los Angeles.
Muthén, B. (2001). “Latent variable mixture modeling.” New developments
This work was supported by the National Research Foundation of and techniques in structural equation modeling, G. A. Marcoulides and
Korea Grant (NRF-2015R1A2A2A04005886, 2017R1A2B4003949). R. E. Schumacker, eds., Lawrence Erlbaum Associates, Mahwah, NJ,
1–33.
Muthén, B. (2003). “Statistical and substantive checking in growth mixture
References modeling: Comment on Bauer and Curran (2003).” Psychol. Methods,
Akaike, H. (1973). “Maximum likelihood identification of Gaussian 8(3), 369–377.
autoregressive moving average models.” Biometrika, 60(2), 255–265. Muthén, B., and Asparouhov, T. (2011). “LTA in Mplus: Transition prob-
Arauzo-Carod, J., Liviano-Solis, D., and Manjón-Antolín, M. (2010). abilities influenced by covariates.” Mplus Web Notes, 13, 1–30.
“Empirical studies in industrial location: An assessment of their meth- Muthén, B., and Muthén, L. K. (2000). “Integrating person-centered and
ods and results.” J. Reg. Sci., 50(3), 685–711. variable-centered analyses: Growth mixture modeling with latent trajec-
Arauzo-Carod, J. M. (2005). “Determinants of industrial location: An ap- tory classes.” Alcohol.: Clin. Exp. Res., 24(6), 882–891.
plication for Catalan municipalities.” Papers Reg. Sci., 84(1), 105–120. Muthén, B., and Shedden, K. (1999). “Finite mixture modeling with mix-
Burns, A. F. (1934). Production trends in the United States, Bureau of ture outcomes using the EM algorithm.” Biometrics, 55(2), 463–469.
Economic Research, New York. Nagin, D. S. (1999). “Analyzing developmental trajectories: A semipara-
Celeux, G., Biernacki, C., and Govaert, G. (1997). “Choosing models in metric, group-based approach.” Psychol. Methods, 4(2), 139–157.
model-based clustering and discriminant analysis.” J. Stat. Mech: Nagin, D. S., and Land, K. C. (1993). “Age, criminal careers, and popu-
Theory Exp., 64(1), 49–71. lation heterogeneity: Specification and estimation of a nonparametric,
Clogg, C. C. (1981). “New developments in latent structure analysis.” Fac- mixed Poisson model.” Criminology, 31(3), 327–362.
tor analysis and measurement in sociological research, D. J. Jackson Nylund, K. L., Asparouhov, T., and Muthén, B. O. (2007). “Deciding on the
and E. F. Borgotta, eds., Sage, Beverly Hills, CA, 215–246. number of classes in latent class analysis and growth mixture modeling:
Clogg, C. C. (1995). “Latent class models.” Handbook of statistical mod- A Monte Carlo simulation study.” Struct. Equ. Model., 14(4), 535–569.
eling for the social and behavioral sciences, Springer, Berlin, 311–359.
Ram, N., and Grimm, K. J. (2009). “Methods and measures: Growth mix-
Cox, W. (2016). “Demographia world urban areas.” 〈http://www
ture modeling: A method for identifying differences in longitudinal
.demographia.com/db-worldua.pdf〉 (Dec. 2016).
change among unobserved groups.” Int. J. Behav. Dev., 33(6), 565–576.
Fujita, M., Krugman, P. R., and Venables, A. J. (2001). The spatial
economy: Cities, regions, and international trade, MIT Press, Schwarz, G. (1978). “Estimating the dimension of a model.” Ann. Stat.,
Cambridge, MA. 6(2), 461–464.
Glaeser, E. L. (1999). “Learning in cities.” J. Urban Econ., 46(2), 254–277. Titterington, D. M., Smith, A. F. M., and Makov, U. E. (1985). Statistical
Gold, B. (1964). “Industry growth patterns: Theory and empirical results.” analysis of finite mixture distributions, Wiley, New York.
J. Ind. Econ., 13(1), 53–73. Vermunt, J. K., and Magidson, J. (2002). Appl. Latent Class Anal., J. A.
Goodman, L. A. (1974). “Exploratory latent structure analysis using both Hagenaars and A. L. McCutcheon, eds., Cambridge University Press,
identifiable and unidentifiable models.” Biometrika, 61(2), 215–231. Cambridge, U.K.
Jeffries, N. O. (2003). “A note on ‘Testing the number of components in a Wang, M., and Hanges, P. J. (2011). “Latent class procedures: Applications
normal mixture’.” Biometrika, 90(4), 991–994. to organizational research.” Organ. Res. Methods, 14(1), 24–31.

© ASCE 04017011-10 J. Urban Plann. Dev.

J. Urban Plann. Dev., 2017, 143(3): 04017011

You might also like