You are on page 1of 13

Comparative assessment of CORINE2000 and GLC2000:

Spatial analysis of land cover data for Europe


K. Neumann
a,
*
, M. Herold
b
, A. Hartley
c
, C. Schmullius
d
a
Wageningen University, P.O. box 47, 6700 AA Wageningen, The Netherlands
b
Friedrich-Schiller-University Jena, Loebdergraben 32, 07743 Jena, Germany
c
Joint Research Centre of the European Commission, Via Enrico Fermi, 1, Ispra, Italy
d
Friedrich-Schiller-University Jena, Loebdergraben 32, 07743 Jena, Germany
Received 11 November 2005; accepted 19 February 2007
Abstract
Given the current lack of interoperability between global and regional land cover products, efforts are underway to link the new
European global land cover map (GLOBCOVER) with the existing global land cover 2000 map (GLC2000) and European CORINE
mapping initiative. Since both datasets apply different mapping standards, key for a successful implementation is a thorough
understanding of the heterogeneities among both datasets. Thus, this paper provides an assessment of compatibilities and
differences between the CORINE2000 and GLC2000 datasets. The comparative assessment considers inconsistencies between the
thematic legends (using the UN land cover classication system-LCCS), class specic accuracies, and the spatial resolution and
heterogeneity of the datasets. The results are summarized with implications for the development of the newGLOBCOVERdatasets.
# 2007 Elsevier B.V. All rights reserved.
Keywords: Land cover; Interoperability; Harmonisation; GLC2000; CORINE2000; GLOBCOVER
1. Introduction
The increasing need for comprehensive and reliable
information on land cover and land cover dynamics has
led to the development of several global land cover
datasets derived from satellite Earth observation. Their
development was driven by different national or
international initiatives and programmes, and the
variety of mapping standards reects the wide range
of interests, requirements and methodologies of the
originating programmes (Herold et al., 2006a).
Available data products include the global land cover
for the year 2000 (GLC2000, Bartholome and Belward,
2005) and the European dataset CORINE2000 (JRC,
2004; EEA, 2005a). A new global land cover dataset
will be GLOBCOVER derived from ENVISAT-MERIS
data for the year 2005.
So far, there is only limited compatibility and
comparability between different land cover maps and
their thematic legends; they rather exist as independent
datasets. In general, heterogeneity in land cover maps
results from different underlying methods and standards
and has multiple facets. They include syntactic issues
(e.g. logical data models: vector/raster), schematic
heterogeneity (e.g. database models, spatial reference
systems, cartographic standards including variable
minimum mapping units and mixed units) and semantic
aspects (Bishr, 1998). Different mapping methodolo-
gies make it difcult to separate land changes
themselves from changes that are result of a different
www.elsevier.com/locate/jag
International Journal of Applied Earth Observation
and Geoinformation 9 (2007) 425437
* Corresponding author. Tel.: +31 317 482430;
fax: +31 317 482419.
E-mail addresses: kathleen.neumann@wur.nl (K. Neumann),
m.h@uni-jena.de (M. Herold), andrew.hartley@jrc.it (A. Hartley),
c.schmullius@uni-jena.de (C. Schmullius).
0303-2434/$ see front matter # 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.jag.2007.02.004
methodology used to create the map. Semantic incon-
sistencies may be a problem for time series analysis of
land cover or land use to monitor environmental change
and for initiatives that aim to react to environmental
change (Comber et al., 2004). The problem of semantic
discrepancies has been addressed by several authors.
While some of themseek for harmonization guidelines to
facilitate future mapping efforts (Bennett, 2001; Jansen,
2005; Herold et al., 2006a) do others show how to tackle
inconsistencies between already existing maps to make
them comparable (Comber et al., 2005; Hagen, 2003;
Visser, 2004).
Since the 1980s, international efforts have been
made aiming at harmonization of existing and future
land cover datasets to support operational Earth
observation of land with the goal to overcome current
limitations of land cover dataset compatibility and
comparability. One of the key drivers in the land cover
harmonization process is the global observation of
forest cover and land dynamics (GOFC-GOLD) as a
platform for international communication and coordi-
nation. GOFC-GOLD in cooperation with global
terrestrial observing system (GTOS) develops and
provides methodological and organizational resources
for joint international progress in this arena (Herold and
Schmullius, 2004).
A major objective is to assist ongoing and upcoming
mapping initiatives to foster consistent and comparable
ways for creating land cover maps (Herold et al.,
2006a). GLOBCOVER is such a new land cover
initiative expected to evolve a new era for consistent
global land cover assessment. Keys for GLOBCOVER
success are the improved spatial data resolution of
300 m(compared to 1 kmin existing maps), the premise
to the develop the map product based on thorough
understanding of existing datasets, a harmonized legend
development based on common land cover classiers,
and the consideration of known challenges in coarse
scale land cover mapping. This will improve the quality
of the GLOBCOVER products and extend the potential
eld of applications.
From a European perspective, GLOBCOVER is
expected to complement and extend the two major
coarse scale land cover mapping efforts: CORINE and
GLC2000. Both initiatives have followed rigorous
internal mapping standards. For example, the CORINE
is a regional programme and the legend reects
thematic denitions and detail of land categorization
of both land cover and land use characteristics. In
contrary, the GLC2000 map covers the whole globe and
uses the UN land cover classication system (LCCS)
developed for describing land cover characteristics.
The aim of GLOBCOVER being, as much as possible,
interoperable to both existing map products is ambitious
but has to start from a thorough understanding on the
characteristics and inconsistencies of related land cover
information. It has to be understood what are causes of
the heterogeneities among existing datasets and which
factors are responsible. This study focuses on compar-
ison of the CORINE2000 and GLC2000 datasets. In this
study, we describe their agreement, and assess factors
driving the disagreement. The investigations consider
several aspects: land cover heterogeneity and spatial
resolution effects, semantic land cover denitions and
thematic similarities, and known map accuracy mea-
surements. They will be assessed in relation to the
measured disagreement between GLC2000 and COR-
INE2000 to gain understanding and to develop
strategies for the GLOBCOVER.
2. Data and methods
2.1. CORINE2000 dataset
The CORINE programme (Coordination of Informa-
tion on the Environment) was established in 1985 by the
European Commission. One of the major tasks has been
the establishment of the CORINE Land Cover Project.
The two main objectives are (1) to provide quantitative
data on land cover, consistent and comparable across
Europe for all interested in the European environmental
policy, and (2) to prepare one comprehensive digital
land cover database covering the 25 EU member states
and other European and North African countries. The
mapping is based on the CORINE nomenclature and
interpretation methods at an original scale of 1:100,000.
The nomenclature comprises 44 land cover classes on
three levels at a minimum mapping unit of 25 ha. The
datasets have been created under the responsibility of
each EU member state by on-screen interpretation and
digitizing of Landsat images in a GIS environment. The
European wide product was produced by merging the
consistent national products to one dataset (EEA,
2005b). Fig. 1 provides a detailed view of the
CORINE2000 dataset for the Berlin region and shows
the full legend for the third level.
The rst CORINE1990 land cover dataset was
released in 1999, an updated and extended version
followed in 2000. The new CORINE2000 dataset,
representing land cover for the period from 1999 to
2001 as well as land cover changes, has become available
in 2005. The dataset with a spatial resolution of 250 m
can be downloaded free of charge from http://dataservi-
ce.eea.eu.int/dataservice/metadetails.asp?id=678 .
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 426
2.2. Global land cover of the year 2000 (GLC2000)
The global land cover for the year 2000 (GLC2000)
project, coordinated by the European joint research
centre (JRC), provides consistent global land cover
information for the year 2000. In contrast to former
global mapping initiatives the GLC2000 map develop-
ment followed a bottom up approach. Eighteen regional
land cover map products were derived by regional
experts and merged to a global map.
To ensure consistency of all regional land cover
classications, each regional product was derived by
classifying the SPOT-4 VEGETATIONdataset. Further-
more, each regional partner applied the land cover
classication system (LCCS) produced by United
Nations (UN). The global GLC2000 product was
produced by harmonization and merging the individual
regional products to one global product with general-
ized legend (Bartholome and Belward, 2005). Different
parts of Europe were covered by ve different regional
map products with usually more detailed or regionally
specic legends than the global one. Fig. 1 provides a
detailed view of the GLC2000 dataset for the Berlin
region together with the full legend (JRC, 2004). The
regional and global GLC2000 land cover datasets are
available free of charge from http://www-gvm.jrc.it/
glc2000/ProductGLC2000.htm.
Since GLC2000 provides global coverage, some
classes rarely appear for Europe. Nine of the 22 global
classes in GLC2000 only represent 0.2% or less of the
area covered by CORINE2000 (Table 1). They are
considered to be of minor importance when global scale
data are studied in the context of Europe and thus will be
disregarded in this analysis. Furthermore, the class water
bodies will be excluded. The dominance of the GLC2000
class Cultivated and managed areas is striking,
throughout the continent of European. Forests are limited
to broadleaf deciduous, evergreen needle-leaved and
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 427
Fig. 1. Detail of the CORINE2000 and GLC2000 datasets for the Berlin region with complete legends.
Table 1
GLC2000 classes with less than 0.2%spatial coverage of the CORINE
mapping area
GLC2000 class Area (%)
1 Tree cover, broadleaved, evergreen 0.03
5 Tree cover, needle-leaved, deciduous 0.00
7 Tree cover, regularly ooded, fresh water 0.01
8 Tree cover, regularly ooded, saline water 0.00
9 Mosaic: tree cover/other natural vegetation 0.15
10 Burnt area 0.00
21 Snow and ice (natural and articial) 0.04
mixed wooded areas. Europe also shows a signicantly
higher degree of urbanization than the rest of the world.
2.3. Land cover classication system (LCCS)
The land cover classication system (LCCS),
developed by the United Nations, provides an appro-
priate framework for a land cover legend development
and translation using exible but standardized set of
classiers and thresholds. LCCS is an a priori
classication system that is based on independent
and universally valid land cover diagnostic criteria,
rather than on predened specic land cover classes. It
can be used to describe land cover features all over the
world at any scale or level of detail with an absolute
level of standardization of class denitions between
different users (Di Gregorio, 2005). LCCS is currently
evolving as an internationally agreed standard for land
cover characterization (Herold et al., 2006b).
As part of its harmonization activities, the ESA
GOFC-GOLD land cover project ofce in cooperation
with the JRCused LCCS as key tool for the comparative
assessment of GLC2000 and CORINE2000. LCCS was
used to reveal and explain thematic similarities and
inconsistencies between the class denitions of both
datasets. The CORINE legend was translated to LCCS
and thus re-described the classes from a land cover
perspective. Based on this translation the GLC2000 and
CORINE class denitions could be directly compared
in terms of their land cover descriptions. While some
classes translate easily, others have complex land cover
denitions with variety of different thematic and
cartographic mixtures (Herold and Schmullius, 2004).
This again emphasizes the CORINE mapping approach
of visual satellite data interpretations (30 m 30 m
pixel and 25 ha minimummapping unit) and delineating
many categories with a thematic focus on land use.
Land use categories usually aggregate from a mix of
different land cover types. Furthermore, several
CORINE classes contain specic extensions that add
to the complexity of the LCCS-denitions and
complicate the translation/comparison process.
A key component in using LCCS is a set of common
land cover classiers. Since it has been challenging to
nd general agreement for dening land cover classes
the approach is to agree on the terminology and
common classiers rather than categories. The common
classiers dened by LCCS and used in this study were:
(1) Vegetation/non-vegetation.
(2) Terrestrial/aquatic and regularly ooded.
(3) Natural and seminatural/managed and articial.
(4) Life form/surface type (trees, shrubs, herbaceous,
bare, water, snow/ice).
(5) Leaf type (broadleaf, needleleaf, mixed, none).
(6) Vegetation density (015%, 1540%, 4065%, 65
100%, none).
(7) Land use types (urban, agriculture, none).
An additional classier not used is phenology or leaf
longevity. This one was excluded since CORINE did not
consider this information. Through translating both
legends into LCCS, each classier can be determined
for each class. This information is used to assess the
thematic similarity between the different categories
(Section 2.4.2).
2.4. Comparing GLC2000 and CORINE2000
The comparative assessment of GLC2000 and
CORINE2000 focused on highlighting and explaining
thematic and spatial differences between both datasets.
Starting point was the determination of areas of spatial
agreement between GLC2000 and respective CORINE
classes. Given the LCCS legend translation, a crosswalk
between the CORINE and GLC2000 classes was
developed (Table 2). This comparison does not imply
a direct correspondence between the classes. It rather
considers common thematic agreement and similarity
among the categories, i.e. it provides the obvious choice
if a GLC2000 category is to be related to a specic
CORINE class. In other words, the correspondence of
each GLC2000 class to a certain CORINE2000 class is
solely based on the class denitions. Table 2 considers
both class agreement and similarity with ladder one
referring to signicant thematic overlap between two
classes.
For a more enhanced comparison three different
measures are introduced in this study: correspondence
based on denition, thematic similarities between all
land cover classes, and a confusion matrix.
An important issue is the understanding of the
differences between both datasets. Both GLC2000 and
CORINE2000 are based on satellite images. These
images, however, were received from different sensors
(CORINE2000: Landsat ETM+; GLC2000: Spot
Vegetation) which were differently calibrated and
which detect the spectral characteristics of the Earth
surface using different wavelengths. Both satellite data
have different spatial resolutions which naturally
inuence the presentation of spatial detail, especially
in areas with heterogeneous land cover types such as
CORINE classes 324 (Transitional woodland-scrub),
241 (Annual crops associated with permanent crops),
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 428
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 429
Table 2
Summary of correspondence of GLC2000 classes to CORINE classes
GLC2000 global class CORINE class (classes in italics show similarity not
agreement)
1: Tree cover, broadleaved evergreen, closed to open (>15%) 311 broad-leaved forests
2: Tree cover, broadleaved deciduous, closed (>40%)
3: Tree cover, broadleaved deciduous, open (1540%)
4: Tree cover, needleleaved evergreen, closed to open (>15%) 312 coniferous forests
5: Tree cover, needleleaved decidous, closed to open (>15%)
6: Tree cover, mixed leaftype, closed to open (>15%) 313 mixed forests
7: Tree cover, closed to open (>15%), regularly ooded,
fresh or brackish: swamp forests
31 forests
411 inland marshes
8: Tree cover, closed to open (>15%), regularly ooded,
saline water: mangrove forests
31 forests
9: Mosaic of tree cover and other natural vegetation
(incl. crop component)
324 transitional woodland-scrub
31 forests
10: Tree cover, burnt (boreal forests) 334 burnt areas
11: Shrubcover, closed to open (>15%), evergreen
(broadleaved or needleleaved)
322 moors and heathland
323 sclerophyllous vegetation
324 transitional woodland-scrub
12: Shrubcover, closed to open (>15%),
deciduous (broadleaved)
322 moors and heathland
324 transitional woodland-scrub
13: Herbaceous cover, closed to open (>15%) 231 pastures
321 natural grasslands
14: Sparse herbaceous or shrubcover (015%) 333 sparsely vegetated areas
322 moors and heathland
332 bare rocks
15: Regularly ooded (>2 month) Shrub and/or Herbaceous
cover, closed to open
411 inland marshes
412 peat bogs
421 salt marshes
16: Cultivated and managed areas 21 arable land
22 permanent crops
241 annual crops associated with permanent crops
242 complex agricultural pattern
244 agro-forestry areas
231 pastures
17: Mosaic of cropland/tree cover/other natural vegetation 243 land principally occupied by agriculture, with signicant
areas of natural vegetation
231 pastures
18: Mosaic of cropland/ shrub cover/herbaceous cover 243 land principally occupied by agriculture, with signicant
areas of natural vegetation
231 pastures
19: Bare areas 331 beaches, dunes, and sand plains
332 bare rocks
333 sparsely vegetated areas
20: Water bodies (natural and articial) 5 water bodies
423 intertidal ats
21: Snow and ice (natural and articial) 335 glaciers and perpetual snow
22: Articial surfaces and associated areas 1 articial surfaces
422 salines
Corresponding CORINE classes in standard letters indicate an agreement, classes marked in italics indicate a similarity (see also Table 3).
242 (Complex agricultural pattern), and 243 (Land
principally occupied by agriculture, with signicant
areas of natural vegetation).
The spatial resolution of the both satellite image
sources differs signicantly. Spot vegetation provides
data with a resolution of 1 km whereas the 30 m
resolution of Landsat allows for the detection of more
spatial details (EEA, 2005a; JRC, 2004). Different
interpretation methods, such as manual interpretation of
the Landsat data versus an (semi-) automatic classica-
tion of the Spot data, as well as the use of different
classication systems lead to further differences
between the GLC2000 and CORINE2000 maps.
Because of these crucial differences the classes of the
two databases cannot fully match. The following
sections will explore in more detail the reasons for
the differences between both datasets.
2.4.1. Spatial agreement between GLC2000 and
CORINE2000
The spatial agreement between GLC2000 and
CORINE2000 classes was determined through a spatial
overlay, at 250 m pixel resolution, resulting in a
confusion matrix labelling classes as agreement,
similarity and disagreement (Tables 2 and 3). Class
labelling was performed subjectively, based on the
comparison of the descriptors of the two legends. The
matrix reects the area percentage of each GLC2000
class covered by the individual CORINE categories. For
example, the GLC2000 class Articial surfaces can be
assigned to the CORINE class group Articial
surfaces, consisting of eleven subclasses. The agree-
ment between these GLC2000/CORINE classes is
70.5% (Table 3). Thus, more than two-thirds of the
GLC2000 class Articial surfaces overlap with the
respective CORINE class Built up area. The remaining
29% of the GLC2000 class Articial surfaces cover
other CORINE classes, for example Non-irrigated
arable land (Table 3). The total agreement between
GLC2000 and CORINE2000 is 57%.
2.4.2. Thematic similarities between GLC2000 and
CORINE classes
A quantitative approach to dene afnities between
both legends can be provided based on LCCS
translations. The thematic similarities were dened
for each GLC2000/CORINE land cover class combina-
tion considering the above given seven LCCS classi-
ers. Thematic similarities were dened in addition to
correspondence (Table 2) since not each corresponding
GLC2000/CORINE2000 class necessarily has to have a
high thematic similarity or contrary.
A similarity matrix for each classier is derived for
every individual of the seven common LCCS classier
used in this study (see Section 2.3). Each classier
matrix contains three values, zero for no agreement, 0.5
for partial agreement, and 1 for full agreement. For the
case of the classier leaf type, if both classes strictly
have the same leaf type (e.g. broadleaf), the score is one,
a mixed leaf type would get a score of 0.5 and a needle-
leaf category and all that do not have broadleaf
character would receive a score of zero. The overall
thematic similarity is the sum of the matrices for all
classiers (vegetation/non-vegetation, terrestrial/aqua-
tic and regularly ooded, natural and semi-natural/
managed and articial, life form/surface type, leaf type,
vegetation density, land use types) divided by the
number of classiers (seven). Thus, thematic similarity
can range between 0 (=absolute disagreement) and 1
(=complete similarity) and the similarity matrix is
shown in Table 4.
Table 4 emphasizes similarity scores that except for
some of the urban classes there is no complete
agreement between the classes of both datasets. This
emphasizes the difference in thematic denitions for
both maps. Classes with a similarity score of zero do
not agree in any of the studied classiers, such as
GLC2000 bare areas versus CORINE rice elds.
As expected, there is a general thematic afnity
between the agreeing classes (Tables 2 and 3) and the
thematic similarity scores (Table 4), i.e. CORINE
class and a corresponding GLC2000 class usually
share the largest amount of thematic similarity. The
most prominent exception is CORINE class green
urban areas which, in terms of thematic agreement,
corresponds to maximum 64% with a number of
GLC2000 classes although the class corresponds to
the GLC2000 class Articial surfaces and associated
areas.
Further investigations used an aggregate measure of
similarity for each GLC2000 category. In general, it
would be important to study the amount of thematic
agreement among corresponding classes (Table 4). For
a better understanding of the spatial disagreement
between two datasets we investigate the thematic
similarity with classes that do not agree. The median
value of thematic similarity of all CORINE classes that
do not correspond to a specic GLC2000 class were
calculated to represent the amount of thematic
confusion. For example, GLC2000 class 2 (tree cover,
broadleaved, deciduous, dense) corresponds to one
specic CORINE class (broadleaf forest) with a
thematic similarity of 0.96. The median of the similarity
values of all other (non-corresponding) CORINE
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 430
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 431
Table 3
Confusion matrix between GLC2000 and CORINE2000 categories considered in this study
The individual values are in area percent for each GLC2000 class. Fields with green background show class agreement, orange background shows
class similarity according to Table 2. A pink background highlights class combinations with no agreement but a large amount of confusion, i.e.
spatial agreement with a land class not comparable in terms of their thematic class similarities (10%).
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 432
Table 4
Thematic similarity matrix between GLC2000 and CORINE2000 categories considered in this study
The individual values are calculated from an aggregate value of seven common land cover classiers determined in LCCS. Fields with bold border
show agreement classes according to Table 3.
classes is 0.54 and is considered as aggregate measures
of the confusion of this class from a semantic
perspective. The semantic confusion for GLC2000
class 22 (articial surfaces and associated areas) is
almost half as much as for the GLC2000 classes 11
(herbaceous cover, closed-open) and 13 (shrub cover,
closed-open, evergreen). While the spatial agreement
between GLC2000 and CORINE2000 considers spatial
aspects, the thematic similarities do not consider the
spatial distribution of corresponding classes. The values
for each class are presented in the Table 5.
2.4.3. Spatial homogeneity
Local neighbourhood analyses are a valuable tool to
determine the heterogeneity of a dataset through
analysing pixel variability with respect to their
neighbourhood. Such analyses were performed on the
GLC2000 dataset to show the land cover classes that are
spatially more homogeneously structured than others.
The spatial heterogeneity or local class diversity was
calculated using a local 3 3 kernel. The heterogeneity
value can range between one, i.e. none of the eight
neighbour pixels differs from the core pixel (=max-
imum local homogeneity) and eight, i.e. all neighbour
pixels differ from the core pixel (=maximum local
heterogeneity). This information is used to show
whether the spatial agreement between GLC2000 and
CORINE2000 depends on the heterogeneity of the land
cover classes. For this, the percentage of the class pixels
located in homogenous areas (local homogeneity
values = 1) was calculated (Table 4). Classes represent-
ing a homogenous pattern are bare areas, sparsely
vegetated areas, and cultivated and managed areas.
Most heterogeneous are the mixed agricultural classes
and open broadleaf deciduous forests.
2.4.4. Classication accuracy of GLC2000
The validation of GLC2000 has provided statisti-
cally robust accuracy information for the global
classes. Validation results show an overall accuracy
of 68.6% with a very low accuracy for the mosaic
classes (Mayaux et al., 2006). The GLC2000 confusion
matrix was analysed on the producers accuracies of
each GLC2000 land cover class. The producers
accuracy was derived by dividing the total number
of correct sample units in a land cover class by the total
number of sample units of this class from the reference
data. The producers accuracy therefore indicates the
probability of a reference sample units as being
correctly classied (Congalton, 2003), and provides a
good indicator of general mapping performance for
each class fromthe map producer perspective. Both the
producers and users accuracy are reported in Table 5.
Three classes do have no robust class specic
accuracies reported for them (Mayaux et al., 2006).
It should be noted that the GLC2000 accuracy estimates
were derived for the global dataset. Perhaps, the
specic mapping errors may differ if only Europe is
considered. However, these are only statistically robust
validation data available.
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 433
Table 5
Characteristics for each GLC2000 class considered in this study
GLC 2000
class code
Spatial
agreement
(%)
Class
area
(%)
Thematic
similarity
Spatial
homogeneity
(%)
Semantic
confusion
(median)
Accuracy
Producers
(%)
Users
(%)
Tree cover, broadleaved, deciduous, dense 2 39.2 9.35 0.50 32.8 0.54 96.7 35.4
Tree cover, broadleaved, deciduous, open 3 26.1 0.45 0.50 2.7 0.50 15.2 19.8
Tree cover, needle-leaved, evergreen 4 53.8 20.01 0.43 43.1 0.43 92.8 47.0
Tree cover, mixed leaf type 6 17.8 9.80 0.50 14.4 0.50 37.1 94.0
Shrub cover, closed-open, evergreen 11 32.6 0.52 0.46 31.4 0.57
Shrub cover, closed-open, deciduous 12 25.8 5.20 0.48 38.3 0.50 25.4 47.1
Herbaceous cover, closed-open 13 49.4 7.95 0.57 27.9 0.57 45.9 33.4
Sparse herbaceous or sparse shrub cover 14 46.6 1.77 0.46 46.3 0.46 62.0 50.5
Regularly ooded shrub and/or
herbaceous cover
15 23.0 1.34 0.43 10.3 0.5 35.9 77.1
Cultivated and managed areas 16 76.9 38.20 0.21 58.3 0.46 76.5 73.0
Mosaic: cropland/tree cover/other
natural vegetation
17 11.0 1.65 0.54 0.5 0.50 38.8 81.5
Mosaic: cropland/shrub and/or
grass cover
18 47.6 1.70 0.50 11.3 0.50
Bare areas 19 60.4 0.35 0.43 80.4 0.43 95.2 93.8
Articial surfaces and associated areas 22 70.5 1.49 0.29 27.4 0.29
2.4.5. Statistical analysis
Regression analyses were performed to assess the
contribution of each factor (spatial heterogeneity,
thematic similarity, mapping accuracy) to the amount
of spatial agreement between GLC2000 and COR-
INE2000. Bivariate linear regression helped to inves-
tigate the strength of the relationship between the
amount of agreement and each individual factor shown
in Table 5. Before applying multivariate regression
models, the joint principal components of four factors:
thematic similarity, spatial heterogeneity, Producers
and Users accuracy were derived to avoid correlation
between the regression predictor variables. Principal
components one and two explained more than 88% of
the variance of the variables and they were used in
multiple linear regression analysis.
3. Results
3.1. Spatial agreement between GLC2000 and
CORINE2000
Recognizing the amount of spatial agreement
(Table 3), all areas dened as urban areas in CORINE
can be aggregated to GLC2000 class articial
surfaces. Half of this category corresponds to CORINE
class 112 (discontinuous urban fabric). This CORINE
category is also confused with other non-urban
GLC2000 classes showing that a fair amount of
CORINE urban land is not represented as such in
GLC2000. Perhaps smaller settlements or the rural/
urban interface zones considered with CORINE are not
reected in the coarser scale GLC2000 map. There are
also is some amount of CORINE agriculture and forest
classes that are committed to the GLC2000 articial
surfaces. In general, the mapping of urban areas on
coarse scales is known to be a challenging task due to
their small fragmented spatial extent, the spectral
heterogeneity of the urban environment, and the
challenges in discriminating urban and rural land.
The detailed CORINE agriculture land use cate-
gories basically aggregate to one major agriculture
GLC2000 category (class 16). CORINE non-irrigated
arable land (class 211), despite some agreement with
the corresponding GLC2000 class, shows signicant
confusion with several GLC2000 categories, i.e.
herbaceous cover, sparse vegetation, and the agriculture
mosaic categories. The latter classes may indicate
spatial resolution effects, where agricultural areas may
appear as mixed units in coarser resolution datasets.
Both mosaic classes generally correspond to CORINE
class Land principally occupied by agriculture, with
signicant areas of natural vegetation. However this
category confuses with a whole range of different
GLC2000 categories and the overall amount of
agreement is very lowfor most of the mixed agricultural
classes. This also includes the CORINE class pasture.
This class allows up to 50% tree cover (wooded
meadows) and thus mixes with GLC2000 forests and
agriculture mosaics. Furthermore, GLC2000 does not
consider any land use/management practices within
pastures or grasslands and therefore, no clear dis-
crimination between these two classes is possible in the
context of CORINE.
The CORINE classes with a strong woody vegetation
component (311324) heavily confuse with different
GLC2000 classes. This may reect the different crown
cover densitythresholds for forest denitions (GLC2000:
15%and CORINE: 30%) and the fact that CORINEdoes
not specically contain a distinct shrubland category.
Prominent disagreement between such classes are
GLC2000 class 3 (open broadleaf deciduous forest)
and CORINE class 322 (moors and heathland). Also,
parts of the CORINE woody vegetation classes (311
324) are assigned to the GLC2000 herbaceous categories
(classes 1315) and Cultivated and managed areas.
There is limited overlap between the wetland areas
indicated in CORINE and the respective wetland areas
of GLC2000. In particular peat bogs seemed to be
mixed with a variety of GLC2000 categories. Bare areas
and sparsely vegetated areas are highly confused with
each other. Many sparsely vegetated areas in CORINE
appear as bare areas in GLC2000 and vice versa.
3.2. Key drivers for the spatial (dis)agreement
Considering the limited amount of spatial agreement
between both datasets, the emerging question refers to
the factors driving the (dis)agreement. This study has
considered several of these factors. Thematic similarity,
spatial heterogeneity, and classication accuracy of
GLC2000 were analysed with respect to their inuence
on the spatial agreement between GLC2000 and
CORINE2000 (Tables 5 and 6).
Obviously, all of the considered factors explain some
amount of agreement between both datasets, except the
users accuracy. The spatial heterogeneity affects the
joint dataset disagreement with more homogenous
classes showing more agreement. This factor strongly
reects the issue of spatial resolution which is different
in both datasets. Mapping heterogeneous landscapes is
strongly dependent on the minimum mapping unit
(Smith et al., 2003) and thus drives the heterogeneity
between GLC2000 and CORINE. The thematic
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 434
similarity measure describes the afnity of each
GLC2000 class with the non-agreeing CORINE classes
(Table 4), i.e. the lower the similarity with dissimilar
categories the higher the amount of agreement. Urban
areas and cultivated and managed areas show the lowest
thematic similarity values and the largest amount of
agreement. Both GLC2000 classes comprise of a
number of aggregated CORINE classes and are
thematically rather distinct. The other GLC2000
categories are thematically more similar to other
CORINE categories. One noticeable exception is
GLC2000 class 13 (herbaceous cover, closed-open).
This class indicates the highest thematic similarity
value with a rather high amount of agreement compared
to the general trend. The third factor (GLC2000
producers accuracy) highlights a direct relationship
with the GLC2000/CORINE agreement. The more
accurate the class was mapped from a producer
perspective the higher the amount of agreement with
CORINE. No linear trend is found between the
GLC2000 users accuracy and the class agreement
values.
The R
2
values for the signicant regressions are
rather similar around 50% of explained variance. This
suggests that all factors have some amount of
contribution for the disagreement and no single factor
solely contributes to the observed disagreement. To
prove whether rather the interactions of multiple
factors may be responsible for the disagreements a
principal component analysis and a multivariate
regression was performed. Principal components one
and two explained more than 88% of the variance of
these variables and they were used in multiple linear
regression analysis. Principal component one reects
the information from the producers accuracy
(R = 0.75), spatial heterogeneity (R = 0.60), and the
thematic similarity (R = 0.30). The loadings of the
rst principal component suggest that the spatial
heterogeneity and producers accuracy maybe more
important than the thematic similarity or at least have a
somewhat similar though negatively correlated infor-
mation. The variance of the thematic similarity one
strongly depends on two classes (urban and agriculture)
and may be of less importance for other categories. The
second component two represents the user accuracy
(R = 0.97).
The regression results of the joint contribution of all
factors used the rst two principal components of all
four factors is shown in Table 6. This multiple
regression model using both principal components
explains the majority of the agreement between both
datasets (R
2
= 0.81, R = 0.90). Both principal compo-
nents have signicant contributions in this regression
model with the rst component being signicant at 99%
condence level and the second component at 95%
condence level. Thus, the considered factors are
responsible for most of the disagreement between
GLC2000 and CORINE2000.
3.3. Implications for GLOBCOVER
The comparative assessment of GLC2000 and
CORINE2000 highlights several reasons for the
differences between both datasets. The landscape
heterogeneity, which strongly relates to the spatial
resolution, signicantly inuences the spatial agree-
ment between both datasets. Since GLOBCOVER is
based on 300 m MERIS data, this reason for disagree-
ment should be reduced if the goal is the integration
with CORINE data. However, it has to be considered
that a ner spatial resolution has a direct impact on the
class denitions, in particular for mixed unit classes.
One of the main issues remaining for GLOBCOVER
development is mapping accuracy. This study has
emphasized that mapping accuracy is one of the main
drivers of the disagreement between coarse scale land
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 435
Table 6
Regression results explaining the relationships between the agreement of GLC2000/CORINE2000 and different factors that affect the agreement
presented in Table 5
Independent variable to predict agreement Correlation with amount of
spatial agreement (r
2
)
T-test Pr(>jtj)
Spatial heterogeneity 0.46 3.19** 0.0078
Thematic similarity 0.50 3.49** 0.0044
Producers accuracy (GLC2000) 0.50 3.00** 0.0149
Users accuracy (GLC2000)
Multiple regression
PC1 0.81 5.32** 0.0007
PC2 2.44* 0.0404
Note: **shows a dependence at 99% condence interval and *shows a dependence at 95% condence interval.
cover datasets. This has been noted before in the context
of land change assessment (Townshend and Justice,
2002), and points at the need for the most accurate
mapping approaches accompanied by comprehensive
and comparative validation efforts (Herold et al.,
2006a). GLOBCOVER has to consider known chal-
lenges in deriving specic land cover classes on global
scales. Given the GLC2000 experience, mixed unit and
mosaic classes, shrublands, herbaceous covers and
wetland classes are among the ones usually derived with
rather low mapping quality. GLOBCOVER should not
start from scratch and take advantage of the comparison
of existing global land cover products. If there is general
agreement between different products it is certain that
these areas represent known land cover characteristics.
Mapping error may be reduced if such an approach is
used.
In terms of thematic denitions, the results indicated
a clear need for harmonizing existing denitions. So far,
all CORINE categories have been translated into the
LCCS language used by GLC2000. The use of LCCS
and a set of common land cover classiers have to be
adopted for the development and interoperability of the
GLOBCOVER product. Given the thematic translation,
CORINE contains detailed land use categorizations
for articial surfaces and agricultural areas in several
categories. They aggregate to land cover types in
GLC2000: cultivated and managed areas (all types of
crop agriculture), and articial surfaces and associated
areas (most types of urban land uses). Both classes
indicated individual thematic similarity character and
should be treated as such in the mapping process. One
problematic class is pasture, which is an agricultural
category in CORINE but refers to a land cover class in
GLC2000.
Further problematic issues for thematic denitions
are the different crown cover densities used to
discriminate forests from other vegetation types
(GLC2000-15% and CORINE-30%). This may be
one of the reasons for signicant confusion between
different GLC2000 and CORINE woody vegetation
categories. The integration of additional coarse scale
mapping information (e.g. forest continuous elds)
might be useful in this context. Also, CORINE has no
specic class for shrubs but some CORINE categories
might correspond to the two GLC2000 Shrub Cover
classes; this issue needs to be resolved. Furthermore, the
vegetation density threshold separating bare and
sparsely vegetated areas is different for both datasets
and thus resulted in signicant disagreements. An
adjustment of density thresholds is essential if CORINE
and GLOBCOVER data shall become comparable.
4. Conclusions
The study has emphasized the heterogeneity
between the GLC2000 and CORINE2000 datasets
and the driving factors of disagreement. In the
presented study thematic similarities, the spatial
heterogeneity and the classication accuracy of
GLC2000 were analysed to assess the comparability
of both datasets. It has been statistically proven that not
a sole factor but rather a joint contribution of all of them
are responsible for the observed disagreements
between both land cover maps. Inferentially, to make
a robust map comparison all these drivers need to be
considered. Taking into account that land cover and
land use maps are developed for a specic purpose,
based on different (remote sensing) data and classica-
tion methodologies it would be hindering to dene a
rigid set of parameters just to allow a map comparison
by default. Flexible standardisation, which allows on
the one hand a free map generation according to an
individual mapping request and on the other hand a full
comparability with other land cover datasets, is
increasingly demanded. LCCS is a promising tool to
facilitate this development. Finally, all mapping efforts
have to be accompanied by robust and comparative
accuracy assessment with existing datasets to improve
their inter-comparison.
Linking the new GLOBCOVER with GLC2000 and
CORINE2000 seems challenging and has to consider
their specic characteristics and denitions. Never-
theless, there is potential for interoperable development
of GLOBCOVER. The ner spatial resolution of 300 m
is expected to reduce some of the disagreement. There is
a strong need for harmonized land cover denitions and
their accurate mapping points at generic land cover
denitions (common LCCS classiers) as common
ground and for exible map product generation.
GLOBCOVER should consider known land cover
characteristics worldwide available from harmonizing
existing land cover datasets.
References
Bartholome, E., Belward, A.S., 2005. GLC2000: a new approach to
global land cover mapping from Earth observation data. Int. J.
Remote Sens. 26 (9), 19591977.
Bennett, B., 2001. What is a Forest? On the vagueness of certain
geographic concepts. Topoi 20, 189201.
Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS
interoperability. Int. J. Geograph. Inform. Sci. 12 (4), 299314.
Comber, A., Fisher, P., Wadsworth, R., 2004. Integrating land-cover
data with different ontologies: identifying change from inconsis-
tency. Int. J. Geograph. Inform. Sci. 18 (7), 691708.
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 436
Comber, A., Fisher, P., Wadsworth, R., 2005. What is land cover?
Environ. Plan. B: Plan. Design 32, 199209.
Congalton, R.G., 2003. Putting the map back in map accuracy
assessment. In: Lunetta, R.S., Lyon, J.G. (Eds.), Remote Sensing
and GIS Accuracy Assessment. CRC Press, Boca Raton, FL, pp.
111.
Di Gregorio, A., 2005. Land Cover Classication System (LCCS):
Classication Concepts and User Manual. FAO, Italy.
EEA, 2005a. IMAGE2000 and CLC2000. Products and Methods
CORINE Land Cover Updating for the Year 2000. Ispra, Italy.
EEA, EEA data service, Corine land cover 2000 vector by country
(CLC2000): http://dataservice.eea.eu.int/dataservice/metadetail-
s.asp?id=667, 2005b.
Hagen, A., 2003. Fuzzy set approach to assessing similarities of
categorical maps. Int. J. Geograph. Inform. Sci. 17 (3), 235249.
Herold, M., Schmullius, C., 2004. Report on Harmonization of Global
and Regional Land Cover Products, Workshop report at FAO,
Rome, Italy, 14-16 July 2004, GOFC-GOLD report series 20.
URL: http://www.fao.org/gtos/gofc-gold/series.html.
Herold, M., Woodcock, C., Di Gregorio, A., Mayaux, P., Belward, A.,
Latham, J., Schmullius, C.C., 2006a. A joint initiative for harmo-
nization and validation of land cover datasets. IEEE Trans. Geosci.
Remote Sens. 44 (7), 17191727.
Herold, M., Latham, J.S., Di Gregorio, A., Schmullius, C.C., 2006b.
Evolving standards on land cover characterization. J. Land Use
Sci. 1 (24), 157168.
Jansen, L.J.M., 2005. Harmonisation of land-use class sets to facilitate
compatibility and comparability of data across space and time. In:
Proceedings of the 12th CEReS International Symposium, Chiba,
Japan.
JRC, GLC2000 homepage: http://www-gvm.jrc.it/glc2000/default
GLC2000.htm, 2004.
Mayaux, P., Strahler, A., Eva, H., Herold, M., Shefali, A., Naumov, S.,
Dorado, A., Di Bella, C., Johansson, D., Ordoyne, C., Kopin, I.,
Boschetti, L., Belward, A., 2006. Validation of the global land
cover 2000 map. IEEE Trans. Geosci. Remote Sens. 44 (7), 1728
1739.
Smith, J.H., Wickham, J.D., Stehman, S.V., Yang, L., 2003. Effects of
landscape characteristics on land-cover class accuracy. Remote
Sens. Environ. 84 (3), 342349.
Townshend, J.R.G., Justice, C.O., 2002. Towards operational mon-
itoring of terrestrial systems by moderate-resolution remote sen-
sing. Remote Sens. Environ. 83 (12), 351359.
Visser, H., (Editor) 2004. The Map Comparison Kit: methods, soft-
ware and applications. RIVM report 550002005/2004, Bilthoven,
The Netherlands.
K. Neumann et al. / International Journal of Applied Earth Observation and Geoinformation 9 (2007) 425437 437

You might also like