You are on page 1of 23

Managing Error,

Accuracy, and Precision


In GIS
Importance of Understanding Error
*Until recently, most people involved with GIS
paid little attention to error

*That situation has now changed dramatically

*Error management is a vital role to the proper


functioning of a GIS database, and is subject to
a large percentage of work in most GIS shops
Importance of Understanding Error
*The key point is that through awareness,
scrutiny, and careful planning can minimize
these errors and their associated effects on
management and decision-making
Definitions for Understanding Error
*Accuracy: the degree to which information on
a map or in a digital database matches the true
or accepted values
-can vary greatly amongst datasets
-very high accuracy can be expensive

*Precision: refers to the level of measurement


and “exactness” of description
Definitions for Understanding Error
*Precision: refers to the level of measurement
and “exactness” of description in a GIS
-again, precision requirements vary greatly
depending on the dataset
-highly precise data can be much more
expensive to create
Definitions for Understanding
Error
Accuracy vs. Precision . . .
Types of Error
Positional Accuracy and Precision
*Refers to both horizontal and vertical positions
*Don’t use/compute locational information at a
level beyond which the data was intended

Accuracy Standards for US NTS Maps


1:1,200 ± 3.33 feet 1:2,400 ± 6.67 feet
1:4,800 ± 13.33 feet 1:10,000 ± 27.78 feet
1:12,000 ± 33.33 feet 1:24,000 ± 40.00 feet
1:63,360 ± 105.60 feet 1:100,000 ± 166.67 feet
Types of Error
Attribute Accuracy and Precision
*Attribute (non-spatial) information can also be
erroneous
*Some layers can be more precise than others

Conceptual Accuracy and Precision


*Use of inappropriate categories, or
misclassification
*Ex.-not classifying voltage in your power lines
layer would limit your ability to manage
electrical utilities infrastructure
Sources of Error
*Sources of error can be divided into three
groups:

-obvious sources of error

-errors resulting from natural variations or


from original measurements

-errors arising through processing


Obvious Sources of Error
*Age of Data
-some data sources may be too old to be
useful
-past collection standards may no longer be
acceptable
-the database could have changed
dramatically over time (erosion/deposition,
harvest, fire)
-updating a database is by far the most
common form of error management work
Obvious Sources of Error
*Areal Cover
-some datasets contain only part of the
required information (veg., soils are common)
-ex. FRI often contains no land cover
information for wetland areas
-some remote sensing data may be difficult to
acquire consistently cloudy regions
Obvious Sources of Error
*Map Scale
-always remember the implications of scale!!!!

*Density of Observations
-an insufficient number of observations may
not provide the required level of resolution
-ex. If you have a 40’ contour interval, you
had better not be reporting on or making
decisions about features only a few feet in
difference
Obvious Sources of Error
*Relevance
-surrogate data may be used to indirectly
describe/classify/quantify features
-Ex. We can create a forest polygon layer from
classification of remotely sensed data.
However, we are not classifying a “tree” as a
tree. Rather, we are classifying the imagery
based on spectral signatures, and those
signatures can be related to tree species.
Obvious Sources of Error
*Format
-methods of formatting data can introduce
errors
-conversion of scale, projection, or datum,
vectorization/rasterization, and pixel
resolution are possible areas of format error
-international mapping standards not
established
Obvious Sources of Error
*Accessibility
-try getting a highway map of the former
USSR in the Cold War days . . . Good Luck!

*Cost
-highly accurate, precise data is expensive!!!
Errors from Natural Variation
or from Original Measurements
*Positional Accuracy
-many natural features do not exhibit “hard”
boundaries like roads or boundary lines
-examples include . . .?
Errors from Natural Variation
or from Original Measurements
*Positional Accuracy
-many natural features do not exhibit “hard”
boundaries like roads or boundary lines
-examples include:
-soils
-vegetation communities
-climate variables
-drainage
-biomes, etc.
Errors from Natural Variation
or from Original Measurements
*Accuracy of Content
-qualitative accuracy refers to correct
labelling/classification (Ex.-pine forest vs.
spruce forest)
-quantitative inaccuracies often occur from
faulty equipment or poor readings
-what forestry equipment could give you bad
data? And how?
Errors Arising Through Processing
*Numerical Errors
-by far, the hardest errors to detect!!!
-different (faulty) computer chips can compute
differently, generating a different output
(response)

*Topological Errors
-overlaying, or deriving/creating new variables
based on other data can cause slivers,
overshoots, and dangles
Errors Arising Through Processing
*Classification/Generalization Errors
-classification inaccuracies/class merging
-grouping data in different ways can lead to
dramatically different results (Ex.-studying
cause of death amongst males would probably
be quite different if you had (amongst others)
an aged 18-25 group vs. an 18-50 group
Errors Arising Through Processing
*Geocoding/Digitizing Errors
-what can cause digitizing errors?
Errors Arising Through Processing
*Geocoding/Digitizing Errors
-what can cause digitizing errors?
-rasterizing will cause positional error
Error, Error, Everywhere . . .
How can we manage error?
1. Be aware of where error can be
generated (everything discussed in this
presentation)
2. Metadata, metadata, metadata . . . Fully
understand all data compiled for your GIS,
make notes of all work done with the data,
and send such information to future users
or with all GIS generated output.

You might also like