You are on page 1of 169

COURSE NOTES

CHEMICAL MASS BALANCE APPROACH TO


QUANTITATIVE SOURCE APPORTIONMENT OF
ENVIRONMENTAL POLLUTANTS





















Presented
September 12, 2011
EPA, North Carolina

By John A. Cooper
Cooper Environmental Services LLC
10180 SW Nimbus Ave., Ste J6
Portland OR 97223




Auto Exhaust
Industry
Fly Ash
Sulfate
Carbon
Soil Dust
g/m
3

















TABLE OF CONTENTS
1.0 INTRODUCTION ...................................................................................................................... 1
1.1 Background ....................................................................................................... ............... 1
1.2 The Subject ....................................................................................................... ............... 1
1.3 The Objective .................................................................................................... ............... 3
1.4 The Law and the Scientific Method .................................................................. ............... 4
1.5 How Much, Not Who ........................................................................................ ............... 4
1.6 Source Apportionment Alternatives .................................................................. ............... 5
1.7 Evolution of the Receptor Oriented Approach ................................................. ............... 7
1.8 Receptor Modeling Methodology ..................................................................... ............... 9
2.0 STATISTICS, MODELS, METHODS, ASSUMPTIONS AND RESOLUTION ...................... 10
2.1 Statistics and Empirical Models .................................................................. . ................ 10
2.2 Perceived Physical and Mathematical Models ............................................ .. ............... 11
2.3 Models and Methods ................................................................................... .. ............... 13
2.4 Model Assumptions .......................................................................................................... 15
2.5 Receptor Model Limitations ............................................................................................. 18
2.6 Spectroscopy, Resolution, and Signal to Noise Ratios ..................................................... 19
3.0 CHARACTERISTICS OF AMBIENT AND SOURCE AEROSOLS ....................................... 22
3.1 Chemical-Physical Model of an Airshed ........................................................................... 22
3.2 Source Emission Characteristics ....................................................................................... 23
3.3 Characteristics of Ambient Suspended Particulate Matter ............................................... 28
4.0 AN OVERVIEW OF STATISTICAL AND EMPIRICAL MODELING .................................. 39
4.1 Assumptions ....................................................................................................................... 39
4.2 The Signal: Variability ....................................................................................................... 40
4.3a Two Statistical Approaches ............................................................................................. 40
4.3b Bivariate Linear Regression ............................................................................................. 42
4.4 Multiple Linear Regression ............................................................................................... 45
4.5 Factor Analysis ................................................................................................................. 47
4.6 Data Requirements ............................................................................................................. 51
4.7 Experimental Design Considerations ................................................................................. 51
4.8 Advantages and Disadvantages ......................................................................................... 52
5.0 A GRAPHICAL APPROACH TO RECEPTOR MODELING ................................................. 53
5.1 Models and Modeling ........................................................................................................ 53
5.2 The Subject and Data Set .................................................................................................. 53
5.3 Histogram and Vector Representations ............................................................................. 57
5.4 Multicollinearity in Source Apportionment ....................................................................... 65
5.5 Ambient Pattern Fitting ..................................................................................................... 66
5.6 Enrichment Factor and Mn/V Ratio .................................................................................. 67
6.0 CHEMICAL MASS BALANCE MODELING .......................................................................... 70
6.1 Fundamental Principles ...................................................................................................... 70
6.2 Solutions to the Receptor Model Equation ....................................................................... 75
6.3 Applying the CMB ............................................................................................................ 79
6.4 Extending the CMB Result ............................................................................................... 82
7.0 PARTICLE TYPE MASS BALANCE (PTMB) ........................................................................ 83
7.1 Summary ............................................................................................................................ 83
7.2 Comparison of Microscopic Approaches .......................................................................... 84
7.3 Mass Balance Calculation ................................................................................................. 85
8.0 MULTIVARIATE ANALYSIS .................................................................................................. 87
8.1 Introduction ....................................................................................................................... 87
8.2 Important Features of Factor Analysis .............................................................................. 87
8.3 Simulated Analysis Example ............................................................................................ 88
8.4 Applications ...................................................................................................................... 93
8.5 Summary ........................................................................................................................... 93
9.0 AVERAGES, UNCERTAINTIES, AND SOURCE RESOLUTION ........................................ 95
9.1 Mathematical Problems Associated with the Calculation of Mean Values ...................... 95
9.2 Uncertainties ..................................................................................................................... 96
9.3 Source Resolution in Relation to Sample Time Distribution Assumed and Chemical Data
Combination Method .............................................................................................................. 97
10.0 SOURCE MATRIX DEVELOPMENT .................................................................................... 104
10.1 Introduction ...................................................................................................................... 104
10.2 Use of Literature and Source Library Data ...................................................................... 104
10.3 Multivariate Analysis Methods ....................................................................................... 105
10.4 Wind Trajectory Analysis ............................................................................................... 106
10.5 Direct Source Sampling .................................................................................................. 107
10.6 Composite Transportation Source Matrix ....................................................................... 112
10.7 Source Characterization Protocol .................................................................................... 115
11.0 AMBIENT SAMPLING FOR MAXIMUM SOURCE RESOLUTION .................................. 119
11.1 Introduction ..................................................................................................................... 119
11.2 Meteorological Regime Categorization .......................................................................... 119
11.3 Sampling Location, Frequency, and Duration ................................................................ 123
11.4 Samplers and Filters ......................................................................................................... 124
11.5 Filter Handling ................................................................................................................ 127
12.0 ANALYTICAL METHODS FOR CHEMICAL MASS BALANCE MODELING ................ 129
12.1 Introduction and Overview ............................................................................................. 129
12.2 Selection of Elements ...................................................................................................... 131
12.3 Alternate Analytical Approaches .................................................................................... 133
12.4 XRF Methods .................................................................................................................. 134
13.0 RUNNING EPA CMB 8.2 MODEL.......................................................................................... 140
13.1 Software Installation ....................................................................................................... 140
12.2 Model Operations ............................................................................................................ 143
12.3 Input and Output Data Files ............................................................................................ 144
12.4 Performance Measures .................................................................................................... 145
14.0 EVALUATION OF RECEPTOR MODELING RESULTS .................................................... 148
14.1 Introduction ..................................................................................................................... 148
14.2 Internal Verification Techniques .................................................................................... 148
14.3 Source Activity and Transport Methods .......................................................................... 149
14.4 Comparisons to Other Techniques ................................................................................... 150
14.5 Analysis of Anomalous Results ...................................................................................... 150
15.0 EXAMPLES OF CHEMICAL MASS BALANCE APPLICATIONS .................................... 151
15.1 Airshed Management Applications ................................................................................. 151
15.2 Visibility Source Apportionment .................................................................................... 152
15.3 Chemical Mass Balance Case Studies ............................................................................ 153
15.4 Dispersion Modeling ....................................................................................................... 162
16.0 REFERENCES AND ADDITIONAL READING ................................................................... 164


1

1.0 INTRODUCTION
1.1 Background
The complexity of issues facing those responsible for managing our air resources has changed
markedly within the past ten years. Those who worked in the air quality programs of the 1950's will
recall that "first round" regulatory programs were directed at eliminating obvious emission sources,
largely by adoption of Ringlemann and opacity rules, particle fallout and nuisance regulations. These
programs were successful in improving community air quality by eliminating emissions from stacks
belching black smoke and controlling obvious process losses that had captured the public's attention.
Adoption of the Clean Air Act of 1970 and the development of "second round" control
strategies brought further air quality improvements through promulgation of control strategies based
on proportional rollback models. Yet a "third round" of strategies designed to attain and maintain air
quality standards was required of state and local agencies in the late 1970's, this time based on
approved EPA dispersion models. Unfortunately, the effectiveness of particulate control programs
was often limited by failure of (a) the area source inventories to include fugitive dust emissions (a
source that we now know to be of critical importance in many TSP nonattainment areas) and (b)
inability of the models to simulate formation of secondary aerosols. Thus, in spite of the fact that
cumulative air pollution abatement costs are expected to exceed $300 billion over the decade ending
in 1987, as many as 395 counties remain in noncompliance with National Ambient Air Quality
standards (NAAQS) for Total Suspended Particulate alone.
These facts, and the increasing pressure on air quality managers to adopt cost effective
strategies requires development of convincing evidence of the relative importance of each source to
the overall air quality problem. Unless an effective case, demonstrating the magnitude and identity of
source impacts can be made, considerable doubt as to the likely success of new control programs can
exist.
Although this course will focus primarily on the application of the receptor approach to air
quality issues around which this methodology has evolved, it is clear that the general approach is
applicable to a broad range of environmental problems, many of which will be addressed in this
course.
1.2 The Subject
The topic of this course is the receptor approach to source apportionment of environmental
pollutants, mainly air pollutants. Yet, the central subject behind this course is man; his total
environment, and the substances to which he is exposed through the air, water, and food he ingests.
As such we are primarily interested in the origin, transport, and fate of pollutants in man's
environment, ecosystem, biosphere and/or various pathways to man. The receptor oriented
methodology developed for apportioning air pollutants is just as applicable to all portions of man's
environment.
Our primary concern for air pollution, for example, has been its impact on the health of man.
This has been one of the driving forces behind recent attempts in the U.S. to define inhalable and/or
respirable particulate standards. The impact of suspended particulate material on man depends not
only on the potency of the material, but also its efficiency for exposing sensitive tissues deep in the
2

respiratory tract (Figure 1.1). Large particles (>10 m), for example, are efficiently removed from
the air stream in the nasal-pharynx region of the upper respiratory tract where it is either expelled or
passed through the gastrointestinal tract. Particles smaller than 10 m have low removal efficiency in
the nasal-pharynx and as such pass into the deeper portions of the respiratory tract where they are
removed by impaction and diffusion. The deposition efficiency is highest for particles less than 0.1
m in both the tracheobronchial and alveolar regions of the pulmonary system. The deposition
efficiency is highest for all particles less than a few microns in the alveolar region (Figure 1.2).
Particles deposited in the tracheo-bronchial and pulmonary regions of the respiratory tract are either
expelled or pass through the gastrointestinal tract.
Figure 1.1: The Respiratory System

Figure 1.2: Pollutant Exposure Depends on Particle Size and
Respiratory Deposition Efficiency
Figure 1.3 shows a comparison of a 0.1 m particle and a red blood cell which has a diameter
of about 7 m.
3

Figure 1.3: Comparison of a 0.1 m particle and a red blood cell (diameter of 7 m)

1.3 The Objective
This concern for man's health has led to the establishment of national ambient air quality
standards for air particulate mass and other criteria pollutants. Current particulate standards are
based-on 24 hour and annual TSP and quarterly lead levels. Attainment of these standards and/or
other standards is the primary objective of regulatory action and as such will have a substantial
impact on the design of receptor oriented source apportionment studies.
Although attainment of these standards is the primary objective, reaching this objective
requires action on the part of individuals, the general public, industry, and/or the government.
Successful implementation of cost-effective action, however, requires the establishment of a specific
level of confidence in those required to take action; confidence that the action will be successful in
providing a cost-effective reduction in pollution levels. The level of confidence required to attain a
specified action is highly variable and will depend on the degree of inconvenience created by the
proposed action and the potential consequences caused by either no action or inappropriate action.
The degree of confidence developed in a proposed control strategy depends to a great extent
on the accuracy to which the contributions of major sources have been identified. This is the
quantitative portion of the confidence interval which can be established by the normal propagation of
uncertainties. A less tangible or qualitative portion of the confidence level is established by internal
consistency of the available data. Both components are important and the highest level of confidence
in a proposed control strategy will be one based on a highly accurate and internally consistent data
set.

1.4 The Law and the Scientific Method
Substantial similarities exist between the task of establishing confidence in the results of a
source apportionment study and the task of establishing an adequate level of confidence in an
attorney's hypothesis relating an accused criminal to a crime. Both tasks require that a bridge of
circumstantial evidence be built relating the source (criminal) to a quantitative impact (crime). No
one sees, for example, a source contribute 2 pg/m
3
of particulate material to a receptor. In the
absence of this kind of direct evidence relating a source to its impact, circumstantial evidence must
4

be used to establish a strong enough bridge to withstand extensive criticism. That is, source
contributions must be based on establishing other circumstances which afford reasonable inference of
a source's contribution. Source apportionment results from different approaches, meteorological,
geographical, operational, physical, and chemical data can be thought of as building blocks for this
bridge while the internal consistency of the data acts as the cement which forms a strong circumstan-
tial bridge of information relating a source to its impact.
There are a large number of tools (Figure 1.4) that can be used to develop data to build a
bridge relating a source to its impact. Each has its strengths and limitations which must be considered
to develop a cost-effective source apportionment study plan. One of the objectives of this course is to
provide a sufficient level of understanding of the various receptor oriented source apportionment
tools so as to select those most appropriate for each specific airshed and/ or problem.
Figure 1.4: Source Apportionment Tools

1.5 How Much, Not Who
The question most often asked is who caused the air pollution? This, however,

is not the key
question since it generally is well known who causes the air pollution; i.e., all potential sources
contribute to pollutant levels. What isn't known in most cases, however, is: how much does each
source contribute to pollutant levels? (Figure 1.5). The first question (who?) is qualitative, while the
second question (how much?) is quantitative in nature. Road dust, for example, contributed 39 2%
of the TSP in Portland, Oregon, while vegetative burning contributed only 14.5 5% (Figure 1.5).
With a quantitative source impact data base such as this, a cost-effective control strategy can be
developed which addresses the most significant sources of the problem.




Source
Apportionment

Tool Box
Chemical Mass
Balance
Physical Feature
Size
Time and Spatial
Variations
Multivariate Methods
Regression Analysis
Emission Inventory
Meteorology
Dispersion
Modeling
Scanning Electron
Microscopy
Optical Microscopy
?
5


Figure 1.5: Source Contributions to Air Particulates in Downtown Portland

1.6 Source Apportionment Alternatives
The basis for an effective air management program largely rests in a secure knowledge of
source contributions. Until quite recently, analysts had little choice but to apply dispersion modeling
methods. Yet as useful as they are, dispersion models have not met several critical air management
requirements. Principal among these are the inability of source oriented models to quantify source
impacts during short episodes, assess impacts in complex terrain or apportion particulate mass in
complex urban airsheds. The inability of dispersion models to quantify source impacts to the
"background" aerosol being transported into the airshed aerosol that typically accounts for about
30-50% of the urban TSP aerosol mass is another serious limitation.
Many of these constraints are linked to the inherent inability of emission inventories to
accurately reflect hourly and day-to-day emission variations typical of most urban settings.
Improvements to currently available dispersion models will require additional research to develop
urban source-oriented models able to cope with the complexity, and often random nature, of
atmospheric dispersion, particle deposition physics, emission variability, and secondary aerosol
formation pathways.
Dispersion models, however, represent only one family of source apportionment tools avail-
able to air quality manager. Microinventory and receptor modeling approaches have also been
applied.
Soil and Road Dust
(39.0%)
Vegetative Burn
(14.6%)
Auto Exhaust
(9.7%)
Sulfate (4.6%)
Nitrate (4.5%)
Volatilized Carbon
(8.1%)
Nonvolatilized Carbon
2.2%)
Unidentified (8.0%)
(NH
4
, H
2
O, etc.)
Primary Industrial (4.9%)
Carbide Furnace, Ca (2.0%)
Aluminum Production (1.35%)
Steel Production (0.94%)
Hog Fuel Boilers (0.22%)
Ferromanganese Production (0.21%)
Sulfite Process (0.18%)
Marine
(3.8%)
Residual Oil
(0.8%)
6

Unlike dispersion models, receptor models "decode" ambient pollutant chemistry and
variability information to identify source impacts. Emission inventory and meteorological data are
not required. Figure 1.6 illustrates the relationship between source and receptor modeling, two highly
complementary techniques that collectively offer opportunities for a major advance in air quality
management practices. Each approach has its place in air resource management.
Figure 1.6: Relationship between Source and Receptor Modeling
Dispersion modeling is the key through which the air quality benefits of alternative emission
control programs can be evaluated, potential new source impacts assessed and future predictions
prepared. Receptor models, on the other hand, offer distinct advantages in complex terrain, fugitive
dust studies or other situations in which dispersion models may either be inappropriate or not cost-
effective. Table 1.1 lists several impact assessment scenarios commonly encountered by regulatory
agencies within which receptor-modeling applications may be appropriate.

Table 1.1: Impact Assessment Scenarios Source and Receptor Model Applications

Dispersion Models Receptor Models
Predictions of future air quality Fugitive emission impacts
Analysis of alternative strategies Analysis of actual, worst case 24-hour
impacts
Impact predictions for proposed new sources Identification new, uninventoried sources
Analysis of impacts associated with changes
in stack height, temperature, or volume
Complex terrain/meteorology unsuitable for
dispersion modeling
Identification of impacts from one specific
source among a group of sources with
similar emission characteristics
Impact associated from sources with unique
morphology, chemistry or variability
Source Impact Assessment Methods
Dispersion Models
Receptor Models
Emission
Inventor
y
Impact at
Receptor
Dispersion
Model
Source
Contribution
Filter
Analysis
Receptor
Model
+
+
7

1.7 Evolution of the Receptor Oriented Approach
Although receptor methods are generally considered to be new, they have actually been
developing over the past 15 years (Table 1.2) and their beginnings go back as far as the conservation
of mass principle on which the model is based. The mathematical foundation for receptor model
variability analysis was first established by Spearman in 1927 and later generalized by Thurston in
1935 and 1947 for multiple factor analysis (Table 1.3). The first environmental application, however,
wasn't until 1956 when Lorenz applied empirical orthogonal functions to statistical weather
predictions. Blifford and Meeker applied factor analysis to a large scale air pollution data set and as
such were the first to apply a receptor model approach to aerosol source apportionment. The
following year Prinz and Stratman (Table 1.3) applied factor analysis to source apportionment of
suspended particulate mass and in 1969 Winchester and Nifong applied emission chemistry logic to
interpret ambient aerosol chemistry. Although Hidy and Friedlander first used elemental tracers for
source impact assessment in 1970, it wasn't until 1972 that Miller, Friedlander, and Hidy established
the formalism on which the current chemical mass balance (CMB) approach is now based. The
period from 1972 to 1978 was an exploratory period when different interpretive approaches were
suggested and a large number of aerosol studies conducted. Only one of the studies during this
period, the Portland Aerosol Characterization Study (PACS), was designed to generate the necessary
source and ambient field data required by a receptor modeling program (CMB). Watson helped to
establish many of the receptor modeling systematics in 1978 and a workshop on receptor model
approaches was held at the Quail Roost Conference Center in 1980 which clearly established
receptor modeling as a distinct discipline (Table 1.4). Another major receptor modeling milestone
initiated in 1980 was the use of the CMB method for dispersion model validation and as a routine
tool for airshed management by the Oregon State Department of Environmental Quality. The second
Quail Roost Conference which was convened in February of 1982 focused on receptor modeling
validation. Several reviews have been written in the past few years which can provide the reader with
more detailed historical perspective. The current extent of receptor model applications is reflected in
Table 1.5.














8

Table 1.2: Pre-1967 Activities

Table 1.3: 1967-1978

Table 1.4: 1978 and Beyond




1927 Spearman Spearman-Thurston Approach to Factor Analysis
1935 Thurston
1947 Thurston Multiple Factor Analysis
1958 Lorenz Empirical Orthogonal Functions and Statistical Weather Predictions
1967 Blifford and Meeker A Factor Analysis Model or Large Scale Pollution
1967 Blifford & Meeker, First Application of Receptor Model, Multivariate
1968 Prinz and Stratman, Factor Analysis
1969 Winchester and Nifong, Chemical Emissions
1970 Hidy and Friedlander, Chemical Tracers
1972 Miller, Friedlander and Hidy, CMB Formalism Established
1972-1978 Large Urban Aerosol Characterization and Source Identification Studies
ACHEX Southern California
RAPS St. Louis, Missouri
NYSAS New York, New York
Washington, D.C
Tucson, Arizona
PACS Portland, Oregon
1978 Watson Receptor Modeling Systematics Established
1980 Quail Roost Workshop Receptor Modeling Recognized as a Distinct Discipline
1980 Oregon State CMB Methods used as Routine Tool for Airshed Management
1981 APCA First APCA Sessions on Receptor Models
Beyond

Development of Source Fingerprint Library and
Catalog
Development of Unified Approach to Source
Apportionment
Development of More Cost-Effective Methods
National Technical Documents
9

Table 1.5: Receptor Model Applications Inventory Summary* (1975-1981)
Receptor Model Number of Cities
Studied
Studies Conducted in Support
of SIP Development
Optical (Microscopy) 136 108
Chemical Mass Balance 45 27
Factor Analysis 21 12
Microinventory-Regression Analysis 41 9
Target Transformation Factor Analysis 7 0
Image Analysis 3 3
*Compiled by survey of U.S. EPA Regional Officers. Estimates represent a lower limit of the number of studies
actually conducted.
Although the survey results are known to be partially incomplete, the results demonstrate the widespread use of
these techniques.

1.8 Receptor Modeling Methodology
A receptor model is either a description of the physical world from the perspective of
,
the
receptor or an equation describing the mathematical relationship between sources and features
measured at the receptor as will be discussed in more detail in the following section. The process of
data interpretation used to either validate or modify current models or to establish new models is only
one part of an entire methodology (Figure 1.7). This methodology consists of such major components
as program design, sample collection and analysis, as well as data interpretation. All parts of this
methodology must be optimized to attain adequate source resolution and quantification as well as to
effectively use available resources. Receptor modeling is the act of implementing the receptor
oriented source apportionment methodology to either validate and/or modify existing models or to
establish new models.
Figure 1.7: Environmental Regulation Methodology
10

2.0 STATISTICS, MODELS, METHODS,
ASSUMPTIONS, AND RESOLUTION
2.1 Statistics and Empirical Models
Statistics is the branch of mathematics dealing with the analysis and interpretation of large
numerical data sets. Multivariate statistical methods search these large data sets for common
variability patterns between measurables which, might be used to identify a common cause of
observed variability. The relationship between a statistically detected effect or pattern (common
variability) and a potential cause must be supplied, however, by the air pollution scientist based on an
understanding of the chemistry and physics of a potential airshed of interest.
Large data sets of appropriate features are required for the results of statistical analysis to be
valid. Until recently, the development of such data sets were considered to be impractical. Recent
advances in analytical techniques, however, have made it possible to characterize a large number of
chemical features associated with particulate material collected at a receptor. In the past decade,
numerous large data sets have been developed and analyzed by a variety of multivariate statistical
analysis tools such as linear regression, multiple regression, factor analysis, pattern recognition,
principal component analysis, target transformation factor analysis, cluster analysis, ridge regression
analysis, etc.
The statistical approaches mentioned above are often incorrectly referred to as empirical
models. It is important to distinguish at the beginning of this discussion the difference between a
model and a tool used to develop a model as well as different types of models.
Definition of a model: In physics, a model can take on two basic definitions. First a model can
be a physical description of the perceived real world such as a physical model of the atom or the solar
system. A model may also be mathematical in that it describes the interdependence of variables.
There are two types of mathematical models: those that relate variables based on a perceived physical
model and those that relate variables based on measurement experience (empirical).

Figure 2.1: Relationship of Different Types of Models used in Air Pollution
Thus, for the following discussion, mathematical models are divided into two categories:
perceived and empirical (Figure 2.1). A perceived mathematical model consists of a mathematical
expression relating variables in a conceptual model of a perceived physical world. An empirical
model consists of a mathematical description of the relationships between mpirically measured
Models
Physical Mathematical
Empirical Perceived
11

variables which is based on previous measurement experience. These models do not always evolve at
the same rate. At times a study of the systematics of empirical data can guide the development of a
conceptual model of the physical world, while at other times a conceptualized physical model can
guide the development of empirical data sets on which to build an empirical model and/or test a
perceived physical model.
Empirical: Relying or based solely on experiments or experience.
Empirical Techniques: A number of methods for calculating concentrations having been
devised based upon detailed analyses of actual measured concentrations. Thus, based on previous
empirical measurements of variables and detailed analyses of their interdependence, predictions of
specified dependent variables (acidity, source contributions, etc.) can be predicted by measuring
specified independent predictor variables.
The purest form of the empirical model approach is a statistical analysis of randomly collected
data without preconceived concepts of the interdependence of measured variables. This approach
would be extremely inefficient and is rarely taken. Instead, most studies consist of an approach that
falls somewhere in between a purely empirical approach and an approach defined by a generally
accepted physical model.
This more commonly taken approach might be called a directed, statistical analysis of
empirical data. A sufficient level of understanding of air pollution sources in specific airsheds is
usually available, for example, such that perceived physical models and impact hypotheses can be
readily generated. With this level of understanding, selection of the parameters to measure and
statistical approaches can readily be made. Depending on one's objectives, application of these
statistical analysis tools to large ambient air quality data sets may be used to either validate a
perceived physical concept of possible major sources in an airshed or they may be used to reveal new
insights into potential sources contributing to particulate levels.
2.2 Perceived Physical and Mathematical Models
Preconceived concepts of air pollution sources almost always exist for every airshed prior to
any study. This may consist of a generalized concept of material being emitted at a variable rate from
various sources and transported to a receptor under the influence of such factors as dispersion,
transformations (chemical and physical) and deposition; or it may be a more specific concept such as
source A is contributing 90% of the arsenic at receptor B.
An example of a perceived physical model is illustrated in Figure 2.2 which shows a concep-
tual model of an aerosolizable dust layer, its sources and sinks, as well as the forces acting on it. It is
also generally accepted that combustion and atmospheric reaction particulates are submicron in size
while mechanically produced particles, pollens, etc. are generally greater than one micron. In
addition, a great deal is known about the chemistry and variability of potential sources. Automobile
emissions from the combustion of leaded gasoline, for example, are known to contain lead and
bromine and to have a characteristic diurnal and weekly variability pattern dependent on typical
traffic volume. These and other pieces of generally accepted knowledge concerning air pollution in a
specific airshed make up what might be considered a perceived physical model.
12

Figure 2.2: Schematic Diagram of the Sources and Sinks of Aerosolizable Dust
This information is not often consolidated into a formal unified model. This, however, should
be the first step in receptor modeling of air pollution. The existing knowledge of the system under
study must first be defined, model hypothesis developed, and specific questions stated. Until a
perceived physical model is defined, a logical selection of the most cost effective

set of chemical
features to measure, sampling and analysis strategies, and data interpretation tools cannot be made.
The basic mathematical assumption of receptor analysis techniques is that the features used in
the data analysis are linearly additive as they are sampled at a receptor. This perceived receptor
model is stated (Figure 2.2b) mathematically with the following equation:

=
=
p
j
ij i
S C
1
(2.1)

Sources
Sinks





Aerosolizable Dust Layer
Mechanical
Mixing


Dissolution

Settling

Accumulation Layer
Mechanical
Mixing

D
y
n
a
m
i
c

L
a
y
e
r


Track out
Other

Exhaust

Erosion
Redistribution
Suspension
Forces
Wash
Off
(Erosion)

13

where C
i
is the concentration of the ith feature as measured at the receptor and S
ij
is the source
contribution of the jth source to the ith feature at the receptor. The total concentration of the ith
feature measured at the receptor is assumed to be a linear sum of the contribution of p sources. The
total concentration of lead at a receptor, for example, might be expressed mathematically by the
following equation:


RoadDust Pb Smelter Pb Auto Pb Pb
S S S C
, , ,
+ + =

r
RoadDust
r
Smelter
r
Auto
r
Pb Pb Pb Pb ] [ ] [ ] [ ] [ + + = (2.2)
3 3 3 3
/ 1 / 1 / 3 / 5 m g m g m g m g + + =

where the superscript r is used to emphasize that these concentrations are all referenced to the
receptor. Equation 2.2 represents a specific mathematical model which describes the relationship
between the concentration of lead as measured at a receptor and the contribution of three sources to
lead levels.
There are hundreds of features that can be used to characterize an aerosol. These can range
from elemental or chemical composition to color and may include size, morphology, particle type,
diffraction characteristics, etc. It is reasonable to assume, for example, that the concentration of
elements such as Si, Ti, Mn, Fe, Pb, etc. as measured at the receptor, C
i
, will be equal to the linear
combination of each of the P source's contribution to that element at the receptor, S
ij
. That is, their
mass is conserved as they are collected on a filter at a receptor. On the other hand, color is not
linearly additive and some chemicals are reactive at the receptor and would not satisfy this linearity
assumption.

2.3 Models and Methods
A model is generally considered to be a mathematical description of a perceived physical
world, or regularities in experimental data. There is currently, however, a great deal of confusion in
discussions and in the published literature as to what constitutes a receptor model. Such terms as
chemical mass balance, factor analysis, multivariate statistical methods, principal component
analysis, X-ray diffraction, automated particle analysis, regression analysis, etc. are casually used to
describe different receptor models. Little distinction is made between a chemical or physical feature
that is measured, the method used to measure the feature and the interpretive approach as listed in
Table 2.1. There is, in fact, only one basic receptor model which is in turn a subset of a more
fundamental model on which both dispersion and receptor models are based, i.e., the linearly additive
character of mass arriving at the receptor. This model is a mathematical description of a perceived
physical and chemical world.




14

Table 2.1: Measurement and Interpretation of Aerosol Features for Quantitative Source
Apportionment
Feature Analytical Method Interpretive Approach
Elemental
Concentration
Atomic Absorption
Spectrometry
Inductively Coupled Argon
Plasma
X-Ray Fluorescence
Neutron Activation Analysis
Variability Analysis (Multivariate
Methods)
Factor Analysis
Principal Component
Analysis
Pattern Recognition

Mass Analysis (regress Analysis)
Total Mass Tracer
Regression Analysis
Tracer Mass Balance
Chemical Mass Balance

Hybrid Methods
Target Transformation Factor
Analysis
Chemical
Concentration
Liquid Chromatography
Ion Chromatography
X-Ray Diffraction
Wet Chemical Methods
Particle
Concentration
Polarized Optical Microscopy
Scanning Electron Microscopy
Automated Scanning Electron
Microscopy

Receptor models can be subdivided, as indicated in Table 2.1, according to the feature
measured (chemical, particle type, etc.), the measurement method (microscopy, X-ray diffraction,
etc.) or by the interpretive approach. The latter, more fundamental approach as illustrated in Figure
2.3 will be used in this course. In this case, there are two fundamental approaches to the receptor
model; either a statistical analysis of the variability or a regression analysis of the mass. Variability
analysis methods such as factor analysis, cluster analysis, principal component analysis, etc., provide
valuable qualitative and semiquantitative insight into the nature of the sources and their contribution
to the variability, but are not able to provide quantitative information on source contributions to the
particulate mass. The regression analysis approach, on the other hand, is quantitative, both over the
total mass on a series of filters and over the chemical distribution on a single filter as with the
chemical mass balance method. Hybrid methods such as target transformation factor analysis are
combinations of the two basic approaches which assist with the quantification of the mass. There is
another category which includes other miscellaneous qualitative approaches such as enrichment
factors, trajectory analysis and procedures similar to the noncrustal Mn/V ratio used by Rahn.
15

Figure 2.3: Block Diagram Showing the Fundamental Interpretive
Approaches to the Receptor Model.
M
jk
= mass contribution from the jth source
m
ik
= mass of ith feature
p = number of sources
n = number of measured features
2.4 Model Assumptions
The approach taken in this section is similar to that presented by Watson at the Danvers
conference. It consists of a logical development of assumptions which are required to explain in more
detail the deviations associated with more general models. The approach taken is as follows:
1. Basic model assumptions are stated. They are labeled Al, A2, etc.
2. A conceptual model which these assumptions imply is stated mathematically and/or verbally.
These models are labeled CM1, CM2, etc.
3. Measurement models which are dictated by the conceptual model, or which have been
developed because they are practical, are specified with their basic assumptions. These
models are labeled MM1, MM2, etc.
4. Problems due to the violation of assumptions of the conceptual or measurement model are
recognized. These are labeled P1, P2, etc.
5. The conceptual or measurement model is modified and given a new label.
Though the receptor model evolution described here has taken place within the context of
suspended particulate matter studies on an urban scale, the conceptual model applies to all ambient
pollution. The first basic assumptions are:
Al: Compositions of source emissions are constant.
A2: Components do not react with each other, i.e., they add linerarly.
A3: p identified sources contribute to the receptor.
Receptor Model
m
ik
= F
ij
p
j=1
M
jk

Regression Analysis
on Mass
Other Miscellaneous
Approaches
Variability Analysis
Total Mass Chemical Mass
Hybrid Methods
Time Series Multivariate
Analysis
Spatial
Series
16

CM1:

=
=
p
j
jk jk ij ik
E D F C
1
,
The concentration of component i in the kth sample (C
ik
) equals the sum of the products of
the fractional amount of component i in the emissions from source j (F
il
), the atmospheric
dispersion of emissions from source j for the kth sample (D
jk
), and the total emission rate of
all components from source j which corresponds to sample k (E
jk
).
For a source model, the values of F
ij
, E
ik
and D
jk
are obtained through measurements and the
C
ik
are calculated. For the receptor model, the values of the C
ik
and F
1
are measured and the D
ik
E
jk

products are calculated. The receptor approach results in:
CM2:

=
=
p
j
jk ij ik
S F C
1
for i=l,n
The concentration of component i equals the sum of the products of the composition
fraction (F
ii
) and the total contribution of source j (S
jk
) to the receptor. n is the total number
of components measured.
The set of equations comprising CM2 is the chemical mass balance receptor model. It is
probably more appropriate to call it a mass balance model since any property, e.g., crystalline
structure optical appearance, isotope ratios, can be used as a component if it can be associated with a
mass fractional composition of the source emissions. The mass balance model requires two
additional assumptions:
A4: The number of sources, p, is less than or equal to the number of components, n.
A5: The compositions of all p sources (the set of F
ij
for each S
j
) are linearly independent of each
other.
These assumptions provide a set of equations that can be solved by a least squares fit for the
various source contributions (S
3
-
k
) if the source compositions (F
ij
) and the receptor concentrations
(C
ik
) are known. The solution method is part of CM2. The measurement model required to provide
these date is:
MM1: Measure p source compositions and m receptor concentrations for n components with
routine measurement methods.
Now we find that several problems arise:
P1: CM1 needs exact values of F
ij
and C
ik
. Measurements are not exact; they are really
intervals.
By modifying CM2, this problem can be handled fairly easily:
CM3: Weight variables in the least squares fit so that less precise measurements have less
influence than more precise measurements. Propagate errors through the least squares
solution with the following additional assumption.
AG: Measurement errors are random, uncorrelated and normally distributed.
Two more problems present themselves:
P2: There are more sources than components. p>n (A4 is untrue)
P3: Several sources have the same compositions. (A5 is untrue)
Two solutions exist. One is to modify the measurement model.
17

MM2: Measure more components and make sure they have different concentrations in different
source emissions.
This model requires that new, non-routine measurements be developed and standardized. In
the absence of the capability to make these new measurements, CM3 must be further modified.
CM4: Group sources with similar compositions into source types. p then refers to the number of
source types.
This gives up a lot, for now the model cannot distinguish between individual contributors.
There are still other problems.
P4: Source compositions do not remain constant.
Friedlander discusses the case of a linear reduction of
F11
with time and incorporates it into:
CM5: The source composition at the receptor equals F-4(1-Fk
i

i
) where k
i
is the decay rate and
i
is
the residence time of component i in the atmosphere.
This model imposes two additional assumptions.
A7: Emissions reach a uniform concentration throughout the airshed immediately (i.e., a short
time compared to the sample duration) upon introduction into the atmosphere.
A8: The decay rate is linear with time. CM5 also modifies the measurement model.
MM3: Decay rates and residence times are measured in addition to the MM1 measurements.
Routine methods for MM3 do not yet exist.
In fact, the inability to successfully carry out MMI has resulted in several additional
problems.
P5: It is not possible or too costly to measure the source compositions.
This has led to the develoment of models which make use of the variation of the ambient
component concentrations (C
ik
) over time (the number of consecutive samples, k=1,m). A linear
regression model results from CM2 with Al, A2, A3, A6, and:
A9: Each source type j has one component t
j
, or "tracer" in its composition such that a
t
j
j equals
zero for all other sources.
The linear regression model is:
CM6:

=
k t k t jk
j j
C F S ) / 1 (
The total pollution level (S
k
) equals the sum of the quotients of the ambient concentrations
of tracer components

(C
t
j
k
) and the fractional composition of the tracer for each source.
P6: The number of source types, p, in unknown.
P7: Unique tracer components aren't available for all source types.

These are addressed by a factor analysis variational model.
CM7:

=
=
p
j
jk ij ik
F C
1
, o
where the
ij
and F
jk
are related to, but are not the same as tie F
ij
and S
jk
of CM2.
This model is intended to determine p and the aij from the intercorrelations of concentrations.
It implicitly includes the effects of measurement errors. The intimate relationship between the mass
18

balance and variational models established here is important. While the variational models, under
certain conditions, can specify the number of source types and their probable compositions, these
results must be combined with the mass balance model to obtain source contributions.

2.5 Receptor Model Limitations
The interaction

between assumptions, conceptual models, measurement models and
challenges to assumptions may seem confusing, yet it is essential to the model development process.
In fact, the previous presentation has been substantially simplified by the failure to state many less
fundamental, but equally important assumptions. For example, the variational models make certain
assumptions about the statistical distributions from which measurements are drawn and minimum
sample sizes which may or may not be true in typical applications. The problem is complicated, but it
must be dealt with.
The limitations of a conceptual/measurement model combination depend upon which
assumptions are true under a given situation. Exceptions can be found to every one: 1) source
compositions are not constant, they vary with changes in process inputs, loads, and cycles; 2)
components do react with each other and systems are not-linear; 3) one rarely knows exactly how
many sources are contributing to a receptor; 4) there are many more sources than components which
can be practically measured; 5) many sources have very similar compositions; 6) measurement errors
are not necessarily random, uncorrelated, or normally distributed; 7) emissions definitely do not
instantaneously reach a uniform concentration throughout the atmosphere upon leaving a source; 8)
nor do they exhibit decay constants which are linear with time; and 9) very few sources have their
own unique tracer components.
It would seem that we have just refuted the very foundation upon which receptor models are
based. Yet, for certain purposes they do work. The fulfillment of an assumption is a question of
degree rather than of yes or no. In many cases the deviations from basic assumptions can be tolerated
by the goals of the application. It is important to quantify those deviations, however, and to consider
them with the results of the models.
Consider, for example, the use of Br as an indicator for the contribution of tailpipe emissions
from the combustion
-
of leaded gasoline to lead levels in an airshed dominated by a lead smelter.
Although Br may not be a "unique" tracer of the tailpipe contribution, over 95% of the ambient fine
particle Br is often contributed by this source. By using the tracer assumption, the worst that could
happen is that the automotive contribution to ambient lead levels might be overestimated by about
5%, if it weren't for the fact that Br is unstable and is rapidly converted from particulate to gaseous
species. Because of this added deviation from the assumption of constant relative composition, a
substantial uncertainty is introduced which may result in a final automotive source contribution
uncertainty of as much as 50% (relative). This may not be a serious problem, however, for a
receptor near the smelter where the automotive source may contribute less than a few percent of the
ambient lead. That is, relative uncertainty of 50% for a source which contributes a small fraction of
the total is rarely a limiting factor in developing an effective control strategy. This uncertainty,
however, may represent a more substantial problem at sites further from the smelter where the
automotive exhaust may represent a larger fraction of the total lead. In this case, other species and
methods can be used to improve the level of confidence in a specific source contribution.
19


Note: Deviations from basic assumptions are usually a matter of degree. The utility of specific
assumptions is usually determined by the level of uncertainty introduced by deviations from
the assumptions and the accuracy required to make effective decisions and to implement cost-
effective action.

2.6 Spectroscopy, Resolution, and Signal to Noise Ratios
Urban aerosols are complex mixtures of chemicals contributed by many different sources.
One objective of an aerosol source apportionment study is to separate or resolve the contribution of
specific sources or source types from the collection of all possible sources, and quantify their
contributions. There are many similarities between this task of source resolution using receptor
models, and analysis of complex mixtures of elements or radionuclides using ultraviolet, X-ray, or
gamma-ray spectrometers. Many of the concepts such as signal-to-noise ratio and resolution, which
generally refer to the spectrometer, and preconcentration and preseparation, which refer to sample
preparation, have direct analogues in "source apportionment spectrometry". Examination of these
analogous concepts can yield new insight into the fundamentals of receptor model methods and
improve source resolution and study design.
Terms and Definitions
- Spectroscopy: the study of spectra through the use of the spectroscope.
- Spectrometer: an instrument used for measuring spectral wave lengths.
- Spectrometry: science of using the spectrometer.
- Resolve: to break up into separate parts.
Examples of varying degrees and types of resolution:
Spatial
- Unaided eye
- Magnifying glass
- Microscope
- Scanning Electron Microscope
Spectral
- Prism
- Diffraction Grating
- NaI(Tl) y- and X-ray spectrometer
- Ge(Li) and SI(Li) y- and X-ray spectrometers
Aerosol Source
- Emission inventory
- Dispersion modeling
- Receptor modeling
o Monthly average
o Weekly average
o Hourly average
Resolution is a term used in both spectroscopy and microscopy. It is a measure of a system's
20

ability to separate spectral lines or distinguish morphological features. The system with the
best resolution is able to observe the greatest amount of detail or fine structure; i.e., it
provides the investigator with more information.
Just as a spectrometer is able to resolve electromagnetic radiations emitted from discrete
components of a complex mixture of atoms or radionuclides, so is a receptor modeling study able to
resolve and quantify the influence of various sources in an airshed. In this latter case, the ambient
aerosol is equivalent to the sample and the chemical pattern or distribution is equivalent to the
electromagnetic spectrum. An aerosol spectrometer will have many components similar to
collimators, prisms gratings, etc. commonly associated with electromagnetic spectrometers. The
equivalent components in an aerosol spectrometer include samplers, filters, sampling frequency,
chemical analysis procedures, etc. The statistical analysis of ambient chemical data with multivariate
or regression techniques is similar to spectral analysis programs which identify and quantify atoms or
nuclides in an unknown mixture. In the case of analytical spectroscopies, source resolution and
quantification can improved by maximizing the signal to noise ratio.
The gross signal, for example, in the case of a multivariate analysis is variability in chemical
composition. The net source signal is that portion of the variability due to source variability while all
other sources of variability such as meteorology are noise. The signal (source variability) is reduced
substantially as the sampling and/or data averaging times are increased.
The major components of a source apportionment study can be separated into five general
steps analogous to an analytical spectroscopy procedure:
- Sample collection
- Sample preparation
- Recording of chemical spectra,
- Qualitative analysis of spectral data
- Quantitative analysis by comparing resolved ambient spectral components with standard
source spectra.
The ability of any analytical spectrometer to separate and resolve the effects of different,
elements or sources and to quantify their contributions depends strongly on the optimization of each
of these five steps. Each source, for example, will emit a unique spectrum of chemicals which will
combine in the atmosphere with emissions from other sources to form a complex mixture of
chemicals. It is the task of the
"
aerosol source spectroscopist" to collect a representative sample of the
ambient aerosol, measure its chemical spectrum (pattern or profile), separate and identify the source
influences and quantify their contributions.
The sample collection and preparation step as it pertains to source apportionment consists of
concentrating the aerosol through filtration and preseparating the aerosol into broad source categories
by sorting it into fine and coarse fractions. Preseparation with particle sizing such as with a
dichotomous sampler is the first step in source resolution. It effectively separates the influence of
sources into two broad categories, reduces the chemical interference in both the fine and coarse
fractions, substantially improves the procedures source resolving capability, and increases the level
of confidence in the results.
The analogue to recording an electromagnetic spectrum is measurement of the chemical
spectrum. It is essential that the analysis methods and procedures be designed to-meet the specific
21

source apportionment's needs. The primary objectives of the analysis schemes are to: 1) characterize
as much of the mass as possible and thus measure the major chemical components, and 2) measure
key indicating elements. It would be difficult to resolve and quantify the contribution of residual oil
and automotive exhaust, for example, if the analytical sensitivity were insufficient to characterize V
and Ni, as well as Br and Pb. In general, the most cost-effective analytical tool for this purpose is X-
ray fluorescence. The optimum set of analytical tools for a specific airshed, however, will depend on
the aerosol sources and their expected chemistry .Selection of chemical species to measure requires a
thorough knowledge of the chemical characteristics of each potential source.
The signal for a source may be enhanced by increasing the uniqueness of a source's chemical
spectrum, improving the sensitivity for specific chemical species, and/or defining the variability of
the ambient aerosol in greater detail, i.e., maximizing variability. The uniqueness of a source's
chemical spectrum may be increased by measuring additional chemical features, increasing the
precision with which the spectrum is measured and collecting samples when the impact from specific
sources of interest are maximum. The sensitivity for specific chemical species, of course, is increased
by maximizing the analytical sensitivity. The identify of a source and its contribution to variability
may be defined more precisely by defining the variability in the ambient aerosol- features in more
detail through more frequent sampling.
Once a source has been qualitatively identified as a substantial contributor to the ambient
aerosol, it must be quantified by selecting either a specific element as an analyte element (tracer) or
group of elements to compare with a source. Since the individual elements in each source are not
generally unique to that source, more than one element is usually required to define a unique pattern
which is fit through least squares methods to the ambient aerosol spectrum. The most precise source
apportionment should be obtained by fitting with the largest number of chemical features.
22

3.0 CHARACTERISTICS OF AMBIENT AND SOURCE
AEROSOLS

3.1 Chemical-Physical Model of an Airshed
A number of natural and anthropogenic sources all contribute to a complex mixture which
constitutes the ambient aerosol. This aerosol is composed of solids, liquids, and gases having
organic, inorganic, and radioactive components. This material originates from a number of area, line,
and point sources having different time and spatial dependencies.
A number of these aerosol features, such as size, chemistry, morphology, spatial, and time
variability, etc. may be measured and used to resolve this mixture into its component parts using
receptor oriented methods. One of the first steps in a receptor modeling study is to define and
characterize the subject to be investigated. The objective of this section is to review some of the
major features used to characterize an aerosol.
Aerosol: Particles (solids or liquids) suspended in a gaseous medium.
Particle size: The physical size of a particle is defined by its projected area diameter (the
diameter of a circle with the same projected area as that of an irregularly shaped particle), its Feret's
diameter (the particle's maximum dimension) or its Martin's diameter (the length of a line bisecting
the particle into two equal areas). Particle size may also be defined on the basis of its aerodynamic
characteristics. The aerodynamic diameter is defined as the diameter of a unit density sphere having
the same terminal settling velocity as the particle in question.
D
a
= aerodynamic diameter
D
a
= (18V
s
/gC)
1/2

V
s
= Settling velocity (m/sec)
= Fluid viscosity (kg/m-sec)
g = Gravitational Constant (9.81 m/sec
2
)
C = Slip correction factor

The aerodynamic diameter can be estimated from the physical diameter as measured by
microscopy with the following equation:
D
a
= D
p

1/2

D
p
= physical diameter
= particle density
Chemical Characteristics include elemental composition, chemical state, molecular structure,
compounds, etc.
Physical Features include radioactive and stable isotopic composition, particle size, magnetic
properties, light scattering characteristics, etc.



23

3.2 Source Emission Characteristics
Accurate size-resolved chemical characterization of emissions from potential sources in each
air-shed is an essential component in receptor modeling studies. Lack of this source information is
currently one of the major, if not most significant, factors limiting further improvements in. the
accuracy
,
and precision of quantitative source apportionment using this methodology. Recall that this
approach to identifying and quantifying a source's contribution relies on a comparison of the
chemical and physical features of the ambient aerosol measured at the receptor with the features of
an aerosol

emitted from a potential source. This comparison requires a balance of information if the
"bridge of evidence" relating the source to its impact is to be strong enough to support decisive
action.
Particle size is an important aerosol feature which is often used as the first step in resolving
aerosol sources. Particles generated by chemical (combustion, reaction, etc.) or physical means
(condensation, etc.) are formed predominately in the less than 1.0 m size range while particles
generated by mechanical methods (breaking, sanding, wear, etc;) as well as biological products such
as spores, pollens, etc. are generally greater than a few m. The particle size distribution emitted
from two combustion sources, residential wood combustion (RWC) and lead from automotive
exhaust (AE), are compared in Figures 3.1 and 3.2. Almost all of the RWC particle mass is less than
1.0 m while only about 75% of the AE lead particles are less than 1.0 m.
The chemical character of a source can take the form of a list of percent chemical
compositions such as those determined in the PACS, or a normalized relative composition as
measured in Washington, D.C. by Kowalczyk, et al. A collection of compositions for several sources
forms a table which is called the source matrix. Each set of compositions representing a source may
also be presented as a histogram which may form a unique source pattern for a particular airshed.
This kind of presentation of source information is useful because it clearly forms a pattern or
"fingerprint" the shape of which is independent of the level of contribution.
A fundamental assumption of the receptor model is that the chemical composition listed in the
table or illustrated as a histogram or vector is representative of the chemical composition of that class
of source in the specific airshed of interest during the period that ambient filters are collected.
Although this at first glance may seem highly unlikely, it has been an extremely useful first
approximation.
Source emission information required for receptor modeling is substantially different from
that required by dispersion models. Dispersion models require knowledge of the absolute emission
rates as a function of time which can easily vary by orders of magnitude from one day to the next.
Receptor models, on the other hand, require knowledge of the relative chemical composition which is
usually independent of the amount of material emitted. Whereas parameters such as emission and
dispersion factors on which a dispersion model calculation depends may vary by as much as an order
of magnitude, those on which the receptor model calculation depend, typically vary between 5% and
50%. The chemical composition of the major components in the fine and coarse fraction of road dust,
for example, have been shown to vary by only about 5% within a city such as Portland, Oregon
(PACS).

24


Figure 3.1: Size Distribution of Woodsmoke Particles
Obtained with Electrical Aerosol Analyzer

Figure 3.2: Particle Size Distribution of Lead from Ambient and Vehicle Aerosols
25

For a general airshed study, all major sources of ambient aerosol mass and key indicating
elements should be identified and characterized both for the same size range as collected by the
ambient samplers and the same elements as measured on the ambient filters. The exception to this
general rule is the case where only the contribution from a single source category having a relatively
unique chemical fingerprint is of interest. In this latter case, only those sources having similar
chemical features need be characterized.
Criteria for an ideal chemical fingerprint include the following:
- High concentration of key chemical species in the source's emissions.
- Low concentration of these same chemicals in other major sources.
- Low concentration of these key species in the natural background.
- High analytical sensitivity for key species.
- Minimal atmospheric modification.
Chemical species (elements, ions, organic compounds, etc.) that meet these criteria form a
special class called "tracer", "indicator", or "marker" compounds or elements which can be used by
themselves to determine quantitatively a specific source's contribution to air particulate mass. For
other species not meeting these criteria, the most accurate results will be obtained by fitting all of the
conserving components in the chemical distribution. Examples of key indicating elements and
compounds that have served as tracers for specific sources in previous studies are summarized in
Table 3.1.
Although these indicator species have proved useful in previous studies, they may not in other
airsheds since each new airshed may contain other sources of the same species. Radiocarbon
measurements in the fine fraction of the winter aerosol in Portland, Oregon, for example, was a
useful indicator of residential wood combustion while the same measurement in Medford, Oregon,
could not be used because of substantial interferences from other fine particle radiocarbon sources
such as veneer dryers, hogged fuel boilers, and charcoal manufacturing.
26

Table 3.1: Qualitative Source Impact Indicators
Source Indicator Size Fraction Analysis
Method
Preferred
Sampling
Method

Motor Vehicles Pb, Br Fine, Coarse
or Total
1 Dichotomous
Soil and Road Dust Si, Al Coarse 1, 2, 3, 5 Dichotomous
Residual Oil
Combustion
Ni, V Fine 1 Dichotomous
Tire Rubber Styrene
Butadiene
rubber
(SBR)
Coarse 3, 4 Dichotomous
or
Hi-Vol

Vegetative Burning
and Fossil Fuel
Comb.
12
C/
14
C
Analysis
Fine 5 Hi-Volume
Impactor

Aluminum
Production
F or F
-
Fine, Coarse
or Total
1, 6 Dichotomous
Vegetative Burning K/Fe > 10 Fine 1 Dichotomous
Limestone, Cement
or Lime
Ca Coarse 1 Dichotomous
Paint Pigments Ti/Fe > 0.3 Coarse 1 Dichotomous
Galvanizing and
Refuse Incineration
Zn Fine 1 Dichotomous Often Used with
Sd, Cd, In for
refuse comb.
Diesel Emissions Ba or Pb
with Diesel
or auto
emissions
Fine 1 Dichotomous
Marine Aerosol,
Snow Control
Na, Cl Coarse 1, 6 Dichotomous
or Hi-Vol

Coal, Fly ash, Non-
ferrous smelters
Se, As Fine 1 Dichotomous
Fertilizer Production P Coarse or
Total
1 Dichotomous
Lead Smelter High Pb,
Pb/Br < 0.1
Fine and
Coarse
1 Dichotomous

*Analysis Method Key
1. Elemental Analysis by XRF, INAA or AA 4. Gas Chromatography
2. X-Ray diffraction 5. SEM-XRF
3. Optical Microscopy 6. Ion Chromatography or Colorimetric Methods
27

Although literature values for source composition profiles can be quite useful, they must be
used with care and caution. The chemical composition of a source's emissions can vary with location
and may be a function of process cycle, combustion efficiency, pollution control equipment and their
efficiency, variations in the chemical composition of input materials and fuels, particle residence
times, emission release temperatures, volatility of species and many other factors. Unfortunately,
most of the chemical data available on emissions was not developed with receptor modeling needs in
mind. As a result, these data sets typically do not contain information of all the elements required,
were based on EPA method 5 measurements which separate particles from the condensables, are not
size resolved, are reported in units not applicable to the receptor model needs, lack uncertainty
estimates, etc.
Source Composition Summary (Table 3.2)
a. Geological - soil dust, agricultural tilling, rock crusher, asphalt concrete, coal fly ash all have
similar chemical fingerprints and have not been successfully resolved from each other. Mostly large
particles. Trace metals may be useful in resolving individual components of this category. The Al/Si
ration is enhanced in the fine (<2.5 m) fraction.
Table 3.2: Examples of Key Indicating Elements and Chemical Species
Crustal Na, Mg, Al, Si, K, Ca, Ti, Sc, Mn, Fe, Ga, Rb, Sr, and Zr
Coal Combustion Crustal plus fine particle enrichment elements such as Ge, As,
Se, Sb, Ba, W, U, Hg and B
Oil Combustion V, Ni, and Mo (Fine Particles)
Petroleum Refinery Catalytic
Crackers
La, Ce, Nd, and other elements specific to the process used
Automotive Br and Pb (Fine Particles)
Copper, Nickel and Lead
Smelters
Cu, As, Cd, Pb, In, Sn, Sb
Marine Aerosol Na, Cl
Vegetative Burning OC EC, K, Cl, Zn
Iron and Steel Industry Fine Fe, Co, Cr, Ni, Mg
Short term time variability, spatial relations, and emission inventories can be used to resolve the
individual components of this precisely defined source category. Size, as well as measurement of Se,
As, W, Sb, Ge, and other species enriched in fine particle fraction may be used to resolve coal fly ash
from soil and road dust.
b. Automotive and Truck Exhaust
1. Fuels: leaded gasoline, unleaded gasoline, and diesel fuel.
2. Pb has been reported to vary from 10 to 70% in emissions from leaded fuel. Best estimate is
25%.
3. Tunnel studies have provided best estimate. A Pb value of 25% for leaded auto exhaust has
been suggested.
4. Value for airshed can be estimated from average Pb content for all gasoline in area and other
parameters. (See Appendix 3).
5. Substantial diesel impact indicated by high elemental carbon (soot) content.
28

6. Diesel exhaust may be calculated by emission inventory scaling to the automotive exhaust
contribution or may be included in a composite transportation source.
7. Mostly fine particles and some loss of halide elements with aging.
c. Stationary Fossil Fuel Combustion
1. Coal: Fingerprint looks similar to geological and which may be included in this category.
Trace metals such as As, Se, W, Sb, etc., as well as size may be used to resolve this source
from crustal sources.
2. Residual Oil: Well characterized with Ni and V. Variable concentrations but easily
established for airshed based on fuel content and direct measurements. High primary sulfate
emissions. Primarily fine particles.
3. Distillate Oil: Low elemental content, poorly defined elemental pattern. Can be determined
by emission inventory scaling to residual oil.
4. Natural Gas: Low emissions, low elemental content, emissions may be dominated by
application. Difficult to quantify but usually quite low contribution.
d. Primary Emissions from Industrial Point Sources
e. Residential Solid Fuels (Wood and Coal)
Wood and coal will be used increasingly as residential fuels. Very high levels of pollutants with
current technology. CIO results expected to have high level of uncertainty because of low
concentration of fingerprint elements and high variability. Carbon-14 can be used to quantify
wood combustion emissions.
f. Secondary Aerosols
- Sulfate
- Nitrate
- Hydrocarbons
g. Natural Sources (pollen, spores, leaf fragments, biomass emissions).
h. Miscellaneous Sources (galvanizing, boiler cleaners, construction, etc.)
i. Background Aerosol
1. Material entering an airshed with the prevalent air mass which is not subject to control.
2. Marine
3. Continental
4. Anthropogenic

3.3 Characteristics of Ambient Suspended Particulate Matter

3.3.1 Classification and Properties
The atmospheric aerosol is a dynamic system of solid and liquid particles suspended in the air.
The nature of suspended particulate matter (solid or liquid particles) can vary rapidly in size, shape,
concentration, and composition due to:
- Source variations
- Atmospheric reactions
29

- Mixing and dilution properties of the air
- History of the air
- Scavenging processes
They can vary in size from a few thousandths of a micron to several hundred microns. They can be
classified by their size, shape, composition, origins, effects or in terms of properties measured by a
particular technique. Special names and classifications have arisen from various disciplines, some of
which are tabulated in Table 3.3 along with their fundamental physical and chemical features.
There are two basic particle categories based on source: anthropogenic and natural. Each may
be further divided into primary particles, which are emitted directly as particles and secondary
particles, which are formed in the atmosphere by gas-to-particle conversions. Fly ash, soot, and dust
are examples of primary particles while sulfates and nitrates formed from their respective gaseous
oxides and organic particles formed from hydrocarbon vapors are examples of secondary particles.
Manmade particles formed as vaporous emissions which cool in the atmosphere are called
condensibles.
Particles may also be categorized according to historical control measures. Traditional or
conventional sources, for example, are those historically included in emission inventories, such as
ducted exhausts from industrial or mobile sources. Among the nontraditional sources are fugitive
process sources, i.e., nonducted emissions escaping from buildings or industrial sites; and fugitive
dust sources, i.e., dust suspended by traffic, construction, agricultural activities, etc., or by wind
blowing over storage piles or exposed surfaces.
Other classification methods are based on measurement methods as noted in Table 3.3. The
total suspended particulate (TSP) mass, for example, is the particulate mass collected with an EPA
standard high-volume sampler (hi-vol) and usually represents the mass of particles less than about
100 m in diameter. Inhalable, respirable, and fine suspended particulate mass are similarly defined
as indicated in Table 3.3.
Aerosol particles are formed by either condensation of gases and/or vapors, or by mechanical
and/or communitive processes. They may be transformed by coagulation and condensation while
they are transported by air movement and dilution.
The settling velocities of aerosol particles vary by many orders of magnitude depending on
size, shape, and densities as indicated in Table 3.3. Their actual residence time will vary from a few
minutes for either 0.005 or 100 m diameter particles in the troposhere to years for 0.5 m diameter
particles in the stratosphere. The effect of these different residence times is that the impact on air
quality from small particles may be 10 to 100 times greater than for an equivalent mass of large
particles.
Many of the particles in the air have complex shapes. They can include "smoke aggregates";
typical of incomplete combustion of hydrocarbon fuels consisting of small spherical particles of
carbon having diameters of the order of 0.05 m, small flakes, fibers, crystals, or fragments. From
the point of view of their behlavior during sampling or inhalation, such particles are classified in
terms of their equivalent aerodynamic diameters, i.e., the diameters of unit density spheres having the
same settling velocities.
30

Table 3.3: Names and Characteristics of Aerosol Particles

Name Unique Physical
Characteristics
Effects Origin Predominant
Size Range
Coarse Particle Mechanical
process
> 2
Fine Particle Condensation
Process
< 2
Respirable
Suspended
Particle (RSP)
< 2
Inhalable
Suspended
Particle
< 15
Total Suspended
Particle (TSP)
< 100
Dust Solid Nuisance, Ice Nuclei Mechanical
Dispersion
> 1
Smoke Solid or Liquid Health and Visibility Condensation < 1
Fume Solid Health and Visibility
Effects
Condensation < 1
Fog Water Droplets Visibility Reduction Condensation 2-30
Mist Water Droplets Visibility Reduction;
Cleanse Air
Condensation or
Atomization
5-1,000
Haze Exists at lower RH
than Fog
hygroscopic
Visibility Reduction < 1
Aitken or
Condensation
(CN)
Nuclei for
Condensation at
supersaturation
>300%
Combustion,
Atmospheric
Chemistry
< 0.1
Ice Nuclei (IN) very special crystal
structure
Cause freezing of
supercooling water
droplets
Natural dusts > 1
Small Ions Stable Particle with
an electric range
Carry atmospheric
electricity
All sources > 0.0015

The results of size distribution studies are commonly plotted as cumulative frequency curves.
The mass distribution, for example, of particles from London air sampled close to traffic. The median
mass diameter is approximately 1 m, i.e., half the mass of material collected is contained in
particles having effective (aerodynamic) diameters under 1 m. These specific results indicate a log-
normal distribution of particle size but Willeke & Whitbey (1975) have shown that distributions are
often multimodal.
The particle size distribution of the total average aerosol measured in Los Angeles in 1969 is
plotted normalized to number, surface and volume is another example. The apparent area under the
curves is proportional to the number, surface and volume in a given size range. These curves were
31

constructed assuming uniformly dense, spherical particles. The following important observations can
be made from this fairly typical urban aerosol distribution.
- Most particles are approximately 0.01 m in diameter.
- Most of the surface area is provided by
particles averaging 0.2 m diameter.
- The volume (or mass) distribution is bimodal; one mode is about 0.3 m diameter, the other
is about 10.0 m diameter.
- The mass of fine particles (<2 m) is almost equal to the mass of coarse particles (>2 m).
- The number of particles decreases sharply - with increasing size.
The atmospheric aerosol size distributions actually consist of three separate modes:
- nuclei (< 0.1 m)
- accumulation (< 2.0 but > 0.1 m)
- coarse particle (> 2.0 m)
Although three distinct maximums may be observed in the volume or mass distribution, the bimodal
distribution is more typical.
The origin of each mode can be associated with various aerosol sources, and formation and
removal mechanisms as schematically illustrated in Figure 3.3. Particles in the nuclei mode, from
about 0.005 to 0.1 m in diameter, account for most of the particles by number, but rarely account
for more than a few percent of the total mass. These particles are formed by condensation of hot
vapors during combustion and gas-to-particle conversion in the atmosphere.
Brownian motion of particles smaller than a few tenths of a micrometer in diameter, resulting
from gas-molecule impact, causes the particles to diffuse. If the number concentration is high
enough, this diffusion results in collision and coagulation to larger sizes. This process limits the
maximum nuclei concentration to less than a few million per cm
3
at distances further than a few
hundred feet from sources. It also tends to cause particles smaller than a few hundredths of a
micrometer in diameter to coagulate rapidly with particles a few tenths of a micrometer in diameter.
A substantial nuclei mode in

the size distribution is likely to indicate the presence of fresh
combustion aerosol. This is particularly apparent in aerosols sampled near heavily traveled roadways.
Whitby, et al., for example, found over 25 pg/m
3
of aerosol in the nuclei mode in regions boarding
the Harbor Freeway in Los Angeles.
The accumulation mode, from 0.1 to about 1 or 2 m, usually accounts for most of the aerosol
surface area and a substantial part of the aerosol mass. This mode originates from coagulation of
particles in the nuclei mode and from heterogenous nucleation (condensation of one material on
another) of secondary aerosols. The sharp decrease in number of particles greater than 0.3 m in
diameter limits the mass transferred from the accumulation mode to the coarse particle mode.
The coarse mode, from approximately 1 or 2 m and up, is usually formed by mechanical
processes such as grinding, windblown dust, sea spray, volcanoes, plant particles, etc.
The origin, behavior and removal processes for nuclei and accumulation modes are distinctly
different from those for the coarse mode. This difference can be used effectively in source
apportionment studies to separate particles from some source categories through size-selective
sampling.

32


































Figure 3.3: Idealized Schematic of the Distribution of Particle Surface Area of an
Atmospheric Aerosol
PARTICLE DIAMETER (m)

.002 .01 .1 1 2 10 100

Rainout and
Washout

Sedimentation
Coagulation
Hot Vapor
Primary Particles
Chain Aggregates
Condensation
Coagulation
Chemical Conversion of Gases
to Low Volatility Vapors
Low Volatility Vapor
Homogeneous
Nucleation
Condensation Growth
of Nuclei

Droplets

Coagulation
Wind Blown Dust
+
Emissions
+
Sea Spray
+
Volcanoes
+

COARSE PARTICLES FINE PARTICLES
NUCLEI MODE ACCUMULATION MODE COARSE MODE
33

The relative abundance of the fine and coarse mode aerosol can change rapidly in terms of
chemical composition, physical properties and mass, as illustrated in Figures 3.4 and 3.5 depending
on the nature of the contributing sources. Generally, the fine particle fraction is composed of anthro-
pogenic source aerosols which are more hazardous to publish health, reduce visibility and contain the
majority of the secondary aerosols. Coarse particles tend to be quite different in origin and chemical
composition. The chemistry of the coarse particle mode, for example, is dominated by crustal
elements such as Na, Mg, Al, Si, K, Ca, Ti, Mn, Fe, and Sr while the fine fraction is enriched in Pb,
Br, C, sulfate, nitrate, ammonia and other anthropogenic elements. Fugitive dust sources from natural
wind entrainment of soils, dust entrained by vehicle movement and emissions from industrial
material handling will typically dominate the coarse particle mode, although emissions from
incomplete combustion (fly ash), plant tissues and grinding operations may contribute substantially
to this mode.

3.3.2 Typical Ambient Aerosol Chemical Composition
Although it is possible to investigate the composition of individual particles, most chemical
composition data are usually derived from larger samples collected for the determination of total
suspended particulate mass. Among the principal components are carbon, tarry material
(hydrocarbons, soluble in organic solvents such as benzene), water soluble material (such as
ammonium sulfate), and insoluble ash (containing small amounts of iron, lead, and a wide variety of
other elements.
The relative proportions of these constituents vary widely, depending on the types of sources
in a specific airshed. This wide variability in chemical composition is illustrated with the data from
London collected prior to their 1956 Clean Air Act, and typical fine and coarse chemical
concentrations in selected U.S. cities which lists ambient aerosol chemical composition for several
cities during short-term sampling periods (each chemical distribution can be reasonably explained in
terms of known likely sources).
Typical U.S. urban aerosol trace element ranges and typical values show, for some elements,
as much as three orders of magnitude variation in concentration. Lead can vary by as much as five
orders of magnitude when rural, remote, and urban airshed concentrations. Figure 3.6 illustrates the
typical urban aerosol composition (1973) in the form of a pie chart while Figure 3.7 compares this
same typical urban data in the form of a histogram with other ambient data collected in Portland
during the PACS. This same data could also be represented by a single data point in multi-
dimensional elemental space.
The chemical data on suspended particulate matter developed to date is the result of both
routine monitoring and specific research efforts. The largest volume of data over the longest period
has been accumulated by the routine monitoring operation of the National Air Sampling Network
(NASN). The analysis performed on these samples was limited to the determination of a few ionic
and atomic species and to benzene extractions to indicate the presence of organic material. These
measurements, for example, did not include silica, silicates, oxygen, high molecular weight organic
materials or elemental carbon. The more recent inhalable particulate sampling network will provide
more extensive elemental data on both fine and coarse particles.

34


Figure 3.4: Incursion of aged smog aerosol from Los Angeles at the Goldstone tracking station in
the Mojave Desert in California. Note the buildup in the accumulation mode. Sverdrup et al.
Obtained from AIRBOURNE PARTICLES, National Research Council, National Technical
information service, PB-276 723, Page 76


Figure 3.5: Sudden growth of the coarse particle mode due to local dust sources measured at the
Hunter-Liggett Military Reservation in California. This shows the independence of the
accumulation and coarse particle mode. Whitby and Sverdup, unpublished. Obtained from
AIRBOURNE PARTICLES, National Research Council, National Technical Information service
PB-276 723, page 77

35

Figure 3.6: Chemical Composition of Typical Urban Aerosol

Figure 3.7: Histograms of Chemical Composition of Typical Urban Aersolop

Carbon
(34.5%)
Oxygen*
(26.5%)
Silicon (8.6%)
Lead (1.0%)
Sodium (1.7%)
Magnesium (1.7%)
Aluminum (3.4%)
Nitrogen (2.6%)
Sulfur (4.3%)
Miscellaneous Elements (2.5%)
(Ti, Cl, V, Cr, Mn, Ni, Cu, Zn,
B) Potassium (1.7%)
Calcium (4.3%)
Iron (3.4%)
Hydrogen (4.3%)
*Percent Oxygen estimated from
Oxides of measured elements
36

Before recent studies of composition versus particle size and prior to the understanding of the
bimodal volume distribution, it was common to assume a chemically well-mixed system. Current
particle control strategies, for example, are based implicitly on this assumption. Close examination of
more recent dichotomous data shows that the fine fraction is largely composed of sulfate, nitrate,
lead, bromine, and carbon. The coarse particles consist largely of mechanically produced substances
such as soil or rock dust, road and tire dust, fly ash and sea salt and chemically dominated by Al, Si,
Ca, Fe, Ti, and K. The fact that most combustion, condensation, secondary aerosols, and post-control
industrial process emissions are found in the fine particle fraction provides strong incentive to use
size segregating dichotomous samplers to isolate these sources from fugitive dust impacts associated
with coarse mode particulates.
The hygroscopicity and deliquescence of some atmospheric particles can affect substantially
the amount of an aerosol deposit explained by typical chemical analysis and apportioned to specific
sources. Delinquescent salts undergo a sudden phase transition from a dry crystal to a solution
droplet when the relative humidity (RH) exceeds that of the saturated solution of the highest hydrate
of the salt. Most hygroscopic and deliquescent particles appear in the accumulation mode (particles <
1 m in diameter) with the notable exception of sea salt particles for which a substantial fraction may
be larger than 2 m in diameter. Deliquescence points of several typical atmospheric substances are:
ammonium sulfate, 80%; sodium sulfate, 86%; sodium chloride, 75%; and ammonium nitrate, 62%.
Hygroscopic compounds, on the other hand, absorb water until they become solutions at equilibrium
with the ambient humidity. They exhibit monotonic growth in size as RH is increased. Examples of
hygroscopic atmosphere substances are sulfuric acid, glycols, sugars, organic acids and alcohols.
Measurement of impure atmospheric particles by optical, mass, or microwave techniques
indicate that most urban fine particulate material is often hygroscopic, occasionally deliquescent and
rarely hydrophobic. The effect of relative humidity on the aerosol mass attributed to water was
demonstrated in a California study. This study showed that about half of the fine particulate mass
was water at 80% RH and about 5 to 10% of the mass was water at an RH of 50%. Thus, unless
water is measured, a substantial portion of the fine particle mass is likely to be unexplained.
Control strategies based on the TSP air quality standard has succeeded in reducing emissions
from traditional sources in the United States from 22 million tons in 1970 to 12 million tons in 1977.
Data obtained at urban sites indicate a substantial reduction in ambient TSP levels from 80 pg/m
3
in
1974. More recent data suggests an 8% TSP reduction from 1972 to 1977 but a relatively unchanged
TSP between 1975 and 1977.
Although the available data on TSP show an encouraging downward trend, concentrations of
fine particles have been increasing. In addition, even though emissions and ambient levels of TSP
have decreased substantially since 1970, many locations are not in compliance with either the
primary or secondary standards. Typically, violations occur in areas that vary greatly in terms of
terrain, meteorological conditions, and emissions sources. The Occur in states where coal is a major
fuel (for example, Illinois, Ohio, Michigan and Pennsylvania), in states with petrochemical
processing plants (Texas and Louisiana), in states with smelters (Washington, Montana, Arizona and
New Mexico) and in agricultural states (Colorado, Kansas and Missouri).


37


3.3.3 Ambient Carbonaceous Aerosol
Carbonaceous material present in the atmospheric aerosol represents one of the most abundant
categories of chemical species and is comprised of a number of organic compounds, elemental
carbon, and inorganic carbon (carbonate salts). Approximately 10-30% of the particulate mass in a
typical urban airshed is carbonaceous. Organic carbon forms the biggest fraction constituting 50-70%
carbon with the rest as elemental carbon. Less than 2% of the total carbon is in the form of carbonate;
therefore, the term carbonaceous aerosol will be used to address only organic and elemental carbon
in the rest of the discussion.
The importance of carbonaceous aerosol is three-fold. Carbonaceous aerosol makes up a large
fraction of the total aerosol loading, it rants high in terms of its potential hazards to health, and it may
play an important role in various atmospheric processes. The health effects of carbonaceous aerosol
are further emphasized by its size distribution. The Task Group on Lung Dynamics (1966), among
others, has pointed out the importance of aerosol size distribution in determining its deposition in the
respiratory tract. Investigators have found, for example, PAH aerosol to be not only highly dependent
upon its size but also with maximum concentration in the respirable size range.
Carbonaceous aerosol may also be relevant to several atmospheric processes. It has been
pointed out that surface active organic compounds have an effect on uptake or evaporation of water,
thus influencing the process of cloud formation. Light-absorbing aerosols such as elemental carbon
are important in the radiation balance determining the temperature of the earth. Carbonaceous
aerosols from vehicle exhaust and other combustion processes have been shown to have large surface
areas (up to 100 m
2
/gm or more), suggesting a porous structure and making them efficient absorbers
of other materials. Scanning Electron Micrographs (SEM) of soot material confirm the agglomerated
sphere structure of elemental carbon.
Organic aerosols are the result of direct emissions into the atmosphere (primary component)
and the atmospheric reactions (secondary component) usually involving photochemistry. Organic
aerosol found in the atmosphere is comprised of hundreds of compounds whose analysis is laborious,
requiring sophisticated equipment and laboratory techniques. The classes of identified primary
organic compounds include linear and branched alkanes and alkenes, substituted benzenes and
styrenes, phenols, cresols, phthalates, fatty acids, carbonyl compounds, and some pesticide
compounds.
The formation of organic aerosol as a resilt of photochemical reactions of hydrocarbons,
ozone and oxides of nitrogen has been observed in urban and rural atmospheres. Classes of
secondary organic aerosol compounds include aliphatic organic nitrates, carbocyclic acids (e.g.,
adipic and glutaric acid, terpene-derived oxygenates, and nitrate esters).
Elemental carbon (EC; also referred to as graphitic carbon, particulate element carbon, soot,
free carbon or carbon black) is produced as the result of incomplete combustion. Sources of
elemental carbon include industrial emissions from combustion processes, vehicle exhaust, and space
heating. Elemental carbon is primary in origin.
Typical urban and rural concentrations of organic and elemental carbon are compared in
Table 3.4.

38

Table 3.4: Annual Average Carbon Concentration and Standard Deviation for the Urban and
Rural Locations in g/m
3


OC EC TC SUL TSP
Urban
(46 sites)
6.6 2.5 3.8 1.3 10.3 2.4 10.0 4.1 79 26

Rural
(20 sites)
2.4 0.8 1.3 0.5 3.7 1.2 7.4 3.8 31 13
39

4.0 AN OVERVIEW OF STATISTICAL
AND EMPIRICAL MODELING
4.1 Assumptions
The basic assumption of receptor analysis techniques is that the features used in the statistical
analysis be linearly additive as they are sampled at a receptor. This model is stated mathematically by
the following equation:

=
=
p
j
ij i
S C
1
(4.1)
where C
i
is the concentration of the ith feature as measured at the receptor and S
ij
is the source
contribution of the jth source to the ith feature at the receptor. The total concentration of the ith
feature measured at the receptor is assumed to be a linear sum of the contribution of p sources. The
total concentration of lead at a receptor, for example, might be expressed mathematically by the
following equation:
RoadDust Pb Smelter Pb Auto Pb Pb
S S S C
, , ,
+ + =
r
RoadDust
r
Smelter
r
Auto
r
Pb Pb Pb Pb ] [ ] [ ] [ ] [ + + =
3 3 3 3
/ 1 / 1 / 3 / 5 m g m g m g m g + + =
where the superscript r is used to emphasize that these concentrations are all referenced to the
receptor.
There are hundreds of features that can be used to characterize an aerosol. These can range
from elemental or chemical composition to color and may include size, morphology, particle type,
diffraction characteristics, etc. It is reasonable to assume, for example, that the concentration of
elements such as Si, Ti, Mn, Fe, Pb, etc. as measured at the receptor, C
i
, will be equal to the linear
combination of each of the P source's contribution to that element at the receptor, S
ij
.

That is, their
mass is conserved as they are collected on a filter at a receptor. On the other hand, color is not
linearly additive and some chemicals are reactive at the receptor and would not satisfy this linearity
assumption.
Two additional conditions must be satisfied to a reasonable degree if these tools are to be
useful to the air pollution scientist. The first condition is that the characteristic features selected for
analysis at the receptor must be known for each potential source. Assume, for example, elements A
and S are shown to be highly correlated through a statistical analysis of the receptor data. This
information, however, will be useless unless the ratio of these elements is also known for each of the
potential sources. This information is also necessary to determine which of the hundreds of potential
features to measure.
The second condition states that the relative composition of the conserving species must be
conserved during transport from the source to the receptor. Its condition must be met to confidently
40

relate a controllable source name such as automotive exhaust, blast furnace, etc. to statistical
observations. If the ratio of elements A and S measured in the ambient aerosol are substantially
different from what is known or expected from a potential source, then the level of confidence in
associating the observed statistical feature pattern to an assumed pattern will be low.

4.2 The Signal: Variability
Multivariate statistical analysis techniques interpret data or feature variability. These methods
assume that chemical features emitted from a source will vary in a common manner. The methods
objective is then to identify features as measured at the receptor which have similar patterns of
variability as might be expected from what is known about possible sources. It is thus essential to
separate the variability caused by source specific characteristics (signal) from variations caused by
other factors such as meteorology, sampling or analysis artifacts, etc. Feature variability is thus the
subject of interest, source variability is the signal of interest and all other causes of variability
represent noise. Since source variability is the primary subject of interest, it is essential to maximize
the signal to noise ratio; that is, to obtain data measurements that are based on a time scale
sufficiently short to measure variations in the data caused by the variability in source-specific
characteristics. Two hour sampling intervals will clearly define the variability in traffic patterns
while much of the variability signal will be lost with a 24 hour sampling period. Most of the common
influence of meteorological parameters on variability such as wind speed, wind direction, inversion
height, etc. can be minimized by using data which has been normalized to total aerosol mass.

4.3a Two Statistical Approaches
There are two general approaches to multivariate analysis. The first, regression analysis,
generally yields a quantitative relationship between a dependent variable such as TSP and
independent variables such as elemental composition; while the other general approach, correlation
analysis, is based on interrelationships between independent variables such as elemental composition
and generally provides only qualitative information about source contributions.
Regression techniques will answer the following kind of questions:
1. How does TSP (dependent variable) relate to changes in the concentration of Pb, Br, V, Ni,
Fe, Si, Al, Mn, carbon, sulfate, etc.?
2. Can an equation be developed that will enable us to predict values of TSP as a linear function
of the concentration of specific features in a data set?
3. How strong is the overall relationship between TSP and these features?
4. Is the relationship statistically significant?
5. What is the relative influence of each feature (Al, carbon, sulfate, etc.) on the variation in
TSP and are these separate influences statistically significant?
The questions which correlation methods such as factor analysis can answer are as follows:
1. Are the features measured really independent or can they be represented by a smaller number
independent variables? The nine elements Al, Si, Ti, V, Mn, Fe, Ni, Br and Pb are often
measured in aerosol studies and represent possible independent variables. They, however, are
not often independent since their ambient concentrations are generally dominated by three
41

sources, residual oil, road dust and automotive exhaust. Thus, the variability in the data set
composed of the above nine elements might be explained by the variability of the three
sources.
2. What is the association of these elements?
3. How much of the data variability is accounted for by the independent factors.
Linear, multiple and ridge regression are examples of the first approach. Pattern recognition,
principal component, factor and cluster analysis are all examples of the correlation approach (Figure
4.1). Target transformation factor analysis is also part of this latter category in principle, but is
usually considered to be a hybrid of the two approaches based on the way it is used in practice. The
data table generated by the analysis of particulate material collected at a receptor is the heart of any
multivariate analysis. Because these data tables may contain hundreds of thousands of numbers, it is
necessary to use summary terms to characterize specific features of the data set (Figure 4.2). These
include means, variances, elemental ratios, correlation coefficients, regression coefficients, factor
scores and loadings, etc. which result from various approaches to statistical analysis of the data. The
objective of multivariate analysis is to use these statistical data summaries to infer a cause and effect
relationship between source emissions and particulate impacts at a receptor.
Figure 4.1: Basic Multivariate Statistical Analysis Tools
Figure 4.2: Data Summary Terms Used in Multivariate Analysis

Multivariate Statistical Analysis Tools
Regression
Linear Multiple Ridge
Correlation
Factor Pattern Recognition
Principal Component Cluster
Target Transformation Factor Analysis

Data Summary Terms
Univariate Analysis
Mean Standard Deviation
Bivariate Coefficient (Slope)
Regression Coefficient (Slope) Intercept Correlation
Coefficient
Multivariate Analysis
Partial Regression Coefficient Intercept Factor
Loadings Factor Scores Etc.

42


4.3b Bivariate Linear Regression
Bivariate linear regression analysis is the simplest and most common form of statistical
analysis. Yet it can provide much of the same incite as might be obtained from the more
sophisticated and complicated multivariate analysis approaches. Because this approach can yield so
much information which is so easily interpreted and with minimal effort, it is highly recommended
for the first step of any receptor analysis approach to source apportionment.
The basic objective of a bivariate regression analysis is to determine if a linear relation exists
between two variables, and if it does, define the regression parameters, the intercept and regression
coefficient (slope). The general form of the regression equation is given in Equation 4.2:
Y = a + bX (4.2)
where Y is considered the dependent variable, (a) the intercept, (b) the regression coefficient and X
the independent variable. The degree to which the data fits a linear relationship is defined by the
correlation coefficient which is a measure of the scatter of the data about the linear least squares fit
of the data. If two variables are highly correlated as, for example, Br and Pb often are, it indicates
that a single common source is probably responsible for their variability.
Hypothetical examples of regression plots illustrating various types of curves are shown in
Figure 4.3. The correlation coefficient for the regression curve in (a) is 1.00 and suggests a common
source of variability for the two features X and Y. The correlation coefficient for (b) is -1. Negative
regression coefficients might be obtained, for example, if one feature is associated primarily with
large particles suspended by wind while the other is associated mainly with fine particles from stacks
which are readily removed on windy days. The data in (c) has a correlation coefficient of 0.86 while
the data in (d) has a correlation coefficient of -0.34. Curve (d) suggests that very little of the variance
of X and Y is due to a common source.
43

Figure 4.3: Illustration of Various Hypothetical Examples of Regression Plots
As already noted, meteorological influences represent a major source of common variability.
The influence of this common source of variability should be minimized prior to regression analysis
to avoid drawing false conclusions. An example of the influence meteorology can have on the
regression analysis is illustrated in Figure 4.4. Coarse particle (< 2.5 m) regression curves for Si and
Ti are compared in Figures 4.4a and 4.4b. A substantially higher correlation coefficient (0.923) was
obtained from the data plotted as pg/m.
3
than when the same data was plotted as percent of total mass
(r = 0.713). A more dramatic effect is illustrated in Figures 4.4c and 4.4d. In this case, not only is the
correlation between the Si and Pb reduced, but the slope (regression coefficient) is reversed in sign.
An example of how simple bivariate regression analysis can be used to answer key control
strategy questions is illustrated in Figure 4.5. The two primary questions of concern were:
1. What was the major source of lead? (It was already clear that the smelter was the primary
source, but the question refers to which process source within the smelter was the most
significant).
2. Is historically contaminated soil a major source of suspended lead particles?
Figure 4.5a shows a high degree of correlation between Al and Si (r = 0.821) with a regression
coefficient (slope) of 0.25 which is close to that measured in soil samples. On the other hand, a
negative regression coefficient is observed in the Pb regression plot on Si. The first plot suggests that
the Al and Si originate primarily from a single source, crustal dust sources such as road and wind
blown dust. Thus, the negative regression coefficient observed between Pb and Si strongly suggests
that crustal dust is not a significant source of Pb. The other two regression plots in Figure 4.5 show a
strong correlation between both fine and coarse particle Pb and Cd (r = 0.947 and r = 0.906). The
only source with Cd to Pb ratios close to those measured in the ambient aerosol was the blast furnace.
Although these regression plots were not the only data used to support the above conclusions, it does
illustrate how, in some airsheds, simple bivariate plots can provide valuable insight as to source
contributions.

Figure 4.4: Comparison of Bivariate Correlation Plots of 24-Hour Coarse Particle
44

Samples Collected in the Facility of Lead Smelter

Figure 4.5: Correlation Plots for Selected Elements Measured During an Intensive
Sampling Study in an Airshed with a Lead Smelter
Strong bivariate correlations were also observed between Si and Ti, Mn and Fe which further
suggested that crustal dust sources were the primary source of these elements. Under these
conditions, that is, where bivariate regression plots suggest that the variability of specific features is
dominated by a single source, the features may then be used as a tracer for the suggested source. In
the above example, Si could be used as a tracer for crustal dust and this sources contribution
quantified by multiplying the Si ambient concentration by the concentration of Si measured directly
from the analysis of crustal dust samples. Note, however, the following steps that were involved in
the quantitative apportionment of crustal dust:
Step 1. Qualitative Analysis
a. High correlation of expected crustal elements based on our general knowledge of the
chemistry of crustal dust sources and the chemistry of other possible sources in this specific
airshed led to the conclusion that a single source was responsible for the variability of these
elements. This was a directed statistical analysis of empirical data.
b. The regression coefficients (slope) provided information on the average inter-elemental
ratios.
c. Comparison of the elemental ratios with those previously known for both crustal dust sources
and other possible sources left only one possible primary source of the above crustal dust
(road and windblown dust).
Step 2. Quantitative Analysis
Now that the primary source of these elements had been determined, an upper limit to the
contribution of this source could be estimated by using any one of the elements shown to be
associated with this source as a tracer. The element of choice is usually the one which is
45

expected to have the highest concentration in the source and the highest correlation. If the
concentration of Si in crustal dust is 25%, for example, then an upper limit to the crustal dust
contribution to the ambient aerosol could be calculated by dividing the ambient Si
concentration by 0.25. The extent to which this upper limit approaches the actual source
contribution would be determined by the degree to which Si is influenced by other sources. A
high correlation with other crustal elements such as Al would suggest that only one source is
responsible for the variability of these elements and that the tracer assumption is reasonably
valid.
Table 4.1: Examples of Some Key Bivariate Elemental Corrections
Elements Sources Ratios
Al/Si Soil, Road Dust --- 0.25 0.35
V/Ni Residual Oil Combustion Local Fuels
Br/Pb Automotive Exhaust 0.25 0.35
Cd/Pd Lead Smelters 0.1
Cr/Fe Iron and Steel Industry Variable

Some of the elements commonly used in bivariate regression analysis are listed in Table 4.1.
Although only Al and Si are listed for soil and road dust, correlations with other major crustal
elements may also provide additional insight. Similarly, other sources may have additional elements
that might be useful in source identification.
As illustrated above, bivariate linear regression analysis can determine:
1. If the variability of two or more elements is dominated by a single source, and
2. The ratio of the elements in that source.
3. The responsible source may be identified by comparing elemental ratios if the elemental
ratios are known for potential sources in a specific airshed of interest.
4. A quantitative upper limit may be established if the concentration of one of the highly
correlated elements is known for the identified source.
This method is relatively easy to conduct, requires little interpretive resources or advanced
training and yet can provide valuable source impact information on some major sources in relatively
simple airsheds.
Its major limitation is related to its ability to develop source information in more complex
airsheds where there may be more than one source responsible for the variability of some key
indicating elements.

4.4 Multiple Linear Regression
The bivariate regression analysis equation can be generalized by adding additional dependent
variables (X
i
) and partial regression coefficients (b
i
) as illustrated in Figure 4.6. The intercept
represents the value of Y when all X
i
are equal to zero. The partial regression coefficient, b
i
, is a
measure of the change in Y per unit change in X
i
when all other independent variables are held
constant. This type of statistical analysis is used to develop an equation relating a dependent variable
such as TSP to independent variables such as the concentration of chemical species. This approach
46

does not yield information on the relationship between independent variables (X
i
) but only defines
the relationship between the dependent variable (Y) and each independent variable (X
i
).







Figure 4.6: General Multiple Linear Regression Analysis Equation

Although, in general, there are no restrictions on what species can be used in a multiple
regression analysis, a highly directed approach is usually taken where specific features are selected to
represent specific sources. This is usually the case because of the large number of combinations and
permutations that could otherwise be involved. Whatever species are used, it is essential that they be
independent.
The multiple regression analysis approach applied to particulate source apportionment
assumes that a tracer feature such as Pb for automotive exhaust can be uniquely associated with each
major source. A multiple regression analysis is then applied, to measured TSP and chemical feature
data to determine the partial regression coefficients (b
i
) which are equal to the reciprocal of the
fractional composition of the tracer element in the source as illustrated in Figure 4.7. Once the
regression coefficients are determined, the contribution of each source category to TSP can be
calculated by multiplying the measured tracer element concentration times the determined regression
coefficient.









Figure 4.7: Basic Multiple Linear Regression Equation as Typically Applied to Source
Apportionment of TSP

The choice of tracer elements or properties is often based on knowledge of emission sources
and their compositions. This knowledge can be supplemented by using factor analysis to determine
source properties that are highly correlated with specific sources and also statistically independent of
other tracers. Examples of reciprocal regression coefficients or fractional composition of tracer
elements associated with sources in New York City are listed in Table 4.2.
Multiple linear regression analysis is a more powerful tool than bivariate regression for
Multiple Regression
Y = a+b
1
x
1
+ b
2
x
2
+

a = Intercept
b
i
= Partial Regression Coefficient

Particulate Multiple Regression Equation
TSP = a +
1
f
11
C
1
+
1
f
22
C
2
+ ---
1
f
pp
C
p

f
ii
= fraction of tracer element i in the ith source
C
i
= measured ambient concentration of ith chemical species

47

quantitative apportionment of such variables as TSP because it provides a more complete,
simultaneously fitting of the data. It does not, however, provide information on inter-species
relationships, requires a greater degree of understanding and direction prior to the regression
analysis, and requires more training and substantial computer resources to implement.
Table 4.2: Reciprocal Regression Coefficient Obtained in New York
Source Element f
ij
Incinerators Cu 0.02
Residual Oil V 0.01
Auto Exhaust Pb 0.08
SO
2
Oxidation SO
4
0.60
Soil Mn 0.002

4.5 Factor Analysis

4.5.1 Introduction
Factor analysis is one of the newer statistical techniques. Although the mathematical
foundation was established in 1927, the technique wasn't generalized until 1947. The first
environmental application was in 1956 when it was used for statistical weather prediction. The first
particulate source apportionment application was in 1967 but its use in this field didn't become
widespread until the late 1970's. Although it was suggested in the mid-1970's that it could quantify
source contributions without source information, more recent studies have shown this not to be the
case and it is now being applied to the tasks for which it was originally intended:
1. Determine the number of major sources responsible for source variability.
2. Estimate the relative composition of source emissions, and
3. Apportion the observed data variability.
4. Establish a minimum number of independent variables responsible for variability.
Factor analysis is a general tool for explaining observed correlation relationships between
large numbers of variables in terms of fewer variables. The new variables are linear combinations of
the original ones, but independent of one another.
There are two factor analysis procedures described in the literature: the classical or common
factor analysis and the principal components model. The heart of both models is the interspecies
correlation matrix.
Correlation of Fractional Concentrations
Mass Al Si Br Pb
Mass --- --- --- --- Mass
Al 0.964 0.895 -0.360 -0.417 Al
Si 0.980 0.974 -0.504 -0.536 Si
Br 0.691 0.788 0.710 0.893 Br
Pb 0.653 0.907 0.676 0.922 Pb
Mass Al Si Br Pb
Correlation of Absolute Concentrations
48

Figure 4.8: Comparison of Inter-element Correlation Coefficients based on Absolute and
Fractional Concentrations
4.5.2 Correlations
The starting point for a factor analysis is a table of correlation coefficients for each chemical
species. The major cause of variability in data is often meteorology and not variations in source
emissions. As a consequence correlations between absolute concentrations of elements as shown in
the lower left of Figure 4.8 are often high. Most of the common influence of meteorology on
variability and correlation is minimized by using data that has been normalized to total aerosol mass,
With the variability due to meteorology minimized, only two correlations are apparent in the data set
shown in Figure 4.8; one between Al and Si and one between Br and Pb.
There are several types of interspecies correlations that can be used. One of these involves the
correlation between species concentrations as measured at different sites. For example, a correlation
matrix can be generated for Al or TSP as a function of location. Another type of correlation involves
the correlation between species measured as a. function of time at only one location. For instance, the
correlation between TSP and Pb or between Pb and Br can be examined at one sampling site.
An important limitation of interspecies correlations is that the correlation coefficients indicate
how well two variables vary together, not necessarily that there is or is not a casual relationship
between the two variables. A good correlation between TSP and another species implies that a large
part of the variations in measured TSP can be explained by the variation in the emissions from the
source of that species. For example, a high correlation between TSP and Al at one site would indicate
that the variation in TSP could be explained by the variability of Al.
When this technique yields good correlations between a species measured at different
locations, the results indicate a common source or type of source for the species. For instance, a high
correlation between Pb at two different locations would indicate that the Pb measured at the sites has
a common source.
The resulting correlations between species associated with a specific type of source serve to
identify general impacting source categories. However, no quantitative assessment of source impact
is possible. For example, a good correlation between Pb and Br at a particular site would suggest an
impact from vehicle exhaust, but it would not indicate how much of an impact (in terms of TSP
concentrations) that source category had.
One aspect of these correlations should be noted. A low correlation between species suggests
that the species may not be related to each other. However, a low correlation does not preclude the
possibility that there is a relationship between species. It is possible that the true relationship is
hidden by sampling errors, the range of variation among the variables, or other interferences.

4.5.3 Classical Factor Analysis
Classical factor analysis assumes that observed correlations between variables are the result of
regularity in the data. Each variable is expressed as a linear combination of factors common to all
variables and a unique factor not shared with the other variables. The values of the variables can be
expressed as a set of n linear equations:

49

i i jk
p
j
ij ik
U d S a C + =

=1
(4.1)

where
C
ik
= concentration of the ith species.
A
ij
= factor loadings (correlation between a variable and the factor of which it is a part) which
is related to the source composition.
S
jk
= factor scores which are related to source contributions.
U
i
= unique factor for variable i
d
i
= standard regression coefficient of variable I on unique factor U
i

p = number of factors
The factor scores or common factors are chosen so that the correlations among the species can
be reproduced as well as possible from only a few factors. The unique factor loadings and scores are
often left out. The resulting equation represents principal component analysis.
The factor analysis is begun by calculating the linear correlation coefficients for each of the
possible pairs of variables. Next, the common factors must be found. The approach used in the
literature is called the principal factor method. In this procedure, the matrix of correlation
coefficients is modified so that the elements in the principal diagonal are the estimated values of the
variances of each variable that result from the common factors. There are several methods available
to estimate these terms. The usual methods involve the highest correlation coefficient for each
variable or the squared multiple correlation.
The next steps in factor analysis are to diagonalize the resulting matrix, determine the
eigenvalues, and determine the corresponding orthogonal eigenvectors. The factor loadings can then
be calculated from these results.
The factor loadings represent the correlations of a variable with the factor of which it is a part.
In the factor matrix, each row of the matrix corresponds to one variable and each column represents
one factor. There are as many factors as there are variables. However, normally only those factors
with eigenvalues greater than or equal to 1.0 are used.
Selection of the number of independent factors to retain is not, in practice, as simple as
keeping all those factors with eigenvalues greater than 1.0. Several methods should be used to decide
how many factors to retain. Often it is helpful to plot the eigenvalues as a function eigenvalue
number and look for sharp breaks in the slope of the line that often can provide an indication of the
point of separation. Other techniques useful for indicating the number of factors to retain are root-
mean-square percent error and the exner functions.
Using matrix algebra, factors are rotated to point a factor at a cluster of variables so the factor
may be easier to interpret. The object of rotation is to produce a factor matrix with high loadings in
some rows and near zero values in other rows. A number of methods of rotation are described in the
literature, including varimax, quartimax, and equimax. The factor matrix is examined to determine
which variables reveal high loadings (0.50 or greater).
An important aspect of factor analysis is that the solution is not unique. It is possible to
50

produce equally valid sets of transformations through rotation from the same input data. The high
loadings cannot be defined objectively. They require that other information be used to select
appropriate descriptors for the factors.
The analyst does not preselect these factors. Rather, they are combinations of the original
variables that explain the observed variance in the data. The first factor explains more of the variance
than any other factor, the second more than any other except the first, and so on.
The principal component model mentioned earlier in this section ignores the unique factor of a
variable. It does not require variance estimates, since it is interested in total variation among all
variables instead of the common variance. The calculations of the component solution and the
subsequent rotation are essentially the same as in the classical factor model.
The final step in factor analysis is the extraction of factor scores (the measure of each variable
on each factor). These factor scores can be used as independent variables in regression analysis or as
a new data set for further analysis.
The application of this technique to the quantified species present in TSP samples leads to a
characterization of the aerosol. The loadings on the factors give an indication of which species are
linearly related. For example, one factor may contain high loadings on Pb and Br, indicating the
species are related in the aerosol.
The technique does not lead to a quantification of the impact of a specific source of emissions,
nor does it have any predictive capabilities. It cannot be used to trace changes in emissions over time.
The factor loadings obtained for a particular factor need to be related to some aspect of the
physical world. Cluster analysis, enrichment factor, microscopy, and interspecies correlation
techniques can be used to aid in the interpretation of the results of factor analysis. In addition,
knowledge of the existing emission inventory is necessary to attach meaning to the results.
Meteorological variables, such as wind speed, precipitation, and wind direction, can be included as
variables in a factor analysis.
The mathematical formulation of factor analysis is most simply expressed in matrix notation.
Therefore, because of the extensive computations necessary to reduce large matrices, factor analysis
has become practical only with the ready availability of large digital computers. Programs from the
Statistical Package for the Social Sciences (SPSS) or the BMCP Biomedical Computer Programs can
be used to carry out the factor analysis computations but they were not designed with source
apportionment in mind. Thus, caution should be the rule when using these standard programs to draw
conclusions. A basic knowledge of the chemistry, physics, and source characteristics of the airshed of
interest will go a long way in preventing drawing inappropriate conclusions from standard factor
analysis programs.

4.5.4 Target Transformation Factor Analysis
Hybrid forms of factor analysis have been recently suggested. The most common hybrid
approach is target transformation factor analysis which combines factor analysis with regression and
chemical mass balance analysis.
One operational difference between this approach and ordinary factor analysis is the use of the
correlation about the origin rather than between the elemental species. This permits the development
of the source profiles from the data matrix through the target transformation rotation as suggested by
51

existing knowledge of source characteristics.
In order to calculate source contributions, it is necessary to rescale the source vectors
determined by target transformation factor analysis so that they are comparable with actual source
composition vectors. The necessary scaling factors are determined by multiple regression of total
mass against the (unscaled) factor loadings from each sample period. Finally, ordinary weighted
least-squares GMB is used to calculate source contributions from the resealed source vectors.

4.6 Data Requirements
To obtain accurate, statistically valid results by multivariate analysis, it is essential that there
be an adequate number of samples in the data set. It is generally recommended that at least 30 and
preferably 50 to 100 samples be included in the analysis. A general equation that can be used as a
guide is shown below:
2
3
30
+
+ >
V
V
where N is the number of samples required and V is the number of features included in the analysis.
The typical number of species measured is 20. Thus, the minimum number of samples to
include in the analysis would be 42.
In addition, it is essential that the features included in the analysis be linearly additive and
consists of values greater than detection limits.

4.7 Experimental Design Considerations
Although multivariate statistical analysis study design could be developed independently of a
conceptual model of the physical and chemical features of a specific airshed, a directed approach
based on our current understanding of the physical, chemical, and source characteristics of the
airshed is much more cost-effective than a random selection of sampling characteristics and
observables. Thus, a clear understanding of the system (airshed, sources, chemistry, physics,
meteorology, etc.) and current models is essential in designing a cost-effective experiment to develop
an empirical data set which will provide maximum source resolution and quantification.
Key experimental considerations include:
- Selection of particle size fraction to sample.
- Selection of chemical features to measure.
- Sampling frequency.
- Duration of sample collection.
- Selection of analytical measurement techniques.
Selection of particle size fraction to sample is usually determined by source identification
objective and potential interferences. In general, dichotomous samplers with a fine/coarse particle cut
point of 2.5 m is recommended because of its ability to separate the influence from sources
generating particles by mechanical means from those in which particles are formed by combustion,
condensation and chemical reactions.
Selection of chemical features to measure is based on current knowledge of potential source
52

chemistry. It would be very difficult to resolve the influence of such sources as residual oil and
automotive exhaust, for example, if such key species as V, Ni and Pb were not measured. Examples
of some key indicating elements are listed in Table 4.3.
Sampling frequency and duration is dictated by:
1. The emission cycle of possible sources.
2. Analytical sensitivities, and
3. Available resources.
If substantial diurnal variation is characteristic of some potential sources, then it would be
desirable to define this variability by collecting samples frequently and for short enough periods to
define this source variability pattern. Since the signal in a multivariate analysis is data variability, it
is important to maximize the signal (variability). A single 24 hour sample, for example, would lose
all of the diurnal variability information in the averaging process.
Analytical sensitivities and available resources are the primary factors limiting how short a
sample duration can be selected. There is a point, for example, where insufficient material is
collected to characterize the key species by the analytical techniques selected. At this point,
substantial portions of the data set may not be included in the analysis because of species
concentrations being reported below detection limits. Thus, selection of sampling frequency, duration
and analytical techniques is an optimization based on available resources.

Table 4.3: Examples of Key Indicating Elements and Chemical Species
Crustal Na, Mg, Al, Si, K, Ca, Ti, Sc, Mn, Fe, Ga, Rb, Sr, and Zr
Coal Combustion Crustal plus fine particle enrichment elements such as Ge, As, Se,
Sb, Ba, W, U, Hg and B
Oil Combustion V, Ni, and Mo (Fine Particles)
Petroleum Refinery Catalytic
Crackers
La, Ce, Nd, and other elements specific to the process used
Automotive Br and Pb (Fine Particles)
Copper, Nickel and Lead
Smelters
Cu, As, Cd, Pb, In, Sn, Sb
Marine Aerosol Na, Cl
Vegetative Burning OC EC, K, Cl, Zn
Iron and Steel Industry Fine Fe, Co, Cr, Ni, Mg

4.8 Advantages and Disadvantages
One of the primary advantages of this methodology is the ability to include non-chemical
measurements, such as light scattering, gaseous pollutant measurements and meteorology in the data
set. Thus, primary particles may be associated with secondary species. Another multivariate analysis
strength is their ability to identify source impacts at the receptor with very limited knowledge of the
airshed. The number of sources, likely emission composition and source loadings can be inferred
directly from the data. In addition, these methods can provide information concerning the number of
major sources responsible for the data variability, source composition and source loadings. These
models, however, require large data sets and are, therefore, not useful for modeling single days.
Finally, some knowledge of source compositions and sources likely to be impacting the receptor are
53

required to interpret the model results.
54

5.0 A GRAPHICAL APPROACH TO
RECEPTOR MODELING

5.1 Models and Modeling
A model is either a
- Mathematical expression relating variables in a perceived physical model or
- A mathematical expression derived from an analysis of measured variables.
Modeling is the process by which a model is developed. In the case of receptor modeling, the
process includes planning, sampling analysis, and data interpretation. Before effective planning can
begin, however, a perceived physical-chemical model based on existing information should be
constructed (hypothesis defined), key variables listed and order of magnitude values defined for each
variable. Questions, hypothesis, and data gaps are defined in the planning phase as well as the
experimental procedure for developing the answers to questions and data to develop new models, or
modify or validate old models and/or hypothesis. Tables are formed with new experimental data from
which matrices are formed. At this point, the world of mathematics is entered and all manipulations
of the data, now in the form of a matrix, are independent of physical reality and governed only by the
laws of mathematics. Mathematics alone, however, cannot identify or quantify a source based on just
ambient data.
In the case of the CMB approach, source combinations are proposed for each ambient filter
chemical data set and the most probable contributions calculated by least squares procedures. The
resulting regression coefficients define a linear model (an equation defining the relationship between
source contributions and the total concentration of each chemical species measured at the receptor)
for the period of time during which the sample was collected. In the case of multiple linear regression
analysis of TSP, sources are selected and the most probable mathematical model explaining the data
is derived from a regression analysis. The resulting regression coefficients define a mathematical
model for the average data set used in the regression. Factor analysis is significantly different in that
no source data is input at the beginning of the analysis. As a result, an infinite number of
mathematical solutions (all valid mathematically) are derived from the analysis. Solutions are
selected that best match preconceived understanding of physical reality based on the investigator's
understanding of source characteristics. In either case, the potential mathematical solutions (models)
are controlled within the bounds of reality by introducing the constraints of reality (sources and their
characteristics) either before the regression analysis or after the factor analysis.
The modeling procedure thus consists of first defining appropriate questions and formulating a
physical model. Next, data is collected and solutions (models valid for the time period defined by the
data set) developed within the realm of mathematics. Solutions fitting physical reality are then
selected on the basis of current understanding of source chemistry.

5.2 The Subject and Data Set
The fundamental basis for the receptor model and the importance of source resolution can best
be understood through graphical means. The first step, however, is to review the characteristics of the
55

subject of interest which is the volume of air that enters the receptor (Figure 5.1). It is a dynamic
system of gaseous, liquid, and solid material in which physical changes and chemical reactions can
transfer material from one phase to the next. It is, thus, important to understand the chemistry and
physics of this system if the assumptions of linearly additive features and conservation of relative
composition are to be used with appropriately selected chemical species. The receptor model cannot
be applied directly to species such as volatile organic liquids, or reactive species such as SO
2
. It can
be applied, however, to problems dealing with reactive species indirectly with the use of coefficients
of fractionation or transport coefficients.
Figure 5.1: Subject of Study
This volume of air arriving at the receptor can be characterized by time, location, and
measurable features such as:
- Chemical concentration Particle type concentration
- Elemental concentration Color
- Particle size distribution Light scattering/absorption properties
The main features of use in receptor modeling are linearly additive features. Thus, such
features as color are of little value.
Data collected over a period of time and at different locations can be displayed as a data cube
as illustrated in Figure 5.2.


Volume
of
Air

Solid
Gas
Liquid
Chemical
Reactions
Receptor
56


Figure 5.2: Three Dimensional Data Cube Used as Input to the
Fundamental Receptor Model Equation.
The data cube represents the information base available to interpret source contributions.
Current methods and previous studies have extracted only a small fraction of the potential
information contained in data cubes generated in large studies such as the Portland Aerosol
Characterization Study (PACS). The objective of the investigator, thus, is to extract the maximum
source information from this data base by using the most cost-effective set of approaches to develop
the highest level of confidence in the results.
A single datum cell from the larger data cube represents a value for a feature, i, such as the
concentration of particulate lead measured during a time interval k and at location 1. The basic
receptor model equation relating these three parameters is shown at the bottom of Figure 5.2. Similar
equations can be written for each size range of particles.
The data cube is constructed of aerosol feature measurements made at different times and at
different locations. Features measured can include elemental, chemical, isotopic, particle type
features or gaseous species as indicated in Table 5.1 and in Figure 5.3. A rich abundance of informa-
tion about the nature of the aerosol at a receptor can be collected with state-of-the-art sampling and
analysis technology. The challenge is to design a cost-effective study to maximize the source
resolving capabilities of the particular receptor model approach and to extract as much independent
information as possible from this data cube.
57

Figure 5.3: Illustration of Three Possible Data Slices through the Data Cube

Table 5.1: Possible Receptor Model Features
















Information in the data cube can be analyzed from several perspectives as illustrated in Figure
5.3 which shows three two-dimensional slices through the data cube. One plane having dimensions
of features (Al, Si, V, etc.) and time (samples collected at different times) is called the time series
.

slice. Another is the spatial series slice, having dimensions of features and locations while the third is
the cross-sectional slice having dimensions of space and time.
Receptor modeling based on the analysis of variability, such as factor analysis, statistically
Possible Receptor Model Features
Elemental Concentration
Organic Elemental-Carbonate Carbon
Concentrations
Specific Carbon Compound
Sulfate, Sulfur Dioxide
Nitrogen Oxides
Mineral and Chemical Species
Particle Type
Isotopes Stable and Radioactive
Other Source-Resolving Information
Time
Location
Location
F
G
H
NH
4

B(a)P
C-14
Pollen
Ni
Cu
Zn
Time Series
Slice
Cross Sectional
Slice
(Space/Time)
Spatial Series Slice
(R or Q Analysis)
SO
4

58

analyzes the data in a planar slice through the data cube while a regression analysis such as the CMB
approach fits source feature data to the features represented by a single line of data as measured on a
single filter. In the case of factor analysis, one can analyze the variability of features over filters or
the variability of filters over features. 'The planar slice and its orientation used for factor analysis has
been referred to either 0, P, Q, R, S or T factor analysis, but these terms are often used inconsistently
in the literature and are irrelevant. Additionally, normal factor analysis can analyze only one slice of
data at a time. This is due to the requirement that a homogenous set of influences be responsible for
the variability. Thus, mixing of time series data collected at several sites in a common set of factor
analyzed data is technically incorrect. Analysis of mixed time and location data will generally reduce
the ability of the analyst to resolve individual sources because the variability will, in such an
incorrectly formed data set, be due both to source and location variability. However, the mixing of
both time and location data may provide improved' source resolution when the individual data sets
and the variability due to location are both small. Analysis of mixed time and space data is possible
but requires a special approach.

5.3 Histogram and Vector Representations
The way in which these different interpretive approaches treat the information contained in a
data cube is best illustrated with graphical representations. The data can be represented either as
HISTOGRAMS (feature variability) or VECTORS (feature vector in filter space or filter vector in
feature space). Figure 5.4 illustrates the two histogram means of graphically displaying part of the
data in the data cube. The concentration of a feature can be plotted, for example, as a function of
features such as Al, Si, SO
4
, etc. as shown. This pattern will represent a unique geometrically
conserving pattern if the concentration is plotted on a logarithmic scale. This type of pattern
represents the elemental concentrations measured for a single filter along the feature axis in Figure
5.3. Although only a few species are illustrated, any grouping can be used. Inclusion of a specific
chemical or other feature will depend on the feature's ability to contribute to the source resolving
power, the ease of measurement, background levels, etc.
Data can also be plotted for a single feature as it varies along the time or location axis. This
type of feature histogram is illustrated in the bottom half of Figure 5.4. The plot as illustrated on the
bottom right is what might be expected from the diurnal variation of Al in an airshed dominated by
road dust.
An element like Pb which is abundant in the road dust pattern but low in the others might be
used to separate the road dust impact, but the ambient Pb levels are often dominated by automotive
exhaust which reduces the resolving power of Pb in other sources such as road dust.
The receptor model's ability to resolve these sources, however, could be improved
dramatically by
- Physically separating their impact with fine and coarse particle sampling (coal fly ash is
predominantly fine particles while the others are mostly coarse particles).
- Adding other chemical species and elements such as As, Se, Ge, carbonate etc. to the
distribution would help to further separate the influence of coal fly ash by chemical means.


59

Figure 5.4: Histograms Where either the Concentration of Various Features is plotted as a
Function of Feature or the Concentration of a Single Feature is plotted as a Function of Time
of Space
- Using short term sequential sampling and multivariate analysis in conjunction with CMB
analysis would be helpful in separating the influence of road dust which is dependent on
general meteorological conditions.
Ambient aerosol elemental distributions clearly show the influence of source fingerprints. The
dominating impact of residual oil, ferromanganese, kraft recovery boiler, and galvanizing operations
are obvious when contrasted to the typical urban aerosol.
It is interesting to note that a high (as high as 40%) kraft recovery boiler impact was attributed
to the fine particle mass at an industrial site on both August 17th and October 17th. These results
were reported to the Oregon Dept. of Environmental Quality (DEQ) who later informed us that the
nearest kraft recovery plant was about 20 miles from the sampling site. After a review of the
analytical data, the same results were again reported to the DEQ with a qualification that either a
kraft recovery plant was impacting the site or another source having a chemical fingerprint similar to
a kraft recovery boiler. The DEQ went looking for such a source and found a small (too small to be
included on the emission inventory) industrial chemical plant that trucks kraft liquor into the
industrial site for power and feedstock materials. This source, located just a few blocks from the
sampling site, apparently had a substantial impact on the receptor on the days sampled and was a
substantial source contributing to the annual average. Numerous similar experiences where
unexpected source impacts have been reported and verified has provided essential feedback that
greatly increases receptor modeler's confidence in his results.
A B C X Y
Feature
C
o
n
c
e
n
t
r
a
t
i
o
n

Al Si SO
4
Pb Ni
C
o
n
c
e
n
t
r
a
t
i
o
n

Feature Pattern
Feature Variability
1 2 3 4 5
C
o
n
c
e
n
t
r
a
t
i
o
n

Filter Number
(Time/Location)
1 2 3 4 5 6 7 8 9 10
A
l
u
m
i
n
u
m

(
%
)

Filter Number
(Time/Location)
60

The information in the data cube may also be represented by vectors in filter or feature space
as shown in Figure 5.5. In the top portion of the figure, a feature vector such as the percent
concentration of Al as measured on two filters is plotted in filter space as defined by the filter
number 1 and filter number 2 axis. The vector representing this data is drawn from the origin to the
point representing the coordinates. These two filters could represent samples collected at different
times or samples collected at different locations but as noted, they should not be mixed. Assume, for
example, the Al is dominated by road dust. The variability of Al will then be determined primarily by
the amount of traffic. If Al data from another site in which the Al is dominated by a source different
from road dust or by road dust with different Al concentration, then the data variability would not
only be a function of traffic in the area near the one site but would also be due to the relative amounts
of different sources influencing the other site. Since the filter can represent either time or space, this
method can be used to represent either time or spatial series data.
Figure 5.5: Illustration of the Two Vector Modes of Displaying the Data Cube
Information

Both data slices can be examined either by plotting feature vectors in filter space as illustrated
in the top of the figure or as filter vectors in feature space as illustrated on the bottom. In the latter
example, the percentage of Al and Si are the coordinates for filter 1 and filter 2. Again, since the
filters can represent either time or location, both slices can be graphically represented in the same
manner. Since the correlation coefficient is the cosine of the angle between two vectors, one can
study the correlation between filters in feature space or the correlation between features in filter
space. The difference in source resolving capabilities of these two approaches has not been clearly
FEATURE VECTOR IN FILTER SPACE
Feature X
F
i
l
t
e
r

1

(
%

X
)

Filter 2 (% X)
(Filter Space)
Feature
Vector
F
i
l
t
e
r

1

(
%

A
l
)

Filter 2 (% Al)
Aluminum
(3, 3)
F
e
a
t
u
r
e

1

(
%
)

Filter 2 (%)
(Feature Space)
Filter Vector
(Time/Location)
A
l
u
m
i
n
u
m

(
%
)

Silicon (%)
Filter 1
(3, 1)
Filter 2
(1, 3)
FILTER VECTOR IN FEATURE SPACE
61

established.
One example is a plot of V and Ni percent compositions as measured in Portland, Oregon,
during the winter at four urban sites. This plot, in which the impact of meteorology has been
minimized by normalizing to the total mass (i.e., by the calculation of weight fraction or percent)
shows a strong correlation in the data points. That is, if vectors were drawn from the origin to each
point, the angle between the vectors would approach zero and their cosine would approach one.
The objective of variability analysis approaches such as cluster analysis, factor analysis, etc. is
to find correlated data clusters like those found in Portland, Oregon but in multidimensional space. In
this particular case, the two current axes (V, Ni) could be rotated until one axis was aligned such that
the maximum data variability was along one axis. In this particular case, the two current axis (V, Ni)
could be rotated until one axis were aligned such that the maximum data variability could be
explained by a new single axis as drawn through the data points. It is interesting to note that the slope
of this line (0.64) is the same as the average V to Ni ratio measured in residual oil emission in
Portland. Although a factor analysis of this simple two-dimensional data would find this same factor,
it would not be able to place a common name on it without knowledge of the source chemistry and
would not be able to quantify the source contribution without a regression analysis on the mass or
knowledge of the source chemistry and would not be able to quantify the source contribution without
a regression analysis on the mass or knowledge of the fractional composition of one of the
components. The analysis of the variability of this Ni and V data reveals that these elements are:
1. Highly correlated and possibly came from a common source, and
2. The V/Ni ratio is 0.64.
Although this variability analysis yields primarily qualitative information useful in identifying
sources, the connection between the source and its impact cannot be made without information on the
composition of potential sources, and the knowledge that other sources do not exhibit a similar strong
correlation. The impact on each filter could be quantified if the V or Ni were assumed to have
originated entirely from residual oil and were emitted in a known concentration. Sources such as coal
fired power plant emissions might be resolved in multi-elemental space, if other key elements or
chemicals are used.
A three dimensional data base in which two sources contribute to the variability simplifies to a
two dimensional plan. In this case, hypothetical data from automotive exhaust and road dust
contribute the variability in the ambient concentration of Br, Pb, and Si. This system can be
simplified to a two dimensional planar system having road dust and automotive exhaust axis.
Another way to graphically visualize source resolution with the receptor model is shown in
Figure 5.6. The vectorial representation of three source elemental fingerprints is shown at the top
while the elemental pattern is shown at the bottom of the figure. The angle between these source
vectors is larger than observed for the geological source category, i.e., they are easily resolved
because of their distinctly different chemistry. It should be noted that another road dust sample
having similar chemistry would be represented by a point near the current road dust vector, would be
highly correlated with the first vector and difficult to resolve because it would be approximately
linearly dependent on the first vector.
62

Figure 5.6: Comparison of the Vector and Histogram forms of
Displaying Chemical Data for Three Sources
Pb
Si V

Auto Exhaust
(22, 0.01, 0.8)
Resolution
Residual Oil
(0.11, 2.5, 1.0)
Street Dust
(0.3, 0.02, 22)
Elemental Space (Pb, V, Si)
100
10
1
0.1
0.01






Pb
Ni
Br
Fe
V
S
Si
Al
Na
1.0
2.5
.11
P
e
r
c
e
n
t

100
10
1
0.1
0.01






Pb
Ni Br
Fe
V
S
Si
Al
Na
22
0.02
0.3
P
e
r
c
e
n
t

100
10
1
0.1
0.01







Pb
Ni
Br
Fe
V
S
Si
Al
Na
0.8
.01
22
P
e
r
c
e
n
t

Residual Oil
Street Dust
Auto Exhaust
63

The amount of a source contribution is adjusted by multiplying the entire distribution by a
constant which in effect stretches or contracts the length of the vector as illustrated with the dashed
line for the residual oil in Figure 5.6.
Figure 5.7 illustrates chemical data from ambient filter analysis plotted in three dimensional
filter space. In this hypothetical case, three factors could be resolved: one with high Al and Si
loadings, one with high Br and Pb loadings and one with high Ni and V loadings. The variability in
this data could be explained with three new axes which we could label road dust, automotive exhaust,
and residual oil. In this hypothetical case, all three sources were easily resolved.
With this in mind, the process of obtaining a chemical mass balance fit to ambient data can be
visualized as a multiple regression analysis as illustrated in Figure 5.8. The ambient data vector (X)
is plotted in elemental space (Si, Ni, Pb) with coordinates x
1
, x
2
, and x
3
. Three hypothesized sources
(D = dust, A = auto exhaust, R = residual oil) are mapped into the same space. A linear combination
of these three source vectors is sought which minimizes the difference between the experimental and
predicted data, i.e., a linear least squares multiple regression analysis is performed as illustrated at the
bottom of the figure. Other combinations of sources can be tried until the best fit is obtained as
determined by specified criteria such as minimum X
2
, a high percentage of mass explained or a ratio
of predicted to measured elemental concentrations which is close to unity.

64

Figure 5.7: Plot of Elemental Data for Si, Al, Br, Pb, Ni and C as Measured on Three
Hypothetical Filters. The Elemental Correlations Observed Clearly Suggest Contributions
from the Three Sources Indicated
Figure 5.8: Schematic Illustration of the Major Steps in a CMB Fitting Procedure.
Br (1, 3, 3)
1
2
3
Road Dust
Automotive
Exhaust
Residual Oil
Pb (3, 9, 9)
Si (6, 3,
9)
Al (2, 1, 3)
Ni (5, 3.5,
1)
V (10, 7, 1)

(d
1
, d
2
, d
3
)
S
i
Pb
V

(x
1
, x
2
, x
3
)

(
1
,
2
,
3
)

(r
1
, r
2
, r
3
)

(a
1
, a
2
, a
3
)

3
=
1

3
+
2

3
+
3


65

The measured ambient concentration data (X

) is first plotted in the three dimensional elemental


space. Three hypothesized source vector are next plotted in the same elemental space and a linear
combination of these three basic source vectors sought such that their sum approximates the
unknown ambient vector. The most probable linear combination is found from the regression
analysis illustrated at the bottom of the figure.
X

is the actual ambient vector

is the vector predicted by the linear combination of source vectors

is the error between actual and predicted values, the cosine of the angle between X and is
directly proportional to the degree of correlation.
66

5.4 Multicollinearity in Source Apportionment
Large errors can result if source apportionment by conventional regression or weighted
regression analysis is attempted when two or more sources with very similar signatures are included.
It is not uncommon for negative aerosol contributions with large magnitudes to be estimated for
some sources under these conditions. Another symptom is estimated uncertainties that are larger than
the calculated source contributions themselves.
In statistical terms, the problem of similar source signatures is one of "multicollinearity."
More generally, multicollinearity exists when one source signature is nearly a linear combination of
any subset of the other signatures.
Multicollinearities can be identified on the basis of chemical and physical judgment, manual
examination of the source signatures, and statistical measures. The first two approaches adequately
determine many multicollinearities (for example, sources within the crustal category that have similar
signatures). Identifying subtle multicollinearities involving several sources, however, can be more
difficult. Thus, the use of statistical measures is also helpful. The variance inflation factor (VIF) is a
statistical measure of multicollinearity. The VIF quantifies the isolated effect of multicollinearity in
terms of the final errors in the source apportionment. Specifically, the VIF can be interpreted as the
increase in the error variance of the estimated aerosol contribution of a specific source due to
multicollinearity alone. Also, principal component analysis can be used to explicitly uncover
multicollinearities in the source composition matrix.
One approach to the multicollinearity problem is to examine the source signatures and
determine groups of sources whose aerosols are chemically similar. Multivariate analysis of source
profiles may be used to define independent source categories. One such group, for example, could
include sources in the crustal category, such as soil and road dust. Then a signature could be selected
that represents the entire group. Subsequently, an analysis could be made to apportion the aerosols
among the selected groups of sources. This approach (source selection) reduces the multicollinearity
problem and the potentially large errors that often are caused by it. However, grouping sources
reduces the resolution of the source apportionment.
Source selection may offer a solution or a partial solution, depending on the situation. It may
be possible to eliminate some sources on the basis of engineering judgment. A source should be
eliminated on this basis only if there is strong physical evidence (such as wind direction data for the
sampling period in question) that it does not contribute significantly to the monitored aerosol
concentrations. Deleting sources may eliminate some of the multicollinearity problems. Source
selection can be made on the basis of emissions inventories, source-receptor geometry, wind
information, additional aerosol characterization, etc.
One fundamental problem with eliminating multicollinearity by source selection involves
determining realistic error estimates for those sources that remain in the least-squares solution. Their
uncertainties should reflect the possibility that wrong sources were selected; currently, this is not the
case. A further difficulty arises in assigning uncertainties to source contribution averages. If a source
is not selected on one day, it is usually averaged in as 0, with an uncertainty of 0. The error of the
average can then be determined by propagation of the errors on individual source contributions. In
this case, the uncertainty on the average is given by N
-1
(
i
2
)

, where N is the number of sampling


periods and
i
2
is the error variance associated with the CMB source contribution determination for
67

each period. Another approach that has been applied is to take the uncertainty of the average source
contribution to be N
-
, where N is as before but is the standard deviation of the N source
contributions. These two approaches are not equivalent, and further work is needed to better treat the
question of uncertainties of average source contributions.
Another approach to the problem is the use of statistical techniques specifically designed to
handle multicollinearity (e.g., ridge regression and singular value decomposition). An additional
approach is regression on principal components. Depending on the situation, a combination of the
approaches discussed here may be needed. Without additional information, such as wind direction,
no CMB or multivariate receptor model can apportion an aerosol between two sources whose
signatures are indistinguishable.

5.5 Ambient Pattern Fitting
The least squares fitting of chemical source patterns to ambient chemical patterns can be
performed and illustrated graphically. A time series of elemental patterns, for example, are illustrated
in Figure 5.9 for the fine particle fraction (< 2.5 m) collected on January 27, 1978 in Portland,
Oregon. The first sample collected from midnight to 4:00 a.m. shows an unusually high
concentration of Mn, K, and Na relative to other species such as Fe, Al, Si, and Ca. The Mn/Fe ratio,
which is normally about 0.02 in a typical urban aerosol, is about 4.0 in this sample. The only source
in the Portland airshed exhibiting an enrichment of Mn relative to Fe was the ferromanganese source
which also was enriched in K and Na. Comparison of the elemental patterns for the samples collected
from 4:00 a.m. to 8:00 a.m. and from 8:00 a.m. to 12:00 a.m with the one collected from midnight to
4:00 a.m. shows a steady decrease in the concentration of Na, K, and Mn with time while the Al, Si,
Br, and Pb show, a steady increase in concentration as might be expected from an increase in road
dust and automotive exhaust due to an increase in traffic levels.
Note: Overlays will be used by the instructors to graphically illustrate how source elemental
profiles are fit to ambient profiles.
68


Figure 5.9: Time Series of Elemental Patterns for Fine Particle Fractions

5.6 Enrichment Factor and Mn/V Ratio
A noncrustal Mn/V ratio method was recently used to distinguish the impact of regional air
pollution sources over long distances. The relative contribution of the Eurasian and northeastern
North American emissions to the arctic aerosol, for example, was suggested by this method.
Although this noncrustal Mn/V ratio method is a very primitive form of receptor modeling, it
69

is informative to review this specific application to appreciate the full potential of state-of-the-art
receptor model methods and because it was recently suggested as a means of distinguishing the
midwestern contribution to the eastern acid deposition problem.
The method is a modification of an enrichment factor (EF) approach. The EF method is a
receptor oriented approach that is used to estimate the degree to which a potential anthropogenic
pollutant is enriched relative to crustal contributions. The EF is given by the equation:

c
r
c
i
a
r
a
i
i
C C
C C
EF
/
/
= (5.1)
where
EF
i
= enrichment factor for ith element

a
i
C = aerosol factor for ith element

a
r
C = aerosol concentration of crustal reference elemtns such as Al, Si or Fe

c
i
C = crustal concentration of ith element

c
r
C = crustal concentration of reference element
A major problem with this method is that one cannot expect the EF to be constant when the
aerosol is transported over long distances because the crustal reference element is associated with
large particles, while many anthropogenic pollutants, such as V, are found in the fine particle mode.
The noncrustal Mn/V ratio method minimizes this effect by comparing the ratio of the noncrustal Mn
and noncrustal V. This ratio is calculated as indicated below:
( ) ( )
( ) ( )
a
t
c
a
t
a
t
c
a
t
n
Al
Al
V
V
Al
Al
Mn
Mn
V
Mn
|
.
|

\
|

|
.
|

\
|

=
|
.
|

\
|
(5.2)
where the superscripts "a" and "c" refer to aerosol and crustal respectively and subscripts "n" and "t"
refer to noncrustal and total aerosol mass. The second term in both the numerator and denominator
represent the crustal contribution to the Mn and V. In essence, the crustal Mn and V are normalized
to the total Al in the aerosol (Al is assumed to be a tracer for the crustal contribution to the aerosol)
and subtracted from the total aerosol concentration for the specific element.
This process as defined by Equation 5.2 can be visualized in terms of the receptor model as a
three element (Al, V, and Mn), two source (crustal and noncrustal) chemical mass balance (CMB)
source stripping procedure. The crustal source fingerprint is fit (normalized) to the aerosol. Al
concentration and the noncrustal fingerprint determined by subtraction of the normalized crustal Mn
and V.
This approach assumes that all of the Al is crustal and is still somewhat dependent on aging
effects because of differences between Mn and V noncrustal aerosol size distributions. This
remaining size dependence has been estimated to be much less than the normal enrichment factor and
70

therefore not significant in this calculation.
The noncrustal Mn/V ratio (three element noncrustal "source" profile with Al equal to zero)
has been determined for both Eurasia and the northeastern North America. Examples of the
noncrustal fingerprints for these two regional sources are compared to a representative arctic aerosol
fingerprint. The key to this comparison is the large difference in the noncrustal Mn/V ratios for the
two regional sources. The observed arctic Mn/V ratio is most consistent with a Eurasian aerosol with
an enrichment of Mn from a source in central USSR.
The Eurasian mean noncrustal Mn/V ratio (2.0 0.8) was 5 1 fold greater than the
northeastern U.S. mean noncrustal ratio (0.41 0.09). These uncertainties, however, are based on
distributions of values measured in each area and do not take into account uncertainties in crustal
component estimates. The effect, however, has been estimated to be small. This can be seen from
equation 5.2 and the magnitude of the crustal component which is relatively small in most cases.
Thus, even a relatively large uncertainty in the crustal component has a small impact on the
noncrustal ratio. The noncrustal Mn/V ratio method was the first receptor approach, although
primitive, used to gain insight into the sources of pollutants transported over long distances.
The same procedure was also used in an attempt to separate the influences of midwestern and
northeastern aerosol sources on the east coast's air quality. The difference in the noncrustal Mn/V
ratios in this case is eightfold as compared to the fivefold difference observed with the Eurasian
continent. Results, based on this noncrustal Mn/V ratio method suggested that east coast aerosols
during high sulfate and acid episodes had more of the chemical characteristics of a typical east coast
aerosol than those of a Midwest aerosol. This primitive receptor approach provided for the first time
an indication of a lower contribution from midwestern sources than had been previously considered.
71

6.0 CHEMICAL MASS BALANCE MODELING

6.1 Fundamental Principles

6.1.1 Background
The impact a source has on air quality at a receptor can be calculated by using either a
dispersion model approach in which fundamental physical principles are used to calculate the impact
(a deterministic approach), or a receptor model approach which uses a statistical analysis of aerosol
features measured at a receptor to calculate the most probable source contributions (probabilistic
approach).
The source oriented approach (SOA) expresses the impact at a receptor from source j, M
j
, as a
product of an emission term, E
j
, and a transport coefficient, A
j
:
j j j
A E M =
The transport coefficient is a function of diffusion (D), transformation (T) and deposition (d)
terms. The transformation and deposition terms can be thought of as attenuation coefficients that
reduce the impact that would have occurred if the emissions obeyed the law of conservation of mass
during transport. The effect of each term must be estimated with complex submodels because few of
the variables for these main elements are directly observed. For example, surface winds may be
known for only a few minutes of each hour and are usually not available for more than a few widely
dispersed stations. In addition, direct measurement of atmospheric stability and turbulance are
limited. Submodels must therefore be used to estimate, from the sparce direct measurements
available, the detailed data sets of input variables required.
The most fundamental limitations of this SOA are its dependence on absolute values and a
detailed history of each air parcel. Although our understanding of the chemistry and physics involved
in the transport of reactive species will improve, our knowledge of the history of an air mass arriving
at a receptor (i.e. the environmental conditions to which it has been subjected) will have a high
degree of uncertainty and will substantially limit the accuracy of the SOA.

6.1.2 Basic Receptor Oriented Approach
The physical model on which the receptor modeling approach is based, is similar to that of the
source oriented approach as schematically illustrated in Figure 6.1. Material emitted by the jth source
is acted upon by meterological and atmospheric influences which dilute and modify the
characteristics of the emitted material. The primary difference in the two models is their starting
point and their focus of attention as far as measurables. Whereas the source-oriented deterministic
modeling approach starts from the source and focuses on measurables characteristic of the source and
transport influences, the receptor-oriented probablistic modeling approach starts with the receptor
and focuses on measurables characteristic of the material collected at the receptor. In the receptor
modeling approach, the total mass or the mass of the ith feature, measured at the receptor is the
dependent variable and the mass contributed by each source impacting the receptor become the
independent variables:
72

=
=
p
j
j
M m
1
(6.2)
m = total mass measured at the receptor
p = number of sources impacting the receptor

=
=
P
j
ij i
M m
1
(6.3)
m
i
= total mass of ith feature measured at the receptor
M
ij
= mass of the ith feature arriving at the receptor from the jth sources

Implicit in both of these equations is an assumption that the masses are linearly additive and
nonreactive as the material arrives at the receptor.
Assumption No. 1: Linearly additive masses
Assumption No. 2: Nonreactive at the receptor
These assumptions are also implicitly assumed by other receptor and source oriented modeling
approaches.
The mass contribution of the jth source to the ith feature may be expressed as the product of
the fractional mass composition of the ith feature in the material from the jth source as it arrives at
the receptor (F
ij
) and the total mass contribution (M
j
).

j ij ij
M F M
'
= (6.4)

Equation 6.4 is valid so long as the above assumptions are valid. The problem with Equation
6.4, however, is that there are two unknowns, F
ij
and M
j
. Although Mj can be measured at the
receptor, neither F
ij
nor M
j
can be measured directly. To use Equation 6.4, an additional assumption
is required.
Assumption No. 3: Conservation of relative composition.
ij ij
F F =
'
(6.5)
F
ij
= fractional composition of the ith feature as measured at the jth source.
Although some compounds, such as sulfur and nitrogen oxides, are reactive and would not
meet this assumption, hundreds of other potential compounds could be measured which would satisfy
assumption No. 3, particularly if appropriate source sampling were conducted to minimize
differences between the measured source profile and the characteristics of the material arriving at the
receptor (This will be discussed further in Section 12, Develoment of a Source Matrix).
Although the above discussion is an over simplification of very complex transport
characteristics, it has been extremely useful in many applications.
With this assumption (Equation 6.5) Equation 6.4 becomes

=
=
p
j
j ij i
M F m
1
(6.4a)

73

Figure 6.1: Schematic Representation of Chemical Mass Balance Principles

E
m
i
s
s
i
o
n

V
a
r
i
a
b
l
e
s

C
o
n
t
r
o
l

M
e
t
h
o
d
s

P
r
o
c
e
s
s

F
u
e
l

T
i
m
e
,

S
e
a
s
o
n

R
a
w

M
a
t
e
r
i
a
l
s

E
t
c
.

E
m
i
s
s
i
o
n
s

C
h
a
r
a
c
t
e
r
i
z
e
d

b
y

F
i
j


O
t
h
e
r

S
o
u
r
c
e
s



S
o
u
r
c
e

J

F

i
j

A
t
m
o
s
p
h
e
r
i
c

M
o
d
i
f
i
c
a
t
i
o
n
s

C
o
n
d
e
n
s
a
t
i
o
n

V
o
l
a
t
i
l
i
z
a
t
i
o
n

C
h
e
m
i
c
a
l

R
e
a
c
t
i
o
n
s

S
e
d
i
m
e
n
t
a
t
i
o
n

E
t
c
.

Mj

m

R
e
c
e
p
t
o
r

(
A
i
r

F
i
l
t
e
r
)

m

=


m
i

=


m

=

M
j
j

m
i

=

M
i
j
j

=

i
j

M
j
j

A
s
s
u
m
e

F
i
j

=

F

i
j

m
i
m

=

F
i
j
M
j
m
j

C
i

=

F
i
j
S
j

1
0
0

S
j

=

P
e
r
c
e
n
t

C
o
n
t
r
i
b
u
t
i
o
n

o
f

S
o
u
r
c
e

J

74

By dividing both sides of Equation 6.4a by the total aerosol mass measured at the receptor, the
following equations result:

=
p
j
ij
i
m
M
F
m
m
(6.4b)
j
p
j
ij i
S F C

=
=
1
(6.4c)
C
i
= fractional composition of the ith feature
S = fraction of measured aerosol mass contributed by the jth source.
The C
i
are measured at the receptor and the F
ij
are measured at the source. A series of
simultaneous equations result in which the only unknowns are the source contributions. These can be
solved by several computational methods as discussed in Section 6.2.

6.1.3 Application to Reactive Species
The receptor oriented approach (ROA) divides the reactive species problem into two parts: a
source apportionment of nonreactive species step and a step requiring the calculation of an
attenuation term which is a function of transformation and deposition effects.
Accomplishment of the first step requires the assumption that mass is conserved within a
defined set of features and that the relative composition of these features is conserved within this
subset. This assumption can be expressed mathematically on the basis of the SOA equation:
kj kj
ij ij
kj
ij
A E
A E
M
M
= (6.6)
where the ith and kth subscripts refer to the ith and kth features or chemical species in the emissions
from the jth source. The basic assumption required in this first step is that the transport coefficient for
the ith species is approximately equal to the kth species coefficient i.e.:
kj ij
A A ~ (6.7)
Under these conditions, Equation 6.6 becomes
kj
ij
kj
ij
E
E
M
M
= (6.8)
i.e., the ratio of the ith and kth chemical species as measured at the source is equal to the ratio of their
impacts at the receptor. The total mass of those species meeting this requirement, M
aj
, is defined as
follows

=
=
s
q
qj aj
M M
1
(6.9)
where M
qj
is a species in which its relative composition is conserved over S species. The fractional
composition of the ith species to this subset of conserving species, F
ij
, is defined by the following
equation:
75

aj ij ij
E E F / =
(6.10)

where E
aj
is the emission factor for all of the conserving species.
Although this assumption may not be valid for all gaseous and particulate emissions, it can be
a useful assumption if it is restricted to relatively stable species in specified particulate size ranges.
The average chemical composition of the fine particle fraction (particles with aerodynamic diameters
less than 1 or 2 m) from a coal fired power plant for example, should remain relatively constant
during transport if such obviously nonconserving species as S are not included in the definition of
M
aj
. In addition, the effects of condensation and evaporation as well as differential deposition can be
minimized by characterizing emissions with a size-segregating dilution sampler. This plus the
realization that particles from such sources are complex mixtures of chemicals having on the average,
the composition of the bulk material, provides additional confidence in the use of this assumption.
The fundamental receptor model equation

=
=
p
j
aj ij i
M F m
1
(6.11)
is based on the-assumptions of conservation of relative mass and linearly additive and non-reactive
features. It states that the mass of the ith species, as measured at the receptor is equal to the sum of
the contributions from p sources where the jth source contribution is defined by the product of the
fractional composition term, F
ij
,

and the mass contribution from the jth source, M
aj
. Solution of
Equation 6.11 completes the first step, i.e., determination of the contribution of the jth source
category to the nonreactive (conserving) species at the receptor.
The ROA to source apportionment of reactive species was first established in 1972 by Miller
et. al.

=
=
p
j
aj rj rj r
M F m
1
o (6.12)
where
m
r
= mass of reactive species measured at the receptor

rj
= Millers coefficient of fractionation
F
rj
= E
rj
/E
aj
, i.e. the ratio of reactive species to the sum of conserving species as measured at
the source with a size segregating dilution sampler.
The nature of Miller's coefficient of fractionation can be derived from the SQA equations. Rewriting
Equation 6.6 in terms of a reactive species
aj aj
rj rj
aj
rj
A E
A E
M
M
= (6.13)
solving for the mass contribution of the reactive species, M
rj
,

76

(
(

|
|
.
|

\
|
|
|
.
|

\
|
=
aj
aj
rj
aj
rj
rj
M
E
E
A
A
M (6.14)
and summing over p sources, yields

= = (
(

|
|
.
|

\
|
|
|
.
|

\
|
= =
p
j
aj
aj
rj
aj
rj
rj
p
j
r
M
E
E
A
A
M m
1 1
(6.15)
After substituting F
rj
for the emission ratio term and comparing the resulting equation
aj rj
p
j aj
rj
r
M F
A
A
m

=
|
|
.
|

\
|
=
1
(6.16)
with equation 6.12, it is clear that
aj rj rj
A A / = o (6.17)
That is, Millers coefficient of fractionation,
rj
is equal to the ratio of the reactive species
transport coefficient to the transport coefficient for the nonreactive species.
The reactive species impact at the receptor has thus been expressed as a function of the
nonreactive species impact (M
aj
), the ratio of reactive species to nonreactive species (F
rj
) and a
relative transport term (
rj
, Millers coefficient of fractionation). The product, F
rj
M
aj
, represents the
maximum impact of the reactive species at the receptor, i.e. the unattenuated reactive species impact.
The relative transport term can be thought of as an attenuation factor or a transmission or fraction-
ation coefficient which would be less than one for such species as SO
2
.
Equation 6.12 for a reactive species such as SO
2
, for example, becomes

=
=
p
j
aj SO SO SO
jM jF m
1
, ,
2 2 2
o (6.18)

6.2 Solutions to the Receptor Model Equation

6.2.1 Introduction
If one measures n features of both sources and receptor aerosols, n equations of the form of
Equation 6.4c exist, i.e.,
jk
p
j
ij ik
S F C

=
=
1
i=1, 2---n (6.4c)

where the kth subscript refers to the kth filter.
If the number of source types contributing these features is less than or equal to the number of
equations, i.e., p < n, then the source contributions, the S
jk
can be calculated.
77

The chemical mass balance solution to the receptor model equation can take on five levels of
complexity: tracer, tracer plus stripping, ordinary least squares fitting, weighted least squares fitting
and effective variance least squares fitting.

6.2.2 Tracer Method
The tracer property method is the simplest approach. It assumes that each aerosol source type
possesses a unique property which is common to no other source type. If this is valid, Equation 6.4c
will reduce to a simple proportionality, i.e.,
ij
ik
jk
F
C
S =

where the impact of source j can be determined by dividing the concentration of tracer element i (C
ik
)
by the fraction of tracer element i in the source emissions (F
ij
). The result will be an upper limit for
the contribution of source j since all of element i has been attributed to this source. Assume, for
example, that
1. Fine particle Pb is considered a unique tracer for leaded auto exhaust.
2. Automotive exhaust contains 20% Pb.
3. 1.0 pg/m
3
of Pb is measured at a receptor.
Then the total impact of leaded auto exhaust on the ambient aerosol as determined by the tracer
method is 5 pg/m
3
(1.0 pg/m
3
0.20). This approach can be extended to other sources so long as the
tracers meet the following requirements:
- The fraction of the tracer in the source emission is invariant with time.
- The concentration of the tracer element can be accurately and precisely measured in the
ambient sample.
- The tracer element is emitted by only one source category.
These conditions are difficult to meet, are highly susceptible to analytical and sampling errors
in both the ambient and source particles and tend to ignore other useful information contained in the
ambient aerosol "fingerprint".

6.2.3 Tracer Plus Stripping
The next level of sophistication is the tracer plus stripping method. In this case two or more
sources are assumed to contribute to au element's concentration. The influence of each source is
"stripped" or subtracted one at a time. Consider, for example, that the ambient Pb concentration, C
Pb
,
is the sum of contributions from both automotive exhaust and road dust.
C
Pb
= F
Pb
,
Auto
S
Auto
+ F
Pb,R..D.
S
R.D.
It then follows that this equation can be solved for

S
Auto
.
Auto Pb
Auto
F
S
,
1
=
0 . . ,
. . ,
D R Si
D R Pb
Pb
F
F
C
Si
C
where Si is used as a tracer for road dust and the second term in the brackets is the Pb associated
with road dust. This equation can then be solved if the F factors are known. A more complex
example is illustrated below for an airshed containing several contributing sources.
78

SOURCE CALCULATIONS
1. Road Dust (RD)
[{[Si ug/m
3
] + 0.281} Deposit Mass ug/m
3
] x 100% = %RD
2. Auto Exhaust (AE)
[{[Pb - (Si x 0.013)] + 0.2 1} DM] x 100% = %AE
3. Residual Oil (RO)
[{[Ni - (Si x 0.00014)] 0.054 } DM] x 100% = %RO
4. Nog Fuel Boiler (HFB)
[{[K - (Si x 0.036)] 0.2} DM] x 100% = %HFB

K
n

5. Kraft Recovery Boiler (KRB)
-Maximum Na Source Contribution
[{[Na - (Si x 0.064 + K
n
x 0.21)] 0.1281} DM] x 100% = KRB
6. K-Na/S Ratio
{[Na - (Si x 0.064 + K
n
x 0.21)]/S} = K-Na/S

6.2.4 Least Squares
The basic set of equations is most often solved for the S
i

values by a number of more complex
mathematical procedures, all of which use standard matrix manipulation techniques. The basic
approach consists of an ordinary linear least squares solution to the set of simultaneous equations.
The method attempts to

find a combination of S
j
's that minimize the sum of the squares in the
deviation between the measured and predicted species concentrations designated as fitting species.
The least squares solution requires that there be more fitting elements than S
j
's to be determined, i.e.,
more equations than terms in the basic equation set. Once the S
j
's are determined, however, all of the
species contributions from each source and the total predicted concentration of each species can be
computed and expressed as a ratio to the measured species concentration as a measure of the quality
of fit. Estimated concentrations of species not designated as "fitting" components (referred to as
"floating" elements) can be used as independent measures of the adequacy of the "fit". Although the
ratio of the calculated/measured species concentrations should ideally be "1.0", researchers usually
consider ratio values within a range of 0.5 to 2.0 to be acceptable.

6.2.5 Ordinary Weighted Least Squares (OWLS)
Standard least squares fitting procedures are usually unsatisfactory, since they tend to ignore
important species, e.g., As, Mn, that are often found in concentrations near the minimum detection
limit of the analytical methods used. To properly account for these less abundant species in the fitting
process, a weighted least square solution based on the inverse of the absolute standard deviation (of a
single determination) of the measured atmospheric concentration has been used. This procedure
produced good fits to concentrations of fitting elements in studies of Washington, D.C. aerosol.
79

In the ordinary weighted least-squares (OWLS) method, the most probably values of Si when
n > p are achieved by minimizing x
2
, the sum of squares of the differences between the measured
values of C
i
and those calculated from Equation 6.4c weighted by
C
i
, the analytical uncertainty of
the C
i
measurement:


=
=
|
.
|

\
|

=
n
i
C
n
i
j ij i
i
S F C
1
2
2
1 2
o
_ (6.19)
Written in matrix terms, the CMB equation (Equation 6.4c) is
C = FS, (6.20)
where C is the n x 1 vector of observed concentrations, F is the n x p source composition matrix, and
S is the p x 1 vector of source contributions. The OWLS solution to Equation 6.20 is
,
1
WC F WA F S
t t
= (6.21)
where W is a diagonal matrix with
C
i
-2
on the diagonal. Superscript "t" denotes matrix transpose and
"-1" is the matrix inverse. This solution provides the added benefit of being able to propagate the
measurement uncertainly of C
i
through the calculations in order to develop a confidence interval
around the calculated, S
j
. The error estimates of source contributions calculated from ordinary
weighted least-squares, (assuming errors only in the observed concentrations and no errors in the
source composition matrix) are given by the diagonal elements of the matrix.
1
) (

WA A
t
(6.22)

6.2.7 Effective Variance Least-Squares Method (EVLS)
The ordinary weighted least-squares solution is incomplete because the ambient aerosol
chemical properties C
i
are not the only measured observebles. The source aerosol chemical properties
Fij are also measured observables, but the errors associated with those measurements are not
included in the ordinary weighted least-squares fit.
The maximum likelihood solution when both errors are included minimized the function

=
=
=
+
|
.
|

\
|

=
n
i
p
j
S F
C
n
i
j ij i
j ij
i
S F C
1
1
2 2
2
1 2
2
o o
_ (6.23)
where
F
ij

is the uncertainty associated with the F
ij
measurement.
The least-squares solution using uncertainties in the F matrix is identical in form to the
ordinary weighted least-squares solution in Equation 6.21 except that the weights
C
i
-2
are replaced
by
80

1
2
1
2 2 1
,

|
|
.
|

\
|
+ =
j
p
j
F C i eff
S V
ij i
o o (6.24)
Since the effective variance solution depends on the source contributions, which are unknown,
an iterative procedure is followed. At each iteration, the previously estimated S
j
are used to compute
new effective variance weights which are used, in turn, to compute new S
j
. The process terminates
when the S
j
do not change much from step to step. The error estimates for the S
j
are obtained from
the diagonal elements of the matrix
( )
1
1

eff
t
V F (6.25)
Where
1
eff
V is a diagonal matrix with terms from Equation 6.24 on the diagonal.
This solution provides two benefits. First, it propagates a confidence interval around the
calculated S
j
which reflects the cumulative uncertainty of all the input observables. The more precise
the ambient source property concentration measurements are, the better the source contribution
estimate will be (up to a point). The second benefit provided by this effective variance weighting is
to give those chemical properties with larger measurement uncertainties, or chemical properties that
are not as unique to a source type, less weight in the fitting procedure than those properties that are
more precisely measured or that are truly unique to a source. In addition, this method allows the
inclusion of all data for which uncertainties are known. The maximum accuracy in source
apportionment is thus possible because the maximum amount of input data can be used.

6.2.8 Ridge Regression
The CMB Equation (8) can be discussed in statistical terminology as a linear regression
problem. In that framework, the observed concentrations C are the dependent variables, the columns
of the source composition matrix A are the values of the independent variable, and the source
contributions are the regression coefficients to be estimated by least-squares techniques. The
following discussion of ridge regression uses this statistical jargon.
An important difficulty is encountered when estimating the separate particulate contributions
of sources whose aerosols have similar chemical compositions. This problem -- "multicollinearity" --
is a potentially serious problem for CMB receptor models. Often two sources have nearly identical
compositions, or one source composition is almost identical to a linear combination of other source
compositions. In this case, the least-squares solution is mathematically unstable, i.e., small errors in
the measurements will be magnified into large errors in calculated source contributions. Box et al.
(1978) present a relatively simple example of the problems caused by unstability. Multicollinearity is
discussed in more detail at the end of this paper.

6.3 Applying the CMB
This discussion will focus only on the EVLS solution to the CMB since it is the most widely
used and represents the most realistic fitting and propagation of uncertainties.

81

6.3.1 Input Data Requirements
The input data requirements for the CMB (EVLS) modeling approach include the
- Chemical composition of potential sources
- Chemical composition of ambient aerosol at the receptor
- Uncertainties in each of the above
- Identification of fitting and floating species

6.3.2 Obtaining the Most Probable Source Contributions
a. Include all source profiles and elements in the initial CMB fit.
b. Remove sources with negative predicted intensities one at a time, discarding first the source
where
j
.
/S
j
.
is largest, and repeat the CMB calculation after each removal.
c. When all remaining source intensities are positive, remove any additional sources one at a
time, discarding first the source where
j
/S
j
is largest and greater than 1, and repeat the CMB
calculation after each removal.
d. Experiment with adding or substituting discarded sources to improve agreement between the
measured and predicted elemental concentrations to reduce the percentage error of predicted
total mass, and to minimize the reduced chi-square value.
This procedure can be applied to the individual observation periods separately and to the
average of the periods. Although much more computing is required, the former was considered
preferable because of the more satisfactory handling of multicollinearity and improved resolving
capabilities. Even though a particular source is discarded in the analysis of one observation period, it
will likely be retained in another, and average intensities for all sources are then obtained from an
average of results from all periods.

6.3.3 Evaluating the Goodness of Fit
The last step involved in applying the CMB requires that the operator select the best fit based
on an evaluation of goodness-of-fit parameters. These goodness-of-fit parameters include:
- Reduced Chi square (minimum)
- Degrees of freedom (maximum)
- Total percent of mass explained (100%)
- Ratio of calculated to measured elemental mass. (1.0)
As already noted, the CMB results are calculated by minimization of the chi-square function
(x
2
) as defined by Equation 6.23. The reduced chi-square,
2
p n
_ , is a goodness-of-fit parameter
reported at the top of the computer output along with the source apportionment results and is defined
as:
p n
p n

2
2
_
_
Unless the source compositions and ambient species concentrations are measured exactly,
2
p n
_

will rarely be equal to zero. Fortunately the probability distribution of when only random errors
are present is well characterized. Using this distribution and assuming that only random error is
82

present in the source and ambient data, a statement can be made about obtaining a certain value of
2
p n
_ . To make this type of statement it is the common practice to refer to the 1%
2
p n
_ (also known
as the critical value). The 1%
2
p n
_ is the value
2
p n
_ would exceed 1% of the time if it was calculated
for many data sets and the only cause of deviations was random error. The critical value is dependent
on n-p, the number of degrees of freedom. Critical values of about 2.0 are not uncommon.
Consequently, in the fits where
2
p n
_

exceeds 2.0, there is probably something incorrect included in
the chi square equation. Either the uncertainties
F
ij
or
C
i

were underestimated or the source set
included in the fit is not the real one. On the other hand, a reported
2
p n
_ of less than 2.0 is in the
range of high probability assuming only random error is present.
The
2
p n
_ as developed is not used to set confidence limits but instead as a parameter useful
for alerting us to probable misspecifications in the fitting parameters. From experience with CMB's it
is apparent that the best fit for a specific ambient filter is usually achieved with the set of sources that
produces the lowest This observation cannot be used to set confidence limits but it is an additional
piece of circumstantial evidence which builds confidence in the results.
This is not always the case, however, since the source and ambient uncertainties appear in the
denominator and may have a strong influence in producing low chi square values when sources and
or species having very large uncertainties are used. This is one reason why other goodness of fit
parameters are used.
The degrees of freedom listed (7) is the difference between the number of fitting elements
(those elements with asterisks next to them) and the number of sources, i.e. n-p. A great deal of
weight is not given to this parameter, but if other goodness-of-fit parameters are about equal, then the
fit with the highest degrees of freedom would be selected.
The percent mass explained (70 10%) and the ratio of calculated to explained mass are the
other two goodness-of-fit parameters that must be considered when selecting the best fit. Although
only 70% of the fine particle mass was explained in this example, this is typical of fine particle
samples when only those species listed are measured. This list, for example, does not include water,
secondary organic compounds, unaccounted for oxides, ammonium ion, etc.
The elemental ratios for the fitting elements are quite good. In every case, the ratios for the
fitting elements are within one standard deviation of 1.0 except for Fe. Cr, Cu and Zn are
substantially underfit as indicated by their low values and uncertainties. Since Cr, Cu and Zn were
not included as fitting elements, they did not influence the calculated source contributions. The Fe,
on the other hand, would tend to increase the contribution of other sources rich in this element since
an attempt would have been made to attribute more of the Fe mass to these sources. The use of
multiple compounds in the fitting process (15) however has tendency to reduce the influence of any
one species in the final fit.
The underfitting of the Cr, Fe, Cu and Zn suggests that one or more sources have been
omitted from the fit. Although all sources contributing to the aerosol should be included in the fit,
this requirement is a matter of degree and depends on such factors as which elements are used in
the fit. Even though a Zn source such as galvanizing was not included, it has not substantially
affected the quantitation of the other sources included in the fit because the Zn was not a fitting
83

element.
The sources included in the EVIS fit represent but one possible solution. This solution,
however, was determined to be best by trying other combinations of sources and comparing the
reduced chi squared, the percent mass explained and the ratio of calculated to measured values as
well as the degrees of freedom. Although only six primary sources were fit, (the nitrate and sulfate
are the excess of each not attributed to primary sources and represent an upper limit to the
secondary component) the other 20 or more sources considered but not fit can be assumed to have
contributed less than their minimum detectable limit which was about one percent or less.
The fact that a source cannot be used in a fit may be as important as if it were fit, since this
information can be used to establish that the source is not a substantial contributor. This upper limit
for a source contribution may also be established in some cases with the tracer method when a key
indicating element for a particular source has a very low concentration in the ambient sample.
Consider, for example, the objective of establishing an upper limit for the contribution of a glass
manufacturing furnace to particulate mass. Although the most abundant species observed in the
glass furnace emissions sampled as part of the PACS were Na and S, these same elements are
relatively abundant in the ambient atmosphere and in other likely sources. Se, on the other hand, is
also reasonably abundant in the glass furnace emissions, about 1%, and relatively low in the
atmosphere and other sources. Thus, Se can be used to establish a useful upper limit to the
contribution of the glass furnace by assuming that all of the Se in the ambient aerosol is due to the
glass furnace. If the ambient Se concentration is less than 5 ng/m
3
, then the maximum glass furnace
contribution would be 0.5 g/m
3
(5 ng/m
3
0.01). Even if there is considerable variability in the
source concentration, a factor of two uncertainty would only extend the upper limit to 1 g/m
3
;
hardly sufficient to include as a major component in any control strategy. This information is of
particular importance to the glass manufacturer who will not have to invest in additional control
equipment.

6.4 Extending the CMB Results
It needs to be emphasized that the above fitting procedure (determination of the most probable
source contributions) is completely independent of all data except for receptor data. Because of
multicollinearities, some sources cannot be resolved, such as road dust and soil, and must be included
in a common source category (crustal), or a source may not have a sufficiently unique source profile
to separate its influence from other sources. In these cases, other data besides receptor data may be
used to provide further insight into sources and their contributions. Emission inventory scaling, for
example, may be used to estimate the contribution of diesel exhaust based on the fuel or mileage
ratios and the measured source contribution of leaded gasoline combustion. Another example would
be the use of emission inventory scaling to subdivide a larger, general category such as crustal, into
components like soil, road dust, rock crusher and asphalt batching. Use of wind sector analysis may
also be helpful, particularly for point sources. Use of event activities such as strikes and permitted
activities, as well as time variability analysis, may also provide additional insight how to resolve the
influence of specific sources.
84

7.0 PARTICLE TYPE MASS BALANCE (PTMB)

7.1 Summary
Microscopy is a valuable tool for the study of settled and suspended atmospheric particles. Its
use in conjunction with an air sampling program allows the acquisition of information beyond the
measurement of the mass of particles collected in a given time. Particles may be sized, identified, and
quantitated in a thorough examination. Where time and costs are limited, less precise estimates of
quantity may be justified. However, it should be realized that such crude estimates are subject to
significant error when the sample contains many components.

Capabilities useful in receptor modeling.
Light Microscopy
- Particle by particle analysis (identification, size)
- Phase identified
- Applicable to all types of particles
- Qualitative and Quantitative results
- Qualitative analysis independent of sampling size
- Possible direct source identification and assignment
Electron Microscopy*
- Particle by particle analysis (size, shape, elemental composition - direct elemental signature)
- Automation possible (statistically significant numbers of particles can be studied without
operator dependence)
* Includes computer controlled SEMS (CCSEM)
Limitations
Light Microscopy
- Operator dependence (training and experience)
- Size limitation (generally applicable to particles >1-2 m in special cases smaller particles
are distinguishable)
- Specific source assignment dependent upon adequate reference standards
- Quantitative analysis dependent upon statistics
- Mounting medium required
- Sorbed and liquid organics not observed
Electron Microscopy
- Require vacuum
- Elemental measurement varies with size and shape
- Unreliable for organics - sorbed and liquid organics not observed .
- Standard and quantitative requirements similar to those of optical microscopy
Suitability of various sample substrates for microscopy*
Light Optical
- Cellulose Acetate (membrane) filter
85

- Glass fiber
- Others
Electron Optical
- Nuclepore
- Millipore
- Teflon
- Glass
*Ranked in order of desirability.

7.2 Comparison of Microscopic Approaches
SEM and OLM are very similar in many respects. Each involves two basic steps:
- Qualitative identification of particle type and
- Quantification of particle mass.
The major strength of microscopic methods is their qualitative ability to sort particles into a
large number of particle type classifications based on a large number of particle features. These
features include morphology, light refractivity, elemental composition, etc. All of these features,
however, simply contribute to a better qualitative analysis, i.e., particle type identification. The
extension of SEM to include elemental analysis simply added another feature that could be used in
particle type classification.
In general, OLM appears to be a better qualitative analysis tool than SEM in that it can divide
the particles into more subcategories.
When it comes to quantifying the amount of a specific particle type on a filter, both are iden-
tical in their requirements, i.e., each requires the following information:
- Volume of each particle
- Density of each particle
- The number and mass of particles analyzed must be a statistically valid representation of the
particles deposited on the original filter.
- No overlapping particles.
- Particles generally must be removed from fiberous filters for "quantitative" analysis.
Whereas the strength of microscopic methods is in its ability to resolve a large number of
particle types, its weakness is in its inability to provide a precise quantitative estimate of the mass
associated with each particle for the following reasons:
- Volume determination is made by providing an estimate of the projected area and an assumed
shape factor. Of course, particles may be cenospheres, fractured cenospheres, spores,
crystalline material, minerals, amorphous particles, etc. In some cases, only the "tip of the
iceberg" may be showing. Although there are always problems associated with any analytical
method, they can often be eliminated and/or minimized by specific procedures such as
sample preparation that redistributes particles such that a minimum number of particles are
overlapping. A combining of the results from a large number of relatively imprecise and
inaccurate single particle analyses results in an average which has a much higher relative
accuracy and precision. (But, a large number of particles must be analyzed.)
86

- The density of each particle type must be assumed. Although this often is not a major
problem for TSP analysis because the dominance of large-particle crustal material which has
a relatively small range in density, it may be a more substantial problem with the
apportionment of fine particles (2.5 m), lead particles, etc.
- The degree to which the particles analyzed is representative of the entire distribution is
determined by the number of particles analyzed. For example, if 100 particles were analyzed
by OLM and divided into 35 categories, there would be an average of about three particles
per category if they were of equal abundance. It would still be only 5 particles per category if
they were divided into 19 categories as with CCSEM. Since the particle abundances can vary
by orders of magnitude, a very large number of particles (> 1000) must be analyzed before
the less abundant particles can be quantified. In addition, numerous fields of view from
representative filter locations should be included in the analysis.
One aspect of the problem being addressed with the microscopic technique which minimizes
the uncertainty in quantifying major sources of TSP is that much of the TSP mass consists of large
crustal type particles. Since the particle mass varies as r
3
, the measurement of a few large particles
goes a long way towards explaining much of the TSP. This, as well as sample preparation techniques,
has a tendency to significantly bias the results towards those sources contributing primarily large
particles.
- Sample preparation represents another major limitation for this method since the particles
must be removed from fiber filters and redeposited on another filter prior to actual
microscopic analysis. The procedures commonly in use have been known to introduce bias
towards large particles because of low removal efficiencies for fine particles, dissolved
particles, etc.

7.3 Mass Balance Calculation
Many different optical and chemical properties of single aerosol particles can be measured,
enough to distinguish those originating in one source type from those originating in another. The
microscopic analysis receptor modeling approach takes the form of the chemical mass balance
equation presented in Equation 6.4c.

=
=
p
i
j ij i
S F C
1
(7.1)
where in this case
C
i
= fractional mass of particle i in the receptor sample
F
ij
= fractional mass of particle type i in source type j
S
j
= fraction of total mass measured at the receptor due to source j.
The solutions to the set of Equations 7.1 are the same as those described for the CMB method.
Only the tracer method, however, (each receptor particle type is assumed unique to one source type)
has been used in the past.
The microscopic receptor model can potentially include more aerosol type categories than
87

those used to date in the chemical mass balance and multivariate models. It has not taken advantage
of the mathematical framework developed in the other two models to deal with particle types which
are not unique to a given source type or the ambient and source measurement uncertainties.
The data inputs required for this model are the ambient particle type concentrations, the C
i
,
and the source particle type concentrations, the F
ij
. To estimate the confidence interval of the
calculated S
j
, the uncertainties are also required. Microscopists generally agree that a list of likely
source contributors, their location with respect to the receptor, and windflow during sampling are
helpful in confirming their source assignments.
The major limitation of microscopic receptor models is that the analytical method, the
classification of particles possessing a defined set of properties, has not been separated from the
source apportionment of those particles. Equation 7.1 has never been used in this application
although a simplified form was used in El Paso, Texas. The particle type identification takes place on
recognition of the particle by the microscopist. The particle properties he uses for this identification
are his alone (or his laboratory's) and are not subject to interpretation by another microscopist.
Source contributions assigned to

the same aerosol sample have varied greatly in inter-
comparison studies, but without the intermediate particle property classifications, it is impossible to
ascribe the differences to the analytical portion or to the source assignment portion of the process.
Many times the microscopist relies on his past knowledge of the properties of aerosol sources
without the examination of local source material. For example, a coarse particle sample taken in the
vicinity of a coal fired power plant may show a 20% contribution due to flyash and an 80%
contribution due to minerals. It has been pointed out, however, that the 20% flyash may not have
come from the power plant during the sampling period. An examination of nearby soils could show a
20% flyash concentration resulting from long term deposition. Using one of the least squares or
linear programming approaches to the solution of Equation 7.1 with local source compositions could
alleviate this problem.
Though the aerosol properties mass balance suggested by the microscopic models shows
promise, it is limited by the lack of a standardized and reproducible analytical method.
88

8.0 MULTIVARIATE ANALYSIS

8.1 Introduction
The receptor model equation is
C = FS (8.1)
The CMB (regression) approach estimates S from measurements of C and F. An alternate
approach to Equation 8.1 is factor analysis, a generic name for a variety of techniques which attempt
to estimate F and S given only C. The combination of past applications and validation studies has
established factor analysis as a useful tool for developing source related information. In general,
factor analysis has proved to be much more successful at indicating the nature of F, the source matrix
than S, the source contribution matrix. In addition, a hybrid method that combines the factor analysis
and regression approaches can be used to calculate reasonable estimates of both the source
composition and source contributions for sources with large impacts.

8.2 Important Features of Factor Analysis
1. It is a multivariate statistical technique. Since it is a multivariate technique, it requires a large
data set consisting of many measurements of many parameters. For example, a data set made
up of the concentrations of 15 chemical species measured on 24 hour air filters collected for
100 days at one site would be a good candidate for factor analysis. As with any technique,
factor analysis requires reliable input data to produce reliable results. A useful rule of thumb
is that to perform a factor analysis, the data set must contain 3 times as many samples as the
number of measured species.
2. Assuming that the chemical compositions of emissions from a single source are constant,
then the concentrations of species emitted from a source as measured at a receptor will tend
to rise and fall in tandem due to temporal variations in emission intensity and transport
parameters (e.g., wind speed, mixing heights, etc.). Factor analysis takes advantage of the
manner in which concentrations of species from a source vary together in time. In fact, the
starting point of most factor analyses is a correlation matrix. The members of the matrix are
the pair-wise correlations between the species or the samples.
3. The method of factor analysis is in many ways conceptually similar to the process of
interpreting a bivariate (x vs y) scatter plot. The two main differences are that the plot is two-
dimensional while factor analysis is multidimensional, and the plot is a graphical procedure
while factor analysis is mathematical. In spite of these differences, the concepts involved in
the two procedures are very similar. In essence, factor analysis allows us to explore for
patterns in multidimensional data in the same manner that bivariate plots often reveal patterns
in two-dimensional data.



89

8.3 Simulated Analysis Example
In order to illustrate the method of factor analysis, it is useful to work through an example
with a simulated data set. The ambient data matrix is formed by multiplying together the simulated
source contribution and source composition matrices shown in Figure 8.1. The simulated ambient
data set manufactured by this process is shown in Figure 8.2. The goal of factor analysis is to reverse
the process used to form the ambient data set. In other words, factor analysis attempts to factor the
ambient data set in Figure 8.2 into its two precursor matrices shown in Figure 8.1.
Figure 8.3 is a three-dimensional plot of the ambient data set. Each sample in the ambient data
set is represented by a vector in the space defined by the three elemental (Pb, Al, Si) coordinate axes.
At first glance, there may not appear to be a distinct pattern to the data vectors. The heavy-lined box
added to the plot in Figure 8.4 is an attempt to make it more obvious that the data vectors lie in a
plane. Since the data lie in a plane, the system is two-dimensional; meaning any two vectors in this
plane completely defines the space containing the data vectors. The dimensionality of the plane is
indicative of the number of contributing sources.
The first objective of factor analysis is to define the dimensionality and location of the hyper-
plane containing the data vectors. This determination is not done directly from the ambient data
matrix. Instead the correlation matrix containing the pair-wise correlations between the measured
species is first formed. The correlation matrix is then operated on with a procedure known as matrix
"diagonalization". The result of the diagonalization step is the calculation of the eigenvalues and
eigenvectors of the correlation matrix. The eigenvalues indicate the dimensionality of the plane or
hyperplane containing the data vectors. The eigenvectors locate the position of the plane as a
subspace within the space defined by the measured species coordinate axies.
Table 8.1 contains the eigenvalues of the simulated data set. The magnitude of the eigenvalue
is related to the significance of the corresponding eigenvector. A large eigenvalue is indicative of an
important eigenvector. The large difference in magnitude between values 2 and 3 strongly suggests
that the data set is two-dimensional and that there are only two sources contributing to the variation
in concentration of the measured species. With a real data set, the cut-off for significant eigenvalues
is never this clear. Generally, the eigenvalue table has enough structure to restrict the estimate of the
number of contributing sources to a narrow range.
Table 8.2 contains the coordinates in the original elemental space of the eigenvectors (also
known as principle factors or principle components) corresponding to the two largest eigenvalues.
The eigenvalues are plotted in Figure 8.5 and are labeled C
1
and C
2
, respectively. The eigenvectors
define the plane which contains the data points. The eigenvectors are not physically interpretable in
terms of the sources which influence the data. In fact, the eigenvectors are just one pair of an infinite
number of pairs of vectors which define the data plane equally as well. The source composition
vectors (the rows of the source composition matrix in Figure 8.1) is one pair out of the infinite
number of pairs of vectors which lie in this plane. Now that the data plane has been located, the next
step is to estimate the position of the source vectors within this plane.
Numerous mathematical and graphical schemes have been developed to estimate the position
of the "true" source vectors within the plane defined by the eigenvectors. The three main objectives
of these procedures may be summarized as:
90

1. To provide reference axes which are recognizable in terms of the chemical profiles of
suspected sources;
2. To provide a stable factor solution; and
3. To test the hypothesis that a suspected source is an actual source of the data.
The procedure for estimating the source vectors that will be considered here is known as target
transformation. The input required for this procedure is the chemical composition of suspected
sources. The suspected source profiles are then tested one at a time to determine how close they are
to the data plane. Close proximity to the plane is taken as an indication that the suspected source is a
true source.
The first two profiles tried in applying the target transformation procedure to 4 suspected
source profiles are the true sources and therefore result in a null difference vector. The third
suspected source profile is obviously not a true source vector. The fourth suspected source vector
leads to somewhat puzzling results and points out one of the disadvantages of target transformation.
Because source 4 is a multiple of the auto source profile, it also produces a null difference vector.
This illustrates that the transformation procedure only tests for relative source profiles and can be
very misleading in terms of absolute magnitudes.
A second disadvantage is that the target transformation procedure is sensitive to gaps in the
investigators a priori knowledge of the airshed. Specifically, if a real source of the aerosol is not
suspected of influencing the airshed, its profile will not be tested and consequently it will not emerge
as an identified source.
A self-modeling target transformation procedure has been developed to overcome this
disadvantage. The source profiles developed by this method are not directly related to sources since,
unlike the standard target transformation technique, source-related information is not built into the
system. Instead, common source names are assigned to the axes by a comparison with the chemical
composition of previously characterized sources. The confidence placed in the assignments must be
determined by their reasonableness and consistency with respect to supporting data.
To this point, it has been shown that factor analysis is a useful tool for indicating the number
and nature of sources which significantly impact an airshed. In order to develop quantitative source
contribution information, the estimated source composition matrix must be appropriately scaled,
thereby enabling the calculation of the source contribution matrix, S, by equation 8.1. The methods
for calculating the scaling factors are (1) sampling and analysis of identified sources, (2) literature
values of similar sources, and (3) multiple regression of the total sample mass for each of the filters
against the unscaled source contribution matrix. It is important to note that source contribution
information developed by factor analysis has only been shown to be valid for the 2 or 3 sources with
the largest impact on an airshed.
In a real world situation, the source profiles are never estimated as accurately as in this simple
example. Random error in the sampling and analysis procedures was not taken into account in this
example and would have the effect of widening the plane into an ellipsoidal shape.

91

Simulated Ambient Data Set (g/m
3
x 100)
Pb Si Al
1 23 121 41
2 76 20 10
3 31 17 7
4 139 69 29
5 54 138 48
6 107 37 17
7 112 112 42
8 24 136 46
9 144 144 54
10 139 69 29

Figure 8.1: Simulated Ambient Data Set. Calculated by Multiplying the Source Composition
Matrix by the Source Contribution Matrix

















Figure 8.2: Source Contribution Matrix [S] and Source Composition Matrix [F]

Source Contribution Matrix [S] (g/m
3
)
Auto Road Dust
1 1 8
2 5 1
3 2 1
4 9 4
5 3 9
6 7 2
7 7 7
8 1 9
9 9 9
10 9 4
Source Composition Matrix [F] (Percent x 100)
Pb Si Al
15 1 1 Auto
1 15 5 Road Dust
X
92

Figure 8.3: Ambient Data Vectors Plotted in Elemental Source
Figure 8.4: Ambient Data Vectors Plotted in Elemental Space. A Plane has been Added to the
Graph to Indicate that the Data Vectors All Lie in a Plane

Si
Pb
Al
4, 10
3
5
8
6
2
1
9
7
Si
Pb
Al
4, 10
3
5
8
6
2
1
9
7
93

Table 8.1: Eigenvalues of the Simulated Data Set








Table 8.2: Coordinates of 1
st
and 2
nd
Eigenvectors

Pb Si Al
1
st
Axis 0.67 0.96 0.26
2
nd
Axis 0.74 -0.65 -0.18
Figure 8.5: Eigenvectors (

1
and

2
) and Ambient Data Points Plotted in Elemental Space

Eigen 1) = 1.79 x 10
5
Eigen 2) = 2.45 x 10
4
Eigen 3) = -6.74 x 10
-3
94

8.4 Applications
Typical environmental data matrices contain chemical concentration and meteorological
parameter measurements made at different sampling stations or sampling times. The objectives are to
identify and ultimately develop a detailed model for the physical and chemical processes responsible
for pollutant generation, dispersion, and removal. Several representative applications are discussed
below.
Factor analysis methodology, very similar to that outlined in the previous section, has been
applied to urban aerosol data from St. Louis, Boston, Portland, Oregon and Kellogg, Idaho. These
studies have resulted in the development of extensive qualitative and semi-quantitative information
concerning the sources of aerosols in the respective airsheds. The source contribution information
developed in these studies should be questioned since validation studies done with simulated data
sets have indicated that the present factor analysis methodology can only provide reasonable
estimates of the 2 or 3 sources with the largest impact on an airshed.
A factor analysis study was performed on data consisting of the chemical and physical
properties of Boston roadway dust. Results of the analysis indicated that the sources of the roadway
dust were soil, cement, auto exhaust, rust, tire wear, and salt.
An innovative factor analysis approach proposed by Henry and Hidy shows great promise as a
tool for gaining insight into the causes of elevated sulfate concentrations in both urban and rural
areas. Briefly, the approach is one of exploratory data analysis involving the Principle Component
Analysis of meteorological and air pollutant data followed by a regression of the derived components
against observed sulfate concentrations. The method has been applied to data from New York City,
Los Angeles, Salt Lake City, St. Louis, Lewisburg, West Virginia and Brookville, Indiana. Some of
the chemical and physical factors that were found to be associated with sulfate levels in these cities
included photochemical activity, air mass character, atmospheric dispersion, and SO
2
emissions.
A recently proposed approach to receptor modeling employs the factor analytical technique
known as Singular Value Decomposition (SVD). Application of SVD enables the calculation of the
limits of the ability of the CMB to accurately estimate the contribution of specific sources. In
addition, it is possible to determine if two sources with very similar chemical fingerprints can be
accurately resolved by a CMB.
The above examples illustrate a few of the reported applications that have established factor
analysis methodology as a valuable tool for understanding atmospheric phenomenon.

8.5 Summary
Table 8.3 compares some of the characteristics of the CMB and Target Transformation Factor
Analysis approaches to receptor modeling. It is important to understand that the two approaches are
complementary and not mutually exclusive. The effort involved in applying both methods to a data
set is not that much greater than applying either one alone. Presently, Factor Analysis can be
considered an established method for developing useful qualitative source related information.
However, for most situations, the CMB approach is superior suited for providing quantitative source
apportionments.
95

Table 8.3: Comparison of CMB and Target Transformation Factor
Analysis Approaches to Receptor Modeling
Chemical Mass Balance TTFA with Regression Scaling
1. Calculated source contributions agree well
with true contributions for simulated data sets
1. Calculated source contributions inconsistent
with true contributions for simulated data sets
2. Able to resolve and measure the contribution
of all major and most minor sources that are
suspected contributors and have chemical
profiles available
2. Only able to resolve major sources
3. The impact of an unsuspected source can bias
the contributions calculated for other sources
3. Produces chemical profiles for major sources
and is therefore very useful for identifying
unsuspected sources and verifying the impact
of suspected sources
4. CMBs on individual samples can measure
sources which significantly contribute on only
a few occasions
4. Not very useful for identifying sources which
only contribute occasionally
5. CMB on average data can be used to calculate
contributions of major sources
5. A large data set is necessary for factor
analysis
6. Errors in the data are propagated throughout
the procedure
6. Errors are not propagated
96

9.0 AVERAGES, UNCERTAINTIES,
AND SOURCE RESOLUTION

9.1 Mathematical Problems Associated with the Calculation of Mean Values
In a CMB study, it is generally desirable to calculate mean values associated with the
contribution of a given source or the sum of a group of sources. For example, if the contribution of
road dust was determined on twenty-four hour samples taken on every sixth day, the mean annual
road dust contribution based on the sixty samples collected during the year would be valuable in the
development of a SIP or simply for illustrative purposes. Similarly, if the contributions of several
industrial sources such as a basic oxygen furnace, a coking plant and a blast furnace are calculated on
daily samples, an annual average contribution of a composite source category which might be called
iron and steel manufacturing would be useful. A potential problem, however, is inherent in the
calculation of mean values, particularly if percent contributions or geometric mean concentrations are
the parameters being averaged. Geometric means are often calculated in CMB studies so that the
results are compatible with TSP standards and with data generated in attainment/ non-attainment
programs.
Table 9.1: Hypothetical Filter Data (g/m
3
)
Source Filter #1 Filter #2 Filter #3
X 0.9 0.1 75
Y 0.01 90 50
Unexplained Mass 0.09 9.9 125
Total Mass 1 100 250

The problem associated with calculating percent contributions or geometric means can be
demonstrated with the hypothetical data tabulated in Table 9.1. The mean percent contributions and
geometric means of two sources (X and Y) can be calculated from the three sets of data by two
methods. The percent of the average mass which is explained by a source, and/or the average percent
contribution made by a source may be calculated, but they are not equivalent. Mathematically, the
average percent contribution made by a given source can be expressed by the equation:
( )

-
n
i
i i
M X n % 100 / / 1 (9.1)
where, n is the number of filters,
X
i
is the mass contribution of source X on filter i, and
M
i
is the total mass on filter i.
The percent of the average mass explained by a given source is given by the equation:
% 100 / 1 / / 1 -
(
(

|
|
.
|

\
|
|
.
|

\
|

n
i
n
i
i i
M n X n (9.2)

Equations 9.1 and 9.2 are not equivalent. Using the data contained in Table 9.1, equation 9.1 yields a
97

value of 40.0% for the average percent contribution made by source X. Equation 9.2 yields a percent
average mass value contributed by source X of 21.6%.
In an analogous manner with percent contributions, geometric means could conceivably be
calculated by two methods:

n
i
i
X n , log / 1 or (9.3)
, / 1 log
(

n
i
i
X n (9.4)
where log is a logarithm to the base 10 (1n
10
). Equation 9.3 is the technique used in regulatory work
with hi-volume TSP data and hence would be the most reasonable technique to use with CMB source
data. The dilemma encountered with geometric mean data is when the sum of the contributions if
more than one source is of interest. If, for example, the sum of geometric mean contributions of
sources X and Y listed in Table 9.1 are calculated by Equation 9.5, the value obtained is 1.24 (10
1.24
=
17.4 g/m
3
). If it is calculated by Equation 9.6, the value obtained is 1.34 (10
1.34
= 21.9 pg/m3).

+
n
i
n
i
i i
Y n X n log / 1 log / 1 (9.5)

+
n
i
i i
Y X n ) log( / 1 (9.6)
Intuitively, it is difficult to judge which of the two techniques is the better choice. In addition
to the problem encountered when the sum of sources is determined, percentage values associated
with geometric means suffer the same problem as has been discussed for the calculation of
percentage values based on arithmetic means.

9.2 Uncertainties
Three different uncertainties may be calculated for the mean source contribution determined
by averaging CMB analysis results for individual samples:
- Average source contribution uncertainty
- Standard deviation of the distribution
- Standard deviation of the mean (standard error)
The average source contribution uncertainty is given by:
2 / 1
1
2
) (
1
|
.
|

\
|
=

=
k
n
n
k
j S S
j
o o (9.7)
If a source was not included in one of the CMB calculations for a particular filter, the uncer-
tainty in its zero contribution is assumed also to be zero. This assumption is probably not completely
valid, but a means to estimate the "detection limit" for sources excluded from the CMB has not yet
been developed.
The standard deviation of the distribution is given by the following equation:
98

2 / 1
1
2
1
) ) ( (
(
(

=

=
n
k
j j
S
n
S k S
j
o (9.8)
while the standard deviation of the mean (standard error) is given by
( )
2 / 1
1
2
) 1 (
) (
(
(

=

=
n
k
j j
S
n n
S k S
j
o (9.9)
This last uncertainty (Equation 9.9) has been shown to provide the most realistic estimate of
the uncertainty in a source's contribution. The analysis of synthetic data sets where the true source
contributions were known, have shown that Equation 9.7 yields uncertainties which rarely overlap
the true value while Equation 9.9 yields uncertainties which overlap the true value about two thirds
of the time. Thus, the standard deviation of the mean (Equation 9.9) should be used when calculating
the source contribution uncertainty.
It should be noted, however, that this uncertainty does not include systematic uncertainties due
to assumptions, etc.

9.3 Source Resolution in Relation to Sample Time Distribution Assumed and
Chemical Data Combination Method
A chemical mass balance calculation provides information about the amount of mass
contributed by each source considered. In studies such as the PACS, CMB calculations were made on
a large number of filters each of which was collected for a relatively short period of time. In that
study, as in many others, the primary interest is to learn the mean contribution made by a source over
a long period of time. This raises questions such as: 1) What distribution should be assumed for the
source mass data? 2) How should the data be combined to find the distribution statistics? In addition
to these scientific questions there are others suggested by economics for there is considerable
expense involved in obtaining and analyzing many, short term samples. The following questions
could be asked: 1) Can the data from several, short term samples be combined and a CMB
calculation performed on the combined data? 2) How should chemical data from several short term
samples be combined? and 3) Is it feasible to collect samples for 24-hours or even much longer
periods of time? Analyses in this section will attempt to clarify these questions and their answers.
The problems may be brought into focus by examination of Figure 9.1. The top center box
represents a number of short-term samples collected either sequentially, that is one right after the
other, or on a less-frequent, random basis. The box at the bottom represents a final statement
concerning the contribution of a source over a long period of time. The statement could be the mean
and standard deviation, for example. The right and left routes represent two general ways to arrive at
the final statement. If CMB calculations are made on each sample collected, the route on the left is
followed and the left box represents data concerning the contribution of each source for short periods
of time. Then a distribution must be assumed for source mass data, S
jk
, and the data combined to give
the final conclusion. If it is assumed that a single CMB calculation can be made on combined data,
then the route on the right is followed. In this case, a distribution must be assumed for the chemical
(or physical) data, C
ik
, and the data combined to give mean values with errors assigned that are
99

Figure 9.1: Diagram to Illustrate Major Steps in the Process of Estimating the Long Term
Contribution of Sources to Air Particulate Mass when the CMB Method is Used
appropriate for the CMB calculation. Then, the single CMB calculation is made leading to the final
conclusion.
Taking a single, long term sample and performing a CMB on that sample is equivalent to
following the right hand route with the assumption that the data is normally distributed and that the
arithmetic average of chemical (or physical) data, C
ik
,

is calculated for several, sequential, short term
samples. It is important to realize that a long term sample assumes a normal distribution for the data.
Another important point that is implied from Figure 9.1 is that the distribution assumed for the
source-mass data in the lower left path is the distribution that must also be assumed for combining
chemical data in the upper right path if the conclusions from the two paths are to be compared. This
conclusion can be supported by rigorous proof in the case of a normal distribution. Let it be assumed
SAMPLES
Short Term,
Random or
Sequential
Samples
SOURCES
Short Term
Contribution of
Sources
CHEMICAL
SPECIES
Long Term
Contribution of
Species
Long Term
Contribution of
Sources
SOURCES
CMB calculation
for each sample
Distribution assumed,
combine each species
One CMB calculation
Distribution assumed,
combine each source
100

that the air particulate concentrations for a source, S
jk
, belong to a normally-distributed, independent
random variable. Then the chemical (or physical) characteristic represented by F
ij
S
jk
is also a
normally-distributed independent random because the product of constant times variable is an
orthogonal transformation. Furthermore, the chemical species concentration, C
ik
,

is also normally
distributed because it is the sum of normal random variables. The combination of S
jk
is that indicated
in the lower left of Figure 9.1, while the combination of C
ik
is that indicated in the upper right of
Figure 9.1 and C
ik
must be assumed normal if S
jk
is assumed normal.
If S
jk
is assumed to be lognormal, one cannot prove in a straightforward manner that C
ik
must
be lognormal, but the previous proof confirms C
ik
cannot be assumed normal in this case. At the end
of the discussion evidence will be presented supporting the idea that C
ik
must be assumed lognormal
if that is the distribution assumed for S
jk
.
Calculations will be made following both paths indicated in Figure 9.1 and the resulting
conclusions will be compared in order to shed light on the questions raised at the beginning of this
section.

9.3.1 24 Hour CMB Calculations
At site 3, the Portland city center site, samples were taken every four hours for 24 hours on the
days sampled. Using these data, two types of comparisons were made: 1) six filters were combined to
form 24-hour data, and 2) six days or 36 filters were combined to form a six-day set of data.
The 24-hour data will be examined first. Data were collected at site 3 on eight days in March
and April, 1978. As part of the PACS study, chemical data, C
ik
,

from the six filters collected for a 24-
hour period were averaged to give an arithmetic mean, C

Ni
,

for each chemical species. The
uncertainty assigned to each mean value was the standard deviation of the mean (or standard error).
The resulting set of mean values for chemical species were then treated with the CMB calculation
and the source contributions for the day, S
Nj
, were predicted. As noted above, this is also equivalent
to collecting a 24-hour sample for CMB calculation. In the present study, the source-mass data, S
jk
,
from the six filters collected in one day were combined assuming the data for each source were
normally distributed. The arithmetic mean, S

Nj
, and standard deviation about the mean were
calculated. This procedure was followed for each of the eight days referred to above. The values of
S
Nj
and S

Nj
were compared to see if both methods predicted contributions from the same sources and
the same contribution for a given source.
While coarse fraction urban dust sources and fine fraction residual oil sources make mass
contributions that differ by two orders of magnitude, the two methods of source-mass prediction are
in good agreement for both cases. Both correlations have essentially zero intercept, slope of one and
correlation coefficient of .99. The method of averaging results from six CMB calculations to give S

Nj
resulted in two zero values, a very small value and five other values. The 24-hour CMB values, S
Nj
,

are consistent with the S

Nj
in all but two instances. This pattern, seen in several sets of data, suggests
that some weak sources resolved on a four-hour sample will not be resolved on a 24-hour sample.
Another significant result is the observation that while the 24-hour prediction may fail to resolve a
source part of the time, if the source is predicted it is predicted correctly. This pattern was observed
in most, for when the inconsistent pairs with S
Nj
of zero are dropped from the correlation the line
parameters are those expected for a slope of one.
101

These two cases can be considered the only cases of random prediction by the 24-hour CMB.
There are 26 cases when the 24-hour CMB could not resolve the sources for which there were S

Nj
values given by six-filter averages.

9.3.2 6-Day CMB Calculations
Six days of data or 36 filters were combined in another aspect of this study to see if a single,
multiday CMB calculation would be feasible and to see if the trends observed in the 24-hour CMB
calculations would be continued. Three assumptions were made about the type of distribution that
best represented the mass data for each source. Calculations, each following the pattern suggested by
Figure 9.1, were made using each of the three assumptions.
First assumption: the normal distribution. In this first case it was assumed that the set of mass
data for each source, S
ik
, was normally distributed. Chemical data, C
ik
,

from the 36 filters were
combined by calculating the arithmetic mean, C

Ni
, and the standard deviation of

the mean (or
standard error) for each of the chemical species. These mean values were used for a single, 6-day
CMB calculation giving the mass contribution for each source, S
Nj
. The source-mass data from the
36 single-filter CMB calculations, S
jk
, were combined by calculating the arithmetic mean, S

Nj
, and
standard deviation of the mean to estimate the mass contribution of each source. The ratio of S
Nj

divided by S

Nj
for each source is plotted for the fine fraction and for the coarse fraction. The error
given for the ratio represents propagation of errors from S
Nj
and S

Nj
.
A further note or error calculation needs to be made. The error of C

Ni
was also calculated by
propagating the errors given for each C
ik
. The propagated error was smaller than the standard
deviation of the mean and, therefore, the latter was used. Similarly the error of S

Nj
was calculated by
propagating the errors of S
jk
which were assigned by the CMB calculation on each filter. Here also
the standard deviation of S

Nj
was the larger of the error estimates and was the error used. Similar
conclusions concerning error calculations were also reached in the following two cases as well.
Second assumption: the bimodal distribution. In this second case, it was assumed that the
distributions were bimodal with all zeros belonging to one set and all positive values belonging to a
lognormal distribution. Chemical data, C
ik
,

were combined by calculating the geometric mean, C

Bi
,

and the geometric standard deviation of the

mean (or standard error). The standard deviation
calculated in this way is not symmetrical about the geometric mean. The CMB program, however,
requires the error on the chemical data to be symmetric about the mean. This was accommodated by
calculating the error as follows using the values representing one standard deviation below the mean
and one standard deviation above the mean.
(high value low value)/2 = error
Zero values for chemical data were dropped before calculation of the distribution parameters.
The mean value was scaled down in proportion to the number of zeros as discussed below for the
CMB mean. There were very few zeros in the chemical data. Chemical data combined in this way
were then used for the single, 6-day

CMB calculation of source mass contributions, S
Bj
.
The source-mass data from the 36 single-filter CMB calculations, S
jk
, were combined by
dropping all zero values and calculating the geometric mean and geometric standard deviation of the
mean. The geometric mean was then scaled down by multiplication with the following ratio.
102

number of positive points / total number of data points
The total number of data points includes the total number of zero points plus positive points.
The resulting, scaled mean value was designated as S

Bj
.

The procedure is equivalent to multiplying
all positive points in the distribution by the same factor. The geometric standard deviation was
calculated to be consistent with this procedure. The method should be viewed as follows. The
geometric mean is calculated from the positive data values. Then this mean and zero, the mean of the
set of zeros, are combined to yield a weighted average. The ratios of S
Bj
to S

Bj
are plotted. In order to
calculate an error of the ratio, the geometric errors on S

Bj
were treated as discussed for the chemical
data.
Third assumption: the lognormal distribution. In the third case the distributions were assumed
to be lognormal with all zero values belonging to the distribution. Zero values in the chemical data,
C
ik
,

were set to 0.001 and the geometric mean, C

Li
,

and standard deviation of the mean calculated.
Error values assigned to the chemical mean were calculated as discussed for the second case. Using
these data, a 6-day CMB calculation was made giving the source-mass contributions, S
Lj
. Source-
mass data from the 36 single-filter CMB calculations, S
jk
,

were assumed to be lognormal and the zero
values were set to 0.01 before the geometric mean, S

Lj
,

and standard deviation of the mean were
calculated. The ratios of S
Lj
to S

Lj
are plotted.
Some significant observations can be made about the data plotted. First note that for the strong
sources or those well characterized the ratios are very near one and the errors reasonable. Note also
the ratios plotted at zero value. These represent the cases where the 6-day CMB was not able to
resolve a source which was observed in the CMB calculations on the individual filters.
The next important trend to note is that the assumption of a lognormal distribution with the
zeros set to a small value gives ratios that often deviate from one and which have large uncertainties.
This observation casts considerable doubt on the procedure of setting zeros to small, positive values
and strengthens the assumption that the zeros do not belong to the same distribution as the positive
values in the data. When the assumptions of normal data or bimodal data are considered, most ratios
appear near one or zero. The two vegetative burn sources have large ratios and large errors which can
be justified from the fact the source-matrix data have large uncertainties.
The conclusions suggested by these data for the 6-day CMB calculations appear to be
.
the
same as for the 24-hour CMB calculations. The strong sources are predicted with confidence by the
long term CMB but weak sources are often not resolved. When the long term CMB does resolve a
source it appears to make a good prediction for it.
The mass contributions for each source, as determined using the three assumed distributions,
have been compared. Ratios are given for the mass determined by the bimodal distribution, S

Bj
,

divided by that determined by the normal distribution, S

Nj
, and a similar ratio is given to compare the
lognormal S

Lj
,

and normal assumptions. The ratios for the bimodal to normal predictions are, with
few exceptions, less than one. Further, it is seen that even when the uncertainty in the ratios are taken
into account the majority of the ratios must be considered to be less than one. These results compare
well with those reported by Watson (1979) who earlier compared geometric means with arithmetic
means for the PACS study. Many ratios fall below the 0.90 to 0.85 range found common when
comparing total mass (Watson (1981). Watson (1981) also showed that the ratio of geometric mean
to arithmetic mean decreases as the width of the lognormal distribution increases which suggests that
103

the distributions for source-mass data in this study are relatively broad. These observations suggest
that it is important to seriously consider the distribution assumed for data combination because the
models do not predict the same mean values.
The comparison of values predicted by the lognormal and normal assumptions gives many
ratios much less than one. This appears to arise from setting the zero values to small numbers which
make the distributions distorted toward low values and significantly reduce the mean values. These
observations strengthen the case for the bimodal distribution.
A comparison of the total-mass mean values as determined from experimental data and from
CMB calculations of source contributions was done. The mass (measured) was calculated from the
measured mass values on each filter by using the methods described above for combination of C
ik

data. The mass (predicted) was calculated by summing the contributions of
.
all sources, that is by
summing each data set S

Nj
, S

Bj
, and S

Lj
overall j. The lognormal assumption has been shown above to
have shortcomings and will not be discussed further. The mean values as calculated by the bimodal
distribution are lower than that calculated by the normal distribution as is seen in the four examples
shown. The ratios of means from the CMB predictions to means from the measured masses are less
than one in the fine particle size and essentially one for the coarse particle size. Both the normal and
bimodal methods seem to predict about half of the fine particle mass and nearly all of the coarse
particle mass.
The data support the contention that the distribution assumed for source-mass contributions,
S
jk
,

(lower left of Figure 9.1) must also be assumed for the chemical species, C
ik
(upper right of
Figure 9.1). This was proven earlier for normal distributions. There are no zeros in the source-mass
data for auto exhaust, and therefore the bimodal and lognormal distributions are the same. These
sources are among those with the most consistent and reliable results. The main point to note is the
significant differences in the 36-filter mean values within each source as calculated by the assumed
distributions. In each case the 6-day CMB value compares favorably with the mean calculated using
the same assumption.
The following conclusions relate to sets of data containing the mass contributions, S
jk
,

for one
source as determined by the CMB method.
1. The distribution of source-mass data seems to best be described as bimodal with all zeros
belonging to one set and all positive values belonging to a lognormal distribution. Some
distributions have no or few zeros and, therefore, have only one mode.
2. The procedure of setting zero values to some small, positive number when assuming a
distribution is lognormal has been shown to give erroneous results for source-mass
distributions.
3. When source-mass contributions from many short time filter samples are considered; the
mean value, uncertainty of the mean, and the distribution about the mean can all be
calculated. When a long time filter sample is used for calculating the source-mass contribu-
tion, only the mean value and uncertainty of the mean can be calculated.
4. Chemical data were combined from 6 filters which covered a 24-hour period and from 36
filters which covered a 6-day period. CMB calculations were performed on these data sets.
The following results were observed for both of these types of calculations.
a. Strong or well characterized sources were resolved and accurate predictions of mass
104

contributions were made.
b. Weak or poorly characterized sources were often not resolved. This was evident in the 24-
hour calculations and did not appear to become more serious in the 6-day calculations.
c. When a source was resolved, its mass contribution was predicted correctly. This means
that spurious results in the mass predictions appeared to be very rare events.
d. The mean, source-mass contribution predicted by the assumption of a bimodal distribution
was consistently smaller than the predictions made with the assumption of a normal
distribution.
e. Both the normal and the bimodal assumptions predicted about 50% of the fine particle
mass and about 90% of the coarse mass.
f. Procedures were suggested for combining data when the data is assumed to have a
bimodal distribution.
It is worth noting two important implications of these conclusions. The implications relate to
the objectives of a study involving CMB calculations. If the main objectives of a study infer an
accurate description of source mass frequency distributions, the biomodal model discussed in
conclusion (1) should be used.
If, however, the main objectives of a study involve reliable determination of major source
contributions in air-particulate mass, it may be acceptable to assume a normal distribution for source
mass. In this case, air-particulate samples may be collected over a long period of time and a small
number of CMB calculations made on the few, long term samples. The work summarized above
indicates that the CMB calculations on long term samples will resolve all but the weakest or more
uncertain sources. This suggested procedure could result in considerably reducing costs for applying
the CMB method.
105

10.0 SOURCE MATRIX DEVELOPMENT

10.1 Introduction
All receptor modeling methods require some knowledge of source emission characteristics for
both qualitative identification and quantitative source apportionment. One of the largest current
sources of uncertainties in receptor modeling is the limited knowledge of source characteristics and
their variability with time, size, fuel, location, etc.
Source sampling objectives for receptor modeling applications are substantially different from
classical source sampling for regulatory and dispersion model requirements. Whereas absolute
emission rates are required for the latter applications, only representative samples of the emission's
relative chemistry is required for receptor modeling applications. In addition, samples collected for
receptor modeling must be representative of the material emitted as it reaches the receptor. This latter
requirement has a strong influence on the design of source sampling strategies for receptor modeling.
An ideal chemical fingerprint is characterized by the following characteristics
- High concentration of key indicating species
- Low concentration of these same species in other common sources
- Low natural background levels of these same species
- High analytical sensitivity
- Low cost per analysis
Source characterization objectives include
- Minimize deviations from conservation of relative composition
- Maximize source resolution
There are four basic approaches that can be used to provide input data for the source matrix.
- Literature
- Multivariate Analysis
- Wind Trajectory Analysis
- Direct Source Sampling

10.2 Use of Literature and Source Library Data
Literature values for source profiles may be used. They can provide a means for a low cost
first pass at receptor modeling. Although substantial variability has been noted in the chemical
composition of some sources, this first pass using literature values can often establish the relative
magnitude of the source impact. If the source contribution is identified as representing less than
about 1% of the mass, the factor of two uncertainties in the source profile are not going to be critical.
If, on the other hand, this first pass using the literature source profile yields a source contribution
greater than 20%, then factor of two uncertainties are going to be very important and there may be
sufficient justification to make direct source measurements to reduce the source attribution
uncertainty.
Care should be taken, however, when adopting literature source profiles to a new airshed.
Values should not be taken directly from tables without reading the descriptive text. The literature
106

values for lead in leaded gasoline combustion emissions, for example, range from about 15% to over
70%. Values used in previous receptor modeling studies range from about 16 to 40%. Much of this
variability is due to the year in which the study was conducted, the automotive fleet characteristics,
the definition of the source (leaded auto exhaust or a composite transportation source), average lead
in gasoline, etc.
Representative source class literature review considerations are listed in Table 10.1.

10.3 Multivariate Analysis Methods
This procedure is valuable if the only available source profiles are from source tests conducted
in other geographic areas, or if it is reasonably certain that the source profile has changed or is highly
variable. TTFA can be used to adjust an assumed source profile to maximize that profile's
consistency with observed intercorrelations. Thus, TTFA can improve the source profile before
ordinary CMB analysis is attempted. If sufficient data are available (30 or more degrees of freedom
per variable), TTFA can provide a useful check on assumed source compositions. If no source
composition data are available, TTFA can provide quantitative estimates of source profiles. For
instance, if a previously unknown source is deduced, its approximate composition can be determined.
Certain dangers are associated with the use of TTFA to obtain source signatures, especially
the "smearing" of signatures between sources whose emissions are highly correlated in time or space.
Signatures obtained by TTFA for marine aerosol and incinerator emissions, for example, contained
admixtures of the signature of oilfired power plants (indicated by high V and Ni concentrations).
This smearing of signatures, which resulted from correlations among those source emissions, reflects
the close proximity of an oil-fired plant to an incinerator in Washington and the fact that large oil-
fired plants are located to the south and east of Washington, so that air parcels coming from those
directions, bearing marine aerosol, also contain particles from oil combustion. A comparison of
signatures of sources in St. Louis obtained by TTFA and WTA indicates a similar smearing of
signatures by TTFA among sources that are located in the same general direction with respect to the
receptor. In cases of large ambient data sets that include extensive meteorological data (especially
wind direction), the smearing problem may be avoided by selectively applying TTFA to only the
samples collected under specified wind conditions.

107

Table 10.1: Chemical Mass Balance Input Source Composition Consideration

10.4 Wind Trajectory Analysis
A wind trajectory analysis (WTA) can be used to obtain compositions of particles from
specific sources by using large ambient data sets that contain extensive wind direction data.
In this procedure, the following steps are performed for each species in each size fraction:
a. Determine average concentration and standard deviation at each station.
b. Search the data from each station for sampling periods that meet two conditions: a
concentration of 3 above average for the station; and fluctuations of surface wind
direction,

, of less than 20.


c. For sampling periods that meet these criteria, construct a histogram of mean wind direction.
If there are dominant sources of the species, the histogram will display clusters about angles
pointing toward the sources.
d. Using the mean angles from the clusters, construct a map of back trajectories observed at all
stations in the network. A convergence of trajectories from two or more stations often
indicates the presence of a dominant source of the species near the convergence.
e. For samples identified as heavily influenced by a particular source, construct linear
regressions of concentrations of all measured species versus those of one or more species
strongly associated with particles from the source. The correlation coefficients indicate the
likelihood that the various species are present on particles from the source, and the regression
coefficients yield relative concentrations. If mass contributions from the source are large
enough to give good correlations with the species in question, absolute concentrations of
Source Class Literature Review Considerations
Motor Vehicles
- Leaded/Unleaded/Diesel Vehicle Mix
- Br/Pb Ratio (Fresh or Aged Aerosol)
- Smoke Suppressant use in Diesel Fuels
- Transportation Source Mix Near the Receptor
Soils
- Study Area Soil Chemical Enrichment by Local Emission Deposition
- Local Soil Composition Consistency with Literature Values
- Importance of Limestone to Local Soils
Residual Oil
Combustion
- Similarity in the Fuel Elemental Composition
- Use of Fuel Additives
- Combustion System Similarity
Coal Combustion
- Similarity in Chemical Composition of Input Fuel
- Similarity in Combustion Process and Control System Type and
Efficiency
Vegetative Burning
- Similarity in Type of Fuel Burned
- Point(s) within the Combustion Cycle where Sampling Occurred
Metals Industry
- Specific Process Sampled, Control Equipment Used and Chemical
Composition of the Alloy
- Point(s) within the Process Cycle Sampled
108

those species in the particulate matter can be obtained.
This method has been used to identify samples influenced by the following point sources in St.
Louis: an iron and steel complex, an iron works, a zinc smelter, two lead smelters, two copper plants,
an incinerator, a pigment plant, and a fertilizer plant. Surprisingly, they also found trajectories --
apparently to areas of high traffic density -- that were associated with motor vehicles. For most
sources, concentrations of 6 to 10 elements relative to that of a prominent element could be
determined.
WTA has several advantages relative to traditional methods:
1. It avoids the cost of source measurements.
2. The source signatures obtained are characteristic of particles as received at the receptors. For
example, volatile elements will have condensed (if they are going to), and very large particles
will have fallen out.
3. Problems that could arise by use of different sampling equipment or analytical methods at the
source versus the receptors, or by use of measurements widely separated in time, are avoided.
4. Fugitive emissions, which are difficult to measure by conventional methods, can often be
observed.
5. Cooperation of personnel at the source is not required.

10.5 Direct Source Sampling
Figure 10.1 shows a block diagram illustrating a direct source sampling scheme for a variety
of source types.
Grab samples of passive fugitive dust sources may be collected in the field and returned to the
laboratory and resuspension using a resuspension chamber similar to the one shown in Figure 10.2.
This type of source sample collection provides a filter sample that has collected the source material
in an identical manner to that used to collect ambient aerosol samples. This minimizes modifications
in relative composition due to differential deposition during transport to the receptor.
Collection of a source sample from a stack that is representative of the material arriving at the
receptor is particularly difficult because of the effects of condensation, volatilization and deposition
during transport. Material collected in control devices is not necessarily representative of the material
emitted.
Frequently available literature for point sources have been obtained from baghouse dust, in-
stack samplers or from EPA Method 5 type equipment. Samples collected in such a manner are not
representative of the size distribution or of the chemical composition of stack particles after they mix
and cool in the atmosphere. Condensation, vaporization, agglomeration and secondary chemical
reactions can significantly alter the size distribution and chemical composition of stack aerosols
when ambient dilution and cooling occurs. It should be emphasized that conventional gravimetric
emission rate determinations (grains/DSCF) or relative emission inventory tallies are of little value
for source apportionment with receptor modeling.
The desire to obtain samples of particulate material in the form which it will have after it has
been emitted into the atmosphere has stimulated the development of dilution/cooling systems (e.g.,
see Figure 10.3). The fundamental principles inherent in the design of a dilution/cooling system are:
(1) the isokinetic withdrawal of an aerosol stream from an industrial stack source, (2) the mixing and
109

Figure 10.1: Block Diagram of a Direct Source Sampling Scheme for a Variety of Source Types

D
i
l
u
t
i
o
n

S
a
m
p
l
i
n
g

P
o
i
n
t

S
o
u
r
c
e
s

A
n
a
l
y
s
e
s

X
R
F
,

N
A
A
,

M
i
c
r
o
s
c
o
p
y
,

I
C
,

X
R
D
,

C
a
r
b
o
n

S
t
a
t
i
o
n
a
r
y

S
a
m
p
l
e
r
s
,

e
.
g
.

d
i
c
h
o
t
o
m
o
u
s

a
n
d

l
o
-
v
o
l

T
S
P

s
a
m
p
l
e
r
s

(
s
h
o
r
t

d
u
r
a
t
i
o
n
,

c
u
s
t
o
m

s
i
t
i
n
g
)

I
n
t
e
r
m
i
t
t
e
n
t


S
o
u
r
c
e
s

C
o
n
s
t
a
n
t


S
o
u
r
c
e
s
P
r
o
c
e
s
s

F
u
g
i
t
i
v
e


S
o
u
r
c
e
s

P
a
r
t
i
c
l
e

S
o
u
r
c
e
s

S
t
a
t
i
o
n
a
r
y

S
a
m
p
l
e
r
s
,

e
.
g
.

d
i
c
h
o
t
o
m
o
u
s

a
n
d

l
o
-
v
o
l

T
S
P

s
a
m
p
l
e
r
s

(
r
e
p
e
t
i
t
i
v
e

s
a
m
p
l
e

c
o
l
l
e
c
t
i
o
n
)

C
o
m
p
o
s
i
t
i
n
g

R
e
s
u
s
p
e
n
s
i
o
n
s
:

2
.
5

m
,

2
.
5
-
1
5

m

a
n
d

T
S
P

B
u
l
k

S
c
a
n


a
n
d


C
o
m
p
o
s
i
t
i
n
g

S
i
e
v
i
n
g
,

3
8

m

C
o
m
p
o
s
i
t
i
n
g

R
o
a
d

D
u
s
t


V
a
c
u
u
m
i
n
g

G
r
a
b

(
B
u
l
k
)


S
a
m
p
l
e
s

P
a
s
s
i
v
e

F
u
g
i
t
i
v
e

S
o
u
r
c
e
s

S
i
e
v
i
n
g
,

3
8

m

B
u
l
k

S
c
a
n


a
n
d


C
o
m
p
o
s
i
t
i
n
g

R
e
s
u
s
p
e
n
s
i
o
n
s
:

2
.
5

m
,

2
.
5
-
1
5

m

a
n
d

T
S
P

110

A. Resuspension chamber
B. Dichotomous sampler
C. Intake air filter
D. Lo-vol sampler filter assembly
E. Lo-vol sampler pump
F. Dichotomous sampler control unit
G. Resuspension air pump (40 liters/minute)
H. Inline air filter
I. Watch glass support assembly
J. Resuspension platform
K. Resuspension chamber gasket

Figure 10.2: Schematic of Resuspension Chamber System

111

Figure 10.3: An Experimental Dilution Sampling System. The system is equipped with a
cyclone train with a backup filter for sample collection. It can be completely disassembled due
to its construction with threaded pipe unions. The control electronics are contained in a
separate transport box.
112

cooling of the source aerosol with excess filtered ambient air and (3) collection of particulate samples
from the dilution chamber in a manner compatible with analytical procedures. As with any stack
sampling system anisokinetic conditions must be avoided to insure representative particle size
distribution is obtained. With a dilution sampling system both the withdrawal of stack aerosol for
dilution arid the withdrawal of an aliquot of the dilution chamber flow for particle sample collection
must be done isokinetically. The ratio of dilution air to stack aerosol must be adjustable due to the
range of stack parameters which can be encountered. Enough dilution air is required to reduce the
dilution temperature to near ambient and to prevent the condensation of water vapor as condensed
water is deleterious to the collection of particle samples on a filter medium. On the other hand, the
dilution ratio must stay low enough so that there is sufficient particulate concentration in the diluted
aerosol to collect adequate sample mass for analysis within a reasonable time frame. Workable com-
promises between dilution ratios and sampling duration have been achieved for many stacks. The
condensation of water vapor is not problematic with a dilution ratio of 20:1 for most stacks (Figure
10.4) although stacks with very high water vapor content such as lime kilns which can have in excess
of 50% water vapor require higher dilution ratios for that reason. The collection of particulate
samples from the dilution chamber for CMB source matrix formation is accomplished by passing a
portion of the diluted aerosol through size classifying equipment such as a dichotomous sampler or a
cyclone train followed by a backup filter. The collection of size classified samples onto chemically
clean Teflon filters permits the same analytical protocol which is used on ambient samples to be used
on the source samples. Identical analytical protocols and size fractions reduce the possibility of
determinate errors when statistically comparing the source and ambient data sets in CMB modeling
and hence the quality of the source apportionment is improved.
Figure 10.4: Temperature and Relative Humidity of Diluted Sample vs. Dilution Ratio for a
Dilution Air at 27C and Relative Humidity less than 10%. Initial water vapor content of stack
gas was 18.4% by volume. From: Heinsohm, R.J., Davis, J.W. and Anderson, G.W., 1976, the
Design and Performance of 69
th
Annual Meeting of the Air Pollution Control Association, Portland,
Oregon
113

Dilution sampling to adequately reproduce atmospheric mixing requires a relatively large
mixing volume and sample residence time. Unfortunately, spatial restrictions imposed by the
approach structures to sampling platforms, as well as the small size of sampling platforms themselves
often preclude the utilization of large equipment. Similarly, weight is critical as equipment must
frequently be hand carried up vertical ladders and stairways as well as through areas which are
physically difficult to negotiate. Consequently, a dilution sampling system must be constructed in
such a manner as to facilitate its disassembly and reassembly and should be constructed of
lightweight materials (e.g., PVC or Teflon) where possible. In addition, the use of a turbulent mixing
design (versus a linear buoyant plume design) permits the system to be relatively compact while still
maintaining a long dilution chamber path length (3 meters in system shown in Figure 10.3) and long
particle residence time due to the ninety degree bends in the dilution chamber structure.

10.6 Composite Transportation Source Matrix
There are two basic approaches that can be used to quantify the contribution of transportation
sources to suspended aerosol levels (The transportation source category as used here does not include
road dust generated by traffic). Figure 10.5 shows a calculation flow diagram illustrating the
relationship of these two approaches. The composite transportation source contribution, M
t
, can be
calculated on the basis of a leaded gasoline profile as indicated on the right side. In this case, the
leaded automotive exhause contribution, M
1
, is calculated on the basis of a leaded automotive
exhaust profile and the transportation source contribution determined by emission inventory scaling
of the leaded automotive exhaust contribution. This emission inventory scaling or relative emissions
weighting factor, R
1
, is the ratio of leaded gasoline automotive emissions relative to the total
transportation emissions.
The other basic approach to calculating the composite source contribution is illustrated on the
left side of Figure 10.5. In this case, each of the individual source profiles is used to generate a
composite transportation source profile which then is used in chemical mass balance (CMB)
calculations to determine the transportation source contribution.
There are a variety of ways in which this relative emission factor can be estimated. One
approach is illustrated in Figure 10.6. This approach is based on VMT in the area closest to the
sampling site (usually within 2 to 5 km), fuel use, vehicle age distributions, etc. Relative
transportation source contributions based on these variables are clearly just rough approximations.
They are very useful, however, since they apportion the transportation contribution on the basis of a
relatively precise determination of the contribution from vehicles burning leaded gasoline.
Both approaches should lead to the same transportation source contribution, M
t
. The right
hand approach using the leaded automotive exhause profile, however, may provide a somewhat more
accurate and precise transportation source contribution than the left hand approach because the
emission weighting step is performed after the CMB fitting. It should be clear that the relative
emission weighting ratios are estimates at best depending not only on the factors mentioned above,
but also such factors as temperature, altitude, average vehicle speed, fleet characteristics nearest the
receptor, etc. Thus, weighting the individual source profiles with these relative emission weighting
factors will greatly increase the uncertainties in the individual species and/or may decrease the fitting
pressure provided by lead during a CHB calculation.
114

Figure 10.5: Calculation Flow Diagram for Composite Transportation Source
C
h
e
m
i
c
a
l

M
a
s
s


B
a
l
a
n
c
e

C
a
l
c
u
l
a
t
i
o
n
s

C
o
m
p
o
s
i
t
e

T
r
a
n
s
p
o
r
t
a
t
i
o
n


S
o
u
r
c
e

C
o
n
t
r
i
b
u
t
i
o
n
,

M
t

R
e
l
a
t
i
v
e

E
m
i
s
s
i
o
n
s


W
e
i
g
h
t
i
n
g
,

F
i
w

R
w

R
e
l
a
t
i
v
e

E
m
i
s
s
i
o
n
s


W
e
i
g
h
t
i
n
g
,

F
i
u

R
u

C
h
e
m
i
c
a
l

M
a
s
s


B
a
l
a
n
c
e

C
a
l
c
u
l
a
t
i
o
n
s

R
e
l
a
t
i
v
e

E
m
i
s
s
i
o
n
s


W
e
i
g
h
t
i
n
g
,

F
i
l

R
l

F
o
r
m
a
t
i
o
n

o
f

C
o
m
p
o
s
i
t
e

T
r
a
n
s
p
o
r
t
a
t
i
o
n

S
o
u
r
c
e

M
a
t
r
i
x
,

F
i
t
=

F
i
j
R
j
j

R
e
l
a
t
i
v
e

E
m
i
s
s
i
o
n
s


W
e
i
g
h
t
i
n
g
,

R
l

C
o
m
p
o
s
i
t
e

T
r
a
n
s
p
o
r
t
a
t
i
o
n


S
o
u
r
c
e

C
o
n
t
r
i
b
u
t
i
o
n
,

M
t
=

M
l


R
l

R
e
l
a
t
i
v
e

E
m
i
s
s
i
o
n
s


W
e
i
g
h
t
i
n
g
,

F
i
d

R
d

L
e
a
d
e
d

A
u
t
o

E
x
h
a
u
s
t


C
h
e
m
i
c
a
l

P
r
o
f
i
l
e
,

F
i
l

U
n
l
e
a
d
e
d

A
u
t
o

E
x
h
a
u
s
t


C
h
e
m
i
c
a
l

P
r
o
f
i
l
e
,

F
i
u

W
e
a
r

C
h
e
m
i
c
a
l


P
r
o
f
i
l
e
,

F
i
w

D
i
e
s
e
l

E
x
h
a
u
s
t

C
h
e
m
i
c
a
l

P
r
o
f
i
l
e
,

F
i
d

L
e
a
d
e
d

A
u
t
o

E
x
h
a
u
s
t


S
o
u
r
c
e

C
o
n
t
r
i
b
u
t
i
o
n
,

M
l

115

Figure 10.6: Calculation Flow Diagram for Transportation Weighing Ratio
P
e
r
c
e
n
t

D
i
e
s
e
l
,

P
d

(
V
M
T

o
f

F
u
e
l

U
s
e
)

W
e
a
r

E
m
i
s
s
i
o
n

F
a
c
t
o
r
,

G
w

D
i
e
s
e
l

P
o
w
e
r
e
d

V
M
T

V
d

W
e
a
r

E
m
i
s
s
i
o
n

E
w

=

V
w
G
w

G
a
s
o
l
i
n
e

P
o
w
e
r
e
d

V
M
T

V
l
+
u

D
i
e
s
e
l

E
m
i
s
s
i
o
n

F
a
c
t
o
r
,

G
d

D
i
e
s
e
l

E
m
i
s
s
i
o
n
s

E
d

=

V
d
G
d

P
e
r
c
e
n
t

U
n
l
e
a
d
e
d
,

P
u


(
A
g
e

D
i
s
t
r
i
b
u
t
i
o
n
,

F
u
e
l

U
s
e
)

U
n
l
e
a
d
e
d

V
M
T

V
u

L
e
a
d
e
d

V
M
T

V
l

U
n
l
e
a
d
e
d

G
a
s
o
l
i
n
e

E
m
i
s
s
i
o
n

F
a
c
t
o
r
,

G
u

L
e
a
d
e
d

G
a
s
o
l
i
n
e

E
m
i
s
s
i
o
n
s

E
l

=

V
l
G
l

T
r
a
n
s
p
o
r
t
a
t
i
o
n

E
m
i
s
s
i
o
n
s

(
T
P
Y
)

E
j

=

E
j
j

V
e
h
i
c
l
e

M
i
l
e
s

T
r
a
v
e
l
e
d

V
t

L
e
a
d
e
d

G
a
s
o
l
i
n
e

E
m
i
s
s
i
o
n

F
a
c
t
o
r
,

G
l

U
n
l
e
a
d
e
d

G
a
s
o
l
i
n
e

E
m
i
s
s
i
o
n
s

E
u

=

V
u
G
u

T
r
a
n
s
p
o
r
t
a
t
i
o
n

W
e
i
g
h
t
i
n
g

R
a
t
i
o

R
j

=

E
j


E
t

116

From a statistical point of view, it is more aesthetically pleasing to first determine accurately
the contribution of leaded automotive exhaust (M
1
) before applying relatively uncertain emission
factors to calculate the transportation contribution. On the other hand, it is easier to communicate the
results by using the approach illustrated on the left side of Figure 10.5. It needs to be emphasized,
however, that both approaches have been used in previous studies but never compared in the same
study. Thus, the difference in the final result produced by these two approaches has not been
established.
Appendix 8 describes this process in detail, lists the chemical profiles for major transportation
sources and provides a sample calculation of a composite source profile.

10.7 Source Characterization Protocol
Resources are an important consideration in developing a receptor modeling program for any
airshed. The following outline of three resource levels can be used to achieve various degrees of
sophistication in a receptor modeling application. Each level depends upon completion of the
previous level. Each level requires more resources than the preceding one, but provides receptor
model results of greater reliability. An initial application of the receptor model using source
composition data in each level may direct the actions and minimize the costs in subsequent levels.
Level I. Minimal Resources. The first step towards source characterization is to compile a list
of the point and area sources most likely to affect pollutant concentrations at the sampling sites.
Properties of their emissions can be estimated from available emissions inventories, site surveys,
local knowledge, and the results of previous tests of similar types of sources, not necessarily in the
same locale.
For point sources, the National Emissions Data System (NEDS) and the local agency
emissions inventories, from which NEDS is derived, are the best starting points. For each source
emitting > 1 t/yr of SO
2
, NO
x
, CO, or total emitted particulate matter (TEP), the NEDS condensed
point source inventories contain source classification codes (SCCs) that identify the fuels used and
processes involved, universal transverse Mercator (UTM) coordinates of the point of emissions, and
calculated and allowable emission rates of SO
2
, NO
x
, CO, TEP, and hydrocarbons. These records are
obtainable in computer-compatible formats that can be sorted by SCC and, subsequently, by TEP
emission rates within each SCC. Important point sources of a given type should be identified on the
basis of SCC and estimated TEP emission rate. The lower cutoff of emission rates will depend on the
resources available and its location relative to the receptor. Past experience shows that most
inventoried sources with TEP rates of < 50 t/yr can be ignored unless they release fugitive emissions
with low plume rise. The TEP rate is often inaccurate and should be treated as an order-of-magnitude
estimate.
Area sources most likely to contribute to a receptor are those in its immediate environs. These
sources are best identified from a survey of the area around the site. This survey, based on the
National Air Monitoring System hardcopy site survey, should include:
1. A street-and-block map for the 2 km radius around the sampling site that shows typical traffic
counts, curbing, lights, and general dirtiness of roadway sections; block classifications by
residential, commercial, industrial, and agricultural categories; and locations and dimensions
of vacant lots (with coverings), storage piles, and parking lots.
117

2. A description of the sampling station, including the sampler layout; elevation of sampling
probes above sea level, ground level, and platform level; UTM coordinates; and a summary
of nearby structures that might affect micrometeorology.
3. Photographs in eight cardinal directions from the sampling platform.
The point source summaries and site surveys should be reviewed by a knowledgeable person
from the area. Local air pollution specialists are aware that emission rates often change more rapidly
than indicated in emission inventories and dated site surveys.
Source compositions for chemical components and particle size fractions under conditions
compatible with those prevailing at the receptor should be compiled from existing data for the source
types in the area. This compilation would be assisted by a documented source characterization
library. In such a library, data for each source should include its type, specific identifier, and the
time, place, and circumstances of the source test; the characteristics of source operation and control
devices used during the test; the sampling methods used for each test; the chemical species and size
ranges measured and the analytical methods used; and concentrations and uncertainties therein for
designated chemical species by size range.
The source composition library should contain sufficient information to allow selection of the
source signatures that are most similar to those in the local area. Such literature includes previous
receptor model studies in which the components are listed, published articles of reports of source
studies, and soil analyses from the area based on national compilations or those of state geological
surveys. As discussed below, however, the airborne soil probably has a composition different from
that of bulk soil. It may be preferable to construct an approximate component for emissions from
coal-fired plants rather than to determine the composition directly from studies on plants from
another area. Such a component could be compiled on the basis of the composition of coal consumed
in the local area as modified by enrichment factors determined for elements between coal and emitted
particles.
Initial receptor modeling using these Level source characteristics and receptor measurements
can guide the efforts of Level II source characterization by evaluation of their adequacy for the
models being applied.
Level II. Moderate resources. The accuracy and precision of estimates of source contributions
can be improved if actual compositions of local source emissions are determined (instead of using
measurements of similar sources in other locations). This section outlines a general method to be
followed.
First, identify source categories and specific emitters that previous modeling or past
experience have shown to be potential contributors. Major sources of mass include wind-raised or
vehicle-entrained dust from agricultural soils, roadways, and parking lots; active storage piles (i.e.,
piles where materials are frequently added or removed); transfer processes, such as loading and
unloading of vehicles; and sources such as power plants, smelters, steel plants, cement plants, and
refuse incinerators.
Second, from each selected source obtain "grab" samples that will best represent the average
composition. Here "grab" sampling means the collection of bulk material from a pollution control
device or from piles of material that may produce fugitive emissions, or the simple in-stack
collection of suspended particles. Sampling methods used at each source should consider temporal
118

variations caused by the process schedule and fuel switching.
Third, try to isolate the portion of the bulk sample that is likely to remain airborne long
enough to reach a receptor. Ordinarily, it is inappropriate to analyze a bulk sample. A convenient and
necessary (but not always sufficient) procedure is to sample in a suspension chamber after drying the
bulk sample and sieving it through a standard Tyler 400 mesh (36 m) screen. The resuspended
sample should be collected on a filter of high purity (cellulose, polycarbonate, or Teflon) with a
loading appropriate for the material to be measured and the intended analytical method. For example,
a quartz filter cannot be used if silicon is to be measured.
If possible, the same size intervals used for ambient sampling should be used for source
sampling. If sources are to be controlled to meet federal standards, the size intervals may be
established by those standards. If not, the best scientific judgment should be followed, using
available and convenient sampling methods. One useful approach is to employ virtual impactors
(dichotomous samplers) to collect source and ambient samples in two size intervals (<2.5 m to 10
m).
For grab sampling of in-stack suspended particles, a variation of U.S. Environmental
Protection Agency (EPA) Method 5 is used in which the filter of Method 5 is replaced by a particle-
sizing device, such as a cascade impactor or in-stack virtual impactor. In classical source emissions
tests, a pitot tube must be used to determine the velocity of stack gases at many points along two
perpendicular diameters across the stack. Then the mass loading must be measured at several points.
However, for receptor modeling, collecting samples at one point somewhat away from the inner wall
is usually sufficient. A pitot tube is used to determine stack gas velocity at that point, and the
pumping speed through the sampler is adjusted to ensure isokinetic sampling.
Finally, the samples should be analyzed for species of interest. At this second level of
resources, consideration should be given to additional analyses that could improve the overall results,
even though these additional species have not yet been measured at the receptors. Analyses should be
considered that would provide data compatible with ambient data on the following additional source
characteristics: (1) total mass; (2) total mass within the "fine" and "coarse" size fractions; (3)
individual elements and ions, especially major contributors to total mass or tracers for major sources;
(4) inorganic and organic compounds, again emphasizing tracers for major sources; (5) visual
characteristics, using microscopy of various types; and (6) crystal structures as observed by X-ray
diffraction (XRD).
This second level provides source samples similar to those taken at receptors at approximately
the same cost per sample. Sources of some types may not be amenable to grab sampling, or the grab
sample may not be representative of the emissions that reach the receptor. More complex and costly
procedures are required to characterize these sources. Sampling only in stacks and pollution control
devices will miss fugitive emissions, which often contribute more suspended particulate matter,
especially in the coarse fraction, than do ducted emissions.
Level III. Maximum resources. Level III testing should be confined to those sources for which
it is absolutely necessary, such as major contributors that are not amenable to grab sampling. In all
cases, time variations in composition and mass should be determined for each source studied. The
three types of Level III tests are:
1. Ground-based sampling of plumes. This sampling should be completed for a source whose
119

ground-level contribution is greatest. This type of test is applicable to sources such as motor
vehicle emissions and field burns. The collection devices should be the same as those used at
receptor sites. A variation of this method applicable to point sources -- wind trajectory
analysis -- is discussed below.
2. Dilution stack sampling. This technique should be used for vented emission sources and have
appropriate constraints for assuring the collection of condensable species. Samples should be
collected in size fractions similar to those collected at the receptor sites. Presently 2.5 mm
and 10 mm diameter size cuts appear appropriate for the anticipated particulate standard. In
the case of high-temperature stacks, provision should be made for diluting stack gases to
ambient temperature and relative humidity to allow condensation of moderately volatile
species. Standardization of the dilution sampler also is necessary, especially to determine the
sample size required for the receptor application. If dilution and cooling are not possible, it is
desirable to collect moderately volatile species (e.g., Hg, As, Se, halogens, organics) in
appropriate columns or traps downstream from filters (e.g., activated charcoal) in order to
determine an upper limit on the amount that might condense in the plume at ambient tem-
peratures. As these methods are still under study and improvement, no "standard" procedure
can be described yet.
120

11.0 AMBIENT SAMPLING FOR MAXIMUM
SOURCE RESOLUTION

11.1 Introduction
No single protocol for sampling and analysis is applicable for all receptor modeling studies. A
large variety of ambient sampling instruments, procedures, and considerations of increasing
sophistication and cost are presented in this section. If resources are limited, however, a decision
must be made regarding the optimal division of those resources between source measurements, and
interpretations. The division of effort among these categories to achieve a given level of reliability of
final results (e.g., accuracy of the estimates of total suspended particulate contributions in two size
fractions from various source classes) will vary for different areas. If most important sources in an
area are common to many areas (e.g., coal- and oil-fired power plants, motor vehicles, marine
aerosol, soil, and regional sulfate), little additional study of sources may be needed (However, the
EPA 1980 Houston study demonstrated the need for at least some analyses of sieved, resuspended
soil, whose composition may vary considerably from one area to another). In such cases, most
resources would be best devoted to ambient measurements.
Although ambient measurements are the focus of this section, ambient and source sampling
should be considered together. Insofar as practical, similar collection and analysis methods should be
used for both types of sampling (e.g., inlet cut point, type of sampler and filter media, analytical
techniques).
The ambient sampling program will depend strongly on the specific source apportionment
objectives. If the objective is to identify the sources causing exceedance of the 24 hour TSP standard,
then (1) high volume TSP samplers will be required or at least the results must be relatable to this
sampler, and sampling duration cannot exceed 24 hours. If, on the other hand, the interest is in
determining the sources' of the annual TSP standard exceedance, then longer samples might be
collected. If maximum source resolution is the objective, then two to four hour samples using a
dichotomous sampler would be more appropriate. If the impact of a specific source is of interest such
as automotive exhaust, then a fine particle sampler using a glass fiber filter might be adequate. Are
there sources of interest having similar chemical compositions but different particle size distributions
such that their impacts might be resolved by selecting an appropriate particle-size sampling
instrument?
Thus, the airsheds characteristics and the studies objectives need to be defined clearly before
attempting to design an ambient sampling program. Existing data should be used to obtain an
understanding of the area under study. These should include such data as historical high volume TSP
and gas concentrations, meteorological data, emission inventories, chemical and optical analyses, and
the results of dispersion modeling.

11.2 Meteorological Regime Categorization
The determination of annual mean TSP levels and three month mean lead levels for
compliance with federal air quality standards requires an average to be calculated from a number of
121

short term measurements. Similarly, the annual or seasonal contribution assigned to specific sources
in an airshed by CMB source apportionment also requires an average value to be calculated. Ideally,
twenty-four hour samples should be collected each day of the year from which a true mean value
could be determined. Unfortunately in practice, budgetary and manpower constraints generally
preclude daily sample collection and filter analysis. The dependence of particulate concentrations on
weather conditions has been well documented in many airsheds. For weather (and season) dependent
atmospheric particulate concentrations an accurate long term average can only be obtained if each
important meteorological category is represented by samples proportional in number to the frequency
of occurrence of that meteorological category. If a weather type with characteristically high or low
particulate levels is over or under represented, a biased mean will be obtained.
Two approaches can be taken to obtain an accurate mean value: (1) Random collection of a
very large number of samples, or (2) meteorological regime categorization. The greater

the number of
randomly chosen sample days, the greater is the probability that each weather category is represented
in proportion to its frequency of occurrence, and the calculated average becomes closer to the true
mean. The collection of samples at three day intervals is an example of this approach (random with
respect to weather conditions). If, as is often the case, a more limited sampling program is conducted,
meteorological regime categorization can be used to insure that the mean value obtained is
representative of the true mean for that study period. It should be noted that regime categorization
can also be used to improve data developed from frequent sample collection such as the every third
day approach.
The determination of a regime categorized geometric mean (and its standard deviation) for
which the TSP levels characteristic of each weather category are properly weighed can be expressed
by the equations:
i
m
i
i
c
P S T
N
N
P S T log log
1

=
=
and,
TSP
i i
i i
m
i
i
TSP
N n
n N
N
N
c
o o
2
1
2
log
) 1 (
log
(

=
,
where
N
i
is the total number of daily occurrences of regime i (of m regimes total) during the
experimental period,
N is the total number of days in the experimental period (if an annual mean is being
calculated N = 365 days),
n
i
is the number of sample days selected from regime i during the experimental period,
n is the total number of sample days chosen during the experimental period,
m is the number of regime categories, TSP is the categorized geometric mean,
TSP

c
is the categorized geometric mean,
TSP

i
is the geometric mean of TSP in regime i,
c
TSP
o is the geometric standard deviation of the categorized mean, and
122

i
TSP
o is the geometric standard deviation of TSP in regime i.
*The mathematical description presented here is for the calculation of geometric mean TSP values.
Analogous equations can be developed for other parameters (viz, Pb concentration or average annual
source contributions). In addition, by removing the log function arithmetic means can be calculated if
desired.
If the second derivative of the equation for determining the standard deviation is taken with
respect to n
i
and set equal to zero the relationship,
N
N
n
n
i i
=
is obtained. This implies that the minimum standard deviation is achieved when the ratio of the
number of days selected perregime to the total number of days chosen is the same as the frequency of
occurrence of that regime on an annual (or seasonal) basis. Hence to optimize the results an attempt
should be made to select days in such a manner so that the ratio n
i
/n is as near as possible to the ratio
N
i
/N for each regime.
Selection of those meteorological conditions to form the basis of groupings in a regime
categorization scheme is dependent on two principal factors: (1) Only meteorological parameters
which are routinely measured and recorded in a study area can be used, and (2) only parameters
which have particulate concentrations should be chosen. Review of historical TSP and
meteorological records is most helpful in developing, the relationship between weather and TSP
levels. Wind-speed and direction, inversion conditions, precipitation and soil conditions are all
important factors (Table 11.1). Actual meteorological groupings will, of course, vary from airshed to
airshed depending on the local climate and the location of major emission sources with respect to the
study sites.
Successful TSP regime categorization studies carried out in Portland, Oregon and Lewiston,
Idaho demonstrate two different approaches to regime categorization (Table 11.2). Lewiston is in a
semi-arid agricultural region and approximately 60% of its TSP burden is due to geological material,
hence soil (and road) moisture conditions, episodic precipitation events and high wind speed were
emphasized in. category formation. Portland, in contrast, receives nearly 40 inches of precipitation
annually, is a relatively large metropolitan area and is situated at the confluence of two broad river
valleys. Examination of U.S. Weather Service records and TSP data over a four year period revealed
that the relationship between wind velocity (vector averages) and TSP levels was sufficient to permit
regime categories to be formed. Besides wind speed a proportional number of rainy days was
incorporated into the sample day selection. Regimes III and VII had the highest TSP associated with
them and regimes IV and V the lowest. The low wind speeds of regimes III and VIII are associated
with poor dispersion in the Willamette and Columbia River valleys. Surface flow diagrams indicate
that the winds characterized by regimes IV and V originate in the heavily vegetated region to the
southwest of Portland plus some of the highest wind speed periods fall into regime IV. Consequently,
good dispersion conditions and little entrained dust characterize these regime categories. Socioeco-
nomic activity changes on weekends necessitated including the correct number of weekend days in
both the Lewiston and Portland studies. Similarly, seasonal averages were calculated in both studies
prior to determining annual mean values due to the seasonal nature of some particulate sources, e.g.,
123

Table 11.1: Example Meteorological Regimes
Regime Criteria Comments Particle
Size
TSP Level
(g/M
3
)
I Dry Soil, High Wind Speed
(>14 MPH)
Resuspended Dust, Good
Dispersion
Large Very High
(260)
II Air Stagnation (Inversion) No Resuspension, Poor
Dispersion
Small High (200-
260)
III Low to Intermediate Wind
Speed
Combination of Two
Conditions: (1) Dry Soil <14
MPH and (2) Wet/Frozen
Snow Covered Soil <5 MPH
Mixed Intermediate
(100-200)
IV Wet/Frozen/Snow Covered
Soil, Intermediate to High
Wind Speed (>5 MPH)
Little Resuspension, Good
Dispersion
Mixed Low (10-100)
V Day of Heavy Precipitation
(>.1 in)
Washout Mixed Very Low
(<10)




Table 11.2: Meteorological Regime Categories
Portland, Oregon

Category Wind Speed and Direction
I NE-NW, 8 MPH
II NE-NW, 5-7 MPH
III NE-NW, 4 MPH
IV SE-SW, 8 MPH
V SE-SW, 6-8 MPH
VI S-SW, 6 MPH
VII E-SE, 8 MPH
VIII E-SE, 8 MPH
Lewiston, Idaho

Category Description
I Wet Soil Days within 3 days after .01 inches of ppt
II Dry Soil Days not within 3 days after .01 inches of ppt
III Snow Covered Soil 1 inch snow on ground
IV Rain Washout 0.1 inch rain in 24 hours
V Snow Scavenging 0.1 inch snow in 24 hours
VI Very High Wind Speed Wind speed 14 MPH

124

home heating, agricultural practices, road sanding and salting, and soil surface conditions.
The two important advantages which can be obtained by meteorological regime
categorization can be summarized as follows: (1) an accurate annual mean can be obtained with
fewer samples than would otherwise be required, and (2) a more representative mean can be obtained
with any number of samples. The former point is well illustrated with data generated during the
Lewiston study. The categorized annual (1979) geometric mean TSP value was 102 g/m
3
based on
37 samples while the uncategorized mean value was 104 g/m
3
based on 111 samples. The two
values are within 2% of each other and are statistically indistinguishable. The second point is
illustrated with data from the Portland study where categorized and uncategorized means were
calculated from the same number of individual samples. Of particular interest is the 1973 data. The
categorized mean is below the primary standard of 75 g/m
3
whereas the uncategorized mean is
above that value, making the difference between attainment and non-attainment status.

11.3 Sampling Location, Frequency, and Duration

11.3.1 Sampling Location
Siting of monitoring locations should take advantage of existing sites with consideration to
land use categories and transportation source influences. Source oriented sites are often useful, but
both upwind and downwind, and background sites are needed. Under ideal resource conditions, most
urban area studies will require at least six sampling stations. Simple source oriented modeling can
provide helpful guidance for the location of sites, e.g., location of maximum impact.

11.3.2 Frequency and Duration
The frequency and duration of sampling are important factors and will have a substantial
influence on source resolution, meeting analytical objectives, resources, etc. Source contributions, at
a fixed sampling station, are often short term and to measure the source impact variability requires
sampling times of the same order as that variation. The impact of automotive exhaust on air quality is
strongly dependent on highly variable diurnal influences as traffic density and meteorological
patterns. A 24 hour sample which represents all of the days activities with a single average data
point will lose the source information contained in the diurnal variability pattern. This is particularly
important for interpretive methods such as TTFA which relies on variability for its source signal. It
will also have an impact on the resolvability of CMB, methods since the source impact signal when
averaged over 24 hours becomes submerged in a higher level of background interference.
Although very short sampling times might be desirable from a source signal to noise ratio, it
can greatly increase the program costs and will require lower detection limits of the subsequent
analytical procedures to attain elemental information equivalent to that attainable with samples
collected over longer sampling times.
In general, two to four hour sampling might be considered optimal from the point of view of
source resolution and compatibility with analytical procedure sensitivities. Such sampling periods,
however, are usually not required and are probably not cost effective relative to most study
objectives. Eight to 12 hour sample durations are close to optimal because they still provide for some
resolution of the diurnal pattern and may only increase the sampling and analysis costs by a factor of
125

two or three. Twenty-four hour sampling would be considered minimal if the primary objective is to
address a twenty-four hour standard. However, as discussed in Section 9, longer sampling times can
be cost effective if quarterly or annual averages are of interest. Although some source resolution may
be lost, the quantification of major sources or sources with high signal to noise ratios will not be
affected by the longer sampling times. Meteorological regime stratification can provide valuable
input into making sampling frequency and duration decisions.
It is also important to obtain information on both intense pollution periods (episodes) and
relatively clean conditions. Thus, more samples than needed should be taken with a subset chosen for
detailed analysis based on meteorological regimes or other factors. It is also useful to collect samples
every day over selected monthly or seasonal periods rather than on per third or sixth day so as to
capture the progression of multiday events.
Although meteorological data is not required by receptor oriented methods, it provides
valuable input to the composite analysis and reconciliation of receptor modeling results. Hourly
observations of instantaneous wind direction from the nearest airport are insufficient to determine the
directions of important sources with respect to samplers. At least one sampling location should be
equipped with a meteorological tower to obtain wind direction and speed with sufficient frequency to
calculate hourly average values and standard deviations. If feasible, especially in a large study area,
more than one station should be so equipped.
To relate receptor modeling results to dispersion models, one needs information on mixing
heights and stability class. Classical methods such as balloons and radiosondes are helpful, but
usually too expensive to use more than twice a day. Devices that yield more frequent readings should
be considered: for example, a meteorological tower for short term wind fluctuations or vertical
temperature gradients (T/z) to calculate stability class, or an acoustic sounder to obtain mixing
height. As development and testing increase their reliability and reduce their costs, various remote
sensing devices should be considered for these applications.

11.4 Samplers and Filters
The two basic objectives of aerosol sampling are collection of a representative aerosol sample
relevant to a specific objective and preconcentration. A variety of procedures can be used to collect
an appropriate sample that is also compatible with analytical procedures as illustrated in Figure 11.1.
Parameters which should be considered when selecting samplers and filters are listed at the
bottom of the figure. The importance of each parameter will depend in part on the type of aerosol to
be sampled. Blank impurity content is less important, for example, when sampling an urban aerosol
than when sampling an aerosol in a remote area while just the reverse is the case for volume sampled
since the remote area aerosol has a lower concentration than the urban aerosol.
Aerosol sampling devices must be

chosen which can obtain an adequate amount of particulate
material within the sampling period for analysis; it must separate the aerosol into a desired size range
in addition to meeting performance specifications for efficiency, cut-point, wind dependence, etc. If
variation of aerosol concentration with time is required, it would be advantageous if the sampler is
able to switch filters sequentially without operator attention.
Because of re-entrainment and bounce-off problems associated with cascade impactors which
cause large particles to penetrate to small particle stages, virtual impactors (Figure 11.2) and cyclone
126

Figure 11.1: Atmospheric Aerosol Sampling

Atmospheric Aerosol Sampling
Air ( 80 ppb)
100 g/m
3

Concentration
Diffusion Interception Electrostatic
Precipitation
Settling Impaction
Stage Constants
Flow Rates
Bounce Off
Brake Up
Shape and Density
Last Stage Cutoff
Filters Particle Sizing
Volume Sampled
Efficiency
Blank Content
Reproducibility of Tare Wt
Analytical Compatibility
Flow Rate
Pressure Drop
Stack
Rural
Source
Urban
Remote
127

Figure 11.2: Cross-Sectional Drawing of a Dichotomous Virtual Impactor.
Filters A and B are Membrane-type Filters
128

separators are the most commonly used size separating devices in current aerosol characterization
studies.
In view of the availability and detailed characterization of the dichotomous sampler of the
virtual impactor type, these devices should be considered the basic sampler for receptor model
applications and should be used at each sampling site. Particles of > 2.5 m in diameter are
chemically quite different from those of < 2.5 m. These groups should be kept separate to avoid the
loss of information that occurs if they are mixed. Cascade impactors can separate particles into six or
more fractions, but the additional fractionation often does not add much useful information and
increases the costs substantially. If, as appears probable, EPA adopts a standard for particles of < 10
m in diameter, the upper cut point should probably be adjusted to that size. The 2.5 m cut point,
however, should be retained for identification purposes, whether or not a standard is established for
the fine particles.
A 24-hour high volume sample should also accompany samples collected on other samplers
both for regulatory issues and comparison with historical data sets.
Selected characteristics of a few samplers and filters are summarized in Tables 11.3 and 11.4.
Carbon and/or carbonaceous species account for a large portion of the ambient aerosol and can
provide considerable insight into source contributions. But since elemental and carbon analysis
cannot be done on the same filter medium, both glass fiber (quartz) and Teflon filters are required
which doubles the number of samplers required.

11.5 Filter Handling
A filter handling protocol will be discussed during the course.
129

Table 11.3: Summary of Aerosol Sampler Characteristics
Sampler Flow Rate
(l/min)
Cut Point
(m)
Comments/Recommendations
TSP
(Hi-Vol)
500-1000
0
Standard based on this sampler. Convenient, low cost
filter substrate suitable for elemental analysis not
available.
SSI
(Hi-Vol)
500-1000 10 Standard based on this sampler. Convenient, low cost
filter substrate suitable for elemental analysis not
available.
TSP
(Lo-Vol)
50-100 Can be used with filters suitable for elemental analysis
which are conveniently handled.
Dichotomous
(virtual)
17 2.5, 10 Best source apportionment sampler. Expensive, and
generates two filters per sampling period.
Dichotomous
(Stacked filter
units)
Low cost means of collecting a dichotomous sample for
trace element analysis. Cut points are not as sharp as for
virtual impactor and bounce problems may be more
severe. Adequate for some problems and cost effective
Multi Stage 1-10 Variable Potential bounce problems. More than two particle size
fractions are not typically required. Inconvenient and
generates more sample than can usually be analyzed.

Table 11.4: Summary of Filter Characteristics
Filter Cost Elemental
Blank
Artifacts Comments
Teflon* High Low No Filter of choice for Lo-vol samplers. Difficult
to use with Hi-vol samplers and expensive.
Moderate loading capacity
Nuclepore* Low Low No Required for first filter in SFU. Low loading
capacity, high pressure drop. Humidity
insensitive
Cellulose Low Low Minimal Lowest particle efficiency, not necessarily a
problem. Humidity sensitive. Fibrous
structure presents problems for light element
analysis by XRF
Cellulose
Membrane
Low Moderate Minimal Fragile, but has been successfully used in
large programs
Teflon Coated
Quartz
Moderate Si (High)
Moderate
No Not suitable for either XRF or carbon
analysis. Low sulfate/nitrate artifacts
Quartz Fiber Low Si (High) Yes Potential organic carbon and sulfur and
nitrogen oxide artifacts, low elemental
background but fibrous nature limits its utility
for XRF analysis. Fibers easily lost
Glass Fiber Low High Yes Stable fibrous structure, but high element
blank
* These filters may be coated with a thin layer of high purity oil which minimizes particle loss in transport. Mainly
a problem for a coarse particle fraction.


130

12.0 ANALYTICAL METHODS FOR CHEMICAL MASS
BALANCE MODELING

12.1 Introduction and Overview
A large number of chemical compounds can be found in a typical urban aerosol, and some
subset must be selected for quantification. This subset of species is selected on the basis of meeting
two basic receptor modeling objectives:
- Measure those species that account for most of the mass, preferably more than 90%.
- Measure key indicating species.
To meet the first objective requires that all major species be measured. On the basis of the
typical urban aerosol, this means that the following elements should be measured: H, C, N, O, Na,
Mg, Al, Si, S, Cl, K, Ca, Ti, Mn, and Fe. Because of analytical difficulties associated with the direct
measurement of H, N, and O, they are usually accounted for as part of compounds such as H
2
O,
NH
4
+
, NO
3
-
, SO
4
-
, and carbonaceous species.
The amount of fine particle mass explained by the species measured is usually less than the
percent of the coarse particle mass. This is usually due to the more complex chemistry and the
difficulty of quantifying species such as H
2
O. The percent mass explained will typically range from
about 25% if only the elements above Na are measured to about 90% if ionic and carbonaceous
species are measured and metal oxides are assumed. To make a serious effort to explain 100% of the
mass would not only be difficult but would also be cost-ineffective and unnecessary in most cases.
The second objective (measure key indicating species) will likely present a different set of
analytical requirements for each airshed since the potential sources can vary from one airshed to the
next. Key indicating species were noted in Section 3.4 and will be discussed again in section 12.2,
selection of elements.
Most studies will require measurement of basic species including ions, elements, organic, and
elemental carbon.
Key indicating species can be determined for each airshed by developing a size-resolved
chemical species emission inventory as described in Section 14.0.
It should be noted, however, that many source apportionment objectives can be attained
without meeting the first analytical objective; i.e., characterize 100% of the mass. For example, a
reasonably accurate estimate of road dust, residual oil, and automotive exhaust can be made with a
high degree of confidence just on the basis of XRF data. The addition of carbonaceous and ionic data
would not substantially improve the quantitation of the contribution from these sources.
A quality assurance and quality control plan is a must for any receptor modeling analysis
program and must be an integral part of the study plan. Whether the analysis is to be done by an "in-
house" laboratory or by an outside service contractor, the individual responsible for the overall
program quality and/or data interpretation should develop a quality assurance plan that will provide
him with the necessary confidence in the data prior to the data interpretation step. This plan should
be separate from the QA/QC plan used by the laboratory. QA/QC components that have been useful
in the past include:
1. Blind replicate analysis. Selection and relabeling of samples for repeat analysis. This is a
131

measure of precision.
2. Submittal of blind standard reference material (SEM). This is difficult because of the lack of
SEM similar to typical samples.
3. Statistical analysis of data for expected trends based on assumed sources; e.g., Br/Pb, Al/Si,
V/Ni.
4. Statistical analysis of intermethod comparison data (SO
4
, Cl, Br, from IC vs S, Cl, and Br
from XRF).
5. Request that laboratories supply SOP's, QA plan, and validation evidence such as
intermethod and/or laboratory comparisons.
Because unlimited resources are never available, analyses selected to characterize aerosol
samples must be based on their cost-effectiveness in meeting the source apportionment objectives;
i.e., characterize most of the mass and measure key indicating species.
X-ray fluorescence is generally considered to be the most cost-effective receptor modeling
tool for most airshed studies. It is capable of measuring all elements from Na to U and can thus
provide a useful indicator of unusual elemental abundances despite the fact that only about 15 to 30
elements are measured on a single filter. This set of elements, however, includes most of the more
abundant inorganic species such as Al, Si, S, K, Ca, and Fe in addition to most of the key indicating
species such as Si, S, Cl, Ca, Fe, Zn, Br, Pb as well as many other species useful in the general fitting
process. This coupled to its relatively low cost and nondestructive character has made this analysis
the one of choice for most of the filters selected for analysis.
An analysis protocol should be developed which processes filters in the most cost-effective
manner. Schematically, such a protocol might look like an inverted pyramid where most of the
samples would be analyzed by the most cost-effective procedures and fewer samples would be
analyzed by the less cost-effective methods. Most of the filters collected, for example, are usually
analyzed for deposit mass (although mass isn't necessary for receptor modeling, it is recommended to
provide the added confidence of knowing the percent mass explained by both chemical species
measured and sources quantified). Based on the net masses determined from this first analysis step,
meteorological regime classifications, episode days, and/or other criteria, a subset of filters is
selected for the next level of analysis, XRF. The results of these first analyses and other study data
can then be used to select a subset of the above subset to analyze by the next most cost-effective
analysis procedure which might be OC/EC, NAA or IC analysis depending on objectives. A similar
selection process is used for subsequent analyses.
Although analysis techniques such as NAA are quite expensive, they may be the most cost-
effective analysis tool for some applications such as acid deposition studies where very low
concentrations of As, Se and other elements must be quantified which are often below the range of
detection for other methods.
A brief review of the analytical techniques most commonly used in receptor modeling studies
is presented in the following sections. These analytical procedures generally fit within one of the five
categories listed below:
- Elemental Analysis
o X-ray fluorescence analysis (XRF)
o Neutron Activation Analysis (NAA)
132

o Inductively Coupled Argon Plasma (ICAP)
o Atomic Absorption Spectrophotometry (AA)
- Chemical Analysis
o Ion Chromatography (IC)
o X-ray Diffraction (XRD)
o Gas Chromatography (GC)
o Gas Chromatography/Mass Sepctrometri (GC/MS)
o Other Analysis methods
- Particle Type Analysis (Microscopy)
o Polarized Light Microscopy (PLM)
o Scanning Electron Microscopy (SEM)
o Automated Scanning Electron Microscopy (ASEM)
- Carbon Analysis
- Radiocarbon Analysis

12.2 Selection of Elements
The selection of elements or chemical species to measure is done on the basis of the following
objectives:
1. Measure key fitting features for source apportionment purposes.
2. Characterize most of the mass for the assurance that major species and sources are not over-
looked and for QA purposes.
3. Measure specific species responsible for a significant portion of observed visibility
degradation.
A key fitting feature is an element or chemical species which is reasonably abundant in a
source of interest but has a low concentration in the natural

background and other potential sources.
An ideal key fitting feature or more commonly referred to as a tracer or marker species should also
have a high analytical sensitivity and a low cost per determination. Lead is often used as an example
of a tracer element for automotive exhaust since its concentration is relatively high in the emissions,
low in the natural background and in most other common sources, and is readily measured at typical
concentrations at a relatively low cost. Lead, however, would not be a tracer in an airshed with a lead
smelter. In this case, other physical and chemical features, such as Br and Cl for automotive exhaust
and Cd, Zn, As, and Sb for the lead smelter, must be used to separate the influence of these two
sources. Thus, the list of key indicating elements may be different for each site depending on
potential sources.
It is important to emphasize the fact that there are very few ideal tracers and that more
commonly a group of key elements will form a unique pattern which will allow a source category to
be distinguished from other possible sources. If an effective variance CMB analysis is used, the
fitting pressure applied by each source will be related to the uncertainty in each element. In this
broader sense, key fitting elements are those elements in a source fingerprint that are expected to
provide most of the fitting pressure or resolving power for that particular source. For most crustal
sources such as soil and road dust, Al, Si, K, Ca, Ti, Mn, Fe, Rb, and Sr are expected to be key fitting
elements to varying degrees with possibly Si, Ca, Ti, and Mn accounting for most of the fitting
133

pressure. Other elements such as Na, Mg, P, S, Cl, Sc, V, Cr, Ni, Cu, Zn, Ga, etc. which are
sometimes measured provide less fitting pressure because of either their lower abundances, greater
variability or presence in other common sources. Measurement of elements in this second group is
helpful in fitting this and other sources but their measurement is not as cost-effective in helping to
resolve this particular source's contribution to the aerosol.
As noted above, the list of key indicating elements is defined in part by potential sources that
might impact an airshed. A site at the Olympic National Park, for example, is likely to have
substantial contributions from such sources as the global background aerosol, sea spray, slash
burning, residential wood combustion, hogged-fuel boilers, unpaved roads, natural foliage emissions,
and urban plumes from Vancouver, B.C., Seattle, Tacoma and other smaller cities in the area. The
aerosol in the Glacier National Park, on the other hand, may be influenced substantially by an
aluminum plant, a forest products industry, unpaved roads, and a number of gas, oil, and coal based
energy developments. The urban airsheds such as Salt Lake City would be influenced by yet another
set of sources such as a copper smelter and mining operation, petroleum refinery, an iron and steel
plant; a variety of coal fired power plants, transportation sources, etc.
Table 12.1 lists some of the more common potential sources and examples of their key indica-
ting elements in the absence of interfering sources. This list includes the following 31 elements: Na,
Al, Si, S, CI, K, Ca, Sc, Ti, V, Cr, Mn, Fe, Ni, Cu, Zn, Ga, Ge, As, Se, Br, Rb, Sr, Zr, Mo, Cd, Sn,
La, Ce, Nd, and Pb. Additional elements might be added to the list if other sources were listed or if
other analytical techniques were more commonly used to characterize source emissions. For
example, La, Ce, and Nd would not have been on this list 6 months ago because emissions from a
petroleum refining catalytic cracker had not yet been adequately characterized. Emissions from a
catalytic cracker were sampled with a size-segregating dilution sampler in the summer of 1982 and
measured La, Ce, and Nd at several tenths of a percent of the fine elemental pattern which should be
useful in resolving the influence of petroleum refining complexes. Thus, these elements must now be
added to our list of key indicating elements which should be measured whenever a petroleum
refinery is expected to be a substantial contributor to visibility degradation.
Additional species could be added to this list to meet the second objective listed, i.e., to
characterize most of the deposit mass. Additionally, it is important to characterize most of the mass
for QA reasons and to assure reviewers that major species and sources are not being overlooked. Of
course, carbon and oxygen account for much of the mass. Graphitic carbon or elemental carbon (EC)
is important because it can have a significant impact on visibility. Organic carbon (OC) is important
for resolving vegetative burning sources having OC/EC ratios of 5:1 from diesel exhaust sources
having OC/EC ratios of 1:5. Oxygen, on the other hand, like H and N measurements, provides little
insight into possible sources that can't be obtained more readily by measuring compounds containing
these elements such as oxides of N and S, carbonaceous compounds and water. Al, Si, S, K, Ca, and
Fe are also important, particularly in the coarse fraction where they account for a major portion of the
mass.




134

Table 12.1: Examples of Potential Sources and Key Indicating Elements
1. Road and Soil Dust (Plus Asphalt &
Rock Crushing)
Al, Si, K, Ca, Ce, Ti, Mn, Fe, Ga, Rb, Sr, Zr
2. Leaded Automotive Exhaust Cl, Br, Pb, OC, EC
3. Vegetative Burning OC, EC, Cl, K, Ca
4. Diesel Exhaust OC, EC
5. Hogged Fuel Boilers OC, EC, Cl, K, Ca
6. Lead Smelters Pb, Cd, Zn, S, As, Sb
7. Copper Smelters Cu, Zn, As, Pb
8. Coal Fired Power Plant Al, Si, K, Ca, Ti, Cr, Mn, Fe, Zn, Ge, As, Se, Sr
9. Oil Refinery Catalytic Cracker La, Ce, Nd
10. Aluminum Processing Al, F, Cl
11. Sulfite Recovery Boiler SO
4
, K
12. Kraft Recovery Boiler SO
4
, Na
13. Steel Electric Arc Furnace Si, Cl, Ca, Cr, Mn, Fe, Zn
14. Ferromanganese Furnace Na, K, Mn
15. Carborundum Si, S, Cl, K, OC, EC
16. Glass Furnace Na, S, Se
17. Carbide Furnace Ca
18. Incinerator Cl, K, Zn, Cd, Sn, Pb
19. Residual Oil Combustion Cl, S, Ca, V, Ni, Fe, Zn, Mo
20. Galvanizing NH
4
, Cl, Zn

12.3 Alternate Analytical Approaches
There are a hundred or more compounds present in most aerosols which could be measured by
one or more analytical tools that are available. Selection of the most appropriate tool or set of tools
will depend on a program's objectives and the following boundary conditions:
- airshed characteristics,
- desire to characterize most of the mass,
- need to measure key indicating features,
- compatibility with sampling substrate,
- quantitative accuracy,
- available resources and,
- compatibility with interpretive approach.
The airshed characteristics which define the typical concentration and range of elements and
compounds in the aerosol are probably the most important. Once the elements of interest and the
airshed characteristics have been defined, each potential analytical approach can be evaluated relative
to the analytical problem. That is, will the particular analytical approach be capable of measuring
most of the elements or compounds of interest at typical, low and high concentrations in the presence
of typical interference? If not, how important is the element or those conditions when it can't be
measured? Can alternative analytical approaches be used?
The objective of this section is to review some of the main features of various analytical
approaches which will be helpful in evaluating options that might be available if key elements or
compounds are not measurable at typical concentrations with potential x-ray methods.
Deposit mass determinations are nearly a universal measurement and are generally
135

recommended on all filter samples. Determination of deposit mass is not necessary to determine the
g/m
3
contribution of a particular source since this can be determined from the g/m
3
of the
measured features. Omission of this analysis, however, results in minimal cost savings in most
studies and, if measured, will provide information on the percent source contribution and the quality
assurance of knowing the percent mass explained.
X-ray fluorescence (XRF) analysis is generally accepted as the most cost effective source
apportionment analysis tool (Quail Roost). It is capable of measuring most of the more abundant
inorganic species such as Al, Si, K, Ca, Fe, Pb, etc. and common key indicating elements such as Pb,
Br, Si, V, Ni, Zn, etc. It is low cost, precise and accurate if appropriately validated which is essential
for all elements and particularly low atomic number elements like Al and Si. About 30 to 40
elements are usually analyzed and about 20 to 30 are typically reported above detection limits in
urban aerosols. The relationship of different x-ray fluorescence methods is illustrated in Figure 12.1.
PIXE analysis which is sometimes used consists of energy dispersive x-ray spectrometry with proton
excitation. It is generally less desirable because of possible destruction of the more volatile
compounds and less competitive sensitivities for the higher atomic number elements than can be
obtained with monochromatic photon excitation as proposed for this study. X-ray fluorescence is
applicable to all types of filters but the maximum amount of information can be obtained with
membrane type filters. Elements with atomic numbers above 20 (Ca) can be determined
quantitatively on quartz fiber hi-vol filters. X-ray fluorescence analysis requires substantial light
element correction for particle size, absorption and inter-element effects which must be validated. X-
ray analysis procedures will be discussed in more detail later in this section.
Neutron activation analysis (NAA) is another multielement technique which has superior
sensitivities to XRF for many elements such as Na,Mg, Al, V, Mn, In, Cd, Sb, Sn, rare earths, etc. It
is independent of filter absorption effects and can be applied to high purity quartz fiber filters. Forty
to fifty elements can be measured with 30 to 40 typically measured in urban aerosols. Costs of NAA
are high, it cannot measure key elements such as Si, S, and Pb, and it is usually recommended for
only a few source and ambient samples unless key indicating elements such as La, Ce, and Nd are
considered critical for a specific location because of possible petroleum refinery impacts.
Ion Chromatography is frequently applied to urban aerosols to measure SO
4
-2
, NO
3
-
, Cl
-
, Br
-
,
NH4
+
, Na
+
, and K
+
. The anion analysis is relatively routine, while cation analysis requires
considerably more care. If IC is used in conjunction with XRF, only the SO
4
-2
, NO
3
-
, and NH
4
+
are
added uniquely by IC. The NH
4
+
, although accurately quantified once in solution, is difficult to
interpret in terms of ambient concentrations because of artifacts and potential losses. Thus, cation
analysis is often not cost effective. The NO
3
-
and SO
4
-2
are also difficult to interpret when samples
are collected on glass fiber filters because of chemical artifacts. In addition, S is measured with XRF
and most of the fine S is often in the SO
4
-2
form.
Atomic Absorption Spectrophotometry and ICAP are excellent techniques for the analysis of
solutions but have substantial limitations when applied to aerosol samples. In adequate detection
limits for some key elements, difficulties in solubilizing the sample, high costs relative to XRF and
their destructive nature, are but a few of their limitations. The method generally provides results for
V, Mn, and Pb with normal glass fiber hi-vol filters.

136

Figure 12.1: Block Diagram Illustrating the Relationship between the
Different Types of X-Ray Fluorescence Analysis

Organic (O), elemental (E), and carbonate (CO
3
) carbon (C) analysis with pyrolysis
correction as developed by Huntzicker at the Oregon Graduate Center does an excellent job of
separating the three major carbon components.
Gas chromatography Mass spectrometry (GCMS) is capable of characterizing a large number
of the more volatile organic compounds but is costly, and interpretation of the results is difficult
because of likely deviations from the conservation of mass due to compound reactivity and
portioning between the gaseous and particulate phases. In addition, it usually addresses a small
portion of the total organic aerosol.
Liquid chromatography (LC) is less expensive and is applicable to higher molecular weight
compounds likely to be more stable in transport from source to receptor. Although both of these
chromatography techniques are available and have been used by Professor Cooper in his research at
the Oregon Graduate Center, they are not expected to be needed in this study because of the strength
of other techniques, the relatively high cost of LC and the difficulties associated with their
interpretation and validation.
X-Ray Fluorescence (XRF)
Bremsstrahlung Monoenergetic
Proton Alpha Electron
Wavelength Dispersive
(WDXRF)
Energy Dispersive
(EDXRF)
Photon Excitation
Particle Excitation
137

X-ray diffraction (XRD), on the other hand, is applicable to a wide variety of relatively stable
compounds. Chief among its assets are its selectivity and sensitivity for determination of geological
or crustal compounds. This analysis technique has been successfully applied to source
apportionment. The analyses are expensive, however, and should be applied only to selected samples
to maximize its cost effectiveness. It is applicable to high volume glass fiber filters in addition to
selected membrane filters but generally requires deposits of more than 100 g/cm
2
.
Optical microscopy (OM) is one of the oldest receptor oriented source identification tools, but
in general is considered to be only semiquantitative in routine applications. Its quantitative abilities
for particles greater than 2 m has been amply demonstrated by a few skilled specialists when the
appropriate techniques and time are applied. It can distinguish pollens, spores, paper fibers, etc. and
many inorganic particles. It complements SEM techniques which are strong in the less than 2 m
range, but provides limited elemental information for the very large particles where surface effects
become important. Optical microscopy also has the potential of observing weathering effects which
may be important in distinguishing secondary contributions from road dust. Microscopic methods are
applicable to hi-vol glass fiber filters, but the particles must be removed before analysis. Although
state-of-the-art techniques such as ultrasonic methods are highly reproducible, their ability to
quantitatively remove all aerosol deposits, unaltered have not been validated by any laboratory and a
large portion of the fine particle mass remains on the filter. The most direct approach to microscopic
analysis is on unaltered samples deposited on membrane type filters. Quantitative OM is expensive
and usually applied to only a few highly selected samples to maximize its cost effectiveness. These
methods have not been able to quantify the contribution of RWC and generally have problems
quantifying sources of fine particles particularly in the presence of high levels of sulfate.
Scanning electron microscopy (SEM) provides both morphological information and elemental
information about individual particles. It extends the range of microscopy to the few hundredths of a
micron region, but has limited applicability for characterizing particles less than a few microns in
regions with high sulfate levels because of particle coating problems. The elemental data
complements similar qualitative data such as color developed by OM. Quantitative analysis, like
OM, requires the summation of mass for specific particle classes which depends on an estimate of the
particle volume and density for a representative number of particles, generally in the range of 1000 or
more particles (More particles are required for representative fine particle analysis). Automated SEM
or automated particle identification techniques collect the same type of information as with normal
SEM analysis but are able to analyze many more particles to obtain improved statistical
representation of the sample. It is applicable to hi-vol filters, but the particles must be removed by
vacuum or ultrasonic methods, the same as for OM. SEM methods like optical microscopy are
expensive and is recommended only for highly selected samples. These methods have not been able
to quantify the contribution of RWC and generally have problems quantifying sources of fine
particles particularly in the presence of high levels of sulfate.
Radiocarbon (C-14) analysis is an excellent tool for distinguishing between fossil carbon
sources such as diesel and distillate oil emissions and modern carbon sources such as residential
wood combustion emissions. To distinguish between these sources, which emit primarily fine
particles, requires that samples be collected with a size selective hi-vol sampler to eliminate the
contribution of large particle modern carbon sources such as pollens, spores, wood fibers, etc.
138

Normal hi-vol samples could be used if microscopic analyses can confirm minimal impact from
large, modern carbon particles. The method is expensive when applied to small carbon samples (~10
mg carbon) which are typically collected with 24-hour hi-vol samplers. Costs can be minimized if
filters can be composited to give a seasonal average.

12.4 XRF Methods
There are four fundamental steps that are involved in any x-ray fluorescence analysis method:
1. Excitation
2. Spectral Analysis
3. X-ray Intensity Determination, and
4. Quantitative computations
Most authors separate the different approaches based on differences in the first two steps as
indicated in Figure 12.1. Although excitation is the first step, the first separation of approaches has
been on the basis of spectral analysis, mostly because of historical reasons.
Wave length dispersive methods (WDXRF) use crystal diffraction characteristics to separate
and resolve characteristic sample x-rays. It has excellent resolving power for low energy x-rays (less
than 5 kev) but requires high x-ray fluxes to excite an adequate number of sample x-rays to be
measured by relatively inefficient WCXRF spectrometers. These high x-ray fluxes can be destructive
and could cause loss of some elements. The WDXRF approach has never been used on a routine
basis with excitations other than broad band bremsstrahlung radiation from high powered x-ray
tubes. Its main advantage comes from its high resolving power which is most useful when analyzing
samples collected from source emissions in which unusual elemental ratios and interferences can
limit the use of energy dispersive methods (EDXRF).
This resolution advantage, however, is of little importance in the analysis of typical ambient
aerosols because the analyses of elements of interest are usually not limited by the spectrometers
ability to resolve interferences.
The energy dispersive spectral analysis approach (EDXRF) is the method most commonly
used for ambient aerosols. The characteristic sample x-rays, in this case, are resolved by a silicon
diode which converts the deposited x-ray energy directly into electrical signals, the amplitude of
which is proportional to the x-ray energy deposited. This high efficiency spectrometer has allowed
the development of analytical techniques using low power (less destructive) x-ray tubes.
The EDXRF spectrometer has superior resolving power to WDXRF spectrometers for x-ray
energies that exceed about 7 or 8 keV. Its lower resolving power for low energy x-rays is usually not
a problem for low atomic number elements in typical ambient aerosols. This type of spectrometer is
used both with photon and particles (PIXE) induced x-ray fluorescence analysis methods.
As indicated in Figure 12.1, there are a variety of other categories into which XRF methods
can be subdivided based on excitation methods. These excitation methods can be grouped into two
broad categories, photons (continuous or monochromatic) or particles (electrons, protons or alphas).
The continuous mode of photon excitation is most useful for broad band excitation or light element
analysis (Na to K). The main limitation for bremsstrahlung radiation analysis of low atomic number
elements is their low transmission efficiency for low energy exciting x-rays. This problem has
recently been reduced with the development of thin Be window x-ray tubes which provide much
139

higher x-ray fluxes near the absorption edge of low atomic number elements (Na to Al).
Monochromatic excitation provides the best sensitivity for higher atomic number elements.
Radioisotopic sources offer the purest form of monochromatic radiation but provide poor energy
selectivity, low intensity and low excitation efficiency for low atomic number elements. Both
regenerative monochromatic, transmission tube and secondary fluorescence methods produce
monochromatic x-ray excitation at a variety of selectable energies and intensities.
Particle excitation methods use high energy protons or alpha particles for exciting electron
hole vacancies. Proton excitation has been found to provide the best signal to noise ratios and is most
commonly used for aerosol analysis. Its excitation efficiency is highest for low atomic number
elements and has relatively poor excitation efficiency for producing K-shell vacancies for elements
with atomic number greater than about 40. To obtain its highest sensitivities, high beam currents are
required which can cause sample damage.
The method used to determine the characteristic x-ray intensities once the EDXRF spectrum
has been recorded has very little impact on the analysis. It has been shown that any one of many
different spectral analysis approaches can yield relative analytical accuracies in the range of a few
tenths of a percent.
The final step of quantitative computation usually consists of a comparison between thin film
standards and thin aerosol deposits. This is usually adequate for elements with atomic numbers
greater than about 20 but particle size and matrix corrections are required to increasing degrees for
lower atomic number elements. This is a particular problem for elements from Na to Si because of
the potential magnitude of the corrections and their dependence on unknown particle size
distributions which can be different in each airshed and can vary as a function of time. Theoretical
corrections, particle standards and empirical methods have all been used to mitigate this problem.
These approaches will be limited, though, when the particle mass distribution is substantially
different from expectations. Without knowledge of the particle size distributions and chemistry, x-ray
measurements for the very low atomic number elements, such as Na, will be estimates at best. This
problem, although not totally independent of the method of creating electron-hole vacancies, is
primarily dependent on the characteristic x-ray energy and the composition of the host matrix which
is the same for XRF and PIXE analysis.
Although the particular method used to excite x-rays and measure the resulting spectrum are
both of some importance, the key evaluation criteria should be performance factors such as:
1. Elements measured
2. Minimum detection limits for these elements
3. Minimum detection limits for key indicating elements
4. Validated quality
5. Cost and
6. Turn-around time
Of particular importance are the detection limits since they reflect the amount of information
that is potentially available with a proposed approach. Although the detection limits required are not
particularly restrictive, selection of a procedure that will provide substantially better detection limits
for key elements that are expected to be typically at low concentrations will provide additional
valuable information. One potential problem associated with a comparison of detection limits is
140

created by an inconsistent use of detection limit definitions. This has been potentially resolved by
two publications by Currie, who has established a theoretical basis for a standard definition. The
value suggested by Currie for the concentration at the detection limit (C
D
) for x-ray analysis is given
by the following equation:
( ) | |
b b D
S t R C o 65 . 4 / 65 . 4 = =
where,
b
is the blank standard deviation, R
b
is the blank counting rate, t is the time of the analysis
and S a sensitivity function in terms of counts per second per g/cm
2
.















141

13.0 RUNNING EPA CMB 8.2 MODEL


142


143


144


145

146


147


148

149

14.0 EVALUATION OF RECEPTOR MODELING RESULTS

14.1 Introduction
Although this chapter focuses on verification of receptor modeling results, an equal degree of
care should be taken in evaluating the quality of the input data. Evaluations of laboratory analytical
precision within analytical methods, correlation analysis of fine particulate mass scattering
coefficient (B
scat
)

data and between data collected at different sampling sites may be helpful.
Evaluation of the quality of the input data and correction of deficiencies is most important to the
successful use of the receptor modeling method within the context of airshed studies.
The process of evaluating modeling results is of central importance in establishing credibility
in the study's findings. Four criteria may be used to assess the validity of the apportionment analysis;
each of which is discussed in further detail in the following sections:
- Model results must be internally consistent (e.g., predicted/observed species and mass
agreement, correlations among elements from a common source, reproducibility of results).
- Estimated source impacts should agree with source activities, transport and dispersion
conditions within the airshed.
- Source apportionment results should be consistent with impact estimates derived by other
techniques such as dispersion modeling, multivariate statistical methods, microscopy, or
other receptor model approaches.
- Unexpected results should be investigated and explained.
Unfortunately, there are no clear standard or "right" answers against which estimated source
contributions can be compared to determine their validity. Results can, however, be evaluated in
terms of results from other methods, the relative merits of each approach identified and judgments
made relative to the credibility of the results.

14.2 Internal Verification Techniques
In the absence of "hard" standards for comparison, the credibility of receptor modeling results
can best be judged by applying as many self-consistency tests as possible. Three consistency criteria
have been suggested to evaluate the quality of CMB fits, for example:
A. The concentrations of all species and each size fraction's mass should be predicted accurately.
B. The observed size distributions of the sources determined by CMB should be consistent with
the size distributions measured at the receptor.
C. Concentrations of chemical species should be highly correlated with other species thought to
come from the same source.
Criterion A is best evaluated by examining the average CMB predicted/observed species and
mass data derived from a large number of individual samples. Predicted/observed (pred/obser)
species ratios within the range of 0.5 to 2.0 are generally considered as acceptable. Ratios that fall
outside of this range may indicate an error in accounting for an important source (as is common in
coarse mode Ca, for example), errors in the source composition, element volatility or the presence of
a previously unaccounted for source (e.g., sampler vacuum-pump motor particulate indicated by a
low Cu pred/obser ratio).
150

Mass pred/obser ratios within a range of 0.8-1.1 are difficult to achieve unless nearly all of the
aerosol mass has been accounted for chemically. In those cases where only elemental data are
available, such as data from the EPA inhalable particulate network, mass ratios much less than 0.8
are to be expected because of the absence of sulfate, nitrate, and ammonium ion data and carbon
components in the analysis. Supplemental analysis for these components would be necessary in
airsheds where sulfate, nitrate, ammonium ion and carbon are important constituents of the aerosol.
Criterion B requires that the percentage of each source's impact (at the receptor) estimated for
the fine mode (< 2.5 m) should approximate the percentage of emissions (at the source) found in the
same size fraction. If, for example, 90% of the emissions measured in the boiler stack are found in
the fine mode, one would expect to find most of the boiler's impact measured at the receptor in the
fine mode. An example of this type of analysis is the results from the Portland Aerosol
Characterization Study. The comparison shows for most sources good agreement between the
fraction of mass in the source emissions and that estimated by CMB. Vegetative burning impact
estimates often do not agree, however, due to the large uncertainty in the source's composition or the
chemical similarity between coarse mode vegetative burning particles and plant tissue. Disagreement
between estimates for minor source components is to be expected because of relatively large
analytical and fitting errors near the limit of the method.
Because of preferential deposition of particles emitted in the coarse mode (> 2.5 m),
application of this criteria to the coarse mass provides less certain conclusions. As an upper limit, one
would not expect to find a greater percentage of a major coarse particle source (e.g., urban dust) in
the source emission than in the receptor samples.
Criterion C suggests that binary correlations between species measured at each sampling site
be examined to see if the results are consistent with CMB findings. By stratifying data sets into
impact groups, the investigator could examine "impact" versus "no impact" correlation sets for
significant elements common to the source class being examined. For example, one would expect a
higher correlation between fine particle Fe and Mn during a steel mill impact as reported by the
receptor modeling than during a period of no impact.

14.3 Source Activity and Transport Methods
An analysis of anticipated source strengths, preferably conducted independently from those
involved in the CMB analysis, can provide important clues regarding the validity of the receptor
model analysis. This requires, however, the prior assumption that source activity or transport analysis
represents the "truth" against which the CMB results can be judged. Such an assumption, however,
may be misleading.
The ability to prepare graphical comparisons of temporal changes in source activity versus
CMB derived source strength, reflects one of the strong points in the CMB approach compared to
multivariate techniques because of the latter's requirement for large sets of data.
Spatial variations in source class emission density can be used to evaluate receptor modeling
source strength estimates. Concurrent receptor modeling source strength estimates for each source
class obtained from a number of sampling sites can be used to develop isopleth maps which can be
superimposed on emission density maps. Less resource intensive evaluations based on micro-
inventory results can also be used.
151

Atmospheric transport information can provide strong support to attempt to resolve impacts
from specific sources within a general source class. If, for example, an airshed was impacted by two
steel mills having chemically similar emissions, the individual impacts can be identified by (a)
sampling during periods of persistent wind direction at a point downwind of the source of interest,
yet upwind of the second mill, (b) results can be stratified for wind direction and stability cases likely
to favor impact from a specific source, or (c) source strength data from two or more sources can be
used, in association with wind direction data to identify sources by "triangulation."

14.4 Comparisons to Other Techniques
Coarse mode soil dust impacts obtained by CMB analysis can, for example, be compared to x-
ray diffraction or optical microscopy results if care is taken to protect and, preserve a portion of the
sample. Radio-carbon, enrichment factor and CMB results for vegetative burning impact in the fine
particle mode have been also compared. On larger data sets, results from multivariate statistical
techniques can be evaluated in relation to CMB results averaged over

the same time period. In each
case, the investigators must use their knowledge of the strength and weaknesses of the methods to
determine the true source strength mix.

14.5 Analysis of Analomous Results
The evaluation of unexpected results or contradictions between CMB results and other source
strength indicators can provide extremely valuable insights into the nature of the airshed's source mix
and their relative impacts. Investigations into unusually high concentrations of Al, K, Sr, Sb, and Ba
in St. Louis lead to the identification of July 4, 1976 fireworks displays as a strong (although short-
lived) particulate source. Evaluation of unaccounted for fine "discovery" of vegetative burning
(woodstoves, fireplaces, domestic burning, etc.) as a major source in Portland, Oregon later estimated
to be 6,500 tons/year. The key issue here is the major benefits in our understanding of the airshed and
its sources can be obtained through a careful evaluation of the evidence, recognizing that no single
approach is inherently correct.

152

15.0 EXAMPLES OF CHEMICAL MASS
BALANCE APPLICATIONS

15.1 Airshed Management Applications
The receptor modeling techniques have a wide range of potential applications in apportioning
source impacts related to TSP, inhalable, fine or coarse mode particulate, visibility impairment,
pollutants in the entire biosphere in general and in such broad industrial applications as asphalt
batching operations. Applications to control strategy development and dispersion model verification
have been mentioned, but other areas such as national policy development, NAAQS assessment,
stationary source enforcement, identification of sources of toxic pollutants, acid rain, and gaseous
hydrocarbon source reconcilation are also possible uses of the technology.
A few of the potential applications are listed below as examples of how these methods might
be used by regulatory agencies. Since receptor methods cannot identify the specific origin of all
particles, the actual application of the techniques to the applications noted may have limited success
depending on the specific problem at hand.
- Dispersion Model Validation
The comparison of source impacts derived from dispersion models to those obtained using
receptor techniques represent a new approach to dispersion model validation. Specific source (or
source class) impacts estimated by both techniques can be directly compared and evaluated to
identify errors in emission inventory data, modeling assumptions, and meteorology that would not be
apparent using traditional model validation approaches based on particulate mass.
- Control Strategy Tracking
The effectiveness of control strategies can be tracked using receptor modeling source impact
trend analysis methods. By analyzing the ambient aerosol source mix using receptor techniques, the
effectiveness of control strategies can be assessed and compared to those originally predicted.
Ineffective strategies can be re-evaluated with minimum impact on the community while those regu-
lations that are effective can be more easily identified.
- Demonstration Control Strategy Evaluations
Receptor techniques can be used to identify changes in source impacts resulting from trial
control strategy applications. The State of Oregon, with EPA support, has conducted studies of the
effectiveness of street cleaning strategies using CMB methods to monitor changes in soil and road
dust impacts.
- Growth Management
Emission inventory projections based on estimates of future community growth can be tracked
over time to determine if the projections are correct compared to receptor method results based on
true source impacts. By establishment and routine operation of monitoring networks designed to
capture aerosol samples in a manner suitable for future receptor model analysis, the analyst has
access to the historical record of trends in source impacts with which more accurate growth
management decisions can be made.
- PSD Increment Tracking and Analysis
Receptor method results obtained within maximum impact areas projected by preconstruction
153

PSD dispersion modeling can potentially provide results which can better assess the true source
impact than techniques which rely solely on changes in particulate mass concentrations. Receptor
results can then be used to further restrict or expand the emission growth in the area studied.
- Emergency Action Plan Develoment
Receptor source apportionment techniques are especially well suited to the identification of
source impacts during air pollution episodes. It is under these acute conditions of poor dispersion and
high particulate concentrations that receptor methods can more appropriately be applied than source
models. Dispersion models often cannot be used because of their sensitivity to low wind speeds and
difficulty in developing an adequate emission inventory. Episode monitoring data and receptor
technique analysis can provide important information needed to develop and track .the effectiveness
of emergency action plans.

15.2 Visibility Source Apportionment
The identification of source impacts on visibility reduction has been a subject of considerable
interest to regulatory agencies for many years as the public's perception of a community's air quality
is often based solely on this criterion. The apportionment of source impacts on visibility reduction
may be of central importance to the objectives of particulate control strategies designed to achieve
NAAQS. Researchers have identified fine particulate in the light scattering range (0.1 to 1.0 m) as
the most important components associated with visibility reduction, although NO
2
provides an
additional potential of discoloration caused by absorption of blue light.
As a first step, the relative importance (regression coefficient) species in the aerosol to total
light extinction (scattering and absorption coefficients) is determined through regression analysis.
Source contributions to the fine particulate mass are then identified by receptor modeling. The
receptor modeling source impact information is then used to estimate, for example, the portion of
elemental carbon associated with auto exhaust, diesel emissions, vegetative burning and other
sources. By weighting the assigned species with the regression coefficient, estimates can be made as
to the likely portion of the light extinction associated with specific source categories.
- In the Denver Haze Study, linear regression analysis in which the particle scattering
coefficient (B
scat
) was regressed against concentrations of fine particulate species was used to
determine if any of the components were especially important. Results concluded that while
no single chemical species was uniquely efficient at scattering light (on an equivalent
concentration basis), elemental carbon's ability to absorb light (B
abs
)

provided the strongest
single correlation with light extinction.
- Results from OMB analysis of fine particulate mass in the Southwest were combined with
concurrent light scattering (B
scat
) measurements to determine a visibility "budget". Results
indicated that sulfates accounted for nearly one-half of the scattering on a typical, clear day.
154

The chemical composition of the light scattering aerosols provides a valuable clue regarding
their potential source. Unfortunately, the ability to resolve sources of the fine particulate mass using
receptor modeling is limited when the aerosol is dominated by secondary sulfate, nitrate, ammonium
ion, and organic and elemental carbon species. The current inadequacy in source composition data,
especially for carbonaceous components, and the absence of currently known "tracers" limit current
applications to sources with more distinct fingerprints. Light extinction associated with many
industrial sources may potentially be identified using these techniques, however.

15.3 Chemical Mass Balance Case Studies
In the past 10 years, variations of the basic CMB approach to particulate source impact
apportionment have been applied in at least 15 studies within the United States, Canada, and Brazil.
Although many of the early applications were designed to provide a general knowledge of aerosol
composition and likely source contributions, results from the most recent studies have had direct
input to control strategy development in primary and secondary particulate standard non-attainment
areas.
Brief descriptions of a few representative studies are included in this section to illustrate how
the technology is used and to show examples of the points raised in the course.

15.3.1 Particulate Source Apportionment Studies
The following studies were research efforts which have used CMB methods to better
understand source strength, develop receptor model techniques and provide basic data on the nature
of the aerosol problems.
Pasadena, California
In one of the first applications of the CMB concept, total particulate data from Pasadena was
used to resolve contributions from sea salt, soils, auto exhaust, fly ash, cement, and tire dust.
Eighteen elements were measured by NAA and AA methods. The tire dust component was
determined indirectly by assuming that the mass ratio of tire rubber to exhaust was 1:10. Altogether,
about 70% of the total particulate mass was accounted for within the source classes studied.
Chicago, Illinois
TSP samples from 20 sites were analyzed for 20 elements and elemental tracers used for
source apportionment. Six source classes were identified as important contributors to the aerosol
mass: coal combustion, coke production, fuel oil combustion, auto exhaust, iron and steel
manufacturing and cement dust. Fifty-two percent of the total particulate mass (including sulfate and
nitrate ion) was identified but carbon emissions, non-carbonate carbon components and secondary
aerosols were not accounted for, nor were all potential sources included in the study. Elemental
concentrations predicted in this study were within a factor of 2 for Br, Co, Cr, Cu, Fe, and K, but As,
Cd, Cl, Hg, La, Mg, Na, Se, and En were substantially underestimated.
Los Angeles Basin, California
The California Aerosol Characterization Study (ACHEX) was conceived as one of the most
comprehensive aerosol field studies as of the early 1970's. Primary source contributions were
estimated using the CMB method in its "tracer" mode supplemented with (a) emission inventory
155

scaling techniques for those sources for which no tracer data were available, and (b) estimates of gas
to particle conversion of sulfate, nitrate, organics, and ammonium ion. The CMB model was used to
calculate source impacts for Pasadena, Riverside, San Jose, Fresno, and Pomona using NAA and
XRF data from hi-vol and Lundgren impactor samples. Sea salt, soils, auto exhaust, cement and fly
ash components were resolved by CMB. Estimates for diesels, industry and aircraft emissions were
estimated by inventory ratioing techniques. Carbon balance methods were added to identify source
impacts based on estimates of the percent carbon in source emissions. About 35% of the measured
mass was accounted for with good CMB predicted/observed ratios for Al, Na, Ca, V, Pb, Mg, and K.
St. Louis, Missouri
Size resolved particulate data from ten sites operated in St. Louis between 1975 and 1977
were used to identify contributions from crustal sources (soils, road and quarry dust and fly ash),:
limestone, ammonium sulfate, motor vehicles, paint pigments and steel manufacturing. Ambient,
dichotomous impactor samples were analyzed by XRF for elements of atomic numbers 13 to 38 plus
Cd, Sn, Sb, Ba, and Pb.
Source composition data consisted of data from tunnel studies while crustal, paint pigment
and steel component values were from the literature. An average of 78, 87 and 96% of fine, total
particulate and coarse fractions were accounted for. Elements not predicted within a factor of 2
included V, Cr, Ni, Cu, Zn, Se, Cd, Sn, Sb and Ba.
Washington, D.C.
CMB methods were applied to results from the NAA of hi-vol total suspended particulate
samples (using a 0.4 m Nuclepore substrate). Twenty-seven elements were measured. The aerosol
mass was apportioned into six components: coal fly ash, oil and refuse combustion, marine aerosols,
soils and motor vehicles. Eight fitting elements (No, V, Zn, Pb, Al, Fe, Mn, As) were used with the
least squares fitting approach. Only stable, nonvolatile elements were selected as fitting specie which
serves to identify aerosol from a specific source (e.g., Pb for auto exhaust, Al for soils, etc.). Source
composition data was obtained from local source tests (coal, oil and refuse combustion and soils)
while marine and auto exhaust was taken from literature values.
Source composition data was later improved through numerous tests of local power plants and
other sources. Forty elements measured by NAA provided an overall element CMB-predicted to
observed ratios of between 0.7 and 1.5 for most elements. Underestimated elements such as K, Mg,
Cr, Cu, and Ni suggest that either (a) several important sources have not been identified, or (b) errors
are present in the source composition data. About 48% of the TSP was accounted for with primary
aerosol emissions, the remainder consisting of secondary aerosols (SO
4
-
, NO
3
-
NH
4
+
) and condensible
organic emissions.
Buffalo, New York
The State of New York, working in association with EPA, conducted a study of particulate
source contributions in the Buffalo-Lackawana area between January, 1978 and July, 1979. Five
hundred and fifty size resolved samples were analyzed by XRF for 12 elements and 10 ions. CMB
was used to identify iron and steel manufacturing, soils, lime, oil and refuse combustion and auto
exhaust. Fine particle mass (< 4 m) was underpredicted by the CMB model by 14-23% and the
coarse fraction (4-15 m) overestimated by 32-90%.
156

Source composition data for total particulate fraction was obtained from the Washington, D.C.
studies. The fitting element approach of Gordon was used: Si for soils, Fe for steel production, V for
fuel oil combustion, Zn as a tracer for refuse incineration, Pb for auto exhaust and Ca as an indicate
of concrete abrasion, cement and slag dust. CMB analysis was based on average aerosol composition
data rather than 24-hour concentrations.
Denver, Colorado
Chemical mass balance was applied in the Denver airshed during the winter, 1978 studies of
visibility impairment. Size resolved samples taken on Teflon and quartz filters were analyzed by
NAA and XRF for elements and two carbon classes (organic and elemental) to resolve auto exhaust,
coal combustion, residual oil combustion and soils. Diesel emissions, natural gas and unleaded auto
exhaust was identified by emission inventory scaling. Secondary ammonium sulfate and nitrate was
also measured. Most of the aerosol mass was apportioned by including oxygen, hydrogen and water
in the estimates. Underpredicted elements included Al, Cl, K, Cr, Mn, Ni, Cu, As and Se.
Source characterization for leaded auto exhaust was developed from source tests of 1970
model year cars. Coal combustion data came from literature values while residual oil combustion
data taken in Portland was adjusted to the elemental composition of the crude oil use in Colorado.
Soils data from the Portland Studies were used.
Denver aerosol was also analyzed by another group using dichotomous sampler data
developed by XRF methods. Ion chromatography was used to measure sulfate and nitrate ion
concentrations following elemental analysis. Total carbon analysis was conducted using collocated
samplers equipped with quartz filters. The study attempted to describe the Denver aerosol in terms of
six components: sulfate (and related cations), motor vehicles exhaust, shale, limestone, salt and
refuse incineration using source composition data obtained from the literature. Results provided good
agreement between observed and predicted values for total mass and nine elements used in the fit,
but K, Cr, Ni, Ca and carbon were underpredicted. About 64% of the fine mass was found to be
carbon, but could not be successfully resolved into the sources included in the analysis.

15.3.2 Control Strategy Development Applications
The following programs represent the logical extension of earlier research efforts: application
of the chemical mass balance technique to evaluating which particulate emission sources should be
controlled to insure that air quality standards are attained and maintained.
Portland Aerosol Characterization Program (PACS)
Portland, Oregon lies at the north end of the Willamette Valley in an area almost entirely
surrounded by mountains and hills. Temperature inversions frequently trap particulate emissions
within the valley, reducing visibility and minimizing the dispersive capacity of the airshed to the
point where violations of the primary and secondary National Air Quality Standard (NAAQS) occur.
In 1972, the State of Oregon Department of Environment Quality's (DEQ) Implementation Plan was
adopted, setting standards for particulate emissions from industrial sources which were believed
necessary to insure attainment with the primary and secondary standards. Following completion of
the control strategy in 1974 and a 60,000 ton year reduction in emissions, the improvement in
particulate air quality was found to be insufficient to attain the standards at key sampling locations.
157

Faced with continuing violations of the secondary standard, numerous new source applications
and growing pressures on airshed resources, the Oregon Environmental Quality Commission adopted
a strict "emission ceiling" limiting new source growth in the airshed in spite of the potential for
consequences to the region's slowing the economic growth. In recognition of the inadequacies in the
data base upon which the regulations and policy were founded, the Commission directed the DEQ to
design the data base improvement program needed to quantitatively assess the impact of sources in
the Portland airshed.
Following several months of study and consultation with aerosol study researchers, the staff
presented a $650,000, 3 year, integrated plan to upgrade the emission inventory, meteorological and
air monitoring data bases as well as an initial design for an aerosol source apportionment program.
Program funding was obtained from the State, EPA, Portland Chamber of Commerce and an
association of local industries. In-kind support from EPA, OAQPS, and North Caroline was obtained,
technical and policy advisory committees were formed and field sampling began in the spring of
1977.
The primary objectives of PACS were to (a) identify the source of at least 90% of the aerosol
contributing to NAAQS violations on worst-case days and on an annual average at nonattainment
monitoring sites, (b) determine the source contributions to the fine (< 2 m) particle fraction, and (c)
identify sources and chemical species responsible for visibility reduction in Portland. Department
staff responsibilities included project management, source testing and ambient field monitoring tasks
while trace metal, ion, carbon analysis, data interpretation and CMB were completed by staff of the
Oregon Graduate Center for Study and Research, Beaverton, Oregon.
Twenty-seven elements and compounds were measured on over 2,000 size resolved samples
and 1700 CMB calculations. Thirty-seven major source classes representing 95% of the emission
inventory were source-sampled to obtain local fine and coarse aerosol composition data. Ambient
aerosol mass was apportioned among soil dust, motor vehicle exhaust, aluminum production, hog
fuel boiler, calcium sources, steel and ferromanganese production, kraft manufacturing, marine
aerosol, vegetative burning, residual oil combustion and sulfate and nitrate ion accounting for 90% of
the TSP mass and 80% of the fine particulate mass. Improvements in the CMB technique included
incorporation of an effective variance least squares calculation method, and an error propagation
scheme.
Meteorological conditions were stratified into a set of nine "regime" conditions, each of which
consists of wind speed and direction vectors in each of the 5,000 2 km by 2 km grids into which the
study area was divided. Airshed topography, area and point source emissions and mixing height were
also developed as input to an advection dispersion simulation model to be used for strategy
development. Emission inventories for the PACS sampling period were developed, dispersion model
simulations were prepared for each meteorological regime by source class (motor vehicles, road dust,
etc.) and the results were compared to the regime stratified CMB analysis. Following evaluation of
the CMB-dispersion model results, emission inventory deficiencies were identified and corrected to
insure that the assigned source impacts predicted by the dispersion model were consistent with the
PACS CMB findings.
Results from the PACS program were presented before the Portland AQMA Citizen Advisory
Committee in the fall of 1979, control strategy alternatives were simulated with the upgraded model
158

and a staff analysis of the alternatives was developed. In December 1980, a SIP revision for total
suspended particulates was adopted by the State following considerable discussion by public,
industry, land use planning and local government representatives.
Although the PACS program and the DEQ's approach to dispersion model validation were
widely regarded as a major step in assuring community confidence in the effectiveness of the strategy
the program did not provided the level of source impact specificity anticipated:
- Soil and road dust, which collectively, account for 39% of the characterized TSP mass, could
not be chemically differentiated from one another. Since there are few unpaved streets in
Portland, most of these particles have been assumed to originate directly or indirectly (thru
trackout) to paved roads.
- Secondary particulate matter and unaccounted for carbon (31.7% of the fine mass) could not
be traced to its origin by CNB. Secondary sulfate and nitrate apparently generated by urban
sources, were apportioned in relation to the percent of local SO
2
(or NO
2
) emitted by each
source class.
- Approximately 21% of the fine mass and 8% of the characterized TSP mass was unidentified
but assumed to be composed of water, NH
4
+
and analysis uncertainty.
- The specific sources of vegetative burning impact (20% of the fine and 15% of the TSP
mass) could not be chemically determined. Emission inventory and source activity data have,
however, been used to identify the source class most likely to have been responsible.
- Specific impacts from diesel emission, distillate home heating, tire rubber, pollen, paint
fragments, paper home refuse combustion and a host of other potential (and likely sources)
was not chemically identified.
Figure 15.1 illustrates the annual average impact of aerosol sources in Portland as determined
from the PACS study. The limitations of the study results reflect the current state-of-the-art in CMB
analysis and serve to emphasize the importance of other, independent information in identifying
specific source impacts.

15.3.3 Willamette Valley Aerosol Studies
Oregon's Willamette Valley produces more than 90% of the total U.S. production of pasture
and turf grasses. The industry contributes about $108 million to the State's economy each year. As
important as it is to the State, however, there is considerable controversy regarding the industry's
place in the valley's future because of the smoke generated by the practice of open burning of the
grass stubble. About 150,000 acres are burned annually during the July-September period generating
smoke plumes that have created a great deal of public concern about health effects, visibility loss and
the

burning's impact on the Valley's nonattainment status.
Recent State legislation mandated the phasedown of field burning by 1975 contingent upon
the development of alternatives to the burning. When no economically acceptable alternatives were
found, the industry pressed for major increases in burning activity (and emissions) in conflict with
provisions included in the State's SIP. To resolve the question of the burning's impact on particulate
standard attainment and maintenance plans, the State Legislature required the Department to study
the impact of the burning on NAAQS, visibility reduction and the magnitude of population exposure
to the smoke.
159

Figure 15.1: Aerosol Source Impacts in Downtown Portland, Annual Average
Soil and Road Dust
(39.0%)
Vegetative Burn
(14.6%)
Auto Exhaust
(9.7%)
Sulfate (4.6%)
Nitrate (4.5%)
Volatilized Carbon
(8.1%)
Nonvolatilized Carbon
2.2%)
Unidentified (8.0%)
(NH
4
, H
2
O, etc.)
Primary Industrial (4.9%)
Carbide Furnace, Ca (2.0%)
Aluminum Production (1.35%)
Steel Production (0.94%)
Hog Fuel Boilers (0.22%)
Ferromanganese Production (0.21%)
Sulfite Process (0.18%)
Marine
(3.8%)
Residual Oil
(0.8%)
TOTAL
Soil and Road Dust
(4.3%)
Vegetative Burn
(20.2%)
Auto Exhaust
(15.2%)
Sulfate (8.2%)
Nitrate (5.8%)
Volatilized Carbon
(13.7%)
Nonvolatilized Carbon
(4.0%)
Unidentified (21.3%)
(NH
4
, H
2
O, etc.)
Primary Industrial (3.0%)
Steel Production (1.0%)
Aluminum Production (0.72%)
Hog Fuel Boilers (0.48%)
Sulfite Process (0.39%)
Carbide Furnace, Ca (0.6%)
Marine
(3.2%)
Residual Oil
(1.4%)
FINE
160

In response to the legislature's mandate the Oregon Department of Environmental Quality
developed the project study design in cooperation with the EPA, the grass seed industry, Lane
Regional Air Pollution Authority and local governments concerned about the burning's impact on
their cities. The objectives of the study were to (a) identify the particulate impact of field burning and
as many other sources as possible in relation to NAAQS, (b) provide data on fine particle source
impacts to support health effect studies, and (c) determine the burning's impact on visibility
reduction. The study design focused on identification of source impacts using CMB methods
developed during PACS and dispersion modeling approaches to source reconciliation and strategy
development.
A two year, $611,000 coordinated program of air quality, meteorological, source
characterization and dispersion model improvement was developed which included both ground and
aircraft monitoring programs. The ground monitoring program consisted of 10 sites operated on a
daily basis during the summer of 1968, resulting in 2,000 size-resolved samples, which were
analyzed for 18 elements, 8 ions and carbon. Chemical mass balance was conducted on about 100
24-hour samples selected on the basis of a burning activity and observed impact. Source composition
data developed during PACS was supplemented with local soil data, updated vegetative burning
composition information and three new industrial source classes. The aerosol mass was resolved into
soil dust, motor vehicle exhaust, kraft recovery furnace emissions, vegetative burning, residual oil
combustion, marine aerosol, sulfate, nitrate and residual carbon typically accounting for 90% of the
fine particle mass and overestimating the coarse mass by about 20%. The impact of field burning,
soil, marine air and motor vehicles were also estimated by enrichment factor analysis, "tracer mode"
CMB methods and statistical techniques based on burning activity.
The overall success of the Willamette Valley CMB analysis was not as good as that obtained
from the PACS work because of (a) the greater uncertainties in the PIXE (proton induced x-ray
emission) elemental data for several key elements as compared to the XRF/INAA methods used in
PACS, (b) the lack of good fitting species and the high variability of vegetative burning's
composition, and (c) the inability of the model to clearly apportion marine aerosol in the presence of
kraft process emissions. Experience gained from this work emphasizes the importance of the highest
possible data quality and good fitting species for major sources.
Results released to all involved parties, the State legislature and to the EPA, clearly
demonstrated the importance of soil and road dust as a major source contributing to standard
violations. Smoke from the burning of grass fields and timber slash was found to have short term,
acute impacts. Annual impacts, however, were estimated to be less than 2 g/m
3
. The findings of the
study provided the first technical assessment of the impacts of grass field burning in the 20 year
history of the controversy and have channeled legislative and regulatory energy into more
constructive, less emotional areas. Dispersion model results have been evaluated in relation to CMB
results for soil and road dust and emission inventory improvements have been made. SIP revisions
based on the Willamette Valley study results and dispersion modeling have been adopted by the
State. A seven station monitoring program used to collect ambient data for input to the CMB model
and to track the impact of the burning in key communities has remained active as an important smoke
management information source.

161

Medford Aerosol Characterization Study (MACS)
The Medford Air Quality Maintenance Area was designated as a primary particulate
nonattainment area because of continued violations of the 24-hour and annual primary NAAQS.
Located in the Bear Creek Valley of Southwest Oregon, the AQMA suffers from frequent periods of
air stagnation, low wind speeds, and annual geometric mean TSP levels approaching 100 g/m
3
.
Control of the predominant industrial point sources has a long history which has resulted in improved
air quality. Particulate air quality standards are now, however, likely to be attained as a result of the
control of traditional sources. As in Portland and the Willamette Valley, community and industry
plans to expand the economic growth of the area were being restricted by the capacity of airshed to
assimulate the associated increase in emissions. This prompted legislative requests to identify the
relative source contributions through a comprehensive source apportionment program, develop the
data base needed to support appropriate control actions and prepare SIP revisions.
Building on experience gained through the PACS and Willamette Valley programs, the
Oregon Department of Environmental Quality developed a third study design costing $215,000 over
a two-year period. Again, emission inventory, meteorological and air monitoring programs were
improved to insure that the CMB results for Medford could be compared to those derived from
dispersion modeling. Project management, source testing, ambient sampling and some laboratory
analysis functions were assumed by Department staff while trace metal (XRF, INAA) carbon
analysis, CMB computation, data analysis, and reporting tasks were performed by the DEQ under
contract. During the field sampling period of April, 1979 to April 1980, dichotomous samplers at
four sites collected aerosol samples for analysis of carbon (organic and elemental), SO
4
-
,NO
3
-
Cl
-
,
and NH
4
+
, and 25 elements.
The fine and coarse mode source composition data used in the study was derived from
sampling local soils, home fireplaces and wood stoves, road dust, particle board dryers, hogged fuel
boilers, charcoal manufacturing and orchard heating emissions. Motor vehicle emissions for diesels,
leaded and unleaded

gasoline were obtained from the literature and weighted to reflect the Medford
traffic mix.
The fine and coarse particulate mass was apportioned into the eight source classes noted
above, sulfate, nitrate, and residual carbon. Unlike the PACS and Willamette Valley study results, the
results of the MACS program showed that many of the acute TSP episodes were related to wood
burning emissions rather than soil and road dusts. Conclusions from the study have directed agency
resources toward studies to minimize emissions from residential wood heating, completion of
telephone surveys needed to improve wood burning emission inventories and improvements to
dispersion model results.
Missoula, Montana
Studies of winter aerosol composition and source impacts using CMB, optical microscopy and
x-ray diffraction were conducted in the Missoula nonattainment area. Sponsored by EPA Region
VIII, the State of Montana and the Missoula City/County Health Department, the CMB study
apportioned the fine particle fraction into residential wood combustion, soil, auto exhaust, hog fuel
boiler and other sources. Results are being used in control strategy development by State and local
regulatory agencies.

162

Kellogg, Idaho
Chemical mass balance, and other receptor models, have been applied to source resolution
within the Kellogg, Idaho lead nonattainment area to identify the source impacts of lead and zinc
smelters, auto exhaust and other sources.
Lewiston, Idaho
Studies of the impact of the pulp and paper manufacturing industry in the Lewiston particulate
nonattainment area have been used to assess the need for further control of process emissions.
Results from the industry sponsored study have been presented to the State of Idaho Department of
Health and Welfare during a contested case hearing on the need for new control systems.
Juneau, Alaska
Receptor modeling source apportionment of major air pollution sources. The primary focus of
this study was on the contribution of such sources as residential wood combustion which contribute
up to several hundred g/m
3
of fine particulates.
Whitehorse, Yukon Territory, Canada
The contributions of road dust, residential wood combustion, automotive exhaust and distillate
oil were investigated by receptor model source apportionment methods. Residential wood
combustion was the largest single source of winter particulates.
Seattle/Tacoma, Washington
The objective of this study was to quantitatively identify the major sources causing violation
of the 24 hour particulate standard. Both ambient and source samples using a size-segregating
dilution sampler were collected. Samples were analyzed by XRF, IC, NAA, carbon combustion
methods, etc. Source contributions were determined by chemical mass balance methods.
East Helena, Montana
The objective of this study was to determine the contribution of lead sources to TSP levels in
East Helena, Montana where there is a lead smelter. This included ambient sampling with a low-
volume TSP sampler and dichotomous samplers, in addition to source stack sampling using a dilution
sampler, and fugitive emissions sampling. X-ray fluorescence and neutron activation analysis
methods were used to develop ambient and source matrices for chemical mass balance calculations
and multivariate analysis.
Butte, Montana
The contributions of wood smoke, mining activities, diesel truck exhaust, road and soil dust
and a phosphorous plant to the ambient TSP concentrations were determined by chemical mass
balance source apportionment. Butte, Montana is a non-attainment area for TSP.
Iowa Aerosol Study
Chemical mass balance and x-ray diffraction analysis of TSP and IP particulate samples
collected in eight Iowa cities was completed to identify sources contributing to non-attainment
problems within the state. XRF, carbon, carbonate and ion analysis was conducted on 175 glass fiber
filters as input to the CMB model. Identified fugitive dust impacts were interpreted in relation to
microinventory, transport and wind speed to apportion crustal source impacts to specific source
classes.

163

South Bend, Indiana
This project was done in two parts. Dichotomous filter data was studied during the first part
and hi-vol filter data was studied during the second part. Chemical mass balance source
apportionment modeling was performed to identify relative impacts from road dust, transportation
sources and organic combustion sources.
Hamilton, Ontario, Canada
This study was done in two parts. The first phase was completed in February, 1981. The
second phase using more detailed industrial source fingerprints was completed in January, 1982. The
determination of the impact of road dust to ambient TSP levels was the primary objective of this
study. The contribution

of road dust was accurately assigned using ambient data and by measuring
the chemical composition of resuspended road dust samples.
Street Cleaning Effectiveness Study
The Portland Street Cleaning Effectiveness study was funded as a demonstration control
strategy program to assess the effectiveness of street cleaning as a tool to reduce TSP levels within
Portland's non-attainment areas. Chemical mass balance (CMB) receptor model techniques were
used, for the first time, to assess the changes in road dust impacts by direct means. The CMB road
dust impact estimates were used in concert with dispersion modeling to provide a precise
determination of the effectiveness of the control programs being evaluated.
Residential Wood Energy Environmental Project
The Residential Wood Energy Environmental Project was commissioned to assess the impact
of residential wood burning on ambient and indoor air quality in several Northwest cities. Airshed
study results based on chemical mass balance analysis were used to validate local dispersion models.
Carbon 14 methods were also being used as were a number of other elemental, carbon and PNA
techniques. The results are being used by Region X and State/local agencies in assessing regulatory
options to be included in their SIP's.
Impact of Iron and Steel Manufacturing on TSP Levels
Chemical mass balance source apportionment analysis was conducted on selected samples to
assess the impact of the iron and steel industry on ambient particulate levels.

15.4 Dispersion Modeling
Chemical Mass Balance source strength estimates have been compared to dispersion model
estimates in only a few studies. In Washington, D.C. refuse incineration impact derived by CMB
were compared to box model estimates with reasonable agreement (0.83 g/m
3
versus 0.58 g/m
3
,
respectively). The State of Oregon Department of Environmental Quality, Air Quality Control
Division staff has devoted considerable resources to comparing, analyzing and improving dispersion
model performance based on CMB results obtained during the Portland, Medford and Willamette
Valley aerosol characterization studies. This work concluded that the application of CMB-derived
source strength (for selected source classes) to dispersion model simulations can provide
considerable insight into deficiencies in modeling assumptions, emission inventory and source
operating schedule errors. Initial model predictions of paved and unpaved road dust derived from
CMB analysis versus model predicted values at five sampling sites. The consistent underprediction
164

of the model was expected, considering the difficulty in accurately inventorying fugitive dust
sources. Following improvements to the inventory, utilizing a land use specific paved road dust
emission factor, dispersion model predictions of road dust impact improved dramatically. Overall
dispersion model annual prediction results also compared more favorably with observed TSP data.
In two other Oregon nonattainment areas, similar programs were undertaken to improve and
validate dispersion modeling results. In both cases, major improvements in model performance were
achieved through programs designed to upgrade emission inventory source categories found to be
deficient through CMB-dispersion model comparisons. Road dust emission inventory improvements
resulted in a 600% increase in the Portland road dust emissions inventory. A 2,000 ton per year
reduction in the Eugene, Oregon's inventory occurred following collection of new traffic data on
unpaved roads.
It is important to note that only CMB results for carefully selected source classes should be
used for dispersion model validation. To date, source classes used in this manner have been limited
to motor vehicle exhaust, residual oil combustion, urban dust, residential wood burning, and refuse
incineration, although any source whose emission composition contains two or more unique species
ratios (fingerprint) could be used.
Studies will be selected from the above noted applications and discussed in the course to
illustrate various aspects of receptor modeling methods. Other applications to be discussed include
potential use in apportioning sources of acid rain and their precursors, the aquatic environmental and
biological systems.

165

ADDITIONAL READING

Air Pollution Engineering Manual. Davis, Wayne T. Second Edition. New York: Wiley-Interscience
Publication, 2000.
Receptor Modeling in Environmental Chemistry. Hopke, Philip K. New York: Wiley-Interscience
Publication, 1985.
Atmosphere Aerosol Source/Air Quality Relationships. Macias, Edward S. and Hopke, Philip K.
Washingtong D.C.: American Chemical Society, 1981.
The New York Summer Aerosol Study, 1976. Kneip, Theo J. and Lippmann, Morton. New York:
New York Academy of Sciences, 1979.
Aerosols: Anthropogenic and Natural, Sources and Transport. Kneip, Theo J. and Lioy, Paul J. New
York: New York Academy of Sciences, 1980.
Chemical Element Balance Receptor Model Methodology for Assessing the Sources for Fine and
Total Suspended Particulate Matter in Portland, Oregon. Watson, John G. Portland, Oregon: Oregon
Graduate Center, 1979.
Receptor Modeling for Air Quality Management. Hopke, Philip K. New York: Elsevier, 1991.
Receptor Methods for Source Apportionment. Pace, Thompson G. Pittsburgh: Air Pollution Control
Association, 1986.
Receptor Models in Air Resources Management. Watson, John G. San Francisco: Air and Waste
Management Association, 1988.
Receptor Models Applied to Contemporary Pollution Problems. The Air Pollution Control
Association. Danvers, Massachusetts: Air Pollution Control Association, 1982.

You might also like