Professional Documents
Culture Documents
To cite this article: Ernesto A. Bustamante (2014) From basic to applied research: theory and
application of the a–b signal detection theory model, Theoretical Issues in Ergonomics Science,
15:4, 318-337, DOI: 10.1080/1464536X.2011.584581
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [University of West Florida] at 02:44 01 January 2015
Theoretical Issues in Ergonomics Science, 2014
Vol. 15, No. 4, 318–337, http://dx.doi.org/10.1080/1464536X.2011.584581
From basic to applied research: theory and application of the a–b signal
detection theory model
Ernesto A. Bustamante*
The goal of this research was three-fold: (1) address the basic versus applied
research debate within the domain-specific signal detection theoretical frame-
work, which scientists and practitioners have used extensively along the full
spectrum of this continuum; (2) provide empirical support to validate the
adequacy of the model; and (3) show the potential generalisability of the model to
different domain-specific areas within applied settings. The results from the basic
empirical study suggest that the a–b signal detection theory (SDT) model provides
a more accurate theoretical framework for examining the underlying processes
involved in signal detection. The findings from the domain-specific studies
showed the potential applicability of the a–b SDT model for approaching applied
problems. From a theoretical and applied point of view, this research suggests
that the traditional SDT contention that the detection and response processes are
independent from each other does not hold true for either basic signal detection
tasks or domain-specific areas.
Keywords: basic versus applied research; signal detection; decision making;
human-automation interaction; cognitive engineering
1. Introduction
The debate between basic versus applied research falls within a continuum with seemingly
different, yet fundamentally similar goals. Basic research is necessary to establish
theoretical frameworks upon which conduct applied research. Likewise, applied research
addresses practical problems, which can lead to the further development of basic
theoretical frameworks. The ultimate goal of research, whether it is considered basic or
applied, is to generate knowledge for the purpose of deriving theoretical implications as
well as practical applications.
As basic research approaches the applied end of this continuum through iterative
development and refinement, applied research is essential for understanding how basic
knowledge serves the needs and the goals of humans given a context in functional settings.
Furthermore, research can also begin on the applied side of this continuum with the
discovery of practical problems, which require reliance on basic research to serve as the
underlying theoretical frameworks to most adequately approach such problem.
The present manuscript addresses the basic versus applied research debate within the
*Email: ernestob@uidaho.edu
recognition is the process of identifying whether the psychological experience was caused
by noise only or the signal plus noise. Conversely, the decision process depends on the
amount or extent of the psychological experience required by the detector to make an
affirmative response. Swets (1973) distinguished between a process of covert discrimina-
tion and a process of overt response. He argued that these two processes have a complex
relationship, influenced by a number of factors, including, but not limited to, probability,
expectations and motivation. However, the main proposed contribution of the traditional
SDT model to the study of the performance of humans and automated systems is its
presumed capability to separate the discrimination process from the response process by
distinguishing between independent measures of sensitivity and threshold or criterion
setting.
proportion of hits and 1[ p(FA)] the z score corresponding to the point below which the
area under the standard normal distribution equals the proportion of false alarms.
The threshold measure is theoretically defined as the odds ratio between the ordinate
values of the signal-plus-noise and the noise distributions where an observer sets the
threshold for making affirmative responses (Figure 1), and it is operationally defined as:
2 2
0:5½1 pðFAÞ 0:5½1 pðHIÞ
¼e ð2Þ
The criterion-setting measure c is theoretically defined as the point along the
continuum above which an observer makes affirmative responses (Figure 1), and it is
operationally defined as:
c ¼ ð1Þ 0:5 1 ½ pðHIÞ þ 0:5 1 ½ pðFAÞ ð3Þ
cognitive heuristics (Kahneman et al. 1982), fatigue (Krueger 1989), effort (Wogalter et al.
1989), expertise (Klein et al. 1993), perceived urgency (Haas and Casali 1995), perceived
risk (Ayres et al. 1998), emergency of the situation (Bliss and Gilson 1998), dynamic
quality characteristics (Liu and Hwang 2000), trust in decision-support tools (DSTs)
(Lee and See 2004, Madhavan and Wiegmann 2007a), task complexity (Bailey and Scerbo
2007) and preconceived expectations of automation (Madhavan and Wiegmann
2007b). With regard to automated systems decision making, designers typically use
highly complex algorithms that are also nonlinear, such as decision trees, Monte Carlo
simulations and neural networks (Yang and Kuchar 1997, Thomas et al. 2003, Canton
et al. 2005).
Second, an important criterion for assessing the adequacy of measures is the capability
to assign scores even when observers do not commit any errors, which is common in
applied settings (Craig 1979). However, research suggests that d 0 , and c have
Downloaded by [University of West Florida] at 02:44 01 January 2015
questionable properties when based on extreme responding (Craig 1979). The reason for
this is that the 1 function is undefined for values of 0 and 1. Therefore, if an observer
has a perfect hit rate of 1 or a false-alarm rate of 0, the original hit and false-alarm rates
need to be transformed. The problem is that transformations may lead to biased estimates
(Hautus 1995). This limitation is particularly important for designers and practitioners
across a wide range of real-world settings (Snodgrass and Corwin 1988), in which extreme
responding is common. Some examples of applied domains include, but are not limited to:
traffic–collision warning systems (Parasuraman et al. 1997), medical diagnosis (Li et al.
2004), monitoring complex cockpit displays (Bailey and Scerbo 2005) and luggage
screening (Drury et al. 2006).
Most importantly, the adequacy of the traditional SDT model is questionable to the
extent that it may not accurately capture the potential dependency between the
discrimination and response processes with its most commonly used measures
(Bustamante 2008a). This limitation may have critical consequences for the proper
design and implementation of technology to solve applied problems.
2.1. Advantages of the a–b SDT model over the traditional SDT model
The a–b SDT model has several advantages over the traditional SDT model. First, given
the lack of assumptions in comparison to traditional SDT, it is evident that the a–b SDT
model is more parsimonious. One of the main reasons for this is that the a–b SDT model
makes no reference to an underlying decision continuum. Swets (1961) argued that the
exact nature of the sensory excitation produced by either the noise or the stimulus could be
quantified in terms of a single continuous variable (i.e. the decision variable). However,
this argument does not apply well to domains where individuals and automated systems
make decisions based on multiple sources of information and different decision-making
algorithms. This lack of reliance on the assumption of an underlying decision continuum is
one of the strengths of the a–b SDT model because, as previously mentioned, in most
applied settings, humans and automated systems do not make decisions based on a single
Downloaded by [University of West Florida] at 02:44 01 January 2015
underlying continuum.
Second, the a–b SDT model does not require transformations of original hit and false-
alarm rates for extreme responses. Because the a–b SDT model is not based on an underlying
continuum, there is no need to assume the existence of probability density functions
associated with the different noise and signal-plus-noise trials. The a–b SDT model simply
describes the decision outcome matrix using measures of accuracy and response bias that
characterise the underlying covert detection and overt response processes. In contrast,
traditional measures of performance, such as hit rate, false-alarm rate, overall percentage
of correct decisions and the ratio of hit rate to false-alarm rate, are all inadequate measures
of accuracy and response bias because they are confounded factors that spuriously affect
them, such as the probability of the target stimulus (Stanislaw and Todorov 1999).
A third advantage of the a–b SDT model is that the alternative a and b measures may
be interpreted more intuitively. With regard to a, a score of 0 indicates the complete lack of
ability to make accurate decisions. A score of 0.5 indicates performance at chance level,
and a score of 1 indicates perfect decision-making accuracy. With regard to b, a score of 0
indicates a lack of affirmative responsiveness. A score of 0.5 indicates an unbiased level of
responsiveness, and a score of 1 indicates a complete response bias towards affirmative
responses. These metrics may be more appealing to human factors and ergonomics
researchers and practitioners for applying research findings to the design and evaluation of
human–automation interaction because of their intuitive interpretative nature.
The most important advantage of the a–b SDT model is that, as the results from the
subsequent empirical studies will suggest, the a–b SDT model provides accuracy and
response bias metrics that more adequately measure the underlying discrimination and
response processes as distinct yet dependent processes. As previously mentioned, the main
contribution of SDT is that it supposedly allows researchers to examine the covert
detection process independently of the overt response process. A fundamental requirement
to satisfy this claim is to have a model from which to derive independent measures of
accuracy or sensitivity and response bias, threshold or criterion setting, respectively.
Theoretically, two random variables, (e.g. X and Y), are statistically independent if and
only if, the conditional probability of a value of X given a value of Y equals the marginal
probability of the value of X and vice versa (Papoulis and Pillai 2002), or
p X ¼ xY ¼ y ¼ pðX ¼ xÞ ð6Þ
and
pðY ¼ yjX ¼ xÞ ¼ pðY ¼ yÞ ð7Þ
Theoretical Issues in Ergonomics Science 323
Given the intuitive nature of the calculation and interpretation of a and b (see
Formulas 4 and 5), it is quite simple to demonstrate that these measures are not
independent. Based on Formulas 4 and 5, it is clear that in cases of an extreme accuracy
score (either 0 or 1), the probability of b being equal to 0.5 is 1 and 0 for any other value of
b. Likewise, given an extreme value of b (either 0 or 1), the probability of a being equal to 0
is 1 and 0 for any other value of a. It is of course an empirical question whether the two
processes these metrics intend to measure are in fact dependent or independent of each
other. However, due to the complex nature of the computations of the traditional SDT
measures (Formulas 1–3), assessing whether or not d 0 , and c are independent measures is
not as simple.
Therefore, the purpose of the subsequent experiments was to provide empirical
validation of the a–b SDT model and show support for its use in applied settings. The
main goal of Experiment 1 was to compare both models to examine the extent of
Downloaded by [University of West Florida] at 02:44 01 January 2015
independent measures in a basic research setting. The purpose of Experiments 2 and 3 was
to show the potential generalisability of the a–b SDT model to different domain-specific
areas.
3.1. Experiment 1
Consistent with the second goal of this research, Experiment 1 aimed to provide an
empirical validation of the a–b SDT model within the context of basic research. The goal
of this study was to gather empirical data from a traditional SDT study to assess the
adequacy of the a–b SDT model. The basic premise of SDT is that in a traditional SDT
task, two distinct and independent processes take place: a covert discrimination process
and an overt response process (Swets 1973). Furthermore, according to the traditional
SDT model, these two processes are independent of each other and are affected by
different factors (Green and Swets 1966). However, according to the underlying premises
of the a–b SDT model, these two processes are not independent. Thus, to test the adequacy
of the a–b SDT model, two factors that should affect each of these processes independently
were manipulated within a traditional SDT task.
One of these factors was the probability of occurrence of the target stimulus. Prior
research suggests that changes in the probability of occurrence of the target stimulus affect
people’s response bias (Bliss et al. 1995). This effect is commonly known as probability
matching. Based on this principle, it follows that as the probability of the target stimulus
increases, people’s tendency to make affirmative responses also increases. Within the
context of the a–b SDT model, increasing the probability of the target stimulus should
increase people’s response bias. Similarly, given the theoretical and operational definitions
of traditional SDT measures, this implies that increasing the probability of the target
stimulus should decrease people’s thresholds or criterion settings.
The second factor was based on a derivation of Weber’s Law, which is used to predict
people’s abilities to detect just-noticeable differences between two stimuli. According to
324 E.A. Bustamante
Weber’s Law, the ratio of the difference between a target stimulus and a baseline stimulus
equals a constant K, or
DI
K¼ ð8Þ
I
where, K is the Weber’s fraction; DI the difference between the baseline and target stimuli;
and I the intensity of baseline stimulus.
Based on Weber’s Law, it follows that as the difference between the baseline and target
stimuli increases, people’s abilities to discriminate between the two stimuli should also
increase. Within the context of SDT, this implies that increasing the difference between the
baseline and target stimuli should increase people’s sensitivity or accuracy.
Downloaded by [University of West Florida] at 02:44 01 January 2015
3.1.1. Hypothesis
Consistent with the fundamental premise of dependent processes underlying the a–b SDT
model, as well as the literature on probability matching and Weber’s law of just noticeable
differences, it was hypothesised that increasing the frequency difference between the
baseline and target stimuli would increase participants’ accuracy to the point where the
probability of occurrence of the target stimulus would cease to affect participants’
response bias.
3.1.2. Method
3.1.2.1. Experimental design. A 3 3 repeated-measures design was used for this study.
The probability of occurrence of the target stimulus was manipulated at three levels (0.10,
0.50 and 0.90). The frequency difference between the baseline and target stimuli was also
manipulated at three levels (5, 10 and 15 Hz).
3.1.2.3. Materials and apparatus. This study took place in a sound-attenuated room with
an average ambient noise level of 45 dB(A). Participants performed a traditional yes–no
SDT task, which consisted of discrimination between baseline and target auditory stimuli
that varied in their fundamental frequency. All stimuli were generated using the NCH
Tone GeneratorÕ software and lasted 100 ms. The baseline stimulus consisted of a simple
sine wave of 500 Hz. Depending on the experimental condition, the target stimulus
consisted of a simple sine wave of 505, 510 or 515 Hz. Stimuli were presented to
participants through a set of sound-attenuated stereo headphones at 55 dB(A) using a
fixed inter-stimulus interval of 2.5 s. A Microsoft Visual BasicÕ program was developed
Theoretical Issues in Ergonomics Science 325
and loaded on a Dell Inspiron 600 m laptop computer to run the study and collect
participants’ data.
3.1.2.4. Procedure. As part of this experiment, participants performed nine 1-min practice
sessions and nine 5-min experimental sessions, which varied according to the probability of
occurrence of the target stimulus (i.e. 0.10, 0.50, 0.90) and the frequency difference
between the baseline and target stimuli (i.e. 5, 10, 15 Hz). Each practice session preceded its
corresponding experimental session. Practice sessions consisted of 20 trials, whereas
experimental sessions consisted of 100 trials. All sessions were fully counterbalanced
according to the ascending or descending nature of each factor (i.e. the probability of
occurrence of the target stimulus and the frequency difference between the baseline and
target stimuli). Furthermore, to avoid a potential vigilance decrement, the experimenter
Downloaded by [University of West Florida] at 02:44 01 January 2015
3.1.3. Results
Three 3 3 repeated-measures ANOVAs were conducted to assess the effects of the
probability of occurrence of the target stimulus (p) and the frequency difference (FD)
between the baseline and target stimuli on the b, and c measures. Due to the use of
multiple univariate statistical tests, statistical significance for all inferential analyses was
set a priori at p 5 0.01.
3.1.3.1. Response bias (b). Results showed a statistically significant interaction effect
between p and FD on b. Given that Mauchly’s test indicated that the sphericity
assumption was violated, the Greenhouse–Geisser correction was used to adjust the
degrees of freedom, F(2.01, 38.20) ¼ 20.38, p 5 0.01, partial 2 ¼ 0.52. As shown in
Figure 2, the effect of the probability of occurrence of the target stimulus on participants’
response bias was significantly diminished by increases in the frequency difference between
the baseline and target stimuli.
3.1.3.2. Threshold (). To be able to estimate , the observed hit and false-alarm rates
were first transformed using the log-linear conversion, as recommended by Snodgrass and
Corwin (1988). Furthermore, was transformed using the natural log function as
recommended by Stanislaw and Todorov (1999). Contrary to the results on b, there was no
statistically significant interaction effect between p and FD on , F(4, 76) ¼ 2.71, n.s.,
326 E.A. Bustamante
Downloaded by [University of West Florida] at 02:44 01 January 2015
Figure 2. Participants’ response bias as a function of the interaction between the probability of
occurrence of the target stimulus and the frequency difference between the baseline and target
stimuli.
Figure 3. (a) Participants’ threshold and (b) criterion setting as a function of the interaction between
the probability of occurrence of the target stimulus and the frequency difference between the baseline
and target stimuli.
partial 2 ¼ 0.13. As shown in Figure 3(a), the effect of p on did not significantly vary as
a function of FD.
3.1.3.3. Criterion setting (c). Because of the same limitations for computing , the
transformed hit and false-alarm rates were used to calculate c. Results did show a
statistically significant interaction effect between p and FD on c. Given that Mauchly’s test
indicated that the sphericity assumption was violated, the Greenhouse–Geisser correction
was used to adjust the degrees of freedom, F(2.24, 42.56) ¼ 9.73, p 5 0.01, partial 2 ¼ 0.34.
Similarly, Figures 2 and 3(b) show that the effect of the probability of occurrence of the
target stimulus on participants’ criterion setting was significantly diminished by increases
in the frequency difference between the baseline and target stimuli. It is important to note
that the direction of the effect is seemingly opposite to that on b only because of the
operational definitions of the measures not because of the nature of the interaction.
Theoretical Issues in Ergonomics Science 327
3.1.4. Discussion
As expected, results from this study showed partial support for the underlying premise of a
dependency between the discrimination and response processes, which is consistent with the
fundamental basis of the a–b SDT model. Furthermore, the discrepancy in the findings
regarding and c brings into question the adequacy of the traditional SDT model. The fact
that there was an interaction effect between the probability of occurrence of the target
stimulus and the frequency difference between the baseline and target stimuli on both
response bias and criterion setting suggests that the covert discrimination process and the
overt response process are not independent of each other. As Figures 2 and 3(b) show, as the
frequency difference between the baseline and target stimuli increased, and consequently,
participants’ accuracy and sensitivity increased, the effect of the probability of occurrence of
the target stimulus on participants’ response bias or criterion setting significantly decreased.
Downloaded by [University of West Florida] at 02:44 01 January 2015
3.2.1. Hypothesis
Consistent with the fundamental premise of dependent processes underlying the a–b SDT
model, it was hypothesised that using a MP system would increase participants’ response
bias to the point where the advantage of using LAT would cease to improve their decision-
making accuracy. More importantly though, it was hypothesised that this interaction
effect would be better captured by the a–b SDT model measure of accuracy (a) than the
traditional SDT measure of sensitivity (d 0 ).
3.2.2. Method
3.2.2.1. Experimental design. A 2 2 between-groups design was used for each experi-
ment. Type of automation (FP, MP) and technology (BAT, LAT) were manipulated
between groups. The dependent measures of interest were decision-making accuracy (a)
and sensitivity (d 0 ).
3.2.2.3. Materials and apparatus. Two computer workstations equipped with two 47.5-cm
Downloaded by [University of West Florida] at 02:44 01 January 2015
3.2.2.3.1. Primary flight tasks. The primary flight tasks were simulated using the Multi-
Attribute Task Battery (MATB) designed by Comstock and Arnegard (1992) and
consisted of a compensatory-tracking task and a resource-management task (Figure 4).
3.2.2.3.2. Compensatory-tracking task. The main purpose of this task was to simulate the
key function that pilots and UAV operators need to perform to fly an airplane or a UAV,
which is to maintain level flight.
3.2.2.3.3. Resource-management task. The main purpose of this task was to simulate
another important function that pilots and UAV operators need to perform as they fly an
airplane or a UAV, which is to make sure that they have an adequate level of fuel.
3.2.2.3.4. Secondary tasks. In addition to performing the primary flight tasks, partici-
pants performed a secondary task. In Experiment 2, participants performed an engine-
monitoring task (Stanton et al. 2009). The main purpose of this task was to simulate a
crucial secondary function that pilots need to perform to maintain flight safety, which is to
ensure that they have at least one fully functioning engine at all times. Participants
performed this task with the aid of a simulated Engine Indication and Crew Alerting
System (EICAS) display (Figure 5).
The simulated EICAS display equipped with BAT provided one of two types of
advisories (OK, ALARM) regarding the potential status of the engines. The simulated
EICAS display equipped with LAT provided one of three types of advisories (OK,
WARNING, ALARM). After receiving the EICAS advisory, participants could ignore it
and continue performing their primary flight tasks, or they could acknowledge it and
search for system-status information. If participants decided to acknowledge an alarm,
Theoretical Issues in Ergonomics Science 329
Downloaded by [University of West Florida] at 02:44 01 January 2015
they had to evaluate the temperature and pressure levels of two engines and determine
whether they needed to make a corrective response to repair the engines if both of them
were malfunctioning (Figure 6).
In Experiment 3, participants performed a secondary weapon-deployment task (Clark
et al. 2009). The main purpose of this task was to simulate one of the functions that UAV
operators may need to perform in a combat situation, which is to deploy a weapon on a
military target. To accomplish this task, participants interacting with the BAT DST received
one of two types of advisories (OK, ALARM), and participants interacting with the LAT
DST received one of three types of advisories (OK, WARNING, ALARM) regarding
the potential presence of an enemy target. After receiving the advisory, participants could
ignore it and continue performing their primary flight tasks, or they could acknowledge it
and view an aerial image to find the enemy target. If participants considered that an enemy
target was present, they had to deploy a weapon to destroy it (Figure 7).
The type of automation (FP versus MP) was manipulated in a similar manner as Dixon
et al. (2007). The FP system had a 100% hit rate and an 80% false-alarm rate. The MP
system, on the other hand, had a 20% hit rate and a 0% false-alarm rate. As previously
mentioned, the BAT system provided participants with one of two types of advisories. One
of such advisories was composed of a green rectangle with the signal word ‘OK’ embedded
in it, accompanied by a 500-Hz simple sign wave sound. The other advisory was composed
of a red rectangle with the signal word ‘ALARM’ embedded in it, accompanied by a
3000-Hz simple sign wave sound. The LAT system provided participants with three types
of advisories depending on the probability of engine malfunctions in Experiment 2 and the
presence of the enemy target in Experiment 3. Two of the three advisories had the same
physical characteristics as the stimuli used for the BAT system. The LAT system, however,
Theoretical Issues in Ergonomics Science 331
Downloaded by [University of West Florida] at 02:44 01 January 2015
3.3.1. Results
Two 2 2 between-groups ANOVAs were conducted in each experiment to assess the
effects of the type of automation (FP, MP) and the type of alarm technology (BAT, LAT)
332 E.A. Bustamante
Figure 8. Participants’ accuracy as a function of the interaction between the type of automation and
alarm technology for (a) Experiment 2 and (b) Experiment 3.
Downloaded by [University of West Florida] at 02:44 01 January 2015
Figure 9. Participants’ sensitivity as a function of the interaction between the type of automation
and alarm technology for (a) Experiment 2 and (b) Experiment 3.
3.3.1.2. Sensitivity (d 0 ). Because of the same limitations for computing and c, the
transformed hit and false-alarm rates were used to calculate d 0 . Contrary to the findings in
Experiment 2 regarding a, there was no statistically significant interaction effect between
the type of automation and alarm technology on d 0 , F(1, 96) ¼ 0.00, n.s., partial 2 ¼ 0.00
(Figure 9a). However, results from Experiment 3 did show a statistically significant
interaction effect between the type of automation and alarm technology on d 0 .
Theoretical Issues in Ergonomics Science 333
F(1, 96) ¼ 18.59, p 5 0.01, partial 2 ¼ 0.16. As shown in Figure 9(b), LAT significantly
increased a for the FA system only.
3.3.2. Discussion
As expected, results from Experiments 2 and 3 provided further support for the underlying
premise of the a–b SDT model of a dependency between the discrimination and response
processes. Furthermore, due to the discrepancy in the findings from Experiments 2 and 3
regarding d 0 , these results provided further support for the superiority of the a–b SDT
model over the traditional SDT model. The a–b SDT model consistently captured the
positive interactive effect of FP automation equipped with LAT on accuracy. These
findings could provide a useful insight to designers of DSTs for the adequate
implementation of LAT in complex systems to enhance human signal detection and
Downloaded by [University of West Florida] at 02:44 01 January 2015
decision-making accuracy.
4. General discussion
4.1. Theoretical implications and practical applications
A fundamental theoretical contention of traditional SDT is that covert detection and overt
response are independent processes that can be quantifiable by independent measures of
sensitivity and threshold or criterion setting, respectively. According to Green and Swets
(1966), non-sensory factors, such as the probability of the target stimulus, should have no
effect on sensitivity. Likewise, sensory factors, such as the frequency difference between
the baseline and target stimuli, should have no effect on criterion setting. However, the
results from the basic empirical findings bring into question the viability of their argument.
The results from Experiment 1 suggested that the effect of sensory factors can impact the
effect size of non-sensory factors on criterion setting or response bias. These are crucial
basic research findings because they suggest that factors that affect the covert detection
process can also affect the overt response process, thereby establishing a dependency
between these two processes.
Furthermore, results from Experiments 2 and 3 not only provided additional support
for the findings of Experiment 1, but also suggested that in applied settings, where
operators interact with automated DSTs, such as alarm systems, the a–b SDT model more
adequately captures the importance of adequately combining the type of automation with
the type of alarm technology to enhance performance. Consequently, the results from this
research suggest that contrary to Green and Swets (1966)’s argument, the detection and
response processes are dependent, and the nature of their dependency increases as
decision-making accuracy and response bias increase. Furthermore, the findings from this
research suggest that the a–b SDT model is a more adequate framework for detecting the
dependency of the covert detection and the overt response processes.
An applied example of a domain in which the a–b SDT model may be a more adequate
framework to analyse human performance is pilots’ decision-making during potential
weather threats. Research shows that in general, commercial aviation pilots have a
tendency to deviate from their predetermined flight paths due to potential weather threats
(Bliss et al. 2005). Two of the main reasons for this tendency to deviate from potential
weather threats are passengers’ safety and comfort. The problem is that making
unnecessary flight path deviations can have negative effects, such as increased fuel
consumption and flight delays.
334 E.A. Bustamante
Within the context of SDT, passengers’ safety and comfort constitute non-sensory
factors that increase pilots’ biases towards deviating from their predetermined flight paths.
Researchers and designers can use the a–b SDT model to examine how sensory factors,
such as the characteristics of weather displays, can mitigate the effect of non-sensory
factors. The purpose of this approach would be to increase pilots’ decision-making
accuracy to the point where they would not be biased by such non-sensory factors and
would avoid making unnecessary flight path deviations.
nature of signal detection and decision making proposed within SDT. In laboratory
settings, signals are often defined by experimenters a priori. Imposing a binary
dichotomous decision-making matrix is acceptable if the state of the world truly exists
in a present–absent dynamic. However, Zadeh (1965) has proposed that binary logic may
not be the most adequate method of describing human response tendencies in the real
world. Zadeh (1965) has instead proposed the use of fuzzy logic, which allows for the
realisation of states of the world to exist at and between extremes, rather than simply at
extremes. Dichotomising the world into categories of non-overlapping extremes may result
in the loss of useful information if events do not truly exist at these extreme levels, which
may result in a less sensitive analysis (Kozko 1993). If events can exist between extremes of
unequivocal presence and unequivocal absence, it is a reasonable desire to attempt to
capture this grey area information.
The a–b SDT model, as it currently stands, does not capture non-binary response
information and would require some modifications to the theoretical foundation and
calculation of its measures. Real world signals are likely to be dependent on domain-
sensitive contextual and temporal qualities that should also be considered when analysing
response tendencies (Parasuraman et al. 2000). It is also worth noting that imposing a binary
decision state onto a signal detection paradigm results in a representation of two ends of a
single continuum and that uncertainty should be lower at these extremes and higher towards
the centre or midpoint (Hancock et al. 2000). Future research should focus on integrating
the basic tenets of the a–b SDT model with Fuzzy SDT to enhance the applicability of
these two theoretical frameworks to a wide range of domain-specific settings.
5. Conclusion
The combined results from this research provide important theoretical and practical
contributions to researchers and practitioners. Given its lack of reliance on the assumption
of a single underlying continuum, the a–b SDT model is more generalisable and applicable
than traditional SDT. This greater generalisability and applicability is particularly
important for researchers and practitioners who are interested in examining the
performance of humans and automated systems in complex domains, in which neither
humans nor automated systems make decisions based on a single underlying continuum.
From a theoretical point of view, the findings from this research suggest that the
traditional SDT contention that the detection and response processes are independent
from each other does not hold true for either basic signal detection tasks or complex
human–automation interaction. From a practical point of view, the results of this research
Theoretical Issues in Ergonomics Science 335
are particularly crucial for researchers and practitioners who are interested in examining
not only the detection and decision-making accuracy of humans and automated systems,
but also their response biases. More specifically, researchers interested in studying human–
automation interaction factors, such as compliance, reliance and trust, may benefit from
using the a–b SDT model to assess how sensory and perceptual factors affect humans’
response biases while interacting with automated systems in a variety of different domains,
including, but not limited to, air-traffic control, uninhabited vehicle operations, nuclear
power control, medical diagnosis, advanced search engines, navigational systems and
luggage screening.
Finally, the findings from this research showed evidence to support the superiority of
the a–b SDT model over traditional SDT. Aside from being more parsimonious and
generalisable, the a–b SDT provides measures of accuracy and response bias that more
adequately capture the dependency between the covert detection and overt response
Downloaded by [University of West Florida] at 02:44 01 January 2015
processes. As such, researchers and practitioners could use the integration of the a–b SDT
model with Fuzzy SDT to more adequately examine how sensory system characteristics
that can improve operators’ accuracy interact with non-sensory factors to affect operators’
decision-making while interacting with automated systems in applied settings.
Acknowledgement
The author thanks Dr Stephen Rice for providing the aerial images used in Experiment 3.
References
Ayres, T.J., et al., 1998. Risk perception and behavioural choice. International Journal of Cognitive
Ergonomics, 2, 35–52.
Bailey, N.R. and Scerbo, M.W., 2005. Concurrent monitoring for multiple critical signals in a
complex display: a vigilance perspective. In: Proceedings of the 49th annual meeting of the
Human Factors and Ergonomics Society, 26–30 September 2005. Santa Monica, CA: Human
Factors and Ergonomics Society, 1518–1522.
Bailey, N.R. and Scerbo, M.W., 2007. Automation-induced complacency for monitoring highly
reliable systems: the role of task complexity, system experience, and operator trust. Theoretical
Issues in Ergonomics Science, 8, 321–348.
Bisseret, A., 1981. Application of signal detection theory to decision making in supervisory control:
the effect of the operator’s experience. Ergonomics, 24, 81–94.
Bliss, J.P., et al., 2005. Reactions of air transport flight crews to displays of weather during simulated
flight. (Final Report, ODU Contract 130971). Langley, VA: NASA Langley Research Center.
Bliss, J.P. and Gilson, R.D., 1998. Emergency signal failure: implications and recommendations.
Ergonomics, 41, 57–72.
Bliss, J.P., Gilson, R.D., and Deaton, J.E., 1995. Human probability matching behaviour in
response to alarms of varying reliability. Ergonomics, 38, 2300–2312.
Broadbent, D., 1978. The current state of noise research: reply to Poulton. Psychological Bulletin, 85,
1052–1067.
Bustamante, E.A., 2005. A signal detection analysis of the effects of workload, task-critical and
likelihood information on human alarm response. In: Proceedings of the 49th annual meeting of
the Human Factors and Ergonomics Society, 26–30 September 2005. Santa Monica, CA:
Human Factors and Ergonomics Society, 1513–1517.
Bustamante, E.A., 2007. Using likelihood alert technology in cockpit displays of traffic information
to support free flight. In: Proceedings of the 14th international symposium on aviation
psychology, 23–26 April, Dayton, OH, 100–102.
336 E.A. Bustamante
Bustamante, E.A., 2008a. The a-b signal detection theory model. Dissertation Abstracts
International: Section B: The Sciences and Engineering, 68, 6371.
Bustamante, E.A., 2008b. Implementing likelihood alarm technology in integrated aviation displays
for enhancing decision making: a two-stage signal detection modelling approach. International
Journal of Applied Aviation Studies, 8, 241–262.
Canton, R., et al., 2005. Development and integration of human-centred conflict detection and
resolution tools for airborne autonomous operations. In: Proceedings of the 13th International
Symposium on Aviation Psychology, 18–21 April 2005. Oklahoma City, OK: Wright State
University, 92–97.
Clark, R.M., Peyton, G.G., and Bustamante, E.A., 2009. Differential effects of likelihood alarm
technology and false-alarm vs. miss prone automation on decision making. In: Proceedings of
the Human Factors and Ergonomics Society 53rd annual meeting, 19–23 October 2009. Santa
Monica, CA: Human Factors and Ergonomics Society, 349–353.
Cohen, J., 1988. Statistical power analysis for the behavioural sciences. 2nd ed. Hillsdale, NJ:
Downloaded by [University of West Florida] at 02:44 01 January 2015
Erlbaum.
Comstock, J.R. and Arnegard, R.J., 1992. The multi-attribute task battery for human operator
workload and strategic behaviour research. Technical Memorandum 104174, Hampton, VA:
National Aeronautics and Space Administration, Langley Research Centre.
Craig, A., 1979. Nonparametric measures of sensory efficiency for sustained monitoring tasks.
Human Factors, 21, 69–78.
Dixon, S.R., Wickens, C.D., and McCarley, J.S., 2007. On the independence of compliance and
reliance: are automation false alarms worse than misses? Human Factors, 49, 564–572.
Donaldson, W., 1992. Measuring recognition memory. Journal of Experimental Psychology: General,
121, 275–277.
Drury, C.G., Ghylin, K.M., and Holness, K., 2006. Error analysis and threat magnitude for carry-on
bag inspection. In: Proceedings of the Human Factors and Ergonomics Society 50th annual
meeting, 16–20 October 2006. Santa Monica, CA: Human Factors and Ergonomics Society,
1189–1193.
Green, D.M. and Swets, J.A., 1966. Signal detection theory and psychophysics. New York: Wiley.
Grier, J.B., 1971. Nonparametric indexes for sensitivity and bias: computing formulas. Psychological
Bulletin, 75, 424–429.
Haas, E.C. and Casali, J.G., 1995. Perceived urgency of and response time to multi-tone and
frequency-modulated warning signals in broadband noise. Ergonomics, 38, 2313–2326.
Hancock, P.A., Masalonis, A.J., and Parasuraman, R., 2000. On the theory of fuzzy signal detection:
theoretical and practical considerations. Theoretical Issues in Ergonomics Science, 1, 207–230.
Hautus, M., 1995. Corrections for extreme proportions and their biasing effects on estimated values
of d0 . Behaviour Research Methods, Instruments, and Computers, 27, 46–51.
Hodos, W., 1970. Nonparametric index of response bias for use in detection and recognition
experiments. Psychological Bulletin, 74, 351–354.
Kahneman, D., Slovic, P., and Tversky, A., 1982. Judgment under uncertainty: heuristics and biases.
New York: Cambridge University Press.
Klein, G.A., et al., 1993. Decision making in action: models and methods. Norwood, NJ: Ablex
Publishing.
Kozko, B., 1993. Fuzzy thinking: theory and applications. New York, NY: Prentice Hall.
Krueger, G.P., 1989. Sustained work, fatigue, sleep loss and performance: a review of the issues.
Work and Stress, 3, 129–141.
Lee, J.D. and See, K.A., 2004. Trust in automation: designing for appropriate reliance. Human
Factors, 46, 50–80.
Li, C.R., Lin, W., and Chang, H., 2004. A psychophysical measure of attention deficit in children
with attention-deficit/hyperactivity disorder. Journal of Abnormal Psychology, 113, 228–236.
Liu, C.L. and Hwang, S.L., 2000. A performance measuring model for dynamic quality
characteristics of human decision-making in automation. Theoretical Issues in Ergonomics
Science, 1, 231–247.
Theoretical Issues in Ergonomics Science 337
Long, G.M. and Waag, W.L., 1981. Limitations on the practical applicability of d’ and measures.
Human Factors, 23, 285–290.
Madhavan, P. and Wiegmann, D.A., 2007a. Similarities and differences between human-human and
human-automation trust: an integrative review. Theoretical Issues in Ergonomics Science, 8,
277–301.
Madhavan, P. and Wiegmann, D.A., 2007b. Effects of information source, pedigree, and reliability
on operator interaction with decision support systems. Human Factors, 49, 773–785.
McCornack, R.L., 1961. Inspector accuracy: a study of the literature. Albuquerque: Sandia
Corporation.
Papoulis, A. and Pillai, S.U., 2002. Probability, random variables, and stochastic processes.
New York: McGraw-Hill.
Parasuraman, R., Hancock, P.A., and Olofinboba, O., 1997. Alarm effectiveness in driver-centred
collision-warning systems. Ergonomics, 40, 390–399.
Parasuraman, R., Masalonis, A.J., and Hancock, P.A., 2000. Fuzzy signal detection theory: basic
Downloaded by [University of West Florida] at 02:44 01 January 2015
postulates and formulas for analyzing human and machine performance. Human Factors, 42,
636–659.
Parasuraman, R. and Riley, V., 1997. Humans and automation: use, misuse, disuse, abuse. Human
Factors, 39, 230–253.
Pollack, I. and Norman, D.A., 1964. A non-parametric analysis of recognition experiments.
Psychonomic Science, 1, 125–126.
Snodgrass, J.G. and Corwin, J., 1988. Pragmatics of measuring recognition memory: applications to
dementia and amnesia. Journal of Experimental Psychology: General, 117, 34–50.
Stanislaw, H. and Todorov, N., 1999. Calculation of signal detection measures. Behaviour Research
Methods, Instruments, and Computers, 31, 137–149.
Stanton, N.S., Ragsdale, S.A., and Bustamante, E.A., 2009. The effects of system technology and
probability type on trust, compliance, and reliance. In: Proceedings of the Human Factors and
Ergonomics Society 53rd annual meeting, 19–23 October 2009. Santa Monica, CA: Human
Factors and Ergonomics Society, 1368–1372.
Swets, J.A., 1961. Is there a sensory threshold? Science, 134, 168–177.
Swets, J.A., 1973. The relative operating characteristic in psychology. Science, 182, 990–1000.
Swets, J.A., 1996. Signal detection theory and ROC analysis in psychology and diagnostics: collected
papers. Mahwah, NJ: Erlbaum.
Swets, J.A. and Pickett, R.M., 1982. Evaluation of diagnostic systems: methods from signal detection
theory. New York, NY: Academic Press.
Thomas, L.C., Wickens, C.D., and Rantanen, E.M., 2003. Imperfect automation in aviation traffic
alerts: a review of conflict detection algorithms and their implications for human factors
research. In: Proceedings of the Human Factors and Ergonomics Society 47th annual meeting,
13–17 October 2009. Santa Monica, CA: Human Factors and Ergonomics Society, 344–347.
Wickens, C.D. and Dixon, S.R., 2007. The benefit of imperfect diagnostic information: a synthesis of
the literature. Theoretical Issues in Ergonomics Science, 8, 201–212.
Wogalter, M.S., Allison, S.T., and McKenna, N.A., 1989. Effects of cost and social influence on
warning compliance. Human Factors, 31, 133–140.
Yang, L.C. and Kuchar, J.K., 1997. Prototype conflict alerting system for free flight. Journal of
Guidance, Control, and Dynamics, 20, 768–773.
Zadeh, L.A., 1965. Fuzzy sets. Information and control, 8, 338–353.