You are on page 1of 17

OLEUM &

E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 271

Unit 13 Notes
__________________
__________________
Reliability Fundamental __________________

Theories __________________
__________________
__________________

Objectives __________________
__________________
After reading this unit, you will be able to:
__________________
y Understand Basic Concept of Reliable Theory
__________________
y Know Reliable Theory of Ageing and Longevity
y Define Safety Engineering
y Discuss Safety Engineering Proces
y Understand Application of Bow-tie Diagram

Reliability Theory
Reliability theory is the probabilistic and statistical
foundation of reliability engineering, which is a branch of
engineering practice that has become increasingly important
as the complexity and necessary precision of engineering
artefacts has increased.
In order to get an insight of the developments that have
taken place over the passage of years, let us take the case of
the modern aeroplanes. The aeroplane of the yester years,
(Say in 1915) were single or double seater, built of wood and
canvas with a simple rotary internal combustion engine, wire
operated manual simple control surfaces, fixed
undercarriage, and no brakes. The aeroplanes of 1940 were
multi-engined, high speed, built of metal, had a complex
propeller driven engine, still had manual, wire operated,
simple controls, but had retractable undercarriage with
brakes, often hydraulically operated, and could carry a
number of passengers. The aeroplanes of 1960 (e.g. DC-8,
B707) were pressurized, used to fly at very high altitude with
jet speed, built of riveted metal sheet, powered by an axial
flow jet engine, had powered assisted controls but still linked
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
272 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes by rods to the control column, hydraulically operated


__________________ undercarriage, and carried radars and radios and electronic
__________________ navigation etc.
__________________ On the other hand the modern aeroplanes (e.g. Airbus A-
__________________ 320) are a sophisticated structure of milled metal and
__________________ composite material components, with power operated
__________________ controls, computer controlled stability and manoeuvring laws,
__________________
many hydraulically and electrically operated subsystems,
radars, radios and electronic navigation systems (Fly By Wire
__________________
Technology, GPS and satellite assisted) etc.
__________________
__________________
Keeping the aeroplane of 1915 fit to fly was a straight forward
task comparable to keeping a car on the road. On the contrary,
keeping a modern aeroplane fit to fly is a complex task which,
in the worst cases, could make the operational time to
maintenance time ratio too small to be economic or practical.
Consideration of reliability and maintainability has to be a
primary design criterion of such aircraft alongside the other
operational requirements and parameters.
These stimuli led to the development of reliability and
maintainability engineering as a distinct engineering
specialization. Although the stimulus came from particular
application areas, the consequences of the better
understanding of reliability engineering have filtered into
almost every branch of engineering. The modern motor car
has benefited enormously from the improvements in
reliability developed in other arenas. In the 1950s a typical
car required servicing at 3000 mile intervals (including
greasing of many chassis parts), would require a major engine
overhaul (removing the engine from the car, dismantling and
refurbishing many parts) at 60,000 mile intervals, used
mechanical analogue ignition timing devices (called a
“distributor”), and often became a basket case due to chassis
rusting at about 6-8 years old. The developments in reliability
have extended the service intervals to 12,000 or 20,000 miles,
the typical life of the engine to 200,000 miles or more, the
life of the main structure to 12-20 years, and made it practical
to incorporate power assistance for steering, brakes, window
operation, soft top operation, etc. Other everyday
engineering artefacts (e.g. washing machines, dish washers,
freezers) have similarly benefited.
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 273
What is Reliability theory Notes
__________________
Reliability theory suggests that biological systems start their
__________________
adult life with a high load of initial damage. Reliability theory
__________________
is a general theory about systems failure. It allows
researchers to predict the age-related failure kinetics for a __________________

system of given architecture (reliability structure) and given __________________


reliability of its components. Reliability theory predicts that __________________
even those systems that are entirely composed of non-aging __________________
elements (with a constant failure rate) will also deteriorate __________________
(fail more often) with age, if these systems are redundant in
__________________
irreplaceable elements. Aging, therefore, is a direct
__________________
consequence of systems redundancy.

Reliability theory also predicts the late-life mortality


deceleration with subsequent levelling-off, as well as the late-
life mortality plateaus, as an inevitable consequence of
redundancy exhaustion at extreme old ages. The theory
explains why mortality rates increase exponentially with age
(the Gompertz law) in many species, by taking into account
the initial flaws (defects) in newly formed systems. It also
explains why organisms “prefer” to die according to the
Gompertz** law, while technical devices usually fail
according to the Weibull*** (power) law. Reliability theory
allows to specify conditions when organisms die according
to the Weibull law: organisms should be relatively free of
initial flaws and defects. The theory makes it possible to find
a general failure law applicable to all adult and extreme old
ages, where the Gompertz and the Weibull laws are just
special cases of this more general failure law. The theory
explains why relative differences in mortality rates of
compared populations (within a given species) vanish with
age (compensation law of mortality), and mortality
convergence is observed due to the exhaustion of initial
differences in redundancy levels.

[**Benjamin Gompertz (March 5, 1779 - July 14, 1865,


England), was a self educated mathematician, and a Fellow
of the Royal Society. Gompertz is today mostly known for
his Gompertz’s law of mortality, a demographic model, which
establishes the relationship between number of individuals
at time, the intrinsic growth rate and the number of
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
274 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes individuals in equilibrium. This model was used by insurance


__________________ companies to calculate the cost of life insurance. ]
__________________
[***Waloddi Weibull born on June 18, 1887 originally came
__________________
from Denmark. Weibull’s Power Law states that the logarithm
__________________ of failure rates increases linearly with the logarithm of age.]
__________________
Reliability theory is developed apart from the mainstream
__________________
of probability and statistics. It was originally a tool to help
__________________
nineteenth century maritime insurance and life insurance
__________________
companies compute profitable rates to charge their
__________________ customers. Even today, the terms “failure rate” and “hazard
__________________ rate” are often used interchangeably.

The failure of mechanical devices such as ships, trains, and


cars, is similar in many ways to the life or death of biological
organisms. Statistical models appropriate for any of these
topics are generically called “time-to-event” models. Death
or failure is called an “event”, and the goal is to project or
forecast the rate of events for a given population or the
probability of an event for an individual.

When reliability is considered from the perspective of the


consumer of a technology or service, actual reliability
measures may differ dramatically from perceived reliability.
One bad experience can be magnified in the mind of the
customer, inflating the perceived unreliability of the product.
One plane crash where hundreds of passengers die will
immediately instil fear in a large percentage of the flying
consumer population, regardless of actual reliability data
about the safety of air travel.

Reliability theory of aging and longevity


Reliability theory of aging and longevity is a scientific
approach aimed to gain theoretical insights into mechanisms
of biological aging and species survival patterns by applying
a general theory of systems failure, known as reliability
theory as mentioned above.

Overview
Reliability theory allows researchers to predict the age-
related failure kinetics for a system of given architecture
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 275
(reliability structure) and given reliability of its components. Notes
Applications of reliability-theory approach to the problem __________________
of biological aging and species longevity lead to the following __________________
conclusions: __________________

1. Redundancy is a key of the notion for understanding __________________

aging and the systemic nature of aging in particular. __________________


Systems, which are redundant in numbers of __________________
irreplaceable elements, do deteriorate (i.e., are aging) __________________
over time, even if they are built of non-aging elements. __________________

2. Paradoxically, the apparent aging rate or expression of __________________

aging (measured as relative differences in failure rates __________________


between compared age groups) is higher for systems
with higher redundancy levels.

3. Redundancy exhaustion over the life course explains the


observed ‘compensation law of mortality’ (mortality
convergence at later life, when death rates are becoming
relatively similar at advanced ages for different
populations of the same biological species), as well as
the observed late-life mortality deceleration, levelling-
off, and mortality plateaus.

4. Living organisms seem to be formed with a high initial


load of damage (HIDL hypothesis), and therefore their
lifespan and aging patterns may be sensitive to early-
life conditions that determine this initial damage load
during early development. The idea of early-life
programming of aging and longevity may have important
practical implications for developing early-life
interventions promoting health and longevity.

5. Reliability theory explains why mortality rates increase


exponentially with age (the Gompertz law) in many
species, by taking into account the initial flaws (defects)
in newly formed systems. It also explains why organisms
“prefer” to die according to the Gompertz law, while
technical devices usually fail according to the Weibull
(power) law. Theoretical conditions are specified when
organisms die according to the Weibull law: organisms
should be relatively free of initial flaws and defects. The
theory makes it possible to find a general failure law
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
276 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes applicable to all adult and extreme old ages, where the
__________________ Gompertz and the Weibull laws are just special cases of
__________________ this more general failure law.
__________________
6. Reliability theory helps evolutionary theories to explain
__________________ how the age of onset of deleterious mutations could be
__________________ postponed during evolution, which could be easily
__________________ achieved by a simple increase in initial redundancy
__________________ levels. From the reliability perspective, the increase in
__________________ initial redundancy levels is the simplest way to improve
__________________
survival at particularly early reproductive ages (with
gains fading at older ages). This matches exactly with
__________________
the higher fitness priority of early reproductive ages
emphasized by evolutionary theories. Evolutionary and
reliability ideas also help in understanding why
organisms seem to “choose” a simple but short-term
solution of the survival problem through enhancing the
systems’ redundancy, instead of a more permanent but
complicated solution based on rigorous repair (with the
potential of achieving negligible senescence). Thus there
are promising opportunities for merging the reliability
and evolutionary theories of aging.

Overall, the reliability theory provides a parsimonious


explanation for many important aging-related phenomena
and suggests a number of interesting testable predictions.
Therefore, reliability theory seems to be a promising
approach for developing a comprehensive theory of aging and
longevity integrating mathematical methods with specific
biological knowledge and evolutionary ideas.

Reliability theory of aging provides an optimistic perspective


on the opportunities for healthy life-extension. According to
reliability theory, human lifespan is not fixed, and it could
be further increased through better body maintenance,
repair, and replacement of the failed body parts in the future.

Failure rate
Failure rate is the frequency with which an engineered
system or component fails, expressed for example in failures
per hour. It is often denoted by the Greek letter ë (lambda)
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 277
and is important in reliability theory. In practice, the Notes
reciprocal rate MTBF is more commonly expressed and used __________________
for high quality components or systems. __________________
__________________
Failure rate is usually time dependent, and an intuitive
corollary is that both rates change over time versus the __________________

expected life cycle of a system. For example, as an automobile __________________


grows older, the failure rate in its fifth year of service may __________________
be many times greater than its failure rate during its first __________________
year of service—one simply does not expect to replace an __________________
exhaust pipe, overhaul the brakes, or have major power __________________
plant-transmission problems in a new vehicle. So in the
__________________
special case when the likelihood of failure remains constant
with respect to time (for example, in some product like a
brick or protected steel beam), failure rate is simply the
inverse of the mean time between failure (MTBF), expressed
for example in hours per failure. MTBF is an important
specification parameter in all aspects of high importance
engineering design— such as naval architecture, aerospace
engineering, automotive design, etc. —in short, any task
where failure in a key part or of the whole of a system needs
be minimized and severely curtailed, particularly where
lives might be lost if such factors are not taken into account.
These factors account for many safety and maintenance
practices in engineering and industry practices and
government regulations, such as how often certain
inspections and overhauls are required on an aircraft. A
similar ratio used in the transport industries, especially in
railways and trucking is ‘Mean Distance Between Failure’,
a variation which attempts to correlate actual loaded
distances to similar reliability needs and practices. Failure
rates and their projective manifestations are important
factors in insurance, business, and regulation practices as
well as fundamental to design of safe systems throughout a
national or international economy.

Safety engineering
Safety engineering is an applied science strongly related
to systems engineering and the subset System Safety
Engineering. Safety engineering assures that a life-critical
system behaves as needed even when pieces fail.
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
278 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes In the real world the term “safety engineering” refers to any
__________________ act of accident prevention by a person qualified in the field.
__________________ Safety engineering is often reactionary to adverse events,
__________________ also described as “incidents”, as reflected in accident
__________________ statistics. This arises largely because of the complexity and
difficulty of collecting and analyzing data on “near misses”.
__________________
__________________ Increasingly, the importance of a safety review is being
__________________ recognized as an important risk management tool. Failure
__________________ to identify risks to safety, and the according inability to
__________________
address or “control” these risks, can result in massive costs,
both human and economic. The multidisciplinary nature of
__________________
safety engineering means that a very broad array of
professionals are actively involved in accident prevention
or safety engineering.

Safety engineers distinguish different extents of defective


operation: A “failure” is “the inability of a system or
component to perform its required functions within specified
performance requirements”, while a “fault” is “a defect in a
device or component, for example: a short circuit or a broken
wire”. System-level failures are caused by lower-level faults,
which are ultimately caused by basic component faults. The
unexpected failure of a device that was operating within its
design limits is a “primary failure”, while the expected failure
of a component stressed beyond its design limits is a
“secondary failure”. A device which appears to malfunction
because it has responded as designed to a bad input is
suffering from a “command fault”. A “critical” fault endangers
one or a few people. A “catastrophic fault” endangers, harms
or kills a significant number of people.

Safety engineers also identify different modes of safe


operation: A “probabilistically safe” system has no single
point of failure, and enough redundant sensors, computers
and effectors so that it is very unlikely to cause harm (usually
“very unlikely” means, on average, less than one human life
lost in a billion hours of operation). An inherently safe system
is a clever mechanical arrangement that cannot be made to
cause harm – obviously the best arrangement, but this is not
always possible. A fail-safe system is one that cannot cause
harm when it fails. A “fault-tolerant” system can continue to
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 279
operate with faults, though its operation may be degraded Notes
in some fashion. __________________
__________________
These terms combine to describe the safety needed by
__________________
systems: For example, most biomedical equipment is only
“critical”, and often another identical piece of equipment is __________________

nearby, so it can be merely “probabilistically fail-safe”. Train __________________


signals can cause “catastrophic” accidents (imagine chemical __________________
releases from tank-cars) and are usually “inherently safe”. __________________
Aircraft “failures” are “catastrophic”, so aircraft are usually __________________
“probabilistically fault-tolerant”. Without any safety features, __________________
nuclear reactors might have “catastrophic failures”, so real
__________________
nuclear reactors are required to be at least “probabilistically
fail-safe”.

The process
Ideally, safety-engineers take an early design of a system,
analyze it to find what faults can occur, and then propose
safety requirements in design specifications up front and
changes to existing systems to make the system safer. In an
early design stage, often a fail-safe system can be made
acceptably safe with a few sensors and some software to read
them. Probabilistic fault-tolerant systems can often be made
by using more, but smaller and less-expensive pieces of
equipment.

Far too often, rather than actually influencing the design,


safety engineers are assigned to prove that an existing,
completed design is safe. If a safety engineer then discovers
significant safety problems late in the design process,
correcting them can be very expensive. This type of error
has the potential to waste large sums of money.

The exception to this conventional approach is the way some


large government agencies approach safety engineering from
a more proactive and proven process perspective. This is
known as System Safety. The System Safety philosophy,
supported by the System Safety Society, is to be applied to
complex and critical systems, such as commercial airliners,
military aircraft, munitions and complex weapon systems,
spacecraft and space systems, rail and transportation
systems, air traffic control system and more complex and
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
280 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes safety-critical industrial systems. The proven System Safety


__________________ methods and techniques are to prevent, eliminate and control
__________________ hazards and risks through designed influences by a
__________________ collaboration of key engineering disciplines and product
__________________ teams. Software safety is fast growing field since modern
systems functionality are increasingly being put under
__________________
control of software. The whole concept of system safety and
__________________
software safety, as a subset of systems engineering, is to
__________________
influence safety-critical systems designs by conducting
__________________ several types of hazard analyses to identify risks and to
__________________ specify design safety features and procedures to strategically
__________________ mitigate risk to acceptable levels before the system is
certified.

Additionally, failure mitigation can go beyond design


recommendations, particularly in the area of maintenance.
There is an entire realm of safety and reliability engineering
known as “Reliability Cantered Maintenance” (RCM), which
is a discipline that is a direct result of analyzing potential
failures within a system and determining maintenance
actions that can mitigate the risk of failure. This methodology
is used extensively on aircraft and involves understanding
the failure modes of the serviceable replaceable assemblies
in addition to the means to detect or predict an impending
failure. Every automobile owner is familiar with this concept
when they take in their car to have the oil changed or brakes
checked. Even filling up one’s car with gas is a simple example
of a failure mode (failure due to fuel starvation), a means of
detection (fuel gauge), and a maintenance action.

For large scale complex systems, hundreds if not thousands


of maintenance actions can result from the failure analysis.
These maintenance actions are based on conditions (e.g.,
gauge reading or leaky valve), hard conditions (e.g., a
component is known to fail after 100 hrs of operation with
95% certainty), or require inspection to determine the
maintenance action (e.g., metal fatigue). The Reliability
Cantered Maintenance concept then analyzes each individual
maintenance item for its risk contribution to safety, mission,
operational readiness, or cost to repair if a failure does occur.
Then the sum total of all the maintenance actions are bundled
into maintenance intervals so that maintenance is not
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 281
occurring around the clock, but rather, at regular intervals. Notes
This bundling process introduces further complexity, as it __________________
might stretch some maintenance cycles, thereby increasing __________________
risk, but reduce others, thereby potentially reducing risk, __________________
with the end result being a comprehensive maintenance __________________
schedule, purpose built to reduce operational risk and ensure
__________________
acceptable levels of operational readiness and availability.
__________________

Data Analysis __________________


__________________
The data can then be studied and analysed. The results will
__________________
provide us the steps to be taken for controlling the risk.
__________________
Safety certification
Usually a failure in safety-certified systems is acceptable if,
on average, less than one life per 10 9 hours of continuous
operation is lost to failure. Most Western nuclear reactors,
medical equipment, and commercial aircraft are certified to
this level. The cost versus loss of lives has been considered
appropriate at this level by FAA for aircraft under Federal
Aviation Regulations.

Other countries also more or less follow similar procedures.

The Bow-Tie Diagram


In 2004, the US Federal Aviation Authority (FAA) mandated
that its regulated entities employ a technique known as the
‘Bow-Tie Diagram’ as the main mechanism for “safety
analyses” (FAST 2004). This technique is also recommended
by other bodies responsible for safety in air traffic control
(EuroControl 2004) and safety management in hazardous
industries.

• Causes: potential causes of an undesirable Incident;

• Proactive Controls: actions taken to reduce the


likelihood of an undesirable Incident occurring;

• Incident: an event that can cause undesirable Outcomes;

• Reactive Controls: actions taken to reduce the impact


of an undesirable Incident; and
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
282 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes • Outcomes: potential results of an undesirable Incident.


__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
The left-hand side of the diagram is often called a ‘Fault Tree”,
which is a detailed analysis of the combination of causes
(‘faults’) that can possibly give rise to an undesirable incident,
while the right hand side is often called an Event Tree, which
is a detailed analysis of the Outcomes or Consequences of an
undesirable Incident.

(The Bow-Tie sequence is also termed:

Hazard ® Preventative Controls ® Incident ® Mitigating


Controls ® Consequences in some Safety Management
areas.)

In essence, the diagram attempts to answer the two


‘fundamental questions”: “what is the potential frequency of
a particular scenario occurring [i.e. left side/Fault Tree] and
secondly, what is its potential loss severity [i.e. right side/
Event Tree]”?

In industrial applications, Bow-Tie analyses are most often


employed to identify and assess the potentially disastrous
impact of the failure of mechanical components, such as
chemical containment vessels or airframe components.
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 283
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

In this relatively simple example, there is the potentially


disastrous incident of a flat tyre occurring during airplane
take-off. The causes are identified on the left and, on the
right, the conditions that give rise to various outcomes, some
much worse than others. In practice, of course, a diagram
would be much more complex than this one. Advantages of
using the “bow tie” assessment are often identified as (e.g.
Euro Control 2004):

l It provides a ‘common language’ for communication


between independent risk managers and operational
experts;

l The full range of Causes (i.e. ‘inherent risks’) and


Proactive Controls (i.e. ‘residual risks’) can be shown
and discussed;

l The combination and interaction of Causes and Proactive


Controls can be clearly illustrated; and

l Likewise the full range of Outcomes (i.e. Losses in Basel


terminology) and Reactive Controls can be illustrated
and discussed.

In summary, the complex linkages between possible Causes


and potential Outcomes can be made explicit and that assists
in drawing a clear picture for the precise drivers that
generate losses. Furthermore, if each stage of analysis, e.g.
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
284 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes moving from left to right, is carried out by experts and then
__________________ brought together into a coherent whole by independent risk
__________________ analysts/moderators then such a process should qualify for
__________________ being “robust and methodical” for Basel purposes.
__________________ Weaknesses
__________________
Of course the bow-tie technique is not a panacea; it is merely
__________________
a way of making risk management assumptions, analyses and
__________________ conclusions explicit.
__________________
It has known weaknesses, including:
__________________
__________________ l The quality of the final analysis will totally depend on
the quality of the analysis process and the analysts and
experts taking part: garbage in - garbage out;

l The technique does not help in uncovering underlying


causes, merely in making their consequences explicit,
there is therefore an earlier analysis step (i.e. Risk
Identification) required;

l It is a ‘semi-quantitative’ methodology and hence


requires an additional step of estimating the impact of
each outcome numerically as required by Basel II, and

l It can be ‘gamed’ by staff members who may have a


different agenda, so requires additional supporting
information to be captured such as external data or other
documented factors which can suffice as evidence.

A methodical approach to estimating risk in any Scenario


Analysis exercise is extremely important. Research in ‘risk
perception’ shows, for example, that people will invariably
overestimate the likelihood of an event with which they have
some familiarity rather than a completely alien one and will
extrapolate from known situations to estimate an unknown
one, invariably not making a large enough adjustment (i.e.
will underestimate the risk). Furthermore, researchers have
found that ‘experts’ are over confident in their ability to
estimate accurately from small data samples. Nor does using
a number of experts, rather than one, to estimate risks
necessarily lead to a better estimation, as the well-known
phenomenon of ‘groupthink’ can lead groups to make
completely wrong, but agreed, conclusions.
OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 285
The use of a Bow-Tie approach does not, of course, eliminate Notes
these problems, merely reduces the likelihood of error by __________________
segregating risk analysis into smaller, discrete, independent __________________
components and reducing cross-contamination between __________________
them. Of course it should be recognized, especially for low- __________________
probability events, small errors in one part may be amplified
__________________
in others – a problem with all subjective techniques.
__________________
Therefore a good taxonomy is required for homogenous loss
__________________
data collection that can show when correlation factors are
present for broad impacts that cross over from one risk __________________

classification into another. __________________


__________________
Application of the Bow-Tie Diagram in Scenario Analysis
A Bow-Tie diagram is a graphical representation of a
Scenario.

Having identified a ‘Scenario’, such as flat tyre in the FAA


example, the situation can be analysed in a methodical
manner, by experts, as follows:

l Identify potential Causes: using operational/business


experts, risk managers and, if appropriate, external
experts;

l Assess the effectiveness of Proactive and Reactive


Controls: using independent internal/ external auditors
and risk managers;

l Identify and assess possible Outcomes: using


operational/business experts, risk managers and, where
possible, internal and external experience;

l Build a Bow-Tie model of the Scenario (i.e. Causes,


Controls and Outcomes): using business and
independent assessments and, where available,
historical data and evaluate the range/distribution of
potential Outcomes and their sensitivity to assumptions
of the key parameters; and

l Refine the Model: based on business/risk management


feedback and any additional analyses required.

In order to satisfy the requirements, such a process would


have to be judged:
OLEUM &
E TR EN
FP

ER
YO

GY
u

UNIV ER SIT

ST U DI ES
286 Aviation Safety and Security Management fo
n ~;
k ; k ‘ k f D r o’
k~ e

Notes l Methodical: with each component step performed to


__________________ agreed procedures with well-defined separation of
__________________ responsibilities;
__________________
l Robust: able to be replicated by different analysts and
__________________ experts, producing results that are not too dissimilar;
__________________
l Comprehensive and Consistent: used in the same way
__________________
across all business units;
__________________
__________________ l Well-documented: in a consistent fashion with
__________________ sufficient detail; to permit
__________________ l Independent Review and Validation: by external and
independent experts.

It is therefore desirable that a firm should build a “database


of scenario based events” that can be reviewed periodically
and modified as business conditions change. The consistent
use of a Bow-Tie technique should aid the development of
such a database, allowing rational discussion between risk
analysts and business managers to take place when discussing
new initiatives, which is a major benefit of such an approach,
overcoming a major hurdle in subjective assessment.

Since financial firms are subject to similar risks (although


their individual control environment and consequent range
of potential outcomes may vary significantly), there is the
potential for developing a database of scenarios that are
applicable across the industry. For example, the loss of a
shared industry service such as an Exchange or Clearing
house. Such a ‘scenario’ is the same for all participating firms,
but the impact may vary wildly, depending on: for example,
transaction volumes, customer impact and the quality of their
BCP (Business Continuity Planning).

References
1. Radatz, Jane (Sep 28, 1990). IEEE Standard Glossary of
Software Engineering Terminology (PDF), New York,
NY, USA: The Institute of Electrical and Electronics
Engineers, 84 pages. ISBN 1-55937-067-X.

2. Vesely, W.E.; F. F. Goldberg, N. H. Roberts, D. F. Haasl


OLEUM &
E TR EN
FP
ER
YO

GY

u
UNIV ER SIT

ST U DI ES

fo
n ~;
k ; k ‘ kf D r o’
k~ e
UNIT 13 Reliability Fundamental Theories 287
(Jan, 1981). Fault Tree Handbook (PDF), Washington, Notes
DC, USA: U.S. Nuclear Regulatory Commission, page __________________
V-3. NUREG-0492. Retrieved on 2006-08-31. __________________
__________________
3. Gompertz, B., (1825). On the Nature of the Function
Expressive of the Law of Human Mortality, and on a New __________________
Mode of Determining the Value of Life Contingencies. __________________
Philosophical Transactions of the Royal Society of __________________
London, Vol. 115 (1825). __________________

4. Weibull, W. (1951) “A statistical distribution function __________________


of wide applicability” J. Appl. Mech.-Trans. ASME 18(3), __________________
293-297 __________________

5. Safety First – Scenario Analysis under Basel II Patrick


Mc Connell, and Martin Davies April 2006

6. EUROCONTROL (2004) “Review Of Techniques To


Support The EATMAP Safety Assessment Methodology
Volume 4” European Organization for the Safety of Air
Navigation; http:// www.eurocontrol.int

7. FAST (2004) “Toolsets / System Safety Management


Program- Section 4”, Federal Aviation Authority
Acquisition System Toolset; http:// fast.faa.gov

Questions
General Questions.
1. What do you mean by the reliability theory?

2. What do you mean by Failure rates?

3. How ‘Failure rate’ and ‘Mean Time Between Failure


(MTBF)’, are mathematically related with each other.

4. What is a bow-tie diagram? How the application of the


Bow-Tie Diagram can be used in Scenario Analysis
Objective Type of questions
a. Reliability theory is developed apart from the
mainstream of ——.
Answers to Objective Type of questions
b. Probability and statistics.

You might also like