Professional Documents
Culture Documents
Todays Content
1. Administrivia: Future Alumni Training
2. A Definition of Dependability (6.1)
A.
B.
C.
Basic Definitions
Achieving, Measuring, and Validating Dependability
Fault Assumptions
25.
a)
b)
c)
d)
24.
a)
b)
c)
d)
26.
a)
b)
c)
d)
27.
a)
b)
c)
d)
this statement has not been evaluated by the US Food and Drug Administration
Todays Content
1. Administrivia
2. A Definition of Dependability (6.1)
A.
B.
C.
Basic Definitions
Achieving, Measuring, and Validating Dependability
Fault Assumptions
a comprehensive specification
Need to specify not only functionality but
assumed environmental conditions
Need to clarify what high means (contextdependent)
This is .
Fault
This is .
Failure
Component1
11101111
Error
Fault
8
Fault Types
Several axes/viewpoints by which to classify faults
Phenomenological origin
Physical:
HW causes
Design: introduced in the design phase
Interaction: occuring at interfaces between components
Nature
Accidental
Intentional/malicious
More on Faults
Independent faults: attributed to different
causes
Related faults: attributed to a common
cause
Related faults usually cause common-mode
failures
Single
10
Intentional
Faults
Physical
Faults
Humanmade
Faults
PERSISTENCE
System Boundaries
Internal
Faults
External
Faults
Phase of Creation
Design
Faults
Usual
Labelling
Operational
Faults
Permanent
Faults
X
X
Temporary
Faults
X
X
X
X
X
X
X
Physical Faults
Transient Faults
Intermittent
Faults
Design Faults
X
X
X
X
Interaction Faults
Malicious
Logic
Intrusions
X
11
Todays Content
1. Administrivia
2. A Definition of Dependability (6.1)
A.
B.
C.
Basic Definitions
Achieving, Measuring, and Validating Dependability
Fault Assumptions
12
2 Choices:
Error
15
MTBF/(MTBF+MTTR)
Interval
Performability: combined
performance+dependability analysis
Quantifies
Availability Examples
Availability
9s
Downtime/year
Example Component
90%
>1 month
Unattended PC
99%
~4 days
Maintained PC
99.9%
~9 hours
Cluster
99.99%
~1 hour
Multicomputer
99.999%
~5 minutes
99.9999%
~30 seconds
99.99997%
~3 seconds
Todays Content
1. Administrivia
2. A Definition of Dependability (6.1)
A.
B.
C.
Basic Definitions
Achieving, Measuring, and Validating Dependability
Fault Assumptions
19
Fault Assumptions
Cant design to tolerate an arbitrary number and kind
of faults! (IMHO; YMMV.)
stops
wrong meaning
bug
Deliberate action by intruder
Caveat:
Caveat:its
itsaaByzantine
Byzantine(and
(andMachievellian)
Machievellian)world
world
out
outthere.
there.
You've
You'vegot
gotto
toask
askyourself
yourselfone
onequestion.
question.Do
Doyou
you
feel
feellucky?
lucky?Well,
Well,do
doyou...
you...Punk
Punk
24
Coverage
To build a FT system you had to assume a fault
model
But
Q: which is better
A
25
Causes of Failures
Jim Gray (RIP) survey at Tandem (1986)
Still
relevant today
Todays Content
1. Administrivia
2. A Definition of Dependability (6.1)
A.
B.
C.
Basic Definitions
Achieving, Measuring, and Validating Dependability
Fault Assumptions
27
a span of mechanisms
Error Processing
Facets of error processing
Error
29
more replicas
Hardening fragile ones (fault prevention)
Making more resilient to severe faults (fault tolerance)
precise results