Professional Documents
Culture Documents
Manchester , UK
21 - 24 November 2011
Often the activities prior to test execution are delayed. This means
testing has to be done under severe pressure. It would be unthinkable
to quit the job, to delay delivery or to test badly. The real answer is a
differentiated test approach in order to do the best possible job with
limited resources.
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
Contents
1. Introduction 2
7.1 Planning 12
7.2 Kick-off 14
8. Practical Experiences 19
References 22
Biography 22
w w w. e u ro s t a rc o n f e re n c e s . c o m
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
TWEETABLE
industries. The method supports good enough to ship?” When the time
the test manager in doing risk- comes to make the big decision, how could
based testing. you answer that question? If you say, “Well,
it’s just not ready,” the project manager just
thinks, “Testers always say that, they’re
never happy” and you are dismissed as a
TWEETABLE
pessimist. Suppose you say, “Well, it looks
“Good enough” is a reaction to ready to me,” will your project manager put
the formalism in testing theory. a piece of paper under your nose, asking
It’s not reasonable to aim at zero- you to sign it? If you sign, are you taking
defects (at least in software), so someone else’s responsibility? So, what is
“Good enough?” and how does it help with
why do you pretend to yourself
the risk based testing?
and pretend to the users and
customers that you’re aiming at “Good enough” is a reaction to the formalism
perfection? in testing theory. It’s not reasonable to aim
at zero-defects (at least in software), so why
do you pretend to yourself and pretend to
the users and customers that you’re aiming
at perfection? The zero-defect attitude just
doesn’t help. Your customers and users live in
the real world, why don’t we? Compromise is
inevitable, you always know it’s coming, and
the challenge ahead is to make a decision
based on imperfect information. As a tester,
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
don’t get upset if your estimates are cut down provided sufficient evidence to say these
or your test execution phase is squeezed. risks are addressed and those benefits
Guilt and fear should not be inevitable just are available for release? In essence these
because a project is constrained for budget questions are all about balancing; spending
and resources and has more than zero the resources on testing to deliver good
defects remaining. enough quality and acceptable level of risk.
This definition means that there is already Figure 1: Balancing testing with quality & risks
enough of this product working (this system,
4
increment or enhancement) for us to take it
Who Decides?
PAGE
Firstly, have sufficient benefits been In our simple analogy, we will disregard
delivered? The tests that we execute must at the lawyer’s role. We will assume that the
least demonstrate that the features providing prosecution and defence are equally good
the benefits are delivered completely, so we at extracting evidence from witnesses and
must have evidence of this. Secondly, are challenging “facts” and arguments. We will
there any critical problems? The incident focus on the expert witness role; these are
reports that record failures in software provide people who are brought into a court of law
the evidence of at least the critical problems to present and explain complex evidence in
(and as many other problems as possible). a form for laymen (the jury) to understand.
There should be no critical problems for it to The expert witness must be objective and
be good enough. Thirdly, is our testing good detached. If asked whether the evidence
enough to support this decision? Have we points to guilt or innocence, the expert
explains what inferences could be made
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
based on the evidence, but refuses to judge. believe that by taking this principled position
In the same way, the tester might simply as an expert witness (and taking it as early in
state that based on evidence “these features the project as possible) you might raise your
work, these features do not work, these risks credibility with management. Management
have been addressed, these risks remain”. It might then give you the right responsibilities
is for others to judge whether this makes a on future projects.
system acceptable.
The Classic Squeeze on Testing
The tester is there to provide information
for the stakeholders to make a decision. The well-known squeeze on testing occurs
After all, testers do not create software or when the developers deliver late into test
software faults; testers do not take the risks but the go-live deadline remains fixed. The
of accepting a system into production. exit criteria might be used to determine
Testers present to their management and what happens next, but all too often it is
peers an informed and independent point obvious that these criteria cannot be met
of view. When asked to judge whether a in time. The pressure to release might be
product is good enough, the tester might so great that the exit criteria are set aside.
say that on the evidence obtained, these There may be some attempt to downgrade
benefits are available, but these risks still the severe bugs perhaps. Exit criteria are
exist. However, if as a tester you are actually an uncomfortable reminder of the idealistic
asked to make the decision, what should attitude in the early stages of the project,
you do? The answer is that you must help but do not make the decision any easier to
5 the stakeholders make the decision, but not make. The risks that were apparent at the
make it for them. The risks, those problems start of testing have been visible throughout.
PAGE
that we thought say 6 months ago could When testing is squeezed, some risks may
occur, and which in your opinion would have been addressed, some benefits may
make the system unacceptable, might be available, testers may have revealed new
still exist. If those were agreed with the risks to be addressed, but the outstanding
stakeholders at that time, the system cannot risk is apparent to all involved. If the decision
now be acceptable, unless they relax their to release is actually made, the stakeholders
perceptions of the risk. have explicitly chosen to accept a product
before they have evidence that all risks are
The judgement on outstanding risks must addressed, and that all benefits are available.
be as follows: This should not simply be considered as a
tester’s problem. The stakeholders have
- There is enough test evidence now judged that they had enough information to
to judge that certain risks have been make that decision.
addressed.
- There is evidence that some features do In fact, information for making the release
not work (the risk has materialized). decision becomes available from the
- Some risks (doubts) remain because of first day of test execution onwards; it’s
lack of evidence (tests have not been just that the balance of testing evidence
run, or no tests are planned). versus outstanding risks weighs heavily
against release. A positive (though perhaps
This might seem less than ideal as a surprising) principle for risk-based testers
judgement, but is preferable to the unrealistic, is therefore that the time available for test
ideal-world acceptance criteria discussed execution has no bearing on your ability to
earlier. You may still be forced to give an do ‘good testing.’
opinion on the readiness of a system but we
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
there may be other factors playing a role, but functionality, but rather makes the
the factors given here have been valuable in product less appealing to the user or
many projects. Important areas can either customer.
be functions or functional groups, or quality
attributes such as performance, reliability, Of course damage will mean very different
security etc. In this paper we will use the things depending on the product, for some
generic term ‘test items’ for this. products it is related to (human) safety and
for some ‘only’ to financial damage. Another
Major Factors to Look for When way of looking at user importance is to take
Determining the Importance of Test Items the view of marketing. What are the (unique)
Include: selling points of this new product for our
customers?
Critical Areas (Cost and Consequences
of Failure) Visible Areas
You have to analyse the use of the software The visible areas are areas where many
within its overall environment, and analyse users will experience a failure, if something
the ways the software may fail. Find the goes wrong. Users do not only include the
possible consequences of such failure operators sitting at a terminal, but also final
modes, or at least the worst ones. Take users looking at reports, invoices, or the like,
into account redundancy, backup facilities or being dependent on the service delivered
and possible manual checks of output by by the product which includes the software.
7 users, operators or analysts. A product that A factor to take into account under this
is directly coupled to a process it controls is heading is also the tolerance of the users to
PAGE
more critical than a product whose output is such problems. It relates to the importance
manually reviewed before use. If a product of different functions or quality attributes,
controls a process, this process itself should see above. Software intended for untrained
be analysed. or naive users, especially software intended
for use by the general public, needs careful
A possible hierarchy is the following: attention to the user interface. Robustness
will also be a major concern. Software
- A failure would be catastrophic which directly interacts with hardware,
The problem would cause the system industrial processes, networks etc. will be
to stop, and maybe even take down vulnerable to external effects like hardware
things in the environment (stop the entire failure, noisy data, timing problems etc.
workflow or business or product). Such These kinds of products need thorough
failures may deal with large financial validation, verification and re-testing in
losses or even damage to human life. case of environment changes. Regarding
- A failure would be damaging visibility often a distinction is made between
The program may not stop, but data may external visibility (outside the organisational
be lost or corrupted, or functionality may boundaries) and internal visibility whereby
be lost until the program or computer is ‘only’ our own users experience the
restarted. problem.
- A failure would be hindering
The user is forced to work around, and Most Used Areas
to execute more difficult actions for
reaching the same results. Some functions may be used every day, other
- A failure would be annoying functions only a few times. Some functions
The problem does not affect may be used by many, some by only a few
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
changes by functional area or otherwise and involved in a task, the larger is the overhead
find the areas which have had exceptional for communication, and the greater the
amount of changes. These may either chance that things will go wrong. A small
have been badly designed from the start, group of highly skilled staff is much more
or have become badly designed after the productive than a large group with average
original design has been destroyed by the qualifications. In the COCOMO (Boehm,
many changes. Many changes are also a 1981) software cost model, this is the largest
symptom of badly done analysis. Thus, factor after software size. Much of its impact
heavily changed areas may not correspond can be explained from effort going into
to user expectations. detecting and fixing defects. Areas where
relatively many and less qualified people
New Technology and Methods have been employed may be identified for
better testing. It is important in this context
Programmers using new tools, methods to define what qualified means, e.g. is
and technology experience a learning curve. it related to the programming language,
In the beginning, they may generate many domain knowledge, development process,
more faults than later. Tools include CASE working experience in general, etc.
(Computer Aided Software Engineering)
tools, which may be new in the company, Care should be taken in that analysis: Some
or new in the market and unstable. Another companies (Jørgensen, 1984) employ their
issue is the programming language, which best people in more complex areas, and
may be new to the programmers. Any new less qualified people in easy areas. Then,
9
tool or technique may give trouble. defect density may not reflect the number of
PAGE
Most software cost models include Time pressure may also lead to overtime at
factors accommodating the experience of work. It is well known, however, that people
programmers with the methods, tools and lose concentration after prolonged periods
technology. This is as important in test of work. Together with short cuts in applying
planning, as it is in cost estimation. reviews and inspections, this may lead to
extreme levels of defect density. Data about
People Involved time pressure during development can
best be found by studying time lists, or by
The idea here is the thousand monkey’s interviewing management or programmers.
syndrome. The more people that are
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
To support the PRISMA process a supporting documents: the risk items. If assumptions are
freeware software tool has been developed. made (e.g. related to ambiguousness in input
Output examples of this tool are used documents), these should be documented.
throughout the remainder of this paper (e.g. The most likely scenario can be determined
figures 5, 6 and 8). by interviewing the document owner and/or
stakeholders. As a rule of thumb there should
TWEETABLE
not be more than approximately 30 – 35 risk
Finding the worst areas in items to keep the process workable. This
the product soon and testing often means the items (e.g. requirements)
them more will give you more as stated in the input document need to be
defects. If you find many serious grouped into logical units. The identified risk
items will be structured following a hierarchy.
problems, management will
It is most often not useful to consider each
often be motivated to give you and every elementary requirement as a
more time and resources. separate risk item.
the input documents (often the same as the The identified risk item list should be
test basis) are determined and collected. Of uniquely identified and understandable to
course the relevant input documents depend the participants. The reference identification
largely on the test level on which the risk can be a number, but it may also contain a
analysis is performed. One needs to check code or abbreviation that is meaningful in
that the input documents are at the required the project. Per risk item, if possible, a link
quality level and that they contain the items should be provided to the relevant parts of
(referred to as test items) that can be used the input documentation. A description of
in this process. The inputs don’t have to be the risk item can be added as a one-liner.
‘final’ or ‘accepted,’ but sufficiently stable For the participants, this description should
to use them for product risk analysis. Gaps give a clear idea which risk item they have
(e.g. in requirements) should be identified to assess.
and reported to the document owner. If the
input documents are not available at the Determine Impact and Likelihood
correct level, one needs to consider how and Factors
if to continue with the product risk analysis
process. Ideally the documents to be used For product risk analysis two ingredients are
are a requirements document (for testing at relevant: the likelihood that a failure occurs
system level), or an architectural document and the impact if this happens. In sections 5
(for development testing). and 6 these two ingredients are discussed
and several factors that influence the impact
Identifying Risk Items and/or likelihood are recommended to
be used are presented. The test manager
The items that are used for the risk determines the factors that the project will use
assessment are identified based on the input to assess the risk items in terms of likelihood
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
factor should already have been determined method prescribes not to use weight factors
at a higher level than at project level. for stakeholder, i.e. each stakeholder is
considered equally important. It is also
Define a Weight for Each Factor highly recommended to assign each factor
to at least two stakeholders.
It is also possible to use weights whereby
one factor is considered more important The test manager who is responsible for
than another factor, e.g. one factor could assigning the roles has to make a good
have 2 times the ‘worth’ of another factor. balance in:
Again weights for the factors are preferably
determined in the test strategy, but can be - choosing the appropriate persons for the
tailored for project specific purposes. The roles depending on the test level.
general method is to assign weights, and to - choosing “technical” roles to fill in the
calculate a weighted sum for every area of likelihood and “business” roles to fill in
the system. Focus your testing on the areas impact.
where the result is highest! For every factor - involving sufficient knowledge areas,
chosen, assign a relative weight. You can both for impact and likelihood parts.
do this in very elaborate ways, but this will
take a lot of time. Most often, three weights Scoring Rules
will suffice. Values may be 1, 2, and 3 (1 for
“factor is not very important,” 2 for “factor Finally the rules are set that apply to the
has normal influence” and 3 for “factor has scoring process. One of the common pitfalls
strong influence”). Once historical project of the risk analysis is that the results tend to
data is gathered one can start fine tuning cluster, i.e. the result is a two-dimensional
the weights. matrix with all risk items close to each
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
other. To prevent this in an efficient way, the are clear. A clear explanation and common
stakeholders should be enforced to score view will contribute to a better deployment
the factors on a full range of the values. and motivation. Discussion about the
Rules will support this process. Examples usefulness of the process and expected
are ‘the full range of values must be used,’ benefits should take place here, not later
‘all factors shall be scored,’ ‘no blanks during the process. Items to discuss here
allowed’ and ‘homogeneous distribution of are: e.g. how to perform the individual
values assigned to a factor.’ preparation, explanation of the tools to be
used, and the applicable deadlines. The test
TWEETABLE
manager also provides an overview of the
During the latter part of test remainder of the process. It should be made
execution, one should focus on clear to the stakeholders what to expect at
areas where defects have been the consensus meeting organized at a later
found, and one should generate stage and what will happen at the end of the
process. The test manager explains to the
more tests aimed at the type of
stakeholders what their role in this process
defect detected before. is and how their contribution influences the
test approach for the project.
TWEETABLE
The risk items and factors are explained in
One needs to check that the
detail, as the stakeholders will be requested
input documents are at the to score them. The exact meaning of the
14 required quality level and that risk items, the factors, the value set(s) and
PAGE
they contain the items that can assumptions must be made very clear to
be used in this process. The them. To obtain reliable results there must be
inputs don’t have to be ‘final’ a common understanding of all properties of
or ‘accepted,’ but sufficiently the risk assessment. At the end of the kick-off
meeting, commitment from the stakeholders
stable to use them for product
is needed, to ensure they will participate in a
risk analysis. meaningful manner.
7.2 Kick-Off
7.3 Individual Preparation
Optionally, a kick-off meeting can be
organized in which the test manager explains During the individual preparation values
to all stakeholders their role in the process. are assigned to the factors per risk item
Although optionally, the kick-off meeting by the participants. The participants score
is highly recommended. After the kick-off by selecting (the description of) the value
meeting the process and expected actions that fits best with the perceived risk for the
should be clear to all participants. The kick- corresponding factor regarding a risk item.
off meeting can also be used to explain the This step can be done manually, but is often
list of risk items, factors and to make clear to supported by providing the participants with
which factors they have to assign a value. Excel sheets that even support automatic
checking against the scoring rules.
The kick-off phase of the risk analysis is
to ensure that not only the process, the
concept risk based testing and the risk
matrix, but also the purpose of the activities
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
The values filled in by the stakeholder are the scoring process, clearer testing priorities
based on perceived risks: the stakeholder can be set at a later stage thanks to a more
15
expects that something could go wrong differentiated scoring.
PAGE
meeting. 6.
When the deadline is approaching, the test The threshold value and the other rules are
manager should remind the stakeholders to determined in the project rule set by the test
submit their score and return them in time. manager during the planning phase.
Stakeholders that didn’t return their scores on
time shall be approached individually. Before TWEETABLE
the next phase of the process is entered, A product that is directly coupled
there should be at least a representative to a process it controls is more
response from each group of stakeholders critical than a product whose
or roles in the project.
output is manually reviewed
Processing Individual Scores before use. If a product controls
a process, this process itself
The test manager now processes and should be analysed.
analyses the individual scores by calculating
the average value. He also prepares a list of TWEETABLE
issues to be discussed in the consensus
Risk tracking should be done
meeting. For each risk item the likelihood
and impact is determined. Per risk item the throughout the project. This
scores of the factors determining likelihood can be done by periodically
are added up and separately the scores of repeating (part of) the risk
the impact factors are added up. Each risk analysis and validating the initial
item can now be positioned in the so-called risk analysis.
risk matrix.
Practical Risk-Based Testing - Product Risk Management: The PRISMA Method
7.6 Define a differentiated test (or test cases) can be reviewed with
approach stakeholders or other testers.
- Re-testing
Based on the test items’ location in the With re-testing, also called confirmation
risk matrix, all test items are prioritised. testing, one can decide to re-run the full
The result is ordering of all test items, the test procedure or just the step that failed
most important item first. In addition to and re-test the defect solved in isolation.
prioritization a differentiated test approach - Regression Testing
for the test items needs to be defined based Of course the outcome of the risk
on their position in the risk matrix. The test assessment can also drive the regression
approach usually has two major aspects: test, whereby high-risk areas should be
the test depth used, and the priorities for most intensively covered in the
testing. Test depth can be varied by using regression test set.
different test design techniques, e.g. using - Level of Independence
the decision table technique on high-risk It is possible to have one tester define
test items and using ‘only’ equivalence the test cases and test procedures,
partitioning for low risk test items. The and another tester to execute the test
problem with varying test depth based on procedure. The independent test
the application of test design techniques is executor tends to be more critical
that not all (test) projects are mature enough towards the test cases and the way
to use a set of test design techniques. Many they are executed and as a result
test projects still just write test cases based will likely find more defects. Also for
18
on requirements and do not practice test component testing one can make pairs
PAGE
or other measured indicators like DDP detecting defects. The rationale behind this
(Defect Detection Percentage), (changed) goal is that test managers tend to use or
assumptions, updated requirements or not use a method according to the extent to
architecture. Changes in the project’s scope, which they believe it will help them perform
context or requirements will often require their job better. This determinant is referred to
updates of steps in the risk assessment as the perceived usefulness of the PRISMA
process. The risk assessment process method. However, even if a test manager
therefore becomes iterative. believes that a given technique is useful,
they may, at the same time, believe that it is
TWEETABLE too difficult to use and that the performance
Some functions may be used benefits are outweighed by the effort of
every day, other functions only using a more systematic approach. Hence,
a few times. Some functions in addition to usefulness, perceived ease-
of-use is a second important determinant to
may be used by many, some by
take into account.
only a few users. Give priority
to the functions used often and
heavily. To define these concepts in more detail, the
following definitions are introduced:
TWEETABLE
on these findings, we can conclude that test and people learn throughout the project
managers tend to consider PRISMA easy that risk assessment also should be
to use provided that they have received the treated this way, and (partly) repeated at
necessary initial practical support. Hence, several instances, e.g. at milestones.
there is a large probability that test managers
adopt the method in their test management Finally, one company released defect
practices. Also beware that within the numbers recently after measuring DDP for
organisation for stakeholders implementing several years at alpha level. In addition to a
the method is a real change process. The shorter test execution lead time their DDP
process is simple but not easy. has improved by approximately 10% after
the introduction of PRISMA in 2006, as
can be observed by looking at the graph
hereafter.
References Biography
• Bach, J. (1997), Good Enough Quality: Drs. Erik van Veenendaal
Beyond the Buzzword, in: IEEE (www. erikvanveenendaal)
Computer, August 1997, pp. 96-98 nl) is a leading international
• Bach, J. (1998), A framework for consultant and trainer, and a
good enough testing, in: IEEE Computer widely recognized expert in
Magazine, October 1998 the area of software testing
and quality management with over 20 years
• Boehm, B.W. (1979), Software
of practical testing experiences. He is the
engineering economics, Prentice-Hall,
founder of Improve Quality Services BV
Englewood Cliffs, NJ
(www.improveqs.nl). At EuroSTAR 1999,
• Jørgensen, M. (1994), Empirical studies
2002 and 2005, he was awarded the best
of software maintenance, Thesis for the
tutorial presentation. In 2007 he received the
Dr.Sceintific degree, Research Report
European Testing Excellence Award for his
188, University of Oslo
contribution to the testing profession over
• Gerard, P., and N. Thompson, Risk- the years. He has written numerous papers
Based E-Business Testing, Artech House and a number of books, including “The
Publishers, ISBN 1-58053-314-0 Testing Practitioner,” “ISTQB Foundations
• Karlsson, J. and K. Ryan (1997), A of Software Testing” and “Testing according
Cost-Value Approach for Prioritizing to TMap.” Erik is also a former part-time
22 Requirements, in: IEEE Software, senior lecturer at the Eindhoven University
September 1997 of Technology, vice-president of the
PAGE
w w w. e u r o s t a r c o n f e r e n c e s . c o m