You are on page 1of 21

computers & security 62 (2016) 296316

Available online at www.sciencedirect.com

ScienceDirect

j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / c o s e

The simulated security assessment


ecosystem: Does penetration testing need
standardisation?

William Knowles a, Alistair Baron a,*, Tim McGarr b


a
Security Lancaster, School of Computing and Communications, Lancaster University, Lancaster LA1 4WA, UK
b
British Standards Institution, 389 Chiswick High Road, London W4 4AL, UK

A R T I C L E I N F O A B S T R A C T

Article history: Simulated security assessments (a collective term used here for penetration testing, vul-
Received 11 November 2015 nerability assessment, and related nomenclature) may need standardisation, but not in the
Received in revised form 30 May commonly assumed manner of practical assessment methodologies. Instead, this study high-
2016 lights market failures within the providing industry at the beginning and ending of
Accepted 7 August 2016 engagements, which has left clients receiving ambiguous and inconsistent services. It is here,
Available online 11 August 2016 at the prior and subsequent phases of practical assessments, that standardisation may serve
the continuing professionalisation of the industry, and provide benefits not only to clients
Keywords: but also to the practitioners involved in the provision of these services. These findings are
Penetration testing based on the results of 54 stakeholder interviews with providers of services, clients, and
Security coordinating bodies within the industry. The paper culminates with a framework for future
Evaluation advancement of the ecosystem, which includes three recommendations for standardisation.
Standards 2016 Elsevier Ltd. All rights reserved.
Assessment

The collective terminology of simulated security assess-


1. Introduction ments uses the notion of simulation as it is established by Such
et al. (2016) in their definition of information assurance tech-
In the presence of the seemingly inexorable increase in cyber niques. Simulation here is the practical imitation of threat actors
attacks, how should organisations best pursue self-examination within real-world environments, as opposed to the virtual al-
to accurately determine their resilience to such threats? One ternative. Although the concept bears a strong relationship to
approach has been through the increase in services sold to these non-contractual vulnerability research and its formally
organisations that intend to replicate the methodologies and crowdsourced and contractual counterpart (e.g., bug bounty pro-
techniques (technical and social) of both internal and exter- grams), the work presented within this paper is primarily
nal malicious attackers, which are branded using a complex, concerned with contractual services procured from third party
and often confusing set of terminologies something that we organisations.
collectively describe here as simulated security assess- Examples of such simulated security assessments include
ments. This paper seeks to explore the context in which these red team exercises, penetration tests, social engineering, and
services are delivered, in order to determine best practices, and vulnerability scans. In practice, each of these services that can
opportunities for further advancement. constitute a simulated security assessment has subtle

* Corresponding author.
E-mail address: a.baron@lancaster.ac.uk (A. Baron).
http://dx.doi.org/10.1016/j.cose.2016.08.002
0167-4048/ 2016 Elsevier Ltd. All rights reserved.
computers & security 62 (2016) 296316 297

differences, which are defined here for further context within 2000, 2010), as well as the facilitation of innovation (Blind, 2013).
this paper. The core feature of a vulnerability scan is its use Standards aid public procurement in decision making and risk
of automatic tools; however, the use of automatic tools should management, but individuals involved with public procure-
arguably be followed up at minimum with a cursory manual ment need to provide a greater input into the development of
review (e.g., false positive and negative verification), but at what standards from the early stages (Blind, 2013). Any recommen-
level of analysis does manual review become a vulnerability dations for standardisation must be grounded in the realities
assessment? Furthermore, how much manual review is re- of industry practices and client experiences, i.e. to establish
quired before an engagement can be considered a penetration the actual or potential problems to address. It is also impor-
test, or is such a label defined by the use of exploitation? This tant to understand links to existing standards, what level of
distinction is perhaps most notable and has been the most con- self-organisation of the industry has occurred, and if this re-
troversial within the security community. Labellings such as quires explicit standardisation. The research questions that this
IT Health Check add a further dimension to this conun- study will address are thus:
drum, as they exemplify the use of domain-specific labelling
of types of simulated security assessment; in this case a form 1. What standards currently exist for simulated security
of penetration test. A long-term penetration test, possibly with assessments?
greater scope of allowed activities (e.g., social engineering using 2. What coordinating bodies exist within the industry for
approaches such as phishing or physical access), is often called organisations and individuals?
a red team exercise, which attempts to test an organisations 3. How are current offerings perceived and what are the pre-
cyber security capabilities against real-world simulations of per- vailing issues surrounding simulated security assessments?
sistent threats. Despite this wide variation in service offerings, 4. Is there a need for additional standards, or to modify ex-
the central motivation for their procurement is typically the isting standards?
same to generate evidence (e.g., of the efficacy of security
controls) that contributes as part of wider security risk man- A requirement for action was identified which led to the
agement programs. Such evidence can then be converted into publication of a preliminary white paper (Knowles et al., 2015).
organisational-specific risks. The effectiveness of simulated se- This paper provides an expanded and academically-focused
curity assessments in generating this evidence for security risk extension to this work and makes the following contributions:
management is exemplified in how their usage is no longer
restricted to organisations pursuing the extra mile but in how 1. A review of the simulated security assessment ecosys-
it is rapidly becoming a mandatory requirement as part of many tem, detailing standards requiring assessments, standards
organisational standards. for providing assessments, and individual qualifications for
The market for simulated security assessments is in a rela- those that do.
tive adolescence, and the complexity and ambiguity of service 2. An analysis of the performance of simulated security as-
models (both in terms of service definitions and what is de- sessment providers over three phases of an engagement: pre-
livered against them) may be a consequence of this. Given this engagement, practical assessment, and post-engagement.
dynamic and the rapidly evolving nature of the market, it es- Findings are based on the experiences gathered through
tablishes the need to assess the current state of affairs, and 54 stakeholder interviews with providers, clients, and
to establish if there is a requirement for standardisation. coordinators.
The research presented in this paper has been conducted 3. A framework for future advancement in the simulated
in partnership with the British Standards Institution (BSI), the security assessment ecosystem, which includes both
UK national standards body, which is responsible for originat- standardisation and industry-led work activities.
ing many of the worlds most commonly used management
system standards, such as ISO 9001 and ISO/IEC 27001. The This study predominately addresses the UK landscape, as
overall aim of the research was to assess the need for this is where the stakeholder interviews were conducted.
standardisation in the area of simulated security assess- However, as will be evidenced in the remainder of the article,
ments, and provide guidance on what any proposed standards it can be argued that the UK is leading developments in this
should include. We do not, however, restrict ourselves to rec- area. We will highlight examples of best practice in the UK that
ommending only formal standards, and utilise the broader could be adopted internationally, as well as discuss interna-
definition of standardisation offered by De Vries (1999): tional elements where appropriate. The recommendations for
the advancement of the UK ecosystem should be broadly ap-
activity of establishing and recording a limited set of so- plicable in other countries.
lutions to actual or potential matching problems, directed The remainder of this paper is structured as follows. In
at benefits for the party or parties involved, balancing their Section 2 the academic literature concerning simulated secu-
needs and intending and expecting that these solutions will rity assessments is discussed. The methodology is outlined in
be repeatedly or continuously used, during a certain period, Section 3. Section 4 then provides a review of coordinating
by a substantial number of the parties for whom they are bodies, standards, and qualifications within the simulated
meant. security assessment ecosystem. The simulated security as-
sessment engagement process is then broken into phases in
The benefits of standardisation have been widely dis- Section 5, which includes a discussion on stakeholder
cussed, with links to increases in productivity, globalisation, practices and experiences. Recommendations for future
exports, and general economic growth (Blind et al., 2010; Swann, standardisation activities, along with industry-led improvement,
298 computers & security 62 (2016) 296316

are described in Section 6. The paper is then concluded in look for companies certified by the Communications-Electronics
Section 7. Security Group (CESG1), IT Health Check Service (CHECK), the
Council of Registered Ethical Security Testers (CREST), or ISO
17025 (for testing laboratories), and also have ISO/IEC 27001.
The requirement for CREST certification is also proposed by
2. Related literature
Yeo (2013). Both CHECK and CREST are UK organisational cer-
tifications to provide independent assurance of simulated
Despite the significant body of academic literature that exists
security assessment providers, and also offer individual quali-
on specific techniques within technical security assess-
fications. A high-level introduction to each, along with other
ments, there has been limited focus on the wider ecosystem
standards and individual qualifications in the UK, is pro-
of simulated security assessment services. The literature that
vided by Xynos et al. (2010). This topic is examined in greater
has attempted to address this domain falls into four broad
depth and supplemented with perceptions of stakeholders in
areas.
this paper in Section 4.2.

2.1. Software development life cycle


2.3. Methodologies
The first has emphasised the role of simulated security as-
The third area concerns research around the methodological
sessments within the Software or System Development Life
characteristics of simulated security assessments. In most cases,
Cycle (SDLC), which has largely arisen through the U.S. De-
methodologies have been established at a high-level of ab-
partment of Homeland Security project Build Security In.
straction. An early paper by Pfleeger et al. (1989) outlined such
Software security best practices were discussed by McGraw
a methodology, while proposing breaking systems into objects
(2004), emphasising the importance of integrating security into
that undergo transactions. A four element methodology (plan-
the SDLC, in particular the use of penetration testing. A similar
ning, discovery, exploit, reporting) was proposed by Bechtsoudis
emphasis was placed by van Wyk and McGraw (2005) who
and Sklavos (2012) who further describe a case study to iden-
outline various touchpoints (activities) in the SDLC to achieve
tify common security issues and their implications. Goel and
this, including code review during implementation (a deriva-
Mehtre (2015) provide a basic introduction to the importance
tive assurance technique (Such et al., 2015) of penetration
of vulnerability assessments and penetration testing, along with
testing) and penetration testing during configuration and de-
a linear methodology. Yeo (2013) provides a more detailed meth-
ployment. A further study by Arkin et al. (2005) focused
odology for the reconnaissance and attack phases of a
specifically on software penetration testing, which emphasises
penetration test, which is largely linear, with a cyclical element
the use of tools during the SDLC (e.g., static and dynamic analy-
where compromise leads to the requirement for further enu-
sis tools), along with the importance of contextualising
meration. The modelling of penetration tests using petri nets
assessments according to perceived risk posture.
has been explored by McDermott (2000). Thompson (2005) pro-
vides discussion on penetration testing across three areas:
2.2. Procurement
building a test plan (e.g., in relation to a threat model), ex-
ecuting this plan (e.g., in terms of the types of testing, such
A second area of research has centred around the procure-
as dependency testing), and the output of this process (e.g.,
ment of simulated security assessments. A high-level
having clear, detailed reproducible exploit scenarios). Geer and
introduction to the motivations for doing so are provided by
Harthorne (2002) also discussed five aspects of application pen-
Hardy (1997), while Bishop (2007) provides discussion around
etration testing: why (e.g., motivations for testing), who (e.g.,
the thought processes involved in scoping a meaningful pen-
assessor characteristics), what (e.g., the testing methodol-
etration test, and addresses topics such as goal setting, attacker
ogy), where (e.g., in terms of application subsystems), and when
knowledge, resources, and ethics. Geer and Harthorne (2002)
(e.g., at which point in the SDLC). Limited research has exam-
highlight the contradictory drivers for penetration tests, which
ined the application of such methodologies to niche scenarios.
arise through clients desiring advertisable (in the context of
One exception to this can be found in the work on social en-
demonstrating security to stakeholders) yet meaningful find-
gineering by Dimkov et al., (2010), which proposes two
ings, and providers of services wanting to primarily succeed
methodologies: one which measures the environment sur-
in the engagements objectives (i.e., to discover flaws). The im-
rounding a target asset, and does not involve social engineering
portance of penetration testing being only one part of a wider
the asset owner, and a further methodology which is more
security risk management programme was stressed by Midian
general, where the asset owner is in scope and unaware of the
(2002), while also introducing common issues found during as-
assessment.
sessments. The use of counterfactuals (i.e., what-if scenarios)
in the process of security reasoning is explored by Herley and
Pieters (2015), and concludes that despite their challenges they 2.4. Education, training and ethics
are a necessary evil. Penetration testing as a form of imper-
sonation of such a counterfactual is discussed. Two further The fourth area revolves around ensuring the competencies
papers focus on procurement from the clients perspective, in- of individuals performing simulated security assessments. A
cluding establishing requirements for organisations providing
such services. Tang (2014) emphasises the importance of 1
CESG is the information security arm of the UK Government
organisational standards, and proposes that procurers should Communications Headquarters (GCHQ).
computers & security 62 (2016) 296316 299

study by Guard et al. (2015) assessed the characteristics of stu- 2.5. Summary
dents most suited to conducting penetration tests within
testbed environments. Skillset requirements and career de- Four areas of academic research have been highlighted for simu-
velopment of penetration testers were discussed at a high- lated security assessments; however, two main criticisms can
level by Caldwell (2011). Qualifications were explicitly referenced, be applied to existing research. Firstly, while the benefits of such
which include three of the UK bodies, namely CHECK, CREST, assessments are widely espoused, there has yet to be a de-
and the Tigerscheme, along with the more internationally tailed review of the ecosystem for both the organisations
focused EC-Council Certified Ethical Hacker (CEH) and ISC2 Cer- providing such services and the qualifications for individuals
tified Information Systems Security Professional (CISSP). The that work inside such organisations. Such standards and quali-
challenges of designing information security courses that have fications have been highlighted here where mentioned; however,
potentially damaging consequences were examined by Logan in each case, references are largely cursory and a lack of ana-
and Clarkson (2005) who identified a need for greater empha- lytical depth is evident. It does, however, highlight the
sis on the integration of ethics into courses. This need was also dominance of UK standards and qualifications, which pro-
proposed by Saleem (2006) and Pashel (2006), both of whom vides some validation of this papers UK-centric focus. Secondly,
provided a short analysis on teaching penetration testing, where there is an absence of any empirical review of the effective-
the main emphasis was placed on preventing misuse of taught ness of penetration testing in practice. Research has developed
skillsets. At a high level, such ethical issues were discussed by knowledge or established the need for knowledge in dynamic
Jamil and Khan (2011) and Smith et al. (2002), with the latter areas such as methodologies and ethics; however, we argue that
highlighting the ethical dilemma created by a community which future research requires a greater understanding of what occurs
releases security assessment tools, and then also protects clients within real-world engagements.
against individuals that use them. A similar theme can be found
in Matwyshyn et al. (2010) who explored the ethics of secu-
rity vulnerability research. Only three further papers provided
a degree of detailed ethical analysis. The first was produced 3. Methodology
by Pierce et al. (2006) who established a conceptual model for
ethics within penetration testing which contains five ethical This paper presents the first comprehensive analysis of the
themes of which integrity is at the core. The second was by simulated security assessment industry. The analysis focuses
Ashraf and Habaebi (2013) who reviewed penetration testing on UK-based services with reflections on the wider interna-
ethics within the context of the Islamic faith. The third was tional implications. The studys methodology is visualised in
by Mouton et al. (2015) who examined the requirement for Fig. 1. The project can be seen to span three phases, which en-
ethics specifically within the context of social engineering. Ad- compass the three research questions outlined in Section 1.
ditional work has examined the ethical dimensions of Within Fig. 1 the arrowed lines are used to denote contribut-
conducting the most common type of social engineering attack, ing relationships; namely, between the sequential phases, the
phishing. Finn and Jakobsson (2007) present three experi- activities within these phases, and where the data sources origi-
ments to measure susceptibility to phishing attacks, discussing nate that are used in their findings.
how to make them realistic, and ethical. Although not spe- This first phase frames the contemporary ecosystem for simu-
cific to penetration testing, Reece and Stahl (2015) have also lated security assessments. This phase resulted in the
conducted a UK-based stakeholder-led study to examine per- comprehensive review of individual qualifications, organisational
ceptions around the professionalisation of information security, standards for those providing simulated security assess-
which found mixed levels of support. ments, organisational standards that mandate an assessment,

Data Collection
Ecosyst em
(Phase 1)

Enforcement Organisational Individual


Bodies Standards Qualifications Desk Research

Interviews
Perf orman ce

Providers
(Phase 2)

Practical
Pre-Engagement Post-Engagement
Assessment Clients

Coordinators
Progressio n
(Phase 3)

Opportunities for
Requirement for
Community
Standardisation
Development

Fig. 1 Methodology.
300 computers & security 62 (2016) 296316

and the bodies that enforce them. Only through an under- 32

standing of what exists can the understanding of its performance 30


be contextualised.This performance was the focus of phase two.
27
To understand the realities of real-world engagements, in-
depth interviews were conducted with 54 stakeholders about
their experiences across three phases of an engagement: pre- 22

engagement (e.g., scoping), practical assessment, and post-


20
engagement (e.g., reporting).The simulated security assessment
ecosystem is not a static entity; it is under continual evolu-
15
tion. Phase three is concerned with ensuring such progression
occurs in a manner that benefits all industry stakeholders with 12 12
the central objective of improved client security. The role of
10 9 9
standardisation is considered here, along with opportunities
7
for improvement by the industry itself. This article has been
structured around the three phases: phase one (ecosystem) can
be found in Section 4; phase two (performance) in Section 5; and
phase three (progression) in Section 6.
0
Two data sources were used within this paper: desk re-
search and interviews. Each is described further within Section
Providers Clients Coordinators
3.1 and Section 3.2.

Stakeholders Interviews Organisations


3.1. Desk research
Fig. 2 Stakeholder composition.
The predominant mode of data collection used within phase
one was desk research, which established the foundations for
enforcing requirements on the other two stakeholder types (co-
later research within phases two and three. Data sources for
ordinators). The composition of each category is described below
desk research fall within four categories.
and visualised in Fig. 2. The duration of provider interviews was
between 40 minutes and 2 hours, for clients between 25 minutes
1. Academic literature was consulted to frame the research
and 1 hour 15 minutes, and for coordinators between 20 minutes
from the perspective of academia. A scarcity of existing lit-
and 1 hour 30 minutes.
erature was identified (see Section 2), which resulted in the
Providers: There are 32 stakeholders across 27 separate in-
majority of data being sourced from the remaining three
terviews and 22 provider organisations. This includes providers
sources.
of simulated security assessments in its various forms. All but
2. Formal standards (e.g., those from ISO, IEC and BSI) were
2 stakeholders were based in the UK. Out of 32 providers, 10
reviewed; in total, over 40, after a preliminary analysis to
(across 8 organisations) were from CHECK accredited
identify those that had relation to the ecosystem.
organisations, and 18 providers (across 13 organisations) were
3. Consortia standards, which unlike formal standards that are
from CREST member companies. Insufficient information was
published through international and national standards
collected to calculate the number of CHECK or CREST quali-
bodies, required data to be collated through public sources.
fied individuals that were interviewed. Such individuals do not
This included the analysis of publicised information on web-
need to work for CHECK or CREST organisations, nor does one
sites (e.g., of trade associations) and specification documents
have to be CHECK or CREST qualified to work for one.
(e.g., policies for membership requirements).
Clients: There are 15 stakeholders across 12 separate in-
4. Community-led literature was also reviewed, which en-
terviews with 12 client organisations. To achieve a broad
compasses what has been produced by both individuals and
representation of client experiences, the organisational size of
collectives (e.g., technical reports by organisations and com-
clients interviewed was highly varied. Client stakeholders
munity standards).
ranged from micro enterprises (i.e., <10 employees) to large en-
terprises (i.e., >250 employees, with 3 stakeholders in
3.2. Interviews organisations of >1000 employees, which included a financial
institution). Furthermore, 5 representatives from UK local gov-
Stakeholder interviews were conducted to gather the percep- ernment were interviewed. Included within the total client
tions and experiences of the challenges and opportunities present count were 2 stakeholders who worked in consultancy roles
within real-world engagements and the wider ecosystem. Such (e.g. in one case, as a CESG Listed Advisor Scheme (CLAS) con-
interviews were primarily used within the analysis of phase two, sultant) to procure penetration tests and identify remediation
but also informed phase one (e.g., for where information was strategies for third parties.
not publicly available) and phase three (e.g., for perceptions of Coordinators: There are 9 stakeholders across 9 separate in-
future industry direction). In total, 54 stakeholders were inter- terviews with 7 coordinators. Included within this count were
viewed across 46 separate interviews. Stakeholders can be divided 2 stakeholders from provider organisations who also spoke
into three categories: those that provide assessments (provid- about their roles within a coordinating body. This count of 9
ers), those that receive assessments (clients), and those bodies stakeholders does not include the 3 providers who spoke about
computers & security 62 (2016) 296316 301

their work on the community standard, the Penetration Testing opt-in or opt-out of security controls based upon a risk as-
Execution Standard. Stakeholder organisations were: CESG; sessment), and therefore those under audit to pursue such
CREST; the British Standards Institution (BSI); the UK Depart- certification are under no obligation to possess audit evi-
ment for Business, Innovation and Skills (BIS); Tigerscheme; dence of the results of a simulated security assessment. ISO/
Information Assurance for Small and Medium Enterprises IEC 27001 is, however, widely used as the basis for other
(IASME); and Quality Guild (QG) Management Standards. assurance schemes, and in some cases the technical compli-
ance review becomes a mandatory requirement (potentially
along with other security controls). One such example is the
CESG Assured Service Telecoms CAS(T)2 for telecommunica-
4. The simulated security assessment tion environments.
ecosystem IT Health Checks (i.e., CHECK assessments) are another gov-
ernment standard requiring some form of simulated security
This section describes two perspectives on the simulated se- assessment. IT Health Checks are for UK public sector bodies
curity assessment ecosystem. A client-focused perspective is (including local governments) who wish to participate within
taken in Section 4.1, which reviews standards that mandate the network that interconnects them: the Public Services
particular forms of assessment. Section 4.2 then addresses this Network (PSN).
from the provider perspective in terms of qualifications, ISO/IEC 15408 more commonly known as the Common
consortia/private standards, formal standards, community stan- Criteria outlines the requirements for the secure function-
dards and methodologies for those delivering such assessments. ality of IT products and for assurance measures applied to these
Preliminary findings from the interviews for each section are IT products during a security evaluation (British Standards
discussed, with a more expansive analysis of their implica- Institution, 2014). Penetration testing is frequently cited within
tions in practice given in Section 5. the vulnerability assessment requirements of ISO/IEC 15408-
3:2008 (British Standards Institution, 2009).
4.1. Standards requiring simulated security assessments Interview findings of standards for clients
Two important survey findings were made regarding the use
There are a multitude of reasons why an organisation would of simulated security assessments for compliance where se-
procure or conduct a simulated security assessment. One curity controls are established.
notable driver arises through recommended or mandatory as- Firstly, stakeholders from all three categories felt the link
sessment requirements as part of a wider formal or private/ between simulated security assessments and ISO/IEC 27001
consortia standard. The standards which make explicit reference was currently disparate and poorly documented. Two inter-
to a requirement for some variation of a simulated security related approaches for establishing a link emerged from early
assessment are discussed below. interviews, and subsequent stakeholder views were widely posi-
Cyber Essentials (Department for Business Innovation and tive for both. The first approach was to establish a clear link
Skills, 2014) is an entry-level organisational standard that pro- between the activities within a simulated security assess-
vides basic assurance that an organisation is meeting minimum ments and ISO/IEC 27002 security controls, and the second was
cyber security control requirements. It is targeted at private, to establish greater auditor guidance for using assessment find-
not-for-profit and public organisations of all sizes, although it ings as audit evidence, within the larger ISMS audit. Arguably,
has particular relevance for small and medium enterprises the former must happen to enable the latter. Criticism was ex-
(SMEs). It outlines two levels of certification: basic (no formal pressed by one stakeholder, whose views are notable due to
label) and Plus. The Cyber Essentials standard requires the their proximity to the standardisation process. This stake-
completion of a self-assessment form for basic certification, holder felt the approach was at odds with the ISO/IEC 27001
and self-assessment plus a third-party security assessment for model, which was not about security in itself, but knowing in-
Plus (including an external and internal vulnerability scan). security and planning for continuous improvement. This
PCI DSS (The Payment Card Industry Data Security Stan- stakeholder added: Why favour a particular method over
dard) (PCI Security Standards Council, 2016) enforces a business another? Why is penetration testing better than auditing se-
requirement for the information security of organisations han- curity records? Risk must be identified to be managed, however,
dling payment transactions, including those by credit and debit and for other stakeholders, the enthusiasm was focused on the
card. Compliance with PCI DSS is not a legal requirement (with ability of simulated security assessments to assess controls in
certain geographical exceptions) but instead a requirement en- demonstrable terms.
forced through business terms (e.g., non-compliance can result Secondly, whilst the motivations behind Cyber Essentials
in penalty fines). Requirement 11.2 mandates quarterly vul- were widely applauded, it was criticised for its lack of target
nerability scans, while requirement 11.3 mandates penetration market, while provoking explicit and widespread confusion about
tests at least once per year and with any signifiant infrastruc- its implementation. Such confusion arose primarily from the
ture or application modification. heterogeneous approaches of the accreditation bodies. Fre-
ISO/IEC 27001 an Information Security Management quent remarks concerned the integration of companion
System (ISMS) standard can be contributed to with simu- standards within Cyber Essentials, where vulnerability
lated security assessments as audit evidence. In this case, such
assessments are encapsulated as a security control under the
umbrella of a technical compliance review. Security con- 2
https://www.cesg.gov.uk/articles/policy-and-guidance
trols within ISO/IEC 27001 are not mandatory (organisations -documentation-suite-cast.
302 computers & security 62 (2016) 296316

Table 1 Penetration testing qualifications.


Level Qualification bodies and qualifications
CESG CREST Tigerscheme Cyber scheme
CHECK CCP
Entry N/A SFIA Responsibility Level 3 Practitioner (CPSA) AST CSA
Intermediate Team Member Level 4 Registered (CRT) QSTM CSTM
Advanced Team Leader Level 5 Certified (CCT) SST CSTL
Level 6
Red Team N/A N/A STAR (CCSAM and CCSAS) N/A N/A

assessments were required (or where they were not), and the STAR is CBEST,5 which specifically targets the financial sector
separation of accreditation and certification status. Some pro- STAR qualifications ensure that competency requirements are
viders further questioned whether consistency could be achieved met to perform such engagements. Two forms of STAR quali-
due to the ambiguity in the testing guidelines, and the subjec- fications exist: those for managers (i.e. those who lead STAR
tivity required to implement them. teams) and those for technical specialists. The Tigerscheme and
Cyber Scheme qualifications follow a similar structure to the
4.2. Standards and qualifications for providing simulated lower three CREST qualifications, each with a beginner, inter-
security assessments mediate and advanced qualification. An equivalent to CRESTs
STAR qualifications is not available from other qualification
Competence requirements can be established at both the bodies.
organisational and individual levels. This section provides a dis- CESG has also launched a separate scheme, which forms
cussion of the current state of the market for both, along with a competence framework that is described as a certification
the views of stakeholders on the requirement for modified or rather than a qualification: the CESG Certified Professional (CCP).
new standards in these areas. The CCP is a framework that defines seven roles, one of which
is Penetration Tester. Each role has different levels of com-
petence, which are aligned with the responsibility levels defined
4.2.1. For individuals
by the Skills Framework for the Information Age (SFIA)6 and
Budding and established professionals are now faced with a
the skill levels defined by the Institute of Information Secu-
multitude of choices for qualifications across a range of skill
rity Professionals (IISP). Four levels are defined for the
levels and topic scopes. UK qualifications for simulated secu-
Penetration Tester role: SFIA Responsibility Levels 3, 4, 5 and
rity assessments primarily arise from four providers: CESG,
6. CCP, while listed in Table 1, does not currently contribute
CREST, Tigerscheme3 and the Cyber Scheme4.
to a CHECK qualification assessment.
CESG has established a qualification scheme for the IT Health
The list of qualifications described is not exhaustive and
Check Service (CHECK), which has been in operation for over
is UK focused, which meant many non-UK training courses and
a decade. The two levels of CHECK qualification are the CHECK
qualifications have been omitted; this is a consequence of in-
Team Member and the CHECK Team Leader; the latter is split
terview findings, which found a greater emphasis on UK
into two qualifications for infrastructure and web applica-
qualifications for recruitment. Despite this, there are two quali-
tions. The current format requires candidates to have obtained
fications worthy of note, both of which are from US-based
a certain type and level of industry qualification and Security
providers. Firstly, the International Council of Electronic Com-
Clearance (SC) to allow them to handle sensitive data, amongst
merce Consultants (EC-Council) Certified Ethical Hacker (CEH).
other publicly undisclosed factors. The three remaining quali-
This qualification can be positioned at the lower spectrum of
fication bodies, CREST, Tigerscheme and the Cyber Scheme,
the entry-level classification. Despite such positioning, it was
provide the industry qualification. Their content, level and
identified to be frequently cited within job advertisements (typi-
equivalence to CHECK qualifications are shown in Table 1.
cally supplementary to those of Table 1) and it has seen
CREST has since emerged as the predominant industry-
integration into some UK academic courses. CEH is not as-
led professional qualification body within the UK, and its
sessed through virtual lab examination, however, and uses only
qualifications can be seen to span four tiers. In order of re-
multiple-choice examination. Secondly, the Offensive Secu-
quired proficiency, they are CREST Practitioner (requiring an
rity Certified Professional (OSCP). In contrast to CEH, the
estimated 2500 hours of experience), Registered Tester (6000
interviewees perceived OSCP to be increasingly popular within
hours; CRT), Certified Tester (10,000 hours; CCT) and Simu-
the UK market for recruitment due to its rigorous technical
lated Targeted Attack and Response (STAR). It is at the Certified
virtual lab examination, and is the predominant requirement
tier that specialism occurs in the areas of infrastructure or web
for US-based organisations. Since the conclusion of this in-
application security. STAR is a framework created to provide
terview process, CREST has partnered with Offensive Security
intelligence-driven red team penetration tests for the critical
infrastructure sectors. Currently the main implementation of
5
http://www.bankofengland.co.uk/financialstability/fsc/Pages/
3
http://www.tigerscheme.org. cbest.aspx.
4 6
http://www.thecyberscheme.com. http://www.sfia-online.org.
computers & security 62 (2016) 296316 303

to establish equivalency between OSCP and CRT.7 This allows


Red Team Highest
holders of OSCP to obtain CRT subject to certain stipulations
Exercise Expertise

Five
(i.e., a fee and a multiple-choice and long-form examination CBEST/STAR
within six months). CRT obtained in this manner, however,
cannot be used as part of the CHECK application process.
Interview findings for individual qualifications

Four
The consensus amongst stakeholders was a strong oppo- CHECK Specialised
sition to any form of new standard for individuals. Opposition
Penetration
was twofold. Firstly, the techniques and skills used evolve at
Testing
a rapid pace, which would be infeasible to capture and keep
current within a standards type document. Secondly, whilst PCI DSS Penetration

Three
CREST
the current system is not without fault, existing consortia pro- Test
viders within the UK have done an exemplary job of raising,
setting and assessing the competence of individuals that
conduct simulated security assessments, and furthermore, the Cyber Essentials

Two
UK is ahead of the rest of the world in this regard. PCI DSS ASV
Plus
Many stakeholders, however, did feel that there was a growing
need for an independent body, modelled in the same vein as
medicine or law, in order to continue the professionalisation of

One
simulated security assessments. Taking medicine as an example, Cyber Essentials
Vulnerability Lowest
key indicators of its professionalisation are the internationally Expertise
Scan
recognised standards for its practices, and regulation bodies for
individuals with powers such as being able to revoke the right
Fig. 3 A tiered model of provider standards.
to practise. Some providers felt that standards bodies, such as
BSI, could facilitate the internationalisation process by working
with technical assessors such as CREST and Tigerscheme.
However, not all stakeholders were positive about such an en- accreditation bodies.8 Only one accreditation body, CREST, has
deavour. It was noticeable that those supportive were implemented their version of the Cyber Essentials scheme to
predominantly in positions of management, whose natural pro- deliver simulated security assessments at this tier. CREST man-
clivity is one of control. Those engaging in the practical elements dates that external vulnerability assessments must be conducted,
of security assessments, the practitioners, had fears about the on top of the core requirements of the scheme (i.e., a self-
potential future exploitation of such a scheme to regulate those assessment form), in order to ensure border controls are properly
who wish to conduct cyber security research. It was felt that such implemented (CREST, 2014).
a situation would negatively impact the industry as a whole, and Tier Two expands the engagement scope of Tier One vul-
lead to the loss of the UKs competitive advantage. nerability assessments. Two main standards exist. The first is
Cyber Essentials Plus. All accreditation bodies deliver assess-
ments that meet the criteria set forth by the Common Test
4.2.2. For organisations Specification (CESG, 2014) which outlines the mandatory in-
Confidence in a providers process readiness to deliver simu- ternal and external assessments, along with success criteria.
lated security assessments can be created through organisational Assessments include vulnerability scans and configuration
standards. Such standards fall broadly into two areas: those from reviews (e.g., assessing ingress malware filtering at the bound-
consortia and those from formal standards organisations (e.g. ary, server, and workstation levels). The second fulfils
BSI directly or ISO/IEC). Private/consortia standards are those requirement 11.2 of PCI DSS for periodic vulnerability scans.
used most predominantly; such standards can be visualised using The PCI Security Standards Council mandates that such vul-
a tiered model, in which the hierarchy is formed by the level nerability scans must be conducted by an Approved Scanning
of intended rigour of the simulated security assessments that they Vendor (ASV). Certification to become an ASV involves a simu-
provide. The reader should note that this is a rough categori- lated assessment on PCI Security Standards Council
sation, and there remains potential for offerings to move between infrastructure to evaluate technical competence, and
tiers based upon client requirements. This tiered model is pre- organisations must recertify annually.
sented in Fig. 3. Tier Three sees a shift towards increasingly adversarial en-
Tier One is concerned with external vulnerability assess- gagements, and what are widely considered penetration tests.
ments. The Cyber Essentials scheme (Department for Business The most well-established industry-led body providing certi-
Innovation and Skills, 2014) was introduced in Section 4.1, along fication here is CREST. The application process covers four
with its two types of certification: Basic and Plus. Provider domains of organisational capability: (a) information secu-
organisations can be accredited (i.e., to become certification rity; (b) operating procedures and standards; (c) methodology;
bodies) to deliver one or both levels by one of the four and (d) personnel security, training and development. Both the

8
This can be seen through the example of provider organisations
7
http://www.crest-approved.org/uk/examinations/oscp-and accredited by CREST: http://www.cyberessentials.org/certifying
-crt-equivalency/. -bodies/.
304 computers & security 62 (2016) 296316

ISO/IEC 27001 and ISO 9001 management system standards are 15408 (Common Criteria). The methodology for such evalua-
referenced in CRESTs guidance for applicants, but not man- tions are outlined in ISO/IEC 18045. Various challenges prevent
dated. However, CREST does require evidence of operational the widespread application of ISO/IEC 15408 to simulated se-
commitment to the implementation of an information secu- curity assessments, such as high information requirements
rity management system (ISMS) and a quality management about target environments, and the challenges of applying it
system (QMS). Furthermore, CREST requires a clear and docu- to a dynamic and live system. However, attempts to mitigate
mented complaints procedure for Member Companies, with this have been provided in supplemental standards; for
an escalation path that makes direct reference to the CREST example, PD ISO/IEC TR 19791 extends ISO/IEC 15408 to cover
complaints process for independent arbitration. PCI DSS re- operational environments, while ISO/IEC TR 20004 uses the
quirement 11.3 for periodic penetration tests does not mandate Common Weakness Enumeration (CWE) and the Common
any certification requirements for providing organisations. Attack Pattern Enumeration and Classification (CAPEC) frame-
Instead, a de facto standardisation is enforced (represented by works to support ISO/IEC 18045 vulnerability assessments.
the red area within Fig. 3) through the certification process when Interview findings of standards for providers
an appointed individual and/or organisation assesses whether Isolated criticisms were raised against consortia/private stan-
requirements have been met to an appropriate level. To facili- dards (e.g. a lack of independence from industry in their
tate this process PCI DSS has released supplementary governance); however, the predominant voice amongst stake-
penetration testing guidance (PCI Security Standards Council, holders was that they have done much to raise the standard
2015), which includes recommended provider competencies (e.g. of operational readiness and professionalism of providers within
qualifications), methodologies (notably including emphasis on the UK. Indeed, approval for both CREST and CHECK was fre-
the importance of exploitation), and reporting. Within com- quently cited as a motivation for the adoption of management
petency guidelines, it is noteworthy that the only organisational system standards by providers, predominantly ISO/IEC 27001
certifications promoted were the UKs CHECK and CREST. and ISO 9001.
Tier Four bears a close resemblance to Tier Three with Although the benefits of some formal standards were es-
respect to assessment expectations and required competen- poused (e.g. the thoroughness of Common Criteria), the
cies; however, it differs in its establishment of non-standard consensus was that it would prove difficult to implement these
requirements (e.g., for providers to have achieved security clear- standards for the types of services and timescales for testing
ance). The primary standard here is CHECK, a governmental that clients were demanding. A small number of providers sug-
initiative operated by CESG, which enables approved gested standardising a methodology; however, this was mostly
organisations to perform penetration tests for government de- only seen as an option if standardisation was forced. Other
partments and public sector bodies (including Her Majestys stakeholders questioned the benefit of such a high-level stan-
Government). Organisational approval requires evidence sub- dard, feeling there was already a significant quantity of
mission in two areas: (a) capability (e.g. testing methodology information in the public domain on this topic. Efforts to create
and examples of previous reports) and (b) the composition of such a standard have been considered (and continue to be) by
a CHECK team (at least one CHECK Team Leader and one CHECK the subcommittee that developed the ISO/IEC 27000 series (ISO/
Team Member). There are a multitude of specialised engage- IEC JTC 1/SC 27), although there is not yet any standard
ment types that would fall into this category but have no published or in development on this topic.
formally defined offering. An example would be penetration The performance of these standards and certifications for
tests involving safety-critical infrastructures (e.g., Industrial providers in practice (including methodologies used) will be dis-
Control Systems). The requirements present within this level cussed in Section 5.
may also apply to some CREST engagements.
Tier Five engagements are those that require a similar or
4.3. Community standards and methodologies
higher level of expertise as Tier Four, but differ predomi-
nantly in the length of the engagement and list of permissible
Many of the guidelines and standards for conducting assess-
activities. Tier Five may therefore be considered a form of red
ments have not come from formal standards institutions, but
team engagement. Although many providers have red team ca-
instead have been generated by the security community and
pabilities there currently only exists one organisational standard
other interested stakeholders. This section details those ac-
to provide oversight within this market. This falls under what
tivities self-described as standards, and then finally those that
CREST refer to as the STAR framework, which offers threat
can be broadly defined as guidelines and methodologies.
intelligence-led red team exercises to the critical infrastruc-
The Penetration Testing Execution Standard (PTES)9 is a com-
ture sectors. Currently the main implementation of STAR is
munity standard that provides guidelines for the full scope of
CBEST (Bank of England, 2016a, 2016b) which establishes man-
penetration testing activities over seven stages: pre-engagement,
datory testing requirements in the financial sector. CBEST
intelligence gathering, threat modelling, vulnerability analy-
engagements may utilise government issued intelligence pro-
sis, exploitation, post-exploitation, reporting. Although the
vided by the UK Financial Authorities, which may not be
majority of these stages are technical evaluations, PTES itself
available for other STAR engagements.
does not specify technical guidelines on how to conduct a pen-
Formal Standards: The discussion has focused on the private/
etration test engagement, instead describing the process at a
consortia standards that dominate this domain. However, formal
conceptual level. However, PTES has produced a set of technical
(technical) standards also exist that describe activities relat-
ing to simulated security assessments, or mandate their use.
9
One widely used standard for security evaluations is ISO/IEC http://www.pentest-standard.org.
computers & security 62 (2016) 296316 305

Initial Scoping Deliverable


Threat Analysis Exploitation
Interaction Proposal (Report)

Information Vulnerability Client


Project Sign-Off
Gathering Analysis Follow-Up
Pre-Engagement Practical Assessment Post-Engagement

Fig. 4 Phases of a penetration testing engagement.

guidelines to accompany the standard,10 which includes the 4.4. Summary of key findings
specification of particular tools and instructions on their use.
A further community standard was produced by the Open Web Several key findings on the existing ecosystem and how it is
Application Security Project (OWASP) project, whose aims are viewed by stakeholders are worth highlighting:
to improve the state of web application security through the
provision of guidelines, reports, and tools. OWASP has created The link between simulated security assessments and ISO/
the Application Security Verification Standard (ASVS) v2.0 which IEC 27001 should be strengthened.
outlines a methodology for assessing web application secu- The UK leads the rest of the world in raising, setting and
rity controls. assessing the competence of professionals conducting simu-
Other publications are available that do not purport to be lated security assessments.
standards, but instead act as guidelines and methodologies. There is support for an independent body, similar to medi-
Some are generically focused including the National Insti- cine or law, to continue the professionalisation of the field.
tute of Standards and Technology (NIST) Special Publication Existing consortia/private standards have done much to raise
800-115 (Scarfone et al., 2008) (which is considered best prac- the level of operational readiness and professionalism of
tice in PCI DSS; PCI Security Standards Council, 2016) and the providers within the UK, including the adoption by provid-
Open Source Security Testing Methodology Manual (OSSTMM) ers of management system standards (e.g. ISO/IEC 27001 and
(Herzog, 2010). Other guidelines and methodologies are more ISO 9001).
technology-specific, such as OWASPs Testing Guide for as-
sessing web applications. Furthermore, there have been
industry-specific publications that address the challenges of 5. The engagement process
assessing systems within environments that require non-
standard approaches during a simulated security assessment. Stakeholders were also questioned on their practices and ex-
For example, the United States Department of Homeland Se- periences around simulated security assessments. To
curity has produced high-level guidelines on the use and contextualise the findings, the terminology for an engage-
challenges of penetration tests for assessing Industrial Control ments subprocesses has been defined through a reference
System (ICS) security (U.S. Department of Homeland Security, model in Fig. 4. This reference model was derived from stake-
2011). A more thorough methodology is presented by the Na- holder responses. The model splits an engagement into three
tional Electric Sector Cybersecurity Organization Resource broad phases: pre-engagement, practical and post-engagement.
(NESCOR) (Searle, 2012) which addresses penetration tests for The pre-engagement phase is concerned with establish-
assessing electric utilities in particular. NESCOR includes within ing the parameters of allowed activity for the practical
its scope guidelines for assessing embedded components, which assessment. There will be some form of initial interaction which
is largely unaddressed within other methodologies in favour may be initiated by the provider or the client. A chosen meth-
of network and web application security. An extension of this odology (e.g. questionnaires or interviews) will be used by the
can be seen in the Advanced Metering Infrastructure (i.e., a sub- provider to generate a scoping proposal. This proposal may go
system within the smart grid) Attack Methodology (InGuardians, through multiple rounds of negotiation. The client will then
2009). This publication outlines a methodology at the techni- sign off on the proposal before the practical assessment begins.
cal level for penetration testing embedded devices that exist The practical assessment phase involves the exposure of
outside of a utilitys security perimeter (e.g., on customer a client system or component to a simulated attack. The phase
premises). begins with information gathering. This may uncover systems
Interview findings for community standards and methodologies that necessitate further discussions around the engagement
Due to the pragmatic nature of the publications described scope (e.g. if a client uses systems owned or operated by third
within this section, their impact will not be discussed here, but parties and there are questions of testing authorisation). A pro-
rather in Section 5, in order to frame this discussion within vider may conduct a threat analysis or move straight to its
the context of experiences from real-world engagements. subsequent stage, a vulnerability analysis. Exploitation of iden-
tified vulnerabilities may occur in order to attempt penetration
10
http://www.pentest-standard.org/index.php/PTES_Technical of the system, and to gain access to additional resources (e.g.
_Guidelines. sensitive data or higher system privileges). The subprocesses
306 computers & security 62 (2016) 296316

of the practical assessment stage may go through multiple rep- The definition of consistent terminology was widely sup-
etitions (e.g. a compromised system may be connected to ported amongst providers (18) and clients (5).11 Another two
another internal network which, if under scope, can also be providers expressed support at a conceptual level, but argued
attacked). practical definitions were difficult to determine; the subjec-
The post-engagement phase is concerned with the deliv- tive nature of exploitation was cited as an issue by one provider.
ery of findings to the client, usually in the form of a written Questions around terminology arose from an early interview
report. The majority of providers will supplement this with ad- with a provider who suggested that the market would benefit
ditional forms of client interaction (e.g. final meetings) in order from BSI working with industry partners to define testing types.
to educate them about the findings and the remedial actions This provider argued that it might not be right; it might get
that need to be undertaken. slaughtered, but its a stake in the ground, where clients can
Comments resulting from the stakeholder interviews con- say they want a peer defined simulated security evaluation (e.g.
cerning these three stages of the penetration test engagement vulnerability assessment or penetration test) and have a clearer
will now be discussed in turn. understanding of the service that they desire and what will
be delivered.
As the support for terminology definitions suggests, the in-
5.1. Pre-engagement
dustry is acutely aware of the issues caused by the lack of
precise terminology. One initiative with potential industry
The quality of the marketing collateral of penetration testing
impact is the community standard, PTES. The upcoming version
companies leaves a lot to be desired. I think its a market-
of the PTES is stated to have levels of testing. Supportive pro-
place thats shrouded in mystery and myth. Its very difficult
viders felt levels would empower clients and facilitate the
as a person wishing to purchase penetration testing and IT
process of procuring a certain type or level of test. If a pro-
Health Check services to assess the marketplace and find
vider was then to fail to deliver the requirements of that level,
out whether or not your potential vendors will satisfy what
they would be in breach of contract. One non-PTES provider
you require, other than them being able to say that theyre
was supportive of a level approach; however, they urged caution
CREST or CHECK registered it almost feels like you need
with definitions, as part of the process of adding value is having
to be an expert yourself to buy an expert to come in and
the power to deviate. Such issues could easily be addressed
help you Being able to come up with a framework with
through clarifying testing requirements in pre-engagement ne-
which you can engage these suppliers, and understand the
gotiations. If a standard is too specific, however, it could cause
nature of the different tests that they will do, and how they
issues if clients do not need an aspect of that test, and do not
will treat that information in terms of reporting it back, and
understand why they do not need it. Providers would then need
there being some consistency across the marketplace I
to deliver unnecessary services to meet that level.
think that would be a very welcome development.
An alternative solution proposed by one coordinator used a
A client of penetration tests
measure of the clients risk appetite to map onto industry ser-
vices; however, it received strong opposition due to the difficulty
in computing risk appetite (even amongst those versed in its
5.1.1. Terminology specialism) and the lack of potential for internationalisation
There was a notable sense of confusion and frustration amongst where many providers wished to focus their efforts.
stakeholders about the ambiguity in what constitutes a pen-
etration testing service. Such ambiguity was evident from the 5.1.2. Scoping
varied service definitions of providers, in particular around the The scoping procedures of providers were found to have formed
level of exploitation that occurs during engagements. A number a de facto standard, with strong commonalities in the basic
of providers stated that vulnerabilities within engagements were stages (see Fig. 4), methodologies to derive client require-
not exploited by default, with additional value provided through ments (predominantly questionnaires), the structure of scoping
theorised exploitation and/or false negative and positive veri- proposals, and types of testing proposed (almost wholly white
fication. This caused a commonly cited issue during the tender box). Client views on scoping were polemic. Providers were
process, which was found to be increasingly common for the largely seen as providing adequate assistance; however, in some
procurement of simulated security assessments. Clients were cases they were criticised for their excessive use of question-
often found to be unable to differentiate between providers, naires and lack of face-to-face meetings, even by larger clients.
even amongst some of those that had approved CHECK or One CESG Listed Advisors Scheme (CLAS) consultant argued
CREST status, while providers argued that clients often failed that you often cant interact with penetration testing provid-
to understand their requirements, provided limited opportu- ers. For some providers, especially the micro enterprises, face-
nities for consultation, and made procurement decisions based to-face meetings were pursued at all opportunities to
predominantly on economic factors. Providers argued that this differentiate themselves against the rise of faceless providers.
could lead to clients failing to procure a level of testing rigour Larger enterprises were considered to be the only clients
appropriate to the requirements of their environment. Some who understood their requirements, and this often mani-
providers felt this was in part because clients are not con- fested in engagements with strict guidelines, goals and
cerned with the quality of the test: clients are just looking for deliverables. Small providers were deemed to require greater
a tick in the box and resent any issues found. The issue from
client and provider perspectives is related and can arguably 11
These figures were reached without all stakeholders being ques-
be reduced to issues with terminology for defining services. tioned on the topic due to time restrictions.
computers & security 62 (2016) 296316 307

levels of assistance, but have grown increasingly knowledge- effort within this domain. They have to understand that
able over the past three to five years through their periodic process and advise if theyre wanting to push an export strat-
audits. A common area of contention amongst stakeholders egy for cyber security services.
was found in industries that mandate simulated security as-
sessments; notably in the CHECK scheme for Public Services
5.2. Practical assessment
Network (PSN) compliance and PCI DSS. Multiple CHECK re-
quiring clients expressed the opinion that their peers were
Known hostilities towards standardisation of the technical
intentionally narrowing the scope of engagements to mini-
aspects of simulated security assessments led to a strategic
mise risk for any issues found due to the punitive nature of
choice in study design to focus on other aspects of engage-
the scheme. Two CHECK providers stated that they had heard
ments; however, stakeholder interviews identified three areas
of such issues themselves. One client insisted that they were
of note.
interested in having an expansive scope for the CHECK scheme.
Such an approach provides financial benefits over having one
approach for aspects critical to PSN compliance, and another 5.2.1. Exploitation
non-mandatory test for other services. However, the punitive The first bears relation to the question of terminology, and con-
nature discourages such an approach, with the additional com- cerns the extent of provider exploitation, which can be seen
plexity that any issues found lead to poor reflection of security as a key differentiator for service definitions. For many pro-
capabilities compared to peer organisations. The Cabinet Office viders, there was an aversion to exploitation. For some providers
does provide a four-page document (two pages of content) on their default policy was to not exploit, beyond basic false posi-
scoping CHECK; however, for the clients the call was clear: tive and false negative testing (e.g., for SQL injection, using a
greater guidance is needed to ensure consistency within the single apostrophe to raise a database error). In one case the
scheme. provider expanded upon their approach, describing the cre-
ation of scenarios to determine if an attack is feasible on the
5.1.3. Authorisation balance of probability. Furthermore, one provider stated that
Questions of authorisation arose when engagements in- while exploitation occurred on some of their engagements,
volved third party services. All providers stated confirmation most of their clients do not like it, and so it is not con-
was sought before engagements began; however, the methods ducted. Where exploitation was described as being used, the
used varied. The preferred method involved the client signing general consensus amongst providers was to only use ex-
a document to state their legal authority to test systems within ploits were sure about, talk to clients if there is a risk to live
scope, with the provider requiring no further proof. One pro- services and to liaise with the client and seek approval before
vider stated this was because it was too time-consuming to exploitation. One provider described how sometimes clients
check it all. A minority of providers required explicit confir- will not want the provider to exploit, but would want them to
mation from the third party. Cloud services were an exception, continue on as if exploitation has worked. In such a scenario,
with providers often demanding email or written confirma- the provider would be given SSH access and continue
tion from the cloud service. Such authorisation was found to from there. For some providers this serves as a welcome
be obtained with relative ease, except for smaller or non- professionalisation of the industry, and a significant improve-
Western cloud services. Providers stated that undisclosed third ment from the time (described as only 510 years ago in one
party systems were often uncovered during the initial recon- case) when exploits were frequently untargeted (e.g., to the spe-
naissance stage of the practical assessment, notably with mail cific operating system and service patch), launched en-
servers, and that the lack of third party authorisation was a masse, and used without full knowledge of their contents. Other
common reason for delayed testing. providers were more critical of this shift. Although it was in-
frequently discussed in stakeholder interviews, for some
5.1.4. International engagements providers the short length of engagements meant that they felt
Providers were questioned on their understanding of the le- there was no longer time to do rigorous testing of proof-of-
gality for conducting engagements outside UK borders. Providers concept exploits before their use in engagements, which could
were largely unaware, with the bigger providers stating that also be a contributor to the move towards scenario-based, no
such engagements would be cleared by their legal depart- exploitation testing where no penetration occurs.
ment. The general approach was to offset risk onto the client
on the assumption they would have greater knowledge of local 5.2.2. Methodologies
laws, and to ensure to never stray from the scope set out for The second concerns methodologies and their limited use by
the engagement. Legal cover would then be provided by au- providers, which may reflect on the diversity of service models.
thorisation from the client. One provider felt there was not As one might expect, no provider followed one particular meth-
enough legal guidance around the Computer Misuse Act (1990), odology, and instead considered their methodology to be a
even within the UK. Where does the Misuse Act stop and a synthesis of community standards. Out of the 32 providers, 10
new law begin? This proved to be a bigger issue for smaller mentioned Open Source Security Testing Methodology Manual
provider organisations. One such provider stated that they have (OSSTMM) and 16 mentioned Open Web Application Security
had multiple enquiries about work from the USA; however, they Project (OWASP) in general terms, with three providers more
have not taken on business because they do not understand specifically mentioning the OWASP Testing Guide. Other meth-
how protected they are. This provider felt that UK govern- odologies in use during the interviews were the PTES by six
ment Trade & Investment department should be making more providers and those adopted by the National Institute of
308 computers & security 62 (2016) 296316

Standards and Technology (NIST) (again in general terms, A small number of providers had heard of the OWASP ASVS,
without reference to a specific standard) by three providers. and two providers had done tests to it, but only on client re-
One provider noted that NIST 800-115 has gained promi- quests. For some, ASVS was met with criticism due to its lack
nence in the past year, due to the new PCI DSS version 3 of differentiation between white and black box testing, and did
standard, but described it as old and invalid. Providers stated not account for the limited willingness for clients to release
that no public methodologies were followed for high assur- source code. A counter to this criticism could be made, however,
ance and/or safety-critical environments requiring specialist in that it is only the higher levels of verification within the stan-
approaches, such as industrial control systems, as in their ex- dard that require source code review. A further felt that it would
perience, none had been created (the ecosystem review of be impossible to meet the requirements of the standard in the
Section 4.3, however, showed a small number exist). It is also short time frames clients are willing to procure for testing, even
worth noting that both the CHECK and CREST schemes require for their more experienced testers. Other providers were more
organisations to have defined a methodology that is to be re- positive, but never so far as having the desire to use it widely.
viewed before acceptance onto the schemes. Although providers One provider did state that although they did not use OWASP
mentioned methodologies in this study, in some cases, re- ASVS, because it was a third party document, it was good as
sponses seemed more of an attempt to demonstrate their a sales tool. Another provider felt ASVSs failure to pen-
methodologies received influences from external sources. A se- etrate the market was not so much a failing with the standard
lection of elicited responses can be seen below. itself, but due to a lack of demand in the buying community.
If the buying community is not asking for it, providers wont
Were aligned to OWASP be willing to spend the time implementing it. If the buying com-
munity wanted it, the industry would be all over it.
Whatevers available
5.2.3. Social engineering
A combination of everybodys Approaches to social engineering engagements were also dis-
cussed with providers. Services described fell into two
Our internal methodology is based on all that stuff categories: Firstly, scenario-focused in the manner that social
Providers on methodologies engineering is traditionally understood (e.g., with a specific end
goal). Secondly, audit-based social engineering (e.g., to deter-
Provider responses did, however, demonstrate that there has mine the awareness level of a department). For where human
yet to be a methodology (for the mainstream target environ- exploitation occurs within social engineering, providers of-
ments: network, web and mobile) that has received widespread fering this service described a robust sign-off process, often
adoption within the industry. The lack of a consistent peer re- with multiple stages. A typical provider described process in-
viewed methodology does not necessarily indicate that providers volved scoping and the determination of Open Source
are doing an inadequate job in terms of coverage. Providers fre- Intelligence (OSINT) sources (e.g., social media), the use of OSINT
quently stated that they do their best to identify as many to discover information about an organisation and its employ-
vulnerabilities as possible within the duration of the engage- ees, the creation of attack scenarios, and the proposal of those
ment. It does, however, suggest a reluctance to have an external scenarios to the client, who would then decide whether they
methodology forced upon them. In the opinion of the provid- wished the provider to proceed or not. In multiple cases the
ers, this allows them the flexibility to tailor their offerings to provider stated that the client would return a list of alterna-
their clients. It might also be a reflection of the difficulties in tive targets for the scenarios to the ones proposed (e.g., OSINT
defining a methodology in a fast changing, highly complex en- might reveal information on a C-Suite member, but the client
vironment. Despite the lack of utilisation of peer reviewed may want to target those in lower positions). Providers were
methodologies, there was the consensus that methodologies generally adamant about the discussion of any social engi-
play a key role in emerging markets, by improving client edu- neering attack before its use, otherwise its easy to get into
cation and establishing a rough de facto standardisation of hot water and there is a potentially higher risk for things to
assessment activities. Such methodologies, however, were per- go wrong. Most providers described their services as being
ceived to best manifest through industry and community efforts. scenario-focused, with a smaller number being audit-focused,
Standards that specify requirements for the practical as- although in some cases offered both.
sessment can be seen to enforce a form of methodological Client sign-off opens a separate issue; the ethics and legal
requirement, and therefore, provider perceptions about this form aspects of social engineering. Two providers described situa-
of standardisation were understandably negative. Percep- tions where it was the client asking the provider for testing
tions on this topic are best illustrated in light of OWASP ASVS, with potentially ethically dubious motivates (e.g., perform a
which as a community developed standard, and a document phishing attack on a department, crack the passwords, and
open to peer review presents an ideal opportunity to discuss provide the names of the 10 individuals with the worst pass-
stakeholders views on this form of standardisation. These views words). These providers stated that it is often them as testers
can largely be summarised as ignorance or indifference. With that have to inform the client that such behaviour would not
respect to the lack of knowledge about OWASP ASVS, mul- be ethical. Despite this, the majority of providers did state that
tiple providers suggested a lack of awareness of many OWASP there was complex negotiations before social engineering
projects beyond the Top 10 and Testing Guide. As one pro- occurs, which often involves a client organisations Human Re-
vider summarised, OWASP is interesting. For some reason sources (HR) department, or at the minimum, the provider
outside the Top 10 their work is not picked up by the industry. suggests the client contact their HR department before testing.
computers & security 62 (2016) 296316 309

Anonymisation of victims of social engineering received strong strengthens the need for clarification on ethical and legal
responses from both perspectives. Of the providers that gave guidelines.
a clear answer, five said no (across two organisations) and seven
said yes (across six organisations) to anonymising results. One 5.3. Post-engagement
provider that said no, stated: It would be wrong to anonymise
anything. The client commissioned the test, and the client owns Ive never seen any wow reports, but a lot of bad ones
the results when you give it to them. The argument for
anonymisation was largely that it is a training and policy failing Shocking
and not a finger pointing exercise. In one case, the provider
stated that they request HR be at debrief meetings to empha- Generally very hit and miss
sise this point. Sometimes anonymisation is inadequate as it
is easy to determine the target (e.g., the receptionist at the front Appalling
desk on a particular day and time). In these situations some Providers on the reports of other providers
of the yes providers stated that they try to obfuscate results.
One provider argued that due to the ease in which social en- Underwhelming was the overarching theme in the per-
gineering attacks succeed, they do agree that one person within ceptions of reporting from providers and clients. Providers
the client organisation can know the names, to allow for tech- expressed satisfaction with the quality of their own reports,
nical remediation (e.g., resetting passwords). A further provider but had largely disapproving opinions of the reports pro-
stated that they get the client to sign a document stating that duced by other providers. A small number of providers felt there
they will not take action against employees found to be targets was some consistency between their direct competitors, with
of the test. Other providers could not give a hard yes or no one large provider arguing that a level of consistency had been
answer to anonymisation, but general practice is that they tend achieved through the movements of individuals between pro-
not to give names. Two of the providers did state they have vider organisations.
felt pressure from client organisations to disclose names. In
one case they stated that sometimes the client asks infor- The quality varies immensely the quality can be
mally and they may tell them, may not. atrocious
With respect to the legal and ethical aspects when con-
ducting social engineering assessments, one provider Often basically a Nessus output in PDF format
summarised this as a minefield. This is notably the case in
scenario-driven tests that involve any form of physical pen- Very impressed
etration. There was a strong consensus amongst stakeholders
that the provider community would benefit from a synthesis great deal of variability
of ethical and legal material on this topic into a common source,
such as a set of guidelines or a code of ethics. Providers views Some are atrocious; others well thought out
included it might be useful, if there was one it would be really
useful, some framework at the level of ethics would be useful The quality of the document was high
and it needs to be done I can imagine the arguments though.
One provider did state they had previously searched for guide- No significant quality variation
lines but without success. Another argued that a code of ethics
in the UK would likely see some success and approval due to Some are so shocking, its hilarious
the requirement for being whiter than white as a penetra- Clients on reporting quality
tion tester, compared to other countries in the world where a
criminal history is sometimes encouraged or overlooked. Clients Client interviews highlighted a significant perceived vari-
also expressed a desire to see greater ethical and legal guide- ability in the quality of reports from providers. The above quotes
lines. One client argued that at that level theyre not engaging were extracted from the views of eight clients in eight
with customers but its employees on an interpersonal level, organisations. One interesting finding was that the smaller
with another client adding, there should be guidelines. For the clients had the best opinions on the quality of reporting. Gen-
protection of those doing the testing as much as anything else. erally, the larger the client (typically, therefore, with a greater
CREST was the body of choice for any guidance for two pro- in-house IT capability), the greater the perceived variability.
viders, with the suggestion that it integrates with their Two widely cited issues that will not be discussed in detail
complaints process. [It] should definitely be from CREST. Some here are the mis-marketing of vulnerability assessment ser-
providers, including those who were pro-guidelines, did express vices as penetration testing services, and the quality of report
that such an endeavour must be cautious not to constrict the content. The former is a systemic issue that stems from pre-
industry and the value of testing. During one client inter- engagement negotiations. The latter is about individual
view, the interviewee could not comment on whether social capability, which stakeholders strongly felt should be the re-
engineering had been conducted in tests on their organisation, sponsibility of technical bodies.
although could discuss other technical tests (e.g., network and
web application penetration testing). One could argue that this 5.3.1. Reporting structure
is perhaps indicative of how people perceive social engineer- All providers were found to follow a similar high-level report-
ing and the sensitivity of human-focused testing, which ing structure. At its most basic, all reports were described to
310 computers & security 62 (2016) 296316

have managerial and technical sections. Managerial sections organisations) stated that they included any root cause analy-
typically contained the executive summary and engagement sis in any form of penetration test report, although more did
details (e.g. scope). Clients were moderately satisfied with pro- state that recommendations were prioritised. One of the largest
vider efforts, but many felt managerial sections were still too clients in the project went further to argue that providers need
technical, and often needed rewriting for internal communi- to include scenarios in their root cause analysis to enable a
cations. The technical section broadly contained the lists of greater understanding of vulnerability chains and their impact.
discovered vulnerabilities and recommendations. The provid- Interestingly, a criticism of clients that arose at multiple points
ers implementations for both were varied. Some best practices within the study was the claim that they often only spot fix,
for reporting structure that were noted include the use of docu- rather than address root causes, and that issues continue to
ment histories, information on providers involved in testing appear in subsequent engagements (e.g. their yearly audit).
(e.g. qualifications), attack narratives (e.g., a descriptive expla- While the ultimate responsibility to address systemic issues
nation of what route was taken by the assessor in attacking lies with the client, based on the findings in this study, it would
the target), root cause recommendations and appendices of test be difficult to claim that many providers are going to great
data (e.g. logs of tool outputs and systems touched during lengths to facilitate this.
testing).
5.3.4. Validation of vulnerabilities
5.3.2. Vulnerability scoring The final issue was once a client has implemented remediation
The first major issue highlighted was the diverse use of default to vulnerabilities, they then have two options for validating its
metrics for scoring vulnerabilities. The Common Vulnerabil- efficacy. They could either obtain a retest of the vulnerabili-
ity Scoring System (CVSS) version 2.0 was mentioned frequently ties (usually at an additional cost) or test for the vulnerability
and was often mandated by some clients; however, providers themselves. Most clients of penetration testing engagements
were critical of it in its current form, arguing that it was only do not have the skills or training to understand and recreate
suitable for certain technologies, its scores often did not reflect the vulnerabilities themselves, which therefore means they
real-world risks, and that it failed to account for vulnerabil- must be empowered to do so. Only nine providers (seven
ity chains (e.g. multiple medium risk vulnerabilities created one organisations) stated that proof-of-concepts were included
of high risk) or the presence of mitigating security controls. within their reports (e.g. a single command or script that can
Instead, providers frequently described the use of alternative be queried or executed to demonstrate the issue), with clients
metrics, such as qualitative scores (usually high, medium and describing their presence as rare. The majority of providers
low); impact to Confidentiality, Integrity or Availability (CIA); offered retests instead, although some providers stated that
ease of exploitation; proprietary CVSS derivatives; or a com- some information was provided, such as what tools were used.
bination of multiple metrics in a matrix. For clients, the variety The majority of clients expressed an interest in proof-of-
of scoring mechanisms was found to be problematic for track- concepts being made available, with some clients stating that
ing performance over time and comparing results between they would also like to see attack narratives. One client stated
providers. Furthermore, issues were felt to be compounded due that this was because remediation is often undone, and attack
to the subjectivity in arriving at a particular score, such as when narratives would facilitate better understanding of cyber threats
providers tried to adapt CVSS to account for its aforemen- and empower them to implement more effective mitigating
tioned limitations, or address one or more aspects using their security controls. A provider argued that not including narra-
own metric system. The survey highlighted a strong opposi- tives or proof-of-concepts in reports was a business decision
tion to the potential for mandating a specific metrics system, by provider organisations, with another suggesting the same
as this is where providers felt their value was generated; issues, arguing that this information will typically be pro-
however, some providers did feel that clients mandating the duced to enable them to conduct a retest anyway. The provision
inclusion of unmodified CVSS scores regardless of the use of of such information aids in educating the client to improve their
another main metric system would provide a quick win for security, but doing so would not be financially beneficial for
consistency within the industry. Version 3.0 of CVSS is cur- providers.
rently in development, and some providers expressed the hope
that this would lead to a natural resolution and improve- 5.3.5. Improving reporting
ment of this issue. The difficulty in achieving greater quality of reporting is bal-
ancing the need for consistency with the resistance to
5.3.3. Recommendations standardisation and the providers desire to maintain flexibil-
Another area of concern was the quality and content of rec- ity in the reporting process. Auditing and setting guidelines were
ommendations. Smaller clients were the most satisfied, with two methods suggested by providers that could help to achieve
larger clients having more qualms. Frequent criticisms in- this.
cluded the lack of prioritisation (beyond the implicit CHECK reports are reviewed by CESG for quality and metrics.
prioritisation based on the vulnerability score), categorisa- Any issues found are raised with the customer and/or the
tion, and root cause analysis. Root cause analysis featured CHECK company. This is supplemented with an annual audit
heavily in client demands, but providers were largely seen to where CHECK companies are requested to send two ex-
be failing to deliver on this. One client was particularly criti- amples of work that best demonstrate their technical ability.
cal of CHECK reports for their lack of root cause analysis, stating CREST reports are not audited. Two providers (one CREST
that it rarely happens in their CHECK reports and that there organisation) argued that they should be doing them. However,
is no interpretation of results. Only seven providers (six without regulatory or other external support (e.g. as with the
computers & security 62 (2016) 296316 311

CHECK scheme) such an approach could see opposition from There was a lack of prioritisation, categorisation, and root
providers. In part, because a shift in the governance frame- cause analysis of recommendations, which made it dif-
work may require adaptations to business models, but also due ficult for clients to understand the impact, and address
to the practical challenges of implementing this in the private systemic issues.
sector (e.g. handling client confidentiality). Clients want to see proof-of-concepts of vulnerabilities
Authoritative guidelines on reporting best practices were and attack narratives, but these are rarely provided in
suggested in the belief that if clients had access to such guide- reports.
lines, their expectations would raise the reporting standards Auditing and setting guidelines were suggested as routes
by providers. PTES does contain reporting standards; however, to improving the quality of reports.
PTES has failed so far to achieve widespread awareness amongst
the buying community. The provider community is aware of
issues around reporting, and some providers are taking steps 6. Opportunities for ecosystem advancement
to address this. One example that the authors were made aware
of during the study was a community project that aims to create This analysis has identified a multitude of opportunities for
a baseline, minimum standard for reporting. The output will further discussion, analysis and improvement within the simu-
be a series of guidelines outlining best practices, and an example lated security assessment ecosystem, most notably at the
report that will be made available to the public (i.e. both pro- beginning and end of engagements. Based on these findings
viders and clients). The example report will be produced based seven development areas are proposed: three which involve
on the findings of a real engagement undertaken by the proj- standardisation, and four which the industry (and wider se-
ects group. This project involves some providers within the curity community) itself is in the most suitable position to evoke
PTES group; however, this project will be independent of PTES. change. Given the importance (and rapid growth) of simu-
lated security assessments, resolving these needs for best
practice quickly would aid providers and buyers. The seven de-
5.4. Summary of key findings velopment areas can be seen in Fig. 5 and are detailed within
the following sections. Each development area is numbered ac-
Several key findings on the different stages of the engage- cording to the relevant engagement phase: pre-engagement (P1),
ment process warrant highlighting: practical assessment (P2), and post-engagement (P3). Certain
development areas provide contributions towards addressing
Pre-engagement: the requirements of other areas; contributing relationships and
The ambiguity of what constitutes a penetration test was their directions are also illustrated within Fig. 5 using arrowed
a cause of confusion and frustration amongst stakeholders. lines.
Smaller providers utilise face-to-face meetings to distin-
guish themselves against the faceless larger providers. 6.1. Pre-engagement
Undisclosed third party systems were a common cause
of delayed testing. Providers offer a diverse mix of qualities and depth of simu-
More effort needs to be made to provide guidance for in- lated security assessments. The buying market in the words
ternational engagements. of one provider is in need of something to compare like-for-
Practical assessment: like. Two development areas are proposed to address this
Many providers and clients have an aversion to exploi- requirement.
tation of vulnerabilities found, with the short length of Standardisation of terminology (P1.1) is recommended to
engagements a contributing factor. enable clients to make more informed procurement deci-
There was a diverse, and limited, use of existing meth- sions. The current heterogeneity of service offerings creates
odologies by providers, and a reluctance to have an uncertainty for clients in what will be delivered, which estab-
external methodology forced upon them. lishes high requirements for pre-procurement due diligence,
Both scenario-focused and audit-based social engineer- which is not always feasible. The form of standardisation pro-
ing were used in engagements, with scenario-focused posed here is not to attempt to establish wholesale consistency
being most prevalent. between providers, which would be both infeasible to achieve
The ethics around the use of social engineering was a grey and may lead to a form of commoditisation on its own. Instead,
area, with a range of opinions on anonymisation, and the a form of standardisation is proposed to establish and hold pro-
providers often having to inform the clients that a request viders to a minimum service quality for the different service
would amount to unethical behaviour. There was a strong definitions (e.g., vulnerability assessment, penetration test, and
call for more guidance in this area. red team exercises), while allowing the flexibility to custom-
Post-engagement: ise services to meet client requirements. The PTES is one notable
Most reports followed a similar high-level structure. community project, which is working towards a similar goal
There was a diverse use of different metrics for scoring of terminology standardisation. Standards bodies should look
vulnerabilities, and a level of subjectivity, which made it towards developing relationships with community efforts to
difficult for clients to track performance over time and achieve similar terminology models. Such an approach lever-
compare results between providers. ages existing work by subject matter experts. Furthermore, in
There was a strong objection by providers to mandating the opinions of providers, the reputation of standards bodies
a specific metrics system. can aid in bringing these concepts to the mass market.
312 computers & security 62 (2016) 296316

Internationalisation Enforcement Client

Pre-
P1.1.1 P1.1.2
Education
Terminology P1.1 P1.2

Practical
Domain-Specific
Ethics
Methodologies
P2.1 P2.2

Reporting Auditing

Post-
Metrics
Guidelines Guidelines
P3.1 P3.2 P3.3

Standardisation Activity Industry and Community-Led Activity

Engagement Phase Contributor

Fig. 5 Opportunities for improvement within the simulated security assessment ecosystem.

Internationalisation (P1.1.1): Stakeholders argued that a focus clients on the importance of security but also empowering them
on standardisation to delineate service definitions would aid through education to improve their security posture. Such em-
in addressing the commoditisation of simulated security as- powerment occurs throughout the engagement life cycle: it
sessments, whilst not being tied to a specific region, and thus begins with education on effective and appropriate service
open to internationalisation. From the perspective of providers, models, and ends with education on remediation. For pre-
the role of terminology standardisation in the internationali- engagement activities, such education is largely self-completing.
sation of simulated security assessment services was signifi- Service understanding develops naturally through exposure to
cant. Although such a standard would be intended to have a services (e.g., annual assessments). However, such develop-
positive impact within the UK market, it would have wider ment can be facilitated by providers through greater
impact within the fledgling international markets that pro- transparency within their service models.
viders are increasingly looking to for growth.
Enforcement (P1.1.2): Standards are ineffective without en- 6.2. Practical assessment
forcement. The manner in which such standardisation is
enforced requires further research and discussion. Stake- A strong opposition to any form of standardisation in rela-
holder opinions within this study followed two tracks. Firstly, tion to the practical assessment phase has been discussed, along
having existing technical bodies place greater emphasis on po- with the success, and continued improvement of the techni-
licing service quality. However, two challenges would need to cal bodies (e.g., CHECK and CREST) in this domain. Despite this,
be resolved. The first is the challenges of practically conduct- two areas for potential improvement are noteworthy.
ing such policing; some of which were discussed in the context Domain-specific methodologies (P2.1): Providers scarcely use
of reporting in Section 5.3. The second concerns internationali- peer-reviewed methodologies as described in Section 5.2.2,
sation, and how these standards are tied to largely region spe- instead preferring a synthesis of approaches for their internal
cific certifications (e.g., CHECK is UK only, while the two methodology. This synthesis is significant, as despite their lack
implementations of CREST, CREST UK and CREST Australia, of strict usage, public methodologies do guide the design of
operate independently, with expansion also in progress to Sin- internal methodologies. In most scenarios, such methodolo-
gapore). The potential for an international standard in the gies are well established (e.g., infrastructure, web applications,
traditional sense, which is enforced by national accreditation and increasingly, mobile devices). For engagements involving
bodies through certification bodies, requires further research (in niche and novel environments, however, this is atypical. Fur-
this scenario existing technical bodies would be certification thermore, in such engagements providers expressed a notable
bodies who then certify provider companies). malleability in the service definitions used by some providers,
Client education (P1.2): Standardisation of terminology alone, which it can be argued, may arise through the lack of peer-
however, is not a panacea to pre-engagement woes. Client en- reviewed knowledge on what assessments should entail, along
gagement is paramount. A number of development areas with establishing further evidence on the requirement for
contribute to this aim, but here the call from stakeholders was standardisation of terminology (P1.1). The development of
for providers to place greater emphasis on not only educating domain-specific methodologies is also recommended to aid in
computers & security 62 (2016) 296316 313

addressing this knowledge gap. One notable case in which this could address all of the aforementioned issues within this study,
occurs is with Industrial Control Systems. Since the conclu- while describing best practices around the processes that
sion of this papers interview process, CREST has begun support report production (e.g. quality assurance). As with the
undertaking a project to this effect. This project is examining recommendations on terminology, standards bodies should look
the need for new and updated public methodologies, along with to leverage existing efforts within the community to raise re-
exploring the possibility and form of services being expanded porting standards. This should include working with technical
into this domain (e.g., the STAR scheme as a service and as bodies in the UK, such as CREST, while remaining aware that
individual qualifications). guidelines should not be region-specific.
Ethics (P2.2): One success of the industry within the UK Metrics (P3.2): Of the characteristics of reports that elic-
(notably by CREST) has been the formalised structure for dealing ited dissatisfaction, one of the most prominent was that of
with complaints at the technical body level. Included within security metrics. To some extent the issues surrounding metrics
this structure is the facility to handle complaints concerning can be mitigated through reporting guidelines (P3.1) and the
ethics. However, for this system to function effectively, pro- wider educational process for clients (P1.2). If clients are edu-
viders must have a clear understanding of the ethical framework cated and empowered to mandate metrics and related reporting
within which to work under. In broad terms, providers felt the requirements, as many larger enterprises have done, a degree
industry has proven capable of handling this responsibility in of consistency can be achieved. Providers, however, often per-
the majority of cases. However, a need for further research ceived such an approach negatively (e.g., as these metrics differ
around ethics (P2.2) was identified in multiple areas; most highly between clients), and furthermore, the general consen-
notably for where simulated security assessments involve sus was that providers see unique metric approaches as a
human subjects (e.g., where social engineering is used). In this market differentiator. It is through the metrics after all that
scenario, research would also be intended to improve client security is measured and understood. For clients, however, the
education (P1.2) on their responsibilities to their employees (e.g., call was for that understanding to be facilitated by consis-
with respect to anonymisation, and remediation being training- tent measurements between providers. Two factors require
led rather than punitive). The topic of ethics has been addressed further research and discussion. The first is establishing con-
somewhat within academic literature (see Section 2.4), e.g. for sensus and backing for a consistent measurement approach.
phishing experiments. Further research is needed, and for the A large part of the debate arises through providers perceiv-
researched methodologies to be taken up by industry. ing that there is no good metric. The notion of mandatory
CVSS 2.0 (supplementary to any other approach), and hope for
6.3. Post-engagement CVSS 3.0 was touted by a number of providers. The second is
to examine how this can be achieved while minimising the sub-
Post-engagement comes in a variety of forms. It is both the im- jectivity involved in making such judgements, which is a
mediate aftermath of an engagement and the implications of primary source of inconsistency.
that engagement. Here three development areas are pro- Auditing guidelines (P3.3): Looking beyond the scope of an
posed that span both categories. engagement, simulated security assessments contribute to ISO/
Although the high-level structure of reports was found to IEC 27001 audits under an isolated ISO/IEC 27002 security control
be relatively standardised between providers, at a lower level, technical compliance review. Auditing guidelines have been
stakeholders were found to be dissatisfied with various char- produced previously in ISO/IEC 27008; however, they are being
acteristics of reports. For providers, it was predominantly their revised. Standards bodies should look to provide auditing guide-
experiences of seeing competitors offering vulnerability as- line that establish a clearer link between the scope of an
sessments that had been mis-marketed and sold as penetration engagement and its findings, and ISO/IEC 27002 security con-
tests. The most effective resolution to this arguably comes not trols beyond the narrow categorisation of a technical compliance
from a focus on the report itself, but around the standardisation review. Some stakeholders mentioned the perception of ISO/
of terminology (P1.1). For clients, it was the inconsistency IEC 27001 being a check box exercise, and where simulated
between providers, and a lack of depth in the provision of security assessments are of relevance in the audit process, the
metrics and recommendations (e.g. root cause analysis) to audit is merely a confirmation that it has occurred, rather than
empower the client to understand more about the security a detailed analysis of its findings to determine whether the se-
issues within their environment. curity controls that have been implemented are consistent with
Reporting guidelines (P3.1): A standard in the form of a the objectives of the ISMS.
guideline, rather than a specification, for reporting is recom- Auditing guidelines provide the opportunity to link the socio-
mended. Rigid standardisation would likely see significant technical security controls of ISO/IEC 27002, and the socio-
opposition; providers see their reports as a means to differ- technical nature of simulated security assessments. Such
entiate themselves and add value to their offering to the client. assessments may only be one group of assessment method-
However, guidelines describing best practices could be pro- ologies, but they continue to rise in popularity and are
duced, and have the potential to provide an effective alternative. increasingly seen as a regulatory requirement. Furthermore,
Through standards bodies, such guidelines would likely gain penetration tests and red team exercises are arguably the most
significant exposure within the buying community, which may realistic methodologies currently available for simulating cyber
be otherwise difficult to achieve. Exposure facilitates client edu- threats. As part of this process, it is recommended that stan-
cation (P1.2), which empowers clients to make informed dards bodies examine the integration of simulated security
decisions when interacting with providers, and gives them a assessments with other standards, such as ISO 31000. A diverse
clear conception of what they should expect. Such guidelines array of metrics can be used as part of an engagement, but what
314 computers & security 62 (2016) 296316

is its meaning for risk management, and how does this impact for robust governance structures. The framework for future im-
the risk that is to be managed as part of ISO/IEC 27001? Fur- provement proposed here is intended to work towards
thermore, auditing guidelines maintain a close relationship with remediating industry issues, using a collaborative approach of
the requirement for terminology standardisation (P1.1). If a stan- standardisation and industry-led development.
dard or other assurance scheme mandates a particular variety The form of standardisation proposed within these recom-
of simulated security assessment, how can auditors ensure that mendations must be shaped through continued discussion with
it has been appropriately delivered without consensus estab- all stakeholders. As such, the findings of this paper have led
lishing what such an assessment should look like? to a preliminary workshop in July 2015 which was hosted by
BSI to determine the consensus between stakeholders in the
UK. Despite the vocal findings on the need for standards from
7. Conclusion many stakeholders the feedback from the workshop was that
the two key stakeholder groups for the UK (CESG and CREST)
The CHECK and CREST schemes, along with technical bodies would need to be explicitly on board for this to proceed as a
such as the Tigerscheme, have successfully defined the tech- national standard. The potential for ISO/IEC standardisation
nical capabilities of individuals who perform simulated security is also being explored.
assessments, and can be seen to be making great efforts to en-
courage evolution within the industry. In addition, both CHECK
and CREST have laid the foundations for the assessment of
organisational processes that support engagements. The Acknowledgements
professionalisation of simulated security assessments that such
schemes have enabled is primarily concerned with the UK The authors would like to thank the EPSRC through its finan-
market. The findings of this study suggest that on an inter- cial support for Lancaster Universitys Impact Acceleration
national scale, the level of professionalisation is less formalised, Account (Grant Reference EP/K50421X/1).
with respect to both individuals and provider organisations (e.g.,
in terms of how evidence of competency can be provided to
REFERENCES
employers and clients), although there are isolated excep-
tions that were highly regarded, such as is the case with some
individual qualifications within the United States (specifi-
cally those from Offensive Security). It can therefore be argued Arkin B, Stender S, McGraw G. Software penetration testing. IEEE
Secur Priv 2005;3(1):847. doi:10.1109/MSP.2005.23.
that the international market can learn many lessons from the
Ashraf QM, Habaebi MH. Towards Islamic ethics in professional
path to professionalisation that has been paved by the UK penetration testing. Revel Sci 2013;3(2):308.
market. There is much evidence to suggest that a shift towards Bank of England, CBEST intelligence-led testing: CBEST
a UK-style professionalisation is not only possible, but desired, Implementation Guide V2.0, Tech Rep. <http://www
as can be seen through the recommendation of UK originat- .bankofengland.co.uk/financialstability/fsc/Documents/
ing schemes in non-UK and international standards (e.g., PCI cbestimplementationguide.pdf>; 2016a.
Bank of England, CBEST intelligence-led testing: An introduction
DSS; PCI Security Standards Council, 2015), and as is evi-
to cyber threat modelling V2.0, Tech Rep. <http://www
denced by the expansion of UK originating schemes to other
.bankofengland.co.uk/financialstability/fsc/Documents/
countries (e.g., CREST Australia). anintroductiontocbest.pdf>; 2016b.
Despite the professionalisation, this study has identified that Bechtsoudis A, Sklavos N. Aiming at higher network security
there remains a number of issues at the start and end of the through extensive penetration tests. IEEE Lat Am Trans
engagement process that the industry has currently failed to 2012;10(3):17526. doi:10.1109/TLA.2012.6222581.
address. This is not through a lack of awareness of these issues; Bishop M. About penetration testing. IEEE Secur Priv 2007;5(6):84
7. doi:10.1109/MSP.2007.159.
this study has highlighted that both providers and clients are
Blind K. The impact of standardization and standards on
dissatisfied by the lack of transparency and consistency in in- innovation. NESTA Work. Pap. Ser. 2013;13/15.
dustry offerings. It is based on these findings that the authors Blind K, Gauch S, Hawkins R. How stakeholders view the impacts
make the proposal: standardisation is needed. of international ICT standards. Telecomm Policy
Standards must be well formed to avoid the potential to suf- 2010;34(3):16274.
focate and hinder rapidly evolving industries, such as the one British Standards Institution. BS ISO/IEC 15408-3:2008.
we find with simulated security assessments. Self-regulation Information technology security techniques evaluation
criteria for IT security. Part 3: security assurance components;
in such an environment is an ideal solution (e.g., through trade
2009.
organisations auditing services delivered as opposed to static British Standards Institution. BS ISO/IEC 15408-1:2009.
document reviews of methodologies); however, as one pro- Information technology security techniques evaluation
vider stated, when it comes to current industry offerings, it can criteria for IT security; 2014.
be a Wild West. This is not due to a lack of technical capa- Caldwell T. Ethical hackers: putting on the white hat. Netw Secur
bility within the industry, as the technological bar has been 2011;2011(7):1013. doi:10.1016/S1353-4858(11)70075-7.
CESG. Cyber Essentials PLUS: common test specification V1.2,
set and maintained by the technical bodies. One provider argued
Tech. rep. <https://www.cesg.gov.uk/documents/cyber-
that the UK security industry can provide anything the market
essentials-plus-common-test-specification>; 2014.
asks for; the problem is that it [the market] does not ask the CREST. A guide to the Cyber Essentials scheme, Tech. rep.
right questions, which promotes ambiguity when interpreting <https://www.crest-approved.org/wp-content/uploads/2014/
service model requirements and results in a lack of demand 10/Crest-Cyber-Essentials-Guide-final.pdf>; 2014.
computers & security 62 (2016) 296316 315

De Vries HJ. Standardization: a business approach to the role of Proceedings of the 3rd annual conference on Information
national standardization organizations. Springer Science & security curriculum development InfoSecCD 06. New York,
Business Media; 1999. NY, USA: ACM Press; 2006. p. 197200 doi:10.1145/
Department for Business Innovation and Skills. Cyber Essentials 1231047.1231088.
scheme: summary, Tech. rep. <https://www.gov.uk/ PCI Security Standards Council, Payment Card Industry (PCI)
government/uploads/system/uploads/attachment_data/file/ Data Security Standard: requirements and security
317480/Cyber_Essentials_Summary.pdf>; 2014. assessment procedures V3.2, <https://www
Dimkov T, van Cleeff A, Pieters W, Hartel P. Two methodologies .pcisecuritystandards.org/documents/PCI_DSS_v3-2.pdf>;
for physical penetration testing using social engineering. In: 2016.
Proceedings of the 26th annual computer security PCI Security Standards Council, Penetration Test Guidance
applications conference, ACSAC 10. New York, NY, USA: ACM; Special Interest Group. Information supplement:
2010. p. 399408. penetration testing guidance, Tech. rep. <https://
Finn P, Jakobsson M. Designing ethical phishing experiments. www.pcisecuritystandards.org/documents/
IEEE Technol Soc Mag 2007;26(1):4658. Penetration_Testing_Guidance_March_2015.pdf>; 2015.
Geer D, Harthorne J. Penetration testing: a duet. In: 18th annual Pfleeger CP, Pfleeger SL, Theofanos MF. A methodology for
computer security applications conference, 2002. Proceedings. penetration testing. Comput Secur 1989;8(7):61320.
IEEE Comput. Soc; 2002. p. 18595. doi:10.1109/ doi:10.1016/0167-4048(89)90054-0.
CSAC.2002.1176290. Pierce J, Jones A, Warren M. Penetration Testing Professional
Goel JN, Mehtre B. Vulnerability assessment & penetration Ethics: a conceptual model and taxonomy. Aust J Inf Syst
testing as a cyber defence technology. Procedia Comput Sci 2006;13(2):193200. doi:10.3127/ajis.v13i2.52.
2015;57:71015. doi:10.1016/j.procs.2015.07.458. Reece R, Stahl B. The professionalisation of information security:
Guard L, Crossland M, Paprzycki M, Thomas J. Developing an perspectives of UK practitioners. Comput Secur 2015;48:182
empirical study of how qualified subjects might be selected 95. doi:10.1016/j.cose.2014.10.007.
for IT system security penetration testing. Ann UMCS Inf AI Saleem SA. Ethical hacking as a risk management technique. In:
2015;2(1):41424. Proceedings of the 3rd annual conference on Information
Hardy G. The relevance of penetration testing to corporate security curriculum development InfoSecCD 06. New York,
network security. Inf Secur Tech Rep 1997;2(3):806. NY, USA: ACM Press; 2006. p. 2013. doi:10.1145/
doi:10.1016/S1363-4127(97)89713-0. 1231047.1231089.
Herley C, Pieters W. If you were attacked, youd be sorry: Scarfone K, Souppaya M, Cody A, Orebaugh A. Technical Guide to
counterfactuals as security arguments. In: Proceedings of the Information Security Testing and Assessment, NIST Special
2015 new security paradigms workshop, NSPW 15. New York, Publication. Gaithersburg, MD: National Institute of Standards
NY, USA: ACM; 2015. p. 11223. doi:10.1145/2841113.2841122 and Technology; 2008. p. 80015.
<http://doi.acm.org/10.1145/2841113.2841122>. Searle J. NESCOR guide to penetration testing for electric utilities,
Herzog P. OSSTMM 3. The open source security testing Tech. rep. <http://smartgrid.epri.com/doc/
methodology manual: contemporary security testing and NESCORGuidetoPenetrationTestingforElectricUtilities-v3
analysis. Tech. rep. ISECOM; 2010 <http://www.isecom.org/ -Final.pdf>; 2012.
mirror/OSSTMM.3.pdf>. Smith B, Yurcik W, Doss D. Ethical hacking: the security
InGuardians. Advanced metering infrastructure attack justification redux. In: IEEE 2002 international symposium on
methodology, Tech. rep. <http://inguardians.com/pubs/ technology and society (ISTAS02). IEEE; 2002. p. 3749.
AMI_Attack_Methodology.pdf>; 2009. doi:10.1109/ISTAS.2002.1013840.
Jamil D, Khan MNA. Is ethical hacking ethical? Int J Eng Sci Such J, Gouglidis A, Knowles W, Misra G, Rashid A. The
Technol 2011;3(5):375863. economics of assurance activities. Tech. Rep. SCC-2015-03,
Knowles W, Baron A, McGarr T. Analysis and recommendations Security Lancaster. Lancaster University; 2015.
for standardization in penetration testing and vulnerability Such J, Gouglidis A, Knowles W, Misra G, Rashid A. Information
assessment: penetration testing market survey. Tech. rep. assurance techniques: perceived cost effectiveness. Comput
British Standards Institution (BSI); 2015 <http:// Secur 2016;60:11733. doi:10.1016/j.cose.2016.03.009.
shop.bsigroup.com/forms/PenTestStandardsReport/>. Elsevier.
Logan PY, Clarkson A. Teaching students to hack: curriculum Swann GMP. The economics of standardisation. Department of
issues in information security. In: Proceedings of the 36th Trade and Industry; 2000.
SIGCSE technical symposium on Computer science education Swann GMP. The economics of standardization: an update. UK
SIGCSE 05. New York, NY, USA: ACM Press; 2005. p. 15761. Department of Business Innovation and Skills (BIS);
doi:10.1145/1047344.1047405. 2010.
Matwyshyn A, Keromytis A, Stolfo S. Ethics in security Tang A. A guide to penetration testing. Netw Secur
vulnerability research. IEEE Secur Priv 2010;8(2):6772. 2014;2014(8):811. doi:10.1016/S1353-4858(14)70079-0.
<http://dx.doi.org/10.1109/MSP.2010.67>. Thompson H. Application penetration testing. IEEE Secur Priv
McDermott JP. Attack net penetration testing. In: Proceedings of 2005;3(1):669. doi:10.1109/MSP.2005.3.
the 2000 workshop on new security paradigms NSPW 00. U.S. Department of Homeland Security. Cyber security
New York, NY, USA: ACM Press; 2000. p. 1521 doi:10.1145/ assessments of industrial control systems, Tech. rep., 2011.
366173.366183. van Wyk KR, McGraw G. Bridging the gap between software
McGraw G. Software security. IEEE Secur Priv 2004;2(2):803. development and information security. IEEE Secur Priv
doi:10.1109/MSECP.2004.1281254. 2005;3(5):759. doi:10.1109/MSP.2005.118.
Midian P. Perspectives on penetration testing. Comput Fraud Xynos K, Sutherland I, Read H, Everitt E, Blyth AJC. Penetration
Secur 2002;2002(6):1517. doi:10.1016/S1361-3723(02)00612-7. testing and vulnerability assessments: a professional
Mouton F, Malan MM, Kimppa KK, Venter H. Necessity for ethics approach. In: International cyber resilience conference; 2010.
in social engineering research. Comput Secur 2015;55:11427. p. 12632. <http://ro.ecu.edu.au/icr/16>.
doi:10.1016/j.cose.2015.09.001. Yeo J. Using penetration testing to enhance your companys
Pashel BA. Teaching students to hack: ethical implications in security. Comput Fraud Secur 2013;2013(4):1720. doi:10.1016/
teaching students to hack at the university level. In: S1361-3723(13)70039-3.
316 computers & security 62 (2016) 296316

William Knowles is undertaking an EPSRC Industrial Case Ph.D. that also teaches penetration testing and digital forensics modules on
is supported by the Airbus Group (formerly EADS) where he re- the GCHQ certified M.Sc. in Cyber Security at Lancaster. He has a
searches Industrial Control System security metrics. This Ph.D. is B.Sc. (Hons) and Ph.D. in Computer Science from Lancaster
being undertaken at Security Lancaster, an EPSRC and GCHQ University.
recognised Academic Centre of Excellence in Cyber Security Re-
search. He is also a qualified Tigerscheme Qualified Security Team Tim McGarr is the Market Development Manager for the Informa-
Member (QSTM) and ISO/IEC 27001:2013 Lead Auditor. tion Technology area within Standards Development in BSI. Tim
has specific responsibility for the direction and development of
Alistair Baron is a Security Lancaster Research Fellow in the School newer standards areas. Tim has been working at BSI since 2009.
of Computing and Communications at Lancaster University, UK. Prior to BSI, he spent 5 years working in the legal publisher
His primary research involves applying natural language process- LexisNexis in the strategy department. Before this, he worked as
ing techniques to cyber security challenges, including social a management consultant for CGI and an internal consultant for
engineering, extremism and other serious online crimes. Alistair BT. Tim has an MBA from HEC Paris, France.