You are on page 1of 13

G00257721

Reality Check on Big Data Analytics for


Cybersecurity and Fraud
Published: 16 January 2014

Analyst(s): Avivah Litan

Fast access to big data is fueling sharp analytics that find malicious security
events. We delineate areas where enterprises are seeing results, and note
the market suppliers that enable them.

Key Findings

Criminals and other bad actors are rapidly evolving their hacking techniques, and are attacking
quickly, making timely security and fraud analytics more critical than ever.

Big data analytics give enterprises faster access to their own and relevant external information.
The value of information needed to uncover security events deteriorates with time; in some
cases, more than a second is too long.

Enterprises can achieve significant savings in time and money when using big data analytics to
stop crime and security infractions, by stopping losses and by increasing productivity.

Big data analytics is ahead of most organizations' abilities to successfully adopt them, and
most vendors have barely begun to prove their software's effectiveness, so it's still early days
for this market.

Recommendations

Use big data analytics to take the noise and high false-positive rates out of security monitoring
systems, allowing security staff to focus on the most important events.

Start small and pick a project where you can see results e.g., making one monitoring system
(such as data loss prevention) smarter and less noisy by infusing it with contextual data and
analytics.

Plan to eventually broaden adoption of big data analytics across multiple applications and use
cases in your organization. This will help maximize return on investment.

When evaluating vendor solutions, first determine if your enterprise wants canned analytics, has
the expertise and resources to develop its own analytics, and/or wants to rely heavily on
professional or outsourced services.

Use fraud, security and cyberthreat intelligence from vendors that use their own big data
analytics to make actionable information available for your enterprise.

Table of Contents
Strategic Planning Assumption............................................................................................................... 2
Analysis.................................................................................................................................................. 2
An Example of Hard Savings............................................................................................................ 3
An Example of Timeliness................................................................................................................. 4
Issues Raised and Related Requirements......................................................................................... 4
Big Data Analytics Architecture.........................................................................................................5
Three Domains of Big Data Analytics Vendors for Security and Fraud............................................... 5
Domain 1: Enhance Existing Security Systems With Canned Analytics........................................8
Domain 2: Combine Data and Correlate Activity Using Custom or Ad Hoc Analytics...................8
Domain 3: External Cyberthreat and Fraud Intelligence............................................................... 9
Conclusion..................................................................................................................................... 10
Gartner Recommended Reading.......................................................................................................... 10

List of Figures
Figure 1. Big Data Analytics Architecture................................................................................................ 5
Figure 2. The Three Domains of Big Data Analytics Vendors for Security and Fraud............................... 7

Strategic Planning Assumption


By 2016, 25% of large global companies will have adopted big data analytics for at least one
security or fraud detection use case, up from 8% today, and will achieve a positive return on
investment within the first six months of implementation.

Analysis
(This document was revised on 6 February 2014. The document you are viewing is the
corrected version. For more information, see the Corrections page on gartner.com.)
Big data (see Note 1) analytics gives enterprises faster access to their own data than ever before.
That's the mantra of users we spoke with who have implemented big data analytics to successfully
address fast-changing enterprise security and fraud issues. Big data analytics enables enterprises
to combine and correlate external and internal information to see a bigger picture of threats against

Page 2 of 13

Gartner, Inc. | G00257721

their enterprises. It is applicable in many security and fraud use cases such as detection of
advanced threats, insider threats and account takeover (see "Use Big Data Analytics to Solve Fraud
and Security Problems").
Information needed to uncover security events loses value over time, and timely intelligent data
analysis is critical as criminals and bad actors move much more quickly to commit their crimes. For
example, a year or two ago, hackers would look around, conduct extensive cyberespionage on their
targets, and then go in for the theft whether it was for money or information. Now, according to
Gartner clients, hackers aware of more-effective security and fraud prevention measures erected
by their target victim enterprises simply go directly to the theft without a drawn-out
reconnaissance phase. Additionally, there are many more hackers and bad actors attacking more
1

enterprises than ever before, according to Gartner clients.

To address these issues in the past, enterprises relied on various siloed monitoring or detection
systems that were optimized for various use cases, such as data loss, financial fraud, or privileged
user monitoring. Now, with big data analytics, enterprises can:

Cut down on the noise and false alerts in existing monitoring systems by enriching them with
contextual data and applying smarter analytics. This is especially important as the number of
security events increase substantially year over year.

Correlate the resulting high-priority alerts across monitoring systems to detect patterns of
abuse and fraud, and to get the big picture on the security state of the enterprise.

Pool their internal data and relevant external data into one logical place, and look for known
patterns of security violations or fraud.

Profile accounts, users or other entities, and look for anomalous transactions against those
profiles.

Remain agile, and stay ahead of malicious actors and activities, via faster tuning of rules and
models tested against data streaming in near real time.

An Example of Hard Savings


2

Big data analytics has yielded hard savings in time and money. At one large U.S. retailer, security
staff measured the gains from applying big data analytics to just its Web application firewall. Using
Splunk as a base vendor, it ingested and combined contextual data such as system resource
utilization, behavioral profiles from RSA Silvertail Systems (by user and IP address), cyberthreat
intelligence from Fox IT, and security and fraud alerts from other security and fraud monitoring
systems (such as Accertify) into the company's big data analytics system. Once implemented, the
security staff reduced the time it took to research and investigate one alert from 12 worker-hours
using 10 resources to just 0.2 worker-hours, using two resources.
The company's alert research was more timely, as was the elapsed period from the time of the alert
to investigation of the alert going from an average of 90 minutes to less than 10 minutes. These
savings in time and money were critical because, at the same time, the number of security events at

Gartner, Inc. | G00257721

Page 3 of 13

this enterprise increased exponentially over the previous four years from a half a billion events in
2010 to 5.5 billion in 2013. This retailer spent $243,000 on staffing for its big data analytics project.
For 550GB indexed per day, the perpetual Splunk list price was approximately $800,000. For
750GB indexed per day, the perpetual Splunk list price was approximately $1 million.

An Example of Timeliness
Most on-premises big data analytics vendors for fraud and security make data available within a
minute or tens of seconds. More recently, new players in this market, such as Sumo Logic, promise
to deliver fresh data within milliseconds, stepping up to enterprise demand for real-time solutions as
criminals also conduct their crimes within milliseconds.
Timely fast results from big data analytics can put pressure on existing operational systems that are
typically slower and can't always keep up with those results. For example, at one company Gartner
spoke with, an abnormal Web page access flagged by its Splunk big data analytics systems
highlighted a wire transfer fraud about to take place that needed to be blocked in its operational
system. The security staff just had to sit and wait for the operational system to get to the point
where it could shut down the criminal wire transfer that it was alerted to through the big data
analytics application.

Issues Raised and Related Requirements


Big data analytics is very powerful because enterprises have faster-than-ever intelligent access to
their own data, combined with relevant external information. Operational issues, however, will arise
unless organizational processes and systems can keep up with the analytical results. Gartner has
seen these implementations raise several new system requirements that need to be addressed:

The big data analytics application must interface with operational systems to leverage results.
APIs must be utilized to ensure delays are minimal and to integrate legacy solutions into the big
data analytics application.

A common alert management system should combine and correlate alerts from previously
siloed applications, now enriched with contextual data and leveraging big data analytics. This
may require custom development, because alert management systems are typically vendor
specific and don't necessarily interface with multiple applications.

A common dashboard must give users quick visual access to the information generated by the
big data analytics systems across all of an enterprise's information sources. This also may
require custom development, since most dashboards are built into specific vendor products
that don't integrate with others.

Page 4 of 13

The dashboard should enable queries. Two examples are "what are the top 10 IP addresses
involved in security events?" and "what are the top 10 vulnerabilities my enterprise faces
today?"

Gartner, Inc. | G00257721

Big Data Analytics Architecture


In Figure 1, we outline the layers of a big data analytics architecture. Organizations can choose to
piece these technologies together themselves by, for example, relying wholly on open-source code
(such as Hadoop for a data repository and Hive for analytics), or they can use proprietary vendor
software to speed up development time, or a combination of both. Not all solutions have all these
layers integrated; for example some big data analytic applications output logic that is absorbed by a
business application instead of a user manually interacting with the data via a user interface.
Figure 1. Big Data Analytics Architecture

User Interface

Pattern Matching, Network Visualization, Linking and Discovery,


Dashboard, Alert Management

Administration

Secure Collaboration, Workflow, Object and Document Tagging,


Rules, Model and Analytics Editing

Analytics

Data Repository
Information
Connectors

Link Analysis, Social Network Analysis, Geospatial, Temporal


Analysis, Clustering, Profiling, Models, Rules
HDFS/Hadoop, Database, Data Warehouse, In Memory, Files,
Logs
Web Services, News Feeds, External Information
Cyberthreat Intelligence, APIs, Fraud Exchanges

Source: Gartner (January 2014)

Three Domains of Big Data Analytics Vendors for Security and Fraud
We group big data analytics solutions for security and fraud into three domains;

Domain 1: Enhance existing security systems with canned analytics

Domain 2: Combine data and correlate activity using custom or ad hoc analytics

Domain 3: External cyberthreat and fraud intelligence

Here, note the vendors that have demonstrated results in these categories. The vendors we cover
are providing each of the layers of the big data analytics architecture noted in Figure 1, and some
are stronger in certain layers (e.g., analytics or administration) than others.
Figure 2 illustrates where these solution domains sit in an enterprise architecture that we originally
laid out in "Innovation Insight: Innovation Drives Seven Dimensions of Context-Aware Enterprise
Security." This Innovation Insight research also elaborated on the types of contextual data that can
be used to feed various use cases in each of these domains (see Note 2). Enterprises should use

Gartner, Inc. | G00257721

Page 5 of 13

these domains to evaluate vendor solutions, and to assess how they come together and fit into an
overall enterprise plan.

Page 6 of 13

Gartner, Inc. | G00257721

Figure 2. The Three Domains of Big Data Analytics Vendors for Security and Fraud

Authentication Manager

Alert and Case Mgmt.


and Investigations

Accept, Block or Verify


Policy Manager
IAM, Directory, HR Systems

Endpoint

Results

Servers

SIEM*

Results

Databases

DLP

Results

Applications

Fraud Prevention

Results

DAP

Results

Channels

Advanced Threat
Detect

Online, Mobile, In-person

Big Data Analytics/Visualization

Current
*SIEM can integrate DLP and DAP

Desirable

External Data

DAP = database audit and protection; DLP = data loss prevention; IAM = identity and access management; SIEM = security information and event management

Source: Gartner (January 2014)

Gartner, Inc. | G00257721

Page 7 of 13

Domain 1: Enhance Existing Security Systems With Canned Analytics


Here, vendor software uses canned analytics to make existing systems more intelligent and less
noisy so that the most egregious events are highlighted and prioritized in queues, while alert volume
is reduced. For example, one financial institution Gartner spoke with employed 35 staff to monitor
135,000 DLP alerts a day prior to installing Bay Dynamics. It has reduced that to a handful of staff
monitoring several thousand higher-priority alerts per day.
The big data aspect of this software domain comes in a more advanced phase of deployment,
where data and alerts from separate systems, e.g., DLP, SIEM (see Note 3) IAM or endpoint
protection platform (EPP), are enriched with contextual information, combined and correlated using
canned analytics. This gives an enterprise a more intelligent and holistic view of the security events
in its organization.
Representative Vendors

Bay Dynamics' software leverages DLP systems from vendors such as Symantec, and
baselines the DLP-related behavior of employees and various workgroups in an organizational
hierarchy. It analyzes incoming DLP events against the baseline profiles to detect anomalous
transactions.
Bay Dynamics' cross-system (e.g., SIEM, DLP, EPP) Risk Fabric application digests,
summarizes and highlights important security events of the day in a news format.

Securonix is a security analytics platform with solutions for various domains such as insider
threats, IAM, DLP and fraud. It performs real-time identity correlation, behavior and peer group
anomaly detection, and risk scoring to identify the highest-risk activities and access of users
and other entities. To spur adoption, Securonix is packaging its solutions into free
downloadable applications, starting with its IAM intelligence solution Access Scanner so
that customers and prospects can make use of its canned analytics and anomaly detection.

Domain 2: Combine Data and Correlate Activity Using Custom or Ad Hoc Analytics
In this category, enterprises use vendor software or services to integrate internal and external data,
structured as well as unstructured, and apply their own customized or ad hoc analytics against
these big data sets to find security or fraud events. There are many different vendors that are
positioned to support this type of activity. The sample below represents those vendors for which
Gartner was able to find customer references to validate implementations that targeted security or
fraud use cases.
Representative Vendors

Splunk Well-known in this category, Splunk makes it very easy to bring machine-readable
data together within seconds, so that organizations can quickly search in "Google-like" fashion
their enterprise and infrastructure data. This enables them to use big data analytics that can
detect security infractions and fraud in their enterprises that would otherwise go undetected.
The results of any search can be automated and operationalized in Splunk to enable proactive
monitoring and detection. All data must be indexed, however, which makes it relatively costly to

Page 8 of 13

Gartner, Inc. | G00257721

scale to hundreds of terabytes of data. A new product Hunk: Splunk Analytics for Hadoop
interfaces with Hadoop and should address this issue because data that does not need to be
indexed will reside in Hadoop and be accessible through Splunk analytics. Splunk software
enables analytics through its search processing language and a pivot interface, but customers
can use more-sophisticated third-party or homegrown statistical models and data mining
techniques on top of Splunk.

Palantir Well-known for its big data analytics capabilities and work across sectors, Palantir
also targets enterprises that want to solve cybersecurity and fraud problems. Proven as a
forensic and investigation tool, Palantir makes it easy to integrate all kinds of structured and
unstructured information (including video files) so that human investigators can get their jobs
done quickly and relatively easily investigating alerts and finding bigger patterns of cybercrime
and fraud. Palantir has advanced security, workflow and secure collaboration built into its
product features it originally developed for U.S. military and intelligence agencies. Successful
Palantir implementations are dependent on Palantir professional services, which the firm always
bundles into relatively pricey licensing arrangements.

SAS Institute The company is best known for advanced analytics that are applied in
customized big data analytics projects. Proven in fraud use cases, the firm is targeting
cybersecurity use cases, although SAS could not identify any customers yet in this area. SAS
comes with its own data cleansing and data integration tools, and integrates with SAS datasets,
Hadoop and data streams, enabling it to address large amounts of information in memory. The
company is currently organizing to address the cybersecurity market with a revised and
dedicated solution with embedded analytics.

BAE Systems Applied Intelligence (formerly BAE Detica) Its NetReveal Social Network
Analytics software was one of the first in the market to solve various fraud problems such as
insurance, tax and banking fraud. The company has a new cybersecurity offering, although
Gartner could not find a customer to validate its functionality. The firm is still not proven in cyber
and enterprise security use cases (and did not offer up any customers Gartner could speak with
regarding the cybersecurity use cases). In 2013, the firm released standard fraud models that
should reduce prior customer dependence on BAE for developing and implementing fraud
applications. Once custom models are developed for the enterprise, they can be updated with
new data frequently.

Seculert An advanced threat protection vendor specializing in finding malware-based and


advanced threats against an enterprise, Seculert analyzes an enterprise's HTTP traffic logs, and
correlates them with malware profiles and external threat intelligence that it gathers. Enterprises
can feed Seculert its logs on an ongoing basis. The firm also has an API so that existing onpremises security devices can continually send traffic logs to the Seculert service. Customers
report excellent results in finding security infractions and advanced threats against them that
were heretofore undiscovered.

Domain 3: External Cyberthreat and Fraud Intelligence


In this domain, vendors apply big data analytics to external data on threats and bad actors, and, in
some cases, combine the external data with relevant data that their customers export to them. Most

Gartner, Inc. | G00257721

Page 9 of 13

of these vendors are also creating and supporting communities of interest where threat intelligence
and analytics are shared across customers.
These services fall into two broad categories:
1.

Cyberthreat intelligence services, where vendors like Fox IT; BrandProtect; Internet Identity;
Imperva; ThreatRadar; RSA, The security division of EMC; Seculert; Neustar; Mandiant; Opera
Solutions; BAE Systems; and others (see "How to Select a Security Threat Intelligence Service")
scour the dark and public Web to find malicious activity and threats against enterprises,
prospects and customers. Often, they turn this information into actionable data, e.g., by
providing lists of IP addresses of known bad criminal servers or signatures of malware attacks
used to perpetrate crime.

2.

Fraud exchanges, where vendors like ID Experts (for healthcare fraud), Early Warning Services,
41st Parameter, CyberSource, Guardian Analytics, NuData, Retail Decisions, RSA iovation, and
ThreatMetrix combine information from their customers and other third parties to identify
entities, such as devices, email addresses or phone numbers associated with known fraud. This
data can then be used on a customer's system blacklist to block activities with known bad
entities. The fraud exchanges typically also collect information on fraud attacks to then share
with their customers, typically in free-text format, such as email.

Conclusion
Big data analytics enables organizations to get faster intelligent access to their own data. This
means they can unearth and stop security and fraud infractions much more easily and quickly than
they have been able to before. However, it's still early days for these uses cases. A Gartner survey
of 720 enterprises in the 3Q13 found that only 8% of the respondents had actually deployed a big
data project.
The technology is generally a couple of years ahead of most organizations' ability to adopt it, but
there are some promising applications with canned analytics that should give enterprises a jumpstart in the use cases they support. Still, by 2016, 25% of large global companies will have adopted
big data analytics for at least one security or fraud detection use case, and will achieve a positive
return on investment within the first six months of implementation.
Determine which solutions among the three domains make the most sense for your organization to
take advantage of. Start small, but think big, and develop a road map that encompasses multiple
use cases and applications across your organization. The ROI on big data analytics is typically too
big to ignore.

Gartner Recommended Reading


Some documents may not be available as part of your current Gartner subscription.
"How to Select a Security Threat Intelligence Service"

Page 10 of 13

Gartner, Inc. | G00257721

"Innovation Insight: Innovation Drives Seven Dimensions of Context-Aware Enterprise Security


Systems"
"Mitigate Breaches With Real-Time Discovery"
"Use Big Data Analytics to Solve Fraud and Security Problems"
"Conduct Digital Surveillance Ethically and Legally: 2012 Update"
"Information Security Is Becoming a Big Data Analytics Problem"
"Emerging Technology Analysis: Cloud-Based Reputation Services"
Evidence
1

During October and November 2013, Gartner interviewed multiple organizations that were using
the big data analytics capabilities offered by the vendors researched in this report. The real-world
examples and case studies in this research are based on these interviews.
2

These big data analytics project results were achieved at a leading U.S. retailer in 2012, by a team
led by its former chief information security officer, Demetrios Lazarikos (Laz), now an IT security
strategist at Blue Lava Consulting.
Note 1 "Big Data" Defined
Gartner defines "big data" as high-volume, high-velocity and high-variety information assets that
demand cost-effective, innovative forms of information processing for enhanced insight and
decision making (see "The Importance of 'Big Data': A Definition").
Note 2 Contextual Data Examples
Here are examples of contextual data used to feed and enhance security and fraud analytics:

Location

Network velocity

Asset context

Time of access

Vulnerability

Application behavior

Note 3 SIEM Vendors


Gartner clients have evaluated several SIEM vendors for Domain 1 and Domain 2, and found that
most SIEM technology is not able to ingest nontraditional security data, such as behavioral profiles

Gartner, Inc. | G00257721

Page 11 of 13

and system resource utilization, without special customized efforts. This is problematic, because of
time delays inherent to customized solutions and because it does not scale well as new data needs
arise.

Page 12 of 13

Gartner, Inc. | G00257721

GARTNER HEADQUARTERS
Corporate Headquarters
56 Top Gallant Road
Stamford, CT 06902-7700
USA
+1 203 964 0096
Regional Headquarters
AUSTRALIA
BRAZIL
JAPAN
UNITED KINGDOM

For a complete list of worldwide locations,


visit http://www.gartner.com/technology/about.jsp

2014 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This
publication may not be reproduced or distributed in any form without Gartners prior written permission. If you are authorized to access
this publication, your use of it is subject to the Usage Guidelines for Gartner Services posted on gartner.com. The information contained
in this publication has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy,
completeness or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. This
publication consists of the opinions of Gartners research organization and should not be construed as statements of fact. The opinions
expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues,
Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company,
and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartners Board of
Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization
without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner
research, see Guiding Principles on Independence and Objectivity.

Gartner, Inc. | G00257721

Page 13 of 13

You might also like