You are on page 1of 38

PATENT

Attorney Docket No.: 098708-1012879-002100USP

PROVISIONAL
PATENT APPLICATION
WEB BASED DECEPTION

Inventors:
Abhishek Singh, Morgan Hill, CA

Assignee: Acalvio Technologies, Inc.


11674 Seven Springs Dr.
Cupertino, CA 95014 United States of America

Entity: Small

KILPATRICK TOWNSEND & STOCKTON LLP


PATENT
Attorney Docket No. 098708-1012879-002100USP

WEB BASED DECEPTION

5
BRIEF SUMMARY
[0001] Provided are devices, computer-program products, and methods for analyzing web-
based attacks. In some implementations, a device, computer-program product, and method for
generating an indicator to describe a result of a web-based attack is provided. For example, a
10 method can include receiving network traffic directed to a production web server on a network.
In some examples, the network traffic can be configured to request a response from the
production web server. In some examples, the network traffic can be intercepted prior to being
received by the production web server.

[0002] The method can further include identifying an attack pattern included in the network
15 traffic and sending the attack pattern to an emulated device. In some examples, the emulated
device can be configured to send the attack pattern to a testing web server. The method can
further include identifying a response associated with the attack pattern and generating an
indicator describing the response.

[0003] In some implementations, the network traffic can be intercepted by a device on the
20 network that analyzes incoming traffic to protect the enterprise network from intrusion. In some
implementations, the testing web server can be the production web server. In some
implementations, the testing web server can be an emulated web server in the emulated network.
In such implementations, the testing web server can duplicate one or more web-based services of
the production web server.

25 [0004] In some implementations, the response is a web page. In such implementations, the
network traffic can cause a script to be inserted into the web page. In such implementations, the
script can be executed when the web page is read by a web browser.

[0005] In some implementations, the script can be configured to access a remote script un a
remote location. In such implementations, the remote script can be executed when the web page

1
is read by the web browser. In some implementations, the network traffic can include a database
query. In such implementations, the response includes information from a database.

[0006] The terms and expressions that have been employed are used as terms of description
and not of limitation, and there is no intention in the use of such terms and expressions of
5 excluding any equivalents of the features shown and described or portions thereof. It is
recognized, however, that various modifications are possible within the scope of the systems and
methods claimed. Thus, it should be understood that, although the present system and methods
have been specifically disclosed by examples and optional features, modification and variation of
the concepts herein disclosed can be resorted to by those skilled in the art, and that such
10 modifications and variations are considered to be within the scope of the systems and methods as
defined by the appended claims.

[0007] This summary is not intended to identify key or essential features of the claimed subject
matter, nor is it intended to be used in isolation to determine the scope of the claimed subject
matter. The subject matter should be understood by reference to appropriate portions of the
15 entire specification of this patent, any or all drawings, and each claim.

[0008] The foregoing, together with other features and examples, will be described in more
detail below in the following specification, claims, and accompanying drawings.

2
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] Illustrative examples are described in detail below with reference to the following
figures:

[0010] FIG. 1 illustrates an example of a customer network that includes a threat intelligence
5 engine.

[0011] FIGS. 2A-2D illustrate examples of configurations of a high-interaction network.

[0012] FIG. 3 illustrates an example where a uniform resource locator (URL) 312 is sent to a
web server 320 from a user workstation 310 over an external network.

[0013] FIG. 4 is a flowchart illustrating an embodiment of a process for analyzing web-based


10 attacks using a high-interaction network.

[0014] FIG. 5 illustrates examples of ways in which the threat intelligence engine 508 can use
indicators generated by its analytic engine 518.

DETAILED DESCRIPTION
15 [0015] A network at a site such as a business or a private home typically includes at least basic
network traffic monitoring and filtering to protect the network from harmful activity. For
example, a sites network typically includes a firewall attached to or incorporated into a gateway
device that connects the sites network to outside networks. A firewall generally applies rules to
network traffic, and controls what network traffic can come into a network. The firewall also
20 typically controls network traffic that can go out of the network. Some sites rely on more than
just a firewall, and have multi-layer, sophisticated security perimeters with multiple network
security tools, such as anti-virus software, intrusion protection systems (IPS), intrusion detection
systems (IDS), email filters, and others. In some examples, the firewall and the multiple security
tools can be referred to as network information security systems.

25 [0016] Network security tools generally protect a sites network by identifying legitimate
network packets and questionable network packets. Legitimate network traffic can be forwarded
to the sites network. Suspect network traffic can be logged and/or can trigger alerts, and can
then be discarded. In some cases, the suspect network traffic corresponds to a known threat, such

3
as previously identified malware, or a denial of service (DoS) attack from a known Internet
Protocol (IP) address. When suspect network traffic corresponds to a known threat, in many
cases, the nature and effect of the threat is understood, and further analysis of the network traffic
might not be necessary. When the exact threat posed by suspect network traffic is not known,
5 however, further analysis of the associated packets, rather than discarding them, can be
beneficial. For example, analysis of suspect network traffic can provide information about an
effect the associated packets can have on a network. This information can be useful for
determining whether a sites network has already been infiltrated and harmed. This information
can also be used to strengthen existing network defenses. This information can also confirm
10 whether suspect network traffic is truly harmful, or whether the suspect network traffic is
actually innocent.

[0017] In various implementations, a threat intelligence engine can be added to a sites


network to analyze suspect network traffic. The threat intelligence engine can receive network
traffic marked as suspect by other network security tools. In some implementations, when the
15 suspect network traffic appears to correspond to a known network threat, the threat intelligence
engine can log the threat and can take no further action. When the suspect network traffic does
not correspond to a known threat, the threat intelligence engine can analyze the suspect network
traffic using a high-interaction network.

[0018] The high-interaction network is a closely monitored, isolated network that provides an
20 environment in which the contents of suspect network traffic can be interacted with just as in a
real network. The threat intelligence engine can use the high-interaction network to conduct
static analysis of suspect network traffic (e.g., opening files, decompressing archives, etc.),
dynamic analysis (e.g. unpacking the contents packets in the suspect network traffic, and
interacting with the contents as would a network user), and network analysis (e.g., tracing
25 network activity initiated by interacting with the contents of the suspect network traffic).

[0019] The high-interaction network can further record the results of these analyses, as well as
information about the suspect network traffic. The threat intelligence engine can configure the
high-interaction network to record data over the course of an incident. An incident can be an
attack or a suspected attack on a network. The threat intelligence engine can record data for the
30 incident from the time a suspected attack is detected until the suspected attack is terminated.

4
[0020] Once the threat intelligence engine has collected data for an incident, the threat
intelligence engine can analyze data associated with the incident, using an analytic engine. The
analytic engine can have one or more analysis engines, each configured to analyze incident data
of a particular type. The analytic engine can further include a correlation engine, configured to
5 correlate the results from the various analysis engines and reconstruct the events that led up to
any damage caused by the incident.

[0021] From this correlation, the threat intelligence engine can generate indicators that
describe the suspect network traffic. These indicators can include network indicators, file
indicators, and static indicators. The indicators can also describe the harm (if any) the suspect
10 network traffic caused. The threat intelligence engine can use these indicators to verify whether a
sites network has been previously infiltrated and compromised by the threat posed by the
suspect network traffic. In some cases, the threat intelligence engine can also send the indicators
to a central collector, for sharing with networks at other sites. The central collector can also
provide indicators to the threat intelligence engine that were generated by other networks.
15 Sharing indicators between networks at different sites can allow each of these sites to have even
stronger defenses.

[0022] Determining information associated with an attack can be useful in better defending a
network. However, many would-be attacks can be stopped by network security infrastructure
before an attack can commence. In such situations, information about the would-be attacks can
20 be lost.

[0023] For example, an attack to insert a script into a web page (which would be executed
when a browser parses the web page) can be identified by the network security infrastructure
before a web server generates the web page. In such an example, any information regarding a
result of executing the script can be lost because the script is never included in a web page.

25 [0024] In other examples, a query attack that is, an attack to query an underlying database
associated with a web server to exploit the underlying database can be identified by the
network security infrastructure. The query attack can be intended to request data from the
underlying database. However, because the network security infrastructure stops the query
attack, the result of the query attack can be unknown. However, knowing what the attacker was

5
querying can help to defend the network from future attacks that the network security
infrastructure might not catch.

[0025] In various implementations, a threat intelligence engine can receive network traffic sent
to a webserver on a network. The network traffic can be intercepted by network security
5 infrastructure of the network. The threat intelligence engine can parse the network traffic to
identify an attack pattern. Parsing can include locating an executable file in the network traffic.
For example, in case of office files (such as XLS, DOCX, PPSX, PPTX), the office files can
include a macro with malicious code. In such an example, the macros can be identified and
extracted. Other examples file types with malicious code can include a PDF (with JavaScript
10 embedded in the PDF) and image files (hidden executables). In some examples, a determination
of whether a macro is bad can be made to the file as it is. In other examples, the determination
can be made on the extracted format. The attack pattern can then be repackaged/sent to or
replayed on a high-interaction network to identify a target or result of the attack pattern. In some
examples, the high-interaction network can emulate at least a portion of the network such that an
15 attack pattern can properly be executed on the high-interaction network, just as if the attack
pattern was executed on the network. In some examples, the repackaged network traffic can be
sent to an emulated device on the high-interaction network. The emulated device can send the
repackaged network traffic to either a web server on a site network or an emulated web server on
the high-interaction network. Either web server can respond to the emulated network traffic, and
20 the threat intelligence engine can identify one or more indicators associated with the attack
pattern. The one or more indicators can describe the attack pattern such that other attack patterns
on the network can be identified by comparing the one or more indicators.

[0026] FIG. 1 illustrates an example of a customer network 102 that includes a threat
intelligence engine 108. A customer network is a network that can be found at a large or small
25 business, at a school campus, in a government building, or in a private home. A customer
network can be described as a local area network (LAN) or a group of LANs. A customer
network can include network infrastructure devices, such as routers, switches, hubs, repeaters,
and controllers, among others. A customer network can also include various computing systems,
such as servers, desktop computers, laptop computers, tablet computers, personal digital
30 assistants, and smart phones, among others. A customer network can also include other

6
electronic devices with network interfaces, such as televisions, entertainment systems,
thermostats, and refrigerators, among others.

[0027] In this example, the customer network 102 includes a gateway device 162 that connects
the customer network 102 to other networks, such as the Internet 150. The gateway device 162
5 can be, for example, a modem used to connect to telephone, cable, digital subscriber line (DSL),
satellite, optical fiber lines, or any combination thereof. In some cases, a gateway device 162 can
include integrated router functionality. The gateway device 162 can include a firewall 164, or
can be connected to a firewall 164 device. Generally, all network traffic coming into or going out
of the customer network 102 passes through the gateway device 162 and the firewall 164. Some
10 customer networks can have multiple gateways to outside networks, where each gateway
functions as a point of entry for outside network traffic to enter the customer network 102. Each
of these gateways typically includes a firewall.

[0028] The customer network 102 of this example also includes a network security
infrastructure 106. The network security infrastructure 106 adds additional monitoring and
15 filtering for network traffic that survives filtering by the firewall. The network security
infrastructure 106 can include network security tools 130, 132, such as for example anti-virus
tools, intrusion prevention systems (IPS), intrusion detection systems (IDS), email filters, spam
detectors, and file transfer protocol (FTP) filters, among others. Some network security tools
130, 132 can be multi-layered, such that network packets that survive analysis by a first security
20 tool 130 are then analyzed by a second security tool 132. For example, email traffic can first be
filtered for viruses, and then be filtered for spam.

[0029] The network security infrastructure 106 identifies network traffic that appears to be
legitimate 134 and safe, and forwards this probably legitimate network traffic 134 to the
customers site network 104.

25 [0030] The site network 104 is where the hardware, software, and internal users of the
customer network 102 can be found, and where the operations of the customer network 102
occur. In this example, the site network 104 includes several routers 166 that connect a switch
174, multiple servers 168, 170, and several subnets 172 together. The site network 104 can
receive apparently legitimate network traffic 134 through one of the routers 166. The switch 174
30 further connects user workstations 176 to the site network 104. The customer networks 102

7
users can access the site network 104 using the user workstations 176, and/or other wired or
wireless devices.

[0031] The servers in this example include a group of file servers 168. The file servers 168 can
provide storage for files used by the customer networks 102 users and/or for data stored and/or
5 operated on by the customer network 102. For example, the file servers 168 can store product
and customer data when the customer network 102 belongs to an online merchant, or can store
financial data when the customer network 102 belongs to a financial institution. The servers in
this example also include a group of compute servers 170. The compute servers 170 can provide
processing resources for software used by the customer networks 102 users and/or for the
10 operation of the customer network 102. For example, the compute servers 170 can provide
hosting for the customer networks 102 website or websites, and/or can provide databases for
volumes of data stored and/or operated on by the customer network 102, and/or can provide
distributed computing resources when the customer network 102 is part of an engineering firm.

[0032] The site network 104 can further include subnets 172. A subnet or subnetwork can be
15 a separate part of a network. Generally, a subnet is logically or physically distinct from other
parts of a network. A subnet can include additional routers, switches, user workstations, and/or
servers.

[0033] The site network 104 described here is provided as an example. A customer sites
network can be less complex or more complex than is illustrated by this example, and can
20 include network infrastructure not described here.

[0034] As noted above, the network security infrastructure 106 can separate apparently
legitimate network traffic 134 from suspect network traffic 136. Suspect network traffic 136,
which ordinarily can be discarded by the network security infrastructure 106, can be forwarded
to the threat intelligence engine 108. In some cases, some network packets can be flagged for
25 inspection but otherwise look legitimate. In these cases, the network traffic can be both
forwarded to the site network 104 and also forwarded to the threat intelligence engine 108. The
threat intelligence engine 108 can attempt to determine what harm, if any, the suspect network
traffic 136 can cause to the site network 104. The threat intelligence engine 108 can subsequently
produce indicators that identify and/or describe any harm caused by the suspect network traffic

8
136. In various implementations, the threat intelligence engine 108 can include a prioritization
engine 110, a high-interaction network 116, and an analytic engine 118.

[0035] The prioritization engine 110 can analyze the suspect network traffic 136 and attempt
to identify whether the suspect network traffic 136 represents a known threat. Known threats
5 include, for example, previously identified malware, packets from IP addresses known to send
malicious network traffic, and authentication requests previously associated with unauthorized
users, among many others. Because these threats were previously identified, in most cases the
network security infrastructure 106 likely has already been configured to identify and block
network traffic associated with these threats. Alternatively or additionally, the prioritization
10 engine 110 can determine that the threat posed by the suspect network traffic 136 is one that the
site network 104 is not vulnerable to. For example, the suspect network traffic 136 can include a
virus designed to exploit a vulnerability version 1.0 of a standard operating system, while none
of the computers in the site network 104 are running that particular operating system or that
version of the operating system. Because the site network 104 is not vulnerable to this attack, in
15 some cases suspect network traffic 136 associated with the attack need not be analyzed.

[0036] The prioritization engine 110 can include a database of indicators describing network
threats that the threat intelligence engine 108 has previously identified. In some implementations,
the prioritization engines 110 database can also include indicators received from a central
database 154 located outside of the customer network 102. When the suspect network traffic 136
20 is a known threat, and/or is a threat the site network 104 is not vulnerable to, then the
prioritization engine 108 can take note of the identity of the threat and then discard the suspect
network traffic 136.

[0037] When the prioritization engine 110 determines that the suspect network traffic 136 is
associated with an unknown threat, then the threat intelligence engine 108 can direct the suspect
25 traffic 134 to the high-interaction network 116 for detailed analysis.

[0038] The high-interaction network 116 can be a self-contained, closely monitored network
that can be quickly reconfigured, repaired, brought up, or taken down. The high-interaction
network 116 is not a part of the site network 104, and exists within a physically and/or virtually
isolated, contained space. The high-interaction network 116, however, can appear and behave
30 just as does a real network, including having a connection to the Internet 150. Additionally, the

9
high-interaction network 116 can be configurable, so that it can resemble the site network 104 or
only a part of the site network 104, as explained in further detail below. The high-interaction
network 116 can be configured to resemble another network entirely, should the need arise. In
most cases, however, the threat intelligence engine 108 is configured to detect threats to the site
5 network 104, thus high-interaction network 116 will more often emulate the site network 104.

[0039] The high-interaction network 116 can be built using a number of testing devices, such
as physical routers, switches, and servers. Alternatively or additionally, the high-interaction
network 216 can exists as a fully emulated network residing on one or more servers. In a fully
emulated network, the testing devices can be software processes configured to resemble routers
10 and servers. Alternatively or additionally, the high-interaction network 116 can be constructed
using a combination of physical devices and emulated devices. In some implementations, the
high-interaction network 116 can reside at a cloud service provider, and thus be located outside
of the customer network 102.

[0040] The high-interaction network 116 can provide a controlled space for conducting static,
15 dynamic, and network analysis of the suspect network traffic 136. In the high-interaction
network 116, the suspect network traffic 136 can be free to engage in whatever activity it can be
capable of, including doing harm. Doing harm is specifically allowed so that how the suspect
network traffic 136 caused the harm, and the nature of the harm, can be captured. Additionally,
the high-interaction network 116 can include automated processes that respond to activity
20 initiated by the suspect network traffic 136; for example, automated processes can respond just
as would a human network user. Any activity initiated by the suspect network traffic 136 within
the high-interaction network 116 can be closely monitored and recorded.

[0041] The threat intelligence engine 108 can send activity logs, memory snapshots, and any
other information generated by analyzing the suspect network traffic 136 in the high-interaction
25 network 116 to the analytic engine 118. As discussed in further detail below, the analytic engine
118 can process data collected in the high-interaction network 116 to determine whether the
suspect network traffic 136 was truly malicious or was, in fact, harmless. In either case, the
analytic engine 118 can produce indicators that describe the suspect network traffic 136. As
described in further detail below, the indicators can include characteristics that uniquely identify
30 the suspect network traffic 136, any effect that resulted from interacting with the contents of the

10
suspect network traffic 136, and/or any activity triggered by the suspect network traffic 136
within the high-interaction network 116.

[0042] In some implementations, the indicators generated by the analytic engine 118 can be
used to verify 140 whether the site network 104 has already suffered the attack identified by the
5 indicators. For example, the threat intelligence engine 108 can identify an email that contained a
virus. The email can have been flagged as suspect because it was addressed to a user that does
not exist within the customer network 102. The threat intelligence engine 108 may, using the
high-interaction network 116, allow the virus to affect a simulated user work station, and see
what effect the virus has on the simulated workstation. For example, the virus can modify
10 operating system settings in the simulated workstation to make the simulated workstation more
vulnerable to attack. The analytic engine 118 can subsequently generate indicators that identify
the malicious email and describe the effect of the virus. The threat intelligence engine 108 can
then use these indicators to verify 140 whether any user workstations 176 in the site network 104
has already received the malicious email and been infected by this virus.

15 [0043] In some implementations, threat intelligence engine 108 can also use the indicators to
update 142 the security infrastructure 106. For example, the threat intelligence engine 108 can
identify new malware that should be blocked by an anti-virus tool, new external IP addresses that
should be blocked by the firewall, or user accounts that have been compromised, among others.

[0044] In some implementations, the threat intelligence engine 108 can also analyze suspect
20 network traffic 136 associated with a known threat. In these implementations, rather than
discarding this suspect network traffic 136, the prioritization engine 110 can be configured to
send this suspect network traffic 136 to the high-interaction network 116. The high-interaction
network 116 can then, for example, be used to see how susceptible the site network 104 can be to
the threat posed by the suspect network traffic 136. The analytic engine 118 can produce
25 indicators that describe how the high-interaction network 116 reacted to the threat. These
indicators can then be used to improve the network security infrastructure 106.

[0045] In some implementations, the threat intelligence engine 108 can also send indicators
generated by the analytic engine to a site database 120. The customer network 102 can have a
site database 120 when the customer network 102 has additional site networks 124. For example,
30 a business occupying a campus with multiple buildings can have a separate network in each

11
building. These separate networks can or might not be able to communicate with each other, but
share a common owner and have common control. Each of these separate networks (which can
be described as subnets) can be considered a site network 104, 124. Each additional site
networks 124 can have its own threat intelligence engine. Each threat intelligence engine can
5 send indicators that they generate to the site database 120. Each threat intelligence engine can
also receive indicators generated by the additional site networks 124 from the site database 120.
By distributing threat indicators across the customer network 102, the customer network 102 as a
whole can be made more secure.

[0046] In some implementations, the threat intelligence engine 108 can also send indicators to
10 a central database 154 located outside the customer network 102. In some implementations, the
threat intelligence engine 108 can send its indicators directly to the central database 154. In
implementations that include a site database 120, the site database 120 can send indicators for all
the site networks 104, 124 to the central database 154. The central database 154 can also receive
indicators from other networks 122. These other networks 122 can also include their own threat
15 intelligence engines for analyzing suspect network traffic and generating indicators describing
suspect network traffic. The central database 154 can also share indicators between the other
networks 122 and the illustrated customer network 102. That is, the customer network 102 can
receive indicators generated at the other networks 122. By sharing indicators across networks
102, 122, all the networks 102, 122 can be made more secure.

20 [0047] In some implementations, a device in the high-interaction network 116 can


communicate with a server or system in the site network 104. For example, the device in the
high-interaction network 116 can communicate 126 with the compute servers 170. In some
examples, device in the high-interaction network 116 can communicate with a web server of the
compute servers 170. The communication 126 can include accessing the compute servers 170
25 from the high-interaction network 116. Responses from the compute servers 170 can be returned
to the device on the high-interaction network 116.

[0048] As noted above, the threat intelligence engines high-interaction network can be
configured to emulate all or part of a customers site network. FIG. 2A illustrates one example of
the configuration of a high-interaction network 216. In this example, the high-interaction
30 network 216 has been configured to emulate all of a site network. Emulating all or nearly all of a

12
site network can be useful when, for example, suspect network traffic has a potentially broad
effect, or when the behavior of suspect network is particular unpredictable, or when the suspect
network traffic is driven based on being tricked into believing the suspect network traffic has
infiltrated the sites real network.

5 [0049] In this example, the high-interaction network 216 has been configured to emulate the
site network 104 illustrated in FIG. 1. As such, the high-interaction network 216 of FIG. 2A
includes test devices configured as routers 266, a switch 274, user workstations 276, multiple
servers 268, 270, and several subnets 272. These user workstations 276 can be configured just as
the user workstations in the site network 104, and can further include automated processes that
10 emulate the activity of the site networks 104 users. The servers include a group of file servers
268 that emulate the files stored by the file servers in the site network 104. The servers also
include a group of compute servers 270 that provide the same processing resources provided by
the compute servers in the site network 104. For example, a compute server can be a web server
that can emulate a web site. The high-interaction network 216 can further include subnets 272
15 that emulate the subnets found in the site network 104. The high-interaction network 216 can
further include a gateway 262 that connects the high-interaction network 216 to the Internet 250,
just as the site network 104 has a gateway that connects the site network 104 to the Internet. The
gateway 262 is attached to a firewall 264, or can have an integrated firewall 264, just as does the
site network 104.

20 [0050] In this example, the high-interaction network 216 does not include the network security
infrastructure that protects the site network 104. The high-interaction network 216 is being used
to analyze the effect of suspect network traffic within the site network 104. In other words, the
suspect network traffic is being released into what appears to be the site network 104 as if it was
not caught by any network security tools. Since the suspect network traffic has already been
25 filtered by the network security infrastructure, the network security infrastructure is not needed
in this instance. In other cases, the high-interaction network 216 can include the network security
infrastructure, for example, when analyzing suspect network traffics effect on the network
security infrastructure and the site network 104.

[0051] Absence of the network security infrastructure also can make the high-interaction
30 network 216 more vulnerable to an attack. When suspect network traffic that constitutes a real

13
attack is received at the site network 104, it is desirable to stop the attack as soon as possible, and
mitigate or repair any damage it caused. But when an actual attack is stopped right away, it
might not be possible to learn what the intent of the attack was and what harm might have
resulted. Having this information can be useful for, for example, gaining a better understanding
5 of network vulnerabilities, finding new or existing vulnerabilities in the site network 104, and
possibly tracking down attackers, among other things. Thus, making the high-interaction network
216 more vulnerable to attack can encourage an attack, and by encouraging an attack more can
be learned about it.

[0052] Processes in the high-interaction network 216 can analyze suspect network traffic in
10 several ways, including conducting static, dynamic, and network analysis. Static analysis
involves extracting the contents of the suspect network traffic and applying various tools to the
content to attempt to identify the content, determine what the content does (if anything), and/or
determine whether the content is harmless or malicious. The content of the suspect network
traffic can include, for example, webpages, email, and files such as formatted documents (e.g.,
15 Microsoft Word, Excel, or PowerPoint documents or Portal Document Format (PDF)
documents), text files documents, images (e.g. Joint Photographic Experts Group (JPEG) files or
Graphic Interchange Format (GIF) files), audio, video, archives (e.g., zip, tape archive (tar),
Java archive (jar) files, etc.), or executable files, among others.

[0053] Static analysis of the content of suspect network traffic can include, for example,
20 applying virus scanning to the content, extracting components from the content such as macros
or scripts and then scanning the content, and/or opening the content using an appropriate
application. Opening an executable file can trigger execution of the file, which can be conducted
in a contained, emulated environment. In some cases, static analysis can alternatively or
additionally include deconstructing the content, including decompressing, decrypting, decoding,
25 decompiling, and/or converting the content into another format, as appropriate. Subsequent to
being deconstructed, the content can be further analyzed to attempt to discover any hidden
purpose behind the content. Malicious intent can be indicated, for example, by instructions to
access password files, instructions to connect to input devices such as a keyboard or a screen, or
code that attempts to exploit a vulnerability in a software application, among others. The result
30 of the static analysis can be provided to the analytic engine 218. The analytic engine 218 can

14
generate indicators describing the content, which can be referred to as static indicators. Static
indicators can include, for example, the contents type (e.g., webpages, email, documents, or
programs), a description of anything questionable found in the content, and/or identification
information that uniquely identifies the content. In some implementations, the identification
5 information can be a digital signature, generated, for example, by applying the MD5 algorithm,
Secure Hash Algorithm 1 (SHA-1), or SHA-2 to the content. The static analysis results can also
be used to drive dynamic analysis.

[0054] Dynamic analysis of the suspect network traffic can involve interacting with content
extracted from the suspect network traffic and monitoring and recording any activity that results
10 from interacting with the content. For example, in some implementations, the high-interaction
network 216 can launch a virtual machine that emulates a user workstation 276. This emulated
user workstation 276 can hereafter be referred to as the release point 280, because it serves as the
point from which the content is released. At the release point, the content can be downloaded,
opened, and/or executed, as appropriate for the specific content. For example, when the content
15 includes webpages, the webpages maybe downloaded, including downloading any graphic or
executable files included in the webpages. Automated processes can then interact with the
webpages, including selecting links and causing additional webpages, graphics, and/or
executable files to be downloaded. Any executable files, if not automatically launched, can be
launched by an automated process.

20 [0055] In some cases, depending on the nature of the content found in the suspect network
traffic, the high-interaction network 216 can release the content elsewhere, such as at a compute
server of the compute servers 270, a file server of the file servers 268, or the firewall 264. For
example, suspect network traffic that is attempting to open ports at the firewall 264 can be more
effectively released at the emulated firewall 264. As another example, an attack on a service
25 provided by the file server 268 (e.g., a database attack) can be analyzed more effectively if
released on the file server 268.

[0056] Monitoring tools can track any calls made by programs launched by executing files
found in the suspect network traffic, including calls made to an emulated operating system
and/or to emulated hardware. In some cases, these calls can be harmless, while in other cases the
30 calls can be malicious. For example, the high-interaction network 216 can see questionable file

15
activity. Questionable file activity can include uploading 282 of files from the high-interaction
network 216 to the Internet 250. Files can be uploaded 282 from the release point 280 by a
process triggered by interacting with the content of the suspect network traffic. Questionable file
activity can also include downloading of files 284 from the Internet 250. For example, the
5 content can trigger downloading 284 of malware, key logging or screen capture tools, or some
other program intended to infiltrate or attack the high-interaction network 216. Questionable file
activity can also include creating, copying, modifying, deleting, moving, decrypting, encrypting,
decompressing, and/or compressing files at any device in the high-interaction network 216.
Questionable file activity can also include making registry changes, accessing passwords, or
10 making lateral movement of files.

[0057] Any activity triggered by interacting with the content of suspect network traffic is
recorded and delivered to the analytic engine 218. The analytic engine 218 can use at least one or
more of a heuristic, a probabilistic, and an emulator to (1) extract content of a file found in
suspect network traffic, (2) determine an intent of or classify the file using source code or pseudo
15 code of the file, and (3) produce one or more indicators to uniquely label and classify the file. In
some examples, the classification of the file can include a determination whether the file is a
threat or not. The classification can further include more threat intel about the file. For example,
more threat intel can include a specific actor that is attributed to the file. In some examples, the
one or more indicators can be referred to as file indicators. File indicators can include, for
20 example, a list of modified files and/or directories, a list of content uploaded 282 to or
downloaded 284 from the Internet, and/or a digital signature identifying the content from the
suspect network traffic.

[0058] The high-interaction network 216 can also conduct network analysis of the suspect
network traffic. Network analysis can include analyzing and/or interacting with network
25 protocol-related packets in suspect network traffic, and attempting to ascertain what effect the
suspect network traffic is trying to achieve. For example, the suspect network traffic can include
packets attacking 294 the firewall 264 by attempting to use a closed port at the firewall 264. The
high-interaction network 216 can open the closed port to allow the packets into the high-
interaction network 216, and analyze these packets as suspect network traffic. As another
30 example, the suspect network traffic can include domain name system (DNS) packets attacking

16
290 one of the subnets by attempting to ascertain IP addresses the subnets 272. The high-
interaction network 216 can provide IP addresses of the subnet 272, and see if any suspect
network traffic is received at those IP addresses. As another example, the user workstations 276
can be attacked 292 by packets making repeated login attempts. The high-interaction network
5 216 can allow the login attempts to succeed.

[0059] Network analysis can occur in conjunction with dynamic analysis of the contents of
suspect network traffic. For example, the contents can include tools for attacking 292 the user
workstations 276 to steal credentials. Automated processes can provide credentials, and then
watch for login attempts that use those credentials. Attacks 290, 292, 294 can be encouraged so
10 that as much information as possible can be learned about, for example, how the attack is
initiated, what entity is behind the attack, and/or what effect each attack has, among other things.
To encourage the attacks 290, 292, 294, the high-interaction network 216 can lower security
barriers, and/or can deliberately provide information for infiltrating the high-interaction network
216.

15 [0060] Network analysis also looks for lateral movement that can result from suspect network
traffic. Lateral movement occurs when an attack on the high-interaction network 216 moves
from one device in the network to another. Lateral movement can involve malware designed to
spread between network devices, and/or infiltration of the network by an outside entity. For
example, an attack 292 on the user workstations 276 can result in user credentials being stolen
20 and uploaded 282 to an outside entity on the Internet 250. The attack 292 can also inform the
outside entity about files available on the file servers 268 and services provided by the compute
servers 270. The high-interaction network 216 can subsequently see an attack 286 on the file
servers 268 that uses the stolen credentials to gain access and ransom the files. The high-
interaction network 216 can also see an attack 288 on the compute servers 270, using the stolen
25 credentials, to take the compute servers 270 offline. Each of these attacks 286, 288 can be
considered lateral movement of an attack 292 that started at the user workstations 276. The
lateral movement can be captured and traced, for example, through log files generated by the
user workstations 276, the gateway 262 and firewall, and the servers 268, 270, and/or memory
snapshots of any of these devices.

17
[0061] The results of the various network analysis methods are provided to the analytic engine
218. The analytic engine 218 can produce indicators, which can be referred to as network
indicators. Network indicators can include, for example, network protocols used by the suspect
network traffic and/or a trace of the network activity caused by the suspect network traffic. The
5 network indicators can alternatively or additionally uniquely identify the suspect network traffic.
The identification can include, for example, a source of the suspect network traffic, particularly
when the source is distinctive (e.g., the source is not a proxy that was used to obfuscate the true
source of the suspect network traffic). The identification can also include a destination within the
high-interaction network that received the suspect network traffic. The source information can be
10 used to track down the sender of the suspect network traffic. The destination information can be
used to locate machines in the real network that can have been affected by the suspect network
traffic. The network indicators can also describe any effect caused by the suspect network traffic,
such as stolen credentials, files held for ransom, or servers being taken offline.

[0062] In some cases, suspect network traffic can be innocent. For example, the suspect
15 network traffic can include an email with an attached image file that was poorly named (e.g. a
file named pleaseopenthis with no extension, that is, in fact, a harmless photograph). Static
analysis can identify that the attachment as an image file, where opening the file shows that the
image file is, in fact, only an image file, and not hidden malware. Dynamic analysis of the email
and the attached file can result in nothing happening. Network analysis of the email can result in
20 determining that the email was from an innocent sender. The information generated from the
static, dynamic, and network analysis can also be sent to the analytic engine 218, so that the
innocent network traffic can be identified as such.

[0063] FIG. 2B illustrates another example of a possible configuration of the high-interaction


network 216. In this example, the high-interaction network 216 has been configured with only a
25 part of the site network 104. This example also illustrates that the high-interaction network 216
can be used to emulate multiple parts of the site network 104 at the same time.

[0064] In the illustrated example, the high-interaction network 216 has been configured with
test devices emulating the file servers 268 and the compute servers 270. Test devices are also
emulating a gateway 262a, firewall 264a, and one router 266a, so that the file servers 268 and
30 compute servers 270 are accessible to the Internet 250. The high-interaction network 216 can

18
have been configured with only the file servers 268 and/or compute servers 270 because suspect
network traffic appears to be a direct attack 288 on the servers 268, 270. For example, the
suspect network traffic can include an attack 288 in the form of an exceptionally large volume of
database queries to a database hosted by the compute servers 270, accompanied by database data
5 being uploaded 282 to the Internet. Since the suspect network traffic in this example constitutes
database queries, the release point 280 for this suspect network traffic is an appropriate compute
server 270. Furthermore, since the attack 288 in this example is not likely to transition to other
parts of the site network 104, such as the user workstations, the other parts of the site network
104 have not been emulated.

10 [0065] In this example, the high-interaction network 216 is also emulating a subnet 272, along
with separate routers 266b and a separate a firewall 264b and gateway 262b to provide the subnet
272 with access to the Internet 250. The subnet 272 and its routers 266b, firewall 264b, and
gateway 262b are, in this example, not connected to the emulated hardware for the file 268 and
compute 270 servers. The subnet 272 and its accompanying infrastructure can be emulated
15 separately so that suspect network traffic directed specifically at the subnet 272 can be analyzed
separate from suspect network traffic directed at the file 268 and compute 270 servers. Suspect
network traffic directed to the subnet 272 can constitute an attack 290 that is unrelated to suspect
network traffic directed to the file 268 and compute 270 servers. Hence, separate analysis can be
more efficient. Separate analysis can also provide a more precise description of each stream of
20 suspect network traffic.

[0066] Separate analysis can also lead to more efficient use of available resources. When only
part of the site network 104 is emulated, the high-interaction network 216 can have idle
resources, such as unused test devices and/or computing power. By using these resources to
emulate another part of the site network, the high-interaction network 216 can analyze more
25 suspect network traffic at the same time. The result of the analysis provided by each individually
emulated network part are provided to the analytic engine 218 for analysis.

[0067] FIG. 2C illustrates another example of a possible configuration for the high-interaction
network 216. In this example, the high-interaction network 216 has been configured to emulate
the part of the site network 104 that is accessible, or relevant, to a specific user. A user of the site
30 network 104 can have authorization to access only specific parts of the site network 104. In other

19
examples, a particular attack can be known to only require specific parts of the site network 104.
Thus in this example, the high-interaction network 216 has been configured with test devices
emulating the specific users workstation 276, as well as the switch 274, router 266, firewall 264,
and gateway 262 that connect the users workstation 276 to the Internet 250. The high-
5 interaction network 216 can further be configured with test devices emulating the one file server
268 and one compute server 270 that is either relevant to the user of this example or that the user
of this example is authorized to use.

[0068] Emulating only a part of the site network 104 can be useful when suspect network
traffic is directed at a specific user, or takes advantage of one user. For example, the user can be
10 the target of a spoofing attack 292. A spoofing attack 292 can take the form of the user receiving
email that appears to be from a person that the users knows, but that is, in fact, malicious email
that is spoofing, or pretending, to be from a known person. The spoof email can further have a
malicious attachment, such as a key logger. The users workstation 276 is treated as the release
point 280 for the spoof email: an automated process, acting as would the user, opens the email
15 and causes the key logger to be downloaded 284. The automated process can subsequently enter
key strokes, including the users credentials, for capture by the key logger. The key logger can
then upload 282 the users credentials to a malicious actor on the Internet 250. Now armed with
one users credentials, an outside actor can attack 288 the compute server 270 or attack 286 files
on the file server 268, using the users stolen credentials. All of this activity, including
20 downloading 284 of the key logger, uploading 282 of the users credentials, and lateral
movement of the attack to the file 268 and compute 270 server can be captured and sent to the
analytic engine 218 for analysis.

[0069] Another example of attacks that can target a specific user are web-based attacks. Web-
based attacks are network attacks that take advantage of security flaws of web servers, such as
25 web servers that may be located in the site network 104. One example of a web-based attack
occurs when a script (i.e., a program, typically written in plain text that causes a system to
execute an operation) is inserted into a web page that is returned to the user. In some examples,
the script is part of what may be referred to as an attack pattern. In some examples, the script can
be inserted into the web page through a link that the user clicks on to take the user to the web
30 page. In such examples, the format of the link or the manner in which the link operates causes

20
the script to be inserted into the text and/or code of the web page that the link refers to. The web
server would then return the web page to the user with the script inserted into the contents of the
web page. When the users browser loads the web page, the script may be executed by the
browser of the user.

5 [0070] While a few types of web-based attacks will be discussed below, a person of ordinary
skill will recognize that any attack capable of being intercepted by network security
infrastructure, as discussed above, can be analyzed by techniques described herein.

[0071] One example of a web-based attack is a cross-site scripting (XSS) attack 294. The XSS
attack 294 can involve code, such as JavaScript code, being injected into a web page, such that a
10 browser executes the code when the web page is loaded by the browser. For example, the user
can be served the following hypertext markup language (HTML) page from the web server:

<html>

<h1>Most recent comment</h1>

<script>doSomething();</script>

15 </html>

[0072] When the users browser loads the HTML page, the browser can execute the code of
the doSomething() script, which can cause the browser to do the method doSomething(). In
some instances, the user is unaware that the code was executed.

[0073] The code can have access to data and information that can be accessed by the web
20 page, such as cookies and values input by a user into the web page. The code can also read and
make modifications to the browsers Document Object Model (DOM) within the page to which
the code was added. The code can also send content to remote destinations using, for example,
XMLHttpRequest to send Hypertext Transfer Protocol (HTTP) requests with the content. In
some implementations, the code can even access a users geolocation, webcam, microphone,
25 and/or files from the users file system.

[0074] In some examples, the code can reference external code by altering the <script> line
above to, for example,<script src=http://example.com/xss.js>. In such examples, the injected
code is located at the domain example.com and the name of the program is xss.js. A person

21
of ordinary skill in the art will recognize that the XSS attack can be inserted into a web page in
other ways, including, but not limited to, using <body> tag, <img> tag, <iframe> tag, <input>
tag, <link> tag, <table> tag, <div> tag, or <object> tag. Other similar web-based attacks include
programmatic server-side include (SSI) injection attacks and Extensible Markup Language
5 (XML) injection attacks.

[0075] Another example of a web-based attack is a database injection attack. For example, a
Structured Query Language (SQL) injection attack 296 can occur when an attacker sends a series
of SQL queries to a SQL database located on a web server. Typically, the SQL queries are issued
by a script, and can cause the SQL database to return, add, delete, or modify information in the
10 SQL database. In some instances, the one or more SQL statements can cause the SQL data to
return more information than was intended by the operator of the web server. A person of
ordinary skill in the art will recognize that other types of databases can apply for a database
injection attack. In addition, other injection attacks can include Command Injection and
Lightweight Directory Access Protocol (LDAP)

15 [0076] Another example of a web-based attack is an Extensible Markup Language (XML)


attack. An XML attack can manipulate logic of an XML application or service. In some
examples, injection of XML content using a network can alter intended logic of an application.

[0077] Web-based attacks, such as injection attacks described herein, can typically be detected
by network security infrastructure devices (e.g., intrusion prevention system, intrusion detection
20 system, or deep packet inspection devices) that are securing the network where a web server is
located. For example, a web page request that would cause a script to be inserted into the
contents may include the name of the script in a field within the request, where the field is one
that is converted by the web server into contents for the web page. When a web-based attack is
detected, typically the network infrastructure devices prevent the web page request from
25 reaching the web server, resulting in the request not being completed. The user, who (probably
unknowingly) initiated the malicious request, may find her browser indicating that the requested
web page could not be found.

[0078] In various implementations, requests identified as being related to a web-based attack


are diverted to the threat intelligence engine. The threat intelligence engine may analyze the
30 requests, and determine an attack pattern. The attack pattern describes the mechanism (e.g.,

22
insertion of a script into a web page) being used by the web-based attack. The threat intelligence
engine can subsequently replay the attack pattern using the high-interaction network to analyze
the attack. For example, the high-interaction network 216 can cause a web server being emulated
by the compute servers 270 to receive a request that has been identified as malicious. The high-
5 interaction network 216 can further have a web page generated as a result of the malicious
request sent to the user workstation 276. The user workstation 276 can be configured with a web
browser, which can receive and parse the web page. Parsing the web page may cause a malicious
script to be executed. The effects of the malicious script can be captured by the high-interaction
network.

10 [0079] FIG. 2D illustrates another example of a possible configuration for the high-interaction
network 216. In this example, the high-interaction network 216 has been configured to emulate a
part of the site network 104 that is associated with the user workstation 276. Thus in this
example, the high-interaction network 216 has been configured with test devices emulating the
specific users workstation 276, the switch 274, the router 266, the firewall 264, and the gateway
15 262 that connect the users workstation 276 to the Internet 250.

[0080] In some implementations, the emulated user workstation 276 can be configured to
communicate with production compute servers 170 in the site network 104. The compute servers
170 may include one or more web servers hosted by the site network 104. In these
implementations, the high-interaction network 216 can analyze attacks on the site networks web
20 servers without needing to emulate the web servers. In some cases, a site networks website may
be extensive and may be changing continuously. In these cases, emulating the all or most of the
website may require a large amount of computing resources. Analysis of an attack on the site
networks web servers may thus be conducted more efficiently by using the physical web
servers, rather than by emulating the websites hosted by the web servers. While some traffic may
25 be generated to the web servers as a result, negative effects of web-based attack typically occur
at the device (e.g., the emulated user workstation 276) from which web pages are requested.
Thus, in this example, though the site network 104 may be involved in analyzing a web-based
attack, the attack remains contained within the high-interaction network.

[0081] FIG. 3 illustrates an example where a uniform resource locator (URL) 312 is sent to a
30 web server 320 from a user workstation 310 over an external network 330 (e.g., the Internet). In

23
some examples, the user workstation 310 can receive a uniform resource locator (URL) 312
addressed to the web server 320. For example, the user workstation 310 can receive the URL 312
as a link in an email. Should the URL 312 be received at the web server 320, a portion of the
URL 312 can cause the web server 320 to insert a script into a web page in response to the URL
5 312.

[0082] In some examples, the URL 312 can be sent to the web server 320 from the user
workstation 310 using the external network 330. The URL 312 can be intercepted by a network
security infrastructure 430 before the URL 312 arrives at the web server 320. The network
security infrastructure 430 can identify a security risk in the URL 312, and prevent the URL 312
10 from reaching the web server 320. Instead, the network security infrastructure 340 can send the
URL 312 to the prioritization engine 350 for parsing, and for identifying an attack pattern related
to the URL 312.

[0083] Once parsed, the prioritization engine 350 can determine whether to discover more
information regarding the attack pattern. If the prioritization engine 350 determines to discover
15 more information, the prioritization engine 350 can send the attack pattern to a high-interaction
network 116 for execution. In some examples, the attack pattern can be packaged into a duplicate
URL by the prioritization engine 350, where the duplicate URL can be sent to the high-
interaction network. The duplicate URL can be similar to the URL 312. For example, the
duplicate URL can cause the script to be inserted into a web page when sent from a web server.
20 In some examples, the duplicate URL can be sent to an emulated workstation 360 on the high-
interaction network. The emulated workstation 360 can emulate a computer that the attack
pattern is meant to affect.

[0084] Depending on a configuration of the high-interaction network, the emulated


workstation 360 can send the duplicate URL to either the web server 320 located on the site
25 network or an emulated web server 370 on the high-interaction network. The web server 320, or
the emulated web server 370, can generate a web page using the duplicate URL. In examples
using the web server 320, the web server 320 can include a whitelist that allows the duplicate
URL to generate a web page, even though the network security infrastructure 430 would
typically block/intercept the duplicate URL.

24
[0085] After the web page is generated, the web page can be sent to the emulated workstation
360 for execution. The emulated workstation 360 can include a browser capable of receiving the
web page. In some examples, upon receiving the web page, the web browser may present the
web page for viewing. In some examples, during the presentation process, the web browser can
5 cause a script (e.g., the attack pattern) on the web page to activate/execute. An analytic engine
(e.g., analytic engine 218) can record a result of execution of the attack pattern, as discussed
above. In some examples, the high-interaction network 116 can stop the emulated workstation
360 from sending traffic outside of the high-interaction network, preventing the attack pattern
from sending information outside of the high-interaction network. In other examples, the
10 emulated workstation 360 is allowed to exchange traffic with the external network 330. In these
examples, the emulated workstation 360 can obtain information about entities located in the
external network 330.

[0086] FIG. 4 is a flowchart illustrating an example of a process 400 for analyzing web-based
attacks using a high-interaction network. In some aspects, the process 400 can be performed by a
15 threat intelligence engine.

[0087] Process 400 is illustrated as a logical flow diagram, the operation of which represent a
sequence of operations that can be implemented in hardware, computer instructions, or a
combination thereof. In the context of computer instructions, the operations represent computer-
executable instructions stored on one or more computer-readable storage media that, when
20 executed by one or more processors, perform the recited operations. Generally, computer-
executable instructions include routines, programs, objects, components, data structures, and the
like that perform particular functions or implement particular data types. The order in which the
operations are described is not intended to be construed as a limitation, and any number of the
described operations can be combined in any order and/or in parallel to implement the processes.

25 [0088] Additionally, the process 400 can be performed under the control of one or more
computer systems configured with executable instructions and can be implemented as code (e.g.,
executable instructions, one or more computer programs, or one or more applications) executing
collectively on one or more processors, by hardware, or combinations thereof. As noted above,
the code can be stored on a machine-readable storage medium, for example, in the form of a

25
computer program comprising a plurality of instructions executable by one or more processors.
The machine-readable storage medium can be non-transitory.

[0089] At step 410, the process 400 includes receiving network traffic directed to a production
web server on the network. For example, the network traffic can include a universal resource
5 locator (URL). The URL can be addressed to the production web server. In such examples, the
network traffic can be configured to request a web page that can be presented, executed, or
viewed on a web browser. In other examples, the network traffic can include a query for a
database to receive data stored in the database. In some examples, the network traffic can be
intercepted prior to being received by the production web server. Network security infrastructure
10 can intercept the network traffic. In some examples, the network traffic can be directed to the
production web server by a user workstation outside of the network.

[0090] At step 420, the process 400 can include identifying an attack pattern included in the
network traffic. In some examples, the attack pattern can be a script in the URL that is
configured to be inserted into a web page by the web server when the web server receives the
15 URL. In other examples, the attack pattern can be the query for the database. In other examples,
the attack pattern can be malicious XML script that is going to be inserted.

[0091] At step 430, the process 400 includes sending the attack pattern to an emulated device.
In some examples, the emulated device can send the attack pattern to the testing web server, as if
the emulated device is the user workstation that originally sent the network traffic to the
20 production web server. In some examples, the testing web server can be the production web
server. In other examples, the testing web server can be an emulation of the production web
server.

[0092] At step 440, the process 400 includes identifying a response associated with the attack
pattern. In some examples, the response can include a result of the emulated device executing the
25 script inserted into the web page received from the testing web server. In other examples, the
response can include the data received from the query to the database.

[0093] At step 450, the process 400 includes generating an indicator describing the response.
The indicator can include a network indicator, a file indicator, a static indicator, or any
combination thereof. The indicator can also describe the harm (if any) the attack pattern caused.

26
A threat intelligence engine can use the indicator to verify whether the network has been
previously infiltrated and compromised by the attack pattern.

[0094] In addition to being provided to a network administrator, the indicators generated for an
incident can be added to an indicators database. A threat intelligence engine can use the
5 indicators in the indicators database in various ways. FIG. 5 illustrates examples of ways in
which the threat intelligence engine 508 can use indicators generated by its analytic engine 518.
FIG. 5 illustrates an example of a customer network 502 that includes a threat intelligence engine
508. The customer network 502 in this example includes a gateway 562 for communicating with
other networks, such as the Internet 550. The gateway 562 can include an integrated firewall 564,
10 or can be attached to a firewall 564 device 564. Generally, all network traffic coming into or
going out of the customer network 502 passes through the gateway 562 and firewall 564.

[0095] The firewall 564 generally controls what network traffic can come into and go out of
the customer network 502. The customer network 502 in this example includes additional
network security tools 530, 532, such as anti-virus scanners, IPS, IDS, and others. The network
15 security tools 530, 532 can examine network traffic coming into the customer network 502, and
allow network traffic that appears to be legitimate 534 to continue to the sites network. The
network security tools 530, 532 can direct suspect network traffic 536 to the threat intelligence
engine 508.

[0096] The site network is where the hardware, software, and internal users of the customer
20 network 502 can be found, and where the operations of the customer network 502 occur. In this
example, the site network includes several routers 566 that connect together a switch 574, a
group of file servers 568, a group of compute servers 570, and several subnets 572. The switch
574 further connects several user workstations 576 to the site network.

[0097] As discussed above, the threat intelligence engine 508 examines suspect network traffic
25 and attempts to determine whether the suspect network traffic may, in fact, be malicious. The
threat intelligence engine 508 in this example includes a prioritization engine 510, a high-
interaction network 516, and an analytic engine 518. The prioritization engine 510 analyzes
suspect network traffic 536 and attempts to determine whether the suspect network traffic 536
represents a known threat. When the suspect network traffic 536 is associated with a known
30 threat, then the threat intelligence engine 508 can log the occurrence of the suspect network

27
traffic 536, and do nothing more. In some implementations, the threat intelligence engine 508
can be configured to provide suspect network traffic 536 associated with a known threat to the
high-interaction network 516 for analysis. Doing so can be useful, for example, to see how well
the customer network 502 can handle the known threat.

5 [0098] Suspect network traffic 536 that is not associated with a known threat can be provided
to the high-interaction network 516 to attempt to determine if the suspect network traffic 536
constitutes a threat, and if so, what the nature of the threat is. Within the high-interaction
network 516, the suspect network traffic 536 can be allowed to do whatever harm it was
designed to do. The suspect network traffic 536, or an entity that is driving the suspect network
10 traffic 536, can further be encouraged to act, for example by lowering security barriers within the
high-interaction network 516 and/or surreptitiously leaking credentials to the entity.

[0099] Any activity triggered by the suspect network traffic 536 inside the high-interaction
network 516 can be recorded and provided to the analytic engine 518. The analytic engine 518
can analyze the recorded activity and generate indicators to describe and/or identify the suspect
15 network traffic 536, as described above.

[0100] The threat intelligence engine 508 can use the indicators in several ways. For example,
in some implementations, the threat intelligence engine 508 can use the indicators to verify 540
whether the site network has already been compromised. The site network can already be
compromised if it has previously received suspect network traffic 536 that has been analyzed by
20 the threat intelligence engine 508. For example, the threat intelligence engine 508 can find that a
virus 592 has been downloaded to the user workstations 576. Indicators can inform the threat
intelligence engine which workstations 576 to check, and where to find the virus. The indicators
can further show that the virus was downloaded through interactions by the workstations 576
users, for example, with a malicious website.

25 [0101] As another example, the threat intelligence engine 508 can find that ports at the firewall
564 have been opened 594. The threat intelligence engine 508 can further find that a routers 566
configuration has been changed 596, making the site network accessible to an outside actor.
Indicators can inform the threat intelligence engine 508 to check the firewall 564 and router 1366
for these changes.

28
[0102] As another example, the threat intelligence engine 508 can be able to use indicators to
trace lateral movement that was captured in the high-interaction network 516. For example, the
threat intelligence engine 508 may, based on theft of credentials at a user workstation 576, look
for unauthorized access 588 to resources provided by the compute servers 570. The threat
5 intelligence engine 508 can also look for unauthorized access to the file servers 568, and
unauthorized downloading 586 of files from the file servers 568. The threat intelligence engine
508 can further look for unauthorized logins 590 into a subnet 572.

[0103] Another way in which the threat intelligence engine 508 can use the indicators is to
update 542 the network security tools 530, 532. For example, the threat intelligence engine 508
10 can identify malware that is not known to an anti-virus tool, can find malicious IP addresses or
websites that should be blocked by the firewall, or can identify attached files that should be
removed from incoming network traffic.

[0104] In some implementations, the threat intelligence engine 508 can also send its indicators
to a site database 520. The customer network 502 can have a site database 520 when the
15 customer network 502 has multiple additional site networks 524. Each of these site networks 524
can be provided with its own threat intelligence engine. The individual threat intelligence
engines can also provide indicators to the site database 520. Indicators from different site
networks 524 can be shared between the site networks 524. Each site network can thereby be
defended against attacks that it has not yet experienced.

20 [0105] In some implementations, the threat intelligence engine 508 can also send its indicators
to a central database 554 located on the Internet 550. In implementations that include a site
database 520, the site database 520 can send indicators for all of the customer network 502 to the
central database 554. The central database 554 can also receive indicators from other networks
522. The central database 554 can share the indicators from the other networks 522 with the
25 customer networks 502 threat intelligence engine 508. By sharing indicators between the other
networks 522 and the customer network 502, all of the networks 502, 522 can be made more
secure.

[0106] In the preceding description, for the purposes of explanation, specific details are set
forth in order to provide a thorough understanding of examples of the invention. However, it will

29
be apparent that various examples can be practiced without these specific details. The figures and
description are not intended to be restrictive.

[0107] The preceding description provides exemplary examples only, and is not intended to
limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description
5 of the exemplary examples will provide those skilled in the art with an enabling description for
implementing an exemplary example. It should be understood that various changes can be made
in the function and arrangement of elements without departing from the spirit and scope of the
disclosed implementations as set forth in the appended claims.

[0108] Specific details are given in the following description to provide a thorough
10 understanding of the examples. However, it will be understood by one of ordinary skill in the art
that the examples can be practiced without these specific details. For example, circuits, systems,
networks, processes, and other components can be shown as components in block diagram form
in order not to obscure the examples in unnecessary detail. In other instances, well-known
circuits, processes, algorithms, structures, and techniques can be shown without unnecessary
15 detail in order to avoid obscuring the examples.

[0109] Also, it is noted that individual examples can be described as a process which is
depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block
diagram. Although a flowchart can describe the operations as a sequential process, many of the
operations can be performed in parallel or concurrently. In addition, the order of the operations
20 can be re-arranged. A process is terminated when its operations are completed, but could have
additional steps not included in a figure. A process can correspond to a method, a function, a
procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its
termination can correspond to a return of the function to the calling function or the main
function.

25 [0110] The term machine-readable storage medium or computer-readable storage medium


includes, but is not limited to, portable or non-portable storage devices, optical storage devices,
and various other mediums capable of storing, containing, or carrying instruction(s) and/or data.
A machine-readable storage medium or computer-readable storage medium can include a non-
transitory medium in which data can be stored and that does not include carrier waves and/or
30 transitory electronic signals propagating wirelessly or over wired connections. Examples of a

30
non-transitory medium can include, but are not limited to, a magnetic disk or tape, optical
storage media such as compact disk (CD) or digital versatile disk (DVD), flash memory, memory
or memory devices. A computer-program product can include code and/or machine-executable
instructions that can represent a procedure, a function, a subprogram, a program, a routine, a
5 subroutine, a module, a software package, a class, or any combination of instructions, data
structures, or program statements. A code segment can be coupled to another code segment or a
hardware circuit by passing and/or receiving information, data, arguments, parameters, or
memory contents. Information, arguments, parameters, data, etc. can be passed, forwarded, or
transmitted via any suitable means including memory sharing, message passing, token passing,
10 network transmission, etc.

[0111] Furthermore, examples can be implemented by hardware, software, firmware,


middleware, microcode, hardware description languages, or any combination thereof. When
implemented in software, firmware, middleware or microcode, the program code or code
segments to perform the necessary tasks (e.g., a computer-program product) can be stored in a
15 machine-readable medium. A processor(s) can perform the necessary tasks.

[0112] Systems depicted in some of the figures can be provided in various configurations. In
some examples, the systems can be configured as a distributed system where one or more
components of the system are distributed across one or more networks in a cloud computing
system.

20 [0113] Where components are described as being configured to perform certain operations,
such configuration can be accomplished, for example, by designing electronic circuits or other
hardware to perform the operation, by programming programmable electronic circuits (e.g.,
microprocessors, or other suitable electronic circuits) to perform the operation, or any
combination thereof.

25 [0114] In the foregoing specification, aspects of various example implementations are


described with reference to specific examples thereof, but those skilled in the art will recognize
that implementations is not limited thereto. Various features and aspects of the above-described
implementations can be used individually or jointly. Further, examples can be utilized in any
number of environments and applications beyond those described herein without departing from

31
the broader spirit and scope of the specification. The specification and drawings are, accordingly,
to be regarded as illustrative rather than restrictive.

[0115] In the foregoing description, for the purposes of illustration, methods were described in
a particular order. It should be appreciated that in alternate examples, the methods can be
5 performed in a different order than that described. It should also be appreciated that the methods
described above can be performed by hardware components or can be embodied in sequences of
machine-executable instructions, which can be used to cause a machine, such as a general-
purpose or special-purpose processor or logic circuits programmed with the instructions to
perform the methods. These machine-executable instructions can be stored on one or more
10 machine readable mediums, such as CD-ROMs or other type of optical disks, floppy diskettes,
ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, flash memory, or other types of
machine-readable mediums suitable for storing electronic instructions. Alternatively, the
methods can be performed by a combination of hardware and software.

[0116] Where components are described as being configured to perform certain operations,
15 such configuration can be accomplished, for example, by designing electronic circuits or other
hardware to perform the operation, by programming programmable electronic circuits (e.g.,
microprocessors, or other suitable electronic circuits) to perform the operation, or any
combination thereof.

[0117] While illustrative examples of the application have been described in detail herein, it is
20 to be understood that the inventive concepts can be otherwise variously embodied and employed,
and that the appended claims are intended to be construed to include such variations, except as
limited by the prior art.

32
WHAT IS CLAIMED IS:

1 1. A computer-implemented method comprising:


2 receiving, by a threat intelligence engine on a network, network traffic directed to
3 a production web server on the network, wherein the network traffic is configured to request a
4 response from the production web server, and wherein the network traffic is intercepted prior to
5 being received by the production web server;
6 identifying an attack pattern included in the network traffic;
7 sending the attack pattern to an emulated device, wherein the emulated device is
8 configured to send the attack pattern to a testing web server;
9 identifying a response associated with the attack pattern; and
10 generating an indicator describing the response.

1 2. The computer-implemented method of claim 1, wherein the network


2 traffic is intercepted by a device on the network that analyzes incoming network traffic to protect
3 the enterprise network from intrusion.

1 3. The computer-implemented method of claim 1, wherein the testing web


2 server is the production web server.

1 4. The computer-implemented method of claim 1, wherein the testing web


2 server is an emulated web server in the emulated network, and wherein the testing web server
3 duplicates one or more web-based services of the production web server.

1 5. The computer-implemented method of claim 1, wherein the response is a


2 web page, wherein the network traffic causes a script to be inserted into the web page, and
3 wherein the script is executed when the web page is read by a web browser.

1 6. The computer-implemented method of claim 5, wherein the script is


2 configured to access a remote script in a remote location, wherein the remote script is executed
3 when the web page is read by the web browser.

1 7. The computer-implemented method of claim 1, wherein the network


2 traffic includes a database query, and wherein the response includes information from a database.

33
1 8. A system comprising:
2 one or more processors; and
3 a non-transitory computer-readable medium containing instructions that, when
4 executed by the one or more processors, cause the one or more processors to perform operations
5 including:
6 receive, by a threat intelligence engine on a network, network traffic
7 directed to a production web server on the network, wherein the network traffic is
8 configured to request a response from the production web server, and wherein the
9 network traffic is intercepted prior to being received by the production web server;
10 identify an attack pattern included in the network traffic;
11 send the attack pattern to an emulated device, wherein the emulated device
12 is configured to send the attack pattern to a testing web server;
13 identify a response associated with the attack pattern; and
14 generate an indicator describing the response.

1 9. The system of claim 8, wherein the network traffic is intercepted by a


2 device on the network that analyzes incoming network traffic to protect the enterprise network
3 from intrusion.

1 10. The system of claim 8, wherein the testing web server is the production
2 web server.

1 11. The system of claim 8, wherein the testing web server is an emulated web
2 server in the emulated network, and wherein the testing web server duplicates one or more web-
3 based services of the production web server.

1 12. The system of claim 8, wherein the response is a web page, wherein the
2 network traffic causes a script to be inserted into the web page, and wherein the script is
3 executed when the web page is read by a web browser.

1 13. The system of claim 12, wherein the script is configured to access a
2 remote script in a remote location, wherein the remote script is executed when the web page is
3 read by the web browser.

34
1 14. The system of claim 8, wherein the network traffic includes a database
2 query, and wherein the response includes information from a database.

1 15. A computer-program product tangibly embodied in a non-transitory


2 machine-readable storage medium, including instructions that, when executed by the one or more
3 processors, cause the one or more processors to:
4 receive, by a threat intelligence engine on a network, network traffic directed to a
5 production web server on the network, wherein the network traffic is configured to request a
6 response from the production web server, and wherein the network traffic is intercepted prior to
7 being received by the production web server;
8 identify an attack pattern included in the network traffic;
9 send the attack pattern to an emulated device, wherein the emulated device is
10 configured to send the attack pattern to a testing web server;
11 identify a response associated with the attack pattern; and
12 generate an indicator describing the response.

1 16. The computer-program product of claim 15, wherein the network traffic is
2 intercepted by a device on the network that analyzes incoming network traffic to protect the
3 enterprise network from intrusion.

1 17. The computer-program product of claim 15, wherein the testing web
2 server is an emulated web server in the emulated network, and wherein the testing web server
3 duplicates one or more web-based services of the production web server.

1 18. The computer-program product of claim 15, wherein the response is a web
2 page, wherein the network traffic causes a script to be inserted into the web page, and wherein
3 the script is executed when the web page is read by a web browser.

1 19. The computer-program product of claim 18, wherein the script is


2 configured to access a remote script in a remote location, wherein the remote script is executed
3 when the web page is read by the web browser.

35
1 20. The computer-program product of claim 15, wherein the network traffic
2 includes a database query, and wherein the response includes information from a database.

36
ABSTRACT OF THE DISCLOSURE
Provided are devices, computer-program products, and methods for analyzing web-based attacks.
In some implementations, a device, computer-program product, and method for generating an
indicator to describe a result of a web-based attack is provided. For example, a method can
include receiving network traffic directed to a production web server on a network. In some
examples, the network traffic can be configured to request a response from the production web
server. In some examples, the network traffic can be intercepted prior to being received by the
production web server. The method can further include identifying an attack pattern included in
the network traffic and sending the attack pattern to an emulate device. In some examples, the
emulated device can be configured to send the attack pattern to a testing web server. The method
can further include identifying a response associated with the attack pattern and generating an
indicator describing the response.

KILPATRICK TOWNSEND 68537329 1

37

You might also like