You are on page 1of 59

SQL Injection Prevention Using

Runtime Query Modeling and Keyword


Randomization
by
Subodh Raikar
A Project Report Submitted
in
Partial Fulfillment of the
Requirements for the Degree of
Master of Science
in
Computer Science

Supervised by
Dr. Rajendra K. Raj
Department of Computer Science
B. Thomas Golisano College of Computing and Information Sciences
Rochester Institute of Technology
Rochester, New York
November 2013

ii

The project SQL Injection Prevention using Runtime Query Modeling and Keyword
Randomization by Subodh Raikar has been examined and approved by the following Examination Committee:

Dr. Rajendra K. Raj


Professor
Project Committee Chair

Dr. Stanisaw Radziszowski


Professor

Dr. Xumin Liu


Assistant Professor

iii

Dedication

To my mother, for always being my strength and support without whom I would never
have been where I am today.

iv

Acknowledgments

I am grateful to Dr. Rajendra K.Raj for being a very considerate and supporting advisor.
Without his flexibility and guidance, this project would not have been successful. I would
also like to thank Dr. Stanisaw Radziszowski for his feedback in writing the project
proposal and the report. I would like to acknowledge my mother, Mrs. Supriya Raikar for
being a continuous source of inspiration and strength. Additionally, I would also like to
thank my friend Mr. Steven Leitao, for his constant support and encouragement
throughout the course of my project.

Abstract
SQL Injection Prevention Using Runtime Query Modeling and
Keyword Randomization
Subodh Raikar
Supervising Professor: Dr. Rajendra K. Raj

Most modern day computer applications are on the Internet, or in some or the other form
related to the web. These applications make use of databases to store data. Since these applications are database-driven, they need to perform a large number of queries that perform
insert, update, delete or retrieve operations on the database. User input is a crucial component of such applications, because it specifies the filtering criteria for the query performed
on the database. The intention of this query can be modified drastically by inclusion of
syntactic content in the user input. If the application accepts and processes malicious inputs provided by the attacker, it can cause severe damage to the data stored in the database.
Corrupt data can be added or existing data can be modified or deleted due to such attacks.
Securing web applications from such type of attacks is very essential.

The main objective of this project is to design an effective mechanism to prevent SQL injection in such applications. This project proposes the combination of keyword randomization
with runtime query modeling to prevent SQL injection attacks. It also aims at investigating
the variations of this attack and the current state of research efforts to mitigate them. Our
implementation also helps in analyzing the performance of existing techniques, highlights

vi

their loopholes and suggests improvements to resolve performance and security concerns.

vii

Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Tautology . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Union-based Attack . . . . . . . . . . . . . . . . . . . . .
1.2.3 Logically Incorrect Queries . . . . . . . . . . . . . . . . .
1.2.4 Alternate Encoding . . . . . . . . . . . . . . . . . . . . . .
1.2.5 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.6 Piggy-backed Queries . . . . . . . . . . . . . . . . . . . .
1.2.7 Stored Procedure Attacks . . . . . . . . . . . . . . . . . . .
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Dynamic Query Matching . . . . . . . . . . . . . . . . . .
1.3.2 Analysis Framework for Web Application Security . . . . .
1.3.3 Instruction Set Randomization . . . . . . . . . . . . . . . .
1.3.4 Frameworks for SQL Retrieval on Web Application Security
1.3.5 AMNESIA . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.6 CANDID . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.7 Automated Fix Generation to Secure SQL Statements . . . .
1.4 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
1
. . 1
. . 2
. . 2
. . 3
. . 4
. . 4
. . 5
. . 5
. . 5
. . 6
. . 6
. . 6
. . 6
. . 7
. . 7
. . 7
. . 8
. . 8
. . 9
. . 10
. . 10
. . 11

viii

Design . . . . . . . . . . . . . . . . . . . . . . . .
2.1 Three Tier Architecture . . . . . . . . . . . . .
2.2 System Architecture . . . . . . . . . . . . . . .
2.2.1 SQLRand Architecture . . . . . . . . .
2.2.2 Dynamic Query Matching Architecture
2.2.3 RandXML Architecture . . . . . . . .
2.3 Logical Flow . . . . . . . . . . . . . . . . . .
2.4 Modules . . . . . . . . . . . . . . . . . . . . .
2.4.1 Key Generation Module . . . . . . . .
2.4.2 XML Parsing Module . . . . . . . . .
2.4.3 Decision Module . . . . . . . . . . . .
2.4.4 Attack Reporting Module . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

12
12
14
14
15
16
17
18
18
18
19
19

Implementation . . . . . . . . . . . . . . .
3.1 Key Generation Module . . . . . . . . .
3.2 XML Parsing Module . . . . . . . . . .
3.3 Decision Module . . . . . . . . . . . .
3.4 Attack Reporting Module . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
. .
. .
. .
. .

20
20
22
23
23

Analysis . . . . . . . . . . . . . . . . . . .
4.1 Test Environment . . . . . . . . . . . .
4.2 Types of Attacks . . . . . . . . . . . .
4.2.1 Tautology Attack . . . . . . . .
4.2.2 Union Attack . . . . . . . . . .
4.2.3 Piggy-backed Attack . . . . . .
4.2.4 Logically Incorrect Queries . .
4.3 Summary . . . . . . . . . . . . . . . .
4.4 Hypothesis Evaluation . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
. .
. .
. .
. .
. .
. .
. .
. .

24
25
25
26
26
28
29
31
31

Conclusions . . . . . . . . . . . . . . . . . . . . .
5.1 Current Status . . . . . . . . . . . . . . . . . .
5.2 Future Work . . . . . . . . . . . . . . . . . . .
5.2.1 Parallel XML Node Comparison . . . .
5.2.2 Database Server . . . . . . . . . . . .
5.2.3 Hotspot Detection in Application Code
5.2.4 Test Application Setup . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
. .
. .
. .
. .
. .
. .

33
33
33
33
34
34
34

ix

5.3

5.2.5 Automatic Prepared Statement Generation . . . . . . . . . . . . . . 35


Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

A UML Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
A.1 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
A.2 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
B Code Listing . . . . . . . . . . . . . . . . . .
B.1 Create ATTACK_DATASET Table . . . . .
B.2 Create INJECTION_ATTACK_LOG Table
B.3 Source Code . . . . . . . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
. .
. .
. .

41
41
42
43

C User Manual . . . . . . . . . . . . . . . .
C.1 Introduction . . . . . . . . . . . . . . .
C.2 Installation . . . . . . . . . . . . . . .
C.3 User Documentation . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
. .
. .
. .

44
44
44
45

D SQL Injection examples . . . . . . . . . .


D.1 Honda Parts Website . . . . . . . . . .
D.1.1 Query Criteria . . . . . . . . .
D.1.2 Query Results . . . . . . . . . .
D.2 Epicor 9 . . . . . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
. .
. .
. .
. .

46
46
46
47
48

List of Tables
4.1
4.2
4.3
4.4
4.5

Tautology Attack Overhead . . . . .


Union Attack Overhead . . . . . . .
Piggy-backed Attack Overhead . . .
Logically Incorrect Attack Overhead
Execution Overhead Comparison . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

26
28
28
30
31

xi

List of Figures
2.1
2.2
2.3
2.4
2.5
2.6
2.7

Three Tier Architecture . . . . . . . . .


SQLRand Architecture . . . . . . . . .
Dynamic Query Matching Architecture
RandXML Architecture . . . . . . . . .
RandXML Flowchart . . . . . . . . . .
DTD for XML Equivalent of SQL . . .
Attack Log Table Structure . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

3.1
3.2
3.3

Secure Random Key Generator . . . . . . . . . . . . . . . . . . . . . . . . 21


XML Parsing Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Decision Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1
4.2
4.3
4.4

Tautology Attack Detection Overhead . . . . . . . . .


Union-based Attack Detection Overhead . . . . . . . .
Piggy-Backed Attack Detection Overhead . . . . . . .
Logically Incorrect Queries Attack Detection Overhead

5.1

Hotspot Detection Using GREP . . . . . . . . . . . . . . . . . . . . . . . 35

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

13
14
15
16
17
18
19

27
29
30
31

A.1 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


A.2 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
D.1 Honda Parts Query Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 46
D.2 Honda Parts Query Results . . . . . . . . . . . . . . . . . . . . . . . . . . 47
D.3 Epicor ERP Support Web Page . . . . . . . . . . . . . . . . . . . . . . . . 48

Chapter 1
Introduction
1.1

Introduction

SQL injection is one of the major issues that belongs to the category of code injection
problems. Many database-driven applications are designed to accept input from the user
and perform queries on the database using this input. This query can be written in the code
in the form of a string that is built dynamically, on the fly, with the help of user input.
Therefore, it is vulnerable to modifications and hence this type of attack can be performed
by manipulating the input provided from the user interface. When this dynamically built
query string is sent to the database for execution, the database engine cannot differentiate
between the actual query and the user input. Using this attack, the attacker can create, update, retrieve or delete any data, depending upon the access specification allowed through
the application code. In addition, the attack can have severe consequences like executing
administration operations on the database [13] or recovering the content present on the
DBMS.

SQL injection can also be prevented by proper validation of the user input. Some web
frameworks handle this issue by distinguishing user input from the SQL query. For example, Microsofts .NET framework provides a mechanism called parameterized query, which
accepts the user inputs as parameters. This helps in separation of the user input from the
developer-intended query and allows the database engine to identify malicious inputs, thus
preventing SQL injection. However, some legacy applications developed in the past were

designed to build the query dynamically in the form of a string. The main reason behind
this was the lack of security awareness amongst the developers. Since 2002, SQL injection
was a part of of more than 10% of the cyber vulnerabilities [16]. In 2003, two companies Guess and Petco, were affected by this attack [3], and it exposed 500,000 credit cards. In the
year 2008, it was one of the most predominant types of web application vulnerabilities [15]
and it increased by 134 percent from the year 2007. Initially, the attackers manually performed attacks by injecting malicious code through the user interface. However, currently,
attackers also make use of scripts and bots to automate query injection process. These tools
require the URL of the vulnerable web page along with the malicious input values specified
through the querystring accepted by the web application from the browsers address bar. To
secure such vulnerable applications, the developers would need to modify the legacy code
and rewrite the query statements in order to accept parameterized queries. An alternative
solution would be to write the code for validation of the dynamically constructed query.
However, not only will this increase the cost of development, but also assigning this responsibility to the application developers may leave potential security loopholes. The main
motivation behind the idea of this project is to secure such applications by preventing SQL
injection attack on them.

1.2

Background

SQL injection belongs to the category of code injection attacks. This attack can be classified into different types, based on the manner in which the malicious content is added to
the original query.

1.2.1

Tautology

A logical condition that always holds true is called tautology. This attack is performed by
injecting a tautology in the query. The attacker uses the OR keyword in SQL to perform
this attack. Consider an application that accepts username and password from the user interface and performs a query on the database to validate if the user is authentic. The query

can be written as :

Select * from USER where userName="+txtUserName.Text+" AND password="


+txtPassword.Text+"";

If the attacker enters " OR 1=1; " as username and does not enter password, the
resulting query would be:

Select * from USER where userName= OR 1=1 AND password= ;

The resulting query will always be executed, since the condition 1=1 will transform
the where clause into a tautology [7]. The string " " acts as a comment in SQL, and so
the remaining part of the query will not be executed by the database engine. On executing
this query, the database will return all rows from the USER table. Since the query returns
a non-null value, the attacker will be authenticated by the application. In some databases,
the first row in the user table represents the administrator of the database. If the attacker
enters admin as the username along with a tautology expression, the database will grant
the attacker administrator privileges from the application. Using the administrators access
control, the attacker can cause severe damage to the database.

1.2.2

Union-based Attack

In this attack, the attacker injects an entire query using the UNION keyword in SQL. If the
attacker enters the input " UNION Select * from USER where userName=ABC ", in
the userName field and leaves the password field blank, the query that will be dynamically
constructed by the application will look like :

Select * from USER where userName= UNION Select * from USER where
userName=ABC AND password= ;

The resulting query will retrieve all the details of the user with username as ABC.

1.2.3

Logically Incorrect Queries

This is a preliminary type of SQL injection attack. This attack may not change the structure or contents of the database. However, it is used to gather useful information from the
database. Precisely, a part of the metadata can be obtained from this type of attack. The
attacker enters malicious information in such a manner, that the database server responds
back with error messages which can be sufficient enough for the attacker to gain valuable
information regarding the database schema. One way of performing this attack is by using
single quotes in the user input. The attacker can enter an input like "Steven OBrien" as the
username and any random text as the password. This will issue the following query to the
database:

Select * from USER where userName= Steven OBrien AND password=<password>;

When this query is sent to the database for execution, it will fail to match the single quotes
in the query and will generate an error message. This error message may reveal information about which part of the code reflected this error and which tables were involved in
this operation. This might be sufficient for the attacker to gain insight about the type of
database in use and its schema, and will help her to easily perform the attack.

1.2.4

Alternate Encoding

In this type of attack, the attacker uses alternate mechanisms to encode and add malicious characters to the query. Various encoding schemes like ASCII, Unicode or hexadecimal [14] can be used for this purpose. For example, char(39) represents the ASCII character
for single quote, which can be used to inject the query instead of directly using the single
quote character.

1.2.5

Inference

This type of attack helps the attacker to gain some insight about the current state of the
database. The attacker can inject Select-Case style statements in the query, and by performing timing based delay attacks, she can learn the status of the underlying database.

1.2.6

Piggy-backed Queries

The attacker can inject a totally new query by piggybacking it over the intended query,
rather than appending it with the intended query. However, unlike the Union-based attack,
the attacker does not modify the original query. The new query is injected in the intended
query by using a semicolon operator(;). A semicolon separates the two SQL queries. If
the attacker leaves the username field blank, and enters " ; Drop table USER;" the resulting
query will be :

Select * from USER where userName= AND password= ; Drop table USER;

The second query is piggy-backed by the first query, and both queries are executed. After these queries finish execution, the USER table is deleted from the database. The attackers success in this type of attack also depends on the underlying database system and
its configuration. For example, Oracle does not allow execution of multiple queries in one
transaction. SQL Server allows this, but it can be configured to disallow multiple query
execution.

1.2.7

Stored Procedure Attacks

Web application developers believe that stored procedures are a foolproof mechanism to
prevent SQL injection attacks. However, even they can be used to attack the database
with the help of metadata obtained by executing logically incorrect queries. This metadata
can be used along with a combination of tautology and piggy-backed query to perform an
attack on stored procedures. Using this attack, the attacker can execute both, user-defined

as well as system stored procedures. An attack of this type can also be used to shutdown
the database server.

1.3

Related Work

Several approaches have been designed in the past to eliminate the effects of SQL injection
attack.

1.3.1

Dynamic Query Matching

The dynamic query matching approach [5] proposes the generation of a SQL master file,
which consists of all valid queries that can be dynamically generated by the application. On
executing the application, all the queries generated during runtime are converted to XML,
and they are verified with the existing structure in the master file. By using this approach,
it would be easily possible to detect an injected query, since it would not be present in the
master file and the query matching component would raise an alarm.

1.3.2

Analysis Framework for Web Application Security

The approach presented by the paper [17] considers the dynamically generated SQL query
as a finite state automaton. This finite state automaton helps in defining the expected value
of input for any given query. This method is based on input validation and depends on how
the user input is filtered by the application. The proposed solution also demonstrates some
drawbacks. It validates the query based on its syntax, and it does not consider semantic
validation. Also, the solution is not yet designed to handle some specific SQL operators
such as the like operator. The solution is only theoretical, and its effectiveness cannot be
tested until demonstrated by some experiments.

1.3.3

Instruction Set Randomization

It [2] proposes that the SQL keywords should be attached with the key generated by the randomization algorithm. The attempt fails because the query constructed by the attack does

not match with the query that contains the randomly generated key, since the attacker has
no knowledge of the key. The keywords in both queries differ, thus preventing SQL injection attack. Since the algorithm is not executed on the web server or the database server, the
security of these servers is not compromised in case of an attack on the proposed method.
However, implementation of a proxy server for randomization and de-randomization adds
to the performance overhead.

1.3.4

Frameworks for SQL Retrieval on Web Application Security

This approach [9] divided the injection detection process in two modules - Pattern Creation
Module (PCM) and Attack Detection Module (ADM). PCM creates a model of attacks
based on the patterns observed from previous attacks, while ADM checks if the query fired
by the application matches an existing pattern. It is not a foolproof approach because it
is signature-based. If the attacker performs a new type of attack that does not match an
existing pattern, the attack will be successful, and this mechanism will fail.

1.3.5

AMNESIA

In this approach [8], a combination of static analysis and run-time monitoring is used for
SQL injection prevention. In the static phase, the AMNESIA tool builds a model of all
the queries that are generated by the application. For this purpose, the tool needs access to
the entire source code. In the dynamic phase, the query built during run-time is validated
against the model built during the static phase.

1.3.6

CANDID

CANDID [1] stands for Candidate Evaluation for Discovering Intent Dynamically. This approach dynamically mines the programmer-intended query structure and compares it with
the actual query. It proposes to run the application on candidate inputs that are benign.
However, its not a practical approach because the problem of finding such inputs is undecidable.

1.3.7

Automated Fix Generation to Secure SQL Statements

This approach [16] gathers information about known vulnerable SQL statements, generates
a fix and then replaces this vulnerability with the generated code. This method, however, is
based on an assumption that the language of development, database connector and database
system support prepared statements. It also assumes that the vulnerable code has equivalent
datatypes as the corresponding field in the database. In case of mismatching datatypes, it
assumes that the run-time will handle type conversions.

1.4

Problem Statement

SQL injection is a serious threat to database-driven applications. These applications are


designed to accept input from the user and query the database using this input. This query
can be written in the code in the form of a string that is built dynamically, on the fly with
the help of users input. Therefore, it is vulnerable to modifications and hence this type
of attack can be performed by manipulating the input provided through the fields in the
application that a user populates from the user interface. This changes the purpose of the
intended query and fires a malicious query that can cause damage to the database schema or
even the data stored in the database. When this string is sent to the database for execution,
the database engine cannot differentiate between the actual query and the user input.

Many frameworks try to mitigate this problem by using different techniques behind-thescene. However, this also affects the overall performance in terms of query execution
time, validation overhead and memory and CPU usage. For instance, Object-Relational
Mapping (ORM) frameworks like Entity Framework prevent SQL injection. However, the
object-relational mapping leads to creation of objects, which adds to the existing performance overhead caused by parametrized queries. However, application security definitely
has the higher priority in comparison to performance. Therefore, the performance overhead can be neglected if it is reasonable. A framework that keeps the application secure at

a cost of extremely poor performance and resource usage cannot be considered as a good
framework.

1.5

Hypothesis

This project aims at preventing SQL injection attacks by combining keyword randomization and dynamic query matching. In the first step, the algorithm will generate a random
key and will append it to the SQL keywords in the query. When the attacker injects the malicious code, she will not be aware of this key, and the injected SQL code will either contain
no key or an invalid key appended to the SQL keywords in the malicious input. However, if
the key gets compromised, or the attacker designs innovative attack techniques and obtains
access to the key, the second step of the validation process will block this query by sending
it to the XML parsing component, which will make use of dynamic query matching and
compare the parse trees of the intended query and the actual query. If the query fails this
validation, it will be considered as an injection attack and will not be sent further to the
database server for execution. The program will also collect information about this query
in the form of an error log that will help the administrator to identify and learn more about
the attacks.

Our approach is based on a combination of RandSQL [2] and dynamic query matching [5].
However, RandSQL uses a separate randomization and de-randomization proxy, which
adds an overhead in terms of communication of the query between the application server
and the proxy servers. In our approach, the randomization and de-randomization will be
performed at the application server layer. Also, the dynamic query approach involves a
partial matching component, which can be eliminated by our approach, since it already
has a strong blocking mechanism in the form of keyword randomization. This eliminates
the performance overhead as well as the risk of false negatives caused by partial matching.
The primary target of this project was to prevent tautology, union-based and piggy-backed
attacks.

10

1.5.1

Security

Some existing approaches that prevent SQL injection propose a scan through the entire
source code of the application to find hotspots in the code that may be vulnerable to SQL
injection attack. They also build a model of all possible queries in the application. However,
our approach in this project recommends building a model of the query only during runtime, when it is fired at the application server. Random key generation and query validation
by using randomization of keywords will be performed at the application server, rather than
at the proxy server [2]. Also, the inclusion of keyword randomization will eliminate the
use of partial matching from the runtime modeling component. Therefore, this approach is
more efficient in terms of performance as compared to the existing technique.

1.5.2

Performance

Our approach provides a good performance because the entire validation process is done at
the application server before sending the query to the database server. Since the application
server has massive computing power, query validation at this layer ensures that the attack
will be blocked at the application server, thus avoiding access to the database server. One
of the existing approaches to mitigate this attack builds a model of all possible intended
queries from the application, which can be very performance-intensive in order to have
an updated model every time. However, in our approach, the model will be built only
for the current query in execution. This provides the program with an updated model
of the query, while ensuring that this process is faster and more efficient. XML parse
trees of the intended query and the actual query will be created in the application servers
memory, rather than storing them on the servers secondary storage in the form of files.
The comparison of the parse trees was implemented using stacks.

11

1.6

Roadmap

The remaining chapters of this report describe the design, implementation, analysis and
conclusions derived from this project. Section 2 discusses the design of the proposed
approach, while section 3 elaborates on the implementation. A detailed analysis of the
proposed method is demonstrated in Section 4, along with the results of the performed experiments to supplement the proposed hypothesis. Section 5 highlights the conclusion of
the project and also describes the current status, future scope and lessons learned during
the course of the project.

12

Chapter 2
Design
The approach implemented in this project was designed as a class library, so that it can
be used by any database-oriented application to prevent SQL injection. This library was
imported by the test applications, thus integrating it at the application server layer. The
following section gives a brief description of the three-tier architecture, and further explains
how our approach was designed.

2.1

Three Tier Architecture

The application used to employ our approach was designed using the three-tier architecture as shown in Figure 2.1. In this architecture, the user interface, application server and
database server form the three different layers. The user interface layer deals with the presentation of the web application. The application server layer contains the business logic
and performs computing operations. It acts as a communication layer between the client
(user interface layer) and the database server layer. It sends the request initiated by the
client to the database server and carries the response from the database server back to the
client. The database server layer deals with the physical storage, access and manipulation
of data.

13

Figure 2.1: Three Tier Architecture


The proposed algorithm was deployed on the application server. The user entered malicious input from the web browser. This input was sent along with an HTTP request to
the application server. The application server contained the logic of the proposed algorithm, where the query was received, parsed and depending on the decision provided by the
algorithm, it was either blocked or sent further to the database server for execution.

14

2.2

System Architecture

2.2.1

SQLRand Architecture

Figure 2.2: SQLRand Architecture


SQLRand [2] is designed using a proxy server. This proxy server could be located
either on the database server as shown in Figure 2.2, or as an external server between the
web server and the database server. In both cases, it introduces performance overhead
and security concerns. If the proxy performs an error and passes a malicious query to the
database server, it will reply back with an error message that could expose information
about the underlying tables and the database, thus encouraging the attacker to try different
types of injection attacks. If the proxy lies between the web server and the database server,
performance overhead will be introduced in the form of network traffic, the extra time
required to send the query to the proxy and to randomize / de-randomize it.

15

2.2.2

Dynamic Query Matching Architecture

Figure 2.3: Dynamic Query Matching Architecture


Figure 2.3 shows the architecture of Dynamic Query Matching approach [5]. This
approach is dependent on a user specified threshold. A master file is created that contains
a list of all possible queries that can be generated by the web application. This file needs
to be updated every time, to keep in sync with the new queries that are written in the
application code in future. The dynamic query generated by the application during runtime
is converted to XML, and compared with the master file. If this dynamic query matches
an existing query in the master file, it is allowed to pass through. However, if it does not
find an exact match, a partial match is performed by calculating the distance between the
dynamic query and the query from the master file that most closely matches with it. If
this distance is below the threshold specified by the user (programmer), then it is allowed
to pass through to the database for execution, otherwise it is stopped from reaching the
database server. Allowing the user to specify a valid threshold is a major drawback of this
approach because this threshold is subjective. If the user chooses an incorrect threshold, it
will compromise the security of the application.

16

2.2.3

RandXML Architecture

Figure 2.4: RandXML Architecture


This architecture was based on the combination of RandSQL and dynamic query matching architectures. It tried to eliminate their design flaws and made an attempt to mitigate
SQL injection attack by combining the positive factors of both approaches. This approach
was designed to perform its operations on the application server as shown in Figure 2.4.
Since the dynamic query was built during runtime at the application server, this approach
was more efficient when embedded in the same layer. This eliminated the network overhead and reduced the execution time required in sending the query from application server
to the proxy server, unlike RandSQL. This approach also eliminated the need for having a
static master file that contained all possible queries from the application beforehand. The
intended query structure was obtained on the fly. This prevented the need for updating the
master file and ensured that the program uses the most current structure for an intended
query.

17

2.3

Logical Flow

The logical flow of our approach is shown in Figure 2.5. When the application server
received input from the user, it dynamically generated the query based on the input. This
query, along with the developer-intended query made use of keyword randomization, where
the randomly generated key was appended to the SQL keywords in both queries. These
queries were then forwarded to an XML parsing component, which converted both queries
into XML trees. These XML parse trees were compared, and based on the result of comparison, the algorithm was able to determine whether the dynamically built query was an attack
or not. If this query was non-malicious, it was allowed to pass further to the database server
for execution. However, if the algorithm determined a query as an attack, it was blocked
at the application server and was not sent to the database server for execution. The attack
queries were added to an error log to help the system administrators to review them.

Figure 2.5: RandXML Flowchart

18

2.4

Modules

Based on the sub-tasks performed, our approach can be divided in different modules.

2.4.1

Key Generation Module

This module was responsible for secure random key generation. A list of SQL keywords
was used to identify SQL keywords in both, intended and actual queries. A secure random
key was generated and it was appended to the SQL keywords in both queries.

2.4.2

XML Parsing Module

This module accepted the intended and actual queries appended with the key generated by
the key generation module. Since the dynamic query was not constant, both queries were
converted to XML. Two stacks were created - one for the intended query, and the other for
the actual query. Each token in both the queries was converted to an XML node, and was
pushed in their corresponding stacks.

Figure 2.6: DTD for XML Equivalent of SQL

19

2.4.3

Decision Module

This module performed a comparison of the XML nodes added to the stack by the XML
parsing module. This comparison was performed till the program either found a mismatch
or both stacks were empty. If a mismatch was found, it implied an injection attack since
the structure of the intended query and the actual query did not match. If both stacks were
empty and no mismatch was found, the program determined the actual query as benign and
allowed it to pass further to the database server for execution.

2.4.4

Attack Reporting Module

Based on the decision provided by the decision module, the attack reporting module was
executed only when mismatching tokens were found while comparing the two stacks. In
this case, the malicious query was blocked from being sent to the database server, and was
reported to the database as an attack. Figure 2.3 shows the structure of the table used for
attack reporting.

Figure 2.7: Attack Log Table Structure


This report stored the entire query string that was determined as an attack, along with
the date and time of the attack. The primary purpose of this design was to facilitate the administrators to review the queries determined as an attack, in order to allow the developers
to secure their code, if needed.

20

Chapter 3
Implementation
This project was implemented using ASP.NET with C# for test application development
and SQL Server 2008 as the database server. The developed application made use of Microsoft .NET Framework 4.0 and Visual Studio 2010 as the Integrated Development Environment. Our approach was implemented as a class library that was used by the test
application code on the web application server. Windows Server 2008 was used along with
Internet Information Server (IIS) 7.0 as an application server, which allowed the hosting
of the ASP.NET web application. The same application was developed by removing the
injection blocking mechanism, thus leaving it vulnerable for attacks. The overhead was
measured in the application that was protected using RandXML.

3.1

Key Generation Module

Since the primary area of research of this project was not cryptography, this module was
implemented by using RNGCryptoServiceProvider class provided by the .NET framework.
It provides a secure random number generator using the implementation provided by the
cryptographic service provider [12]. This generates a cryptographically strong random
number, in contrast to a pseudorandom number generated by the System.Random class.
This prevents the number from being repeated, and thus provides a very effective mechanism to generate the key required by the keyword randomization process.

21

Figure 3.1: Secure Random Key Generator


Even though using this type of random number adds up to the computation overhead of
the approach, the overhead introduced is minimal compared to the security and randomness
of the key it provided. System.Random provides small and predictable random numbers.
In our approach, if a predictable key is used, the attacker can guess it easily with some trial
and error, and can inject SQL code by appending that key to the injected SQL keywords.
However, using a cryptographically strong random key reduces this possibility to the minimum. When this module finishes execution, it returns the randomly generated key, and
using the list of SQL keywords provided to the approach, it appends this random key to the
SQL keywords in both, the intended and the actual queries.

22

3.2

XML Parsing Module

Figure 3.2: XML Parsing Module


The output of the key generation module acted as an input for this module. Both, the
intended query and the actual query, were parsed as XML because the structure of the dynamically generated query was not constant. Figure 3.2 shows the XML equivalent of the
following query :-

Select balance from users where user=abc and password=ab123

These XML nodes were then pushed in two different stacks, which were compared by
the next module.

23

3.3

Decision Module

This module was responsible for comparing the stacks for the intended query and the actual query. This process was performed by popping one node at a time from both stacks
and comparing them, until either a mismatch was found, or both stacks were empty. If a
mismatch was found, it implied that both queries were structurally different, indicating that
it was an injection attack. However, if both stacks were empty and no mismatch was found,
the program determined the query as a benign query and allowed it to pass through to the
application server. Thus, based on the results of this module, the query was either sent to
the database server for execution or was reported in the error log as a malicious query.

Figure 3.3: Decision Module

3.4

Attack Reporting Module

This was an optional module, because it functioned only when the decision module determined the users query as malicious. It was responsible for reporting a bad query to the
INJECTION_ATTACK_LOG table.

24

Chapter 4
Analysis
One of the key points of the hypothesis was that our approach will be faster and more
effective than the existing methods on which it was built. The average execution time overhead introduced by this approach was 4.6 milliseconds, as compared to 6.5 milliseconds
by RandSQL. This was because of the omission of the communication component from
RandSQL which sent the dynamically built query to the proxy server to perform randomization and de-randomization. Since this process was implemented at the application server
itself, the overall execution time overhead was reduced because of the elimination of the
round-trip time of the query between the application server and the proxy server.

Also, in terms of accuracy of attack detection, this approach proved to be more effective, unlike dynamic query matching. Dynamic query matching requires the programmer
to specify a threshold, and based on this threshold, partial matching of the intended query
and the actual query is performed. This threshold is decided by the programmer, and thus
the subjectiveness of the threshold allows the possibility of errors in determining the accurate threshold. In addition to that, partial matching can be a dangerous approach in case
of some attacks. For instance, if the attacker piggybacks a delete or a drop query, and
the query falls within the specified threshold, the partial matching component will allow
the query to pass through as a benign query and will be sent to the database server for
execution. Executing such a query can cause either deletion of data or dropping an entire
table from the database. However, our approach eliminated the need for partial matching
because of the use of keyword randomization. This provided more accurate results.

25

4.1

Test Environment

Our approach was implemented as a class library. This class library was imported by a test
application built using ASP.NET and C#. The queries in this test application were written
in the form of strings, so that their structure can be easily altered by the user input entered
during testing. The method implemented in the class library that parses both, the intended
and actual queries was called in the test application code in order to detect if the current
instance of the input entered in the application was a SQL injection attack or not. Thus, the
vulnerable test application was protected by the proposed method in order to determine if
this approach blocks the injection attack attempts.

4.2

Types of Attacks

The primary purpose of this test system was to find out if RandXML was able to protect the
vulnerable application from SQL injection attempts. Tests were performed using different
variations of this attack to ensure that it detects and blocks as many variations as possible.
The following section highlights different types of attacks performed on the application,
and demonstrates its behavior in terms of performance and effectiveness. Each attack was
performed with 10000 input samples. The tables in the following section display the detection overhead in milliseconds for each attack and the number of samples for each.

26

4.2.1

Tautology Attack

Description
This attack was performed by injecting expressions in the query that always evaluated to
true. These expressions were built easily by using the OR keyword in SQL and appending it
by an expression in the form of value_1 = value_2, where value_1 and value_2 were equal.
This ensured that the OR part of the query always evaluated to true, regardless of what
the remaining query contained. This was done so that the query would always evaluate to
true and return all rows from the desired table.

Results
This attack was successfully blocked, since the randomized keyword was not appended to
the injected OR clause, and this lead to a difference in the structure of the intended query
and the query generated during runtime. The execution overhead introduced during detection of this attack is shown in Table 4.1 and Figure 4.1

Table 4.1: Tautology Attack Overhead


Detection Overhead (milliseconds) Number of samples
3
4174
4
5767
5
59

4.2.2

Union Attack

Description
Another query was injected in this attack along with the intended query. This injected query
was preceded by the UNION keyword in SQL. Both queries were executed independently
and a UNION operation was performed on their results.

27

Figure 4.1: Tautology Attack Detection Overhead


Results
In this attack, the attacker injected an entire query along with the UNION keyword in SQL.
Since the attacker was not aware of the secure random key that would be generated by
RandXML during run-time, the UNION keyword was responsible for causing a mismatch
in the decision module. Even if the attacker would have somehow managed to guess the secure keyword, the injected query would not match with the structure of the intended query,
thus detecting the input as malicious. The execution overhead is shown in Table 4.2 and
Figure 4.2.

28

Table 4.2: Union Attack Overhead


Detection Overhead (milliseconds) Number of samples
4
4092
5
5796
6
110
9
1
12
1

4.2.3

Piggy-backed Attack

Description
This attack was performed by injecting a completely different query on top of the intended
query separated by a de-limiter depending on the database system used by the application.
This query was intended to be executed along with the query dynamically built by the application. The piggy-backed query was created in both forms - a customized query and also
as a system stored procedure. This type of attack could lead to severe consequences like
deleting data, dropping the table or even shutting down the database server.

Results
Since an entirely different query was piggy-backed in this attack, the dynamic query and
the intended query were structurally different. This mismatch detected the query as an attack. Table 4.3 and Figure 4.3 show the detection overhead.

Table 4.3: Piggy-backed Attack Overhead


Detection Overhead (milliseconds) Number of samples
4
4054
5
5827
6
116
7
3

29

Figure 4.2: Union-based Attack Detection Overhead

4.2.4

Logically Incorrect Queries

Description
This attack was performed by injecting illegal characters in the query that returned an exception from the database server on execution. The input was corrupted by using the single
quote character. If such an input is not validated, it causes an exception, and returns the
error message provided by the underlying database. This message may expose considerable information about the database, like the type of the database, its version, and in some
cases, it may also reveal the underlying table name.

Results
This attack was successfully blocked by our approach. Injecting illegal characters in the input caused extra query tokens in the dynamically generated query. This lead to a structural

30

Figure 4.3: Piggy-Backed Attack Detection Overhead


difference between the intended query and the actual query, and determined the attempt as
an attack.

Table 4.4: Logically Incorrect Attack Overhead


Detection Overhead (milliseconds) Number of samples
3
4270
4
5671
5
51
6
8

31

Figure 4.4: Logically Incorrect Queries Attack Detection Overhead


Table 4.5: Execution Overhead Comparison
Type of Attack
Average Overhead (milliseconds)
Tautology Attack
3.58
Union-based Attack
4.60
Logical Attack
3.43
Piggy-backed Attack
4.58

4.3

Summary

4.4

Hypothesis Evaluation

The results demonstrated in this section supplement the hypothesis that RandXML was
a better and more efficient approach to detect and prevent SQL injection attacks. It was
able to prevent different variations of the attack, like tautology attack, piggy-backed attack, union-based attack and attack performed by injecting logically incorrect queries in

32

the application. Also, the performance overhead introduced by this approach was negligible. However, the tests performed were restricted to an application which had only a few
queries. It would have been very interesting to observe the behavior of this approach in
case of complicated queries or queries which contained lots of tokens in the form of joins
and select-list attributes. Also, the list of SQL keywords is currently a static list. This list
needs to be updated periodically with any new keywords that are introduced in SQL to help
the application using the approach to identify SQL keywords in the query.

Even though our approach prevents SQL injection from an application that uses the
class library, the attacks are currently prevented by calling the methods from this library
that perform the necessary tasks to block the attack. However, variations of this attack will
continue to evolve, and this approach may only be able to block a few of them. So, despite
the success of this approach in the test scenario, it cannot be implemented at the industry
level. It is only useful to prevent the variations of this attack mentioned before.

33

Chapter 5
Conclusions
5.1

Current Status

Currently, our approach is able to prevent union-based attack, piggy-backed attack, tautology attack and attack performed by injecting logically incorrect queries. The attack
simulation for testing the effectiveness and efficiency of the approach was performed using a program that was written to generate a test dataset that represented both malicious
inputs and benign inputs. Instead of a customized attack dataset, a real-time simulation
would have been more effective, and would have provided a larger variety of inputs. RandSQL does not detect and prevent logically incorrect queries [10]. However, the RandXML
approach is able to prevent it.

5.2

Future Work

There are several areas related to this project which can be considered for further exploration. Some of the potential opportunities for future work are presented below.

5.2.1

Parallel XML Node Comparison

Currently, the XML node comparison is performed sequentially. This can also be done in
parallel by splitting the two stacks in chunks, and comparing the corresponding nodes in
these sub-stacks by using multiple threads since each token is independent of the other.
This will make the decision module faster, thus speeding up the entire comparison process.

34

5.2.2

Database Server

The current analysis was performed using Microsoft SQL Server 2008. It would be interesting to see the results obtained by using MySQL or Oracle as the back-end server,
especially because of different configurations these systems offer. For instance, Oracle disallows batch query execution. So, it would be interesting to see how a piggy-backed attack
can be performed on a system that uses Oracle.

5.2.3

Hotspot Detection in Application Code

For convenience of implementation and analysis, our approach was implemented as a class
library using C# . This library contained the logic of the approach and the test application had to create objects of the class from this library and call the IsAttack() method to
determine if the query was an attack or not. However, to apply this approach in real-life applications, the algorithm would need to access the source code of the application that needs
to be protected from injection attacks. By accessing the source code, it can find hotspots statements in the code that generate SQL queries. This will eliminate the need to manually
add function calls to the application code and will protect the vulnerable application from
SQL injection attack. Hotspot detection can either be performed manually by using Grep
command as shown in Figure 5.1, or it can also be automated using a suitable tool, depending on the language of development of the source code. For example, CAT.NET [11] is
a binary code analysis tool that can be used to automate vulnerability detection in C#, J#
or VB.NET code. It helps to identify vulnerabilities in code that are responsible for SQL
injection attack, Cross site scripting attack and XPath injection attack [4].

5.2.4

Test Application Setup

The test web application was hosted on a web server. Since our approach is embedded in
the application layer, it will be interesting to learn the performance of this approach when
the application is hosted on a web server in cloud. Also, the attack generation can be made
real-time to simulate concurrent attacks on this system and the behavior of this approach in

35

Figure 5.1: Hotspot Detection Using GREP


terms of memory consumption and CPU utilization can be observed. Also, further experiments can be performed by increasing the complexity of the query by adding more joins or
attributes to the query. Since this approach works with query tokens, increasing the query
complexity might increase the detection overhead.

5.2.5

Automatic Prepared Statement Generation

Prepared statements help the database server in distinguishing the developer intended query
from the user input when the entire query string is sent in the form of a string. If the
database server receives the query in the form of a prepared statement, it handles the injection problem. The current approach is embedded in the application layer, where it parses
the dynamically generated query string at run time and determines if it is an attack or not.
In case there is no attack detected, currently the query is de-randomized and sent further
to the database server in the form of a string for execution. Potentially, this string can be

36

converted to a parameterized query or a prepared statement, where the input is converted


to parameters and added o the query.

5.3

Lessons Learned

During the course of this project, several interesting facts were learned. Even though fixes
to mitigate SQL injection problem seem to be less complex, this is one of the most neglected vulnerabilities in developing database-backed applications. Most developers seem
to ignore or overlook this vulnerability in code, thus exposing the application to the attackers and potentially leading to an inconsistent database or loss of data. Even though this
problem has been persistent since many years, the web programmers tend to ignore it even
today. According to OWASP top 10 list for 2013, SQL injection still tops the list [6], just
like it did in 2010 [13].

Validation of user input by the application code is not a foolproof solution to this problem. Several approaches have been designed to mitigate this attack. Using parameterized
queries or prepared statements is a very effective solution to prevent SQL injection attack.
This separates the user input from system-generated code, and thus the database server can
distinguish between them, unlike the low level string manipulation operations, which send
an entire string consisting of code and user input together as one query. This separation
helps the database server to avoid misinterpretation of user input as SQL code, and thus,
that part is not executed by it.

Another solution to reduce the intensity of this attack is to use a database user account
with very minimal privileges in the database connectivity code. This will not prevent the
malicious query from reaching the database server. However, it will certainly disallow
deletion or modification of data at the database server, if that database user account has
read-only privileges.

37

Bibliography
[1] Sruthi Bandhakavi, Prithvi Bisht, P. Madhusudan, and V. N. Venkatakrishnan. CANDID : Preventing SQL Injection Attacks using Dynamic Candidate Evaluations. In
Proceedings of the 14th ACM conference on Computer and communications security,
CCS 07, pages 1224, New York, NY, USA, 2007. ACM.
[2] Stephen W. Boyd and Angelos D. Keromytis. SQLRand : Preventing SQL Injection
Attacks. In In Proceedings of the 2nd Applied Cryptography and Network Security
(ACNS) Conference, pages 292302, 2004.
[3] Gregory T. Buehrer, Bruce W. Weide, and Paolo A. G. Sivilotti. Using parse tree
validation to prevent sql injection attacks. In In Proceedings of the International
Workshop on Software Engineering and Middleware (SEM) at Joint FSE and ESEC,
pages 106113, 2005.
[4] Justin Clarke. SQL Injection Attacks and Defense. Syngress Publishing, 1st edition,
2009.
[5] Debasish Das, Utpal Sharma, and D.K. Bhattacharyya. An Approach to Detection
of SQL Injection Attack Based on Dynamic Query Matching. International Journal
of Computer Applications, 1(25):2834, February 2010. Published By Foundation of
Computer Science.
[6] OWASP Foundation. Top 10 2013-a1-injection, June 2013.
[7] William G. J. Halfond, Jeremy Viegas, and Ro Orso. Classification of SQL Injection
Attacks and Countermeasures.
[8] William G.J. Halfond and Alessandro Orso. Preventing SQL Injection Attacks Using
AMNESIA. In Proceedings of the International Conference on Software Engineering
Formal Demo, May 2006.

38

[9] Haeng Kon Kim. Frameworks for SQL retrieval on Web Application Security. In Proceedings of the International Multiconference of Engineers and Computer Scientists,
volume 1, page 5, Hong Kong, 2010. IMECS, International Association of Engineers.
[10] Diallo Abdoulaye Kindy and Al-Sakib Khan Pathan. A Detailed Survey on Various
Aspects of SQL Injection: Vulnerabilities, Innovative Attacks, and Remedies. CoRR,
abs/1203.3324, 2012.
[11] Microsoft. Microsoft code analysis tool .net (cat.net) v1 ctp - 32 bit. http://www.
microsoft.com/en-us/download/details.aspx?id=19968, 2013.
[12] Microsoft.
Rngcryptoserviceprovider class.
http://msdn.microsoft.
com/en-us/library/system.security.cryptography.
rngcryptoserviceprovider.aspx, 2013.
[13] Open Web Application Security Project (OWASP). Projects/OWASP Secure Web
Application Framework Manifesto/Releases/Current/Manifesto, November 2010.
[14] Ashok Singh Sairam Sangita Roy, Avinash Kumar Singh. Analyzing SQL Meta characters and preventing SQL Injection attacks using meta filter. In International Conference on Information and Electronics Engineering, ICIEE 2011, IACSIT Press, volume 6, page 4, Singapore, 2011. Indian Institute of Technology, Kalinga Institute of
Industrial Technology, IACSIT Press.
[15] IBM Global Technology Services.
Ibm Internet Security Systems XR 2008 Trend and Risk Report.
Force
Website, January 2009.
http:
//www-935.ibm.com/services/us/iss/xforce/trendreports/
xforce-2008-annual-report.pdf.
[16] Stephen Thomas and Laurie Williams. Using Automated Fix Generation to Secure
SQL Statements. In Proceedings of the Third International Workshop on Software
Engineering for Secure Systems, SESS 07, pages 915, Washington, DC, USA, 2007.
IEEE Computer Society.
[17] Gary Wassermann and Zhendong Su. An Analysis Framework for Security in Web
Applications. In In Proceedings of the FSE Workshop on Specification and Verification of Component-Based Systems (SAVCBS) 2004, pages 7078, 2004.

39

Appendix A
UML Diagrams
A.1

Class Diagram

Figure A.1: Class Diagram

40

A.2

Sequence Diagram

Figure A.2: Sequence Diagram

41

Appendix B
Code Listing
The complete code listing is available on the attached disc. Following is the SQL code for
creation of the tables required for testing the application.

B.1

Create ATTACK_DATASET Table

USE [SQLInjectionTest]
GO
/****** Object: Table [dbo].[ATTACK_DATASET] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[ATTACK_DATASET](
[AttackID] [int] IDENTITY(1,1) NOT NULL,
[Input1] [varchar](2000) NULL,
[Input2] [varchar](2000) NULL,
[AttackFlag] [varchar](1) NULL,
CONSTRAINT [PK_ATTACK_DATASET] PRIMARY KEY CLUSTERED
(

42

[AttackID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

B.2

Create INJECTION_ATTACK_LOG Table

USE [SQLInjectionTest]
GO
/****** Object: Table [dbo].[INJECTION_ATTACK_LOG] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[INJECTION_ATTACK_LOG](
[AttackID] [int] IDENTITY(1,1) NOT NULL,
[AttackQuery] [varchar](1000) NULL,
[AttackDate] [datetime] NULL,
[AttackDetectionOverhead] [numeric](18, 3) NULL
, CONSTRAINT [PK_INJECTION_ATTACK_LOG] PRIMARY KEY CLUSTERED
(
[AttackID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,

43

IGNORE_DUP_KEY = OFF,ALLOW_ROW_LOCKS = ON,


ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO

B.3

Source Code

Source Code for the class library and the test application is on the attached disc.

44

Appendix C
User Manual
C.1

Introduction

The proposed approach is designed as a class library using C#. It is compiled as a DLL
that can be used by programmers in their application to detect and prevent SQL injection
attacks. Building it as a class library provides the flexibility of using it in Java applications,
too. Currently, this approach is able to block tautology attack, union-based attack, piggybacked attack and logically incorrect query attack. The project is compiled as a setup
project in the form of an executable file (RandXML.exe).

C.2

Installation

RandXML.exe allows the user to install RandXML.dll file in C:/Program Files/RandXML/


location on the application server. The installation also contains a configuration file which
should be modified by the programmer to include the connection string to the database
where the Injection Attack Log will be stored. The installation package also comes with
a SQL script file named Injection_Attack_Log.sql, that creates the table used for recording attack occurrences in the application that the user (programmer) wants to protect from
injection attacks.

45

C.3

User Documentation

This package also contains a ReadMe.txt file, which contains step-by-step instructions for
using this dll file. All the methods and properties from the class library are documented in
the code to help the user comprehend their purpose, while using them in the application.
The RandXML API is documented in MSDN style reference manual using SandCastle tool.

46

Appendix D
SQL Injection examples
D.1

Honda Parts Website

D.1.1

Query Criteria

This page allows the users to specify the search criteria for the part they want to purchase,
based on the make, model and other specifications of their vehicle.

Figure D.1: Honda Parts Query Criteria

47

D.1.2

Query Results

This page displays the results of the search performed by using the filter criteria entered
by the user. Typically, it should display the part that the user is looking for. However, this
page displays the SQL query that is built dynamically after entering the filter criteria, and
thus exposes the name of the table in the database.

Figure D.2: Honda Parts Query Results

48

D.2

Epicor 9

The following image shows the result of searching the term updatable dashboard on Epicors online forum. The expected result was a collection of posts in the forums which
contained the keywords entered in the search filter. However, the search returned an error
message implying that it detected a SQL injection attack. This was because the injection
detection code maintains a list of forbidden keywords, and when it finds any SQL keywords
in the query, the program determines it as an injection attack. This is true for any searches
that contain any SQL keywords.

Figure D.3: Epicor ERP Support Web Page

You might also like