You are on page 1of 37

DATA LEAKAGE DETECTION Department of CSE

Chapter-1
INTRODUCTION
1.1 Project Description
In the course of doing business, sometimes sensitive data must be handed over to supposedly
trusted third parties. For example, a company may have partnerships with other companies
that require sharing customer data. Another enterprise may outsource its data processing, so
data must be given to various other companies. Our goal is to detect when the distributor’s
sensitive data have been leaked by agents, and if possible to identify the agent that leaked the
data. Perturbation is a very useful technique where the data are modified and made “less
sensitive” before being handed to agents For example, one can replace exact values by
ranges, or one can add random noise to certain attributes. Traditionally, leakage detection is
handled by watermarking. We annunciate the need for watermarking database relations to
deter their piracy, identify the unique characteristics of relational data which pose new
challenges for watermarking, and provide desirable properties of watermarking system for
relational data. A watermark can be applied to any database relation having attributes which
are such that changes in a few of their values do not affect the applications. Watermarking
means a unique code is embedded in each distributed copy If that copy is later discovered in
the hands of an unauthorized party, the leaker can be identified. Furthermore, watermarks can
sometimes be destroyed if the data recipient is malicious. In this paper, we study unobtrusive
techniques for detecting leakage of a set of objects or records. Specifically, we study the
following scenario: After giving a set of objects to agents, the distributor discovers some of
those same objects in an unauthorized place. At this point, the distributor can assess the
likelihood that the leaked data came from one or more agents, as opposed to having been
independently gathered by other means. Using an analogy with cookies\ stolen from a cookie
jar, if we catch Freddie with a single cookie, he can argue that a friend gave him the cookie.
But if we catch Freddie with five cookies, it will be much harder for him to argue that his
hands were not in the cookie jar. If the distributor sees “enough evidence” that an agent
leaked data, he may stop doing business with him, or may initiate legal proceedings. In this
paper, we develop a model for assessing the “guilt” of agents. We also present algorithms for
distributing objects to agents, in a way that improves our chances of identifying a leaker.
Finally, we also consider the option of adding “fake” objects to the distributed set.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 1


DATA LEAKAGE DETECTION Department of CSE

1.2 Problem Statement:

Identifying data leakages and improve the probability. Our goal is to detect when the
distributor’s sensitive data have been leaked by agents, and if possible to identify the agent
that leaked the data.

1.3 Objectives :

 A data distributor has given sensitive data to a set of supposedly trusted agents (third
parties).
 Some of the data is leaked and found in an unauthorized place (e.g., on the web or
somebody’s laptop).
 The distributor must assess the likelihood that the leaked data came from one or more
agents, as opposed to having been independently gathered by other means.
 We propose data allocation strategies (across the agents) that improve the probability
of identifying leakages.
 These methods do not rely on alterations of the released data (e.g., watermarks). In
some cases we can also inject ―realistic but fake‖ data records to further improve our
chances of detecting leakage and identifying the guilty party.
 Our goal is to detect when the distributor’s sensitive data has been leaked by agents,
and if possible to identify the agent that leaked the data.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 2


DATA LEAKAGE DETECTION Department of CSE

Chapter-2
2.1 Literature Survey
2.1.1 Existing System:
In existing system, we consider applications where the original sensitive data cannot be
perturbed. Perturbation is a very useful technique where the data is modified and made ―less
sensitive‖ before being handed to agents. However, in some cases it is important not to alter
the original distributor’s data. Traditionally, leakage detection is handled by watermarking,
e.g., a unique code is embedded in each distributed copy. If that copy is later discovered in
the hands of an unauthorized party, the leaker can be identified. Watermarks can be very
useful in some cases, but again, involve some modification of the original data. Furthermore,
watermarks can sometimes be destroyed if the data recipient is malicious.

2.1.2 Proposed System:


In proposed system, after giving a set of objects to agents, the distributor discovers some of
those same objects in an unauthorized place. At this point the distributor can assess the
likelihood that the leaked data came from one or more agents, as opposed to having been
independently gathered by other means. If the distributor sees enough evidence‖ that an agent
leaked data, he may stop doing business with him, or may initiate legal proceedings. In this
project we develop a model for assessing the guilt of agents. We also present algorithms for
distributing objects to agents, in a way that improves our chances of identifying a leaker.
Finally, we also consider the option of adding fake objects to the distributed set. Such objects
do not correspond to real entities but appear. If it turns out an agent was given one or more
fake objects that were leaked, then the distributor can be more confident that agent was
guilty.

2.2 Project Identification


2.2.1 About data leakage detection :

A data distributor has given sensitive data to a set of supposedly trusted agents (third
parties). Some of the data is leaked and found in an unauthorized place (e.g., on the web or
somebody’s laptop). The distributor must assess the likelihood that the leaked data came from
one or more agents, as opposed to having been independently gathered by other means. We
propose data allocation strategies (across the agents) that improve the probability of

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 3


DATA LEAKAGE DETECTION Department of CSE

identifying leakages. These methods do not rely on alterations of the released data (e.g.,
watermarks). In some cases we can also inject realistic but fake data records to further
improve our chances of detecting leakage and identifying the guilty party.

2.2.2 Inference:

In a perfect world there would be no need to hand over sensitive data to agents that may
unknowingly or maliciously leak it. And even if we had to hand over sensitive data, in a
perfect world we could watermark each object so that we could trace its origins with absolute
certainty. However, in many cases we must indeed work with agents that may not be 100%
trusted, and we may not be certain if a leaked object came from an agent or from some other
source, since certain data cannot admit watermarks. In spite of these difficulties, we have
shown it is possible to assess the likelihood that an agent is responsible for a leak, based on
the overlap of his data with the leaked data and the data of other agents, and based on the
probability that objects can be guessed by other means. Our model is relatively simple, but
we believe it captures the essential trade-offs. The algorithms we have presented implement a
variety of data distribution strategies that can improve the distributor’s chances of identifying
a leaker. We have shown that distributing objects judiciously can make a significant
difference in identifying guilty agents, especially in cases where there is large overlap in the
data that agents must receive.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 4


DATA LEAKAGE DETECTION Department of CSE

Chapter-3
METHODOLOGY
Problem Setup and Notation:
A distributor owns a set T={t1,…,tm}of valuable data objects. The distributor wants to share
some of the objects with a set of agents U1, U2,…Un, but does not wish the objects be leaked
to other third parties. The objects in T could be of any type and size, e.g., they could be tuples
in a relation, or relations in a database. An agent Ui receives a subset of objects, determined
either by a sample request or an explicit request:

1. Sample request

2. Explicit request

Guilt Model Analysis:


Our model parameters interact and to check if the interactions match our intuition, in this
section we study two simple scenarios as Impact of Probability p and Impact of Overlap
between Ri and S. In each scenario we have a target that has obtained all the distributor’s
objects, i.e., T = S.

Algorithms:

1. Evaluation of Explicit Data Request Algorithms


In the first place, the goal of these experiments was to see whether fake objects in the
distributed data sets yield significant improvement in our chances of detecting a guilty agent.
In the second place, we wanted to evaluate our e-optimal algorithm relative to a random
allocation.

2. Evaluation of Sample Data Request Algorithms


With the sample data agents are not interested in particular objects. Hence, object sharing is
not explicitly defined by their requests. The distributor is “fixed” to allocate certain objects to
multiple agents only if the number of request objects exceeds the number of objects in set T.
The more data objects the agents requests in total, the more recipients on average an objects
has; and the more objects are shared among different agents, the more difficult it is to detect a
guilty agent.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 5


DATA LEAKAGE DETECTION Department of CSE

Chapter-4
IMPLEMENTATION
4.1 SYSTEM ANALYSIS

4.1.1 SOFTWARE REQUIREMENT SPECIFICATION


Software Requirement Specification (SRS) is the starting point of the software developing
activity. As system grew more complex it became evident that the goal of the entire system
cannot be easily comprehended. Hence the need for the requirement phase arose. The
software project is initiated by the client needs. The SRS is the means of translating the ideas
of the minds of clients (the input) into a formal document (the output of the requirement
phase.)

The SRS phase consists of two basic activities:

1) Problem/Requirement Analysis:

The process is order and more nebulous of the two, deals with understand the problem, the
goal and constraints.

2) Requirement Specification:

Here, the focus is on specifying what has been found giving analysis such as representation,
specification languages and tools, and checking the specifications are addressed during this
activity. The Requirement phase terminates with the production of the validate SRS
document. Producing the SRS document is the basic goal of this phase.

ROLE OF SRS:

The purpose of the software requirement specification is to reduce the communication gap
between the clients and the developers. Software requirement specification is the medium
through which the client and user needs are accurately specified. It forms the basis of
software development. A good SRS should satisfy all the parties involved in the system

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 6


DATA LEAKAGE DETECTION Department of CSE

SCOPE

This document is the only one that describes the requirements of the system. It is
meant for the use by the developers, and will also be the basis for validating the final
delivered system. Any changes made to the requirements in the future will have to go through
a formal change approval process. The developer is responsible for asking for clarifications,
where necessary, and will not make any alterations without the permission of the client.

4.1.2 System Specification


System Requirements:

Hardware Requirements:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• Ram : 512 Mb(max).
Software Requirements:
• Operating system : - Windows XP.
• Coding Language : DOT NET
• Data Base : SQL Server 2005

4.2 SOFTWARE DESIGN


In designing the software following principles are followed:

1. Modularity and partitioning: software is designed such that, each system should consists
of hierarchy of modules and serve to partition into separate function.

2. Coupling: modules should have little dependence on other modules of a system.

3. Cohesion: modules should carry out in a single processing function.

4. Shared use: avoid duplication by allowing a single module be called by other that need the
function it provides

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 7


DATA LEAKAGE DETECTION Department of CSE

Proposed Modules:

1. Data Allocation Module


2. Fake Object Module
3. Optimization Module
4. Data Distributor
4.3 MODULES:

1. Data Allocation Module:

The main focus of our project is the data allocation problem as how can the distributor
“intelligently” give data to agents in order to improve the chances of detecting a guilty agent.

2. Fake Object Module:

Fake objects are objects generated by the distributor in order to increase the chances of
detecting agents that leak data. The distributor may be able to add fake objects to the
distributed data in order to improve his effectiveness in detecting guilty agents. Our use of
fake objects is inspired by the use of “trace” records in mailing lists.

3.Optimization Module:

The Optimization Module is the distributor’s data allocation to agents has one constraint and
one objective. The distributor’s constraint is to satisfy agents’ requests, by providing them
with the number of objects they request or with all available objects that satisfy their
conditions. His objective is to be able to detect an agent who leaks any portion of his data.

4. Data Distributor:

A data distributor has given sensitive data to a set of supposedly trusted agents (third
parties). Some of the data is leaked and found in an unauthorized place (e.g., on the web or
somebody’s laptop). The distributor must assess the likelihood that the leaked data came from
one or more agents, as opposed to having been independently gathered by other means.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 8


DATA LEAKAGE DETECTION Department of CSE

Chapter-5
SYSTEM DESIGN
5.1 UML Diagrams
Data Flow Diagram / Use Case Diagram / Flow Diagram
The DFD is also called as bubble chart. It is a simple graphical formalism that can be
used to represent a system in terms of the input data to the system, various processing carried
out on these data, and the output data is generated by the system.

5.1.1 Data Flow Diagram:

Login

Admin Agent
Check
no
Exists
Select Agent
Create Account
yes
View and update agent details
Upload File to Agent

File details
File maintenance and secret key

File lock with secrete key

Data Leaker
If exists
End File locked File unlocked

File download with secret


If secret key key
exists
Original file Duplicate
file
Fig: Flow chart on Data leakage detection

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 9


DATA LEAKAGE DETECTION Department of CSE

5.1.2 Use Case Diagram:

Create an
Account

Login

Upload files to
Agent

Agent
Admin Generate Secret
Key

Download Files

Lock/Unlock

Data Leaker

Fig: Use case diagram on Data leakage detection

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 10


DATA LEAKAGE DETECTION Department of CSE

5.1.3 Class Diagram:

Upload Files Agent Account


FileID AgentName
FileName AgentID
AgentID AgentPassword
FileType EmailID
Filepath
UploadDate
CreateAccount()
SenttoAgent() GenerateKey()
ViewFileDetails()

Lock/UnLock
FileID Edit Account
FilePassword AgentName
ReTypePassword EmailID
SecretKey OldPassword
NewPassword
ReType NewPassword

Lock()
UnLock() Update()

Fig: Class diagram on Data leakage detection

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 11


DATA LEAKAGE DETECTION Department of CSE

5.1.4 Sequence Diagram:

DataBase

Agent Adm in

C reate an Account

Upload Files Store Files

L:ock/UnLockFiles

View Agent Account

View Files

Download Files

Send Required File

Send Duplicate File

Data Leaker

If Secret key does not matches If Secr et key matches

Fig: Sequence diagram for data leakage detection

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 12


DATA LEAKAGE DETECTION Department of CSE

5.1.5 Activity Diagram:

Login

Check

Create Account
Upload files
no
yes
Exists

File Maintenance Lock /Unlock Files

File download
Data leaker
If secret key exists Not exists
Check

Receive Duplicate files


Download original File

Fig: Activity diagram on Data leakage detection

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 13


DATA LEAKAGE DETECTION Department of CSE

5.2 Project Code


Login
import java.io.IOException;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import java.sql.ResultSet;

import javax.servlet.ServletException;

import javax.servlet.annotation.WebServlet;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

import javax.servlet.http.HttpSession;

public class Login extends HttpServlet {

private static final long serialVersionUID = 1L;

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws


ServletException, IOException {

String ss1=request.getParameter("Name");

String ss2=request.getParameter("Password");

HttpSession hs=request.getSession(true);

try {

Class.forName("com.mysql.jdbc.Driver");

Connection conn=(Connection)
DriverManager.getConnection("jdbc:mysql://localhost:3306/data_base","root","sony2603");

String s="select * from registration where Name=? and Password=?";

PreparedStatement ps=conn.prepareStatement(s);

ps.setString(1,ss1);

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 14


DATA LEAKAGE DETECTION Department of CSE

ps.setString(2,ss2);

ResultSet rs=ps.executeQuery();

if(rs.next()) {

hs.setAttribute("ss2",ss1);

hs.setAttribute("s0",ss2);

System.out.println("Login successful");

response.sendRedirect("welcome.jsp"); }

else {

System.out.println("data not entered"); } }

catch(Exception e) {

System.out.println("database error");

e.printStackTrace(); } } }

Register
import java.io.IOException;

import java.io.PrintWriter;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import javax.servlet.ServletException;

import javax.servlet.annotation.WebServlet;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

@WebServlet("/Register")

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 15


DATA LEAKAGE DETECTION Department of CSE

public class Register extends HttpServlet {

private static final long serialVersionUL;

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws


ServletException, IOException {

String ss1=request.getParameter("ID");

String ss2=request.getParameter("Name");

String ss3=request.getParameter("Password");

String ss4=request.getParameter("Email");

PrintWriter pw=response.getWriter();

Try {

Class.forName("com.mysql.jdbc.Driver");

Connection
conn=DriverManager.getConnection("jdbc:mysql://localhost:3306/data_base","root","sony26
03");

String s="insert into registration values(?,?,?,?)";

PreparedStatement ps=conn.prepareStatement(s);

ps.setString(1, ss1);

ps.setString(2, ss2);

ps.setString(3, ss3);

ps.setString(4, ss4);

int i=ps.executeUpdate();

if(i>0){

//System.out.println("data entered successfully");

response.sendRedirect("login.html"); }

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 16


DATA LEAKAGE DETECTION Department of CSE

else {

System.out.println("data not entered successfully"): } }

catch(Exception e) {

System.out.println("Database error");

e.printStackTrace(); } } }

Change

import java.io.IOException;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import javax.servlet.ServletException;

import javax.servlet.annotation.WebServlet;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

import javax.servlet.http.HttpSession;

@WebServlet("/Change")

public class Change extends HttpServlet {

private static final long serialVersionUID = 1L;

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws


ServletException, IOException {

HttpSession hs=request.getSession(true);

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 17


DATA LEAKAGE DETECTION Department of CSE

String s1=request.getParameter("o1");

String s2=request.getParameter("n1");

String s3=request.getParameter("c1");

String s4=(String) hs.getAttribute("ss2");

String s5=(String) hs.getAttribute("s0");

if(s1.equals(s5)&&s2.equals(s3)){

try{

Class.forName("com.mysql.jdbc.Driver");

Connection
conn=DriverManager.getConnection("jdbc:mysql://localhost:3306/data_base","root","sony26
03");

String q="update registration set Password=? where Name=?";

PreparedStatement ps=conn.prepareStatement(q);

ps.setString(1, s2);

ps.setString(2,s4);

int i=ps.executeUpdate();

if(i>0){

System.out.println("change password");

}else{

System.out.println("not change password"); }

}catch(Exception e){ } } }

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 18


DATA LEAKAGE DETECTION Department of CSE

Delete:

import java.io.IOException;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import javax.servlet.ServletException;

import javax.servlet.annotation.WebServlet;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

@WebServlet("/Delete")

public class Delete extends HttpServlet {

private static final long serialVersionUIL;

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws


ServletException, IOException {

String b=request.getParameter("abc");

try{

Class.forName("com.mysql.jdbc.Driver");

Connection
conn=DriverManager.getConnection("jdbc:mysql://localhost:3306/data_base","root","sony26
03");

String s="Delete from registration where (Name=?)";

PreparedStatement ps=conn.prepareStatement(s);

ps.setString(1, b);

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 19


DATA LEAKAGE DETECTION Department of CSE

int i=ps.executeUpdate();

if(i>0) {

System.out.println("delete successful"); }

Else {

System.out.println("delete not success"); } }

catch(Exception e) {

System.out.println("Database error");

e.printStackTrace(); } } }

Update

import java.io.IOException;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import javax.servlet.ServletException;

import javax.servlet.annotation.WebServlet;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

import javax.websocket.Session;

@WebServlet("/Update")

public class Update extends HttpServlet {

private static final long serialVersionUID = 1L;

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 20


DATA LEAKAGE DETECTION Department of CSE

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws


ServletException, IOException {

String v1=request.getParameter("id");

String v2=request.getParameter("eid");

String v3=request.getParameter("name");

Try {

Class.forName("com.mysql.jdbc.Driver");

Connection conn=(Connection)
DriverManager.getConnection("jdbc:mysql://localhost:3306/data_base","root","sony2603");

String q="update registration set ID=?,Email=? where Name=?";

PreparedStatement ps=conn.prepareStatement(q);

ps.setString(1, v1);

ps.setString(2, v2);

ps.setString(3, v3);

int i=ps.executeUpdate();

if(i>0) {

System.out.println("update succes"); }

Else {

System.out.println("not updated"); }

}catch(Exception e){

e.printStackTrace(); }}}

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 21


DATA LEAKAGE DETECTION Department of CSE

5.3 Testing:
Sl Test Test Steps Test Data Expected Status
No. Scenario Results

1 1.Goto Userid=abc Logged in Pass


Check login Password=abc successfully
admin page
2 login with 2.Enter Userid=navya Please enter Fail
valid data user id. Password=navy valid details
3.Enter
password Userid=navya Correct Pass
Details
Password=navya

3 1.Goto Admin name=admin Logged in Pass


Check admin Password=admin successfully
distributor login
4 login with 2.Enter Admin name=admin Please enter Pass
valid data admin Password=Administrator valid details
name
3.Enter
admin
password

5 1.Goto Userid=checha Goto file sent Pass


Login to distributor
2.After Check correct data
Check successful
6 Issue login go Userid=checha Required key Pass
to admin Go to details is applied
details

7 Check 1.Goto Data leaked files Valid Date Pass


Credentials distributor format
2.Enter
key

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 22


DATA LEAKAGE DETECTION Department of CSE

Chapter-6
OUTPUT SCREENSHOTS
Distributor Login

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 23


DATA LEAKAGE DETECTION Department of CSE

Distributor Home Page

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 24


DATA LEAKAGE DETECTION Department of CSE

Distributor Send file

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 25


DATA LEAKAGE DETECTION Department of CSE

View Sent files

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 26


DATA LEAKAGE DETECTION Department of CSE

View Leak Files

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 27


DATA LEAKAGE DETECTION Department of CSE

Agent Home

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 28


DATA LEAKAGE DETECTION Department of CSE

View Files Sent By Distributor

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 29


DATA LEAKAGE DETECTION Department of CSE

View Key

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 30


DATA LEAKAGE DETECTION Department of CSE

Files Sent By Agent

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 31


DATA LEAKAGE DETECTION Department of CSE

Send File To Agent

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 32


DATA LEAKAGE DETECTION Department of CSE

Edit Account Details

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 33


DATA LEAKAGE DETECTION Department of CSE

User Registration

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 34


DATA LEAKAGE DETECTION Department of CSE

Chapter-7
CONCLUSION
In a perfect world there would be no need to hand over sensitive data to agents that
may unknowingly or maliciously leak it. And even if we had to hand over sensitive data, in a
perfect world we could watermark each object so that we could trace its origins with absolute
certainty. However, in many cases we must indeed work with agents that may not be 100%
trusted, and we may not be certain if a leaked object came from an agent or from some other
source, since certain data cannot admit watermarks. In spite of these difficulties, we have
shown it is possible to assess the likelihood that an agent is responsible for a leak, based on
the overlap of his data with the leaked data and the data of other agents, and based on the
probability that objects can be “guessed” by other means. Our model is relatively simple, but
we believe it captures the essential trade-offs. The algorithms we have presented implement a
variety of data distribution strategies that can improve the distributor’s chances of identifying
a leaker. We have shown that distributing objects judiciously can make a significant
difference in identifying guilty agents, especially in cases where there is large overlap in the
data that agents must receive. Our future work includes the investigation of agent guilt
models that capture leakage scenarios that are not studied in this paper. For example, what is
the appropriate model for cases where agents can collude and identify fake tuples? A
preliminary discussion of such a model is available in Another open problem is the extension
of our allocation strategies so that they can handle agent requests in an online fashion (the
presented strategies assume that there is a fixed set of agents with requests known in
advance).

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 35


DATA LEAKAGE DETECTION Department of CSE

Chapter-8
BIBLIOGRAPHY

Good Teachers are worth more than thousand books, we have them in Our Department.

References:

1. User Interfaces in C#: Windows Forms and Custom Controls by Matthew MacDonald.
2. Applied Microsoft® .NET Framework Programming (Pro-Developer) by Jeffrey Richter.
3. Practical .Net2 and C#2: Harness the Platform, the Language, and the Framework by
Patrick Smacchia.
4. Data Communications and Networking, by Behrouz A Forouzan.
5. Computer Networking: A Top-Down Approach, by James F. Kurose.
6. Operating System Concepts, by Abraham Silberschatz.
7. R. Agrawal and J. Kiernan. Watermarking relational databases. In VLDB ’02: Proceedings
of the 28th international conference on Very Large Data Bases, pages 155–166. VLDB
Endowment, 2002.
8. P. Bonatti, S. D. C. di Vimercati, and P. Samarati. An algebra for composing access control
policies. ACM Trans. Inf. Syst. Secur., 5(1):1–35, 2002.
9. P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data
provenance. In J. V. den Bussche and V. Vianu, editors, Database Theory - ICDT 2001, 8th
International Conference, London, UK, January 4-6, 2001, Proceedings, volume 1973 of
Lecture Notes in Computer Science, pages 316–330. Springer, 2001
10. P. Buneman and W.-C. Tan. Provenance in databases. In SIGMOD ’07: Proceedings of the
2007 ACM SIGMOD international conference on Management of data, pages 1171–1173,
New York, NY, USA, 2007. ACM.
11. Y. Cui and J. Widom. Lineage tracing for general data warehouse transformations. In The
VLDB Journal, pages 471–480, 2001.
12. S. Czerwinski, R. Fromm, and T. Hodes. Digital music distribution and audio
watermarking.
13. F. Guo, J. Wang, Z. Zhang, X. Ye, and D. Li. Information Security Applications, pages
138–149. Springer, Berlin / Heidelberg, 2006. An Improved Algorithm to Watermark
Numeric Relational Data.

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 36


DATA LEAKAGE DETECTION Department of CSE

Sites Referred:

http://www.sourcefordgde.com

http://www.networkcomputing.com/

http://www.ieee.org

http://www.almaden.ibm.com/software/quest/Resources/

http://www.computer.org/publications/dlib

http://www.ceur-ws.org/Vol-90/

http://www.microsoft.com/isapi/redir.dll?prd=ie&pver=6&ar=msnhome

MALLAREDDY ENGINEERING COLLEGE FOR WOMEN Page 37

You might also like