60

Journal of Network and Computer Applications 75 (2016) 259278
Contents lists available at ScienceDirect
Journal of Network and Computer Applications

journal homepage: www.elsevier.com/locate/jnca
Copy-move forgery detection: Survey, challenges and future directions

Nor Bakiah Abd Warif a, Ainuddin Wahid Abdul Wahab a, Mohd Yamani Idna Idris a,
Roziana Ramli a, Rosli Salleh a, Shahaboddin Shamshirband a,
Kim-Kwang Raymond Choo b,c,n
a
Department of Computer System and Technology, Faculty of Computer Science & Information Technology, University of Malaya, 50603 Kuala Lumpur,
Malaysia
b
Department of Information Systems and Cyber Security, University of Texas at San Antonio, San Antonio, TX 78249-0631, USA
c
School of Information Technology & Mathematical Sciences, University of South Australia, Adelaide, SA 5001, Australia
art ic l e i nf o
a b s t r a c t
Article history:
Received 8 March 2016
Received in revised form
14 June 2016
Accepted 13 September 2016
Available online 13 September 2016
The authenticity and reliability of digital images are increasingly important due to the ease in modifying
such images. Thus, the capability to identify image manipulation is a current research focus, and a key
domain in digital image authentication is Copy-move forgery detection (CMFD). Copy-move forgery is
the process of copying and pasting from one region to another location within the same image. In this
paper, we survey the recent developments in CMFD, and describe the entire CMFD process involved.
Specically, we characterize the common CMFD workow of feature extraction and matching process
using block or keypoint-based approaches. Instead of listing the datasets and validations used in the
literature, we also categorize the types of copied regions. Finally, we also outline a number of future
research directions.
& 2016 Elsevier Ltd. All rights reserved.
Keywords:
Copy-move forgery
Image forgery
Blind detection
Copied region
Image forensics
Contents
1.
2.
3.
4.
5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Image forgery detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Copy-move forgery detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
3.1.
Workow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Block-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
4.1.
Block-based feature extraction techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.1.1.
Frequency transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.1.2.
Texture and intensity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.1.3.
Moments invariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
4.1.4.
Log polar transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
4.1.5.
Dimension reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.1.6.
Others. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.2.
Block-based matching techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.2.1.
Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
4.2.2.
Hash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.2.3.
Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.2.4.
Euclidean distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.2.5.
Others. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Keypoint-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
5.1.
Keypoint-based feature extraction techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
5.1.1.
SIFT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Corresponding author at: Department of Information Systems and Cyber Security, University of Texas at San Antonio, San Antonio, TX 78249-0631, USA.
E-mail addresses: nurbaqiyah@siswa.um.edu.my (N.B.A. Warif), ainuddin@um.edu.my (A.W.A. Wahab), yamani@um.edu.my (M.Y.I. Idris),
roziana.ramli@gmail.com (R. Ramli), rosli_salleh@um.edu.my (R. Salleh), shamshirband@um.edu.my (S. Shamshirband), raymond.choo@fulbrightmail.org (K.-K. Choo).
http://dx.doi.org/10.1016/j.jnca.2016.09.008
1084-8045/& 2016 Elsevier Ltd. All rights reserved.
260
N.B.A. Warif et al. / Journal of Network and Computer Applications 75 (2016) 259278
5.1.2.
Harris corner detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
5.1.3.
SURF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
5.2.
Keypoint-based matching techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
5.2.1.
Nearest neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
5.2.2.
Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
6. Publicly available datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7. Types of copied regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
7.1.
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
7.2.
Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
7.3.
Creature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
7.4.
Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8.1.
Data inconsistencies and high scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
8.2.
Limitations in existing computer architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
8.3.
Potential of big data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
In recent years, digital image tampering is made easier due to

the availability of commercial photo editing software, free or paid.
For example, such software has made it easier to duplicate and
manipulate the image's content without (signicantly) degrading
its quality or leaving any visible clues to an untrained eye (depending on the skills of the user, the software used, etc). In addition, the images that are widely shared over the social media on
the internet can be easily altered to misrepresent their meaning
with malicious intention.
Digital image tampering or manipulation has also been detected in academic papers. For example, in the survey conducted
by Tijdink et al. (2014), 15% of the respondents admittedly engaged
in scientic misconduct such as fabricating, falsifying, plagiarizing,
or manipulating data in the past three years. Another study also
reported that approximately 20% of accepted manuscripts in the
Journal of Cell Biology contain inappropriate gure manipulations
and at least 1% of them have fraudulent manipulations (Farid,
2006). Consequently, the credibility of the research outcomes can
be challenged and in some cases, result in allegations of scientic
misconduct. For example, a professor in Missouri University retracted his publication entitled CDX2 gene expression and trophectoderm lineage specication in mouse embryos published in
the Feb. 17, 2006, issue of Science. A subsequent investigation revealed that one of the images was manipulated. The researcher
was subsequently found guilty of intentionally manipulating the
image of the embryo. These issues have resulted in a renewed
interest in image forensics research to authenticate image, identify
image manipulation, etc.
Of the image manipulation techniques in the literature, copymove forgery and copy-move forgery detection (CMFD) are the
most widely studied. Copy-move forgery is the manipulation of an
image's content by copying and pasting from one region to another location within the same image. We located a total of 84
scientic papers on the topic of CMFD indexed by Web of Science
published between 2007 and 2014 (see Fig. 1).
Currently, there are four published surveys on CMFD techniques (see Table 1). The review of Vincent et al. (Christlein et al.,
2012) discussed the performance of popular feature extraction
techniques in CMFD. The performance of the feature extraction
techniques were then evaluated using their own dataset. Lin et al.
(2013) categorized the matching techniques in CMFD into brute
force and block-based. Meanwhile, Al-Qershi and Khoo (2013)
categorized existing feature extraction techniques and discussed
their advantages and limitations.

These review articles discussed an aspect of CMFD, either feature extraction or matching techniques, and focused only on one
part of the CMFD process. Furthermore, there is no discussion
specic to matching techniques. Therefore, in this paper, the entire
CMFD process is reviewed detailing both feature extraction techniques and their related matching techniques. We also list the
CMFD datasets and validations, and categorize the copied regions
and analyze the possible domain related to CMFD.
The remainder of the paper is structured as follows. In Section
2, we describe the overview of image forgery detection and introduce copy-move forgery. Section 3 explains CMFD and the
common workow of CMFD techniques. These techniques are
further divided into block-based and keypoint-based approaches
in Sections 4 and 5, respectively. Next, Section 6 presents the datasets and validations involved in CMFD, and the types of copied
regions are explained in Section 7. Discussion and future direction
are presented in Section 8. Finally, Section 9 concludes the paper.
2. Image forgery detection

Image forgery detection techniques can be broadly categorized
into active and passive approaches, according to the presence of
additional information. The active approach is based on additional
information embedded in the digital image for tampering detection such as digital watermarks and digital signatures. Such information can be used to assess the originality of an image.
However, the active approach requires additional information to
Publications of Copy-Move Forgery Detection

Indexed in WOS
Number of Publications
1. Introduction
20
15
10
Journal
Conference
5
0
2007 2008 2009 2010 2011 2012 2013 2014
Years
Fig. 1. Scientic papers located by searching for copy-move forgery detection on

Web of Science.
261
Table 1
List of review articles on CMFD techniques.
Author (s)
Descriptions
Christlein et al. (2012)
Contributions
Evaluated performance of popular feature extraction techniques in CMFD for various post processing scenarios.
Findings
Keypoint-based features (SIFT and SURF) can be performed very efciently with low computational load. However, it is sensitive to lowcontrast regions and repetitive image content.
Five features (DCT, DWT, KPCA, PCA and Zernike) outperformed keypoint-based features with high performance. Of these techniques, the
authors recommended Zernike due to its relatively small memory footprint.
Lin et al. (2013)
Contributions
Categorized the matching techniques in CMFD into brute force and block-based.
The block-based was further classied into spatial domain, transform domain, and post processing invariant method.
Findings
The DCT and PCA block-based techniques exhibit a high computational complexity.
The DCT is inapplicable when considering highly textured and small tampered regions.
Generally, most of the techniques are not responsive to the geometric transformations, such as rotation and scaling.
Birajdar and Mankar (2013) Contributions

Reviewed various image forgery detection methods with an emphasis on passive techniques, and developed its generalized structure.
Findings
The CMFD were found to be computationally expensive and had a high false positive.
Al-Qershi and Khoo (2013)
Contributions
Categorized the features extraction techniques in CMFD into eight groups (DCT, Log-Polar Transform, Texture & Intensity, Invariant keypoint, Invariant moment based, PCA, SVD and Others).
Findings
The complexity and execution time of the CMFD could be reduced when a smaller size of feature vectors is employed.
The robustness of the CMFD increased by adopting feature extraction techniques that are invariant to a wider range of attacks such as
scaling, rotation and etc.
Most of the existing CMFD techniques are time consuming.
be embedded in the image during the capturing process or at later

stage by authorized personnel. If information about the original
image is unknown (e.g. images on the internet), then the active
approach is impossible or ineffective.
On the other hand, the passive approach is capable of detecting
image manipulation without additional information. The passive
approach detects the manipulation by extracting intrinsic features
within the image based on tampering detection and source device
identication. Such techniques can be further categorized into
dependent and independent forgery. The former (i.e. dependent
forgery) is an action of copying and pasting the image regions
either within the same image (copy-move) or from another image
(splicing). Other digital manipulation or general tampering, such
as compression, resampling and inconsistencies, are categorized as
independent forgery. In contrast, the source device identication is
a process to determine the origin device of the digital image based
on optical and sensor regularities. An overview of the image forgery detection categories is depicted in Fig. 2.
3. Copy-move forgery detection

A copy-move forgery is a passive tampering detection in forgery detection wherein one or more region have been copied and
pasted within the same image. Typical motivations of such forgery
include hiding an element in the image (e.g. steganography) or
emphasizing a particular object (e.g. a crowd of demonstrators).
Copy-move forgery is easy to perform and can be relatively effective in image manipulation, particularly when both source and
target regions are from the same image as properties such as color
temperature, illumination conditions and noise will generally be
well-matched between the tampered region and the image.
Digital Watermark
ACTIVE
Digital Signature
Copy-Move
Dependent
Splicing
IMAGE FORGERY
DETECTION
Tampering
Compression
Independent
PASSIVE
Re-Sampling
Inconsistencies
Optical Regularities
Source Device
Sensor Regularities
Fig. 2. Existing image forgery detection techniques.
262
Fig. 3. An example of copy-move forgery (a) original image (b) forged image. The grass is used to manipulate the image with the intention of hiding the house.
Therefore, it can be undetectable by naked eyes. In copy-move

forgery, the common manipulated areas in the image are found to
be grass, foliage or fabric (Fridrich et al., 2003). These areas are
easy to blend with the background due to similarities in the texture and color. An example of copy-move forgery using grass as
manipulated area is shown in Fig. 3.
Combining copy-move forgery with attacks can further reduce
the chance of the manipulated regions from being detected. The
attack operations can be divided as intermediate (also known as
geometric transform) and post-processing attacks (see Table 2). In
Section 3.1, the workow of the CMFD techniques is explained.
Fig. 4. Common workow of CMFD techniques.
3.1. Workow
In CMFD, the common workow consists of four stages,
namely: pre-processing, feature extraction, matching and visualization (see Fig. 4). Each stage is now discussed as follows.
The rst stage of CMFD process is typically pre-processing,
which is optional. In pre-processing, one seeks to improve the
image data by suppressing undesired distortions or enhancing the
image features (Miljkovi, 2009). The conversion of RGB (Red,
Green, and Blue) color channels to grayscale appears to be the
most frequently used method used in pre-processing (see (Vincent
Christlein and E.A.P., 2010; Amerini et al., 2011; Ardizzone et al.,
2010, 2009; Cao et al., 2012; Huang et al., 2008, 2011; Li et al.,
2012; Li and Yu, 2010; Lynch et al., 2013; Muhammad et al., 2012;
Myna et al., 2008; Peng et al., 2011; Ryu et al., 2013; Wang et al.,
2012; Yang et al., 2013; Yang and Huang, 2009; Zhang et al., 2008;
Zhao and Guo, 2013). In the conversion, the RGB channels are
merged using I = 0.228R + 0.587G+0. 114B to represent the
grayscale component. Alternatively, RGB channels can be converted to YCbCr color system to operate either on the luma
(Y) information or on chrominance components (Cb and Cr) (see
Hussain et al., 2014, 2013a, 2013b, 2012; Muhammad et al., 2013;
Wu et al., 2010).
The color conversions are performed to reduce dimensionality
of the data and increase the distinctive visual features in an image.
Indirectly, the complexity of processing can be reduced and the
speed of processing will be increased. Aside from the color
Table 2
Types of attack that have been classied.
Attacks
Example operations
Descriptions
Intermediate/geometric transform
Rotating, scaling,
mirror reection,
translation
JPEG compression,
blurring, Gaussian
noise
Provide a spatial synchronization

and homogeneity between the
copied region and its neighbors
Eliminate any visible hints of the
copy-move operation such as
sharp edges
Post-processing
conversion method, block division has been used as part of preprocessing in CMFD. The block division is a method that divides
the image into a number of blocks either using the overlap or nonoverlap approach. The block division can reduce the computational time for matching process to nd the similar feature vector
in an image compared to exhaustive search.
After pre-processing, the feature extraction allows one to select
relevant information that represent the characteristics of interest
in the image (Chora, 2007). Common methods of feature extraction reported in the literature are Discrete Cosine Transform (DCT),
Discrete Wavelet Transform (DWT), log polar transform, invariant
keypoint, and texture and intensity.
Feature extraction is followed by the matching stage that seeks
out similarities between two or more features in the image. In this
stage, manipulations of copy-move forgery in the image are determined. The execution of matching techniques is mainly by
block-based or keypoint-based depending on the extracted features. For example, DCT features are matched by blocks while the
invariant keypoint features are matched by distance of the nearest
neighbor from all points in the feature space.
Finally, the process of CMFD can be visualized to display and
localize the tampered regions in the forged image. The visualization of block-based approach is usually presented by coloring or
mapping the region of the matching blocks. On the other hand, the
keypoint-based approach is commonly displayed by line transformation between each matching point. Both visualization can be
further rened by morphology operation using the shapes properties of the features such as contours, skeletons and convex hulls
(see Amerini et al., 2013; Cao et al., 2012; Jaberi et al., 2013a,
2013b; Li et al., 2012; Pan and Lyu, 2010; Peng et al., 2011; Yang
and Huang, 2009; Zhang et al., 2008; Zhao and Guo, 2013).
In the next two sections, CMFD techniques are organized into
two approaches, namely: block-based and keypoint-based.
4. Block-based approach
The block-based approach splits an image into blocks of square
or circle for analysis during the pre-processing stage. These blocks

can either overlap or not overlap with each other. Then, the features are extracted from these blocks and compared against each
other to determine the similarity between blocks within the image. Once the matched blocks are detected, these blocks represent
the manipulation of copy-move forgery performed in the image as
illustrated in Fig. 5.
In CMFD, the block-based is the most popular approach adopted by researchers in recent years, perhaps due to its compatibility
with various feature extraction techniques and increased matching
performance. For example, as depicted in Fig. 6(a), 2013 recorded
the highest number of 14 scientic papers indexed in WOS that
adopted this approach.
263
Block-based
16
14
14
12
10
10
8
5
6
4
2
2007
2008
5
3
0
2009
2010
2011
2012
2013
2014
2012
2013
2014
(a)
4.1. Block-based feature extraction techniques
Feature Extraction
100%
Generally, the feature extraction techniques for block-based are

in the form of frequency transform, texture and intensity, moments invariant, log polar transform, dimension reduction and
others see Fig. 6(b) and Table 3. The details of the feature extraction techniques are discussed as follows.
90%
80%
70%
60%
50%
40%
30%
20%
4.1.1. Frequency transform

Frequency transform is the most popular feature extraction
techniques for block-based, perhaps due to its robustness to noise
and separability of the rotational and translational components
(Lucchese and Cortelazzo, 2000). Several enhancements, based on
Discrete Cosine Transform (DCT), Fourier Transform, fast WalshHadamard Transform (FWHT), Discrete Wavelet Transform (DWT),
Dyadic Wavelet Transform (DyWT) and Wiener Filter Wavelet,
have been proposed to further improve the performance see
Table 4.
Of the transform functions, DCT is one of the most widely used
in CMFD. DCT is known for its robustness against noise addition
and JPEG compression. Overall, the enhancements of frequency
transform functions are focused on the reduction of feature dimensions that leads to low computational complexity in later
analysis. Their CMFD performances are robust against the image
with post-processing operation and ineffective with intermediate
operation. However, the studies conducted by Muhammad et al.
(Muhammad et al., 2012) and Shao et al. (Shao et al., 2012) show a
different outcome due to invariant to rotation.
4.1.2. Texture and intensity
Texture and intensity exist in natural scenes such as grass,
cloud, tree, and ground, and image properties such as smoothness,
coarseness and regularity represent the texture contents. Therefore, texture and intensity can be utilized as features to locate the
similarities in the forged image. In CMFD, texture and intensity are
10%
0%
2007
2008
2009
2010
2011
Frequency Transform
Texture & Intensity
Moments Invariant
Log Polar Transform
Dimension Reduction
Others
(b)
Matching Techniques
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
2007
2008
Sorting
2009
Hash
2010
2011
Euclidean Distance
2012
Correlation
2013
2014
Others
(c)
Fig. 6. Publications indexed by WOS for block-based approach between 2007 and
2014 (a) Literature of block-based approach by year (b) Breakdown of the blockbased approach into six different types of feature extraction techniques
(c) Breakdown of the block-based approach into ve different types of matching
techniques.
Fig. 5. The CMFD process in block-based approach.
264
Table 3
Publications of block-based approach according to feature extraction techniques.
Feature Extraction Techniques Author (s)
Frequency transform
Texture and intensity
Moments invariant
Log polar transform
Dimension reduction
Others
Cao et al. (2012), Deng et al. (2012), Huang et al. (2011), Ketenci and Ulutas (2013), Kumar et al. (2013), Li et al. (2008), Li et al. (2012),
Muhammad et al. (2012), Murali et al. (2012), Myna et al. (2008), Peng et al. (2011), Shao et al. (2012), Shin (2013), Yang et al. (2013),
Zhang et al. (2008) and Zhao and Guo (2013)
Ardizzone et al. (2009), Bravo-Solorio and Nandi (2011), Davarzani et al. (2013), Gan and Zhong (2014), Hsu and Wang (2012), Kuznetsov Andrey Vladimirovich (2014), Lin et al. (2009), Lynch et al. (2013), Singh and Raman (2012), Ulutas and Ulutas (2013) and Uluta
et al. (2013)
Bilgehan and Uluta (2013), Kashyap and Joshi (2013), Le and Xu (2013), Mahdian and Saic (2007) and Ryu et al. (2013, 2010)
Bayram et al. (2009), Li et al. (2014, 2012), Li and Yu (2010), Li (2013) and Wu et al. (2010)
Ting and Rang-Ding (2009), Yang and Huang (2009) and Zhao (2010)
Liu et al. (2014) and Wang et al. (2012)
measured and characterized through intensity, pattern or color

information as shown in Fig. 7.
The information of average intensity can be described in nine
dimensional representing the value, ratio and their differences for
each block (Lin et al., 2009). The analysis using average intensity
information in GPU engine is reportedly 12 times faster than its
optimized CPU variant (Singh and Raman, 2012). Furthermore, it is
sufciently robust to xed angle rotation, JPEG compression and
noise. Another feature that represents texture and intensity in
CMFD is pattern information extracted from Gabor feature (Hsu
and Wang, 2012) and Multi-resolution Local Binary Pattern (MLBP)
(Davarzani et al., 2013). This pattern information is known for
their robustness to geometric distortion. Additionally, MLBP presents an extra advantage of robustness to illumination variations.
In CMFD, the color information is widely utilized to characterize texture and intensity feature. Typically, RGB, illumination,
spatial color and gray values are the basic components in representing the color information. These components are extracted
through the color space, color quantication and similarity measurement. The color information is invariant with respect to
scaling, translation and rotation (Bravo-Solorio and Nandi, 2011;
Kodituwakku and Selvarajah, 2004). Nevertheless, a combination
between the average gray value and Tamura texture will result in
additional benets such as robust to Gaussian noise and JPEG
compression with low complexity of time (Gan and Zhong, 2014).
The average gray value from all pixels in the block can also be
used as a dominant feature for matching process (Lynch et al.,
2013). Using this feature allows us to achieve improvement in time
performance and robustness against Gaussian blurring and JPEG
compression. Similarly, the Color Coherence Vector of grayscale
image in each block which describes the spatial color information
have been shown to be robust to Gaussian blurring (Ulutas and
Ulutas, 2013). Meanwhile, Ardizzone et al. (2009) introduced the
bit plane analysis to classify grayscale texture in the image content. However, the bit plane analysis is weak in detecting JPEG
images due to the modication of intensity value in JPEG compression not been persistent.
4.1.3. Moments invariant
Moments invariant is a set of features that are invariant to
translation, rotation and scale. This can be used to classify shape
and recognize object in binary image. Since its rst introduction to
the pattern recognition community by Hu (1962), various improvements have been proposed based on the sequence of orthogonal polynomials and probability distribution. Improvements
such as central moment, Krawtchouk's moment, Zernike moment,
and exponential moment have been proposed to overcome various
problems associated with the regular moments. The regular moments are known to be computationally expensive due to the information redundancy, location dependent and representing global features rather than local.
The moments invariant was initially employed in copy-move by

Mahdian and Saic (2007) using blur invariant moment. The blur
moment that represented by the function of central moments is
resilient to blur degradation, additive noise and arbitrary contrast
changes. However, extracting this feature from a large image will
increase the computational complexity. This complexity can be
reduced with a combination of blur moment and DWT (Kashyap
and Joshi, 2013). Similarly, the Krawtchouk's moment is robust to
the post-processing operations, particularly the Gaussian blurring
operation (Bilgehan and Uluta, 2013) with additional capability to
detect forgery of regular or irregular shaped regions.
Zernike moment is robust against rotation invariant compared
to the moment introduced by Hu (1962); however, it is weak
against scaling and other tampering based on afne transformation (Ryu et al., 2013, 2010). Furthermore, the exponential moment
improves the performance of Zernike moments due to its simple
function and invariant to noise and smooth distortion condition
(Hu et al., 2014). Alternatively, according to Le and Xu (2013), the
exponential moments can be combined with histogram-invariant
moment (explored from central moments) resulting in increase in
robustness against translation, scaling, rotation, brightness and
contrast change with improvement in the processing time. Despite
the benets, mixed moments are challenged in detecting a small
tampered region, as thresholds setting cannot be used widely in
various images.
In summary, moments invariant are global features and inherently location dependent. Therefore, they are not suited for
recognizing objects and some means must be adopted to insure
location invariance.
4.1.4. Log polar transform
Log polar transform is a feature extraction technique that is
invariant to rotation, scaling and translation. The technique works
by projection mapping from the points on the Cartesian plane
(x, y )to points in the log-polar (x, h). One of the early log polar
transformations implemented in CMFD is Fourier Mellin Transform (FMT) (Bayram et al., 2009). FMT resamples the fourier
transform magnitude into log polar mapping. However, the technique is limited to rotation up to 10. This limitation can be enhanced by combining the FMT with a vector erosion lter (Li and
Yu, 2010) or Log Polar Fourier Transform (LPFT) (Wu et al., 2010).
Further improvement of log polar transform has been proposed
using the Polar Harmonic Transform (PHT) properties. Uniquely,
the PHT technique analyzes the blocks in circular shape instead of
the conventional square blocks in other CMFD techniques (Li et al.,
2012). Other transforms that are harmonic in nature are Polar
Cosine Transform (PCT) and Polar Sine Transform (PST). Generally,
these techniques are robust to post processing operations such as
AWGN, JPEG compression and Gaussian blurring. Specically, the
PCT is robust against noise with low computational time compared
to the Zernike moment (Li, 2013), and PST has the best invariance
Table 4
Six variations of the frequency transform in CMFD and their enhancements.
Author (s)
Details
Advantages
Limitations
DCT
Li et al. (2008)
Proposed mismatch information using DCT grid and block artifacts

grid as a clue of copy- move forgery.
Improved the DCT coefcients by truncating the higher frequency
of the coefcients. The truncation is performed by reserved a part
of vector components after the DCT coefcients has been reshaped
to a row vector in zigzag order.
Represents the DCT with a circle block instead of square block.
High computational complexity.

Only tested with JPEG compression.
High sensitivity in detecting copy-move tampering when the
duplicated regions are not too small.
Huang et al. (2011)
Cao et al. (2012)
Zhao and Guo

(2013)
Applied Singular Value Decomposition (SVD) to the blocks after

the DCT quantization process.
Fourier Transform Shao et al. (2012)
The Fourier transform of the polar expansion are calculated on the

overlapping windows pair. This followed by an adaptive band
limitation procedure to obtain a correlation matrix in which the
peak is effectively enhanced.
Utilized fast Walsh-Hadamard Transform (FWHT) due to simpler
features that used addition and subtraction operations compared
to DCT.
Used low frequency sub bands from DWT.
Robust to JPEG compression.

Effective for copy-move and image splicing.
Reduced features dimension.
Robust to JPEG compression, blurring and
additive white Gaussian noise (AWGN)
distortion.
Detect multiple copy-move forgery in an
image.
Robust to blurring and noise addition
Low computational complexity.
Each block represented by a singular value
(low dimension).
Detect multiple copy-move forgery in an
image.
Robust to Gaussian blurring, AWGN, JPEG
compression and their mixed operations.
Only tested with the post-processing operation.
Only tested with the post-processing operation.
High sensitivity to texture features.

Efciently estimates the rotation angle.
Unable to detect copy-move with scaling.
High accuracy and increase speed in CMFD.
Weak performance if the image has undergone the attack of

transforming.
Low computational complexity.
Speed relies on the location of copy-move. If the copy-move is

located between two blocks, detection process must be repeated
into smaller blocks to localize the copy-move region.
Only tested with JPEG compression and rotation.
FWHT
Yang et al. (2013)
DWT
Zhang et al. (2008)
DyWT
Muhammad et al.
(2012)
Performed a comparison between the approximate (LL) sub band Robust to rotation and jpeg compression.
and detail (HH) sub band from the DyWT techniques.
DyWT is shift invariant compared than
DWT.
Wiener Filter
Peng et al. (2011)
Implemented the Wiener Filter in the wavelet based image deRobust to JPEG compression, scaling, rotation, Incapable of self-adaptive to adjust the threshold.
noising to extract four features as follows:
adding noise, and blurring.
variance of the pattern noise
signal noise ratio between the de-noised image and the pattern
noise
information entropy
average energy gradient of the original grayscale image
Technique
265
266
Fig. 7. Texture and intensity characterized by intensity, pattern and color.
to geometric distortions in the PHT group (Li et al., 2014).
4.2. Block-based matching techniques
4.1.5. Dimension reduction

Dimension reduction techniques are commonly used with domain features to reduce the dimensionality of the image and improve the complexity. These techniques are Singular Value Decomposition (SVD) and Locally Linear Embedding (LLE). The SVD is
generally stable, scales, and achieves rotation invariance for both
algebraic and geometric properties. SVD reduces computational
complexity and is robust to various operations particularly rotation, scaling, Gaussian noise and ltering (Ting and Rang-Ding,
2009). However, SVD results in loss of image details resulting in
the low performance in JPEG compression.
Alternatively, LLE can be implement to reduce dimensionality
in high-dimensional dataset (Zhao, 2010). LLE nds the topological
relationship among nonlinear dataset and map high-dimensional
data to low-dimensional data without changing the relative locations. In comparison with Principal Component Analysis (PCA)
(Popescu and Farid, 2004), LLE has the capability to nd the fused
edge that hides the traces in forged image, but PCA recorded a
faster processing time. Between these two techniques, SVD has a
higher overall performance of robustness to various operations
and computational complexity.
Matching technique is a process to nd similarities between

two or more features in the image. The process is performed after
each feature in the image is measured and extracted to dene the
manipulated area. From the literature, the matching techniques for
block-based can be divided into sorting, hash, correlation, Euclidean distance, and others as summarized in Table 5. Meanwhile,
Fig. 6(c) represents the breakdown for each type of matching
techniques published between 2007 and 2014.
4.1.6. Others
Other feature extraction techniques that can be found in the
literature are Multi Scale Auto-convolution (MSA) (Wang et al.,
2012). MSA is determined by vector sequence and invariance to
afne transform. This technique is robust against rotation, Gaussian noise and JPEG compression with the exception of the scaling
operation. Another technique uses features generated from JPEG
block articial grids and local noise discrepancies (Liu et al., 2014).
These features are combined with the image quality score as
coefcient. This technique is effective in detecting copy-move and
splicing forgery, regardless of the JPEG compression ratio of the
input image.
4.2.1. Sorting
Sorting is a technique that orders the features in a certain arrangement. It is a commonly employed technique in the matching
process of block-based approaches. It enhances the computational
complexity during the search of identical values in a large size
image. Hence, an efcient sorting technique is important to
quickly nd the duplicated area by improving the search and
merge algorithms.
The sorting techniques used in matching process for blockbased features include Lexicographical, KD-Tree, and Radix (see
Table 6). Among the sorting techniques, lexicographical is the most
widely employed technique in block-based. The lexicographical
technique commonly detects potentially tampered region through
the adjacent identical pairs of blocks. However, the implementation of lexicographical varies between authors such as the calculations of distance between adjacent blocks and number of
threshold used to dene the tampered area.
The accuracy of lexicographical techniques can also be improved using kd-tree (Christlein et al., 2010). The latter is a nearest
neighbor searching technique, which sorts array of blocks. First,
the technique splits the array into two parts recursively with different dimensions. When the size of the array is smaller or equal
to the neighborhood search size, the iterative processes are terminated. Finally, the neighborhood is analyzed and compared with
a threshold to dene the possible duplicated area. This technique
267
Table 5
Summary of matching techniques for block-based approach.
Matching Techniques Author (s)
Sorting
Lexicographical
KD Tree
Radix
Others
Hash
Correlation
Euclidean distance
Others
Ardizzone et al. (2009), Bilgehan and Uluta (2013), Bravo-Solorio and Nandi (2011), Cao et al. (2012), Davarzani et al. (2013), Gan and Zhong
(2014), Huang et al. (2011), Ketenci and Ulutas (2013), Kumar et al. (2013), Le and Xu (2013), Li et al. (2012, 2014), Ryu et al. (2010), Ulutas and
Ulutas (2013), Uluta et al. (2013), Wang et al. (2012), Yang et al. (2013) and Zhao and Guo (2013)
Mahdian and Saic (2007), Ting and Rang-Ding (2009) and Vincent Christlein and E.A.P. (2010)
Lin et al. (2009) and Singh and Raman (2012)
Lynch et al. (2013)
Bayram et al. (2009), Kuznetsov Andrey Vladimirovich (2014), Li and Yu (2010), Li (2013) and Ryu et al. (2013)
Myna et al. (2008), Peng et al. (2011), Shao et al. (2012) and Zhang et al. (2008)
Kashyap and Joshi (2013) and Muhammad et al. (2012)
Akbarpour Sekeh et al. (2013), Hsu and Wang (2012), Li et al. (2008), Li et al. (2012), Liu et al. (2014), Murali et al. (2012), Shin (2013), Wu et al.
(2010) and Zhao (2010)
Table 6
Denition of sorting techniques.
Sorting Techniques Denition
Lexicographical
KD-Tree
Radix
A generalization based on the alphabetical order of their

features value.
A data structure technique to perform searching in
nearest neighbor by using the properties of tree. The
objective is to eliminate a large portions of the search
space in a short time.
A sorting technique for non-comparative integer. The
data with integer keys are sorted by the individual digits
that share the same signicant position and value.
has the ability for an efcient range queries in multi-dimensional

data for analysis of block similarity (Mahdian and Saic, 2007).
Contrary to lexicographical technique, radix sorts the value of integer digits. It has a faster and better complexities than lexicographical sorts (Lin et al., 2009).
Other than the abovementioned sorting techniques, Lynch et al.
(2013) proposed blocks sorting based on dominant features that
allows the direct comparison between two blocks. This technique
differs from the kd-tree technique, which only allows indirect
block comparison based on the blocks features.
4.2.2. Hash
Hash is commonly utilized to ensure that any modication to
the data can be detected. Counting Bloom Filters (CBF) is a probabilistic data structure that employs hash function with a set of
element in an array. Each identical feature will have the same hash
value, but the element only increases for different hash values. Any
element with value higher than two is expected to be duplicated
pairs in CMFD (Bayram et al., 2009). However, the number of
features is restricted by the size of memory. Hence, CBF is modied
by assigning the same hash value for every different feature.
Consequently, the requirement of the memory size is reduced and
the ability to detect a large image is increased (Li and Yu, 2010).
Locality-Sensitive Hashing (LSH) applies hash functions for
duplication detection. It searches the approximate nearest neighbor through hashing the feature vectors and selecting the identical
hash value. Since the approximate nearest neighbor technique is
faster than the exact nearest neighbor, LSH is used to improve the
processing time in CMFD. In a large size image, the size of hash
values should be small to reduce the search time over all blocks
(Ryu et al., 2013). LSH is robust to post-processing operations
compare to lexicographical sorts (Li, 2013).
4.2.3. Correlation
Correlation is a statistical measurement of two or more variables to indicate the level of change. The correlation coefcient is
usually used to dene the duplicated regions after sorting is executed (Gan and Zhong, 2014; Peng et al., 2011; Wang et al., 2012).
However, the correlation can be performed independently without
sorting to nd the similarity criterion in the image. The most
commonly deployed correlation technique in CMFD is phase correlation. Normally, the phase correlation identies the template
matching in two similar images. This similarity is represented by a
signicant peak that ranges between 0 and 1. Later, the phase
correlation is adopted to nd the matching within one image
(Shao et al., 2012). The region is identied as potentially tampered
if the value of the correlation peak exceeds the predened
threshold during scanning of the image.
4.2.4. Euclidean distance
Euclidean distance is a measurement of distances between two
vectors in Euclidean space. Similar to correlation, Euclidean distance is often nalized in the manipulated area after the sorting
process (Le and Xu, 2013; Li et al., 2014, 2012; Ryu et al., 2010). It
calculates the distance between similar blocks identied by the
sorting technique to detect the duplication in an image. Muhammad et al. (Muhammad et al., 2012) calculate the distance
between identical blocks and eliminates the sorting process. An
image is suspected of been tampered with if the two blocks is near
to each other with a similar neighborhood (Kashyap and Joshi,
2013).
4.2.5. Others
Other matching techniques include DCT coefcient and clustering. In CMFD, DCT coefcients are commonly utilized in feature
extraction. However, the sum of difference between DCT coefcients can be used as a matching criteria to localized the tampered
area (Shin, 2013). The block is considered tampered if the difference value is equal to 0.0. As a result, the technique signicantly
reduces the computational complexity and feature dimension.
Additionally, a coarse-to-ne approach is applied by using sequential block clustering to enhance the duplicated region detection model (Akbarpour Sekeh et al., 2013). The search space in
block matching is minimized through the clustering technique. In
short, both techniques (DCT coefcient and clustering) signicantly improve the time complexity they eliminate the blockcomparing operations.
5. Keypoint-based approach
Keypoint-based approaches are non block-based, as the block
division is eliminated in pre-processing (see Fig. 8). The keypoint
features extract the distinctive local features such as corners,
blobs, and edge from the image. Each feature is presented with a
268
Fig. 8. CMFD process in keypoint-based approach.
set of descriptor produced within a region around the features.

The descriptor helps to increase the reliability of the features to
the afne transformation. Then, both features and descriptors in
the image are classied and matched to each other to nd the
duplicated regions in the copy-move forgery.
We located 16 scientic papers on keypoint-based approaches
published and indexed in WOS, with eight of the papers are
published in 2013 (see Fig. 9(a)). These studies propose various
improvements relating to feature points, descriptors and matching
techniques. Based on the publication trend, the studies of keypoint-based approach are expected to increase in coming years.
5.1. Keypoint-based feature extraction techniques
The feature extraction techniques of keypoint-based approach
can be divided into three types, namely: Scale Invariant Feature
Transform (SIFT), Harris Corner Detector and Speed Up Robust
Features (SURF). The breakdown of the type by year is shown in
Fig. 9(b) and related work are listed in Table 7.
5.1.1. SIFT
The most popular keypoint features technique in CMFD is SIFT
based technique. SIFT is rstly introduced to the object recognition
community by Lowe (1999) and designed to be robust against
scale and rotation. SIFT detects salient points at different scales
from Difference of Gaussian (DoG) pyramid in scale-space representation. The DoG is used to improve the computational speed
during the extraction process in an image (Juan and Gwun, 2009).
Subsequently, the SIFT descriptor is built from the gradient orientation histogram in each SIFT point to be rotation invariant.
SIFT technique has been adopted in CMFD due to the high
stability for both intermediate and post-processing operations
(Ardizzone et al., 2010). Nevertheless, four limitations of SIFT in
CMFD are identied and presented in Table 8.
Firstly, SIFT has a high computational complexity due to the
high number of feature vectors obtained from the image. Consequently, the matching procedures will be computationally expensive, if not impossible, particularly for a high resolution image.
Therefore, PCA is applied to the feature vector to reduce computational complexity during matching (He et al., 2013).
Secondly, SIFT is unable to detect the duplicate regions in at
areas or little visual structure due to the limitation of reliable
points. This limitation can be minimized by combining SIFT and
Zernike moments (Mohamadian and Pouyan, 2013). However, this
combination will increase the processing time as both techniques
will need to be applied on the image.
Thirdly, SIFT features are unable to dene a shape or a single
patch due to their non-uniform distribution. Amerini et al.
(Amerini et al., 2013) attempted to improve their earlier technique
(Amerini et al., 2011) by adapting J-Linkage algorithm after the
matching process. The majority of the SIFT-based techniques use
mathematical morphology operations to connect the boundaries
of features in nal stage of the detection (see Amerini et al., 2013;
Keypoint-based
9
8
7
6
5
4
3
2
1
0
1
0
2007
0
2008
2009
2010
2011
2012
2013
2014
2012
2013
2014
2012
2013
2014
(a)
Feature Extraction
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
2007
2008
2009
SIFT
2010
SURF
2011
Harris
(b)
Matching Techniques
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
2007
2008
2009
2010
Nearest Neighbor - Best Bin First

Nearest Neighbor - g2NN
Clustering
2011
Nearest Neighbor - 2NN

Other nearest neighbor
(c)
Fig. 9. Publications on keypoint-based approach located on WOS, by (a) year,
(b) feature extraction techniques, and (c) matching techniques.
Jaberi et al., 2013a, 2013b; Pan and Lyu, 2010).

Finally, SIFT is incapable of differentiating between regions that
are intentionally inserted or naturally similar. Therefore, the image
is segmented into semantically independent patches by assuming
the points that located close to each other are naturally similar (Li
Table 7
Publications on feature extraction techniques for keypoint-based approach.
Feature extraction
techniques
Author (s)
SIFT
Amerini et al. (2013, 2011), Anand et al. (2014), Ardizzone

et al. (2010), Farukh et al. (2014), Huang et al. (2008),
Jaberi et al. (2013a, 2013b), Li et al. (2014), Mohamadian
and Pouyan (2013) and Shen et al. (2013)
Chen et al. (2013), Guo et al. (2013), Kakar and Sudha
(2012), Yu et al. (2014), Zhao and Zhao (2013) and Zheng
and Chang (2014)
Bo et al. (2010) and Mishra et al. (2013)
Harris corner
detector
SURF
Table 8
List of SIFT drawbacks.
No. Drawbacks
1.
2.
3.
4.
High computational complexity.

Cannot detect duplication region in at areas.
Difcult to identify a shape region.
Cannot differentiate intentionally inserted copied region or naturally similar region.
et al., 2014).
5.1.2. Harris corner detector
The keypoint techniques is rst introduced in Harris Corner
Detector (Harris and Stephens, 1988) following the SIFT technique.
The detector extracts corners and edges from the regions based on
the local auto-correlation function. It has been shown that Harris
features result in consistencies in natural imagery.
In CMFD, the Harris detector has been studied and explored to
improve SIFT-based techniques (see Fig. 9(b)). As the Harris detector only produces feature points, compatible potential descriptor techniques are combined with the features. Moreover, the
Harris features are enhanced to increase the points reliability in
detecting the forgery. A summary of studies focusing on Harris
features is listed in Table 9. Such techniques are generally found to
be robust to rotation, scale, jpeg compression, noise and blurring.
5.1.3. SURF
SURF technique is initially proposed by Bay and Ess (2008) to
improve the performance of SIFT. The SURF features reduce the
processing time and also feature dimension. SURF-based technique in CMFD is presented by Bo et al. (2010), where they extended
the dimension of Bay's techniques into 128. They demonstrated
that SURF can reduce the false match especially for high resolution
images, while robust to certain transformation and post processing operations. However, this technique is unable to detect a
small copied region in the image. It was later shown that the
SURF-based technique reduces the accuracy although it improves
the processing time in copy-move detection (Mishra et al., 2013).
5.2. Keypoint-based matching techniques
Similarities among the feature points in an image can also be
measured using nearest neighbor. However, due to the high
computational complexity, it is challenging to detect the forgery in
an image. Therefore, the nearest neighbor techniques have been
the subject of active research.
In this section, the nearest neighbor techniques for keypointbased approach are divided into four types, namely: Best Bin First,
2NN, g2NN, and others. Another matching technique for keypointbased introduced in CMFD is clustering technique. The breakdown
of these techniques by year is presented in Fig. 9(c) and the
269
respective literature are listed in Table 10.

5.2.1. Nearest neighbor
Nearest neighbor examines the similarity between points by
calculating the distance of each point in vector space. The points
are considered similar if the distances satisfy the designated
threshold. There are four types of improvement in nearest
neighbor technique and they can be combined with other types to
improve the performance as listed in Table 11.
Keypoint features are commonly indexed using Best Bin First
(Chen et al., 2013; Pan and Lyu, 2010; Zhao and Zhao, 2013), and
the distance between each point is compared to a predened
threshold to remove false match (Jaberi et al., 2013b; Guo et al.,
2013) although we observe that totally avoiding a false match is
not possible. Hence, identical points are searched outside a window centered at the keypoint to prevent matching within its close
spatial adjacency (Pan and Lyu, 2010). Amerini et al. (2011) introduced the g2NN procedure designed to produce the highest
match especially the multiple copy-move forgeries in an image.
Dissimilarities between points can also be clustered using a hierarchical agglomerative clustering (HAC) algorithm (Amerini et al.,
2013, 2011; Kakar and Sudha, 2012; Mishra et al., 2013) to create
the point's region.
5.2.2. Clustering
Clustering technique groups a set of object that are similar to
each other, and a common clustering technique in CMFD is HAC.
However, the linkage method used in the clustering calculation
varies among the authors. For example, Ardizzone et al. (2010)
introduced the objects matching by clustering rather than points
matching. Each vector is clustered using Weight Center of Mass
Distance (WPGMC) linkage to obtain the object's region and followed by comparing to a threshold. Thus, the object with similar
shape and texture can be considered as the real copy.
6. Publicly available datasets

Since the eld of image forensic research is constantly
advancing,1 there are few publicly available benchmarking datasets for research. Such datasets should ideally offer a collection of
natural images with realistic copy-move tamper operations. This
will provide a common platform to benchmark the performance of
different techniques. Existing copy-move datasets are listed in
Table 12.
The publicly available dataset may not enough to fulll the
criteria needed by the researchers. However, there are few databases consist of original or non-forged images are available on the
internet. The databases are maintained to support research in
image processing, image analysis and machine vision. These images can be used to perform the copy-move forgery by using a
powerful digital image editor such as Adobe Photoshop.
Once an image has been analyzed for forgery, the performance
evaluation of the detection technique must be performed. We
observe that there are two categories of evaluation techniques,
namely: accuracy per image and accuracy per pixel. The accuracy
per image is dependent on the number of images in the dataset,
and higher number of images will produce more precise results.
Thus, the total number of forged images and original images
should be balanced in a dataset. However, the original images that
have been added to satisfy the quota might not relate to the forged
1
We observe that such a trend is also seen in other digital forensic research
areas such as cloud and Internet-of-Things (IoT) forensics (see Do et al., 2015a,
2015b, 2016; Quick et al., 2013) and forensic authorship (see Peng et al., 2016a,
2016b).
270
Table 9
Summary of studies on Harris features-based techniques in CMFD.
Author (s)
Feature Point
Kakar and
Sudha
(2012)
Technique: combine the features extracted from Laplacian of Gaussian

(LoG) with Harris lter.
Feature descriptor
Outcome
Improve the MPEG-7 image signature tools which

Technique: a circular region around each feature
point is scaled to a radius of 32 pixels and generalize have been developed for content-based image retheir Radon transform over straight lines in the cir- trieval to detect copy-move forgeries.
cular region.
Objective: to obtain features from gra- Objective: to increase robustness to scale and
dient changes and corners in the image. rotation.
Zhao and Zhao Technique: employ dense Harris feaTechnique: a circle patch around each feature point Resilient to the forgery with the at area and little
(2013)
ture points
are extracted using local binary pattern operators.
visual structures.
Objective: to get a sufcient number of Objective: to be rotation invariant.
feature points with approximately uniform distribution.
Guo et al.
(2013)
Technique: apply the adaptive nonmaximal suppression (ANMS) initial by

Brown et al. (2005).
Objective: to increase distribution of
points throughout the entire image.
Technique: apply Daisy descriptor proposed by Tola Resistant to any diverse types of operations, such as
et al. (2010) and enhanced for rotation invariant.
rotation, scaling, JPEG compression, and Gaussian
noise addition better than SIFT.
Objective: to enhance SIFT descriptor
performance.
Chen et al.
(2013)
Technique: a threshold is adjusted for

every single image to control the number of Harris points.
Objective: to reduce the matching time
by adjusting the threshold for every
single image.
Technique: use step sector statistics as descriptor to

represent the small circle image region around each
Harris points.
Objective: to improve the rotation invariant
Zheng and
Chang
(2014)
Technique: measure the values of the

corner response for each pixel and
composed in a matrix.
Objective: to improve SIFT points in
extracting more keypoints especially in
highly uniform texture areas.
Technique: generate a SURF descriptor around each Signicantly improved the Chen et al. (2013) techniHarris point.
que which able to be robust even if the image is
subjected to strong geometric transform and
Objective: to improve the computation speed
degradation.
Yu et al. (2014) Technique: perform a non-maximal

suppression (NMS) technique and obtain roughly evenly distributed points.
Objective: to get a specied number of
local maximums points rather than desired feature point density.
Technique: extract Multi-support Region Orderbased Gradient Histogram (MROGH) as descriptor

for each point.
Objective: to improve matching performance in
texture area.
Table 10
Publications for keypoint-based approach by matching technique.
Matching
techniques
Nearest Neighbor
Best bin rst
2NN
g2NN
Others
Clustering
Author (s)
Chen et al. (2013), Huang et al. (2008), Jaberi et al.

(2013b), Kakar and Sudha (2012), Mishra et al. (2013)
and Zhao and Zhao (2013)
Farukh et al. (2014), Guo et al. (2013), Jaberi et al.
(2013b), Kakar and Sudha (2012) and Mishra et al. (2013)
Amerini et al. (2013, 2011), Mohamadian and Pouyan
(2013)
Anand et al. (2014), Jaberi et al. (2013a), Li et al. (2014)
and Shen et al. (2013)
Ardizzone et al. (2010)
images since one original image may have more one or more
forged images. Moreover, the accuracy results might not be guaranteed due to certain pixels in the image being falsely detected,
even when the image has been identied as forged. Therefore, a
number of researchers improve the evaluation by validating the
detection per pixel in one image to form the percentage of the
detection in an image. In this case, a set of ground truth images has
to be created in the dataset, particularly to compare with the detected pixels.
Fig. 10 shows examples of results for correctly detected image
and falsely detected image, as compared to the ground truth image. The nal results of the accuracy per pixel will be determined
by obtaining the average percentage of the total images in the
dataset. Unfortunately, the processing time of this evaluation will
Effectively detects region duplication forged images

with several geometrical transformations (including
rotation, scaling and ipping) and image degradations
(including JPEG compression and Gaussian noises)
with a high accuracy.
Increase the running time than SIFT and SURF-based

techniques.
increase, although the results are more detailed compared to approaches based on accuracy per image.
In order to calculate the accuracy, commonly used metrics in
CMFD are True Positive Ratio (TPR) and False Positive Ratio (FPR).
These metrics are commonly used in the accuracy per image category, where a good detection technique should maintain a high
TPR while the FPR at the minimum level. The calculation for both
TPR and FPR are presented in Eq. (1).
TPR =
=
#Imagesdetectedasforgedbeingforged
FPR
#Forgedimages
#Imagesdetectedasforgedbeingoriginal
#Originalimages
(1)
Otherwise, precisionrecall (PR) curves are employed typically

for accuracy per pixel category. Eq. (2) shows the precision and
recall rate calculation.
Precision =
=
Forgedregion Detectedregion
Recall
Detectedregion
Forgedregion Detectedregion
Forgedregion
(2)
Meanwhile, CMFD is known for their high computational time

due to the searching and matching of regions within an image.
Therefore, the processing time for the respective technique must
be included as one of the evaluation metrics. Finally, CMFD techniques that are fast and maintained a good accuracy are desirable.
271
Table 11
Summary of nearest neighbor techniques in CMFD.
Technique
How it works
Best Bin First
Based on a variant of the k-d tree search which index the nearest neighbor for a large
fraction of queries and returns a very close neighbor.
2NN
g2NN
Objective
Accept the point as a match if the ratio of closest to second-closest neighbors
d1
d2
To limit the amount of computation in high dimensional space.

To eliminate false match point in high dimensional
is less than a
feature space.
threshold.
Iterate the 2NN procedure between di /di + 1 until the ratio greater than the threshold. If k is To detect multiple copy-move forgeries in one image.
the value which the procedure stops, each keypoint in correspondence to a distance in (d1,
, dk ) where (1 k < n) is considered as a match for the inspected keypoint.

Others Window Perform the search process outside a xed size of pixel window centered at the keypoint.
Only those with distinct similarities are kept as matching points.
Others KNN
Construct a k-d tree search in the whole image. Then perform k-nearest neighbor (KNN)
search in each region for each keypoint to nd a possible correspondence.
To avoid searching nearest neighbors of a keypoint

from the same region.
To simplify the implementation and increased robustness of matching process.
image for each type is presented in Fig. 12.
7. Types of copied regions

Though there are many well-known image tampering datasets
for copy-move forgery detection available online, most of them
categorized the image by operations involved in the tampering
like rotation, compression, scaling and etc. MICC-F220 is a widely
used dataset that contains 110 copy-move images with multiple
operations. This dataset has greatly benetted the CMFD research
community, in terms of nding the operations invariant techniques. In contrast, CASIA datasets only categorize the original image
and arbitrarily forge the image. Therefore, none of the datasets
categorize images based on the copied regions.
Fig. 11 shows the types of copied regions and the explanation
for each type is discussed in this section. An example of the forged
7.1. Background
Background image is dened as the dissimilarity between the
objects and the surroundings. This type represents the scenes with
variations in luminance and geometry settings instead of objects
(Piccardi, 2004). The background can be a scenery, nature, texture
or color. Normally, homogenous backgrounds are been used to
hide the object appearing in the image. Thus, the requirement of
texture analysis including the intensity, patterns, and color is
needed.
Table 12
Existing publicly available copy-move forgery datasets.
Name
Image size
Total
image
Descriptions
URL
Columbia University Ng and

Chang (2004)
128 128
1845
http://www.ee.columbia.edu/ln/dvmm/down
loads/AuthSplicedDataSet/AuthSplicedDataSet.
htm
Image forensics Muhammad

n.d.
CASIA v1.0 (Jing and Wei,
2011)
200 200
10
374 256
1725
CASIA v2.0 Jing and Wei (2011)
240 160 to
900 600
12614
Image manipulation Christlein et al. (2012)
420 300 to
38882592
48
MICC-F220 Amerini et al.

(2011)
722 480 to
800 600
220

(2013)
800 533 to
3888 259
600

(2011)
2048 1536
2000
copy-move forgery and image splicing

original and forged image
divided into two categories (smooth vs. textured and arbitrary object boundary vs.
straight boundary)
The forged images are in JPEG format with Q
(quality) factor of 100.
JPEG format
divided into several categories (scene, animal,
architecture, character, plant, article, nature
and texture)
uncompressed image and JPEG compressed
copy move forgery
applied with jpeg compression, rotation and
scaling operation
applied with translation, rotation, scale (symmetric/asymmetric), or a combination of them
randomly taken from MICC-F2000 and SATS130 datasets
applied with translation, rotation, scale (symmetric/asymmetric), or a combination of them
CoMoFoD Tralic et al. (2013)
512 512
3000 200
260
CMFD_db Cozzolino et al.

(2014)
768 1024
160
http://faculty.ksu.edu.sa/ghulam/Pages/Im
ageForensics.aspx
http://forensics.idealtest.org:8080/index_v1.
html
http://forensics.idealtest.org:8080/index_v2.
html
https://www5.cs.fau.de/research/data/image-ma
nipulation/
http://www.micc.uni.it/downloads/MICC-F220.
zip
http://www.micc.uni.it/downloads/MICC-F600.
zip
http://www.micc.uni.it/downloads/MICCF2000.zip
http://www.vcl.fer.hr/comofod/download.html
applied with translation, rotation, scale, distortion or a combination of them
http://www.grip.unina.it/download/prog/CMFD/
PNG format
CMFDdb_grip.zip
272
Fig. 10. Examples of detection results for (a) correctly detected image, (b) falsely detected image, in comparison to (c)ground truth image.
the letter in the same alphabet. For instance, digital words have
different fonts while handwritten have diverse forms. In copymove forgery, the letter is copied to change the meaning of the
word or text. Hence, as the text is one medium of communications,
it is possible to have some impacts when the meaning of the image
has been altered.
8. Discussion
Fig. 11. Types of copied regions in copy-move forgery.
7.2. Object
Basically, an object is any physical form that is real and recognizable. Object in an image include architecture (e.g building),
art, shape, plant and lines. The object is copied in the image
generally for manipulating the amount of things while hiding the
unwanted things. Besides, the object copied could change the
forms of the object representation. There is one eld in image
studies which is known as object recognition and is being actively
researched on at the time of this writing. Object recognition in
real-world settings requires local image features that can differentiate from each other. Thus, the features of the copied objects
are easily identied.
7.3. Creature
Although both human and animal could be an object, this
creature type means a group of gures that can move and carry
different behavior. Creature is copied regularly to symbolize the
crowd. It does not necessarily consists of a full gure, as it can be a
part of the gure (e.g. face, eye, and hands). Several recognition
techniques relating to face and behavior detection might be necessary to identify the manipulated areas.
7.4. Letter
The last type of copied regions is letter. Here, letter is a symbol
of an alphabet representing a word or text. There are variations of
Image forgery detection is a rapidly growing research area,

especially on passive authentication techniques. We surveyed the
digital image forgery detection literature, focusing on copy-move
forgery. A total of 84 scientic papers of CMFD indexed by WOS
were reviewed, and a common CMFD workow was presented
based on the materials located. Each feature extraction and
matching process were categorized into two approaches, namely:
block-based and keypoint-based.
Block-based approach is the most popular approach in the
CMFD literature, due to its suitability with various feature extraction techniques and the capability to achieve a high matching
performance. Common techniques in the block-based approach
are frequency transform and lexicographical sorting for feature
extraction and matching process, respectively. Frequency transform is invariant to post-processing operations, such as noise,
blurring, and JPEG compression. However, it has limited capability
in dealing with geometrical transformation like rotation and
scaling. Therefore, researchers have attempted to apply different
feature extraction and matching techniques in order to increase
the robustness of such operation in block-based category. They
studied the moments and log-polar transform specically to
handle transformation operation. Texture features have also been
used to identied hidden object in an image. For example, the
color, intensity and pattern are analyzed to nd the most identical
feature among the uniform textures in the image. In addition, dimension reduction techniques like PCA, DWT and SVD are used to
further reduce the feature dimension while increasing the processing speed. Other matching techniques such as Radix sorts, KDTree, and LSH also have been introduced for these reasons. Though
the improvement is continuously developed, keypoint-based approach appears to achieve better performance.
Keypoint-based approaches have also started to become
Types
Original Image
273
Forged Image
Background
Object
Creature
Letter
Fig. 12. Example images obtained from CASIA v2.0 Dataset (Jing and Wei, 2011).
popular, and SIFT features is the most popular and reliable technique in detecting copy-move forgery due to its good performance
in geometrical transformation operation. The difference of Gaussians procedure makes the features robust in scaling while the
gradient orientation procedure is rotation invariant. Moreover, the

g2NN procedure with searching outside the predened window
gives the highest matching performance mainly in multiple copymove forgery detection. However, SIFT lacks in the capability to
274
identify the manipulated area in texture either from at and

smooth region or highly identical features region. For this reason,
SIFT yields a high false positive ratio specically when handling
the texture manipulated area. Therefore, research on keypointbased approach has focused on improving its accuracy while
maintaining the robustness in geometrical transform operation. It
is also known that keypoint-based approaches suffer from a high
time complexity due to the need to match large numbers of
identical points in an image. Thus. studying salient feature selection and improving the matching techniques for reducing the
complexity while maintaining the accuracy is another topic of
interest. For example, a limited number of keypoint features with a
set of efcient descriptor techniques (to sustain the geometrical
transformation, discovering the texture manipulated area and
develop a shape region) can be implemented with different
nearest neighbor techniques and evaluating their outcomes in
CMFD.
In recent years, researchers introduced the hybrid approach by
combining block-based and keypoint-based techniques. They
employed keypoint and pixel features to improve the results. As
keypoint features are unable to dene the shape region between
points, the points are then segmented and matched by patch, resulting in consistently accurate results (Li et al., 2014). In addition,
replacing the feature points with small super pixels and merge
with neighboring local features before the matching process can
reduce the computational complexity (Pun et al., 2015). In contrast, the results of the keypoint matching are decomposed and
analyzed using multi-scale and voting process by pixel specically
to separate the natural homogenous area and intentionally tampered area (Silva et al., 2015). Meanwhile, feature points are divided into triangles and matched between color and vertex to
improve the block-based category in term of robustness in geometrical transform operation and processing time (Ardizzone
et al., 2015).
In summary, while maintaining robustness to various operations (e.g rotation, noise addition, scaling, lossy compression, and
blurring), existing techniques are generally less effective in
homogenous and smooth regions. In order to cover such regions,
computational time and complexity will increase and result in
slower processing time and require a high computational cost.
Additionally, the threshold values to determine the manipulated
area are varied for each image size and content; consequently,
affecting the accuracy of the technique even when a training has
been performed.
CMFD techniques in the literatures attempted to solve the object and background types of copied regions. As the object recognition like SIFT, texture, and moments are used as the features,
such techniques perform well for object even though multiple
operations are involved in the copied activities. However, these
techniques are limited in distinguishing homogenous areas for the
background types. Moreover, creature movements could be manipulated with some operations before pasting into another location. Unfortunately, existing techniques did not explore the creature behavior detection. These techniques also could not be used
to identify the letter types of copied areas as a word can involve
similar letters. Consequently, a research challenge is how to effectively detect the forgery within one image.
Big data is also another popular research trends, and the implications of big data is the inability to process and analyze large
datasets. Quick and Choo (2014b) surveyed material published
between 1999 and 2014 relating to the impacts of increasing volume of digital forensic data, and they concluded that there remains a need for further research with a focus on real world applicability of a method or methods to address the digital forensic
data volume challenge. This is not surprising due to the signicant
increases in information shared online (e.g. uploading and sharing
of images on social media sites such as Facebook, Instagram, Flickr

and Whatsapp), which exceeds the processing and analytical
capabilities of current tools. Thus, verifying the authenticity of
such images remains an operational challenge. For example, it was
shown that analyzing a huge dataset requires different approaches
from analyzing a small dataset (Mahrt and Scharkow, 2013). While
existing techniques are reliable on a small dataset, they may not
perform well with a larger dataset. Moreover, the complexity of
the data in an image itself will also result in complications during
image forensics (Smith et al., 2012).
Therefore, we identify two open issues regarding the big data
and image forensics, namely: data inconsistencies and high scalability, and limitations in existing computer architectures.
8.1. Data inconsistencies and high scalability
The variation of images on social media comes from heterogeneous sources and unstructured, which has also result in data
inconsistencies and noise. Data inconsistency in a big data sample
size combined with high scalability resulted in higher computational cost and algorithmic instability. Unfortunately, existing
techniques of CMFD have large processing times which would not
be scalable for running analysis on large batches of images. These
conditions will cause noise accumulation, spurious correlations,
and incidental homogeneity concerns; thus, the need to develop
more adaptive and robust procedures (Fan et al., 2014).
In addition in data volume challenges associated in image forensics, we need to consider the high dimension of image features.
It is known that processing high dimension image is a current
research challenge. As the preceding step to analysis, image data
must be structured and organized in preprocessing and some data
preprocessing techniques (e.g. data cleaning, data reduction and
data transformation) may be implemented to remove the noise
and improve the inconsistencies. A number of researchers have
proposed mapping the high-dimensional data space into lower
dimensional space while minimizing information loss.
Apache Hadoop is one of the established platforms that support
data-intensive distributed applications, and MapReduce is a programming model execution implemented on the platform to
process large volume of data. This works by partitioning and distributing the aggregation workload across different machines.
Hadoop is designed for batch processing; thus, it is not a real-time
and high performance engine. Therefore, the coordination between separate processing units and data units on a cluster is
highly necessary and essential to improve the scalability, efciency
and fault-tolerance in big data systems (Philip Chen and Zhang,
2014).
8.2. Limitations in existing computer architecture
Due to the characteristics of big data, there are a number of
challenges associated with allocating computational resources for
image processing. The main challenge is the limitation of the CPU
itself. The system imbalance between the speed of processing and
the amount of data in the CPU has limited big data exploration.
Though the improvement on the information processing methods
in the CPU architecture is continually advancing, we may be unable to effectively manage the amount of data which is gradually
created. For that reason, distributed algorithms and software are
studied to accelerate the analysis process. Otherwise, multiple
cores can be used simultaneously through parallel computing to
solve the computational problem. The multiple platform options
including the customizable circuits (e.g FPGA), custom processing
units (e.g GPU), high performance computing (HPC), clusters
connected by fast local networks, and data center-scale virtual
clusters may be employed to improve the machine learning
275
algorithms in detecting image forgery efciently.
Acknowledgments
8.3. Potential of big data
This work is fully funded by Bright Sparks Unit, University of

Malaya, Malaysia, and partially funded by Ministry of Education,
Malaysia under the University of Malaya High Impact Research
Grant UM.C/625/1/HIR/MoE/FCSIT/17.
Advances in network technology (see Behringer et al., 2016;

Rostirolla et al., 2016; Xu et al., 2016) could be benecial for image
forgery detection. For instance, Pooranian et al. (2011) introduced
a hybrid algorithm taking advantages of both Genetic and Gravitational Emulation Local Search (GELS) algorithms, to complete a
job which requires large-scale computation over a distributed
system in the scheduling process. In CMFD, GELS can be employed
to match the identical features in an image (local search) while
Genetic is used to match local features of big scalable data (global
search). Thus, the hybrid/combination could help the solving of
grid computation problems by reducing the processing time and
minimizing the risk of missed task, in comparison with traditional
methods. More recently, Shojafar et al. (2016) proposed an adaptive implementation of the scheduler to deal with time uctuations of the input trafc and state of the mobile connection in
cloud computing. Such an approach could help to improve input
trafc, output trafc and resource reconguration, particularly for
image data and analysis in the cloud; consequently, achieving
energy saving.
Another related work is the research of Tavoli et al. (2013) on
document image retrieval. The authors proposed matching the
query image using feature weighting in a database. In image forensics, the method could be adopted by classifying the feature
weighting of the original image, forged image and the source device. Eventually, every uploaded image will be compared with the
weighting in the database while the suspicious image will undergo
further evaluation. Similarly, instead of employing the Attack-Resistant Trust Management Scheme to the Vehicular Ad Hoc Networks (Li and Song, 2015), the scheme may also be applied to
image forensics. Using the scheme, the image data, including the
malicious attacks, will be analyzed as evidence to detect the
misbehavior of an image for evaluating the trustworthiness.
Existing techniques and tools may not be t-for-purpose to
solve real-world big data problems a view echoed by Quick and
Choo (2016, 2014a, 2014b, 2014c) and others (Hu et al., 2016;
Nepal et al., 2015; Xu et al., 2016; Z. Xu et al., 2016; Zhao et al.,
2016), and big data advances in storage and I/O techniques, computer architectures, data-intensive techniques, etc could help inuence the development of the hardware and software that can be
applied in CMFD.
9. Conclusion
We surveyed publications on CMFD between 2007 and 2014,
and determined that copy-move forgery manipulation is a popular
line of research inquiry in recent years (e.g. as evidenced by the
number of publications on the topic). In this survey, we provided a
comprehensive overview of existing CMFD techniques for the
entire process. Specically, we discussed the importance of the
CMFD techniques, and outlined the common process involved in
the CMFD workow. The key processes are categorized into two
categories; namely block-based and keypoint-based. We described
the major classes of techniques in both categories, and listed the
associated activities related to the CMFD including datasets and
validations. Furthermore, we classied the copied regions to determine their relevancy in existing CMFD techniques. We also
discussed how advances in big data solutions could be inuence
and/or solve CMFD challenges.
References
Akbarpour Sekeh, M., Maarof, M.A., Rohani, M.F., Mahdian, B., 2013. Efcient image
duplicated region detection model using sequential block clustering. Digit. Investig. 10, 7384. http://dx.doi.org/10.1016/j.diin.2013.02.007.
Al-Qershi, O.M., Khoo, B.E., 2013. Passive detection of copy-move forgery in digital
images: state-of-the-art. Forensic Sci. Int. 231, 284295. http://dx.doi.org/
10.1016/j.forsciint.2013.05.027.
Amerini, I., Ballan, L., Caldelli, R., Bimbo, A., Del, Serra, G., 2011. A SIFT-based forensic method for copy move attack detection and transformation recovery.
IEEE Trans. Inf. Forensics Secur. 6, 10991110.
Amerini, I., Ballan, L., Caldelli, R., Del Bimbo, A., Del Tongo, L., Serra, G., 2013. Copymove forgery detection and localization by means of robust clustering with
J-linkage. Signal Process. Image Commun. 28, 659669. http://dx.doi.org/
10.1016/j.image.2013.03.006.
Anand, V., Hashmi, Mohammad Farukh Keskar, A.G., 2014. A Copy Move Forgery
Detection to Overcome Sustained Attacks Using Dyadic Wavelet Transform and
SIFT Methods, in: 6th Asian Conference on Intelligent Information and Database
Systems (ACIIDS). pp. 530542.
Ardizzone, E., Bruno, A., Mazzola, G., 2015. Copy move forgery detection by
matching triangles of keypoints. IEEE Trans. Inf. Forensics Secur. 10, 20842094.
Ardizzone, E., Bruno, A., Mazzola, G., 2010. Detecting Multiple Copies in Tampered
Images. In: 17th International Conference on Image Processing. pp. 21172120.
Ardizzone, E., Mazzola, G., Informatica, I., Universit, D., 2009. Detection of Duplicated Regions in Tampered Digital Images by Bit-Plane Analysis, in: 15th International Conference Vietri Sul Mare, Italy. pp. 893901.
Bay, H., Ess, A., 2008. Speeded-Up Robust Features (SURF). Comput. Vis. Image
Underst. 110, 346359. http://dx.doi.org/10.1016/j.cviu.2007.09.014.
Bayram, S., Sencar, H.T., Memon, N., 2009. An Efcient And Robust Method For
Detecting Copy-Move Forgery, in: IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). pp. 10531056.
Behringer, R., Ramachandran., M., Chang, V., 2016. A Low-Cost Intelligent Car Breakin Alert System Using Smartphone Accelerometers for Detecting Vehicle BreakIns, in: The First International Conference on Internet of Things and Big Data.
Bilgehan, M., Uluta, M., 2013. Detection of Copy-Move Forgery Using Krawtchouk
Moment, in: 8th International Conference on Electrical and Electronics Engineering (ELECO), pp. 311314.
Birajdar, G.K., Mankar, V.H., 2013. Digital image forgery detection using passive
techniques: a survey. Digit. Investig. 10, 226245. http://dx.doi.org/10.1016/j.
diin.2013.04.007.
Bo, X., Junwen, W., Guangjie, L., Yuewei, D., 2010. Image Copy-Move Forgery Detection Based On SURF, in: International Conference on Multimedia Information
Networking and Security. Ieee, pp. 889892. http://dx.doi.org/10.1109/MINES.
2010.189.
Bravo-Solorio, S., Nandi, A., 2011. Exposing duplicated regions affected by reection,
rotation and scaling, in: IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP). pp. 18801883. http://dx.doi.org/10.1016/j.sigpro.
2011.01.022.
Brown, M., Szeliski, R., Winder, S., 2005. Multi-Image Matching Using Multi-Scale
Oriented Patches, in: IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR05). Ieee, pp. 510517. http://dx.doi.org/10.1109/
CVPR.2005.235.
Cao, Y., Gao, T., Fan, L., Yang, Q., 2012. A robust detection algorithm for copy-move
forgery in digital images. Forensic Sci. Int. 214, 3343. http://dx.doi.org/10.1016/
j.forsciint.2011.07.015.
Chen, L., Lu, W., Ni, J., Sun, W., Huang, J., 2013. Region duplication detection based
on harris corner points and step sector statistics. J. Vis. Commun. Image Represent. 24, 244254. http://dx.doi.org/10.1016/j.jvcir.2013.01.008.
Chora, R.S., 2007. Image Feature Extraction Techniques and Their Applications for
CBIR and Biometrics Systems. Int. J. Biol. Biomed. Eng., 1.
Christlein, V., Riess, C., Angelopoulou, E., 2010. A Study on Features for the Detection of Copy-Move Forgeries. Sicherheit 2010, Gesellschaft fr. Inform. e. V.,
105116.
Christlein, V., Riess, C., Jordan, J., Riess, C., Angelopoulou, E., 2012. An Evaluation of
Popular Copy-Move Forgery Detection Approaches. IEEE Trans. Inf. Forensics
Secur. 7, 18411854.
Cozzolino, D., Poggi, G., Verdoliva, L., 2014. Copy-Move Forgery Detection Based On
Patchmatch, in: IEEE International Conference on Image Processing. pp. 5247
5251.
Davarzani, R., Yaghmaie, K., Mozaffari, S., Tapak, M., 2013. Copy-move forgery detection using multiresolution local binary patterns. Forensic Sci. Int. 231, 6172.
http://dx.doi.org/10.1016/j.forsciint.2013.04.023.
Deng, Y., Wu, Y., Zhou, L., 2012. Detection of copy-rotate-move forgery using Dual
Tree Complex Wavelet Transform. Adv. Sci. Lett. 16, 3238. http://dx.doi.org/
276
10.1166/asl.2012.3289.
Do, Q., Martini, B., Choo, K.K.R., 2015a. A Forensically Sound Adversary Model for
Mobile Devices. PLoS One 10 (9), e0138449. http://dx.doi.org/10.1371/journal.
pone.0138449.
Do, Q., Martini, B., Choo, K.K.R., 2015b. A cloud-focused mobile forensics methodology. IEEE Cloud Comput. 2 (4), 6065. http://dx.doi.org/10.1109/
MCC.2015.71.
Do, Q., Martini, B., Choo, K.K.R., 2016. Is the data on your wearable device secure?
An Android. Wear smartwatch case Study Softw.: Pract. Exp. . http://dx.doi.org/
10.1002/spe.2414
Fan, J., Han, F., Liu, H., 2014. Challenges of Big Data analysis. Natl. Sci. Rev., 138.
http://dx.doi.org/10.1093/nsr/nwt032.
Farid, H., 2006. Exposing digital forgeries in scientic images, in: Proceeding of the
8th Workshop on Multimedia and Security - MM&Sec 06. ACM Press, New
York, New York, USA, p. 29. http://dx.doi.org/10.1145/1161366.1161374.
Farukh, M., Anand, V., Keskar, A.G., 2014. Copy-move Image Forgery Detection
Using an Efcient and Robust Method Combining Un-decimated Wavelet
Transform and Scale Invariant Feature Transform. AASRI Procedia 9, 8491.
http://dx.doi.org/10.1016/j.aasri.2014.09.015.
Fridrich, J., Soukal, D., Luk, J., 2003. Detection of Copy-Move Forgery in Digital
Images. Int. J. Comput. Sci. Issues 3, 652663. http://dx.doi.org/10.1109/
PACIIA.2008.240.
Gan, Y., Zhong, J., 2014. Image copy-move tamper blind detection algorithm based
on integrated feature vectors. J. Chem. Pharm. Res. 6, 15841590.
Guo, J.-M., Liu, Y.-F., Wu, Z.-J., 2013. Duplication Forgery Detection Using Improved
DAISY Descriptor. Expert Syst. Appl. 40, 707714. http://dx.doi.org/10.1016/j.
eswa.2012.08.002.
Harris, C., Stephens, M., 1988. A Combined Corner and Edge Detector, in: Procedings
of the Alvey Vision Conference 1988. Alvey Vision Club, pp. 23.123.6.
doi:10.5244/C.2.23.
He, H., Huang, X., Kuang, J., 2013. Exposing copy move forgeries based on a dimension reduced SIFT method. Inf. Technol. J. 12, 29752979.
Hsu, H.C., Wang, M.S., 2012. Detection of copy-move forgery image using Gabor
descriptor. Proc. Int. Conf. Anti-Counterfeiting, Secur. Identication, ASID, pp. 1
4. doi:10.1109/ICASID.2012.6325319.
Hu, H., Zhang, Y., Shao, C., Ju, Q., 2014. Orthogonal moments based on exponent
functions: exponent-Fourier moments. Pattern Recognit. 47, 25962606.
Hu, M.-K., 1962. Visual Pattern Recognition by. Moment Invariants. IRE Trans. Inf.
Theory 2, 179187.
Hu, Y., Yan, J., Choo, K.-K.R., 2016. PEDAL: A Dynamic Analysis Tool for Efcient
Concurrency Bug Reproduction in Big Data Environment. Cluster Comput.
Huang, Y., Lu, W., Sun, W., Long, D., 2011. Improved DCT-based detection of copymove forgery in images. Forensic Sci. Int. 206, 178184. http://dx.doi.org/
10.1016/j.forsciint.2010.08.001.
Huang, H., Guo, W., Zhang, Y., 2008. Detection Of Copy-Move Forgery in Digital
Images Using SIFT Algorithm, in: IEEE Pacic-Asia Workshop on Computational
Intelligence and Industrial Application. Ieee, pp. 272276. http://dx.doi.org/10.
1109/PACIIA.2008.240.
Hussain, M., Muhammad, G., Saleh, S.Q., Mirza, A.M., Bebis, G., 2013a. Image forgery
detection using multi-resolution weber local descriptors. EuroCon, 15701577.
Hussain, M., Muhammad, G., Saleh, S.Q., Mirza, A.M., Bebis, G., 2013b. Evaluation of
image forgery detection using multi-scale weber local descriptors. IEEE Eur.
2013, 15701577. http://dx.doi.org/10.1109/EUROCON.2013.6625186.
Hussain, M., Muhammad, G., Saleh, S.Q., Mirza, A.M., Bebis, G., 2012. Copy-move
image forgery detection using multi-resolution Weber descriptos. 8th Int. Conf.
Signal Image Technol. Internet Based Syst. SITIS, 2012r, pp. 395401. http://dx.
doi.org/10.1109/SITIS.2012.64.
Hussain, M., Saleh, S.Q., Aboalsamh, H., Muhammad, G., Bebis, G., 2014. Comparison
between WLD and LBP descriptors for non-intrusive image forgery detection,
in: IEEE International Symposium on Innovations in Intelligent Systems and
Applications (INISTA) Proceedings. Ieee, pp. 197204. http://dx.doi.org/10.1109/
INISTA.2014.6873618.
Jaberi, M., Bebis, G., Hussain, M., Muhammad, G., 2013b. Accurate and robust localization of duplicated region in copymove image forgery. Mach. Vis. Appl 25,
451475. http://dx.doi.org/10.1007/s00138-013-0522-0.
Jaberi, M., Bebis, G., Hussain, M., Muhammad, G., 2013a. Improving The Detection
And Localization Of Duplicated Regions In Copy-Move Image Forgery, in: 18th
International Conference on Digital Signal Processing (DSP). Ieee, pp. 16.
http://dx.doi.org/10.1109/ICDSP.2013.6622700.
Jing, D., Wei, W., 2011. CASIA Tampered Image Detection Evaluation (TIDE) Database
[WWW Document]. URL http://forensics.idealtest.org/casiav2/ (accessed
04.28.15).
Kakar, P., Sudha, N., 2012. Exposing Postprocessed Copy-Paste Forgeries through
Transform-Invariant Features. IEEE Trans. Inf. Forensics Secur. 7, 10181028.
Kashyap, A., Joshi, S.D., 2013. Detection of Copy-Move Forgery Using Wavelet Decomposition, in: International Conference on Signal Processing and Communication (ICSC). pp. 13.
Ketenci, S., Ulutas, G., 2013. Copy-move forgery detection in images via 2D-Fourier
Transform. 36th Int. Conf. Telecommun. Signal Process. 813816. doi:10.1109/
TSP.2013.6614051.
Kodituwakku, S., Selvarajah, S., 2004. Comparison of color features for image retrieval. Indian J. Comput. Sci. 1, 207211.
Kumar, S., Desai, J., Mukherjee, S., 2013. A Fast DCT Based Method for Copy Move
Forgery Detection, in: IEEE Second International Conference on Image Information Processing (ICIIP-2013). Ieee, pp. 649654. http://dx.doi.org/10.1109/
ICIIP.2013.6707675.
Kuznetsov Andrey Vladimirovich, M.V.V., 2014. A Fast Plain Copy-Move Detection

Algorithm Based on Structural Pattern and 2D Rabin-Karp Rolling Hash. 11th
Int. Conf. ICIAR, pp. 461468.
Le, Z., Xu, W., 2013. A robust image copy-move forgery detection based on mixed
moments. Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, pp. 381384. http://
dx.doi.org/10.1109/ICSESS.2013.6615329.
Li, J., Li, X., Yang, B., Sun, X., 2014. Segmentation-based Image Copy-move Forgery
Detection Scheme. IEEE Trans. Inf. Forensics Secur. 6013, 112. http://dx.doi.
org/10.1109/TIFS.2014.2381872.
Li, L., Li, S., Zhu, H., Wu, X., 2014. Detecting copy-move forgery under afne
transforms for image forensics. Comput. Electr. Eng. 40, 19511962. http://dx.
doi.org/10.1016/j.compeleceng.2013.11.034.
Li, W., Song, H., 2015. ART: an attack-resistant trust management scheme for securing vehicular ad hoc networks. IEEE Trans. Intell. Transp. Syst. 1, 110. http:
//dx.doi.org/10.1109/TITS.2015.2494017.
Li, X., Zhao, Y., Liao, M., Shih, F.Y., Shi, Y.Q., 2012. Passive detection of copy-paste
forgery between JPEG images. J. Cent. South Univ. 19, 28392851. http://dx.doi.
org/10.1007/s11771-012-1350-5.
Li, Y., 2013. Image copy-move forgery detection based on polar cosine transform
and approximate nearest neighbor searching. Forensic Sci. Int. 224, 5967. http:
//dx.doi.org/10.1016/j.forsciint.2012.10.031.
Li, L., Li, S., Wang, J., 2012. Copy-move forgery detection based on PHT. Proceeding
2012 World Congr. Inf. Commun. Technol. WICT, pp. 10611065. http://dx.doi.
org/10.1109/WICT.2012.6409232.
Li, W., Yu, N., 2010. Rotation robust detection of copy-move forgery, in: Proceedings
- International Conference on Image Processing, ICIP. pp. 21132116.
doi:10.1109/ICIP.2010.5652519.
Li, W., Yu, N., Yuan, Y., 2008. Doctored JPEG image detection. IEEE Int. Conf. Multimed. Expo 253256. http://dx.doi.org/10.1109/ICME.2008.4607419.
Lin, W., Khan, S.U., Yow, K.C., Qazi, T., Madani, S. a, Xu, C.-Z., Koodziej, J., Khan, I. a,
Li, H., Hayat, K., 2013. Survey on blind image forgery detection. IET Image
Process 7, 660670. http://dx.doi.org/10.1049/iet-ipr.2012.0388.
Lin, H., Wang, C., Kao, Y., 2009. Fast Copy-Move Forgery Detection. WSEAS Trans.
SIGNAL Process. 5, 188197.
Liu, B., Pun, C.M., Yuan, X.C., 2014. Digital image forgery detection using JPEG features and local noise discrepancies. Sci. World J., 2014. http://dx.doi.org/
10.1155/2014/230425.
Lucchese, L., Cortelazzo, G.M., 2000. A noise-robust frequency domain technique for
estimating planar roto-translations. IEEE Trans. Signal Process. 48, 17691786.
http://dx.doi.org/10.1109/78.845934.
Lynch, G., Shih, F.Y., Liao, H.-Y.M., 2013. An efcient expanding block algorithm for
image copy-move forgery detection. Inf. Sci. (Ny.) 239, 253265. http://dx.doi.
org/10.1016/j.ins.2013.03.028.
Mahdian, B., Saic, S., 2007. Detection of copy-move forgery using a method based
on blur moment invariants. Forensic Sci. Int. 171, 180189. http://dx.doi.org/
10.1016/j.forsciint.2006.11.002.
Mahrt, M., Scharkow, M., 2013. The value of big data in digital media research. J.
Broadcast. Electron. Media 57, 2033. http://dx.doi.org/10.1080/
08838151.2012.761700.
Miljkovi, O., 2009. Image Pre-Processing Tool. Kragujev. J. Math. 32, 97107.
Mishra, P., Mishra, N., Sharma, S., Patel, R., 2013. Region duplication forgery detection technique based on SURF And HAC. Sci. World J., 2013.
Mohamadian, Z., Pouyan, A.A., 2013. Detection Of Duplication Forgery In Digital
Images In Uniform And Non-Uniform Regions, in: 5th International Conference
on Computer Modelling and Simulation. Ieee, pp. 455460. http://dx.doi.org/10.
1109/UKSim.2013.94.
Muhammad, G., Hussain, M., Bebis, G., 2012. Passive Copy Move Image Forgery
Detection Using Undecimated Dyadic Wavelet Transform. Digit. Investig. 9,
4957. http://dx.doi.org/10.1016/j.diin.2012.04.004.
Muhammad, G., Al-hammadi, M.H., Hussain, M., Mirza, A.M., Bebis, G., 2013. Copy
move image forgery detection method using steerable pyramid transform and
texture descriptor. EuroCon, 15861592.
Muhammad, G., n.d. Image Forensics [WWW Document]. URL http://faculty.ksu.
edu.sa/ghulam/Pages/ImageForensics.aspx (accessed 04.28.15).
Murali, S., Anami, B.S., Chittapur, G.B., 2012. Detection of Digital Photo Image Forgery, in: IEEE International Conference on Advanced Communication Control
and Computing Technologies. p. 9166.
Myna, A.N., Venkateshmurthy, M.G., Patil, C.G., 2008. Detection of region duplication forgery in digital images using wavelets and log-polar mapping, in: Proceedings - International Conference on Computational Intelligence and Multimedia Applications, ICCIMA 2007. pp. 371377. http://dx.doi.org/10.1109/ICCI
MA.2007.161.
Nepal, S., Ranjan, R., Choo, K.-K.R., 2015. Trustworthy processing of healthcare big
data in hybrid clouds. IEEE Cloud Comput. 2, 7884. http://dx.doi.org/10.1109/
MCC.2015.36.
Ng, T., Chang, S., 2004. A Data Set of Authentic and Spliced Image Blocks.
Pan, X., Lyu, S., 2010. Region duplication detection using image feature matching.
IEEE Trans. Inf. Forensics Secur. 5, 857867.
Peng, F., Nie, Y.Y., Long, M., 2011. A complete passive blind image copy-move forensics scheme based on compound statistics features. Forensic Sci. Int. 212,
e21e25. http://dx.doi.org/10.1016/j.forsciint.2011.06.011.
Peng, J., Choo, K.K.R., Ashman, H., 2016b. Bit-level N-gram based forensic authorship analysis on social media: identifying individuals from linguistic proles. J.
Netw. Comput. Appl. http://dx.doi.org/10.1016/j.jnca.2016.04.001
Peng, J., Choo, K.K.R., Ashman, H., 2016a. Astroturng detection in social media:
Using binary n-gram analysis for authorship attribution. In Proceedings of 15th
IEEE International Conference on Trust, Security and Privacy in Computing and

Communications (TrustCom 2016), IEEE Computer Society Press. http://dx.doi.
org/10.1109/TrustCom/BigDataSE/ISPA.2016.53.
Philip Chen, C.L., Zhang, C.Y., 2014. Data-intensive applications, challenges, techniques and technologies: a survey on Big. Data. Inf. Sci. (Ny.) 275, 314347. http:
//dx.doi.org/10.1016/j.ins.2014.01.015.
Piccardi, M., 2004. Background subtraction techniques: a review. 2004 IEEE Int.
Conf. Syst. Man Cybern. (IEEE Cat. No.04CH37583) 4, pp. 30993104.
doi:10.1109/ICSMC.2004.1400815.
Pooranian, Z., Harounabadi, A., Shojafar, M., Hedayat, N., 2011. New hybrid algorithm for task scheduling in grid computing to decrease missed task. J. World
Acad. Sci. Eng. Technol. 5, 786790.
Popescu, A.C., Farid, H., 2004. Exposing Digital Forgeries By Detecting Duplicated
Image Regions.
Pun, C., Member, S., Yuan, X., Bi, X., 2015. Oversegmentation and Feature Point
Matching. IEEE Trans. Inf. Forensics Secur. 10, 17051716.
Quick, D., Choo, K.K.R., 2014a. Google drive: forensic analysis of data remnants. J.
Netw. Comput. Appl. 40, 179193. http://dx.doi.org/10.1016/j.jnca.2013.09.016.
Quick, D., Choo, K.K.R., 2014b. Impacts of increasing volume of digital forensic data:
a survey and future research challenges. Digit. Investig. 11, 273294. http://dx.
doi.org/10.1016/j.diin.2014.09.002.
Quick, D., Choo, K.K.R., 2014c. Data reduction and data mining framework for digital
forensic evidence: Storage, intelligence, review and archive. Trends. Issues
Crime. Crim. Justice 480, 16 http://aic.gov.au/media_library/publications/tan
di_pdf/tandi480.pdf.
Quick, D., Choo, K.K.R., 2016. Big forensic data reduction: digital forensic images and
electronic evidence. Clust. Comput 19 (2), 723740. http://dx.doi.org/10.1007/
s10586-016-0553-1.
Quick, D., Martini, B., Choo K.K.R., 2013. Cloud storage forensics. Syngress, an Imprint of Elsevier. http://www.sciencedirect.com/science/book/
9780124199705.
Rostirolla, G., da Rosa Righi, R., dos Reis, Eduardo Souza. Fischer, G., Chang, Victor.
Ramachandran, M., 2016. IDAC: A Sensor-Based Model for Presence Control and
Idleness Detection in Brazilian Companies, in: IDAC: A Sensor-Based Model for
Presence Control and Idleness Detection in Brazilian Companies. In, The First
International Conference on Internet of Things and Big Data, Special Session,
Recent Advancement in Internet of Things, Big Data and Security (RAI).
Ryu, S.J., Kirchner, M., Lee, M.J., Lee, H.K., 2013. Rotation invariant localization of
duplicated image regions based on zernike moments. IEEE Trans. Inf. Forensics
Secur. 8, 13551370. http://dx.doi.org/10.1109/TIFS.2013.2272377.
Ryu, S.J., Lee, M.J., Lee, H.K., 2010. Detection of copy-rotate-move forgery using
zernike moments, in: 12th International Conference. pp. 5165. http://dx.doi.
org/10.1007/978-3-642-16435-4_5.
Shao, H., Yu, T., Xu, M., Cui, W., 2012. Image region duplication detection based on
circular window expansion and phase correlation. Forensic Sci. Int. 222, 7182.
http://dx.doi.org/10.1016/j.forsciint.2012.05.002.
Shen, X., Zhu, Y., Lv, Y., Chen, H., 2013. Image Copy-Move Forgery Detection Based
on SIFT and Gray Level, in: International Conference on Information Technology
and Management Innovation (ICITMI2012). pp. 30213024. http://dx.doi.org/
10.4028/www.scientic.net/AMM.263-266.3021.
Shin, Y., 2013. Fast Detection of Copy-Move Forgery Image using DCT. J. Korea
Multimed. Soc. 16, 411417.
Shojafar, M., Cordeschi, N., Baccarelli, E., 2016. Energy-efcient Adaptive Resource
Management for Real-time vehicle Cloud Services. IEEE Trans. Cloud Comput.,
114. http://dx.doi.org/10.1109/TCC.2016.2551747.
Silva, E., Carvalho, T., Ferreira, A., Rocha, A., 2015. Going deeper into copy-move
forgery detection: Exploring image telltales via multi-scale analysis and voting
processes. J. Vis. Commun. Image Represent. 29, 1632. http://dx.doi.org/
10.1016/j.jvcir.2015.01.016.
Singh, J., Raman, B., 2012. A high performance copy-move image forgery detection
scheme on GPU. Adv. Intell. Soft Comput. 131 AISC, 239246. http://dx.doi.org/
10.1007/978-81-322-0491-6_23.
Smith, M., Szongott, C., Henne, B., Voigt, G. Von, 2012. Big Data Privacy Issues in
Public Social Media, in: 2012 6th IEEE International Conference on IEEE Digital
Ecosystems Technologies (DEST), pp. 16. doi:10.1109/DEST.2012.6227909.
Tavoli, R., Kozegar, E., Shojafar, M., Soleimani, H., Pooranian, Z., 2013. Weighted PCA
for improving Document Image Retrieval System based on keyword spotting
accuracy. 2013 36th Int. Conf. Telecommun. Signal Process. TSP, 2013, pp. 773
777. http://dx.doi.org/10.1109/TSP.2013.6614043.
Tijdink, J.K., Verbeke, R., Smulders, Y.M., 2014. Publication Pressure and Scientic
Misconduct in Medical Scientists. J. Empir. Res 9, 6471. http://dx.doi.org/
10.1177/1556264614552421.
Ting, Z., Rang-Ding, W., 2009. Copy-move forgery detection based on SVD in digital
image, in: 2nd International Congress on Image and Signal Processing, CISP09.
pp. 04. http://dx.doi.org/10.1109/CISP.2009.5301325.
Tola, E., Lepetit, V., Fua, P., 2010. DAISY: An Efcient Dense Descriptor Applied To
Wide-Baseline Stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32, 815830. http:
//dx.doi.org/10.1109/TPAMI.2009.77.
Tralic, D., Zupancic, I., Grgic, S., Grgic, M., 2013. CoMoFoD - New Database for CopyMove Forgery Detection, in: Proceedings of 55th International Symposium
ELMAR-2013. pp. 2527.
Ulutas, G., Ulutas, M., 2013. Image forgery detection using Color Coherence Vector.
2013 Int. Conf. Electron. Comput. Comput. ICECCO 2013 107110. http://dx.doi.
org/10.1109/ICECCO.2013.6718240.
Uluta, G., Uluta, M., Nabiyev, V.V, 2013. Copy Move Forgery Detection based on
LBP, in: 21st Signal Processing and Communications Applications Conference
277
(SIU).
Vincent Christlein, C.R. and E.A.P., 2010. On Rotation Invariance In Copy-Move
Forgery Detection, in: IEEE International Workshop on Information Forensics
and Security.
Wang, T., Tang, J., Zhao, W., Xu, Q., Luo, B., 2012. Blind detection of copy-move
forgery based on multi-scale autoconvolution invariants. Commun. Comput. Inf.
Sci., 438446. http://dx.doi.org/10.1007/978-3-642-33506-8_54.
Wu, Q., Wang, S., Zhang, X., 2010. Detection of image region-duplication with rotation and scaling tolerance, in: Second International Conference, ICCCI. pp.
100108. http://dx.doi.org/10.1007/978-3-642-16693-8_11.
Xu, D., Ren, P., Sun, L., Song, H., 2016. Precoder-and-receiver design scheme for
multi-user coordinated multi-point in LTE-A and fth generation systems. IET
Commun. 10, 292299. http://dx.doi.org/10.1049/iet-com.2015.0229.
Xu, Z., Zhang, H., Sugumaran, V., Choo, K.-K.R., Mei, L., Zhu, Y., 2016. Participatory
sensing-based semantic and spatial analysis of urban emergency events using
mobile social media. EURASIP J. Wirel. Commun. Netw. 2016, 44. http://dx.doi.
org/10.1186/s13638-016-0553-0.
Yang, B., Sun, X., Chen, X., Zhang, J., Li, X., 2013. An efcient forensic method for
copy move forgery detection based on DWT-FWHT. Radio Eng. 22, 10981105.
Yang, Q.-C., Huang, C.-L., 2009. Copy-move forgery detection in digital image, in:
10th Pacic Rim Conference on Multimedia. pp. 816825.
Yu, L., Han, Q., Niu, X., 2014. Feature point-based copy-move forgery detection :
covering the non-textured areas. Multimed. Tools Appl. http://dx.doi.org/
10.1007/s11042-014-2362-y
Zhang, J., Feng, Z., Su, Y., 2008. A new approach for detecting copy-move forgery in
digital images, in: 11th IEEE Singapore International Conference on Communication Systems, ICCS 2008. pp. 362366. http://dx.doi.org/10.1109/ICCS.2008.
4737205.
Zhao, J., 2010. Detection of copy-move forgery based on one improved LLE method.
2nd IEEE Int. Conf. Adv. Comput. Control 4, 547550. http://dx.doi.org/10.1109/
ICACC.2010.5486861.
Zhao, J., Zhao, W., 2013. Passive forensics for region duplication image forgery based
on harris feature points and local binary patterns. Math. Probl. Eng., 2013. http:
//dx.doi.org/10.1155/2013/619564.
Zhao, J., Guo, J., 2013. Passive forensics for copy-move image forgery using a
method based on DCT and SVD. Forensic Sci. Int. 233, 158166. http://dx.doi.
org/10.1016/j.forsciint.2013.09.013.
Zhao, L., Chen, L., Ranjan, R., Choo, K-K R., He, J., 2016. Geographical information
system parallelization for spatial big data processing: a review. Cluster Comput.
Zheng, J., Chang, L., 2014. Detection Technology of Tampering Image Based on
Harris Corner Points. J. Comput. Inf. Syst. 10, 14811488. http://dx.doi.org/
10.12733/jcis9302.
Nor Bakiah Abd Warif is a PhD student at the Faculty of Computer Science and
Information Technology, University of Malaya. She received her Bachelor Degrees in
Information Technology at the National University of Malaysia. Her research interests are in the areas of image processing and image forensics, especially on copymove forgery detection.
Ainuddin Wahid Abdul Wahid received a PhD from Surrey University, United
Kingdom. He is currently a Senior Lecturer at the Department of Computer System
and Technology, Faculty of Computer Science and Information Technology, University of Malaya. His research interests including security service, steganography,
network security, digital forensics and information hiding.
Mohd. Yamani Idna Idris is a Senior Lecturer at the Department of Computer

System and Technology, Faculty of Computer Science and Information Technology,
University of Malaya. He holds a PhD, also from University of Malaya and his research interests include image processing and computer vision, digital signal processing, embedded system, hardware based computer security, and sensor
networks.
Roziana Ramli is a PhD student at the Faculty of Computer Science and Information Technology, University of Malaya. She received her Bachelor and Master Degrees in Engineering from University of Malaya. Her research interest includes
medical image processing and analysis.
278
Rosli Salleh is currently an Associate Professor at the Faculty of Computer Science

and Information Technology, University of Malaya. He has a PhD from Salford
University, United Kingdom. His major research interest includes wireless communication and technologies.
Shahaboddin Shamshirband received a PhD from University of Malaya, Malaysia.

He is currently a Senior Lecturer at the Department of Computer System and
Technology, Faculty of Computer Science and Information Technology, University of
Malaya. His primary research area lies within computational intelligence, multi
agent systems, and machine learning in engineering applications of articial intelligence. In addition, he is working on High Impact Research grant funded by
University of Malaya. Currently, he is a Co-PI of the NFR join Project by Hanyang
University (South Korea), and UMRG Program by University of Malaya. He published more than 200 ISI-Cited articles and numerous conference proceedings. He
is a member of IEEE, and also an editorial board member and reviewer for many
journals.
Kim-Kwang Raymond Choo received the PhD in Information Security from

Queensland University of Technology, Australia. He currently holds the Cloud
Technology Endowed Professorship at The University of Texas at San Antonio, and
is an associate professor at The University of South Australia. He is named one of 10
Emerging Leaders in the Innovation category of The Weekend Australian Magazine
/ Microsofts Next 100 series in 2009, and is the recipient of various awards including ESORICS 2015 Best Research Paper Award, Highly Commended Award from
Australia New Zealand Policing Advisory Agency, British Computer Societys Wilkes
Award, Fulbright Scholarship, and 2008 Australia Day Achievement Medallion. He
is a Fellow of the Australian Computer Society, and a Senior Member of IEEE.

60

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

60

Uploaded by

Copyright:

Available Formats

Journal of Network and Computer Applications 75 (2016) 259278

Contents lists available at ScienceDirect

Journal of Network and Computer Applications

Copy-move forgery detection: Survey, challenges and future directions

In recent years, digital image tampering is made easier due to

their advantages and limitations.

2. Image forgery detection

Publications of Copy-Move Forgery Detection

Fig. 1. Scientic papers located by searching for copy-move forgery detection on

Christlein et al. (2012)

Lin et al. (2013)

Birajdar and Mankar (2013) Contributions

be embedded in the image during the capturing process or at later

3. Copy-move forgery detection

Fig. 2. Existing image forgery detection techniques.

Therefore, it can be undetectable by naked eyes. In copy-move

Fig. 4. Common workow of CMFD techniques.

Provide a spatial synchronization

or circle for analysis during the pre-processing stage. These blocks

4.1. Block-based feature extraction techniques

Generally, the feature extraction techniques for block-based are

4.1.1. Frequency transform

Texture & Intensity

Log Polar Transform

Fig. 5. The CMFD process in block-based approach.

Texture and intensity

measured and characterized through intensity, pattern or color

The moments invariant was initially employed in copy-move by

Proposed mismatch information using DCT grid and block artifacts

High computational complexity.

Huang et al. (2011)

Cao et al. (2012)

Zhao and Guo

Applied Singular Value Decomposition (SVD) to the blocks after

Fourier Transform Shao et al. (2012)

The Fourier transform of the polar expansion are calculated on the

Robust to JPEG compression.

Only tested with the post-processing operation.

Only tested with the post-processing operation.

High sensitivity to texture features.

Unable to detect copy-move with scaling.

High accuracy and increase speed in CMFD.

Weak performance if the image has undergone the attack of

Low computational complexity.

Speed relies on the location of copy-move. If the copy-move is

Yang et al. (2013)

Zhang et al. (2008)

Peng et al. (2011)

Fig. 7. Texture and intensity characterized by intensity, pattern and color.

to geometric distortions in the PHT group (Li et al., 2014).

4.2. Block-based matching techniques

4.1.5. Dimension reduction

Matching technique is a process to nd similarities between

A generalization based on the alphabetical order of their

has the ability for an efcient range queries in multi-dimensional

Fig. 8. CMFD process in keypoint-based approach.

set of descriptor produced within a region around the features.

Nearest Neighbor - Best Bin First

Nearest Neighbor - 2NN

Jaberi et al., 2013a, 2013b; Pan and Lyu, 2010).

Amerini et al. (2013, 2011), Anand et al. (2014), Ardizzone

High computational complexity.

respective literature are listed in Table 10.

6. Publicly available datasets