Professional Documents
Culture Documents
-1
} is same as security of Universal Stego System and are represented as {,
-1
}() or ()
respectively. Therefore {,
-1
}() and () are one and the same.
(2)
The security of any Stego-System = {,
-1
,
C, S, I} is given as () and is secure (that is () =
) if H(Pc||Ps) = . But this has very narrow connotation as Stego-Algorithm {,
-1
} has to operate not just on
C, S and I but on every C , every S and every I . But still the concept of security of any Stego-
System = {,
-1
,
C, S, I}forms the basic building block of the concept of security of any Stego-Algorithm {
,
-1
} in Universal Stego System = { ,
-1
,
, , }. This is because any stego algorithm {,
-1
}
is secure (ie {,
-1
}() or () = ) then maximum value of security of any Stego-System
(given as ()) can be for some
.
Mathematically this can be written as:
(3)
Thus the security of Stego Algorithm {,
-1
} or Universal Stego System is defined in terms of
Stego System
i
. According to Cachin a stego-system is perfectly secure if H(Pc|Ps = 0, which is
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 3 | Page
possible only when Pc = Ps and in such cases receiver is unable to distinguish between C and S as their
probability distributions are same and this represents the Shannons notion of perfect secrecy for
cryptosystems[4]. However Chandramouli et.al in Steganography Capacity: A Steganalysis Perspective [4]
have pointed that this definition of Security of stego-system is purely theoretical in nature because it assumes
the Cover-object C to be perfectly random. But in reality the Image is not random and in some cases it is
possible to steganalyse the image even if the probability distributions of the C and S are same. Hence in addition
to parameter some more parameters of security of any Universal stego system are devised.
2.2 Preliminaries and Definition
Using Cachins Information theoretic model[3] and Chandramoulis Mathematical formulation of a
Steganalytic Problem[6] and extending both to Image based stego-system a method is devised for representing
this system mathematically. Based on this mathematical model a technique is devised for steganlaysis of the
stego image.
Before we proceed to mathematical model of Image based stego-system we have to mathematically
define the preliminary concepts to be used in this model.
Definition 1 (Image)
Every digital image is collection of discrete picture elements or pixels. Let M be any digital image with
N pixels. So any particular pixel of image M is represented as M(z) and z can be any value from 1 to N. This
M(z) can be a gray level intensity of the pixel in gray scale image or RGB or YCbCr value of the pixel in a color
Image. Thus M(z) can be a set {R(z),G(z),B(z)} or equivalent gray scale representation or (R(z)+G(z)+B(z))/3.
But it is always better to consider each R, G and B components individually because the averaging effect cause
loss of vital steganographic information. Further < {M},m > is multiset of Image M such that M(z) {M} for
every z = 1 to N and m is a vector corresponding to the occurrence or count of every element M(z) in {M}.
Mathematically an image M with N pixels is:
(4)
Definition 2 (Identical Images)
Two images M and L with N pixels are said to be identical (represented as M L) if they have pixel to
pixel match. This means that two images are identical and absolutely same. Thus their difference image D = M-
L will be a pure black image corresponding to zero matrix.
(5)
Definition 3 (Probability distribution of Image)
Probability distribution or Probability Mass Function represented as P(M) for image M = < {M},m > is
a multiset <{M}, m > where m =
n(<{},>)
and n(< {M}, m >) is cardinality (number of elements) of
multiset of the image M or simply total number of pixels in M. The same is explained mathematically in (6).
(6)
Definition 4 (Macro statistically Same Images)
Two images M and L with N pixels are said to be Macro Statistically Same (represented as M L) if
they have equal entropy, energy, contrast ratio, brightness and same histograms. However this does not mean
that they are having pixel to pixel match and may not be identical. It simply means that the probability
distributions of their pixels are equal. Thus if M L then <{M},m> = <{L},l> or in terms of probability
distribution P(M) = P(L). In other words images M and L will have same number of occurrence of any certain
pixel intensity but it is not necessary that M(z) = N(z) for any particular z from 1 to N in the image. Thus
(7)
Definition 5 (Neighborhood or Locality of Pixel)
If (M(z)) is said to be set of neighboring pixels of any pixel M(z) in image M. Then any n
i
(M(z))
will be such that d(n
i ,
M(z)
) where d is a function which calculates distance (can be Euclidean, City-Block,
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 4 | Page
Chess Board or any other type depending upon the steganographic algorithm) between its inputs (ie n
i
and
M(z)) and is measurement of degree of neighbourhood and should be minimum (Generally equal to 1 pixel)
but also depends upon the steganographic algorithm used by stegosystem . Mathematically this can be
represented as:
(8)
In Fig 1 an arbitrary pixel Y is shown with its neighbors P, Q, R, S, T, U, V and W. We represent this
pixel Y as Y
. And the set representing the absolute differences of the adjacent neighbors of M(z) among
themselves is given as ( (M(z))). The mean of the values of ( (M(z))) is given as ( (()))
and
Standard Deviation of the values of ( (M(z))) is given as (( (M(z)))) . Since M(z) is also a immediate
neighbor of (M(z)) so ((), (() ))
, Y
)) = {modulus of P-Q, Q-R, R-T, T-W, W-V, V-U, U-S and S-P}.
So aberration in pixel Y with respect to its neighborhood (Y
) given as ( Y
, (Y
)) and (( ))
.
Mathematically:
(10)
Definition 8 (Pixel Aberration of Image)
In any image M with N pixels the Pixel aberration of image M is given as (). It is the weighted
mean of the modulus of the pixel aberrations of the pixels of the entire image M. Since for any image M the
Mz, Mz is the measure of deviation of M(z) from its neighborhood Mz in terms of standard
P Q R
S Y T
U V W
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 5 | Page
deviation so majority of pixels have this values located close to zero and approximately more than 68% of the
pixels have pixel aberration within 1 ( as per 3 Sigma or 68-95-99.7 rule of Statistics). Hence the simple
mean of Mz, mz is very close to zero and is insignificantly small for all images. Since by pixel
aberration analysis we have to identify those images which have larger pixel aberrations so as a remedy very
small weights are assigned to less deviated values (majority of pixels which have low pixel aberration values)
and larger weights are assigned to more deviated values (few counted pixels have large pixel aberrations). Thus
value of () for the
Image M with N pixels is given as:
(11)
The weight W(z) for the pixel M(z) is much smaller for small values of Mz, mz and quite
large for big values of Mz, mz. Thus W(z) is large for pixel having greater pixel aberration and very
small for pixels having lesser pixel aberration. Such weights can be computed by taking cube of the value of
pixel aberration in terms of the standard deviation. In other words the weight W(z) for any Pixel M(z) in image
M is given as
W(z) =
Mz,Mz MEAN
Z=1
Z=N
( Mz,Mz)
STD
Z=1
Z=N
( Mz,Mz)
3
(12)
Although we may avoid taking weighted mean and we can use simple mean but for that we have to
consider only those values of Mz, Mz for determining mean which are above or below certain
threshold and rest of the values can be filtered. This value of is generally given in terms of standard
deviation of Mz, Mz from z = 1 to N and in represented as . Thus Mean Pixel Aberration of Image
M at threshold is represented as , and mathematically defined as:
(13)
Thus this value of depends on smoothness of the cover image and the type of aberration we are
interested in. In unsmooth cover images the differences of the pixels with their neighbors is quite large (for
example an image of a Forest or Valley) and hence the value of , at larger represents the mean of only
those deviations which are larger than . Whereas for smooth cover images like clear blue sky the aberration is
already very low and hence smaller value of produces good result.
Definition 9 (Range of Pixel Aberration in the Image)
In any image M with N pixels the Range of Pixel aberration of image M is given as (). It is the
difference of the Maximum Pixel Aberration (
in
the Image M.
Thus Mathematically the same can be expressed as:
(14)
Definition 10 (Maximum Deviation in the Pixel Aberration of the Image)
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 6 | Page
In any image M with N pixels the Maximum Pixel Aberration in M given as (M) is the maximum
pixel aberration in absolute terms in the image M. corresponding to (M) is represented as . Thus
(15)
(16)
2.3 Detailed Mathematical Model of any Image based Stego Algorithm
In Equation 2 it has been very clearly shown that security of any stego system = {,
-1
,
C, S, I} is
the basic building block of security of the stego-algorithm {,
-1
}. So for the sake of simplicity it is better to
operate on stego-system only. Let = {,
-1
,
C, S, I} be any Image Steganographic System with ,
-1
,
C, S
and I having the same meaning as mentioned in previous section. Thus S = (C, I) and I =
-1
(S) also holds
well. Now let us assume that Cover Image C consists of N discrete pixels represented by C(1), C(2), C(N).
Although cover image C is meant for storing Information I. But any arbitrary pixel C(z) of C can at max store
only a limited part of Information I. Let this small part of I stored in C(z) be represented as I(z). Thus our
Information I can be broken to K parts represented by I(1), I(2) I(K), K N such that any I(z) is the
information stored in any particular pixel C(z) for any z N. If information I is smaller than the cover-image C
ie if K < N then the remaining I(z) from z = K+1 to N can be thought to be empty or Null set and given as I(z) =
{ } for z = K+1 to N. Thus the cardinality of both I and C (given as n(I) and n(C) respectively) is made equal i.e.
N. Since S = (C, I) so corresponding to every C(z) in C we have a unique S(z) in S. Using the notations of Set
Theory the same is mathematically written in (17).
(17)
The stego-function :(C, I) S can be redefined at pixel level as S(z) = (z) [C(z) I(z)] where is
any operator used by stego-funtion acting over C and I to produce S and (z) 0 is factor which strengthens
for z = 1 to N. Thus (z) z: 1 z N is strengthening factor of stego system and helps it in achieving
secure (ie (z) for z = 1 to N is the factor which helps in achieving ()).
The inverse stego function
-1
:(S) I can be redefined at pixel level as I(z) = (S(z)) where is a unary
operator used by
-1
acting on S to produce I and hence indirectly C also. Thus algorithmically unary operator
is inverse of the operator .
2.3.1 Parameters for Measuring Strength of Stego Algorithm
Strengthening Factor (z) z: 1 z N, keeps S(z) such that it is least susceptible to any steganalysis
attacks by making S perfectly resemble an Innocent Image i.e. without any distortions. Therefore this (z) has to
meet four main requirements which are explained next.
Requirement 1
Using operator the (z) should map C(z) and I(z) to S(z) in such a way that relative entropy of cover
and stego image given as H(P(C) || P(S)) should be minimum possible. Here P(C) is probability distribution of C
and P(S) is probability distribution (Probability Mass Function) of S and H(P(C) || P(S)) is relative entropy of
P(C) with P(S). This requirement is derived from equation 1 mentioned in section 2.2. This simply means that
macro statistical parameters of the Cover-Image C and Stego-Image S should be almost same or in terms of
relative entropy should be minimum possible. This requirement is extension of Cachins Information theoretic
model in terms of . Mathematically this can be expressed as
(z) should be such that:
(18)
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 7 | Page
Where P(C) and P(S) are probability distribution of C(z) and S(z) z: 1 z N and such a stego-
system is said to be Secure.
In order to achieve this requirement the stego function :(C, I) S will macro-statistically redistribute
the pixels of C in such a way that even though corresponding pixels C(z) and S(z) may not be same but still
probability distribution of pixels C(z) in C and S(z) in S for z = 1 to N will remain same that is C S will be
achieved. Thus by fulfilling this requirement (assuming = 0) the Cover Image and the Stego Image will have
same Histogram, Brightness, entropy, energy, contrast ratio and all other macro statistical parameters even if C
S that is C(z) S(z) z:1 z N.
Requirement 2
If only Requirement 1 is met we may have a situation where even though the cover-image may look
macro-statistically same (in terms of Histogram, Brightness, entropy, energy, contrast ratio etc) as stego-image
but still they may have significantly different pixel to pixel correspondence between C and S. I.e. any particular
pixel S(z) of S may be considerably different from C(z) of C thus revealing the distortions in S(z) and hence
making S susceptible to Steganalysis. Thus in addition to macro-statistical redistribution of the pixels of cover
image (as mentioned in Requirement 1) the stego-algorithm must redistribute the pixels of the neighborhood of
every pixel C(z) in C (i.e. z: 1 z N) is such way that two corresponding pixels C(z) and S(z) should have
same probability distribution of their neighborhood. Thus (z) should meet another requirement:
Using operator the (z) should map C(z) and I(z) to S(z) in such a way that the relative entropy between the
Neighborhood of C(z) and S(z) (or Local Relative Entropy) should be least possible z: 1 z N. Thus any
Image based Stego-System is said to be Secure if the mean of the relative entropies of the neighborhood of
C(z) and S(z) for all C(z) in C and S(z) in S (that is z: 1 z N) is . Thus (z) should be such that is
minimum where is given as
(19)
Here P(Cz) is probability distribution of the pixels in the neighbourhood of pixel C(z) and
P(Sz) is probability distribution of the pixels in the neighbourhood of pixel S(z).
Requirement 3
Most spatial domain Stego Algorithms distribute the entire information in large number of pixels and
as a result the changes in the pixel values are very small and unnoticeable but in this process large number of the
pixels in the image change and hence the relative entropy of the stego-image and cover-image increases due to
considerable change in probability distribution of pixels in the image. Security of such algorithms can be
defined by Requirement 1 and Requirement 2 that is and .
But there are certain Image Stego Algorithms which concentrate the information in very few pixels. As
a result the change in pixels values of these few pixels is very large and hence quiet perceptible even though the
probability distribution of pixels is not much disturbed. In case of such algorithms even if and are very small
the stego-image may have few grains in last few rows (grains are due to large and perceptible changes in those
few pixels and changes in the bottom most pixel usually goes unnoticed due to psycho-visual weaknesses of
human eye) and are susceptible to steganalysis. In any natural Image a pixel P is almost same as its neighbors.
Therefore on an average C(z) will not be very different from (C(z)) for most values in z = 1 to N. Thus (z)
should meet another requirement:
Using operator the (z) should map C(z) and I(z) to S(z) in such a way that any particular pixel
should not change much. Thus the difference between Weighted Mean of the Pixel Aberration of Stego-Image S
from Cover-Image C (Definition 8) should be minimum possible. The weighted mean pixel aberration can be
calculated by either obtaining the Maximum of the red , green and blue values or by taking the average of red,
green and blue values. Hence mathematically the difference between Weighted Mean of the Pixel Aberration of
Stego-Image S from Cover-Image C is represented as
MAX
and
MEAN
and given as
MAX
=MAX
RGB
( ) MAX
RGB
( )
or
MEAN
=MEAN
RGB
( ) MEAN
RGB
( ) 20
The same can be alternatively represented by finding the difference between the mean pixel aberration
of Cover Image C and Stego-Image S considering only those values of pixel aberrations (of Cz, Cz
and Sz, Sz for z = 1 to N) in entire image which are above a certain threshold and given
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 8 | Page
as , and , Thus (z) should be such that the difference between the pixel aberrations of Stego-Image
and Cover-Image at threshold (in terms of standard deviation it corresponds to pixel aberration value of )
should be minimum possible and given as ().
In unsmooth cover images the aberration is already very high and addition of information brings further
more aberrations (in some efficient stego-algorithms it may reduce the aberrations too) so if the value of is
kept large then () will be measure of differences in only those large aberrations. Whereas in smooth cover
images the aberration is quite low and hence lower value of is advisable. In some cases we may get a value of
() as negative which indicates that at threshold the Stego Image has lesser aberration then the cover image.
Certain steganographic algorithms hide the data very efficiently and as a result only few counted pixels
have aberration beyond the prescribed limit. In such cases determination of weaknesses in these algorithms
using only fixed value () goes unnoticed due to averaging effect of large number of pixels having much lower
pixel aberration. Moreover () has different value at every . Thus a better estimate of () can be which is
the mean of () for continuously increasing value of from 0 to that value of which corresponds to modulus
of Maximum Pixel Aberration (Definition 10) in the stego image that is for = 0 to .
() = , - ,
(21)
Since calculating the value of =
1
= = (M)
0
is practically very expensive in various
accords of time and computation power. So more practical way to estimate of can be based on taking means of
at any chosen discrete values of for example like = 0, 1/8 , 2/8 , 3/8 ... .
Thus as an indicator of requirement 3 either = or =
1
= = (M)
0
can be
considered. But generally the difference of the weighted means of the pixel aberration of cover image and stego
image as given as in (20) will be preferable although this may vary from algorithm to algorithm and situation
to situation. Whatever the value we consider for obtaining this difference i.e. either or has to be represented
by in the holistic representation of the requirement 4 in the steganographic system. Thus is either or .
Requirement 4
Another very good indicator of presence of anomaly in the pixels of the image is Range of Pixel
Aberration in the Image (Definition 9). Bigger value of in spite of lower values of () indicates
that only very few counted pixels have aberration much beyond the prescribed limit and hence the given image
could be a potential stego-image. Thus using operator the (z) should map C(z) and I(z) to S(z) in such a way
that Range of Pixel Aberration in Cover Image must not be very different from the Range of Pixel Aberration in
the Stego image. Thus the difference of Range of Pixel Aberration of Cover and Stego Image should be
minimum possible and given as .
(22)
Thus is the indicator of percentage change in the Range of Pixel Aberration in Cover Image after
embedding the data in it.
In colored Image the value is different for Red, Green and Blue components of the Image. But we
cant take average of the three as value represents the Range of Pixel Aberration and hence for RGB image,
this is given as
(23)
Also it is better if we mention the color component which has maximum in RGB Image.
2.4 Holistic Representation of Stego System and Universal Stego System Mathematically
Based on these four requirements of with regards to the Strength of any Steganographic System we
may define security of by four tuple < , , , > and say () = < , , , > secure.
Thus Image based Universal stego system = { ,
-1
,
, , } with any Stego System = {C, S, I, ,
-
1
} such that can be more elaborately defined at pixel level as
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 9 | Page
(24)
Here Stego-Algorithm of or (Algorithm) = < , > where () = and (
-1
) = and
Strength of given as () = < , , , >.
Since handling of four different values of () is quite difficult so four values of () = < , , , >
can be reduced in to one value represented as < () > by taking weighted means of their modulus.
(25)
The values of these four weights w
1
, w
2
, w
3
, w
4
depends upon the alertness and sensitivity of
steganalysis algorithm with respect to the four strength parameters , , , of any steganographic algorithm. In
most general cases we assume that the steganalyst is capable of exploiting any of these 4 vulnerabilities and
therefore the four conditions have equal importance and hence w
1
= w
2
= w
3
= w
4
and therefore the value of
< > becomes simple mean of < , , , > and given as < > = ( + + + )/4.
The smaller value of < > indicates that the algorithm is stronger. Thus Image based Universal stego
system = { ,
-1
,
, , } with any Stego System = {C, S, I, ,
-1
} such that can also be
defined as
(26)
2.5 Steganalysis is Always Possible
In this section a theorem is given which proves that every stego system is susceptible to steganalysis.
Theorem: No Image based Stego Algorithm (Universal Stego System) is fool proof.
Assumption
Let there be any fool proof Universal stego system = { , , , (Algorithm) , ()} such that
() = < , , , > = < 0, 0, 0, 0 > and (Algorithm) = { ,
-1
} capable of exchanging Y distinct and
authentic Information I
1
, I
2
I
3
I
Y
.
Thus mathematically this assumption can be written as:
(27)
Proof:
Some information I
k
is being exchanged through above assumed Universal stego-system using
cover-image C of size N. As any I
X
is not empty x: 1 x Y so I
k
(z) { } for at least one z from 1 to
N.
As S(z) = (z) [C(z) I
k
(z)] and () = < , , , > = < 0, 0, 0, 0 > so S(z) will be such that S(z) =
C(z) and hence Stego Image S and Cover Image C are identical or S C.
Now a different Information I
m
is exchanged through same Universal Stego system with same
cover Image C. Again since
S(z) = (z) [C(z) I
m
(z)] and () = < , , , > = < 0, 0, 0, 0 > so S(z) = C(z).
Therefore again S and C are identical or S C.
Thus for any information I
x
the Universal stego-system is such that S and C are identical and
same. But as we know that S = (C, I) and I =
-1
(S) so for every stego-image S there exists a unique
Information I.
But in the given case the same stego-image S corresponds to different distinct information I
1
, I
2
I
3
I
Y
. Hence
we conclude that all information are same i.e. I
1
= I
2
= I
3
= = I
x
= = I
Y-1
= I
Y.
But this is in contradiction with our assumption that {I
1
, I
2
I
3
I
Y
} and I
1
I
2
I
3
I
Y
.
Thus our assumption is wrong and hence () = < , , , > < 0, 0, 0, 0 > and hence < > is more then
0.
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 10 | Page
III. Application of the Mathematical Model in Real Scenario for evaluation of Stego
Algorithms
Based on the mathematical model developed in Section 2 three different spatial domain steganographic
algorithms are evaluated for susceptibility to steganalysis. These three Algorithms are named as Algorithm I,
Algorithm II and Algorithm III and represented mathematically as Universal Stego Systems
1
,
2
and
3
respectively. These three steganographic algorithms were also used in [1] and are referred in Section 5 of [1] as
Algorithm designed in section 4, QuickStego Software and Eureka Steganographer respectively. Thus
1
(Algorithm) is Algorithm I,
2
(Algorithm) is Algorithm II and
3
(Algorithm) is Algorithm III. The features
of these three algorithms are summarized in Table 1.
For the sake of uniformity (which is required for Evaluation) we use same set of two different cover
images for evaluation of
1
,
2
and
3.
One of them is smooth (has low Pixel Aberration) and other is relatively
unsmooth and has high Pixel Aberration and hence named as Smooth and Unsmooth and mathematically
represented as smooth and unsmooth respectively.. Thus set of Cover Images = {smooth, unsmooth} and
1
,
2
and
3
and (smooth) < (unsmooth). The two cover images smooth and unsmooth are
shown in Figure 1. Based on various parameters of Image mentioned in Section 2.2 these two images are
summarized in Table 2. These parameters are calculated using MATLAB Image Processing Tool Box.
Fig 1 Cover Images = {smooth, unsmooth}
In order to maintain uniformity in evaluation of
1
,
2
and
3
we embed same Information I using all
the three algorithms. This information I is 900 character string of abcdef.z1234 repeated 30 times. Thus
I = abcdef.z1234 (30 times) and {I} = .
Thus mathematically the three Universal Stego Systems are summarized as:
(28)
Using two cover Images = {smooth, unsmooth} and three Universal Stego Systems
1
,
2
and
3
we obtain Six Stego-Systems given as
1S
,
1U
,
2S
,
2U
,
3S
and
3U .
These six stego systems are
mathematically given as:
(29-A)
(29-B)
(29-C)
Here S
1
S
, S
2
S
, S
3
S
are the three stego-images generated by using image smooth as Cover-image
through 3 stego algorithms
1,
2
and
3
respectively. using or k = 1 to 3. And S
1
U
,
S
2
U
and
S
3
U
are three stego-
images generated by using image unsmooth as Cover-image through 3 stego-algorithms
1,
2
and
3
respectively.
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 11 | Page
Security of
1
,
2
and
3
ie
1
(),
2
() and
3
() is to be determined. It will be obtained by
calculating the security (, , values) of all the six stego systems i.e.
1S
(),
1U
(),
2S
(),
2U
(),
3S
() and
3U
() and applying (3) on them.
Feature Algorithm I
or
1
(Algorithm)
Algorithm II or
2
(Algorithm)
Algorithm III or
3
(Algorithm)
Number of pixels
changed if N
characters are hidden
in the cover image
N+1 0.3353N + 1.8096 1.534N+39.5963
Range of change in
pixel values
-3 to +3 -1 to +1 Variable but ranges from
-253 to +246
Data Insertion
Technique
2 Bit LSB
Insertion
1 Bit LSB Insertion around 6 to 7 bits are
used for data Insertion
Distribution of data in
the pixel
Continuously
inserts data Row
by Row in every
pixel right from
the first row
onwards. As a
result the data is
continuously
distributed in
every pixel.
Enters data in such a way
that cover image and
stego image remain more
or less the same by pixel
values having equal
number of changes in +1
and -1 values so that net
change in pixel value
may remain close to
zero.
Makes very large change
in the bottom most pixels
(changes in the bottom
most pixel usually goes
unnoticed due to psycho-
visual weaknesses of
human eye)
Concentration of
Information in Pixel
low Very low Very high
Degree of Difference
between the Cover
Image and Stego
Image
(It is expressed in the
scale of 1 and
measured using Mean
absolute Difference in
the Intensity Levels of
Cover and Stego
Image)
0.1186
0.0671
1.00000
Degree of Changes in
neighboring pixels of
the pixel changed
Always Very
high because it
inserts data row
by row.
High to Low depending
on size of Cover Image
Low
Source of Algorithm Designed in
section 4 of [1]
http://quickcrypto.com/fr
ee-steganography-
software.html
http://www.brothersoft.c
om/eureka-
steganographer-v2-
266233.html
Table 1 Three Different Steganographic Algorithms Used for Evaluation of Susceptibility to Steganalysis
3.1 Results
The values of
1S
(),
1U
(),
2S
(),
2U
(),
3S
() and
3U
() are calculated using programs in
MATLAB Image Processing Tool Box.First step for calculating the values of
1S
(),
1U
(),
2S
(),
2U
(),
3S
() and
3U
() is to determine the corresponding value of .
Parameters of Image
(based on Section 2.2)
M = smooth M = unsmooth
PIXEL RED GREEN BLUE PIXEL RED GREE
N
BLUE
Weighted mean of the
Pixel Aberration of
Image M or
1.6419 2.1401 1.4854 1.3002 2.7562 2.3393 2.6980 3.2312
Max Pixel Aberration 2.2946 4.6536 3.3466 3.0648 3.8271 5.6875 5.4896 6.2048
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 12 | Page
(
Min Pixel Aberration
(
-
1.3379
-
1.2151
-2.4749 -
1.4882
-
1.0272
-
1.5275
-
1.8235
-
1.6370
Range of Pixel
Aberration ()
3.6325 5.8688 5.8215 4.5530 4.8542 7.2150 7.3130 7.8418
Maximum Deviation in
the Pixel Aberration
(M) and Corresponding
given as
=
: = (M)
2.2946
and
7.9171
4.6536
and
12.469
8
3.3466
and
9.5674
3.0648
and
8.7922
3.8271
and
11.039
3
5.6875
and
14.172
9
5.4896
and
13.234
7
6.2048
and
15.597
9
Standard Deviation of
Pixel Aberrations in
Image M
0.2660 0.3585 0.3294 0.3272 0.3283 0.3869 0.3991 0.3853
, 2 (as modulus of
+
tve
& -
tve
)
1.0675 1.5418 1.3035 1.2741 1.0269 1.1720 1.2365 1.1949
, 4.5 (as modulus of
+
tve
& -
tve
)
1.7062 2.3750 2.1082 1.9913 2.1402 2.7026 2.6782 2.9491
, 6 (as modulus of
+
tve
& -
tve
)
2.0516 2.9045 2.5827 2.4757 2.6701 3.2488 3.5855 3.7377
, 7.9 (as modulus of
+
tve
& -
tve
)
2.2946 3.8532 2.8856 3.0648 3.2868 4.4300 5.2156 5.3151
Table 2 Parameters (based on Section 2.2) of two test Images smooth and unsmooth
The value of
is determined by taking means of ( ) for = 0, 2, 4.5, 6 and 7.9. All these values of (
) and corresponding
as well as
are given in Table 3a (for Smooth Image) and Table 3b (for Unsmooth
Image). These values of ( ) for different and their average
and
(as calculated in Table 3a and Table 3b) for all six stego-
systems
1S
,
1U
,
2S
,
2U
,
3S
and
3U
their overall strengths given as <
1S
()>, <
1U
()>, <
2S
()>,
<
2U
()>, <
3S
()> and <
3U
()> are calculated and shown in Table 4.
In order to better understand the values of , the plots of relative entropy of the neighborhood (given
asHP(Cz)||P(Sz) in Section 2.3.1, Requirement 2) of every pixel for all the three stego-algorithms
is plotted in Fig 3.a and Fig 3.b. In Fig 3.a the cover image C = smooth and Stego Image S =S
1
S
, S
2
S
and S
3
S
where as in Fig 3.b the cover image used is C = unmooth and stego image S = S
1
U
, S
2
U
and S
3
U
.
By applying (3) on these values we can conclude that:
(30)
So
<
1
()> = MAX(0.732089 , 0.524669) = 0.732089
<
2
()> = MAX(0.830721 , 0.963175) = 0.963175
<
3
() >= MAX (5.018686, 2.560202) = 5.018686
So Algorithm 1 is most secure among all the three stego algorithms and Algorithm 3 is least secure.
Algorithm smooth image
Colour (0) (2) (4.5) (6) (7.9)
MAX
MEAN
1S
()
(Algo)
Pixel_mean -
0.0040
-
0.1738
0.0806 0.1171 0.2254 0.225255 2.1607 1.2032
Red - - 0.3364 0.9711 0.8129
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 13 | Page
0.0080 0.1739
Green -
0.0032
-
0.2271
-0.0021 0.0407 1.2632
xe
-005
Blue
0.0049
-
0.1690
0.2956 0.6400 1.7415
2S
(),
(QS)
Pixel_mean 0.0181 -
0.1999
0.3187 0.5918 4.6908 0.792884 3.6670 1.2006
Red 0.0491 -
0.1141
0.5655 1.2580 5.7625
Green 0.0386 -
0.1624
0.1714 0.0854 empty
Blue 0.0498 -
0.0447
0.4032 0.7473 0.8243
3S
()
Eureka)
Pixel_mean 0.0303 2.1060 5.6023 6.4453 7.3028 7.794545 44.8191 38.1743
Red 0.0351 3.1310 6.9109 8.8190 11.7963
Green 0.0525 5.2347 9.7561 12.9615 17.9956
Blue 0.0352 5.8749 12.8564 18.0777 20.8673
Table 3.a Values of
(either
MAX or
MEAN
) for Smooth image
Algorithm unsmooth image
Colour (0) (2) (4.5) (6) (7.9)
MAX
MEAN
1U
()
(Algo)
Pixel_mean 1.1812e
-
004
0.0042 0.0783 0.0256 0.1180 0.108875 -0.2372 0.1943
Red 0.0012 0.0159 0.1562 0.2558 -
0.1627
Green 0.0029 0.0294 0.3133 0.1436 -
0.1053
Blue 0.0084 0.0965 0.5294 0.4082 0.2586
2U
(),
(QS)
Pixel_mean -0.0014 -
7.582e
-
004
0.0480 0.0992 -
5.4725
e-
007
-0.004 0.7045 -0.1435
Red 0.0033 0.0493 0.0623 0.0494 -
1.5004
e
-005
Green 0.0030 0.0217 0.1678 -
0.1845
-
0.2026
Blue 0.0055 0.0151 -
0.0995
0.0395 -
0.0898
3U
()
Eureka)
Pixel_mean 0.0233 1.1202 1.8310 2.3773 3.2307 3.268105 22.1064 18.0095
Red 0.0470 2.6542 4.8776 5.4672 6.0756
Green 0.0539 3.2605 5.4122 6.8352 7.7026
Blue 0.0439 2.6896 3.4502 4.1124 4.0975
Table 3.b Values of (either
MAX or
MEAN
) for Unsmooth image
Table 4 Values of
1S
(),
1U
(),
2S
(),
2U
(),
3S
() and
3U
()
smooth image
1U
() 0.0425 1.8252 0.108875 -0.1221(R) <
1U
()> = 0.524669
2U
() 0.0313 3.8054 -0.004 -0.0120(B) <
2U
()> = 0.963175
3U
() 0.0086 0.9851 3.268105 3.4274 (G) <
3U
()> = 2.560202
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 14 | Page
Fig 2.a Pixel Aberration plotted for Cover Image smooth and associated Stego Images S
1
S
, S
2
S
and S
3
S
Fig 2.b Pixel Aberration plotted for Cover Image unmooth and associated Stego Images S
1
U
, S
2
U
and S
3
U
Fig 3.a Plot of Relative Entropy of neighborhood of Every Pixel in Cover Image smooth and associated Stego
Images S
1
S
, S
2
S
and S
3
S
Fig 3.b Plot of Relative Entropy of neighborhood of Every Pixel in Cover Image unmooth and associated Stego
Images S
1
U
, S
2
U
and S
3
U
3.1.1 Observations:
In Table 4 we notice that Algorithm 3 is the least secure among all three and Algorithm 1 is the most
secure. Further it is interesting to note that Algorithm 2 performs better when the image is smooth where as
Algorithm 1 and Algorithm 3 performs better when the image is unsmooth. In Table 3.a and Table 3.b certain
Mathematical Modeling Of Image Steganographic System
www.iosrjournals.org 15 | Page
values of ( ) are negative for certain specific ( ( ) especially negative at =2 for
1S
and
2S
in Table 3.a).
This indicates that when the pixel aberrations of 2 (pixels which are more than 95% deviated from the
neighborhood) are considered then the cover image has more aberrations than the stego-image. In Figure 2.a we
notice that although Algorithm 2 has minimum pixel aberration among
all the three but due to very high pixel aberration produced in one particular pixel (pixel aberration of
more than 10 at pixel value S
2
S
(1000) ie at 1000
th
pixel) of stego image S
2
S
it becomes quite susceptible to
Steganalysis. Algorithm 1 performs better because it produces stego image by inserting data row by row in
every pixel of cover image thus entire neighborhood of the pixel changes rendering steganlysis based on
analysis of pixel aberration ineffective. Algorithm 3 has the highest pixel aberrations among all the three
algorithms (clearly seen in Table 2.a and 2.b and Figure 2.a and 2.b) because it concentrates the entire
information in very few pixels of bottom most row of the image. Since very few pixels are changed by
Algorithm 3 so it has the minimum Relative Entropy among all the three and this is clearly conspicuous in
Figure 3.a and 3.b. The graphs in Fig 3.a and 3.b are shifted Right for Algorithm 3 because it changes only the
last few pixels of the cover image. From Figure 3.a and 3.b we can also conclude that Relative Entropy is
highest in Algorithm 2. This is because Algorithm 2 distributes the entire information in large number of pixels
as a result the probability distribution of large number of pixels changes in the stego-image (almost every pixel
shows some value for relative entropy). In Algorithm 1 the graph of relative entropy (Figure 3.a and 3.b) has
shifted Left and this indicates that it changes only first few pixels (exactly 900 pixels, one pixel for each
character of I.
IV. Conclusion
Based on the mathematical model designed in Section 2 three different stego-algorithms were
represented mathematically. Their relative strengths and weaknesses could be easily represented using the
mathematical parameters and requirements defined in Section 2. Based on these mathematical parameters we
can also identify any innocent looking image to be a stego image if those parameters are significantly different.
Above all this model can be used for further research in Image Steganography and for representing any Image
based steganographic algorithm mathematically.
V. Acknowledgement
I wish to sincerely thank Mr Kinjal Choudhary (Software Professional at Sapient Nitro), Shri D
Praveen Kumar (DRDO Scientist at RCI Hyderabad), Lieutenant Mani Kumar, Sub Lieutenant S.S Niranjan
(Engineering officers of Indian Navy) and Ms Anjala Sharma (Senior Engineer at Alstom Power) for keenly
reviewing my work and for providing the necessary feedbacks for improvement of this technical paper. I am
also thankful to entire staff of NCE in general and Shri DL Sapra (Senior Scientist, DRDO and Principal of
NCE), Shri R V Kalmekar (Senior Scientist, DRDO and Vice Principal of NCE) and Commander Mohit Kaura
(Senior Engineering Officer of Indian Navy and Training Commander of NCE) in particular for providing the
necessary support and encouragement.
References
[1] Kaustubh Choudhary Image Steganography and Global Terrorism IOSR Journal of Computer Engineering, Volum 1 Issue 2 , pp
34-48.
[2] C.Cachin, An information-theoretic model for steganography Proc. 2nd International Workshop Information Hiding LNCS
1525, pp. 306318, 1998.
[3]. J. Zollner, H. Federrath, H. Klimant, A. Pfitzman, R. Piotraschke, A. Westfeld, G. Wicke, and G. Wolf,Modeling the security of
steganographic systems, Prof. 2nd Information Hiding Workshop , pp. 345355,April 1998.
[4] C. E. Shannon, Communication theory of secrecy systems, Bell System Technical Journal, vol. 28, pp. 656715, Oct. 1949.
[5] Steganography Capacity: A Steganalysis Perspective R. Chandramouli and N.D. Memon
[6]. A Mathematical Approach to Steganalysis R. Chandramouli Multimedia Systems, Networking and Communications (MSyNC) Lab
Department of Electrical and Computer Engineering Stevens Institute of Technology
BIBLIOGRAPHY OF AUTHOR
Kaustubh Choudhary, Scientist, Defence Research and Development Organisation (DRDO)
Ministry of Defence, Govt of India
Current attachment:
Attached with Indian Navy at Naval College of Engineering, Indian Naval Ship Shivaji,
Lonavla - 410402, Maharashtra, India
IOSR Journal of Computer Engineering (IOSRJCE)
ISSN: 2278-0661 Volume 2, Issue 5 (July-Aug. 2012), PP 16-28
www.iosrjournals.org
www.iosrjournals.org 16 | Page
Novel Approach to Image Steganalysis (A Step against Cyber
Terrorism)
Kaustubh Choudhary
Scientist, Defence Research and Development Organisation (DRDO),Naval College of Engineering, Indian
Naval Ship Shivaji,Lonavla, Maharashtra, India
Abstract: Steganography is a technique of hiding secret messages in the image in such a way that no one apart
from the sender and intended recipient suspects the existence of the message. Image Steganography is
frequently used by Terrorist Networks for securely broadcasting, dead-dropping and communicating the secret
information over the internet by hiding the secret information in the Images. As a result it becomes the most
preferred tool to be used by Terrorists and criminal organizations for achieving secure CIA (Confidentiality,
Integrity and Availability) compliant communication network capable of penetrating deep inside the civilian
population. Steganalysis of Image (Identification of Images containing Hidden Information) is a challenging
task due to lack of Efficient Algorithms, High rates of False Alarms and above all the High Computation Costs
of Analyzing the Images. In this paper a Novel Technique of Image Steganalysis is devised which is not only
Fast and Efficient but also Foolproof. The results shown in the paper are obtained using programs written in
MATLAB Image Processing Tool Box.
Key Words: Bit Plane Slicing, Cyber Crime, Global Terrorism, Image Steganalysis, LSB Insertion,
Pixel Aberration, SDT based Image Steganography.
I. Introduction
Image based steganography is a technique of hiding secret messages in the image in such a way that no
one apart from the sender and intended recipient suspects the existence of the message. It is based on invisible
communication and this technique strives to hide the very presence of the message itself from the observer. As a
result it has been used more frequently by various criminal and terrorist organizations than anybody else.
[1][2][3]. Various agencies even claim that 9/11 attacks have been masterminded and planned using image
based steganography [4]. Image Steganography offers numerous advantages to the terrorists like Anonymity,
Electronic Dead Dropping, Secure Broadcasting and above all very high Secrecy and Security.[5] Thus an
innocent looking digital image on any Web Portal, Online Auction Site or even a Social Networking Site may be
probably hiding a malicious and deadly terrorist plan or any other significant criminal Information.The
Steganalysis is the process of identifying such malicious Stego-images (original image which is used for hiding
data is called the Cover-Image whereas the image obtained after inserting the Secret Information in it is called
Stego Image) from the bulk of innocent images. The next step of steganalysis involves either the extraction of
the hidden information or destroying the information by adding visually imperceptible noise in the image or can
be even used for embedding counter-information in the Stego-Image. Considering the voluminous bulk of
images flowing every day through the Internet and amount of time and Computation Cost required for analyzing
the Image the very first step of identifying an innocent looking Image as a Stego Image becomes the most
challenging part of any Steganalysis procedure. It is because we do not have any foolproof method for crisply
identifying a steganographic signature in the innocent looking stego-image.
In this paper a technique has been devised for identification of any such stego-image if the Data is
hidden in it using Spatial Domain Steganography or LSB Insertion technique. But before moving further it
becomes quite necessary to explain the Spatial Domain Image Steganography in brief (elaborately explained in
section 2 and Section 3 of [5]). A Digital image consists of numerous discrete pixels. Color of any pixel depends
upon the RGB Values of the pixel. For example in a 24 bit BMP image RGB values consists of three 8 bits for
each R,G and B and thus a pixel is a combination of 256 different shades (ranging from intensity level of 0 to
255) of red, green and blue intensity levels resulting in 256 x 256 x 256 or more than 16 million colors. Thus if
the least significant bits in the R, G and B value are changed the pixel will have minimal degradation of 2/256 or
0.78125%.This minor degradation is psycho-visually imperceptible to human eye due to limitations in Human
Visual System (HVS). But at the cost of this negligible degradation 3 bits (1 bits each from red, green and blue)
are extracted out of every pixel for transmitting our secret information.
The most of the Spatial Domain Image steganographic techniques use LSB Insertion for hiding data in
the image. But some Spatial Domain techniques instead of using LSB use around five to seven bits of the pixels
and thus make large changes in the RGB Values of the pixel. As a result the entire information gets concentrated
in few pixels. Although changes in the pixel are quite large but since pixels are very small in size and also
Novel Approach to Image Steganalysis (A Step Against Cyber Terrorism)
www.iosrjournals.org 17 | Page
because very few pixels change so these changes go unnoticed by human eye. Moreover most of these
Concentrating Algorithms change the pixels of only the bottom few rows. This is because our human brain
concentrates more on the top, center and other important features of the image than on the bottom rows. There
are many variants of Spatial Domain Steganography but all of them can be broadly classified into these two
types only and is further elaborated in Section 2 of this paper. There are other techniques also for hiding data in
the image. For example Transformation Domain Steganography may use Discrete Cosine Transforms or
Discrete Wavelet Transform for embedding data and some other steganographic algorithm may use a different
color space itself (Example RGB may be converted to YCbCr and then various steganographic techniques can
applied). But the scope of this paper is limited to Foolproof Steganalysis of only Spatial Domain Steganography.
For Steganalysis of Spatial Domain Stego-images various methods like Brute Force Attacks, Pattern Matching,
Statistical Attack (Histogram, Kurtosis), Mathematical Approaches (Stir Mark Attack) and Transform Domain
Attacks are available. Most of them are highly unreliable because they either produce frequent false alarms or
do not identify the stego-image itself. Whereas other advanced steganalsysis techniques based on Stir Mark
Attack [6] and Gabor Filtering [7] are very expensive in terms of Computational Power and time. The
weaknesses of these steganalysis algorithms are elaborated in Section 5.2.3 of [5]. Thus most spatial domain
steganalysis algorithms suffer from one of the two weaknesses. Either they are unreliable and ineffective or else
they are computationally very expensive and slow. To overcome these limitations two different approaches to
steganalysis have been proposed in this paper. Both these approaches are computationally very fast and
complement each other and thus together they form an Efficient, Effective and Reliable technique of Spatial
Domain Steganlaysis of Images. Both these techniques not just identify the Stego image but also informs the
locations of the pixels having information and also provides the information in the binary form.
II. Steganalysis Technique Proposed
As mentioned in Section 1 most spatial domain image steganographic algorithms can be broadly
classified into two types. Either they concentrate the secret information in few pixels or they distribute the
information in large number of pixels. Those algorithms which concentrate the information bring large and
noticeable changes in very few pixels (sometimes Information appears as grains in the bottom most row of the
image) by using around five to seven bits of the pixel. They use bottom most row is due to psycho-visual
weaknesses of the human brain and gets explained from Figure 2. Such steganographic algorithms will be here
onwards referred as Concentrating Steganographic Algorithms.
Fig 1 Large and Perceptible changes in the pixels (Grains) go unnoticed in the Last Row of the Image
On the other hand those algorithms which distribute the information in large number of pixels make very
small and imperceptible changes in the pixels by using either one or two Least Significant bits for storing
information in the pixels. Such steganographic algorithms will be here onwards referred as in this paper as
Distributing Steganographic Algorithms.
2.1 Steganalysis of Concentrating Steganographic Algorithms
In any natural image the pixels do not change abruptly and color of any pixel is dependent on the color
of the neighboring pixels thus the pixels are auto-correlated. Thus if any pixel is substantially different from its
neighboring pixels then it indicates that the given innocent looking image is a stego-image. But it is equally true
that two neighboring pixels are not necessarily same or else the image will not be formed. But two pixels which
are neighbors will not be very different also. Moreover the average difference between the concerned pixel and
its neighbors will be within the range of difference among the neighbors themselves. The difference between the
pixels implies the individual difference of each R, G and B components of the pixels and not the mean of R, G
and B because the averaging effect will lead to loss of vital information.
In Figure 2 any arbitrary pixel P
i,k
or P(i,k) (Pixel located at i
th
row and j
th
column of the image) is shown along
with its 8 neighbors.
Novel Approach to Image Steganalysis (A Step Against Cyber Terrorism)
www.iosrjournals.org 18 | Page
The differences among the Adjacent Neighbors of the pixel P
i,k
or P(i,k) is given as P(i-1,k-1) P(i-
1,k) = A
1
, P(i-1,k) P(i-1,k+1) = A
2
, P(i-1,k+1) P(i,k+1) = A
3
, P(i,k+1) P(i+1, k+1) = A
4
, P(i+1,k+1)
P(i+1,k) = A
5
, P(i+1, k) P(i+1,k-1) = A
6
, P(i+1,k-1) P(i,k-1) = A
7
and P(i,k-1) P(i-1,k-1) = A
8
.
The difference of the pixel P(i,k) with its neighbors is given as P(i-1,k-1) P(i,k) = D
1
, P(i-1,k) P(i,k)
= D
2
, P(i-1,k+1) P(i,k) = D
3
, P(i,k+1) P(i,k) = D
4
, P(i+1,k+1) P(i,k) = D
5
, P(i+1, k) P(i,k) = D
6
,
P(i+1,k-1) P(i,k) = D
7
and P(i,k-1) P(i,k) = D
8
.
Since the adjacent neighbors are not necessarily same so A
1
= A
2
= A
3
= = A
8
is not necessary. But
in any natural image the average of the difference between the Pixel concerned ie P(i,k) from its neighbors given
by mean of D
1
, D
2
, D
3
,D
8
and represented as D
and
as (1).
(
) =
=1
(1)
Assume that we know in advance the number of clusters that the algorithm should produce. The best
known approach that is based on partitioning is k-means clustering, a simple and efficient algorithm used by
statisticians for decades. The idea is to represent the cluster by the centroid of the documents that belong to that
cluster (the centroid of cluster C is defined as. The cluster membership is determined by finding the most similar
cluster centroid for each document. After clustering done, similar pages are assigned to same cluster that can be
used in recommendation process.
3.4 Page Ranking
Finally, by employing the HITS algorithm on structure data system generate ranked pages. In HITS concept,
Kleinberg identifies two kinds of pages from the Web hyperlink structure: authorities (pages with good sources
of content) and hubs (pages with good sources of links). For a given query, HITS will find authorities and hubs.
According to Kleinberg, Hubs and authorities exhibit what could be called a mutually reinforcing relationship:
a good hub is a page that points to many good authorities; a good authority is a page that is pointed to by many
good hubs.
IV. GENERATE RECOMMENDATIONS
The recommendation engine is the online component of a personalization system in order to determine
which items are to be recommended, a recommendation score is computed or each page p as in (8).Two factors
are used in determining this recommendation score: the overall matching score of the active session(S) to the
weighted rules as a whole, and the weighted confidence of the rule.
Recommendation Score (p) = Similarity (S,
) x Confidence (
P)
We choose the highest recommendation score as the recommendation to the active session.
HYBRID WEB MINING FRAMEWORK
www.iosrjournals.org 37 | Page
V. RECOMMENDATION TO END USER
Based on evaluation and comparison the recommended pages are displayed to end user.
VI. CONCLUSIONS
In this paper, a new web page recommendation framework is proposed. First, users' navigational
patterns are extracted from web usage data simultaneously web content data and web structure data is also taken
after pre-processing and pattern discovery is performed on these datas and based on the pattern discovery the
recommendations are generated. Proposed framework is combining three mining techniques so it is
advantageous as compare to the previous hybrid recommendation frameworks.
REFERENCES
Conferences
[1] Agrawal R. and Srikant R. (2000). Privacy preserving data mining, In Proc. of the ACM SIGMOD Conference on Management of
Data, Dallas, Texas, 439-450.
[2] Berners-Lee J, Hendler J, Lassila O (2001) The Semantic Web. Scientific American, vol. 184, pp34-43.
[3] Berendt B., Bamshad M, Spiliopoulou M., and Wiltshire J. (2001). Measuring the accuracy of sessionizers for web usage analysis,
In Workshop on Web Mining, at the First SIAM International Conference on Data Mining, 7-14.
[4] Srivastava, et aI. , Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations, 1(2)
2000,p. 12-23 (3).
[5] R. Kosala and H. Blockeel. Web Mining Research: A Survey, ACM SIGKDD Explorations Newsletter, June 2000, Volume 2 Issue
1.
[6] Kosala, R., and Blockeel, H., (2000). Web Mining Research: A Survey, ACM 2(1):1-15.
[7] Brin, S., and Page, L. (1998). The Anatomy of a Large- Scale Hypertextual Web Search Engine, Proceedings of the 7th
International World Wide Web Conference, Elsevier Science, New York, 107-117.
[8] Desikan, P., Srivastava, J., Kumar, V., and Tan, P.N. (2002). Hyperlink Analysis: Techniques and Applications, Technical Report
(TR 2002-0152), Army High Performance Computing Center.
[9] Li Haigang Yin wanling Study of Application of Web Mining Techniques in E-Business IEEE Conference , 2006
[10] B. Mobasher, H. Dai, T. Luo, M. Nakagawa. Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization. In
Data Mining and Knowledge Discovery, Kluwer Publishing, Vol. 6, No. I, pp. 61-82, January 2002.
[11] D. Shen, Y. Cong, J.-T Sun, Y.-c. Lu, Studies on Chinese web page classification, in: Proceedings of the 2003 International
Conference on Machine Learning and Cybernetics, 1( 2003), pp. 23-27.(20)
[12] B. Mobasher, R. Cooley, 1. Srivastava, Creating Adaptive Web Sites Through Usage-Based Clustering of URLs, in Proc. of the
1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX'99),1999(24)
[13] B. Mobasher, R. Cooley, 1. Srivastava, Automatic Personalization Based on Web Usage Mining, Communications of the ACM,
August 43(8),2000,142-151(25).
[14] M. Spiliopoulou, Web Usage Mining for Web Site Evaluation, Communications of the ACM, 43(8), 2000, pp: 127-134.(29)
[15] H. Liu, V. Keselj, Combined Mining of Web Server Logs and Web Contents for Classifying User Navigation Patterns and
Predicting Users' Future Requests, Data & Knowledge Engineering, 2007.(42)
[16] M. Eirinaki, M.Vazirgiannis, L Varlamis, SEWeP: Using Site Semantics and Taxonomy to Enhance the Web Personalization
Process, in Proc of the 9th SIGKDD Cont: 2003.(37)
[17] B. Mobasher, H. Dai, T. Luo, Y Sung, 1. Zhu, Integrating Web Usage and Content Mining for More Effective Personalization, in
Proc. of the International Conference on E-Commerce and Web Technologies (ECWeb2000), 2000.(34)
[18] Samira Khonsha, Mohammad Hadi Sadreddini Hybrid Web Personalization Framework in 978-1-61284-486-2/1112011 IEEE.
[19] Jos Roberto de Freitas Boullosa. "An Architecture for Web Usage Mining".
[20] Mrs Surekha R.Deshmukh, Dr. D.J. Doke, Dr. Y.P. Nerkar, Optimal Generation Scheduling with Purchase of Power from Market
and Grid , IEEE Conference TENCON 2009, 23-26 Nov. 2009, Singapore.
[21] R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases, In Proceedings of the 20th International
Conference on Very Large Data Bases VLDB'94,Santiago, Chile, 1994, pp. 487-499.(60)
Authors Biography:
*
Prof. (Mrs) Manisha R. Patil, Professor in Computer Engineering Department of Smt. Kashibai Navale College of
Engineering, Vadgaon (Bk), Pune and having 14 yrs of teaching experience. Her area of interest is Data Mining.
*
Mrs. Madhuri D. Patil, Student of Computer Engineering Department in Smt. Kashibai Navale College of Engineering,
Vadgaon (Bk), Pune, pursuing M. E. in Computer Engineering and her area of interest is Web Mining.
IOSR Journal of Computer Engineering (IOSRJCE)
ISSN: 2278-0661 Volume 2, Issue 5 (July-Aug. 2012), PP 38-42
www.iosrjournals.org
www.iosrjournals.org 38 | Page
A Novel Approach To Topological Skeletonization Of English
Alphabets And Characters
Chinmay Chinara
1
, Nishant Nath
2
, Subhajeet Mishra
3
1, 2, 3
(Dept. of ECE, SOA University, India)
Abstract : In this paper we put forward a modified approach towards skeletonization of English alphabets
and characters. This algorithm has been designed to find the skeleton of all the typeface of Modern English as
present in the Microsoft database. The algorithm has been kept simple and optimized for efficient
skeletonization. Finally, the performance of the algorithm after testing has been aptly demonstrated.
Keywords Digital Library, Microsoft Visual C++, Skeletonization, Structuring Element, Thresholding,
Typeface
I. INTRODUCTION
Character recognisation also known as Optical Character Recognisation (OCR) is one of the major
subset of pattern recognisation. Over the years, offline character recognisation has been achieving great demand
due to evolution in the field of digital library and banking. Skeletonization and thinning are amongst the major
aspects of pre-processing steps for the correct working of OCR systems. Skeletonization is a process for
reducing foreground regions in a binary image to a skeletal remnant that largely preserves the extent and
connectivity of the original region while throwing away most of the original foreground pixels [8][5]. In other
words, it is like the loci of intersecting waves emanating from different points of a branch. Thinning is done to
reduce data storage by converting the binary image into a skeleton or a line drawing [5]. The main objective;
however is to retain the topographical properties of the alphabets and characters.
The complexity in the skeletonization of Modern English Alphabets and Characters lies in its vastness
of typeface. This paper proposes an effective skeletonization algorithm that is compatible with the entire
typeface present in the Microsoft database.
II. BACKGROUND MODERN ENGLISH
The modern English alphabets are a sub-group of the 26 Latin alphabets set by the International
Organization of Standardization (ISO). The shape of these alphabets depends on the typeface, commonly known
as the font of the alphabet. The above 26 alphabets are represented in two forms, one being the Majuscule form
(also called Uppercase or Capital letters) and the other being the Minuscule form (also called Lowercase or
Small letters). These are again subdivided into two parts, namely Vowels and Consonants. The characters consist
of a vast database and they vary from typeface to typeface. The few alphabets and characters over which
skeletonisation was implemented are shown in Fig-(1-3).They are based on the Arial font of the size 22.
III. LITERATURE SURVEY
Many skeletonization algorithms are proposed in the literature. Till date, all the skeletonization
algorithms work very well with English alphabets and letters. A properly skeletonised and thinned image aids in
segmentation and feature extraction, which are crucial factors for character recognition. The skeletonization
algorithms are mainly classified in two groups: one on the basis of distance transforms and other on the basis of
thinning approaches.
The paper by G. Sanniti di. Baja [2] used distance transform for the purpose of skeletonization.
Multiple erosions with suitable structuring element were implemented here until all the foreground regions of
the image had been eroded away.
Fig-1. Majuscule form of
alphabets
Fig-2. Minuscule form of
alphabets
Fig-3. Few special
characters
A Novel Approach To Topological Skeletonization Of English Alphabets And Characters
www.iosrjournals.org 39 | Page
The paper by Gisela Klette[3] showed a detailed study of the various distance transform techniques and
thinning approaches used to attain skeletonization. This study indicated that Iterative thinning algorithms are the
most efficient ones.
The paper by Aarti Desai et al. [1] reflects a kind of thinning algorithm in which comparison with
different structuring elements and elimination of unnecessary black pixels is the key. This, when modified gives
a very good skeletonized output, thus making it the gateway to our proposed algorithm.
Paper by [4] showed an efficient use of iterative algorithms. Comprehensive work by [6] and [7] also
helped us understand the morphological operations in greater details.
IV. PROPOSED ALGORITHM
The proposed algorithm is a modification of the work done on thinning by Aarti Desai et al. [1] and is
implemented for the Topological Skeletonization of English Alphabets and Characters. This involves four
stages: binarization, adding of dummy white pixels, creating structuring elements & applying it to the image and
removing the noise to obtain the final skeletonized image.
4.1 Binarization
This process involves the conversion of the input bitmap image to a black and white image by setting a
particular threshold level. The image is then converted to binary by representing the background (white pixel)
by 0 and foreground (black pixel) by 1. This is shown in Fig-4. and Fig-5.
4.2 Addition of Dummy Pixels
When the structuring element is applied to the binary image directly, then for the pixels present in the
extreme left, right, top and bottom sides of the image, we get an erroneous, non-preexisting pixel values. To
eliminate this, we add an extra layer of white pixels (dummy layer) to each of the above mentioned sides as
shown in Fig-6. No operation is done over these pixels; rather they just provide assistance to the process of
comparison with the structuring elements. This binary image is the base image that is to be skeletonized.
Fig-4. The letter H before thresholding and binarisation
Fig-5. The letter H after thresholding and binarisation
A Novel Approach To Topological Skeletonization Of English Alphabets And Characters
www.iosrjournals.org 40 | Page
4.3 Creation of structuring elements and application to the image
Eight 3X3 structuring elements of the form shown in Fig-(7-14). are applied to the base image. The
application involves traversing the base image, pixel by pixel where (m, n) represents the current position of the
pixel under consideration. m & n represent the row and the column positions of the pixel under consideration
from the base image. The comparison is done only if the (m, n) pixel is black. The (m, n) white pixel is left
unchanged.
The 0 in the above figures represent the background (white) and the 1 represents the foreground
(black). The X pixel is not considered for comparison. The encircled region represents the central element that
has to be placed over the (m, n) black pixel of the base image for comparison. The comparison is done between
the adjacent pixels of the pixel under consideration of the base image and the adjacent pixels of the central pixel
of the structuring element. If each of the respective pixels match then the (m, n) pixel of the base image is
converted to white. This comparison and replacement procedure is repeated for each and every structuring
element with the base image till no further changes are possible. The skeletonized image obtained however
contains noise as shown in Fig-15. which has to be removed to get the desired skeleton.
Fig-7. 1
st
S.E. Fig-8. 2
nd
S.E. Fig-9. 3
rd
S.E. Fig-10. 4
th
S.E.
Fig-11. 5
th
S.E. Fig-13. 7
th
S.E. Fig-14. 8
th
S.E. Fig-12. 6
th
S.E.
Fig-6. The letter H after adding dummy pixels (given by area under the red concentric region)
Fig-15. The skeletonized letter H after comparing with structuring elements (contains noise given
by red-boxed regions)
A Novel Approach To Topological Skeletonization Of English Alphabets And Characters
www.iosrjournals.org 41 | Page
4.4 Removal of noise
(m-1), (n-1) (m-1), n (m-1), (n+1)
m, (n-1) m, n m, (n+1)
(m+1), (n-1) (m+1), n (m+1), (n+1)
Removal of noise solely depends on whether a black pixel can safely be converted into a white pixel.
The white pixels are kept unchanged as the noise occurs due to presence of extra black pixels only. For
achieving this we browse the surrounding of the black pixel at (m, n) position i.e. we count the number of black
pixels and white pixels around (m, n) black pixel. If the number of black pixels is less than or equal to 2 then the
(m, n) black pixel is a safe one and is left unchanged. If the number is greater than 2 then we count the number
of [white (minus) black] colour combinations around the (m, n) black pixel considered. If the number is not
equal to 1 then we leave the (m, n) black pixel unchanged but if the number is equal to 1 then that particular (m,
n) black pixel is converted to white if and only if the following conditions satisfy:
i. (m, n+1) or (m+1, n) or both (m-1, n) and (m, n-1) are white.
ii. (m-1, n) or (m, n-1) or both (m, n+1) and (m+1, n) are white.
After applying the above noise removal technique we get the final desired skeletonized image as shown in Fig-
17.
V. DEVELOPMENT PLATFORM
The algorithm was implemented using the Microsoft Visual C++ (ver. 2010) development platform.
Due to its quick, enhanced and interactive environment, we were able to study, implement and analyze all the
previous algorithms which resulted into our proposed algorithm.
VI. CONCLUSION
The modified algorithm applied to skeletonization of English characters works well for all type of
standard fonts present in the Microsoft database. The output contains very little noise, maintains connectivity
and proper branching, thus making it a good choice for the English OCR systems. The time complexity of this
algorithm is also very negligible as shown in Table-1. and is an improvement over the algorithm suggested by
[1]. The algorithm was tested on a system with Intel Dual Core Processor, 2.0 GHz, 4GB RAM, 1GB Graphics
Driver. This algorithm can also be implemented on other languages like the Latin, Greek, Roman as well as
Indian scripts like Devnagari.
Algorithm Time-taken (in ms)
Aarti Desai et al. [1] 17
Proposed algorithm 15
Fig-16. The 3X3 window frame for the considered (m, n) pixel and its surrounding eight pixels
Fig-17. The final skeletonized letter H (after noise removal)
A Novel Approach To Topological Skeletonization Of English Alphabets And Characters
www.iosrjournals.org 42 | Page
VII. ACKNOWLEDGEMENTS
We acknowledge the research facilities and development platform provided to us at the Video-Data
Analysis Systems Lab, Electro-Optical Tracking Division, Integrated Test Range, DRDO, Chandipur, Balasore,
India for the successful completion of our research.
REFERENCES
[1] Aarti Desai, Latesh Malik and Rashmi Welekar, A New Methodology for Devnagari Character Recognisation, JM International
Journal on Information Technology, Volume-1 Issue-1, ISSN: Print 2229-6115, January 2011, pp. 56-60
[2] G. Sanniti di Baja, Well-shaped, stable and reversible skeletons from the (3, 4)-distance transform, J. Visual Comm. Image
Representation, 1994, pp. 107-115
[3] Gisela Klette, Skeletons in Digital Image Processing, Centre for Image Technology and Robotics, Tamaki, CITR-TR-112, July
2002
[4] Khalid Saeed, Marek Tabedzki, Mariusz Rybnik and Marcin Adamski, K3M: A Universal Algorithm for Image Skeletonization
and a Review of Thinning Techniques, International Journal of Applied Mathematics and Computer Science, Volume-20, No. 2,
2010, pp. 317-335
[5] Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, Second Edition, Prentice-Hall, 2002, pp. 541-545
[6] D. Ballard and C. Brown, Computer Vision, Prentice-Hall, 1982, Chapter-8
[7] E. Davies, Machine Vision: Theory, Algorithms and Practicalities, Academic Press, 1990, pp. 149 161
[8] R. Haralick and L. Shapiro, Computer and Robot Vision, Vol. 1, Addison-Wesley Publishing Company, 1992, Chapter-5
IOSR Journal of Computer Engineering (IOSRJCE)
ISSN: 2278-0661 Volume 2, Issue 5 (July-Aug. 2012), PP 43-48
www.iosrjournals.org
www.iosrjournals.org 43 | Page
An Efficient Approach for Detection of Exudates in Diabetic
Retinopathy Images Using Clustering Algorithm
G.S.Annie Grace Vimala
1
, Dr.S.Kaja Mohideen
2
1
(ECE, St.Josephs Institute of Technology, India)
2
(ECE, B.S.Abdur Rahman University, India)
Abstract : Diabetic retinopathy is a kind of disorder which occurs due to high blood sugar level. This disorder
affects retina in many ways. Blood vessels in the retina gets altered . Exudates are secreted, hemorrhages occur,
swellings appear in the retina. Diabetic Retinopathy (DR) is the major cause of blindness. Automatic
Recognition of DR lesions like Exudates, in digital fundus images can contribute to the diagnosis and screening
of this disease. In this approach, an automatic and efficient method to detect the exudates is proposed. The real
time retinal images are obtained from a nearby hospital. The retinal images are pre-processed via. Contrast
Limited Adaptive Histogram Equalization (CLAHE). The preprocessed colour retinal images are segmented
using K-Means Clustering technique. The segmented images establish a dataset of regions. To classify these
segmented regions into Exudates and Non-Exudates, a set of features based on colour and texture are extracted.
Classification is done using support Vector Machine This method appears promising as it can detect the very
small areas of exudates.
Keywords Diabetic Retinopathy, Exudates, fundus image, k means clustering, SVM
I. INTRODUCTION
The retina is the inner and most important layer of the eye. It is composed of several important
anatomical structures which can indicate various diseases. Cardiovascular disease such as stroke and myocardial
infarcation can be identified from retinal blood vessels. Diabetic Retinopathy is the common retinal
complication associated with diabetes. It is a major cause of blindness in both middle and advanced age groups.
The International Diabetes Federation reports that over 50 million people in India have this disease and it is
growing rapidly (IDF 2009a) [2]. The estimated prevalence of diabetes for all age groups worldwide was 2.8%
in 2000 and 4.4% in 2030 meaning that the total number of diabetes patients is forecasted to rise from 171
million in 2000 to 366 million in 2030 [3]. Therefore regular screening is the most efficient way of reducing the
vision loss.
Diabetic Retinopathy is mainly caused by the changes in the blood vessels of the retina due to
increased blood glucose level. Exudates are one of the primary sign of Diabetic Retinopathy [5]. Exudates are
yellow-white lesions with relatively distinct margins. Exudates are lipids and proteins that deposits and leaks
from the damaged blood vessels within the retina. Detection of Exudates by ophthalmologists is a laborious
process as they have to spend a great deal of time in manual analysis and diagnosis. Moreover, manual detection
requires using chemical dilation material which takes time and has negative side effects on patients. Hence
automatic screening techniques for exudates are preferred.
1.1 Overview of State of Art
Alireza Osareh et al [2] proposed a method for automatic identification of exudates based on
computational Intelligence technique The colour retinal images were segmented using fuzzy c-means clustering.
Feature vector were extracted and classified using multilayer neural network classifier. Akara Sopharak et al [6]
reported the result of an automated detection of exudates from low contrast digital images of retinopathy
patients with non-dilated pupils by Fuzzy C-Means clustering. Four features such as intensity, standard
deviation on intensity, hue and a number of edge pixels were extracted and applied as input to coarse
segmentation using FCM clustering method..
Niemeijer et al [7] distinguished the bright lesion like exudates, cotton wool spots and drusen from
colour retinal images. In the first step, pixels were classified, resulting in a probability map that included the
probability of each pixel to be part of a bright lesion.
Walter et al [4] identified exudates from green channel of the retinal images according to their gray
level variation. The exudates contours were determined using mathematical morphology techniques. However
the author ignored some types of errors on the border of the segmented exudates in their reported performances
and did not discriminate exudates from cotton wool spots.
An Efficient Approach For Detection Of Exudates In Diabetic Retinopathy Images Using Clustering
www.iosrjournals.org 44 | Page
II. Imaging And Image Acquisition
To evaluate the performance of this method, the digital retinal images were acquired using VISUCAM
from CARL ZEISS MEDITEC at a nearby hospital. These retinal images are acquired through high sensitive
color fundus camera with the illumination, resolution, field of view, magnification and dilation procedures kept
constant.
Figure 2.1 Block Diagram
2.1 Image Pre-Processing And Processing
Colour fundus images often show important lighting variation, poor contrast and noise. In order to
reduce these imperfections [11] and generate images more suitable for extracting the pixel features in the
classification process, a preprocessing comprising the following step is applied. 1) RGB to HSI conversion 2)
Median Filtering 3) Contrast Limited Adaptive Histogram Equalization (CLAHE).
RGB to HSI Conversion:
The input retinal images in RGB Colour space are converted to HSI colour space. The noise in the
images are due to the uneven distribution of the intensity (I) component.
Median Filtering:
In order to uniformly distribute the intensity throughout the image, the I-component of HSI colour
space is extracted and filtered out through a 3X3 median filter.
Contrast Limited Adaptive Histogram Equalization (CLAHE):
The contrast limited adaptive histogram equalization is applied on the filtered I-component of the
image [12]. The histogram equalized I component is combined with HS component and transformed back to the
original RGB colour space.
Figure 2.2 Histogram of Green Component Figure 2.3 CLAHE - Green Component
2.2 Image Segmentation Based On K-Means:
Image segmentation is the process of partitioning of an image into meaningful regions based on
homogeneity or heterogeneity criteria. Image segmentation can be pixel oriented, contour oriented, region
oriented, model oriented, color oriented and hybrid. In this approach, we present a novel image segmentation
based on colour features from the images. The work is divided into two stages: First, enhancing the colour
separation is done by extracting the a*b* components from the L*a*b* colour space of the preprocessed image.
Then, the regions are grouped into a set of five clusters using K-means Clustering algorithm. By this two step
process, we reduce the computational cost avoiding feature calculation for every pixel in the image[13].
The entire process can be summarized in following steps:
Step 1: Read the image. Fig 2.4 shows the example input retinal image with exudates.
Step 2: Convert the image from RGB colour space to L*a*b* colour space (Figure 2.5). L*a*b* colour space
helps us to classify the color differences. It is derived from the CIE XYZ tristimulus values. L*a*b* colour
space consists of a Luminosity layer L*, chromaticity layer a* indicating where the colour falls along the red-
green axis, chromaticity layer b* indicating where the colour falls along the blue-yellow axis. All of the colour
information is in the a* and b* layer. The difference between two colours can be measured using the Euclidean
distance.
An Efficient Approach For Detection Of Exudates In Diabetic Retinopathy Images Using Clustering
www.iosrjournals.org 45 | Page
Figure 2.4 Input Color retinal image. Figure2.5 CIE L*a*b colour space conversion
Step 3: Segment the colours in a*b* space using K-means clustering. Clustering is a way to separate groups of
objects. K-Means Clustering treats each object as having a location in space. It finds partition such that objects
within each cluster are close to each other as possible and as far from other objects in other clusters as possible.
The algorithm requires that we specify the number of clusters to be partitioned and a distance metric to quantify
how close two objects are to each others. Since the colour information exist in the a*b* space, our objects are
pixels with a* and b* values. Use K-means to cluster the objects into five clusters using the Euclidean distance
metric.
Step 4: Label every pixel in the image using the result from K-means .For every objects in the input, K-means
returns an index corresponding to a cluster. Label every pixel in the image with its cluster index.
Step 5: Create images that segment the images by colour.
Step 6: Since the Optic Disc and Exudates are homogenous in their colour property, cluster possessing Optic
Disc is localized for further processing.
III. Feature Extraction:
To classify the localized segmented image into exudates and Non-exudates, a number of features based
on colour and texture are extracted using Gray Level Co-occurrence Matrix (GLCM) . GLCM is a tabulation of
how often different combination of pixel brightness values occur in a pixel pair in an image. Each element (i, j)
in GLCM specifies the number of times that the pixel with value i occurred horizontally adjacent to a pixel with
value j. The resulting matrix was analyzed and based on the existing information, the feature vectors are formed
[14].
HOMOGENEITY:
Based on colour, the feature vector is computed directly from RGB colour space [15] and it is given by:
FV=[ FE(R), FE(RG), FE(RB), FE(G), FE(GB), FE(GR), FE(B), FE(BG), FE(BR) ]
Where FE represents creating a GLCM and computing homogeneity for this matrix .
3.1 CLASSIFICATION USING SUPPORT VECTOR MACHINE (SVM):
The standard SVM is a binary classifier which has found widespread use in pattern recognition
problems such as image and audio recognition, handwriting recognition, medicine, science, finance and so on.
The support vector machine or SVM framework is currently the most popular approach for "off-the-
shelf" supervised learning.
(1)
(2)
(3)
(4)
(5)
(6)
An Efficient Approach For Detection Of Exudates In Diabetic Retinopathy Images Using Clustering
www.iosrjournals.org 46 | Page
There are three properties that make SVMs attractive:
1. SVMs construct a maximum margin separatora decision boundary with the largest possible distance to
example points. This helps them generalize well.
2. SVMs create a linear separating hyper plane, but they have the ability to embed the data into a higher-
dimensional space, using the so-called kernel trick. Often, data that are not linearly separable in the original
input space are easily separable in the higher dimensional space. The high-dimensional linear separator is
actually nonlinear in the original space. This means the hypothesis space is greatly expanded over methods
that use strictly linear representations.
3. SVMs are a nonparametric methodthey retain training examples and potentially need to store them all.
On the other hand, in practice they often end up retaining only a small fraction of the number of
examplessometimes as few as a small constant time the number of dimensions. Thus SVMs combine the
advantages of nonparametric and parametric models: they have the flexibility to represent complex
functions, but they are resistant to over fitting.The input points are mapped to a high dimensional feature
space, where a separating hyper-plane can be found. The algorithm is chosen in such a way as to maximize
the distance from the closest patterns, a quantity which is called the margin. SVMs are learning systems
designed to automatically trade-off accuracy and complexity by minimizing an upper bound on the
generalization error. In a variety of classification problems, SVMs have shown a performance which can
reduce training and testing errors, thereby obtaining a higher recognition accuracy. SVMs can be applied to
very high dimensional data without changing their formulation.
IV. Results and Discussion
Figure 4.1 Abnormal Image
Fig 4.1 shows the classification result using Support Vector Machine (SVM) classifier for abnormal image. The
preprocessed retinal image is converted to L*a*b Color space. The color component alone is extracted. It is then
applied to the K-means Clustering algorithm. The result consists of five clusters. Since Optic Disc and Exudates
are homogenous in color, cluster containing Optic Disc is selected for feature extraction. Based on the feature
extracted, the SVM is trained for normal and abnormal images. Finally, image is classified as exudates or non-
exudates using SVM.
Figure 4.2 Normal Image
Figure 4.2 shows the classification result using Support Vector Machine (SVM) classifier for normal image.
An Efficient Approach For Detection Of Exudates In Diabetic Retinopathy Images Using Clustering
www.iosrjournals.org 47 | Page
I. Feature Extraction from DRIVE images II. Feature Extraction from Real Time Images
Table I shows the feature extraction of five samples like mean, standard deviation, Energy and
Dissimilarity has been extracted from a publicly available DRIVE database. Table II shows that the
feature extraction has been done for five samples of real time images.
III. Feature Extraction of Normal Image Samples (N1-N5)
and Abnormal Image samples (E1 E10)
IV. Feature Extraction of Normal and
Abnormal Image with single reference Image
Image
Mean SD Energy
Dissim
ilarity
Sample 1 25.55 26.98 0.0002 0.0231
Sample 2 75.10 56.52 0.0010 0.0852
Sample 3 43.04 36.96 0.0004 0.0326
Sample 4 39.07 31.19 0.0003 0.0459
Sample 5 36.73 31.49 0.0003 0.0372
Image Mean SD Energy
Dissim
ilarity
Sample
1
83.361 55.696 0.0305 0.0388
Sample
2
83.041 53.928 0.0297 0.04
Sample
3
86.734 55.374 0.0321 0.0418
Sample
4
57.007 35.337 0.0136 0.0277
Sample
5
93.885 62.352 0.0385 0.045
Images Correlation Contrast Energy Homogeneity
E1 0.1437 0.0076 0.9801 0.9961
E2 0.1766 0.0145 0.9701 0.9927
E3 0.1381 0.0231 0.9533 0.9884
E4 0.3059 0.0145 0.9705 0.9927
E5 0.3751 0.0209 0.9573 0.9895
E6 0.1637 0.0209 0.9768 0.9946
E7 0.2153 0.0092 0.9807 0.9953
E8 0.2368 0.0077 0.9807 0.9961
E9 0.1780 0.0255 0.9838 0.9872
E10 0.1593 0.0223 0.9463 0.9888
N1 0.1037 0.0252 0.9527 0.9873
N2 0.1381 0.0323 0.9484 0.9838
N3 0.0981 0.0199 0.9324 0.9900
N4 0.1201 0.0231 0.9614 0.9909
N5 0.1161 0.0089 0.9112 0.9211
An Efficient Approach For Detection Of Exudates In Diabetic Retinopathy Images Using Clustering
www.iosrjournals.org 48 | Page
Table III shows the feature extraction of 10 abnormal images from E1 to E10 and 5 normal images from N1 to
N5. The correlation, contrast, energy and homogeneity has been compared for abnormal and normal image.
Table IV shows different parameter values of a single normal and a single abnormal image.
V. CONCLUSION
Exudates are one of the earlier signs of diabetic retinopathy. The low contrast digital image is enhanced
using Contrast Limited Adaptive Histogram Equalization (CLAHE). The Contrast enhanced color image is
segmented using K-means clustering, which is one of the simplest unsupervised learning algorithm for image
segmentation. The diabetic retinopathy images were collected from nearby hospital and the features
extracted have been compared with publicly available STARE and DRIVE database. K-means clustering
takes less computational time compared to FCM. It provides more color information from which the result of
classification will be improved. To Classify these segmented image into exudates and non-Exudates, a set of
features based on texture and color are extracted using Gray Level Co-Occurance Matrix (GLCM). The selected
features are classified into exudates and non-exudates using Support Vector Machine (SVM) Classifiers.. Using
this approach, the exudates are detected with 96% success rate. The Detection of Microaneurysm which is one
of the earlier symptoms of Diabetic Retinopathy can be predicted and its performance can be compared in the
future work. Using the same method detection of maculopathy can be done in future and the features can be
extracted.
REFERENCES
Journal Papers:
[1] Gwenole Quellec, Stephen R. Russell, and Michael D. Abramoff, Senior Member, IEEEOptimal Filter Framework for
Automated,Instantaneous Detection of Lesions in Retinal Images IEEE Transactions on medical imaging, vol. 30, no. 2,pp. 523-
533,February 2011.
[2] Alireza Osareh, Bita Shadgar, and Richard Markham A Computational-Intelligence-Based Approach for Detection of Exudates in
Diabetic Retinopathy Images IEEE Transactions on Information Technology in Biomedicine, vol. 13, no. 4,pp.535-545,July 2009.
[3] Carla Agurto, Eduardo Barriga, Sergio Murillo, Marios Pattichis, Herbert Davis, Stephen Russell, Michael Abrrmoff, and Peter
Soliz Multiscale AM-FM Methods for Diabetic Retinopathy Lesion Detection. IEEE Trans. Medical. Imaging Vol. 29, No.
2,pp.502-512, February 2010.
[4] T. Walter, J. Klein, P. Massin, and A. Erginary,. A contribution of image processing to the diagnosis of diabetic retinopathy,
detection of exudates in colour fundus images of the human retina. IEEE Trans. Medical. Imaging,Vol. 21, No. 10, pp.12361243,
October. 2002.
[5] Akara Sopharak , Bunyarit Uyyanonvara and Sarah Barman Automatic Exudate Detection from Non-dilated Diabetic Retinopathy
Retinal Images Using Fuzzy C-means Clustering.Journal of sensors/2009. ISSN 1424-8220.www.mdpi.com/journal/sensors.
[6] Niemeijer, B.V Ginnekan S.R, Russell. M and M.D. Abramoff Automated detection and differentiation of drusen, exudates and
Cotton wool spots in digital color fundus photographs for diabetic retinopathy diagnosis , Invest. Ophthalmol Vis. Sci.,
Vol.48,pp. 2260-2267, 2007.
[7] Niemeijer.M, Abramoff.M.D, Van Ginneken.B, Information fusion for Diabetic Retinopathy CAD in Digital color fundus
photographs IEEE Transactions on medical imaging, vo7l. 26, no. 10,pp. 1357-1365, October, 2007.
[8] Ricci.E, Perfetti.R, Retinal Blood vessel segmentation using Line operators and Support Vector Classification IEEE Transactions
on medical imaging, vol. 28, no. 5,pp. 775-785, March 2009.
[9] Goatman.K.A, Fleming. A.D, Philip. S, William. G.T, Detection of New vessels on the Optic Disc using Retinal photographs
IEEE Transactions on medical imaging, vol. 30, no. 4,pp. 972-979, April 2011.
[10] Deepak.K.S, Sivaswamy. J, Automatic assessment of macular edema from color retinal images IEEE Transactions on medical
imaging, vol. 31, no. 3,pp. 766-776, March 2012.
[11] Huiqili, Chutatape. O, Automated feature extraction in color retinal images by a model based approach IEEE Transactions on
Bio-Medical Engineering, vol. 51, no. 2,pp. 246-254, February 2004..
[12] Aquine. A, Gegundez, Aries. M.E, Marin.D, Detecting the Optic Disc boundary in digital fundus images using morphological,
edge detection and feature extraction technique IEEE Transactions on medical imaging, vol. 29, no. 11,pp. 1860-1869, November
2011.
[13] Tobin.K.N, Chaum.E, Govindasamy.V.P, Detection of anatomic structures in human retinal imagery IEEE Transactions on
medical imaging, vol. 26, no. 12,pp. 1729-1739, December 2007.
[14] Akara Sopharak, Mathew N. Dailey, Bunyarit Uyyanonvara, Sarah Barman, Tom Williamson,Yin Aye Moe, Machine Learning
approach to automatic Exudates detection in retinal images from diabetic patients, Journal of Modern optics,2009.
[15] Fleming. AD, Philips. S, Goatman. KA, Williams. GJ, Olson. JA, sharp. PF, Automated detection of exudates for Diabetic
Retinopathy Screening, Journal on Phys Med.and Bio., vol. 52, no. 24, pp. 7385-7396, 2007.
Proceedings Papers:
[16] Doaa Youssef, Nahed Solouma, Amr El-dib, Mai Mabrouk, New Feature-Based Detection of Blood Vessels and Exudates in Color
Fundus ImagesIEEE conference on Image Processing Theory, Tools and Applications,2010,vol.16,pp.294-299.
[17] Sanchez. C.I, Mayo.A, Garcia. M, Lopez.M.I, Hornero. R, Automatic Image processing Algorithm to detect hard exudates based
on Mixture models IEEE conference on Engineering in medicine and Biology society, pp. 4453-4456, September 2006.
[18] Pradeep Kumar. A. V, Prashanth. C, Kavitha.G, Segmentation and grading of Diabetic retinopathic exudates using error boost
feature selection method World Congress on Information and Communication Technologies, pp. 518-523, December 2011.
[19] C. Sinthanayothin, Image analysis for automatic diagnosis of Diabetic Retinopathy, World Congress on Information and
Communication Technologies, pp. 522-532, December 2000.
IOSR Journal of Computer Engineering (IOSRJCE)
ISSN: 2278-0661 Volume 2, Issue 5 (July-Aug. 2012), PP 49-53
www.iosrjournals.org
www.iosrjournals.org 49 | Page
Non-Intrusive Speech Quality with Different Time Scale
Mr. Mohan Singh
1
, Mr. Rajesh Kumar Dubey
2
1
(Dept. of Electronics and Communication, Satya College of Engineering and Technology, Palwal, India)
2
(Dept. of Electronics and Communication, Jaypee Institute of Information Technology, Noida, India)
Abstract: Speech quality evaluation is an extremely important problem in modern communication networks.
Service providers always strive to achieve a certain Quality of Service (QoS) in order to ensure customer
satisfaction. Modeling the speech quality becomes an urgent issue. In this project a computable model for
different time scale speech quality evaluation, called E-Model is developed. The results indicate that subjects
can monitor speech quality variations very accurately with a delay of approximately 1 second. Non-intrusive
speech quality is measured at the receiver from a degraded signal using G.107 (E-model) which is a parameter
based model and calculate MOS values with quality rating factor. The quality rating factor is calculated by
network impairments (loudness rating) of a speech. The output from the model described here is a scalar quality
rating value, R, which varies directly with the overall conversational quality. The key contribution of this paper
is to explore the use of G-107 (E-model) based features for different time scale non-intrusive speech quality
evaluation using time varying loudness of a speech for long stimuli. Sectional speech quality is obtained by E-
model, which is called instantaneous quality of the section that will be constant for each section. Overall
perceived quality can be calculated by using average of instantaneous speech quality.
Keywords: Critical bands, E-model, loudness, MOS, non-intrusive speech quality,
I. Introduction
The need to measure speech quality is a fundamental requirement in modern communications systems
for technical, legal and commercial reasons. Speech quality measurement can be carried out using either
subjective or objective methods. The Mean Opinion Score (MOS) [1] is the most widely used subjective
measure of voice quality and is recommended by the ITU [2]. A MOS value is normally obtained as an average
opinion of quality based on asking people to grade the quality of speech signals on a five-point scale (Excellent,
Good, Fair, Poor and Bad) under controlled conditions as set out in the ITU standard [2].
In voice communication systems, MOS is the internationally accepted metric as it provides a direct link
to voice quality as perceived by the end user [3]. The inherent problem in subjective MOS measurement is that
it is time consuming, expensive, lack of repeatability and cannot be used for long-term or large scale voice
quality monitoring in an operational network infrastructure. This has made objective methods very attractive to
estimate the subjective quality for meeting the demand for voice quality measurement in communication
networks. Objective measurement of voice quality in modern communication networks can be intrusive or non-
intrusive. Intrusive methods are more accurate, but normally are unsuitable for monitoring live traffic because of
the need for a reference data and to utilize the network. A typical intrusive method is based on the latest ITU
standard, P.862, Perceptual Evaluation of Speech Quality (PESQ) Measurement Algorithm [4]. This involves
comparison of the reference and the degraded speech signals to obtain a predicted listening-only one-way MOS
score. Since the quality of a speech signal does not exist independently of a subject, it is a subjective measure.
The most straightforward manner to estimate speech quality is to play a speech sample to a group of listeners,
who are asked to rate its quality. Since subjective quality assessment is costly and time consuming, computer
algorithms are often used to determine an objective quality measure that approximates the subjective rating.
Speech quality has many perceptual dimensions. Commonly used dimensions are intelligibility, naturalness,
loudness, listening effort, etc., while less commonly used dimensions include nasality, graveness, etc. However,
the use of a multidimensional metric for quality assessment is less common than the use of a single metric,
mainly as a result of cost and complexity. A single metric, such as the mean opinion score scale, gives an
integral (overall) perception of an auditory event and is therefore sufficient to predict the end-user opinion of a
speech communication system.
However, a single metric does not in general provide sufficient detail for system designers. The true
speech quality is often referred to as conversational quality. Conversational tests usually involve communication
between two people, who are questioned later about the quality aspects of the conversation; the most frequently
measured quantity is listening quality. In the listening context, the speech quality is mainly affected by speech
distortion due to speech codecs, background noise, and packet loss. One can also distinguish talking quality,
which is mainly affected by echo associated with delay and sidetone distortion. The distorted (processed) signal
or its parametric representation is always required in an assessment of speech quality. However, based on the
Non-Intrusive Speech Quality With Different Time Scale
www.iosrjournals.org 50 | Page
availability of the original (unprocessed) signal, two test situations are possible: reference based and not
reference based. This classication is common for both the subjective and objective evaluation of speech
quality. The absolute category rating (ACR) procedure, popular in subjective tests, does not require the original
signal, while in the degradation category rating (DCR) approach the original signal is needed. In objective
speech quality assessment, the historically accepted terms are intrusive (with original) and non-intrusive
(without original).
II. Critical Bands
The concept of critical bands is introduced in this chapter, methods for determining their characteristics
are explained, and the scale of critical-band rate is developed. The definitions of critical-band level and
excitation level are given and the three-dimensional excitation level versus critical-band rate versus time pattern
is illustrated. The concept of critical bands was proposed by Fletcher. He assumed that the part of a noise that is
eective in masking a test tone is the part of its spectrum lying near the tone. In order to gain not only relative
values but also absolute values, the following additional assumption was made: masking is achieved when the
power of the tone and the power of that part of the noise spectrum lying near the tone and producing the
masking eects are the same; parts of the noise outside the spectrum near the test tone do not contribute to
masking. Characteristic frequency bands defined in this way have a bandwidth that produces the same acoustic
power in the tone and in the noise spectrum within that band when the tone is just masked. Fletchers
assumptions may be used to estimate the width of characteristic bands, and we shall see later on how these
values compare with the critical bandwidths determined by other measurements.
However, the assumption that the criterion used by our hearing system to produce masked threshold is
independent of the frequency of the tone is incorrect. As will be discussed later, the power of the tone at masked
threshold is only about half to a quarter of that of the noise falling into the band in question. Using this
additional information, the width of the bands in question, the critical bands, can be estimated quite closely. At
low frequencies, critical bands show a constant width of about 100 Hz, while at frequencies above 500 Hz
critical bands show a bandwidth which is about 20% of centre frequency, i.e., in this range critical bandwidth
increases in proportion to frequency. In contrast with the estimation of the width of the critical band using the
assumption described above, there exist several direct methods for measuring the critical band.
III. LOUDNESS
Loudness belongs to the category of intensity sensations. The stimulus sensation relation cannot be
constructed from the just-noticeable intensity variations directly, but has to be obtained from results of other
types of measurement such as magnitude estimation. In addition to loudness, loudness level is also important.
This is not only a sensation value but belongs somewhere between sensation and physical values. Besides
loudness in quiet, we often hear the loudness of partially masked sounds. This loudness occurs when a masking
sound is heard in addition to the sound in question. The remaining loudness ranges between a loudness of
zero, which corresponds to the masked threshold, and the loudness of the partially masked sound is mostly
much smaller than the loudness range available for unmasked sound. Partial masking can appear not only with
simultaneously presented maskers but also with temporary shifted maskers. Thus the eects of partially masked
loudness are both spectral and temporal.
Loudness comparisons can lead to more precise results than magnitude estimations. For this reason the
loudness level measure was created to characterize the loudness sensation of any sound. It was introduced in the
twenties by Barkhausen, the researcher whose name was shortened to create a unit for critical-band rate, the
Bark. Loudness level of a sound is the sound pressure level of a 1-kHz tone in a plane wave and frontal incident
that is as loud as the sound; its unit is phon. Loudness level can be measured for any sound, but best known
are the loudness levels for dierent frequencies of pure tones. Lines which connect points of equal loudness in
the hearing area are often called equal-loudness contours. They have been measured in several laboratories and
hold for durations longer than 500 ms. Because of the denition, all curves have to go through the sound
pressure level at 1 kHz that has the same value in dB as the parameter of the curve in phon: the equal-loudness
contour for 40 phon has to go through 40 dB at 1 kHz. Threshold in quiet, where the limit of loudness sensation
is reached, is also an equal-loudness contour. Because threshold in quiet corresponds to 3 dB at 1 kHz and not
to0 dB, this equal-loudness contour is indicated by 3 phon. Equal-loudness contours are normally drawn for a
frontally-incident plane sound eld. However, in many cases the sound eld is not a plane sound eld but
similar to what is known as a diuse sound eld, in which the sound comes from all directions.
IV. Approach
For calculating speech quality with different time scale, we need to calculate loudness of that speech
signal and that loudness apply to the G-107 (E-model). This whole process is given below fig. [1].
Non-Intrusive Speech Quality With Different Time Scale
www.iosrjournals.org 51 | Page
Fig.1. Calculating Objective MOS
Objective MOS is to be calculated from the above approach. First take a speech file (.wav) and
calculate power spectrum of that speech file and then convert it into dB. This signal power spectrum passes
through 1/3 octave filter .An octave filer is the filter whose highest frequency is double of lowest frequency
which is used to find out the information of a signal, for this purpose we split the speech signal into small
section that is known as bands (critical bands). 1/3 power spectrum of an octave filter is known as 1/3 power
spectrum. Now for each critical band, specific loudness is to be calculated. Total loudness is the sum of all
specific loudness and this loudness is to be applied to the E-model. The output of E-model is the sectional
quality rating factor, R and objective MOS is calculated by using given formulae [1].
For R< 0 : MOS = 1
For 0 < R< 100: MOS = 1+0.035R+R(R-60)(100-R)7.10
-6
For R> 100: MOS=4.5
V. Instantaneous Quality
Time Varying Speech Quality (TVSQ) method is motivated by the fact that the speech or voice quality
of new networks varies, even during a single conversation, due to specific impairments like packet loss,
handover in mobile network [5]. During a communication, the speech quality can vary due to the special
technical characteristics of the different networks such as mobile or IP networks. The communication on these
networks can be impaired by different factors e.g. distortion due to packet loss or bit rate, side tone, echo,
compression algorithm, etc which are very common. For the assessment of speech quality, ITU-T recommended
test methods are available but these methods use short speeches of ~8 seconds or ~16 seconds length. These
methods are standardized and well suited for conditions when tested speech quality expected to be constant. If
one wants to evaluate long speech sequence, he has to divide long speech sequences into short speech sequences
of ~8 seconds or ~16 seconds length. However, if one evaluates short speech sequences then they cannot take
into consideration any quality variations in time within tested speech sequence A test was conducted by France
Telecom in July 1999.The purpose of the test was continuous assessment of speech quality containing quality in
time and its relationship with overall subjective quality. In subjective listening test conducted by France
Telecom, subjects were supposed to perform two tasks while listening to the speech sequences, instantaneous
quality judgments and overall quality judgments. Instantaneous quality is the quality perceived by the subjects at
any instant during the play out of the sequence whereas, overall quality is the single (scalar) judgment which
subjects give after listening the whole speech sequence. After getting these two types of subjective judgments, it
was analyzed whether instantaneous judgment can be used to predict overall quality. Previous work from
AT&T has shown that overall quality might be predicted by a linear contribution of short (8 sec) sound-
sequences MOS (Mean Opinion Scores) assessed independently. Although the model proposed by AT&T is
useful, it does not take into consideration real instantaneous judgments and hypothesizes some integration of
perceived quality over 8 s. Moreover, the model was derived from stimuli that did not contain real degradations
representing those found in wireless or landline packet networks. [6]
.WAV File
Signal Power Spectrum
Convert2dB
1/3 Octave Spectrum
Specific Loudness
E-Model
Objective MOS
Total Loudness
Non-Intrusive Speech Quality With Different Time Scale
www.iosrjournals.org 52 | Page
VI. Results
Time domain speech signal of 8 sec. is given in fig. [2]. in this figure there is no voice in between 2 to
6.5 sec. Voice is present for 0-2 sec. and 6.5-8 sec
.
Fig.2: Time domain plot for 8 sec speech.
The 8 sec. speech instantaneous quality is given in fig. [3], in which first part is for 4 sec. and seconds
part is for other 4 seconds and similarly time domain plots and instantaneous quality for 30 seconds and 60
seconds are given in fig.[4], fig. [5], fig. [6] and fig. [7] respectively.
Fig.3: Instantaneous quality for 8 sec
Fig.4: Time domain plot for 30 seconds speech.
Fig.5: Instantaneous quality for 30 sec speech.
Fig.6: Time domain plot for 60 sec speech
0 1 2 3 4 5 6 7 8
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Time(second)
A
m
p
l i t u
d
e
Time domain plot
0 1 2 3 4 5 6 7 8
1
1.5
2
2.5
3
3.5
4
4.5
5
Time(sec)
Q
u
a
l i t y
Instantaneous Quality Variation
0 5 10 15 20 25 30 35
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Time(second)
A
m
p
l i t
u
d
e
Time domain plot
0 5 10 15 20 25 30
1
1.5
2
2.5
3
3.5
4
4.5
5
Time(sec)
Q
u
a
l i t
y
Instantaneous Quality Variation
0 10 20 30 40 50 60
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Time(second)
A
m
p
l i t
u
d
e
Time domain plot
Non-Intrusive Speech Quality With Different Time Scale
www.iosrjournals.org 53 | Page
Fig.7: Instantaneous speech quality for 60 sec. speech file.
VII. Disscussion
Speech quality evaluation using e-model (ITU-T.G.107) for different time scale is calculated. Speech
files are taken for 8 seconds, 20 seconds and 60 seconds respectively for sectional speech quality. This quality is
obtained using loudness of this speech file by e-model which is known as non-intrusive speech quality with
different time scale .The output is the sectional quality for a speech file and section quality is constant for each
section. This sectional quality is different for each section due to different loudness for that section. Loudness of
a speech is very important and is helpful for calculating the speech quality of a speech. Objective MOS and
average objective MOS is calculated for each section for 30 sec and 60 sec speech respectively table 1.
Table 1: Sectional and average objective MOS for 30 sec and 60 sec speech
VIII. Conclussion
Speech quality of a speech signal is obtained by conversational e-model ITU G.107 which is known as
an overall speech quality .E-model is not used to find out instantaneous quality of a speech signal. In this thesis
different time scale speech quality for non-intrusive is to be calculated by e-model using time varying loudness
of a speech signal. Time varying speech quality is calculated for 8 sec., 20 sec. and 40 sec. speech signal.
Instantaneous quality for 8 sec. speech file for sections is to be calculated and constant for each section in which
first section (4 sec.) and second section (4 sec.) and finally calculated average speech quality for 8 seconds for
24 speech files. And hence time varying speech quality is depends upon time varying loudness of a speech
signal and overall perceived quality can be calculated by using average of instantaneous speech quality.
References
[1] ITU-T Rec. G.107, The E-Model, a Computational Model for Use in Transmission Planning, 2003.
[2] ITU-T Rec. P.800, Methods for Subjective Determination of Transmission Quality, Int. Telecomm. Union, Aug. 1996.
[3] W. C. Hardy, QoS Measurement and Evaluation of telecommunications Quality of Service., John Wiley & Sons, ISBN 0-471-
49957-9, 2001.
[4] ITU-T Rec. P.862, Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality
Assessment of Narrow-Band Telephone Networks and Speech Codecs, Geneva, Switzerland, Feb.2001.
[5] ITU-T Rec. P.880, Methods for objective and subjective assessment of quality, August 2004.
[6] Contribution ITU-T[COM 12-94], Continuous assessment of time-varying subjective vocal quality and its relationship with overall
subjective quality, 1999.
0 10 20 30 40 50 60
1
2
3
4
5
Time(sec)
Q
u
a
l
i
t
y
Instantaneous Quality Variation
Sections Sec-1
OMOS
Sec-2
OMOS
Sec-3
OMOS
Sec-4
OMOS
Sec-5
OMOS
Sec-6
OMOS
Average
OMOS
For 30
sec
3.3169 3.7856 3.1459 2.6123 1.9200 2.7918 2.9287
For 60
sec
3.3169 3.7856 2.9702 3.4817 3.4817 3.1459 3.3637