Wang 2010

EMD AND PSYCHOACOUSTIC MODEL BASED WATERMARKING FOR AUDIO
Liang Wang, Sabu Emmanuel , Mohan S. Kankanhalli

School of Computer Engineering, Nanyang Technological University, Singapore

School of Computing, National University of Singapore, Singapore
Email: {wangl, asemmanuel}@ntu.edu.sg, mohan@comp.nus.edu.sg
ABSTRACT [4], phase coding [5], patchwork coding [4], low-bit coding [1]
The audio watermarking method proposed in this paper offers and spread spectrum [6]. Among all the audio watermarking
the copyright protection to an audio without the use of the orig- techniques, the watermark bits can be embedded either in the
inal signal for watermark detection. The analysis filterbank de- transform domain or in the spatial domain. In this research
composition, the psychoacoustic model and the empirical mode work, the transform domain watermark is studied because it is
decomposition (EMD) are the three key techniques used in the demonstrated to be more robust against various attacks [2].
novel audio watermarking method. Unlike the traditional audio During the past few years, many of the developed audio wa-
watermarking algorithms where the watermark bits are embed- termarking algorithms [1][2][5] took advantage of the percep-
ded directly in the signal either by time domain or transform tual properties of the HAS in order to increase the robustness of
domain processing, the novel blind audio watermarking algo- the watermark message by maximizing its strength while em-
rithm proposed in this paper embeds the watermark bits in the bedding it in a perceptually transparent manner. Therefore, the
final residue of the subbands in the transform domain. Four psychoacoustic model is adopted to guarantee the impercepti-
watermark messages are embedded into the proposed audio wa- bility of the watermark message. In order to efficiently use the
termarking system. The inaudibility, capacity and robustness of psychoacoustic model in the transform domain audio data, a
the audio watermarking system are evaluated, in order to opti- polyphase filterbank is used for the time to frequency mapping
mize the system performance. The experimental results show of the original input audio data.
that the proposed blind watermarking scheme is robust against Traditional multimedia watermarking in either transform do-
MP3 compression and adding Gaussian noise attacks. main or temporal domain tends to embed the watermark bits di-
rectly into the coefficients. The empirical mode decomposition
Keywords Blind audio watermarking, empirical mode de-
(EMD) [7] is proved useful while processing the nonlinear and
composition (EMD), psychoacoustic model, analysis filterbank
non-stationary time series [8][9]. By using the EMD method,
decomposition
any multi-component signal is decomposed into a set of intrin-
sic mode functions (IMFs) and the final residual. This residual
1. INTRODUCTION is proved to be highly robust under Gaussian noise attack and
MP3 compression [9]. Thus it is possible to embed the wa-
Over the past decades, significant effort has been focused on termark bits robustly into the residual rather than the subband
the copyright protection of the digital media (audio, image, and audio signal itself.
video). A promising solution to this problem is the addition of The rest of the paper is organized as follows. Section 2 gives
a watermark to the digitized media, where the special infor- an overview of the empirical mode decomposition. The novel
mation (the watermark message) is hidden in the original data audio watermark embedding and extracting procedure is pro-
in an imperceptible manner [1]. posed in Section 3. Section 4 illustrates the experimental results
Compared to embedding watermarks into images, audio wa- for the quality evaluation. The experiments results of the water-
termarking is a more challenging task due to the fact that the marking capacity and robustness, as well as the performance
human auditory system (HAS) is more sensitive to distortions against signal processing attacks are given in Section 5. And
than the human visual system (HVS) [2], and that inaudibility is finally the paper concludes in Section 6.
much more difficult to achieve than invisibility for images [3].
Also, compared to the visual signals, audio signals are repre-
sented by much less number of samples per time interval, which 2. EMPIRICAL MODE DECOMPOSITION
limits the watermark capacity for the audio signals.
Several techniques in audio watermarking have been pro- A detailed mathematically formulated introduction of the em-
posed to address these challenges, including the echo coding pirical mode decomposition (EMD) can be found in [7][8][9].
By applying the EMD, any multi-component signal is decom-
We thank the Agency for Science, Technology and Research (A*STAR),
Singapore for supporting this work under the project Digital Rights Violation posed into a set of intrinsic mode functions (IMFs). The IMF
Detection for Digital Asset Management (Project No: 0721010022). can be defined as a hidden oscillation mode that is embedded in
978-1-4244-7493-6/10/$26.00 2010
c IEEE ICME 2010
1427
(shown in Fig. 2) makes the use of the polyphase filterbank,
psychoacoustic models and the empirical mode decomposition.
3.1. Analysis Filterbank Decomposition

The subband analysis and synthesis filters defined in [10] are
used as the polyphase filter bank in the experiment.
The analysis filterbank map the PCM audio X(t) into M
subbands. Here we let M = 32 to be consistent with the MPEG
Audio Coding Standard defined in ISO/IEC 11172-3:1995 [10].
Thus the in each band, the subband audio signal can be calcu-
lated as:
63
7
Fig. 1. The original PCM audio signal, its imf components (imf Si (t) = T (i)(k) (C(k + 64i) x(k + 64i)) (2)
1 - imf 8) and the final residue. k=0 j=0
where i is the subband index and ranges from 0 to M 1 (M =

the data series, and it is allowed to be non-stationary and either
32 in this case); Si (t) is the filter output for subband i at time t.
be amplitude or frequency modulated. Here we only briefly in-
C(n) is the pre-defined analysis window coefficients [10]; x(n)
troduce the extraction procedure of all the IMFs and the final
is the audio input sample; and the analysis matrix coefficients
residue.
T (i)(k) is also defined in [10].
For an arbitrary signal P (t), EMD is performed and the sig-
In order to increase the watermark capacity, the transform
nal expressed as a sum of IMFs and a final residue as follows:
domain audio samples in each subband will be further divided

N into NS segments of J samples each. Let us denote by Si,j (t)
P (t) = cn (t) + rm (t) (1) the data samples belonging to the j-th segment. Then we have
n=1
where cn (t) is the n-th IMF of the signal and rm (t) is the fi- Si,j (t) = Si (j J + t) (3)
nal residue. The completeness and orthogonality of IMFs are where t = 0, 1, ..., J 1 and j = 0, 1, ..., NS 1.
shown by Huang [7]. It should be noted that, as the order of
the mode increases, the time scale increases while the mean fre-
3.2. Watermark Embedding Domain Control
quency of the mode decreases.
The final residue is a monotonic function and is the coarsest Often it is required to be able to embed the watermark into a
component of the signal. It has been shown that the final residue certain part of the audio, or not to embed the watermark into
behave stable under the Gaussian noise and MPEG compression a specific region, such as the silent region. Thus the water-
attack [9]. Thus we choose to embed watermark bits into the mark embedding domain control module is proposed, which
final residue obtained from the EMD process of the audio signal. makes it possible to embed the watermark bits in a more flex-
An example of the empirical mode decomposition of an au- ible manner. If we denote i as the appropriate bands and
dio signal is shown in Fig. 1. j as the appropriate segments for the watermarking process,
In our experiment, the watermark message is embedded into Si,j (t) := Si,j (t)ii,jj is defined as the segmented, band-pass
the Waveform Audio File Format (WAV) audio signal where the filtered audio stream that is suitable for the watermark embed-
bit stream is encoded with the Pulse Code Modulation (PCM) ding.
format. For the audio and speech processing, the PCM samples
are stored and processed using floating point numbers which 3.3. Empirical Mode Decomposition
have the zero mean (or the mean value is sufficiently small com-
pared with the amplitude of the signal) and varies in the interval The EMD is applied to each of the segmented subband stream
[-1.0, 1.0]. Thus, compared with the original audio signal, the Si,j (t). Thus we have
amplitude of its final residue can be regarded sufficiently small
(as shown in Fig. 1), which makes is possible to embed the
Ni,j
Si,j (t) = ci,j,n (t) + rm,i,j (t) (4)
watermarks in the final residue of the audio signal while the
n=1
watermark messages are perceptually inaudible.
where ci,j,n (t) are the IMFs of the segmented stream Si,j (t),
3. PROPOSED ALGORITHM Ni,j is the number of IMFs for the segmented streams Si,j (t)
and rm,i,j (t) is the final residue of stream Si,j (t).
A novel blind audio watermarking embedding scheme is de- It is worth noting that, the length of the segmented stream
scribed in this section. The proposed embedding scheme Si,j (t) may affect the watermarking system performance at a
1428
Watermark
Embedding
Domain Control
IMFs c 0,j,n(t)
EMD for
S 0(t) S 0, j(t)
each
Segmentation Mean Trend
Band 0 segment j
r m,0,j (t)
in band 0 + +
PCM IMFs c 1,j,n(t)

EMD for Watermarked
Audio S 1 (t) S1, j (t)
each PCM
Segmentation Mean Trend
Input X(t) Analysis Band 1 segment j
r m,1,j (t) Audio Output
Filterbank in band 1 + +
Synthesis Xw (t)
(M-band)
Filterbank
...
...
...
EMD for IMFs c M-1,j,n (t)
S M-1 (t) SM-1, j (t) each
Segmentation segment j Mean Trend
Band M-1 in band r m,M-1,j (t)
M-1 + +
Watermark Bits
... Embedded into Mean
Trends
Masking Watermark Strength Watermark
FFT
Thresholds Adjustment Watermark Bits w(j) Generator
Fig. 2. Block diagram of the watermark embedding procedure.
noticeable level. The longer the length of Si,j (t) provides more determine the maximum possible power of the watermark mes-
samples to perform the EMD, thus better performance can be sage. By calculation of the signal-to-mask ratio (SMR) for each
expected. However in the proposed audio watermarking sys- segment in each subband, the total maximum possible water-
tem, the watermarking capacity will be reduced, as more audio mark strength can also be obtained.
samples is used to watermark with 1 bit. If we denote the signal-to-mask ratio for the segment j in
subband i as SM Ri,j , then we should have
3.4. Watermark Embedding
In order to increase the watermarking capacity, the subband au-
J1 31

dio stream Si (t) is embedded with the watermark sequences |rm,i,j i,j Wi (j)| SM Ri,j (6)
Wi (j) = wi,j , wi,j 1, +1 and 0 j NS 1. Since the (ii,jj) t=0 i=0
watermarking robustness generally increases with the amplitude
of the host audio signal, the signal-dependent watermark should where i,j is the weight for thresholding the amplitude of the
be embedded in the host audio. watermark strength, and i,j should be proportional to the sig-
Each watermark bit wi,j is embedded into the j-th segment nal strength Si,j (t) since the masking threshold is also propor-
in the i-th subband of the original audio signal by modifying tional to the signal strength.
its final residue rm,i,j (t). The segmented stream Si,j (t) after
embedding watermark bit streams Wi (j) are given by the fol-
lowing equations: 3.6. Watermarked Audio
For any 0 i M 1, the watermarked streams Si,j w

(t)(0
w
Si,j (t)=Si,j (t) rm,i,j (t) + Wi (j) j Ns 1, j j) and the non-watermarked streams

Ni,j w
Si,j (t)(0 j Ns 1, j j) are combined into the sub-
= ci,j,n (t) + Wi (j) (5) w
band signals Si,j (t), and the synthesis filterbank [10] is used
n=1 for the reconstruction. The watermarked PCM audio signal can
be denoted as X w (t).
if i i and j j. The i,j in the above equation is the weight
for thresholding the amplitude of the watermark strength. The
calculation of i,j will be discussed in the following subsection. 3.7. Analysis Filterbank Decomposition
For the segmented stream Si,j (t) where i i and j j, the
audio signal is not watermarked. With a watermarked PCM audio signal in hand, one can extract
the watermark bits by following the procedure shown in Fig. 3.
3.5. Watermark Strength Adjustment Apply the same analysis filterbank used in the watermark
embedding procedure to decompose the watermarked PCM au-
In order to embed the watermark bits robustly while in an im- dio X w (t), After which M subband signals Si,jw
(t), 0 i
perceptible manner, the psychoacoustic model [10] is used to M 1, are obtained.
1429
Watermark
Embedding Domain
Control
Mean Trend
Band 0 Sw 0, j (t) EMD for each rw m,0,j (t) J-1
Segmentation segment j in r w m,0,j(t)
band 0 t=0
Mean Trend
J-1
Watermarked Band 1 Sw 1, j (t) EMD for each rw m,1,j (t) Extracted
PCM Segmentation segment j in r w m,1,j(t)
t=0 Watermark Watermarks
Audio Output X w(t) Analysis band 1
Bits
Filterbank
Calculation
(M-band)
...
...
...
Mean Trend
Band M-1 S w M-1, j (t) EMD for each r w m,M-1,j (t) J-1
Segmentation segment j in r wm,M-1,j (t)
band M-1 t=0
Fig. 3. Block diagram of the watermark extraction procedure.
3.8. Watermarked Embedding Domain Control

Table 1. The grading scale used in the listening test
The same watermark embedding domain defined in the previous Impairment Grade SDG= Gradewatermarked
section is used for the watermark extraction process. For the Description Graderef erence
watermark extraction process, the watermark message should Imperceptible 5.0 0.0
w
be extracted from Si,j (t), where i i and j j. Perceptible, but 4.0 -1.0
not annoying
3.9. Empirical Mode Decomposition Slightly annoying 3.0 -2.0
w
Apply EMD to each of the watermarked segment Si,j (t), one Annoying 2.0 -3.0
can obtain Very annoying 1.0 -4.0
w
Ni,j

w
Si,j (t) = i,j,n (t) + rm,i,j (t)
cw w
(7) audio clips are used for the evaluation. All audio files are 16-
n=1
bits mono audio sampled at 44.1 kHz (CD quality) ranged from
where cw i,j,n are IMFs of the watermarked signal Si,j (t) and
w 1-5 min. The type of the music includes classical, rock, jazz and
rm,i,j (t) are the final residue of the watermark signal.
w electrical music. There are altogether 20 listeners participate the
listening test. None of the participant was trained for the listen-
3.10. Watermark Bits Calculation and Watermark Mes- ing test all of them were only music listeners. All participants
sage Extraction were given the instruction of the listening test just before the
test began and they all used their own headsets.
w
The final residue value rm,i,j (t) is used to determine the embed- In the first part of the quality evaluation, participants were

ded watermark bits. The watermark bit wi,j can be calculated given the non-watermarked and watermarked audio files in ran-
by using the following formula: dom order, and they had to identify the watermarked ones
blindly. For each of the audio file, the listeners could make

J1
their choices as one of the three options: 1) non-watermarked,

wi,j = 1, if w
rm,i,j (t) 0 (8)
2) watermarked, or 3) can not tell the difference. While most of
t=0
the listeners could not tell the difference, this indicates that the

J1
non-watermarked and watermarked audio can not be discrimi-

wi,j = 1, if w
rm,i,j (t) < 0 (9) nated.
t=0
In the second part of the evaluation, with the prior knowl-
Thus, for each of the watermarked subband i, where i edge of the non-watermarked and watermarked (been water-
i, the corresponding watermark message can be extracted as: marked with one watermark message only) audio files, the lis-
Wi (j) = wi,j

, wi,j 1, +1 and 0 j NS 1. teners were asked to report the dissimilarities between the two
signals, using the so called Subjective Difference Grades (SDG)
4. QUALITY EVALUATION [11] as described in table 1.
The audio quality of a watermarking system can be linked to
Subjective quality evaluation of the watermarking scheme was the perceived difference (impairment) between the watermarked
conducted by the listening tests. A total number of 10 testing audio signal and the original audio signal. To facilitate data
1430
analysis, subjective difference grade (SDG) is calculated as the 2
difference of the grades between the watermarked signal and the 1.8
Miss detected watermark bits/sec

original signal. 1.6
In the subjective listening test, the average SDG score was 1.4
-0.15. This indicates that the proposed EMD based watermark- 1.2
ing scheme cause almost no perceptible distortion to the water- 1
0.8
marked signal. 0.6
0.4
5. WATERMARKING CAPACITY AND ROBUSTNESS 0.2
EXPERIMENTAL RESULTS 0
0 200 400 600 800
The length of segment for EMD process
1000 1200
The inaudibility of the proposed watermarking system is dis-

cussed in the previous section. This section evaluates the wa-
Fig. 4. The average misdetected watermark bits per second ver-
termarking capacity and robustness of the EMD and filterbank
sus the segment length (watermarks embedded into 4 subbands).
analysis based watermarking system.
Given a host audio to be watermarked, the number of max-
imum possible watermark bits in each of the subband i (0 0.016
i M 1) that can be embedded is determined by the length 0.014
of the segment (J) used for the EMD process. For 1 min of 0.012
the mono audio signal with the sampling rate of 44.1 kHz, the
Bit error rate

0.01
maximum number of watermark bits that can be embedded in 0.008
each subband can be calculated as: 44100 60 (32 J) = 0.006
82687.5 J. 0.004
In this experiment, we fix the subbands (i, where i i) that 0.002
are used for watermarking, and then vary the length of the seg-
0
ment in order to optimize the performance of the watermarking 0 200 400 600 800
The length of segment for EMD process
1000 1200
system. In order to avoid the overlapping effect of the adjacent

filter bands, subband 3, subband 7, subband 11 and subband 15
are used to embed watermark messages. We evaluate the wa- Fig. 5. The bit error rate versus the segment length (watermarks
termarking system performance by varying the segment length embedded into 4 subbands).
from 32 samples to 1,024 samples with the increment of 2.
We have discussed previously, that the mean value for a seg-
ment of the PCM audio signal may no longer be zero, thus the achieved is 0.0167 bits. When the segment length is longer than
final residue of the EMD for that segment would be increased 600, the misdetected watermark bits in one second would be
and watermark strength would be increased as well. In the wa- less than 0.1 bits.
termark embedding procedure, since the total modification of It should be noted that, Fig. 4 only shows the average mis-
the final residue is bounded by the masking threshold calcu- detected watermark bits in one second. The actual misdetected
lated from the psychoacoustic model, when the required water- watermarked bits should be varied for different audio files, since
mark strength is increased, the accuracy of the watermark ex- the corresponding masking thresholds are different as well.
traction will be decreased. If 4 subbands are chosen to be em- Fig. 5 shows the relationship between the average bit error
bedded with watermark messages, the maximum number of wa- rate and the length of segment for EMD process when there are
termark bits can be achieved with the shortest segment length. 4 watermark messages embedded into subband 3, subband 7,
In our experiment, the maximum number of watermark bits per subband 11 and subband 15.
second equals to 172 when the segment length equals to 32. The minimum bit error rate can be achieved is 0 when 4 wa-
Since 4 subbands are embedded with the watermark messages, termark messages are embedded; while the maximum bit error
we should have 172/4 = 43 bits/sec embedded into each of the rate can be achieved is 0.0153 when 4 watermark messages are
subband. For a typical audio file with 5 min duration, the wa- embedded. For the length of segment for EMD process varied
termarking capacity for each of the subband would be 12,900 from 32 to 1,024 with the increment of 2, the average bit error
bits. rate is 0.0153.
Fig. 4 shows the relationship between the miss detected wa- From Fig. 5, no direct conclusion can be made. This is be-
termark bits and the segment length. As expected, when the cause in the proposed audio watermarking system, the system
segment length is reduced, the number of misdetected water- performance (in terms of the bit error rate) is highly related
mark bits will be increased. When the EMD process is applied with the watermark strength, which depends on the masking
to the segment with the length of 32 samples, the average miss threshold for the particular audio segment. However, the bit
detected watermark bits in one second is 1.80. The minimum error rate varies in a limited region and shows no significant
number of misdetected watermark bits in one second can be change within that region. Considering the cryptographic se-
1431
6. CONCLUSION
Table 2. The bit error rate (BER) of the proposed watermark
system under signal processing attacks In this paper, a novel blind audio watermarking system based
MP3 compression Adding Gaussian Noise on the psychoacoustic model, the polyphase filterbank analysis
(128 kbps) and the empirical mode decomposition is proposed. In our ap-
BER 1.43e-02 1.15e-02 proach, the analysis filterbank is used to decompose the host au-
dio signal into multiply subbands, and each of the subbands can
be embedded with a unique watermark message. Within each
of the subbands, the signal is firstly segmented and the empir-
ical mode decomposition (EMD) is applied to each of the seg-
curity of the proposed watermark system, whereas higher wa- ments. The watermark bits are embedded into the final residue
termarking capacity is preferred, one can choose the length of extracted by the EMD process. The inaudibility of the water-
segment equals to the smallest value. In our experiment, we marks is guaranteed with the use of the psychoacoustic model.
choose the length of segment for the EMD process equals to 32. The watermark extraction procedure does not use the original
audio signal. The proposed blind audio watermarking scheme
Thus we choose to embed 4 watermark messages into sub- is proved to be robust against MP3 compression and adding
bands 3, 7, 11 and 15 with one watermark message embedded Gaussian noise attacks. However this method may not be robust
into each of the subband; and to use the segment of 32 samples to some other attacks such as band-pass filtering and cropping.
for the EMD process. Since each subband has a watermark- Ongoing research focuses in increasing the robustness of our
ing capacity of 43 bits/sec, the total watermarking capacity is method against such attacks.
172 bits/sec (43 4). With this configuration, bit error rate is
1.04e-2 which is very small. We can employ error correction
7. REFERENCES
technique to the watermark information bits before embedded
which would help to restore the watermarking bits even if the [1] N. Cvejic, Algorithm for Audio Watermarking and Steganography. Ph.D.
miss detection exits. diss., Department of Electrical and Information Engineering, University of
Oulu, 2004.
It can be seen from table 2 that, the proposed audio water- [2] I.J. Cox, and M.L. Miller, Digital Watermarking, Morgan Kaufmann,
2002.
marking scheme is robust against MP3 compression and Gaus- [3] F. Hartung, and M. Kutter, Multimedia watermarking techniques, Proc.
sian noise attacks. Especially under the Gaussian noise attack, of IEEE, vol 87, no. 7, Jul. 1999.
the bit error rate is still very low. This is because as the EMD de- [4] W. Bender, D. Gruhl, N. Morimoto, and A. Lu, Techniques for data hid-
ing, IBM Systems Journal, 35(3/4), pp. 313-336, 1996.
composition proceeds, the time scale increases while the mean [5] X. He, A. Lliev, and M. Scordilis, A high capacity watermarking tech-
frequency of the modes decreases. Thus the IMFs are extracted nique for stereo audio, in IEEE International Conference on Acoustic,
with the finest scale from the signal and the remainder final Speech, and Signal Processing, vol. 5, pp. 393-396, 2004.
[6] D. Kirovski, and H.S. Malvar, Spread-spectrum watermarking of audio
residue is the coarsest component of the signal [7]. Therefore signals, IEEE Transactions on Signal Processing Special Issue on Data
for the zero mean Gaussian noise, it is sifted in the lower or- Hiding, 51(4), pp. 1120-1033, 2003.
der of IMFs and the final residue (the mean trend) remains un- [7] N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih,Q. Zheng,N.C. Yen,
C.C. Tung, and H.H. Liu, The empirical mode decomposition and the
influenced. Hilbert spectrum for nonlinear and non-stationary time series analysis,
Royal Society A: Mathematical, Physical and Engineering Sciences, vol.
Since no other implementation of the watermarking system 454, no. 1971, pp. 903-995, Mar. 1998.
or test data is readily available at our site, no direct compar- [8] H. Liang, S.L. Bressler, R. Desimone, and P. Fries, Empirical mode de-
ison can be made in this paper. However, it should be noted composition: A method for analyzing neural data, Computational Neuro-
science: Trends in Research, vol. 65-66, pp. 801-807, Jun. 2005.
that, Cvejic et al. [12] proposed a spread spectrum based audio [9] N. Bi, Q. Sun, D. Huang, Z. Yang, and J.Huang, Robust image water-
watermarking scheme in temporal domain and claimed a wa- marking based on multiband wavelets and empirical mode decomposi-
termarking capacity of 14.7 bits per second for the mono audio tion, IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 1956-
1966, Aug. 2007.
signal. In their later research [13], a spread spectrum based au- [10] ISO/IEC Intl Standard IS 11172-3: 1993 Information Technology-
dio watermarking scheme in spectral domain is proposed, with Coding of Moving Pictures and Associated Audio for Digital Storage Me-
the watermarking capacity increased to 27.1 bits per second. dia at up to about 1.5 Mbits/s-Part 3: Audio.
[11] C. Neubauer, and J. Herre, Digital watermarking and its influence on
Bassia et al. [14] proposed an audio watermarking method that audio quality, 105th Convention of the Audio Engineering Society, pp.
embedded the watermarks in the temporal domain segmented 225-233, 1998.
audio signal. Each watermark bit is embedded in each of the [12] N. Cvejic, A. Keskinarkaus, and T. Seppanen, Audio watermarking using
m-sequences and temporal masking, IEEE Workshop on the Applications
audio samples. In order to obtain a higher performance, the wa-
of Signal Processing on Audio and Acoustics, pp. 227-230, 2001.
termark bits are repeatedly embedded in the audio with the em- [13] N. Cvejic, and T. Seppanen, Spread spectrum audio watermarking using
bedding length of 217 samples. However when embedded with frequency hopping and attack characterization, Signal Processing, vol.
multiple watermarks, the subjective quality evaluation could not 84, no. 1, pp. 207-213, 2004.
[14] P. Bassia, I. Pitas and N. Nikolaidis, Robust audio watermarking in the
provide promising results (the resulting watermarking system time domain, IEEE Transactions on Multimedia, vol. 3, no. 2, pp. 232-
would produce a noticeable distortion while 4 watermark mes- 241, 2001.
sages are embedded).
1432

Wang 2010

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wang 2010

Uploaded by

Copyright:

Available Formats

EMD AND PSYCHOACOUSTIC MODEL BASED WATERMARKING FOR AUDIO

Liang Wang, Sabu Emmanuel , Mohan S. Kankanhalli

3.1. Analysis Filterbank Decomposition

where i is the subband index and ranges from 0 to M 1 (M =

PCM IMFs c 1,j,n(t)

Fig. 2. Block diagram of the watermark embedding procedure.

For any 0 i M 1, the watermarked streams Si,j w

Fig. 3. Block diagram of the watermark extraction procedure.

3.8. Watermarked Embedding Domain Control

Miss detected watermark bits/sec

ing scheme cause almost no perceptible distortion to the water- 1

5. WATERMARKING CAPACITY AND ROBUSTNESS 0.2

The inaudibility of the proposed watermarking system is dis-

i M 1) that can be embedded is determined by the length 0.014

Bit error rate

maximum number of watermark bits that can be embedded in 0.008

each subband can be calculated as: 44100 60 (32 J) = 0.006

82687.5 J. 0.004

In this experiment, we fix the subbands (i, where i i) that 0.002

system. In order to avoid the overlapping effect of the adjacent

You might also like

Wang 2010

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Wang 2010

Uploaded by

Copyright:

Available Formats

EMD AND PSYCHOACOUSTIC MODEL BASED WATERMARKING FOR AUDIO

Liang Wang, Sabu Emmanuel , Mohan S. Kankanhalli

3.1. Analysis Filterbank Decomposition

where i is the subband index and ranges from 0 to M 1 (M =

PCM IMFs c 1,j,n(t)

Fig. 2. Block diagram of the watermark embedding procedure.

For any 0 i M 1, the watermarked streams Si,j w

Fig. 3. Block diagram of the watermark extraction procedure.

3.8. Watermarked Embedding Domain Control

Miss detected watermark bits/sec

ing scheme cause almost no perceptible distortion to the water- 1

5. WATERMARKING CAPACITY AND ROBUSTNESS 0.2

The inaudibility of the proposed watermarking system is dis-

i M 1) that can be embedded is determined by the length 0.014

Bit error rate

maximum number of watermark bits that can be embedded in 0.008

each subband can be calculated as: 44100 60 (32 J) = 0.006

82687.5 J. 0.004

In this experiment, we fix the subbands (i, where i i) that 0.002

system. In order to avoid the overlapping effect of the adjacent

You might also like

each subband can be calculated as: 44100 60 (32 J) = 0.006

82687.5 J. 0.004