Professional Documents
Culture Documents
1 Introduction
Signal denoising is a task where we estimate our original signal x Rn from its
noisy observation, say z Rn , represented as
z = x + n,
(1)
where n represents Additive White Gaussian Noise (AWGN) with standard deviation . Sparsity based solutions have a clear advantage over other solutions because
of low space and time complexity. A successful method of convex denoising using
tight frame regularization was derived by [1]. This paper aims to provide a pedagogical explanation to the theory proposed by [1]. In this work, a denoising method
using Non-Convex Penalty (NCP) functions and decimated wavelet transform matrix W [2] is proposed. Wavelet transform matrix (W ) contains sub-band wavelet
Shivkaran Singh Sachin Kumar S Soman K P
Center for Computational Engineering and Networking (CEN)
Amrita School of Engineering, Coimbatore
Amrita Vishwa Vidyapeetham, Amrita University, India 641112
e-mail: shvkrn.s@email.address
matrices corresponding to different levels of transformation. The problem formulation is derived using W because it is less complicated and easy to manipulate. The
problem formulation for the signal denoising problem using W is given as
(
1
argmin F(x) := ||z x||22 + 1 (W1 x; c1 )i1 + ....
2
x
i1
)
..... + L+1
(2)
iL+1
2 Methodology
2.1 Sparsity Inducing Functions
The convex proxy for sparsity i.e. l1 norm has a special importance in sparse signal
processing. Nevertheless, sparsity inducing NCP functions provide better realization
in several signal estimation problems. In the formulation (2), parameter c will ensure
that the addition of a non-smooth penalty function doesnt alter the overall convexity
of cost function F. Parameter c can be chosen to make NCP function maximally non
convex [4]. In order to maintain convexity of the cost function F, range derived in [3]
is utilized, which is given by
1
(3)
0 < cj <
j
where j = 0, 1, 2....L corresponds to the different levels of transformation. If parameter c fails to comply with (3), then global optimal solution (minima) cant be
ensured, means the geometry of the NCP function affects the solution of the overall cost function. Later we will see in section 2 that c j = 1/ j provided competing
results. Alongwith the aforementioned condition, any non-convex penalty function
(.; c) : R R must adheres to the following conditions [1] to provide global
minima
1. is continuous function on R
2. is differentiable twice on R \{0}
3. Slope of (x) is unity at immediate right of origin, and on immediate left its
negative unity
0 (0+) = 1 and 0 (0) = 1
4. For x > 0, 0 (x) is decreasing from 1 to 0 and for x < 0, 0 (x) is increasing from
1 to 0 (Refer Figure 1 to visualize)
5. (x) is symmetric i.e. function is unchanged by any permutation of variable x.
(x; c) = (x; c)
6. l1 norm can be retrieved as a special case of (x; c), with c = 0
7. The greatest lower bound to the value of 00 (x) is c
Examples of several NCP functions used in experimentation are listed in Table
1. NCP function is not differentiable at origin, because of a discontinuity of (x)
at x = 0, which is illustrated in Fig.1. In Fig.1, the variation in the function with
different values of parameter c could be observed. It turns out that, mathematically
we can generalize the derivative of a non-differentiable function using sub-gradient.
In convex analysis, sub-gradients are used when convex function is not differentiable [6].
Expression
Rational
(x) =
Logarithmic
(x) =
Arctangent
(x) =
|x|
1+c|x|/2
1
c log (1 + c |x|)
2
c 3
tan1 1+2c|x|
6
3
for minimizing the cost function, hence it reads majorize-minimize. Further explanation will consider the case of minimizing the cost function. The initial part of
M-M algorithms implementation is to define the majorizer (Gm (x)), m = 0, 1, 2...,
where m denotes the current majorizer. The idea behind using a majorizer is that it
is easier to solve than F(x) (Gm (x) must be convex function). A function Gm (x) is
called a majorizer of another function F(x) at x = xm iff
F(x) Gm (x) f or all x
(4)
F(x) = Gm (x) at x = xm
(5)
In other words, (4) signifies that the surface of Gm (x) always lies above the surface
of F(x) and (5) signifies that Gm (x) is tangent to F(x) at point x = xm . The basic
intuition regarding equation (4) and (5) could be developed from Fig. 2. In M-M
algorithm, for each iteration we must find a majorizer first with upper bound and
then minimize it. Hence the name Majorization-Minimization. If xm+1 is minima
of current proxy function (Gm (x)), M-M algorithm derives the cost function F(x)
downwards. It is demonstrated in [5] that numerical stability of M-M approach depends on decreasing the proxy function rather than minimizing it. In a similar way,
M-M approach works for multi-dimensional function where it is more effective.
As a general practice, quadratic functions are preferred as majorizers because
derivative provides a linear system. A polynomial of higher order could also be used,
however it will make the solution difficult (non-linear system). To minimize F(x)
using M-M approach, we can either majorize 12 ||y x||2 [9] or NCP function [10] or
both. In our experimentation, we majorized the NCP function.
3 Problem Formulation
In this paper the importance of NCP functions for 1-D signal denoising is addressed
as NCP functions promotes sparsity. The NCP function should be chosen to assure
the convexity of overall cost function. A benefit of this approach is that we can arrive
at the solution by using convex optimization methods.
3.1 Notations
The signal x to be estimated is represented by an N-Point vector
x = [x1 , x2 , x3 , ........., x(N)]
The NCP function (x; c) used in our experimentation parametrized by c is represented as [4]
1 + 2a|x|
(x) = tan1
6
a 3
3
The wavelet transform matrix is represented by a square matrix [2] W. The Matlab
syntax to generate the wavelet matrix is as follows W = wavmat(N, F1 , F2 , L) where
N denotes the length of the signal, F1 and F2 are decomposition/reconstruction filters
which can be obtained by Matlab function w f ilters(0 wavelet name0 ); for different
wavelets, L denotes the transformation level (we used L = 4).
W1
W2
..
W = .
.
..
WL+1
where W1 ,W2 , .....WL+1 are the sub-band matrices of W with dimensions corresponding to the level of transformation used. For e.g. Let signal length be N = 64,
level of transformation be 4 (L = 4), then structure of wavelet transform matrix will
be like
[W1 ]464
[W2 ]
464
W = [W3 ]864
[W4 ]1664
[W5 ]3264 6464
where [W1 ]464 , [W2 ]464 ... are the sub-band matrices corresponding to different
levels of transformation.
3.3 Algorithm
Consider the cost function in (2), M-M algorithm generates a sequence of simpler
optimization problems as
xm+1 = argmin Gm (xm )
(6)
i.e. in each iteration we are solving a smaller convex optimization problem. The expression in (6) is used to update xm in each iteration with x0 = z as initialization.
Each iteration in M-M algorithm will have different majorizer which must fit an upper bound for NCP function i.e. it should follow (4) & (5). As mentioned in Section
2, we used second order polynomial as majorizer for simpler solution to our optimization problem.
As mentioned in [11], scalar case majorizer for NCP function could be given by
g(x; s) =
s
0 (s) 2
x + (s) 0 (s)
2s
2
(7)
0 (s) 2
x +
2s
(8)
It can be modified as
g(x; s) =
where is
s
(s) 0 (s)
(9)
2
the term in parenthesis (i.e. ) in (7) is merely a constant which can be avoided for
solving optimization problem.
Equivalently, the corresponding vector case can be derived as
0 ([W s]1 )
[W x]1 + ...
2[W s]1
0 ([W s]n )
[W x]n +
... + [W x]Tn
2[W s]n
where W is Wavelet matrix, [W x]n is the nth component of vector W x, [W s]n is the
nth component of vector W s. Above equation can also be written as
n
0 ([W s]i )
[W x]i +
2[W s]i
(10)
(11)
0 ([W s]
1)
[W s]1
..
.
0 ([W s]n )
[W s]n
(12)
Hence g([W x], [W s]) is a majorizer to ([W x]). Therefore, using (11) & (12), we
can directly give majorizer for (2) by
1
1
G(x, s) = ||z x||2 + (W x)T (W x) +
2
2
(13)
To avoid any further confusion, will consume all the different s. Finally. It will
appear as
0 ([W s] )
(1 ) [W s] 1
1
..
(14)
=
.
0
([W s]n )
(L+1 ) [W s]
n
(15)
(16)
1
1
xm+1 = argmin ||z x||2 + (W x)T (W x) +
2
2
x
(17)
1
1
xm+1 = argmin ||z x||2 + xT W T W x +
2
2
x
(18)
(19)
The only problem with the update equation arrives when the term [W s]n goes to zero,
the corresponding entries in becomes infinite. Therefore, expression (19) may
become infinite. However, To avoid this unstable state, Woodbury Matrix Identity
(more commonly called as the matrix inversion lemma) [12] could be used, which
is given in the form
(A + XBY )1 = A1 A1 X(B1 +Y A1 X)1Y A1
(20)
(21)
4 Example
During experimentation, 1 D signal denoising problemwas considered. The synthetic noisy signal used for experimentation purpose was generated using MakeSignal()
function from Wavelab tool with Additive White Gaussian Noise of = 4. The
wavelab tool is available at: http://statweb.stanford.edu/wavelab/
The value of parameter c which maintains the convexity could be is calculated using (3). Maximal sparse solution was noted with c = 1/ . To compute the values
associated with j , expression given by [1] was further modified as
j = 2 j/2
N
2(log(N/2) j)
,1 j L
(22)
Where N denotes the length of the signal and L denotes transformation level. Note
that, we used 0 = 1 for first sub-band matrix W1 . Further, different ( j ) for different band of wavelet transform matrix was used. As mentioned in Section 3, arc tangent NCP function is used. Using trial and error, 4-Level transformation in Wavelet
Matrix (i.e. L = 4) gave better results. Further, reconstruction and decomposition
high pass filters of reverse biorthogonal wavelet (rbio2.2) were employed, which
Biorthogonal (bior1.3)
Coiflets (coif1)
Daubechies (db2)
Biorthogonal (bior2.2)
Reverse Biorthogonal (rbio2.2)
1.6881
1.5902
1.4844
1.4811
1.4565
5 Conclusion
The attempt in this paper is to provide a pedagogical approach to the ingenious
methodology proposed by Ankit parekh et al [1]. An approach for 1-D signal denoising using decimated wavelet transform is proposed. The sub-band nature of
wavelet transform matrix (W) is exploited to get easier and better understanding
and solution for the given denoising problem. The problem formulation comprised
a smooth and a non-smooth term with a parameter c which controls the overall convexity. The solution to this formulation is obtained using M-M iterative algorithm.
The proposed approach offered better experimental results than non-convex regularization [1] and l1 regularization. The same functional procedure could be extended
for denoising a noisy image.
References
1. Ankit Parekh and Ivan W Selesnick. Convex denoising using non-convex tight frame regularization. Signal Processing Letters, IEEE, 22(10):17861790, 2015.
2. Jie Yan. Wavelet matrix. Dept. of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada, 2009.
3. Ivan Selesnick. Penalty and shrinkage functions for sparse signal processing. Connexions,
2012.
4. Ivan W Selesnick and Ilker Bayram. Sparse signal estimation by maximally sparse convex
optimization. Signal Processing, IEEE Transactions on, 62(5):10781092, 2014.
5. Kenneth Lange. Numerical analysis for statisticians. Springer Science & Business Media,
2010.
6. Jan van Tiel. Convex analysis. John Wiley, 1984.
7. David R Hunter and Kenneth Lange. A tutorial on mm algorithms. The American Statistician,
58(1):3037, 2004.
8. Geoffrey McLachlan and Thriyambakam Krishnan. The EM algorithm and extensions, volume
382. John Wiley & Sons, 2007.
9. Ingrid Daubechies, Michel Defrise, and Christine De Mol. An iterative thresholding algorithm
for linear inverse problems with a sparsity constraint. Communications on pure and applied
mathematics, 57(11):14131457, 2004.
10. Ivan Selesnick. Total variation denoising (an mm algorithm). NYU Polytechnic School of
Engineering Lecture Notes, 2012.
11. Ivan W Selesnick, Ankit Parekh, and Ilker Bayram. Convex 1-d total variation denoising with
non-convex regularization. Signal Processing Letters, IEEE, 22(2):141144, 2015.
12. Mario AT Figueiredo, J Bioucas Dias, Joao P Oliveira, and Robert D Nowak. On total variation
denoising: A new majorization-minimization algorithm and an experimental comparisonwith
wavalet denoising. In Image Processing, 2006 IEEE International Conference on, pages 2633
2636. IEEE, 2006.