Professional Documents
Culture Documents
Microphone Array
Lalan Kumar
1 / 28
WISSAP 2015
Presentation Outline
I Why Source Localization?
I My Research Journey : Uniform Linear Array (ULA) to Spherical Microphone
Array (SMA)
. Spherical Coordinate System
. Uniform Linear Array and Uniform Circular Array (UCA)
. Data Model in Spatial Domain
. MUltiple SIgnal Classfication (MUSIC) and MUSIC-Group delay (MGD) Spectrum
I Near-field Source Localization in Spherical Harmonics (SH) Domain
. Data Model in SH Domain
. SH-MUSIC, SH-MGD, SH-MVDR
. Cramr-Rao Bound Analysis
. Experiments on Source Localization
I Conclusion
Speech Source Localization over Spherical Microphone Array
2 / 28
WISSAP 2015
Pranjal Agrawal, Aseem Kushwah, Lalan Kumar, and Rajesh M. Hegde, "On the Rapid Prototyping of a Portable
Multi Media Acquisition System for Intelligent Meeting Capture." Journal of Signal Processing Systems 75, no. 3 (2014):
233-243.
Speech Source Localization over Spherical Microphone Array
3 / 28
WISSAP 2015
4 / 28
WISSAP 2015
M1
M0
X
M1
M0
M2
M3
S2
5 / 28
WISSAP 2015
jkT
l r1
jkT
l r2
,e
jkT
l rI T
,...,e
(3)
6 / 28
(2)
WISSAP 2015
(4)
vH ()Rpns[Rpns]H v()
Rpns is noise subspace obtained from eigenvalue decomposition of autocorrelation matrix, Rp = E[p(k)p(k)H ].
I MUSIC-Group delay spectrum is given by
PM GD () = (
U
X
|arg(v().qu)|2).PM U SIC ()
(5)
u=1
7 / 28
WISSAP 2015
4000
5
0
5
100
MUSIC Magnitude
2000
80
60
0
100
Ele() 40
50
Ele()
1
40
20
60
80
100
120
180
160
140
20
0
120
140
160
180
1
MP
0.5
0
0
100
80
60
40
20
10
20
30
40
Azi()
50
60
70
80
0.5
0
0
90
(a)
10
20
30
40
Azi()
50
60
70
80
90
(b)
(a) Spectral magnitude of MUSIC for UCA (top) and ULA (bottom). (b)Spectral phase of MUSIC for
UCA (top) and ULA (bottom). Sources at (15,50) and (20,60) for UCA. Sources at 50 and 60
for ULA.
8 / 28
WISSAP 2015
x 10
30
MUSICGroup Delay
20
10
0
100
50
Ele()
1
20
60
40
80
100
120
140
180
160
0.5
0
0
2
0
100
50
Ele() 0
1
20
40
80
60
100
120
180
160
140
Azimuth()
0.5
10
20
30
40
50
60
70
80
0
0
90
10
20
30
40
Azi()
Azi()
(a)
(b)
50
60
70
80
90
(a) Standard group delay spectrum of MUSIC for UCA (top) and ULA (bottom) (b) MUSIC-Group
delay spectrum for UCA (top) and ULA (bottom).
2
Kumar, L.; Tripathy, A.; Hegde, R.M., "Robust Multi-Source Localization Over Planar Arrays Using MUSICGroup Delay Spectrum," Signal Processing, IEEE Transactions on , vol.62, no.17, pp.4627,4636, Sept.1, 2014 doi:
10.1109/TSP.2014.2337271
Speech Source Localization over Spherical Microphone Array
9 / 28
WISSAP 2015
Estimate
DOA
Compute
TDOA
DSR
S1 (40,19)
T60
T60
Methods CTM
(150ms) (250ms)
MGD
12.98
23.96
MONC MUSIC
9.2 14.21
26.01
BS-MUSIC
15.02
27.99
Speech Source Localization over Spherical Microphone Array
Train
FSB
10 / 28
S2 (30,15)
T60
T60
(150ms) (250ms)
11.99
23.58
13.78
25.56
15.22
27.32
WISSAP 2015
Far-field
Near-field
(a)
(b)
(a) Spherical microphone array : Eigenmike system (b)Near-field and far-field region around
spherical microphone array. The ith microphone is positioned at ri and lth source at rl .
3
Kumar, L.; Singhal, K.; Hegde, R.M., "Robust source localization and tracking using MUSIC-Group delay spectrum
over spherical arrays," CAMSAP 2013, vol., no., pp.304,307, 15-18 Dec. 2013
Speech Source Localization over Spherical Microphone Array
11 / 28
WISSAP 2015
sl (ti (l ))
|ri rl |
with i(l ) =
L
X
sl (t i(l ))
|ri rl |
l=1
(6)
+ ni(t).
L
X
ej2fq i(l)
l=1
|ri rl |
(7)
Kumar, L.; Singhal, K.; Hegde, R.M., "Near-field source localization using spherical microphone array," HSCMA 2014,
vol., no., pp.82,86, 12-14 May 2014
Speech Source Localization over Spherical Microphone Array
12 / 28
WISSAP 2015
L
X
ejk|rirl|
l=1
|ri rl |
(8)
sl (k) + ni(k).
I In matrix form, the final near-field data model in spatial domain can be written as
p(k) = V()s(k) + n(k)
(9)
I The steering matrix V() is
(10)
13 / 28
(11)
WISSAP 2015
X
n
X
ejk|ri rl |
,
|ri rl |
can be
(12)
n=0 m=n
14 / 28
WISSAP 2015
(13)
(14)
Magnitude(dB)
n=0
n=1
n=2
n=3
50
n=4
100
Nearfield
Farfield
150
200
250 1
10
10
10
Kmax
Far-field and near-field mode strength for Eigenmike system. Near-field source is at rl = 1m and
order is varied from n = 0 (top) to n = 4 (bottom)
I The near-field criteria for spherical array is presented based on similarity of near-field mode strength (|bn(k, ra, rl )|) and far-field mode strength
(|bn(k, ra)|).
I The two functions start behaving in similar way at krl N , for array of
order N as shown in the Figure.
I Hence, near-field condition for spherical array turns out to be rN F
ra rl N
[2].
k
Speech Source Localization over Spherical Microphone Array
15 / 28
WISSAP 2015
N
k
and
Pnm(cos)ejm.
4(n + m)!
0 n N, n m n
16 / 28
WISSAP 2015
(15)
pnm(k, r) =
0
(16)
I
X
aipi(k, r, i)[Ynm(i)]
(17)
i=1
(18)
17 / 28
WISSAP 2015
(19)
(20)
18 / 28
WISSAP 2015
(23)
19 / 28
WISSAP 2015
(25)
U
X
|arg(vnmH .qu)|2).PM M
(26)
u=1
(27)
20 / 28
WISSAP 2015
x : 60
Y : 0.06
z : 0.66
0.5
0
0.1
x : 55
Y : 0.08
z:1
0.5
0
0.1
0.08
20
0.04 0
40
Elevation( )
60
0.08
0.06
Range(m)
20
0.04 0
60
40
80
0.06
Range(m)
Elevation( )
SHMGD
0.5
0.5
20
0 0
40
20
Elevation()
60
x : 60
Y : 30
z : 0.96
0.5
0
80
80
60
80
x : 55
Y : 40
z:1
x : 60
Y : 30
z : 0.8
0
80
20
0.04 0
60
40
Elevation( )
(c)
x : 55
Y : 40
z:1
1
x : 60
Y : 30
z : 0.71
40
Azimuth()
0.08
(b)
x : 55
Y : 40
z:1
x : 60
Y : 0.06
z : 0.96
0
0.1
80
(a)
1
x : 55
Y : 0.08
z:1
0.5
SHMVDR
Range(m) 0.06
SHMUSIC
x : 60
Y : 0.06
z : 0.71
SHMVDR
x : 55
Y : 0.08
z:1
SHMGD
SHMUSIC
80
(d)
60
Azimuth()
40
20
0 0
(e)
20
40
Elevation()
60
80
60
40
Azimuth()
20
0 0
20
(f)
21 / 28
WISSAP 2015
60
40
Elevation()
80
44.3
Range(m)
0.062
38.8
0.061
33.3
X : 60
Y : 30X:Y: 6030
Z: 0.06
Z : 0.06
0.06
27.8
0.059
22.3
0.058
33
16.7
32
11.2
31
30
Azimuth()
29
28
27
57
58
22 / 28
59
60
Elevation()
61
62
63
5.72
0.2
WISSAP 2015
H
nm )
H R1V
Rp1VnmRs)T (V
F = 2Re (RsVnm
nm
p
H
nm )
nm )T (RsVH R1V
Rp1V
+ (RsVnm
p
nm
(28)
(29)
Kumar, L.; Hegde, R.M., "Stochastic Cramr-Rao Bound Analysis for DOA Estimation in Spherical Harmonics Domain," Signal Processing Letters, IEEE , vol.22, no.8, pp.1030-1034, Aug. 2015 doi: 10.1109/LSP.2014.2381361
Speech Source Localization over Spherical Microphone Array
23 / 28
WISSAP 2015
x 10
CRB(r)
CRB()
CRB()
CRB
1.5
0.5
0
10
7.5
2.5
0
SNR (dB)
24 / 28
2.5
7.5
10
WISSAP 2015
SNR (dB) S
S1
-10
S2
S1
-5
S2
S1
0
S2
SH-MGD
(0.001,0.4)
(0,0.4243)
(4.47e-04,0)
(4.9e-04,0)
(0,0)
(0,0)
SH-MUSIC
(0.001,0.2449)
(0.001,0.2)
(2.0e-04,0)
(0,0.1414)
(0,0)
(0,0)
25 / 28
SH-MVDR
(0.013,2.97)
(0.007,2.05)
(0.0028,1.0)
(0.0018,0.7071)
(0.001,0)
(4.0e-04,0)
WISSAP 2015
Conclusion
I MUSIC-Group delay based source localization has been presented for ULA,
UCA and SMA.
I Near-field source localization for simultaneous estimation of range and bearing, has been utilized for the fist time.
I Experiments on source localization is presented as RMSE.
I Near-field array processing using sparse recovery technique in SH domain,
will be dealt with in future.
26 / 28
WISSAP 2015
References
[1] E. G. Williams, Fourier acoustics: sound radiation and nearfield acoustical
holography. Access Online via Elsevier, 1999. 14
[2] E. Fisher and B. Rafaely, Near-field spherical microphone array processing with radial filtering, Audio, Speech, and Language Processing, IEEE
Transactions on, vol. 19, no. 2, pp. 256265, 2011. 15
[3] J. R. Driscoll and D. M. Healy, Computing fourier transforms and convolutions on the 2-sphere, Advances in applied mathematics, vol. 15, no. 2, pp.
202250, 1994. 17
[4] B. Rafaely, Analysis and design of spherical microphone arrays, Speech
and Audio Processing, IEEE Transactions on, vol. 13, no. 1, pp. 135143,
2005. 19
27 / 28
WISSAP 2015
Thank You
Lalan Kumar
lalank@iitk.ac.in
http://home.iitk.ac.in/~lalank/