Problem Statement : Most of the Real World Audio Signals are
mixtures of several audio sources. Ex. Cocktail Party In the problem of BASS we only have the mixed audio signal with us without any apriori information of how they were mixed. From this mixed signal we aim to extract all the possible source signals separately. Application BASS: 1. Separation of specific musical signal or specific organ from orchestra or musical band 2. Enhancement of speech signal out of mixture for hearing aid.
BASS : Mathematical Representation
S1 = Source Signal-1 S2 = Source Signal-2 S = [S1 ; S2]; A = Mixing Matrix X = Mixture Signal X = A*[S] W = estimated unmixing Matrix (it acts as inverse approximate of Matrix A) Y = Output after unmixing (should approximate Original Source Signal S) Y = W*X The above BASS algorithm is provided in the Research Paper by Smaragdis, Paris. "Blind separation of convolved mixtures in the frequency domain." Neurocomputing 22.1 (1998): 21-34. It assumes source signal of instantaneous nature. We then further referred to paper by Amari, Shun-ichi, Andrzej Cichocki, and Howard Hua Yang. "A new learning algorithm for blind signal separation." Advances
BASS : Algorithm
Step-1: As per the simulation example given in the paper, here,
we are taking synthetic source signals and random mixing matrix. Step-2: Original Mixing Matrix: Here, A is a mixing matrix. We are taking Random mixing matrix (A) distributed within the Interval [-1,1]. Here the elements of mixing matrix A will be randomly choose from in [-1,1]. Step-3: Mixture: X = A*S is a mixed signal. It is the only known parameter of our equation. Step-4: Estimated Unmixing Matrix: W is an estimated matrix which acts as an inverse of A. It is to be noted that we have no apriori information of A and for the same we aim to find the values of W so that they are close estimate of the inverse of A. The elements of estimated unmixing matrix W will be randomly choose from in [-1,1]. Step-5: Approximate Source Matrix: Y is the approximation calculated of our actual source matrix. It is computed Y = W*X.
Dirty Implementation - Code
signal1 = 5*sin((0:n)*40); %source signal1 signal2 = 0.1*sin((0:n)*400).*cos(30*(0:n)); %source signal2 signal3 = 0.01*sign(sin((0:n)*500) + 9*cos(40*(0:n))); %source signal3 A(1,:) = ((b-a)*rand(3,1) + a)'; A(2,:) = ((b-a)*rand(3,1) + a)'; A(3,:) = ((b-a)*rand(3,1) + a)'; W(1,:) = ((b-a)*rand(3,1) + a)'; W(2,:) = ((b-a)*rand(3,1) + a)'; W(3,:) = ((b-a)*rand(3,1) + a)'; for t=0:n for i=1:10 %calculating delta W dW = (250*exp(-5*t)).*(I - (tanh(y(:,t+1))*y(:,(t+1))'))*W; W = W + dW; %Reassigning W end End y = W*x; %output signal
Dirty Implementation Output Plots
Improved Implementation Algorithm
1. Two sources were taken : One human voice and another of instrument. 2. Mixing matrix(A) was randomly constructed. 3. We computed unmixing matrix (W) after training it through 10 mixing sources. These, mixing sources trained our system for 10 different mixtures which in turn made our system effective towards recovering varied mixtures 4. At output we are able to recover both of the sources separately in most of the cases. In certain cases we are able to recover original sources partially. Mixture Signal-1: Mix_1 Mixture Signal-2: Mix_2 Recovered Signal-1: Unmixed_1 Recovered Signal-2: Unmixed_2