You are on page 1of 7

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No.

8, Issue No. 1, 100 - 106

Design of Matlab-Based Automatic Speaker Recognition and Control System


Er. Abhishek Thakur
ECE department Student RIEIT Railmajra, Punjab, India abhithakur25@gmail.com

Assistant Prof. Neeru Singla


ECE department Faculty RIEIT Railmajra, Punjab, India neerusingla99@gmail.com

I.

INTRODUCTION

Development of speaker identification systems began as early as the 1960s with exploration into voiceprint analysis, where characteristics of an individual's voice were thought to be able to characterize the uniqueness of an individual much like a fingerprint. The early systems had many flaws and research ensued to derive a more reliable method of predicting the correlation between two sets of speech utterances. Speaker identification research continues today under the realm of the field of digital signal processing where many advances have taken place in recent years [1].

The performance of the speech recognition systems is given in terms of a word error rate (%) as measured for a specified technology, for a given task, with specified task syntax, in a specified mode, and for a specified word vocabulary. Robust speech recognition systems can be applied to high accuracy connected digits recognition system finds application in the recognition of personal identification numbers, credit card numbers, and telephone numbers. Continuous speech recognition systems find applications in voice repertory dialer where eyes free, hands free dialing of numbers is possible [2]. Vocal communication between people and computers includes the synthesis of speech from text, automatic speech recognition and speaker recognition. Speaker recognition involves the speaker identification to output the identity of the person most likely to have spoken from among a given population or to verify a person's identity who he/she claims to be from a given speech input [3].

IJ
ISSN: 2230-7818

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES
A. Scopes

Keywords- Automatic Speech Recognition (ASR); Matlab; Microcontroller (89C52)

The main objective of this paper is to design and implement English key word speech recognition and control system using matlab, which is capable of recognizing and responding to key word speech inputs. This English key word speech recognizer would be applicable and useful for various key word-based applications, such as automation of office or business, monitoring of manufacturing processes, automation of telephone or telecommunication services, editing of specialized medical reports and development of aids for the handicapped [4]-[6]. In this research, we utilized rule based method to recognize English language key words.

B. Systems Architecture and Algorithums

Figure-1 Block diagram of control system

T
II.

Abstract This project gives the design of control system and speaker recognition code using matlab. Matlabs straightforward programming interface makes it an ideal tool for speech analysis projects. For the current project, experience was gained in general matlab programming and the design of control system. A basic speaker recognition algorithm has been written to sort through a rule base in matlab and choose the one most likely match based on the pre define time frame of the speech utterance.

In the current design project a basic speaker recognition algorithm has been written to sort through a rule base in matlab and choose the one most likely match based on the pre define time frame of the speech utterance as well as the location of the formants in the frequency and time domain representation. ASR AND CONTROL SYSTEM

Page 100

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

When we say key word microphone convert analog signal into the electrical signal and after doing the attenuation of the signal by the attenuator signal is transferred to the voice processor. In voice processor voice file is executed in matlab and a recognized word ASCII code is then transferred to the microcontroller through the RS232 standard communication protocol. This ASCII key word code perform particular task which is assigned in the microcontroller program. We are using LM 35 temperature sensor to sense the temperature below 30 degree and above 50 degree. Relay card is used to control temperature and speed of the fan. We used transmitter and receiver for wireless robot. Motor driver is used to control the direction of robot.

Step First: As shown in table 1 record these key words in database as .wav file having different time frames for each key word using these commands. 1) file_reverse=wavrecord(duration1,fs); This command is used to record command word with parameters: time frame duration1=40000 and frequency fs=8000. 2) [x_reverse y_reverse]=find(file_reverse>.1); This command is used to take above 0.1 magnitude speech sample and discard below. 3) diff_reverse=max(x_reverse)-min(x_reverse);

III.

APPROACH

This multi faceted design project can be categorized into different sections: software section, hardware section.

This command finds the difference between maximum and minimum value of speech sample and store in a variable. 4) wavwrite(file_reverse,'c:\voice\reverse.wav'); This command is used to store voice sample in memory location of the computer. Then plot the graph between time and magnitude axis. The code for this process can be found in Appendix A.

TABLE I. TIME FRAME FOR COMMAND WORDS Time Frame for Key words Code to 89C52

S.N. 1 2 3 4 5 6 7 8

Command Words

IJ
Go Reverse Robo Stop Temperature 30 Degree Temperature 50 Degree Set Fan Low Set Fan Medium Set Fan Stop Right Now 01 02 03 04 05 06 07 08

A
1000 -TO- 2300 2300 -TO- 6000 6000 -TO- 10000 10000 -TO- 13000 13000 -TO- 16000 16000 -TO- 18500 18500 -TO- 21000 21000 -TO- 25000

In this section, the first step is to define time frame for recording command words having duration=40000 mille seconds, frequency fs=8000 HZ. The next step is to record key word sample using wavrecord command, take value above 0.1 magnitude voice sample, calculate the difference and store the file using wavwrite command. To store other samples for key words procedure is same as previous. In the second step read the file and take above 0.1 magnitude value for the current voice sample. Calculate the difference and store in a variable which is then compared with pre define time frame if it match then give the output. The time frame to speak and store key words is as shown in table-I below.

Table -I Show go key word having time frame between 1 to 2.3 seconds for voice sample speak and store in the memory. If voice sample in between this frame then ASCII code 01 will generate.

ISSN: 2230-7818

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES
Time in mille seconds

A. Software section

Figure-2 Recorded voice key words

Step Second: In this step we compare the recorded sample with the real time speech. This comparison is based on the recorded sample time duration comparison as shown in table -1. If the real time

Page 101

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

speech matched with recorded speech then plot the graph voice matched otherwise No match for that key word and send ASCII code to the microcontroller using serial communication which performs particular operation to that key word defined in the program. This process is very simple for key word recognition. The code for this process can be found in Appendix B. B. HARDWARE SECTION
TABLE II. S.N. 1 2 3 4 5 6 7 MICROCONTROLLER PORT CONNECTION Ports of 89C52 c P1.0, P1.1, P1.2 P1.4 P1.5 P1.6 P1.7 P2.2 P2.3 Hardware Devices Control ADC Temperature 30 degree Temperature 50 degree set Go Reverse Fan low set Fan medium set

If difference between real time and recorded signal in between 1 second to 2.3 seconds then give output. 2) fwrite(s,01); This command is used for serial communication it send 01 code to microcontroller to perform related operation. 3) xlabel('go matched'); This command is used to plot graph for the recognized word. 4) End

A IJ
Figure-3 Implementation of control system

IV.

RESULTS

A. Software Results

To recognize speech key word we use these commands. 1) if (diff_rec >1000 && diff_rec< 2300)

ISSN: 2230-7818

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES
1) if (size_xrec(1,1)==0) diff_rec=0;

As we can see in table -II all peripheral are connected to corresponding port pin of microcontroller (89C52). Port 1.0, 1.1, 1.2, 1.3 pins are connected to the analog to digital converter. Port 1.4 and 1.5 pins are connected to the temperature sensor. Port 1.6 and 1.7 are connected to the robot control section. Port 2.2 and 2.3 are connected to relay circuit. These peripherals work according to our program which is stored in microcontroller. When command word given by user through microphone is recognized in matlab ASCII code will generate. This ASCII code given to 89C52 microcontroller which will perform particular operation related to that key word. The code for this process can be found in Appendix C.

Figure-4 Go Key word matched.

Eight key words are used in this project to control the hardware. User said go key word and this key word recognized by comparing with the recorded go key word as shown in figure -4. To show the results for these key words we use the logic as shown below.

If the difference is zero then variable hold value zero and show the figure no matched voice. 2) else diff_rec=max(x_rec)-min(x_rec); diff_rec1=num2str(diff_rec); figure(2),plot(t,file_rec); title('current voice') ylabel(diff_rec1) end If the time frame is in between maximum and minimum range then it calculate the numerical value in between the range specified in table 1 and put that value in variable after that it will plot the figure for current voice and

T
Page 102

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

matched command words showing x label as time and y label for calculated numerical value of the time frame. Speech recognition procedure for other key words are similar as previous. The voice recognition figure for other key words is shown below in figure 5.

TABLE III. SN 1 2 3 4 5 6 7 8

PORT OPERATION FOR COMMAND WORDS Task Executed Port 1.7=1and P1.6=0 Port 1.6=1 and P1.7=0 Port 1.6 & P 1.7=1 Port 2.2=1 and P2.3=0 Port 2.3=1 and P2.2=0 Port 1.4=1else 0 Port 1.5 =1 else 0 Port 2.2 & 2.3=0

Command Go Reverse Robo Stop Fan Low Set Fan Medium Set Temperature 30 Degree Temperature 50 Degree Set Fan Stop Right Now

CONCLIUSION AND FURTHER WORK

IJ
B. Hardware Results
ISSN: 2230-7818

Figure-5. Recognized voice key words.

As we can see in table III, if go key word recognized then Port 1.7 goes logic one and Port 1.6 goes logic zero. Which means that robot moves in forward direction. The logic one and logic zero position of the port is shows in table III for corresponding key word.

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES
[1] [2] [3] [4] [5] [6]

The main contribution of this study is that it presents the idea of key word recognition and control system. The experiments also show that the approach is good for key word recognition. The proposed ASR and Control System was completely implemented as shown in Figure-3, our effort will be directed toward developing the more appropriate and convenient method.

E. Darren Ellis, Design of a Speaker Recognition Code using Matlab. Department of Computer and Electrical Engineering University of Thennessee, Knoxville Tennessee 37996 09 May 2001 Revathi, R. Ganapathy and Y. Venkataramani, Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach, IJCSIT, Vol 1, No 2, November 2009 Claudia Moisa, Helga Silaghi, Andrei Silaghi, Speech and Speaker Recognition for the Command of an Industrial Robot, Mathematical Methods and Computational Techniques in Electrical Engineering, ISBN: 978-960-473-238-7 Ahmad A. M. Abushariah(1), Teddy S. Gunawan(2) and Mohammad A. M. Abushariah, English Digits Speech Recognition System Based on Hidden Markov Models, (ICCCE 2010), 11-13 May 2010, Kuala Lumpur, Malaysia Zhao Lishuang, Han Zhiyan, Speech Recognition System Based on Integrating feature and HMM, International Conference on Measuring Technology and Mechatronics Automation 2010 Md. Rashidul Hasan, Mustafa Jamil, Md. Golam Rabbani, Md. Saifur Rahman, Speaker Identification Using Mel Frequency Cepstral Coefficients, 3rd International Conference on Electrical and Computer Engineering (ICECE 2004), 28-30 December 2004, Dhaka, Bangladesh

T
REFERENCES

The implemented algorithm and control system control fan speed, temperature of heater and robot direction using the voice key word. It demonstrates its reliability and ease of future development. Based on obtained experimental results it demonstrates that the proposed algorithm is indeed functional and it can be used in voice key word control of home appliances and industrial robots. Percentage of correct recognition of commands is high enough.

Page 103

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

Appendix A %---------- recording pre defined keywords ---------------duration=16000; duration1=40000; fs=8000; %------------------------ saving files -------------------------------file_ready=wavrecord(duration,fs); wavwrite(file_ready,'c:\voice\ready.wav'); file_go=wavrecord(duration1,fs); [x_go y_go]=find(file_go>.1); diff_go=max(x_go)-min(x_go); wavwrite(file_go,'c:\voice\go.wav'); file_reverse=wavrecord(duration1,fs); [x_reverse y_reverse]=find(file_reverse>.1); diff_reverse=max(x_reverse)-min(x_reverse); wavwrite(file_reverse,'c:\voice\reverse.wav'); file_robostop=wavrecord(duration1,fs); [x_robostop y_robostop]=find(file_robostop>.1); diff_robostop=max(x_robostop)-min(x_robostop); wavwrite(file_robostop,'c:\voice\robostop.wav');

[x_fanstoprightnow y_fanstoprightnow]=find(file_fanstoprightnow>.1); diff_fanstoprightnow=max(x_fanstoprightnow)min(x_fanstoprightnow); wavwrite(file_fanstoprightnow,'c:\voice\fanstoprightnow.wav' ); Appendix B %---------------------------------------------------------------------s = serial('COM1'); set(s,'BaudRate',4800,'DataBits',8,'Parity','none','StopBits',1,'Fl owControl','none') ; fopen(s); %----------------- recording pre defined keywords --------------duration=16000; duration1=40000; fs=8000; %---------- writing files --------------------------------------------file_ready=wavread('c:\voice\ready.wav'); file_go=wavread('c:\voice\go.wav'); [x_go y_go]=find(file_go>.1); diff_go=max(x_go)-min(x_go); file_reverse=wavread('c:\voice\reverse.wav');

file_fanlowset=wavrecord(duration1,fs); [x_fanlowset y_fanlowset]=find(file_fanlowset>.1); diff_fanlowset=max(x_fanlowset)-min(x_fanlowset); wavwrite(file_fanlowset,'c:\voice\fanlowset.wav');

file_fanmediumset=wavrecord(duration1,fs); [x_fanmediumset y_fanmediumset]=find(file_fanmediumset>.1); diff_fanmediumset=max(x_fanmediumset)min(x_fanmediumset); wavwrite(file_fanmediumset,'c:\voice\fanmediumset.wav');

IJ
ISSN: 2230-7818

file_temprature30degree=wavrecord(duration1,fs); [x_temprature30degree y_temprature30degree]=find(file_temprature30degree>.1); diff_temprature30degree=max(x_temprature30degree)min(x_temprature30degree); wavwrite(file_temprature30degree,'c:\voice\temprature30degr ee.wav');

file_temprature50degreeset=wavrecord(duration1,fs); [x_temprature50degreeset y_temprature50degreeset]=find(file_temprature50degreeset>. 1); diff_temprature50degreeset=max(x_temprature50degreeset)min(x_temprature50degreeset); wavwrite(file_temprature50degreeset,'c:\voice\temprature50de greeset.wav'); file_fanstoprightnow=wavrecord(duration1,fs);

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES

[x_reverse y_reverse]=find(file_reverse>.1); diff_reverse=max(x_reverse)-min(x_reverse); file_robostop=wavread('c:\voice\robostop.wav'); [x_robostop y_robostop]=find(file_robostop>.1); diff_robostop=max(x_robostop)-min(x_robostop); file_fanlowset=wavread('c:\voice\fanlowset.wav'); [x_fanlowset y_fanlowset]=find(file_fanlowset>.1); diff_fanlowset=max(x_fanlowset)-min(x_fanlowset); file_fanmediumset=wavread('c:\voice\fanmediumset.wav');

[x_fanmediumset y_fanmediumset]=find(file_fanmediumset>.1); diff_fanmediumset=max(x_fanmediumset)min(x_fanmediumset); file_fanstoprightnow=wavread('c:\voice\fanstoprightnow.wav' ); [x_fanstoprightnow y_fanstoprightnow]=find(file_fanstoprightnow>.1); diff_fanstoprightnow=max(x_fanstoprightnow)min(x_fanstoprightnow); file_temprature30degree=wavread('c:\voice\temprature30degr ee.wav'); [x_temprature30degree y_temprature30degree]=find(file_temprature30degree>.1); diff_temprature30degree=max(x_temprature30degree)min(x_temprature30degree); file_temprature30degree=wavread('c:\voice\temprature30degr ee.wav');

Page 104

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

[x_temprature50degreeset y_temprature50degreeset]=find(file_temprature50degreeset>. 1); diff_temprature50degree=max(x_temprature50degreeset)min(x_temprature50degreeset); file_temprature50degreeset=wavread('c:\voice\temprature50de greeset.wav'); %-----------------------main loop ---------------------------------for i=1:40000 t(i,1)=i; end for i=1:10 sound(file_ready,fs); pause(.3); file_rec=wavrecord(duration1,fs); figure(1),subplot(4,1,1); plot(file_go); title('GO training file') figure(1),subplot(4,1,2); plot(file_reverse); title('REVERSE training file') figure(1),subplot(4,1,3); plot(file_robostop); title('ROBO STOP training file')

figure(2),plot(t,file_rec); title('current voice') ylabel(diff_rec1) end if (diff_rec >1000 && diff_rec< 2300) fwrite(s,01); xlabel('go matched'); end if (diff_rec >2300 && diff_rec< 6000) fwrite(s,02); xlabel('reverse matched'); end if (diff_rec >6000 && diff_rec< 10000) fwrite(s,03); xlabel('robo stop matched'); end

if (diff_rec >10000 && diff_rec< 13000) fwrite(s,06); xlabel('fan low set matched'); end

figure(1),subplot(4,1,4); plot(file_temprature30degree); title('TEMPRATURE 30 DEGREE training file')

figure(3),subplot(4,1,1); plot(file_temprature50degreeset); title('TEMPRATURE 50 DEGREE SET training file') figure(3),subplot(4,1,2); plot(file_fanstoprightnow); title('FAN STOP RIGHT NOW training file') figure(3),subplot(4,1,3); plot(file_fanlowset); title('FAN LOW SET training file')

IJ
figure(3),subplot(4,1,4); plot(file_fanmediumset); title('FAN MEDIUM SET training file') [x_rec y_rec]=find(file_rec>.1); size_xrec=size(x_rec); if (size_xrec(1,1)==0) diff_rec=0; else diff_rec=max(x_rec)-min(x_rec); diff_rec1=num2str(diff_rec);
ISSN: 2230-7818

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES
if (diff_rec==0) xlabel('no matched voice'); end pause(3) end flags equ temp30 bit 20h 0

if (diff_rec >13000 && diff_rec< 16000) fwrite(s,07); xlabel('fan medium set matched'); end if (diff_rec >16000 && diff_rec< 18500) fwrite(s,04); xlabel('temp 30 degree matched'); end if (diff_rec >18500 && diff_rec< 21000) fwrite(s,05); xlabel('temp 50 degree set matched'); end if (diff_rec >21000 && diff_rec< 25000) fwrite(s,08); xlabel('fan stop right now'); end

Appendix C ;---------------------------------------------------------------------$include(mod51)

T
Page 105

Er. Abhishek Thakur* et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 8, Issue No. 1, 100 - 106

temp50 bit org 0000h main: clr clr clr clr clr clr lcall lcall lcall lcall wait: jnb jnb clr ljmp checkfor50:jnb jnb clr

1 p2.0 p2.1 p2.2 p2.3 temp30 temp50 delay intlcd lcdwelcome starttimer temp30, checkfor50 p1.4, startdevice p2.1 recint temp50, recint p1.5, startdevice

p2.1 ljmp recint startdevice: setb recint: jnb clr mov cjne a, clr setb ljmp next2: cjne a, setb clr ljmp next3: cjne a, setb p1.6 setb p1.7 ljmp wait next4: cjne a,#04h, setb temp30 clr temp50 ljmp wait next5: cjne a,#05h, clr temp30 setb temp50 ljmp wait next6: cjne a,#06h, setb p2.2 clr p2.3 ljmp wait next7: cjne a,#07h, clr p2.2 setb p2.3 ljmp wait next8: cjne a,#08h, clr p2.2 clr p2.3

IJ
next7 next8 wait
ISSN: 2230-7818

A
next5 next6

@ 2011 http://www.ijaest.iserp.org. All rights Reserved.

ES

p2.1 ri,wait ri a,sbuf #01h,next2 p1.6 p1.7 wait #02h,next3 p1.6 p1.7 wait #03h,next4

ljmp wait ;---------------------------------------------------------------------intlcd: mov a,#38h lcall commandsend mov a,#0eh lcall commandsend mov a,#01h lcal commandsend mov a,#06h lcall commandsend mov a,#80h lcall commandsend ret ;----------------------------------------------------------------------commandsend:clr p3.6 clr p3.7 mov p0,a setb p3.7 nop nop nop clr p3.7 lcall delay ret datasend:setb p3.6 clr p3.7 mov p0,a setb p3.7 nop nop nop clr p3.7 lcall delay ret ;----------------------------------------------------------------------delay: mov r2,#02h l31: mov r0,#0ffh l2: mov r1,#0ffh l1: djnz r1,l1 djnz r0,l2 djnz r2,l31 ret keytokeydelay: mov r2,#03h l5: mov r0,#0ffh l4: mov r1,#0ffh l3: djnz r1,l3 djnz r0,l4 djnz r2,l5 ret ;-----------------------------------------------------------------------starttimer: mov tmod,#20h mov th1,#0fah mov scon,#50h setb tr1 ret intports: ret

Page 106

You might also like