You are on page 1of 21

Presented by:Nitin Rawat TT-ET (0909533046)

INDEX
Objectives
Introduction Difference between speech and voice recognition The logic Study of components used Methodology Challenges faced and their solution Advantages and disadvantages

OBJECTIVES
To develop a speech recognition system capable of

recognizing the words spoken by the user.


To make the system speaker independent. Remove external noise using filters. Retain data in RAM. Remove external noise.

INTRODUCTION
Speech is a natural mode of communication for people by which they

express their views and messages as voice utterances. The electronic approach of speech recognition is to convert the speech into an electronic signal. Analog representation of speech.

DIFFERENCE BETWEEN SPEECH AND VOICE RECOGNITION


Speech recognition
It is a speaker independent

Voice recognition
It is a speaker dependent

system. The main function of this system is to recognize the word spoken by the speaker. It concerns with what is being spoken and not who the speaker is.

system The main function of this system is to recognize the speaker first and then the words It concerns with who is speaking and what is being spoken.

THE LOGIC

STUDY OF COMPONENTS USED


The programming board

IC HM2007
Single chip voice recognition CMOS.
Speaker dependent. External RAM support. Maximum 40 word recognition (.96

second). Maximum word length 1.92 seconds (20 word). Microphone support. Manual and CPU modes available. Response time less than 300 milliseconds. 5V power supply.

INTERNAL FUNCTIONS OF HM2007


The chip provides the following error codes.

55 = word to long 66 = word to short 77 = no match Pressing 99 and then clear clears all the data inside the RAM.

8k x 8 SRAM
TTL-compatible inputs

and outputs. 13 address pins 8 I/O pins. Automatic power-down when deselected. Static RAM organized as 8192 words by 8 bits.

IC 74lS373
Consists of eight latches

with 3-state outputs for bus organized system applications. D1-D8 are data inputs. Q1-Q8 are output pins. Used for controlling two 7 segment drivers.

IC 7448 (7 segment driver)


Converts BCD data into

control signals for 7 segment display. ABCD are the BCD inputs. a-f are outputs for the 7 segment display.

7 SEGMENT DISPLAY
a-g are the connected to

output of IC 7448. Can display all hexadecimal digits from 0-9 and A-F.

METHODOLOGY
Trained words and codes fed in RAM
Data from RAM to latch

Speech input from mic

To & segment display or any other processing unit or interface.

The IC HM2007 is the heart of this speech recognition

circuit. The IC provides an analog front end, voice analysis, speech recognition and system control. The IC gets the analog signal from mic and converts it into digital codes. These codes are given specific notations or codes by the user and are stored in the RAM this is called as training. Now whenever the IC gets the same speech input it will give the same notation as given before.

CHALLENGES FACED AND THEIR SOLUTIONS


External Noise reduction-

it has been reduced by using a band pass filter of range 300Hz - 3.1kHz .

Homonyms
Homonyms are words that sound alike. For instance the words cat, bat, sat and fat sound alike. Because of their like sounding nature they can confuse the speech recognition circuit.

The Voice with Stress & Excitement


Stress and excitement alters ones voice. This affects the accuracy of the circuits recognition. To achieve a higher accuracy word recognition one needs to mimic the excitement in ones voice when programming the circuit. These factors should be kept in mind to achieve the high accuracy possible from the circuit. This becomes increasingly important when the speech recognition circuit is taken out of the lab and put to work in the outside world.

ADVANTAGES AND DISADVANTAGES


Advantages
This technology is great boon for blind and handicapped as they can utilize

the voice recognition technology for their works. As the speech recognition technology needs only voice and irrespective of the language in which it is delivered it is recorded, due to this perspective this is helpful to be used in any language.

DRAWBACKS:
If the system has to work under noisy environments, background noise may

corrupt the original data and leads to SS misinterpretation. If words that are pronounced similar for example, their, there, this technology face difficulty in distinguishing them.

FUTURE SCOPE
No typing by keyboard would be

required as your voice will act as an interface between you and your computer

various

using speech for controlling devices like car stereos, GPS ,home appliances and various other devices.

Sub vocal speech recognitionThe normal speech recognition can be extended to sub vocal speech recognition. This technology enables speech recognition through the vocal utterances. It is a boon to all the dumb people or people who face problems in speaking.

You might also like