Professional Documents
Culture Documents
Certificate
NAME ROLL NO CLASS SEM : Priyanka Tomer : 0822932027 : E.I.E. : 6th
This is certified to be the bonafide work of the student on the topic SPEECH RECOGNITION for the seminar during the academic year 2010-2011.
Acknowledgement
I am very thankful to everyone who all supported me, for I have completed my work effectively and moreover on time. I am equally grateful to my teacher MR. R. K. Nagar. He gave me moral support and guided me in different matters reading the topic. He had been very kind and patient while suggesting me the outlines of my topic and correcting my doubts. I thank him for his overall support. Last but not the least, I would like to thank my parents who helped me a lot in gathering different information, collecting data and guiding me from time to time to time related to this topic despite of their busy schedules, they gave me different ideas in making my work unique. Thank you Priyanka Tomer E.I.E. (6th Sem ) 0822932027
Contents
Introduction History Voice Recognition Software Enrolment Detection and Correction Classification of Speech Recognition System How To Create a Voice Recognition System Framework For Authentication Application Advantage and Disadvantage Conclusion
HISTORY
While AT&T Bell Laboratories developed a primitive device that could recognize speech in the 1940s, researchers knew that the widespread use of speech recognition would depend on the ability to accurately and consistently perceive subtle and complex verbal input. Thus, in the 1960s, researchers turned their focus towards a series of smaller goals that would aid in developing the larger speech recognition system. As a first step, developers created a device that would use discrete speech, verbal stimuli punctuated by small pauses. However, in the 1970s, continuous speech recognition, which does not require the user to pause between words, began. This technology became functional during the 1980s and is still being developed and refined today. Speech Recognition Systems have become so advanced and mainstream that business and health care professionals are turning to speech recognition solutions for everything from providing telephone support to writing medical reports. Technological advances have made speech recognition software and devices more functional and user friendly, with most contemporary products performing tasks with over 90 percent accuracy. According to figures provided by industry. Satisfying the needs of consumers and businesses by simplifying customer interaction, increasing efficiency, and reducing operating costs, speech recognition is used in a wide range of applications. Furthermore, Allied Business Intelligence (ABI), the increased popularity of speech recognition will push revenues from $677 million in 2002 to an estimated $5.3 Billion by 2008. Indeed, recent advances in speech recognition software are creating a dynamic environment, since this technology appeals to anyone who needs or wants a hands-free approach to computing tasks. As the merger of large vocabularies and continuous recognition continues, look for more and more companies to move toward speech recognition and watch the industry take its place as a leader in the technology sector.
Introduction
Speech recognition is a technique used in Speech Processing in which the presence or absence of human speech is detected.
Enrolment
Everybody sounds slightly different, so the first step in using a voice recognition system involves reading an article displayed on the screen. This process, called enrolment, takes less than 10 minutes and results in a set of files being created which tell the software how you speak. Many of the newer Voice recognition programs say this is not required however we would always say its worth doing to get the best results. The enrolment only has to be done once, after which the software can be started as needed.
Isolated voice recognition system: This type of input requires the user to
pause between words so that the computer may distinguish the beginning and ending of words. Although your speech has to be modified slightly, hence slowing your regular dictation, you can achieve well over 80 WPM, the speed of an advanced typist. Some have even reported speeds of up to 125 WPM.
Continuous voice recognition system: This system does not require brief
pause between the spoken words. This technology is currently available for very small vocabularies (2000 words) and numbers by very few vendors. This speech input requires the user to say only words that are known to the system. You are also limited to the expandability of the libraries. This technology is currently not useful for dictation, but is very useful for specific functions or programs, i.e. data entry systems.
Speech analysis: This is the second step to create a voice recognition system.
The first important step in speech analysis is to separate each word from the ambient noise. If noise is not separated properly it gives error. Further each spoken words is compared with the inbuilt acoustic model or dictionary which is created during the training session. This dictionary created by the programmer. The above step is done with the help of an efficient speech detection
For example: Speech recognition system of window 7 looks so compact and it is as shown diagram.
Speaker Recognition
Speech Recognition
S2
SK
SN
This is the block diagram of a speech recognition system. First block is speaker recognition in which system recognize that who is speaking. Then speech recognition block. Last block paring and arbitration.
S1
Speaker Recognition
Speech Recognition
S2
SK
SN
Authentication
When speaker speaks then first block recognize who is speaking, like in this block diagram Annie David Cathy is the speaker.
S1
Speaker Recognition
Speech Recognition
S2
SK
Switch,to,channel,nine
Channel->TV Dim->Lamp On->TV,Lamp
SN
Last block do the parsing and arbitration with the given instruction to the present instruction. Then according to given instruction task is performed.
Applications
Health care
In the health care domain, even in the wake of improving speech recognition technologies, medical transcriptionists (MTs) have not yet become obsolete. The services provided may be redistributed rather than replaced. Speech recognition can be implemented in front-end or back-end of the medical documentation process. Front-End SR is where the provider dictates into a speech-recognition engine, the recognized words are displayed right after they are spoken, and the dictator is responsible for editing and signing off on the document. It never goes through an MT/editor. Back-End SR or Deferred SR is where the provider dictates into a digital dictation system, and the voice is routed through a speech-recognition machine and the recognized draft document is routed along with the original voice file to the MT/editor, who edits the draft and finalizes the report. Deferred SR is being widely used in the industry currently. Many Electronic Medical Records (EMR) applications can be more effective and may be performed more easily when deployed in conjunction with a speechrecognition engine. Searches, queries, and form filling may all be faster to perform by voice than by using a keyboard. Healthcare solutions are usually very state-specific, however some companies adjust their solutions to the needs of concreete markets (p.e. Speech Technology Center in Russia has a Finnish partner Vitim OY with a "Terve Elama" project).
F-35 is the first U.S. fighter aircraft with voice recognition system able to hear pilot spoken commands to manage various aircraft sub system. Such as communication and navigation.
Helicopters
The problems of achieving high recognition accuracy under stress and noise pertain strongly to the helicopter environment as well as to the jet fighter environment. The acoustic noise problem is actually more severe in the helicopter environment, not only because of the high noise levels but also because the helicopter pilot generally does not wear a facemask, which would reduce acoustic noise in the microphone. Substantial test and evaluation programs have been carried out in the past decade in speech recognition systems applications in helicopters, notably by the U.S. Army Avionics Research and Development Activity (AVRADA) and by the Royal Aerospace Establishment (RAE) in the UK. Work in France has included speech recognition in the Puma helicopter. There has also been much useful work in Canada. Results have been encouraging, and voice applications have included: control of communication radios; setting of navigation systems; and control of an automated target handover system. As in fighter applications, the overriding issue for voice in helicopters is the impact on pilot effectiveness. Encouraging results are reported for the AVRADA tests, although these represent only a feasibility demonstration in a test environment. Much remains to be done both in speech recognition and in overall speech recognition technology, in order to consistently achieve performance improvements in operational settings.
Battle Management
Battle Management command centres generally require rapid access to and control of large, rapidly changing information databases. Commanders and system operators need to query these databases as conveniently as possible, in an eyes-busy environment where much of the information is presented in a display format. Human-machine interaction by voice has the potential to be very useful in these environments. A number of efforts have been undertaken to interface commercially available isolated-word recognizers into battle management environments. In one feasibility study speech recognition equipment was tested in conjunction with an integrated information display for naval battle management applications. Users were very optimistic about the potential of the system, although capabilities were limited. Speech understanding programs sponsored by the Defense Advanced Research Projects Agency (DARPA) in the U.S. has focused on this problem of natural speech interface. Speech recognition efforts have focused on a database of continuous speech recognition (CSR), large-vocabulary speech which is designed to be representative of the naval resource management task. Significant advances in the state-of-the-art in CSR have been achieved, and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resource management system.
Further applications
Automatic translation; Automotive speech recognition (e.g., Ford Sync); Telematics (e.g. vehicle Navigation Systems); Court reporting (Realtime Voice Writing); Hands-free computing: voice command recognition computer user interface; Home automation; Interactive voice response; Mobile telephony, including mobile email; Multimodal interaction; Pronunciation evaluation in computer-aided language learning applications; Robotics; Video games, with Tom Clancy's EndWar and Lifeline as working examples; Transcription (digital speech-to-text); Speech-to-text (transcription of speech into mobile text messages);
Provide better accuracy than keyboard input Audio feedback improves data application accuracy Serve the needs for hands-free, eyes-free and real-time input very well High reliability and flexibility Time-saving data input Eliminate spelling mistakes
For some applications, the system tends to have high false reject rate due to the background noise and other variables Low signal to- noise ratio. Overlapping speech. Differentiation between homonyms. Intensive use of computer power.
Conclusion
Human performance figures suggests that we still have enormous rooms for improvements. For get good efficiency and to remove flaws and weakness of VRS use high quality microphone, good quality of sound cards. System must be trained properly and if possible work in quiet environment. At present several new algorithms are developed to implement voice recognition system.
References
www.abilitynet.org.uk www.tech.purdue.eud en.wikipedia.org