You are on page 1of 14

CASSI Speech Recognition: Adding Speech Recognition to Embedded Devices

by
G.V.S.GIREESH (05981A0444), B.Tech Final year, Electronics & Communications Engineering RAGHU ENGINEERING COLLEGE

INTRODUCTION
What is CASSI ?
Conversay Advanced Symbolic Speech Interface

It can be used in a variety of embedded systems. It runs on either single or dual-processor hardware designs Conversay developers and customers write application code that uses the CASSI API to integrate speech recognition and text-to-speech (TTS) capability into embedded products.
> CASSI provides continuous, speaker-independent

speech recognition

What is TTS ?
Text-To-Speech (TTS): CASSI contains two modules for performing TTS: Rosetta and a TTS synthesis module. Rosetta, the text-to-phonetics unit, accepts arbitrary written text as input and outputs a string of phonemes for CASSI to synthesize

process of incorporating speech technology


1. Definition of capabilities 2. Analysis of hardware resources 3. User interface design 4. Development

HARDWARE ENVIRONMENT:
Modular nature.
Suitable for a variety of systems.

Used with single processor designs where one processor handles all component execution.
Feature extraction and TTS synthesis may be separated onto their own DSP (or other front-end signal processor)

Front-End Block:
The front-end block is used for recognition and TTS functions

Processor Block (Back-End):


The processor block performs all other code functions, including topic management and search

AUTOMATIC SPEECH RECOGNISATION


What does speaker dependent / adaptive / independent mean?

What does continuous speech and isolated-word mean?


A continuous speech system operates on speech in which words are connected together, i.e. not separated by pauses. Continuous speech is more difficult to handle because of a variety of effects.

An isolated-word system operates on single words at a time - requiring a pause between saying each word.

This is the simplest form of recognition

The Process of Speech Recognition


Acoustic-Phonetic Pattern Recognition Artificial Intelligence

INTERFACE

The Experiment

Yes spoken by first person

Yes spoken by the second person

The Basic Steps

Divide the sound wave into evenly spaced blocks. Process each block for important characteristics . Phone,

Attempt to associate each block with a which is the most basic unit of speech, producing a string of phones.

Find the word whose model is the most likely match

speech recognition systems use the basic three-stage Architecture:


Feature detection in which the raw acoustic waveform is represented in a more useful space Probabilistic classification of the feature vectors, in which the frames are scored as looking more or less likely as versions

Search for best wordsequence hypothesis in which a word sequence is found that is consistent with the constraints of lexicon and grammar

ADVANTAGES OF SPEECH RECOGNISATION

Easy search and index recorded audio and video data.


Speech recognition is also useful as a form of input. people working in active environment such as hospitals to use computers. people with handicaps to use computers.

CONCLUSION !!!

Visual cues to help computers decipher speech sounds that are obscured by environmental noise.

Speech-to-speech translation project for spontaneous speech

Multi-engine Spanish-to-English machine translation system Building synthetic voices

Thank You
Under The Esteemed Guidance Of Mr. K. PAVAN KUMAR (Asst. Professor) Electronics & Communication Engineering Department RAGHU ENGINEERING COLLEGE

You might also like