You are on page 1of 3

GEORGIA INSTITUTE OF TECHNOLOGY School of Electrical Engineering EE6255 Project No.

1 Introduction to the Speech Signal in MATLAB


Date Assigned: Date Due: January 13, 2003 January 22, 2003

The purpose of this project is to introduce you to the speech signal and to help you begin to see how MATLAB can be used to do interesting things to speech signals.

Download Data and M-Files

Go to WebCT under Resources and click on ECE6255 M-les to download the zip le ECE6255_M-files.zip. This contains some M-les that you may nd useful. Expand the zip le and put the resulting folder on your MATLAB path. I will be augmenting them as the semester progresses. For example, freekz( ) is a substitute for freqz( ) in the MATLAB Signal Processing Toolbox. Likewise, striplot( ) is a multiline waveform plotter that stands in for MATLABs strips( ), and spectgr( ) is a replacement for MATLABs specgram( ). Go to WebCT under Speech Data and download the le 8KHzdata.zip. This contains six sentences in MATLAB .mat format that can be loaded into MATLAB with the load command. This data is sampled at 8KHz sampling rate. Expand the zip le and put the resulting folder on your MATLAB path. You will need one of these sentences in the work below. Also go to WebCT under Speech Data and download the le 16KHzdata.zip. This data is sampled at 16KHz sampling rate, and it is .wav format. You can read it into MATLAB with the command wavread( ). Expand the zip le and put the resulting folder on your MATLAB path. You will need one of these sentences in the work below.

Phonetic Representation of Text

Write out a phonetic representation of the sentence Oak is strong and also gives shade. using the ARPABET symbols. For this you may nd the link under Resources to be helpful.

Segmentation Using Plotting and Listening

A fundamental problem in speech processing is the segmentation and phonetic labeling of speech waveforms. In general, this is very dicult to do automatically, and even computer1

aided segmentation by humans requires a great deal of skill on the part of the analyst. While MATLAB is far from an ideal tool for this purpose, its plotting functions such as plot( ), subplot( ), stem( ), and strips( )can be used for plotting speech waveforms. Also, sound( ) or soundsc( ) can be used to listen to the speech signal (or parts of it). The problem in this section is to segment and label the waveform in the le s5.mat. This le is a sampled (sampling rate 8 kHz) waveform of an utterance of the sentence Oak is strong and also gives shade. Use any useful features of MATLAB to examine the waveform in the le S5.SP, and make decisions on where each phoneme of the utterance begins and ends. You may want to write a MATLAB function or script that will facilitate plotting and listening to short pieces of the signal vector. Be alert for phonemes that are missing or barely realized in the waveform. There may be a period of silence or noise at the beginning and end of the le. Be sure to mark the beginning and end of these intervals too. Make a table showing the phonemes and the starting and ending samples for each.

Remove the Consonants and Vowels

In this section, your task is to use the information gathered in Section 3 to locate the beginning and end of each part of the waveform corresponding to the consonants, and then use MATLAB to zero out the parts of the waveform corresponding to the consonant sounds. Listen to the resulting waveform. Without telling them the sentence, play it to someone else and see if they can understand it. Now use the information gathered in Section 3 to locate the beginning and end of each part of the waveform corresponding to the vowels, and then use MATLAB to zero out the parts of the waveform corresponding to the vowel sounds. Listen to the resulting waveform. Play it to someone else and see if they can understand it. Which of the two modied waveforms was most intelligible?

Pre-Emphasis of the Speech Signal

As we will see when we discuss the speech model, the spectrum of voiced speech falls o at high frequencies due to the glottal pulse spectrum, which combines with the vocal tract frequency response. In many situations, it is desirable to compensate for this high-frequency fallo by linear ltering. This is often called pre-emphasis. A simple way to impose emphasis of the high frequencies is to use a rst dierence lter dened as y [n] = x[n] x[n 1] (1)

(a) Determine an equation for the frequency response H (ej ) of the rst dierence lter. (b) Use either MATLABs freqz( ) or the M-le freekz( ) provided in the ECE6255_M-files.zip download to plot |H (ej 2F T )| as a function of analog frequency F over the range 0 F Fs /2, where Fs is the sampling frequency.

(c) Use wavread( ) to read in the le msa1.wav. Filter this speech signal with a rst dierence lter. Use sound([x;y],fs) to play the input and the output of the rst dierence lter back-to-back. Can you hear the eect of the lter? (d) Use subplot( ) to show the wideband spectrogram of the input together with the spectrogram of the output. To keep your memory usage down, you may want to begin with an FFT length of 256 and perhaps use only the rst 20000 samples of each signal. Can you see the eect of the pre-emphasis? What details are more pronounced in the pre-emphasized speech?

Report

Submit a typewritten report including appropriate plots and images to illustrate your work. Learn to include graphics in your report either with LaTeX or MS Word, or whatever you use for this sort of thing. You should structure your report along the lines of the sections of this project assignment. Be sure to answer all the specic questions asked above.

You might also like