You are on page 1of 4

Sound and Music in Squeak

CS 345: Programming Language Paradigms

Introduction

Squeak contains a rich set of classes for recording and manipulating sound and music, including samples, MIDI data, and text-to-speech (TTS). This document will provide an overview of some of those capabilities.

Sampling

First, a digression into digital audio and how it works. Sound is an analog phenomenon. Some source produces a vibration, which travels through the air and moves your eardrum. The change in pressure produces the phenomenon we call sound. A sound can be pictured as a sine wave, or more commonly as a composition of sine waves. A sound has two components of interest: the amplitude of the wave, which determines the volume, and the frequency (how quickly a cycle occurs), which determines the pitch. A middle A has a frequency of 440 Hz, or cycles per second. The higher the frequency, the higher the pitch. The human ear can hear up to around 20,000 Hz. Since a sound is continuous and a computer is by nature discrete, we cannot store the whole waveform. Also, since most waveforms are highly irregular and therefore not easily describable mathematically, we cant use a formula to represent it. Instead, we try to same bits of the waveform, thereby discretizing it. The hope is that, as with movies, the discrete bits are close enough together and occur quickly enough that it will appear continuous to a human listener. The choices to be made in sampling are the rate at which samples are chosen and the precision at which we discretize the amplitude. Since amplitude is a continuous-valued quantity, we discretize it by mapping a continuous value into one of a set of discrete values. For example, if we use 8 bits to represent a sample, we have 28 distinct discrete amplitudes. More bits provides higher precision and a truer approximation of the continuous signal, at the cost of larger data les. The sampling rate determines how often samples are taken. This inuences both the quality of the digital conversion (how smooth the sound is) and the maximum frequency that can be encoded. A sampling rate of N can represent a frequency of at most N/2. (A higher frequency sound will have multiple oscillations between each sample.) Since the human ear can hear sounds up to around 22 KHz, CD-quality audio has a sampling rate of 44.1KHz. 1

This is just a very brief description of digital audio. For more information, one useful online source is: http://eamusic.dartmouth.edu/ book/.

User-level tools in Squeak

Most of the user-level tools for dealing with sampled sound in Squeak focus on recording an analog sound and using it to play a sample. The primary interface for doing this is the RecordingToolsMorph. To open this, in the Workspace do: RecordingToolsMorph new openInWorld. This opens a control panel which allows you to record, play and pause sounds. You can Edit a waveform by choosing show. This brings up an object called the WaveEditor. (This can also be created directly from the Squeak workspace.) The wave editor contains a PianoKeyboardMorph object, which allows you to play back a recorded sample at dierent pitches. The main thing you typically do with the WaveEditor is trim a sample to, for example, remove silence at the beginning or end. Yo do this by choosing the endpoint of a sample with the cursor, clicking Set loop end, telling the WaveEditor the approximate frequency (pitch) of the sample, the clicking one cycle, then placing the cursor at the beginning of the sample and clicking Set Loop Start. You can then click on the button to get more operations, including trimming the sample. You may also want to look at the harmonics of a sample. (If you dont know what a harmonic is, check out the book above under Frequency Domain). Squeak contains a SpectrumAnalyzerMorph that can be used to display the relative intensity of dierent frequencies of a sound, thereby telling you about the tone. Transforming a sound from the waveform display (amplitude vs time) to spectral display (frequency vs. amplitude) requires a mathematical operation called a Fourier transformation, and so the WaveEditor menu contains an option to open the SpectrumAnalyzer labeled show FFT (for Fast Fourier Transform).

Programming with Sampled Sounds

Its very easy to manipulate sampled sounds within a Squeak program; this section will point out a few ways. I encourage to to play around on your own with these classes. Squeak uses a nice object-oriented design for its sound classes. The basic functionality is captured within an abstract class called AbstractSound. This includes playing, pausing, and playing a sequence of notes. The most commonly used subclass of AbstractSound is FMSound. (FM stands for frequency modulation - its a particular way of creating complex waveforms.) To generate a particular musical sound (called a voice ), do: FMSound bass1 play

There are lots of voices - look under the class methods for FMSound. You can also set the pitch, volume and duration of a note using the instance methods. A more interesting thing to do is to play a sequence of notes. An example of this is: (AbstractSound noteSequenceOn: (FMSound soundNamed: brass1) from: #((c4 1.0 500) (d4 0.5 500) (e4 1.0 500) (c5 1.0 500) (d5 0.5 500) (e3 0.5 500))) This uses the AbstractSounds noteSequence method with a particular voice (the brass voice) to play the 6 notes indicated with the from: argument. Each note is represented by a triplet indicating the pitch (such as c4, or middle C on a piano), the duration (1.0 is 1 beat), and the volume (500). This builds an instance of a SequentialSound, which responds to play. There are lots of other tools and classes available for manipulating samples. They are all in the Sound categories, and most of them have some examples. The Squeak Swiki also has some documentation about sampling sound. The EnvelopeEditor is also a useful tool for manipulating the attack and decay of a sample. One last thing that folks often want to do is play a compressed sound, in particular an MP3. Squeak has support for decoding (but not encoding) MP3s and writing the output to a stream. An example is: StreamingMP3Sound onFileNamed: foo.mp3 play This creates an instance of StreamingMP3Sound, which can then be played, or sent to another stream.

MIDI in Squeak

MIDI is a symbolic language for sending musical data between machines. Rather than storing sampled waveforms, a MIDI message consists of a tuple indicating the voice to be used, the pitch, the duration, and the volume of the note to be played. The device receiving the message is responsible for actually generating a sound from this. MIDI has the advantage of being concise and deviceindependent, but the disadvantage of leaving out a lot of potentially useful information, such as accents, note ties, tempo, and so on. Squeak has a very rich support for MIDI. Many of the applications, including the ScorePlayerMorph, can be seen in the Squeak Music demo. It also provides an explanation of how to use these tools.

Text-to-Speech

Squeak also contains support for the generation of speech from written text. The details of how this works are well beyond the scope of this class, but to poke around at the tools, try: 3

Speaker bigMan say: I am the child? No. I am the big man speaking. The Speaker class has lots of dierent voices that you can play with. The speech code is Squeak was part of a project called Klatt, which eventually evolved into a commercial product developed by DEC. You can use this class (called DECTalkReader) as follows: DECTalkReader daisy playOn: KlattVoice new delayed: 10000 One nice thing about this application (and all the Squeak code, for that matter) is that all of the source code is included, so you can page through it and see how the speech is generated. Granted, in some cases, the lack of commenting can make this a long process, but it is possible.

You might also like