You are on page 1of 9

Digital Audio Effect System-on-a-Chip Based on

Embedded DSP Core

Kyungjin Byun, Young-Su Kwon, Seongmo Park, and Nak-Woong Eum

This paper describes the implementation of a digital I. Introduction


audio effect system-on-a-chip (SoC), which integrates an
embedded digital signal processor (DSP) core, audio codec Over the last few decades, digital audio effects have been
intellectual property, a number of peripheral blocks, and developed for various audio and music applications, such as
various audio effect algorithms. The audio effect SoC is composition, recording, mixing, and real-time sound
developed using a software and hardware co-design processing. A number of implementation techniques for audio
method. In the design of the SoC, the embedded DSP and effects are used, such as filters, delay lines, time-segments, and
some dedicated hardware blocks are developed as a time-frequency representations [1]. Although audio effect
hardware design, while the audio effect algorithms are algorithms and their realization by software have been
realized using a software centric method. Most of the extensively studied [2]-[6], few studies have been devoted to
audio effect algorithms are implemented using a C code the complete system implementation of audio effects [7], [8].
with primitive functions that run on the embedded DSP, Using the hardware/software co-design approach described
while the equalization effect, which requires a large in [7] provides a practical alternative to software centric
amount of computation, is implemented using a dedicated systems. In [7], dedicated hardware units are used to
hardware block with high flexibility. For the optimized implement real-time audio effects while software units are used
implementation of audio effects, we exploit the primitive for parameterization of the effects, which controls and manages
functions of the embedded DSP compiler, which is a very the hardware effect units. The software units are implemented
efficient way to reduce the code size and computation. The on an embedded soft-core processor supplied by the field
audio effect SoC was fabricated using a 0.18 m CMOS programmable gate array (FPGA) vendor. The advantages of
process and evaluated successfully on a real-time test the implementation of audio effects using a dedicated hardware
board. are very high throughput and low latency. However,
implementing audio effects in hardware is more difficult and
Keywords: System-on-a-chip, SoC, DSP, audio effect, time-consuming than in software. Because the hardware and
digital signal processor. software units are implemented together on an FPGA operating
at 48 MHz in [7], the performance of the implemented units is
relatively lower than that of a fabricated system-on-a-chip (SoC).
Implementation of a professional audio effect system by
Manuscript received May 13, 2009; revised Sept. 19, 2009; accepted Oct. 5, 2009. using a commercial digital signal processor (DSP) is described
This work was supported by the IT R&D program of MKE/IITA, Rep. of Korea [2006-S-
in [8]. The authors realized a total of nine audio effects in real-
048-01, Embedded DSP platform for audio and video signal processing].
Kyungjin Byun (phone: +82 42 860 5831, email: kjbyun@etri.re.kr), Young-Su Kwon time in a software centric manner. They insisted that using
(email: yskwon@etri.re.kr), Seongmo Park (email: smpark@etri.re.kr), and Nak-Woong Eum DSPs is the most efficient implementation solution in signal-
(email: nweum@etri.re.kr) are with Convergence Components & Materials Research
Laboratory, ETRI, Daejeon, Rep. of Korea.
based applications involving digital processing because DSPs
doi:10.4218/etrij.09.1209.0029 feature high performance and flexibility at acceptable costs.

732 Kyungjin Byun et al. 2009 ETRI Journal, Volume 31, Number 6, December 2009
However, because their implementation is not an SoC, it
Data processing engine (DPE)
requires other components, such as an audio codec, in order to
implement a complete audio effect system. I unit Fetch
In this paper, we present the implementation of a digital Sequencer
Instruction
audio effect SoC which integrates an embedded DSP (eDSP) Interrupt queue
Looping

core, audio codec IP, a number of peripheral blocks, and manager


Pipeline
Decode table
various audio effect algorithms. Generally, digital audio effects conflict
resolving
are realized by software in almost all cases. However, in this
Decision
study, we developed the audio effect SoC using a software and
hardware co-design method. The embedded DSP and some A unit
Address
dedicated hardware blocks are developed as a hardware design, Address
Register
Operand
calculator locator
while the audio effect algorithms are realized in a software Files
centric manner.
Operand
The design of the eDSP and its development environments register
are also described in this paper, which were developed by Forwarding

Electronics and Telecommunications Research Institute (ETRI)


X unit
for various audio and speech applications, such as an audio Acc0
ALU/shifter MAC
codec [9] and a video codec [10]. One of the useful features of Acc1 Multiplier
the eDSP compiler is its use of primitive functions. If a user
modifies application programs written in C language by using P bus C bus D bus W bus
the primitive functions of the eDSP compiler, efficiently
Access arbiter
optimized compilation results can be obtained. As a result,
employing the primitive functions makes it possible to obtain
an optimized assembly code because our compiler is fully
customized to the eDSP. even even even even even even even even

In the implementation of audio effects, we employ the odd odd odd odd odd odd odd odd

primitive functions of the eDSP compiler, which provide an Memory modules


efficient way to implement the audio effect algorithms. Most of
the algorithms are realized by C language with the primitive Fig. 1. Architecture of embedded DSP core.
functions that run on the embedded DSP in real-time, while the
equalizer, which requires a large amount of computing power, memory module via an access arbiter [11]. This architecture is
is implemented by the dedicated hardware block with high very efficient for developing SoCs for various applications that
flexibility. need a different level of computation and resources. The main
Section II describes the design of the embedded DSP core, features of the eDSP core shown in Fig. 1 are the following:
its development environments, and the primitive functions of
Five functional units: I (instruction sequencing unit), A
the compiler. In section III, we present the digital audio effect
(data address generation unit), X (execution unit), M
algorithms and an efficient implementation method for them.
(memory access unit), and P (peripheral unit)
Then, the design of the audio effect SoC and its
One program bus and three 16-bit data buses (two for data
implementation results are presented in section IV. Section V
read and one for data write)
gives the conclusion of this paper.
Unified memory architecture (both the program and data
share the same memory space)
II. Design of Embedded DSP Core Hardware-based pipeline conflict resolving technique
Seven stage pipeline: prefetch, fetch, decode, access, read,
1. Embedded DSP Core execute, and write
Arithmetic instructions with 32-bit operands
The eDSP is a 16-bit fixed-point DSP, which was developed
by ETRI for use in an SoC suitable for audio and speech A DPE consisting of I, A, and X units is the central
applications. The eDSP can consist of a single, dual, or quad instruction processing core of the eDSP. The DPE performs
core data processing engine (DPE) because it has a scalable most operations of the eDSP, such as instruction fetch and
architecture. Each core of the eDSP efficiently shares the decoding, address generation, and arithmetic operations. In

ETRI Journal, Volume 31, Number 6, December 2009 Kyungjin Byun et al. 733
these primitive functions are used in implementing an
Table 1. Some primitive functions and their operation. algorithm by using C language, well optimized assembly codes
Functions Description and operation can be obtained which are compilation results. We also used
accum_t Type definition of 40-bit accumulator
these primitives in the implementation of the audio effect
algorithms, which will be discussed in section III. The compiler
extract_low Extract 16-bit low word of accumulator
of the eDSP provides about 20 primitive functions, such as
extract_high Extract 16-bit high word of accumulator
extract_high, to_high, subc, and fract_mpy. If a
to_high Load 16-bit word to high word of accumulator programmer modifies the application programs written in plain
Divide step for integer division which is repeated C code by using the primitive functions, very compact
subc
for 16 times for 16-bit division
compilation results can be obtained, that is, very compact
fract_mpy Fractional multiplication
assembly codes. Table 1 shows some primitive functions and
their operation, which are used in Figs. 2 and 7.
Division (plain C code) Division with Therefore, we exploit these primitive functions to develop
primitive function
division (var1, var2) mathematic functions such as logarithm, exponential, and
{ division(var1, var2) division functions, which are used for implementing audio
Word16 iteration, var_out = 0; {
Word32 L_num, L_denom; accum_t aa = var1; effects. The logarithm function is used in distortion effects, and
L_num = L_deposit_l(var1); accum_t bb = var2;
L_denom = L_deposit_l(var2);
the division function is used in auto-wah and phaser effects,
for(iteration=0;iteration<15;iteration++) { aa = aa << 15; which will be discussed in section III. Figure 2 shows the
var_out <<=1; bb = bb << 15;
L_num <<= 1; division function written in C language with and without
if (L_num >= L_denom) { for (int i = 0; i < 16; i++) {
L_num = L_sub(L_num,L_denom); aa = subc(aa, bb);
primitives.
var_out = add( var_out,(Word16)1 ); } Mathematic functions written in C with primitives show
} return extract_low(aa);
} } highly optimized compilation results. For instance, a
return(var_out); logarithmic function implemented using the primitives requires
}
a computation of less than 10% of the code compared to not
Fig. 2. Modification of the division function by using the subc using the primitives. In particular, the division function
primitive function. implemented by using a subc primitive needs only the
computation of 4% of the plain C code compared to not using a
Fig. 1, the I unit arranges instructional words to handle various primitive function.
types of instructions. The A unit fetches operands from the
memory or internal DPE registers. The X unit executes the III. Digital Audio Effects
operation and saves the results into the memory or the DPE
registers. Each unit is a clearly separated module written in Digital audio effects are used in various audio applications,
hardware description language (HDL) code. such as multi-effectors, karaoke systems, and mobile audio
The M unit is a parallel memory system that arbitrates the devices. The main goal of digital audio effects is the
data requests between DPEs and memory modules. The M unit modification of the sound characteristic of the input audio
is composed of an access arbiter and the memory modules. The signal. There are many audio effect algorithms such as reverb,
access arbiter arranges the memory requests from a DPE or set distortion, auto-wah, and pitch shifting. These effects can be
of DPEs. The arbiter detects conflicts among memory requests, configured as shown in Fig. 3.
arranges the requests, and services accesses in the order of
highest priority. Conflict-generating memory requests are 1. Audio Effect Algorithms
serviced sequentially while conflict-free requests are completed
immediately. Many publications have dealt with the design of digital
audio effect algorithms and their realization [5], [6].
Therefore, we briefly describe some algorithms that will be
2. Primitive Functions of eDSP Compiler
mentioned again in the following sections for the
The compiler of the eDSP provides some useful primitive implementation issues.
functions which can be used in the C application layer but they
are not instructions of eDSP. Application programmers can use A. Reverb Effect
the primitive functions as a subfunction in their program Reverberation is a very common phenomenon in our lives. A
written in C language for optimizing their codes. Therefore, if reverb effect is the result of many reflections of sound

734 Kyungjin Byun et al. ETRI Journal, Volume 31, Number 6, December 2009
Output
Reverb
Audio input
Chorus
Noise gate Multiband
compress equalizer
Flanger Logarithm Bezier curve Distortion Input

Distortion
Auto-wah AMP speaker Fig. 5. Nonlinear processing for distortion effects.
Pitch shift
Tremolo
C. Distortion Effect
Phaser Audio Output Distortion is the modification of a sound using nonlinear
signal processing. Some musical instruments such as electronic
Fig. 3. Configuration of digital audio effects.
guitars take advantage of the distortion effect to enlarge and
vary their timbre. This modifies the sound color by introducing
x(t) nonlinear distortion products of the input signal [1]. In our
APF APF CF
implementation, we adopted logarithm and Bezier curves for
CF nonlinear processing to generate a distortion curve as shown in
y(t) Fig. 5. A distortion effect is obtained by performing the
APF APF
logarithmic calculation followed by the computation of the
CF Bezier curve. The distortion effect needs a preprocessing such
as a noise gate because it amplifies the input signal very much.
CF
D. Equalizer
Fig. 4. Simplified block diagram of a reverb effect.
Equalization is an effect that allows the frequency response
of an output signal to be controlled. An equalizer produces an
that occur in a room and on the surrounding walls in a concert
equalization effect that boosts or cuts certain frequency bands
hall. From any sound source, there are direct and indirect paths.
to adjust the output audio sound [8]. In this paper, we
The sound through an indirect path is reflected, delayed, and
implement an equalizer using a filter bank composed of infinite
attenuated. These reflected waves can again bounce off another
impulse response (IIR)-type filters. Because digital filter
wall before arriving at our ears, and so on. The reverb effect is
processing requires repetitive and intensive computation, these
the realization of a series of delayed and attenuated sound
IIR filters are implemented by a dedicated hardware block. The
waves [1].
multiband equalizer can be realized using these filters. The
In our implementation, the reverb effect is realized by
filter coefficients for each IIR filter are supplied by the eDSP
combining a parallel chain of comb filters (CF) and a serial
and are calculated in the eDSP in advance. The audio input
chain of all-pass filters (APF), as depicted in Fig. 4. The
signal is also fed into each filter by the eDSP, and the filtered
outputs of the comb filters are summed together through an all-
output is taken by the eDSP.
pass filter to produce a reverb effect.

B. Pitch Shift 2. Implementation of Audio Effects

A pitch shift changes the pitch of an audio signal without In the implementation of audio effects, we first developed
affecting its speed. Various techniques for pitch shifts have the effect algorithms in C language. The implemented effects
been proposed such as a phase vocoder and synchronous are compiled by an eDSP compiler and verified on an eDSP
overlap-add (SOLA). Both are block-based algorithms. A simulator. To ascertain the performance of our eDSP and its
phase vocoder is based on the short time Fourier transform, and compiler, we compared the simulation results with those of a
the SOLA approach is a time domain block-based algorithm. TI TMS320C54x DSP and CCS (TI DSP development
These block-based algorithms have a latency problem, which environment), which is similar to eDSP in terms of architecture
is a major issue in live systems in which an audio signal or and performance. A comparison between the C54x DSP and
voice is pitch-shifted in real-time [6]. To avoid the latency eDSP shows that the results from the eDSP are sufficiently
problem, we adopted an interpolated pitch shifting method in competitive, as shown in Table 2. In this table, the unit of the
our implementation, which is a sample-by-sample processing program and data memory is a word, and one word is 16 bits.
algorithm. After verifying the implemented audio effect algorithms, we

ETRI Journal, Volume 31, Number 6, December 2009 Kyungjin Byun et al. 735
instruction level ones [12]. Therefore, if programmers use those
Table 2. Comparison of compilation results between C54x DSP and intrinsic functions they might rewrite their C code which seems
eDSP (MIPS: million instructions per second).
to be almost an assembly-level code, as shown in Fig. 6, because
Effects Resource C54x eDSP even if those intrinsic functions can be used like a C function,
Program memory 1,642 1,465 they are instruction-level functions, that is, one intrinsic function
Reverb Data memory 44,384 44,382 is compiled to one DSP instruction. These kinds of intrinsic
MIPS 40.6 39.6 functions are efficient for optimizing C codes such as ETSI
standard speech codecs, but not for plain C codes. In the C code
Program memory 1,094 1,365
with intrinsic in Fig. 6, the _smac intrinsic function of C54x
Pitch shift Data memory 10,171 10,231
DSP is used to optimize the C code, which is the same as the
MIPS 5.2 8.5
MAC instruction of C54x DSP.
On the other hand, the primitive functions of eDSP can be
used like operators in a plain C code as shown in Fig. 7 because
Without intrinsic With intrinsic
the primitive functions of eDSP are more primitive than the
maf0_zn[0] = maf0_zn[maf0_idx] maf0_zn[0] =
C code

+ ((maf0_zn[0] * MAF0_FB) >> _smac(maf0_zn[maf0_idx], intrinsic functions of C54x DSP. This means the primitive
QN); maf0_zn[0], MAF0_FB);
functions have higher flexibility. In other words, for eDSP,
62F8 MPY *(908h),#1478h,A programmers can use some primitive functions in order to
Compiled assembly

F7B8 SSBX SXM F120 LD #1478h,0,B


7211 MVDM 902h,AR1 30F8 LD *(908h),T describe MAC operation in the C code, while for C54x DSP,
code (C54x)

F468 SFTA A,8,A 10E1 LD *AR1(2312),A programmers should use the _smac intrinsic function for the
11E1 LD *AR1(2312),B
F478 SFTA A,-8,A 28F8 MAC *(B),A MAC operation. Therefore, when optimizing the C code it is not
F512 ADD A,-14,B 80F8 STL A,*(908h)
81F8 STL B,*(908h) necessary for programmers to modify their code much to use the
primitive functions of eDSP. Although there are differences
Fig. 6. Optimization of using intrinsic function of TI C54x. between the primitive function of eDSP and the intrinsic
functions of C54x DSP, the results of optimization using
primitive and intrinsic functions are competitive.
Without primitive With primitive
In the example in Fig. 7, we convert the original C code in the
maf0_zn[0] =
maf0_zn[0] = maf0_zn[maf0_idx] upper left into the different one shown in the upper right by using
C code

extract_high(to_high(maf0_zn[maf0_
+ ((maf0_zn[0] * MAF0_FB) >>
QN);
idx]) + fract_mpy(maf0_zn[0], primitive functions, such as extract_high, to_high, and fract_mpy.
MAF0_FB));
770e 1478 // STM #0x1478, T
Using these three primitives makes the compiler generate the
7214 107a // MVDM *(0x107a), optimized compilation results shown in the lower right in Fig. 7,
Compiled assembly

7212 0fe9 // MVDM *(0xfe9), AR2


AR4 44e2 0fef // LD *AR2(0xfef), 16, A
code (eDSP)

20f8 1080 // MPY *(0x1080), A 64f8 0fef 28f0 // MAC *(0xfef), in which a multiply and accumulate (MAC) instruction is
4e04 // DST A, *SP(0x4) #0x28f0, A, A employed by the compiler of eDSP, while the compilation results
10e4 1080 // LD *AR4(0x1080), A 82f8 0fef // STH A, *(0xfef)
5704 // DLD *SP(0x4), B without primitives demonstrate an inefficient code as shown in
F612 // ADD B, 0x12, A
80f8 1080 // STL A, *(0x1080) the lower left of Fig. 7.
An MAC instruction is an instruction among the eDSP
Fig. 7. Effectiveness of using primitive functions of eDSP. instruction set, and it is efficient because it performs
multiplication and addition simultaneously. Therefore, letting the
optimize these C codes using the primitive functions supported compiler employ a MAC instruction, as shown in the lower right
by the eDSP compiler. Employing the primitive functions makes of the Fig. 7, is an efficient way to reduce the code size and
it possible to obtain an optimized assembly code because our computation. As shown in Fig. 7, a compiled assembly code of
compiler is fully customized to the eDSP. the C code with primitive functions is more compact and less
Figures 6 and 7 show examples of the optimization using the cycle-consuming than that without primitives. Figure 8 shows
intrinsic functions of C54x DSP and using the primitive the procedure in which each primitive function used in Fig. 7
functions of the eDSP, respectively. The C code used in Figs. 6 makes the compiler generate the optimized compilation results in
and 7 is a part of the C code for the reverb algorithm. In Figs. 6 which a MAC instruction is employed by the compiler.
and 7, QN is 14 and MAF0_FB is a constant in Q14, which First, as shown in Fig. 8, the to_high primitive moves the data
means the lower 14 bits of a 16-bit word are a fractional value. in memory into acc_high, the high 16-bit part of the 40-bit
TI C54x DSP also provides a number of intrinsic functions for accumulator. At this time, the low 16-bit acc_low is filled with
C programmers in order to optimize C codes. Although it zeros. Second, the fract_mpy primitive performs multiplication
provides many kinds of intrinsic functions, most of them are and shifts left by 2 bits. Last, the extract_high primitive moves

736 Kyungjin Byun et al. ETRI Journal, Volume 31, Number 6, December 2009
Memory (PLL), an analog-to-digital converter (ADC), and interface
to_high() maf0_zn[maf0_idx]
logics in order to develop audio effect systems, such as audio
mixers and electric guitar effectors. We designed and
acc_high acc_low
implemented an audio effect SoC which integrates almost all
fract_mpy() MAC of the following components which are needed for developing
maf0_zn[0] MAF0_FB an audio effect system:
Embedded DSP as a processor to perform various signal
Shift left by 2 bits processing of audio effect algorithms
Stereo audio codec IP for converting an analog audio input
accumulator signal into a digital signal
maf0_zn[maf0_idx] zeros
ADCs for the acquisition of the analog input for adjusting
parameters of various audio effects
acc_high acc_low
External memory interface for an audio effect which
requires large memory
extract_high() maf0_zn[0]
Boot-loading interface for downloading or uploading the
Memory
firmware of audio effects from or to the external program
Fig. 8. Procedure generating the optimized results in Fig. 7 by
using the primitive functions.
memory space
PLL for proper clock frequency

Table 3. Improvements for the implementation of reverb by using the The audio effect SoC integrates a dual eDSP core, stereo
primitive functions. audio codec, multiband equalizer, and many peripherals, such
Without With Assembly as a PLL, ADCs, host interface, and external memory interface
Reverb
primitive primitive coding as shown in Fig. 9. Because the audio effect SoC is based on
Program memory 1,465 1,099 851 the eDSP core, most of the peripheral blocks and hard-wired
Data memory 44,382 44,351 44,401 blocks are connected to the eDSP core through its external user
MIPS 39.6 33.6 24.0 (EU) bus.
We can change the parameters and configuration of the audio
effects by using the host interface, because the host interface
the high 16-bit part of the accumulator into the memory. As a has an 8-bit bus and a memory of 256 words. It is also used to
result, combining these primitive functions generates more download or upload the firmware in the eDSP. The eDSP
optimized compilation results employing a MAC instruction. firmware is also downloaded from an external EEPROM using
Table 3 shows the improvement of the implementation of the the boot-loader interface for the standalone mode.
reverb effect by using the primitive functions. The computation
of the reverb with primitives is reduced by 15% compared to that
Crystal Dual-core embedded DSP
without primitives. The rightmost results are from using PLL
DPE DPE
(clock)
assembly coding, which is the most optimized version; however, I I
Serial
this method requires a long development time. We can choose Boot I/F A A
clock/data
X X
the implementation method according to our needs. For instance, MPU
the reverb effect, which requires a large computation and read/write Host I/F
EU bus

data Access arbiter M


memory, is implemented by assembly coding, and other Analog ADC
algorithms are implemented using cross-compilation with the input

primitive functions for development within a short time. SDRAM External Memory modules
control memory I/F
EU bus
IV. Audio Effect SoC External
codec Codec I/F
Audio Mutiband
1. Implementation of Audio Effect SoC control
codec IP equalizer
Peripherals
Audio
Conventional audio effects are generally implemented on a signal I/O Hard-wired IP
microprocessor or a DSP by software. Therefore, they need
some components, such as an audio codec, a phase locked loop Fig. 9. Block diagram of audio effect SoC.

ETRI Journal, Volume 31, Number 6, December 2009 Kyungjin Byun et al. 737
algorithms in real-time before fabricating the SoC.
Audio effect SoC
The audio effect SoC can be used in the many applications,
EEPROM Boot I/F
such as electric guitar effectors and audio mixers. In case of
DSP core implementing an electric guitar effector, just the audio effect
Analog input SoC and EEPROM are needed because the SoC includes all
V/R ADC
components to perform audio effects. An example of an audio
Audio input DSP effect system is shown in Fig. 10. This example shows how the
memory
Audio codec SoC can be used in an application system such as an electric
Amplifier
Internal bus
guitar effector.
In Fig. 10, the audio effect firmware which is implemented
Audio output
by C and compiled by eDSP compiler is stored in an
Fig. 10. Example of an audio effect system using the SoC. EEPROM. The electric guitar is used as an audio input device,
and reverb effect is used as an audio effect in this example.
The SoC includes an 8-channel ADC and ADC interface, Electric guitarists usually use many kinds of effectors when
which has an 8-bit resolution. The ADC is for controlling the they perform. The electric guitar effector operates as follows.
parameters of the effect algorithms, such as reverb depth, pitch After the system power is turned on, the firmware of the
value of the pitch shifting, and noise threshold. Its maximum audio effects in the EEPROM is downloaded to the internal
conversion rate is 2 MHz, which is enough to control the memory of the SoC through the boot-loading interface. The
parameters of the audio effects. Although the ADC has one downloaded audio effect program run on the DSP waits for the
physical channel, it can be considered an 8-channel converter input audio signal. The analog guitar signal is converted to a
because it is multiplexed by an 8-channel multiplexer. digital signal by the audio codec at a rate of 44.1 kHz. The
The audio codec IP is an 18-bit 48 kHz stereo sigma-delta audio effect program receives a digital signal from the audio
audio codec for digital audio applications. Users can initialize codec as an input signal. The audio effect program performs
the audio codec by using the serial peripheral interface (SPI). the reverb algorithm to reverberate the input signal. The reverb
The codec interface makes it possible to use a user-specific program must process the input samples of 44,100 in a second
external audio codec. When using the internal audio codec, for the real-time processing. The output signal which is
audio data bypasses the codec interface. The external memory reverberated by the reverb program is converted again to
interface is for the external SDRAM. The SoC includes an analog signal by the audio codec and is amplified for the
internal memory of 128 kB, which is not enough for effects speakers.
such as a delay-type algorithm. External memory can be
extended to 2 MB. The PLL is an analog programmable 2. Summary of Audio Effect SoC
frequency synthesizer for an on-chip application. Its output
range and operating frequency is from 5 MHz to 320 MHz, Table 4 provides a summary of the implemented audio effect
and the loop characteristics of the PLL are fully programmable. SoC. The data memory size is 64k 16-bit words, which can be
In the SoC, most effect algorithms are implemented by
software as described in section III, except for the equalizer. Table 4. Summary of audio effect SoC.
The multiband equalizer consists of several filter blocks, which
Item Description
require a huge amount of computation. For instance, the
computation of the equalizer is about 6 MIPS per band, Dual core eDSP Max. 120 MIPS per core
resulting in 60 MIPS for a 10-band equalizer. Therefore, we Gate count 26,000 gates per one eDSP core
realize the equalizer by employing dedicated hardware blocks 64k words for eDSP, 512 words for
Memory
that consist of universal filters and can be reconfigured by interface blocks
changing the filter coefficients and some parameters. Moreover, Hard-wired block 4 filter blocks (equalizer)
we use these filter blocks for implementing other effect Audio codec IP Stereo in/out with 18 bit resolution
algorithms, such as a distortion effect, which uses the IIR filter. Peripherals
PLL, ADC, external memory I/F, host I/F,
The design of the audio effect SoC is verified using the boot I/F, ADC I/F, external codec I/F
FPGA development board as shown in Fig. 12, which includes 0.18 m CMOS process, 144 TQFP
Fabrication
package, die size: 5.5 mm5.5 mm
an Altera Stratix EP1S80B956C6 FPGA, an audio codec, an
Power consumption Typically 168 mW (3.3V / 1.8V)
ADC, and a microcontroller for debugging our SoC design. In
this board, we also verified the implemented audio effect

738 Kyungjin Byun et al. ETRI Journal, Volume 31, Number 6, December 2009
the specific application because implementing the effect
Table 5. Implementation results of audio effects. algorithms on the SoC is based on a programmable eDSP. As
Effects Prog. mem (words) Data mem (words) MIPS mentioned in the previous section, the SoC includes a dual core
Reverb 1,099 44,351 33.6 eDSP that has a computing power of 120 MIPS per core, and
the implemented audio effects in Table 5 require a total
Chorus 386 2,136 5.8
computation of 81 MIPS. Therefore, there is enough room to
Flanger 366 4,152 4.9
add other complicated effect algorithms. Figure 11 shows a die
Phaser 507 78 8.1
photograph of the audio effect SoC, and Fig. 12 shows the SoC
Tremolo 279 452 3.0 evaluation and FPGA development boards.
Auto-wah 523 45 9.1
Pitch shift 573 3,180 7.0 V. Conclusion
Distortion 465 1,046 9.3
Total 4,198 55,440 80.8 In this paper, we presented the implementation of a digital
audio SoC including an embedded DSP design and the
efficient realization of a number of audio effect algorithms
using the primitive functions of the eDSP compiler. Besides the
implemented basic effects in the SoC, we will add variations
SRAM
ADC PLL and other effects into the SoC to expand its application areas in
DSP Core the future. The quality of the audio effect SoC will be finally
SRAM
1024x16
evaluated by musicians, because audio effects are usually
SRAM
1024x16
employed by such users. The implemented SoC is suitable for
application in various audio applications, such as audio mixers,
Memory guitar-effectors, karaoke systems, and mobile audio devices.
module
AUDIO
Codec IP References
[1] U. Zlzer, Ed., DAFXDigital Audio Effects, New York: John
Wiley & Sons, 2002.
Fig. 11. Photograph of the audio effect SoC. [2] Z. Smekal, J. Schimmel, and P. Krkavec, Optimizing Digital
Musical Effect Implementation for Harvard DSP Architecture,
Int. Conf. Digital Audio Effects (DAFx-01), Hamburg, Germany,
Dec. 2001, pp. 33-38.
[3] T. Choi, Y. Park, and D. Youn, Design of Time-Varying
Reverberators for Low Memory Applications, IEICE Trans. Inf.
& Syst., vol. E91-D, no. 2, Feb. 2008, pp. 379-382.
[4] F.P. Ling, F.K. Khuen, and D. Radhakrishnan, An Audio
Processor Card for Special Sound Effects, IEEE Midwest Symp.
Circuits and Systems, vol. 2, Aug. 2000, pp. 730-733.
[5] J. Dattorro, Effect Design: Part 1: Reverberator and Other
Fig. 12. Audio effect SoC evaluation and FPGA boards. Filters, J. Audio Eng. Soc., vol. 45, no. 9, 1997. pp. 660-684.
[6] N. Juillerat, S. Schubiger-Banz, and S.M. Arisona, Low Latency
extended to 256k for the realization of other audio effects such Audio Pitch Shifting in the Time Domain, ICALIP, 2008, pp. 29-
as an echo, which requires a great deal of memory. The 35.
fabricated wafer is packaged by a 144-pin thin quad flat [7] M. Pfaff et al., Implementing Digital Audio Effects Using a
package (TQFP), and the implemented SoC operates well at Hardware/Software Co-design Approach, Int. Conf. Digital
120 MHz on a real-time test board. Audio Effects (DAFx-07), Bordeaux, France, Sept. 2007, pp. 125-
The implementation results of the audio effects are 132.
summarized in Table 5. We initially implemented eight basic [8] M. Micea et al., Implementing Professional Audio Effects with
audio effects in our SoC, but we will add other effects or DSPs, Trans. Automatic Control and Computer Science, vol. 46,
change the configuration of the effect algorithms according to no. 60, 2001, pp. 55-61.

ETRI Journal, Volume 31, Number 6, December 2009 Kyungjin Byun et al. 739
[9] G.H. Jeong, Y.G. Ahn, and I.S. Lee, Complexity Reduction Seongmo Park received the BS, MS, and PhD
Method for BSAC Decoder, ETRI Journal, vol. 31, no. 3, June degrees in electronics engineering from
2009, pp. 336-338. Kyungpook National University, Daegu, Korea,
[10] D.H. Yeo and H.C. Shin, High Throughput Parallel Decoding in 1985, 1987, and 2006, respectively. From
Method for H.264/AVC CAVLC, ETRI Journal, vol. 31, no. 5, 1987 to 1992, he was with the LG
Oct. 2009, pp. 510-517. semiconductor company, Gumi, Korea, where
[11] Application SoC Development Team, Embedded DSP Manual, he worked on ASIC design. In 1992, he joined
Electronics and Telecommunications Research Institute (ETRI), Electronics and Telecommunications Research Institute (ETRI),
May 2008. Daejeon, Korea, where he has worked on the development of SoC
[12] Texas Instruments, TMS320C54x Optimizing C/C++ Compiler design. He is currently engaged in research on SoC design, image
Users Guide, Oct. 2002. compression algorithms, and SoC architecture design. He is now a
team leader of the Multimedia Processor Design Team. He is also a
Kyungjin Byun received the BS degree in professor of the University of Science and Technology. He has
electronics engineering from Kookmin published over 30 technical papers in international journals and
University, Seoul, Korea, in 1987, the MS and conference proceedings. His main research interests are video coding,
PhD degrees in electronics engineering from image compression, multi-core processor design, DSP design, and
Information and Communications University low-power SoC architecture design.
(ICU), Daejeon, Korea, in 2000 and 2006,
respectively. He was a visiting scholar at Purdue Nak-Woong Eum received the BS degree in
University, W. Lafayette, USA in 2007. Since 1987, he has been with electronic engineering from Kyungpook
Electronics and Telecommunications Research Institute (ETRI) as a National University, Daegu, Korea, in 1984,
principal member of research engineering staff, and has participated in and the MS and PhD degrees in electrical
various projects including the development of digital communication engineering from KAIST, Korea, in 1987 and
systems, an embedded DSP core, and speech codecs. His current 2001, respectively. Since 1987, he has been
research interests include speech and audio signal processing, as well working with ETRI, Daejeon, Korea, as a
as digital signal processors and their applications. principal member of engineering staff, where he is currently the
director of the System-on-Chip Research Department. His current
Young-Su Kwon received the BS, MS, and research interests include embedded processor technology, mobile
PhD degrees from Korea Advanced Institute of multimedia systems, and system-on-chip design methodology.
Science and Technology (KAIST), Republic of
Korea, in 1997, 1999, and 2004, respectively.
He was with Microsystems Technology
Laboratory, Massachusetts Institute of
Technology as a postdoctoral associate from
2004 to 2005. He has been with the System-on-Chip Research
Department, Electronics and Telecommunications Research Institute,
Republic of Korea, since 2005. He has authored 30 internal journal and
conference papers with special interest in multi-core processor design,
VLSI CAD, and algorithmic optimizations of circuits and systems.

740 Kyungjin Byun et al. ETRI Journal, Volume 31, Number 6, December 2009

You might also like