You are on page 1of 23

Introduction

SYSC5603 (ELG6163) Digital Signal Processing


Microprocessors, Software and Applications

Miodrag Bolic

1
Objective
• FFT Introduction
• Some FFT algorithms
• FFT on PDSP
• FFT floating to fixed-point conversion
• Hardware implementation of FFT

2
FFT for TMS320x67 with 2 buffers

Buffer (ping)

Destination address 1

count

Source
address
Serial EDMA
Port FFT
Buffer (pong) Processing
Destination address 2

event (internal timer 1 is selected)

Switch address at the completion of a count


transfer

3
FFT Fixed point - Xilinx
• Performing the calculations with no scaling and carrying computation
– The growth of the fractional bits created from the multiplication are truncated
after the multiplication.
– The width of the output will be the (input width + number of stages + 1).
– For example, a 1024-pt transform with an input of 16 bits consisting of 1 integer
bit and 15 fractional bits, will have an output of 27 bits with 12 integer bits and 15
fractional bits.

• Scaling at each stage using a fixed-scaling schedule


• Scaling automatically using block-floating point

4
[Xilinx05]
Block-floating point
– The computation is fixed-point
– After every addition there is an overflow test
– If the overflow is detected the array is divided by ½
– The number of division is counted to determine the scale factor
– SNR depends on how many overflows occurs

5
Butterfly computation for Decimation in Time
• Linear noise model

6
[Oppenheim98]
7
[Oppenheim98]
Butterfly with Scaling multipliers

8
[Oppenheim98]
Sequential FFT-Xilinx core

9
[Xilinx05]
Pipelined FFT-Xilinx core

10
[Xilinx05]
Pipelined FFT architecture
• Radix-2 multipath delay commutator (R2MDC)
• Radix-2 single-path delay feedback (R2SDC)
• Radix-4 multipath delay commutator (R4MDC)
• Radix-4 single-path delay commutator (R4SDC)
• Radix-4 single-path delay feedback (R4SDF)
• Radix-22 single-path delay commutator (R22SDC)

11
[Li03]
Radix-2 multipath delay commutator
• The total number of delay elements is 4 + 2 + 2 + 1 + 1 =
10 for the 8-point FFT.
• The utilization of the butterfly and the multiplier is 50%

12
[Li03]
Radix-2 single-path delay feedback

• The total number of delay elements is N – 1=N/2 + N/4 +... + 1

13
[Li03]
FFT processor
• Datapath
– memories,
– butterflies and
– complex multipliers.
• Control unit

14
[Li03]
Requirements

• Requirement
– Transform length is 1024
– Transform time is less than 40 ms (continuously)
– Continuous I/O
– 25.6 Msamples/sec. throughput
– Complex 24 bits I/O data

• Steps in designing
– Architecture selection
– Partitioning
– Scheduling
– Word length selection
– RTL model generation
– Validation of models

15
[Li03]
Resource analysis
• Computation time for the 1024-point FFT

• The number of butterfly operations for Radix2

• Assume 1 clock cycle per Butterfly


• The minimum number of Butterflies is

• This is optimal with the assumption that ALL data are available to ALL stages, which
is impossible for continuous data streams. Each butterfly has to be idle for 50% in
order to reorder the incoming data.

16
[Li03]
Resource analysis
• The solution: the number of butterflies is 10
• The number of complex multipliers is 9
• Memory length for Radix-2 single-path delay feedback is
N-1

17
[Li03]
RAM Based Commutator
• A dual-port memory is required since the read and write operation
must be performed in one clock cycle.

18
[Li03]
Complex multiplier

19
[Li03]
Radix - 4

20
Radix 4

21
Altera radix-4 butterfly

22
[Oppenheim98]
References
[Altera05] Altera, FFT MegaCore Function User Guide,
DSP Literature, 2005.
[Li03] W Li, Studies on implementation of low power FFT
processors, Thesis, Linköpings University, 2003
[Oppenheim98] A. V. Oppenheim, R. W. Schafer, Discrete-
time signal processing, 2nd edition, Prentice Hall, 1998.
[Xlinx05] Xilinx, “Fast Fourier Transform v3.2”, DS260
August 31, 2005

23

You might also like