You are on page 1of 64

CONFIDENTIAL

Fast Fourier Transform v2.1

Fast Fourier Transform v2.1


Introduction The Fast Fourier Transform (FFT) IP core is a computationally efficient algorithm used to compute the Discrete Fourier Transform for a given input data set (real or complex) using the Cooley-Tukey Algorithm. It is optimally designed specifically for the eASIC Nextreme-2 and Nextreme devices with a focus on throughput characteristics required for Radar, imaging, OFDM modulation / demodulation as well as other applications requiring FFT.
Note: * Level 1 denotes that the core has been implemented in actual silicon

Implementation Summary

Families Supported Design File Formats Certification Implementation Details Support

Nextreme-2, Nextreme Verilog Level 1


See Performance and Resource Section

eASIC

Features Device Support for Nextreme-2 and Nextreme Devices FFT point sizes from 64 16K pt in steps of powers of 2 (i.e. 256, 512, 1024)*. Fixed Point bit accurate C-Model for system modeling available Support for both FFT & iFFT, run-time configurable Optional Run-time configurable Transform Length. Support for unscaled & scaled version
* Radix-4 Loop engine only supports N points which are powers of 4

Input data bit width: 2s Complement 10 24 bits Phase Factor bit width: 2s Complement 10 24 bits Rounding: Convergent Rounding Scaling: Unscaled, Fixed & Block Floating Point Bit Reversed or Natural Order Input Complete Verilog RTL Code Testbench for Simulation Architectures available: 1) Loop engine (Radix-2 & Radix-4) 2) High Performance Pipelined Streaming FFT

Release Information Below is a list of the files and documents containing in this release of the eASIC FFT IP Core function: 1) RTL files in Verilog 2) Bit accurate C-Model 3) Test bench for RTL simulation with Test vectors covering the FFT features 4) Documentation 5) Scripts for running the testbench

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL

Fast Fourier Transform v2.1

Performance and Resource Utilization


The following tables list the maximum clock performance, corresponding transform time and resource usage for a selected set of parameters. The eCell usage changes significantly depending on the timing constraints used in the target design with the faster the constraints the larger the eCell count. The latency is from asserting the START input to the done signal coming out of the core for that frame. The following device families are detailed in the tables below Nextreme Nextreme-2

For the determination of maximum frequency, the core was generated with double registers on each input and output. The registers directly connected to the core run on the core clock, whereas the outer registers run off a separate clock. This ensures that all paths in the core are included in the timing constraint without artificially distorting the design to fit the chip. The device voltage library used for the implementation is specified at the top of each table.

Nextreme-2
Note: All implementations use 16-bit Data and Phase Factors & 1.1v library Table 1:
FFT Architecture R-2 Loop Engine R-2 Loop Engine R-4 Loop Engine R-4 Loop Engine R-2 Loop Engine R-4 Loop Engine Pipelined FFT Pipelined FFT Pipelined FFT Pipelined FFT

Performance and Resource Usage for Nextreme-2


Point Size 1024 2048 1024 4096 1024 16,384 1024 2048 1024 16384 Output Order Natural Natural Natural Natural Reverse Natural Natural Natural Natural Natural Run Time N Yes Yes Yes Yes No No Yes Yes Yes No Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Unscaled Unscaled Scaling Performan ce (MHZ) 312.01 312.01 315.46 315.46 312.01 315.46 312.7 312.7 307.7 303.85 Latency (Cycles) 6321 13506 2421 10379 6321 45214 2103 4162 2103 32848 Latency (s) 20.26 43.29 7.67 32.90 20.26 143.33 6.73 13.31 6.83 108.11 Reg files 0 0 0 0 0 0 7 7 7 7

eCells 3,164 3,179 7,268 7,269 2,973 7,952 12,483 12,833 13,611 21,120

eDFF 1,932 1,981 4,080 4,187 1,932 4,295 11,208 12,454 11,694 14,427

bRAM 3 3 7 7 3 28 11 12 15 75

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL Nextreme
Note: All implementations use 16-bit Data and Phase Factors & 1.2v library

Fast Fourier Transform v2.1

Table 2:
FFT Architecture R-2 Loop Engine R-2 Loop Engine R-2 Loop Engine R-2 Loop Engine R-4 Loop Engine R-4 Loop Engine R-2 Loop Engine R-2 Loop Engine R-2 Loop Engine R-2 Loop Engine R-4 Loop Engine R-4 Loop Engine R-2 Loop Engine R-4 Loop Engine

Performance and Resource Usage for Nextreme


Point Size 512 1024 2048 4096 1024 4096 1024 1024 2048 2048 1024 1024 16,384 16,384 Output Order Natural Natural Natural Natural Natural Natural Reverse Reverse Reverse Reverse Reverse Reverse Natural Natural Run Time N Yes Yes Yes Yes Yes Yes Yes No Yes No Yes No No No Scaled / Unscaled Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Fixed Performance (MHZ) 210 206 196 202 199 197 206 203 196 196 199 198 198 191 Latency (Cycles) 3,479 7,335 15,543 32,967 3,440 14,469 7,335 7,335 15,543 15,543 3,440 7,335 147,687 61,594 Latency (s) 19.51 40.52 87.07 184.68 19.43 82.18 40.52 40.63 87.07 85.87 19.43 40.52 820.48 353.58

eCells 3450 3509 3520 3591 9437 9465 3509 3482 3520 3550 9437 9379 3696 9644

bRAM 5 5 5 10 11 11 5 5 5 5 11 11 40 44

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL

Fast Fourier Transform v2.1

General Description
The formulae for evaluating the DFT is Forward DFT

Where K range from 0 to N-1 Inverse DFT

Where n range from 0 to N-1 We note here that the inverse DFT only change in the phase factor is conjugate of the forward DFT. Fast Fourier transform is an efficient algorithm to find the DFT of a given block of input data. This algorithm uses divide and conquer approach which reduces the computation to smaller and repetitive structure called butterfly structure. This basic/butterfly structure can be implemented in such a way that it takes 2 inputs at a time (Radix-2) or 4 inputs at a time (Radix-4). The iFFT is calculated by conjugating the phase factors of the corresponding forward FFT. User has to take care for the division of result by N when using iFFT. The eASIC FFT cores have 3 architectures. 1) Loop engine - Radix-2: Separate stage for Loading, processing & unloading process. 2) Loop engine - Radix-4: Separate stage for Loading, processing & unloading process. 3) High Performance Pipelined Streaming FFT: Continuous output data after some latency. Figure 1 illustrates the throughput and area difference between the three architectures.

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 1: Resource Utilization vs Throughput of the FFT architectures for Nextreme

Run Time Configurable Point Size


This FFT supports the capability to change the point size on a frame by frame basis. When selected an input port is provided to determine the desired point size. There is a minor size increase when this option is selected. This capability is often required for wireless communications applications like OFDM systems (WiMAX, LTE) where the point size routinely changes over short time intervals.

Natural or Bit Order Input / Output


This architecture provides the option of Natural or Reversed order for output data. Natural order is where the data points are output in the same order as the input data points, i.e., 0, 1, 2, 3, and so on. The Bit Reverse order is simple to calculate, by taking the index of the data point, written in binary, and reversing the order of the Bits. Hence, 0000, 0001, 0010, 0011, 0100,...(0, 1, 2, 3, 4,...) becomes 0000, 1000, 0100, 1100, 0010,...(0, 8, 4, 12, 2,...). This parameter is not runtime configurable.
Rev: FFT_v2_1_ds001 www.eASIC.com 5

CONFIDENTIAL

Fast Fourier Transform v2.1

Scaling
The FFT processes a frame of data by successive passes over the input data frame. In each stage the data is subjected to addition which in turn increases the data width per stage. In each stage there are 2 set of additions so that the bit growth per stage is 2 bits for each real & imaginary in the case of R22SDF (Streaming) & Radix-4 (Loop engine). The numbers generated by the computation are potentially larger than the numbers picked up from memory. A strategy must be employed to accommodate this dynamic range expansion. The Bit growth is handled by fixed scaling schedule for both streaming & loop engine and Block floating point scaling for loop engine only. 1. Scaling at each stage using a fixed-scaling schedule When using scaling, a scaling schedule is used to scale by a factor of 1, 2, 4, or 8 in each stage. If scaling is insufficient, a butterfly output may grow beyond the dynamic range and cause an overflow. As a result of the scaling applied in the FFT implementation, the transform computed is a scaled transform. For Pipelined, Streaming I/O architecture, the scaling schedule is specified with two bits for every R22SDF stage, starting at the two MSBs. For example, a scaling schedule for N=256 could be [1 2 3 2]. So the scaling value for each stage is Stage1 1; Stage2 2; Stage3 3; Stage4 2. For Loop engine we have Radix-2 and Radix-4 architectures, for which the Radix-2 would produce 1 bit increase in each stage and Radix-4 would produce 2 bits in each stage. For N=64 Radix-2 the scaling schedule could be given as [1 2 2 1 1 1]. So the scaling value for each stage are stage1 1; stage2 1; stage3 1; stage4 2; stage5 2; stage6 1. For N=64 Radix-4 the scaling schedule could be given as [3 2 1]. So the scaling value for each stage are stage1 1; stage2 2; stage3 3. 2. Block Floating Point This is used only for loop engine. In this the core is intelligent enough to calculate the scaling factor required in each stage. The scaling value is found out by having a replicated radix-2 or Radix-4 computation just before storing the data into RAM. If any overflow is detected then the scaling value is calculated and then passed on to the next stage where the scaling process takes place.

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL

Fast Fourier Transform v2.1

Loop Engine Architectures


The computation of an input frame is done in a loop engine structure and computation takes place in 3 stages. 1) Loading stage (Takes input data to the core) 2) Computation stage (Computes DFT) 3) Unloading stage (Gives the data after computation) This FFT core has 2 options that user can select between. 1) Radix-2 Loop Engine 2) Radix-4 Loop Engine Figure 1 shows the conceptual block diagram of the loop engine FFT structure (Radix-2 Loop Engine core taken for illustration)

Start

Done

Address Control Generation


Ctrl_sigs Ctrl_sigs W_addr R_addr

Data valid

Data Memory

Radix Computation

Data Reorder Block

Output data Input data

Tw_addr

Twiddle factor Memory

Figure 1 Conceptual block diagram of the FFT core

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL Address Generation Control

Fast Fourier Transform v2.1

This block is the main controlling block for the entire FFT operation and implements the following functions key functions: a) Controls the entire core b) Generation of the read & write address data c) Read address for fetching the phase factors d) Indicates when the Loading ,Computation & Unloading stage occurs e) Generation of data valid, done signals.

Data Memory
This block and implements the following functions key functions a) Stores the input data given by user b) Stores the intermediate data after the Radix computation. c) Takes complex data as input. d) Memory used is the block memory and output is registered.

Radix Computation
This block and implements the following functions key functions a) This block contains the basic butterfly structure (Radix-2 or Radix-4). b) This block accepts the complex data as input & gives out the complex data.

Data Reorder Block


This block and implements the following functions key functions a) This block transposes the result of the butterfly structure before putting back to the memory. This is required as we are doing in-place computation b) In the loading/unloading stage the input/output data are directed to/from memory to input/output pins through this block.

Phase factor memory


a) This block stores the phase factors for the given N-Point value. b) At the beginning the core will be initialization stage. In this stage the phase factors are computed & stored in the block RAM. c) In the initialization stage the user should not give data. Only when the config_o port signal goes low then only data should be fed.

Block Floating Point


a) This block consists of overflow detection block & scaling block. b) The overflow detection block will detect the overflow and pass the scaling value to scaling block in the next stage.
Rev: FFT_v2_1_ds001 www.eASIC.com 8

CONFIDENTIAL Radix-2 Loop Engine

Fast Fourier Transform v2.1

This architecture uses a Radix-2 Butterfly structure for the FFT computation. This is the smallest area implementation of all in the FFT computation. Figure 2 shows Radix-2 computation block.

tw_rom adr_i fd_inv_i rd_addr_i Memory Block RAM0 Twiddle ROM Twos Comple ment

0
input_i

ip_sel_i

ip_sel_i

en_ram0_i wr_addr_i en_ram1_i


1

Radix-2 Engine

S C A L IN G

R 2 0 R 2 1 R 3

0 1

sel_0_i
1 0

sel_1_i

scale_value_i

0
Radix-2Data Reorder Block

RAM1

Figure 2 Conceptual block diagram of Radix-2 Loop Engine

op_sel_i

output_o

Rev: FFT_v2_1_ds001

www.eASIC.com

CONFIDENTIAL

Fast Fourier Transform v2.1

Radix-4 Loop Engine


This architecture uses Radix-4 Butterfly structure for the computation of the FFT. This architecture is faster compared to the radix-2 loop engine, as 4 complex inputs are processed every clock cycle. However, the faster throughput requires more resources. A block diagram of radix-4 computation is shown in Figure 3. This core supports scaled fixed point arithmetic.
R4_reord(0)

fd_inv_i rd_addr_i RAM0


0

Twiddle ROM Twos Comple ment 3 factors R4(0) R4(1) Radix-4 R4(2) Engine R4(3)
S C A L IN G

input_i

ip_sel_i

en_ram0_i wr_addr_i en_ram1_i

R4_reord(2) R4_reord(1)

RAM1

R4_reord(0) R4_reord(1) Data R4_reord(2) Reorder R4_reord(3)

ip_sel_i

1 0

scale_value_i

sel_0_o

sel_1_o

rd_addr_i RAM2
0

input_i

Radix-4 Data Reorder Block

ip_sel_i

en_ram2_i wr_addr_i en_ram3_i

R4(0) R4(1) R4(2) R4(3)

0 1

output

R4_reord(3)

RAM3

op_sel_i(1:0)

Output Mux

ip_sel_i

Figure 3 Conceptual block diagram of the Radix-4 Engine

Rev: FFT_v2_1_ds001

www.eASIC.com

10

CONFIDENTIAL

Fast Fourier Transform v2.1

Pipelined Streaming FFT Architecture


The advantage of the radix-4 over radix-2 is that the computation time taken for a given frame is less compared to radix-2. But the disadvantage of radix-4 is that the complexity is high compared to radix-2 and the size is larger. However, if the performance requirement is even higher than that of the Radix-4 loop engine, high performance pipelined FFT architecture is available that supports input rates equal to 2 2 the clock rate. This architecture uses an algorithm called the R2 Single Delay Feedback (R2 SDF). This algorithm is specifically used for streaming I/O FFT. In this R2 SDF algorithm the complexity of the radix4 is brought down to radix-2 and for each stage there is one R2 SDF block to cater for that stage computation so that the frames can be fed continuously. The computation of the input data frame takes place in the below mentioned 3 steps. 1) Loading stage (Takes input data to the core) 2) Computation stage (Computes DFT) 3) Unloading stage (Gives the data continuously after computation) The above 3 steps take place simultaneously.
2 2

R22SDF Main Structure


2

Below diagram shows the structure of R2 SDF architecture for FFT computation (for N=16 point) This is just a conceptual block diagram
8 4 2 1

BF1 s1 3

BF2 s1 2 t1 W1(n)

BF1 s2 1

BF2 s2 0 t2

Figure 4 : Conceptual block diagram of r22sdf (n=16 point)

We note that there are two structure BF1 & BF2 for each stage of computation. Below diagram shows the structure of the butterfly-1 & butterfly-2. The only difference b/w BF1 & BF2 is that the BF2 has additional multiplexer at the beginning which multiplies the input with (-j) and select corresponding input for the butterfly computation.

Rev: FFT_v2_1_ds001

www.eASIC.com

11

CONFIDENTIAL

Fast Fourier Transform v2.1

N/2 0 1 D11 DT1 1 0 s1 D22 -j D21 D31

N/4 0 1 D41

D12

0 1 X1

D32 1 0 D42

s1

t1

t1

BF1

BF2

Figure 5 : Internal structure of the butterfly structure 1 & 2 The number of multiplier required is same as that for the radix-4 computation stages & the adders required is same as that for radix-2 computation block. This structure is simple compared to radix-4 2 structure. The number of stages of R2 SDF stages is log4(N_point). As the number of twiddle multiplier is log4(N_point) the resource is less compared to radix-2 where the number of multiplier is log2(N_point) & also the memory required to store the twiddle factors is less.

FFT R22SDF Overall Structure

input

Stage-1 (BF1-BF2)

Stage-2 (BF1-BF2)

Last stage (BF1-BF2)

Output Shuffling

output

Sel_line

Sel_line

Sel_line

addr

start

Address control Generation

Figure 6 : Overall block diagram of R22SDF architecture

Figure 6 shows the overall block diagram of the R22SDF architecture,


Rev: FFT_v2_1_ds001 www.eASIC.com 12

addr

Ctrl_sig

Twi 1

Twi 2

CONFIDENTIAL
The main components that we note here are 1) Address control generation 2) Stage computation block 3) Data memory 4) Twiddle memory 5) Output shuffling

Fast Fourier Transform v2.1

Address control generation


This is the main control block, which controls the whole operation of FFT computation. Following are the key functions: a) controls the entire core b) generation of write enable and read enable for the data FIFO c) generation of the select line for the computation block d) generation of twiddle address for each stage e) generation of data valid & done signal f) controlling the output shuffling block

Stage computation block


This block does the following functions: a) This block contains r22sdf structure b) This block takes complex input data and given r22sdf computed complex data as output c) This block contains BF1 & BF2 structure as explained above. d) This block also contains complex multiplier

Data memory
This block does the following key functions: a) This stores the input data & intermediate data b) In this architecture we require shift register which is implemented by using FIFOs c) Take complex data as input d) Memory used can be block ram or distributed ram depending on the threshold and depth

Twiddle memory
Following are the key functions: a) This block stores the twiddle factors generated through Eulers theorem b) At the beginning the core will be initialization stage. In this stage the phase factors are computed & stored in the block RAM c) In the initialization stage the user should not give data. Only when the config_o port signal goes low then only data should be fed d) Memory used can be block ram or distributed ram depending on the threshold and depth

Rev: FFT_v2_1_ds001

www.eASIC.com

13

CONFIDENTIAL Output Shuffling


Following are the key functions:

Fast Fourier Transform v2.1

a) This block will convert the bit reverse order output data to natural order output data b) This block includes address generation block and dual port memory c) This block will be active only when the user selects to have output data in natural order

Rev: FFT_v2_1_ds001

www.eASIC.com

14

CONFIDENTIAL

Fast Fourier Transform v2.1

FFT Core Symbol


Figure 7 shows the core symbol of FFT core

clk_i rst_ni rom_clk_i rom_rst_ni ce_i start_i xn_re_i xn_im_i nfft_i nfft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i
Figure 7: FFT core symbol

Bxk

xk_re_o

Bxk xk_im_o

xk_index_o rfd_o
B B

busy_o

FFT_v2_1

dv_o done_o ovflo_o config_o xn_index_o


6 blk_exp_o

Rev: FFT_v2_1_ds001

www.eASIC.com

15

CONFIDENTIAL

Fast Fourier Transform v2.1

Port Interface
Table 3: Port Name
clk_i rst_ni rom_clk_i rom_rst_ni ce_i

Input Port Descriptions Direction


Input Input Input Input Input

Width
1 1 1 1 1

Description
Rising-edge clock Master asynchronous reset (Active High) Slower clock for the ViaROM block Reset signal for the the ViaROM block Clock enable (Active High) Input data bus: Real component (B = 10 - 24) in

xn_re_i

Input

twos complement format

xn_im_i

Input

Input data bus: Imaginary component (B = 10 24) in twos complement format FFT start signal (Active High): START is asserted to begin the data loading and transform calculation (for Burst I/O architectures). For Streaming I/O, START begins data loading, which proceeds directly to transform calculation and then data unloading. Point size of the transform: This port specifies the N_value that user need to

start_i

Input

nfft_i

Input

feed in or configure the core with. N-point would be (2^nfft_i). If this port is Zero then the least value is selected. (According to the architecture)

nfft_we_i

Input

Write enable for NFFT (Active High): Write enable for nfft_i port Control signal that indicates if a forward FFT or an

fwd_inv_i

Input

inverse FFT is performed. When FWD_INV=1, a forward transform is computed. If FWD_INV=0, an inverse transform is computed.

fwd_inv_we_i

Input

Write enable for FWD_INV (Active High).

Rev: FFT_v2_1_ds001

www.eASIC.com

16

CONFIDENTIAL
For 2xceil(number_of_stage/2) for Streaming I/O & Radix-4 Loop engine. scale_sch_i Input 2xceil(number_of_stages) for Radix-2 Loop engine Streaming

Fast Fourier Transform v2.1


I/O architecture, the scaling

schedule is specified with two bits for R22SDF stages, starting at two MSBs. Ex: Scaling schedule for N=256 could be [1 2 3 2]. So the scaling value for each stage is Stage1 1; Stage2 2; Stage3 3; Stage4 2. For Loop engine scaling schedule is specified with two bits for each stage starting at two LSBs. Ex: For N=64 Radix-4 the scaling schedule could be given as [3 2 1]. So the scaling value for each stage are stage1 1; stage2 2; stage3 3. Write enable for SCALE_SCH (Active High): This scale_sch_we_i Input 1 port is available only with scaled arithmetic and not with full precision.

**Note: unload_i 1 bit port is present when the user chooses loop engine architecture. When this port is high the output will be in natural order & when low then the output will be in reverse order.

Output Port Descriptions Port Name Direction Width Description


Output data bus: Real component in twos xk_re_o Output bxk complement format. (For scaled arithmetic bxk= B. For unscaled arithmetic, bxk = B+ log2 (maximum point size) +1) Output data bus: Imaginary component in twos xk_im_o Output bxk complement format. (For scaled arithmetic bxk = B. For unscaled arithmetic, bxk = B+ log2 (maximum point size) +1) xn_index_o Output log2 (maximum point size) log2 (maximum point size) 1 Index of input data.

xk_index_o

Output

Index of output data. Ready for data (Active High): RFD is High during the load operation. Core activity indicator (Active High): This signal

rfd_o

Output

busy_o

Output

goes high while the core is computing the transform.

Rev: FFT_v2_1_ds001

www.eASIC.com

17

CONFIDENTIAL

Fast Fourier Transform v2.1

dv_o

Output

Data valid (Active High): This signal is high when valid data is presented at the output. FFT complete strobe (Active High): DONE

done_o

Output

transitions High for one clock cycle when the transform calculation has completed. Arithmetic ovflo_o is High during result unloading if any value in the overflow indicator (Active High):

ovflo_o

Output

data frame overflowed. The ovflo_o signal is reset at the beginning of a new frame of data. This port is optional and only available with scaled arithmetic. Indicates that the core is still in the configuration

config_o

Output

stage. (That is the core is still in the evaluation of 1 the Phase factors). No input shall be fed in to the Core until this signal goes low.

blk_exp_o**

Output

This output signal indicates how many bits are scaled in each stage for the given frame.

**Note: blk_exp_o is present only when the user chooses the loop engine architecture.

Rev: FFT_v2_1_ds001

www.eASIC.com

18

CONFIDENTIAL

Fast Fourier Transform v2.1

I/O Data Flow Architectures


The I/O data flow is dependent on the FFT architecture selected. For the loop engine architectures buffered I/O data flow is used, however users can modify them to become a buffered streaming with an external memory buffer as long as the data rate is slow enough for the FFT to have processed the current FFT frame before the complete. For the High Performance Pipelined FFT only a streaming interface is available.

Buffered Data Flow Input Data Flow Waveform


Figure 8 shows the signals that one should note for feeding the data into a loop. 1) config_o should be low before feeding the data (I.e.,, before start pulse is asserted) 2) Before start pulse the run time configuration signals should be asserted 3) After the assertion of the start pulse the rdf_o signal will go high after one clock pulse 4) User should fed data in the next positive edge of the clock after getting the index.

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_i xn_index_i x(n)
0 1 1 0 2 1 2 n-1 n-1

Figure 8 : Input data flow

Buffered Data Flow Output Data Flow Waveform


Figure 9 shows the signals that one should note for feeding the data. 1) When the busy_o signal is deasserted the done will go high for one pulse 2) After 3 clock pulse the data valid will go high 3) The index and the data will be given in the same clock pulse

Rev: FFT_v2_1_ds001

www.eASIC.com

19

CONFIDENTIAL

Fast Fourier Transform v2.1

busy_o done_o dv_o xk_index_o X(k)


0 0 1 1 2 2 n-1 n-1

Figure 9 : Output data flow

Buffered Data Flow Top level Timing diagram

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_i xn_index_i x(n) busy_o done_o dv_o xk_index_o X(k)
0 1 2 0 1 2 n-1 n-1 1 0 1 2 0 1 2 n-1 n-1

Figure 10 : Top level Timing diagram

Rev: FFT_v2_1_ds001

www.eASIC.com

20

CONFIDENTIAL Streaming I/O Input Data Flow Waveform

Fast Fourier Transform v2.1

Figure 11 shows the signals that one should note for feeding the data into the FFT during with a streaming interface. 1) config_o should be low before feeding the data (I.e.,, before start pulse is asserted) 2) Before start pulse the run time configuration signals should be asserted 3) After the assertion of the start pulse the rdf_o signal will go high after one clock pulse 4) User should fed data in the next positive edge of the clock after getting the index.

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_o xn_index_o x(n)
0 1 0 2 1 2 n-1 n-1 0 1 0 2 1 2 n-1 n-1

Figure 11 : input data for non-continuous frame

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_o xn_index_o x(n)
0 1 0 2 1 2 n-1 0 1 2 1 2 n-1 n-1

n-1 0

Figure 12: Input data for continuous frame

Rev: FFT_v2_1_ds001

www.eASIC.com

21

CONFIDENTIAL

Fast Fourier Transform v2.1

Output Data Flow Waveform


Figure 13 shows the signals that one should note for taking the data out of the FFT. 1) When the busy_o signal is deasserted the done will go high for one pulse 2) After 1 clock pulse the data valid will go high 3) The index and the data will be given in the same clock pulse

busy_o done_o dv_o xk_index_o X(k)


0 1 2 0 1 2 n-1 n-1 0 1 0 1 2 2 n-1 n-1

Figure 13: Output data for non-continuous frames

busy_o done_o dv_o xk_index_o X(k)

0 1 2 0 1 2

n-3 n-2 n-1 0 1 n-1 n-2 n-1 0 1

2 2

n-1 n-1

Figure 14: Output data for continuous frames

Rev: FFT_v2_1_ds001

www.eASIC.com

22

CONFIDENTIAL Top level Timing diagram

Fast Fourier Transform v2.1

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_i xn_index_i x(n) busy_o done_o dv_o xk_index_o X(k)
0 1 2 0 1 2 n1 n1 1 0 1 2 0 1 2 n-1 n-1

Figure 15 : Top level Timing diagram for non-continuous frame

Rev: FFT_v2_1_ds001

www.eASIC.com

23

CONFIDENTIAL

Fast Fourier Transform v2.1

n_fft_i n_fft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i start_i rfd_i xn_index_i x(n) busy_o done_o dv_o xk_index_o X(k)
0 1 2 0 1 2 n-1 0 n-1 0 1 2 1 2 n-1 n-1 0 1 2 0 1 2 n-1 0 1 2 n-1 0 1 2 n-1 0 1 2 n-1 0 1 2 n-1 0 1 2 n-1 0 1 2 n-1 n-1

Figure 16 : Overall timing diagram for continuous frame

Rev: FFT_v2_1_ds001

www.eASIC.com

24

CONFIDENTIAL

Fast Fourier Transform v2.1

FFT User Parameters


The table below details the user parameters to configure the core. Table 4: User Parameters for the FFT Values
10 to 24*

Parameter Name
data_width

Description
Real = data_width Imag = data_width Real = phase_width Imag = phase_width Specifies the Fourier Transform Length in steps of 2 powers (16, 32) To Configure the core to compute only IFFT. When user wants the output data in natural order. If NOT defined then the output order will be bit reverse order If defines the core would be configured for scaled version. If NOT defined then the core would be unscaled version To make the core configurable for the run time FFT/iFFT computation If defined then the core works for Nextreme device.

phase_width

10 to 24*

n_point STATIC_IFFT OUTPUT_ORDER

16 to 16K Defines Define

SCALING**

Define

DYN_FFT_IFFT

Defines

NX

Defines

If NOT defined then the core works for N2X device (Nextreme-2 Device).

RUN_TIME_N_CONFIG BLK_FLT_POINT***

Defines Defines

To make the FFT core run time configurable To make the core configured to Block Floating Point Specifies the limit to choose Flip flop(Shift regs) or Memory

THRESHOLD_MEM_FF

User Discretion

elements; if (depth <= Threshold) FF are chosen else Memory element Specifies the limit to choose BRAM or Reg files

THRESHOLD_BRAM_REGFILE

User Discretion

if (depth <= Threshold) Reg files are chosen else BRAM is chosen

* Note: For loop engine the data width & phase width will range from 8 18. ** Note: Define SCALING is only present in streaming FFT ***Note: Present in Loop engine architecture only #Note: Present in Streaming architecture only

Rev: FFT_v2_1_ds001

www.eASIC.com

25

CONFIDENTIAL

Fast Fourier Transform v2.1

FFT Configuration File


//-----------------------------------------------------------------------------------------------------// // USER CONFIGURABLE PARAMETERS/DEFINES // //-----------------------------------------------------------------------------------------------------// `define DATA_WIDTH `define PHASE_WIDTH `define N_POINT 16 16 1024 //Real width = DATA_WIDTH //Imag width = DATA_WIDTH //Real width = PHASE_WIDTH //Imag width = PHASE_WIDTH //Specifies the Fourier Transform //Length in steps of power of 2 //Define //Configure the core to compute //only IFFT //Defines To make the core configurable for //the run time FFT/iFFT computation //Defines If defined then the core works //for Nextreme device. If NOT //defined then the core works for //N2X device (Nextreme-2 Device). //Defines To make run time configurable //transform length //Defines When user wants the output data in //natural order. If NOT defined //then the output order will be bit reverse order //Defines If defines the core would be //configured for scaled version. If NOT defined then //core would be unscaled version `define BLK_FLT_POINT //Defines To make the core configured to //Block Floating Point only for Loop engine 2 //Specifies the limit to choose //Flip flop(Shift regs) or Memory elements //if (depth <= Threshold) FF are chosen //else Memory element //Specifies the limit to choose BRAM or Reg files //if (depth <= Threshold) Reg files are chosen //else BRAM is chosen

//`define STATIC_IFFT

//`define DYN_FFT_IFFT

`define NX

//`define RUN_TIME_N_CONFIG

`define OUTPUT_ORDER

`define SCALING

`define THRESHOLD_MEM_FF

`define THRESHOLD_BRAM_REGFILE 16

Rev: FFT_v2_1_ds001

www.eASIC.com

26

CONFIDENTIAL
FFT IP Core have configuration file named fftv2_config.vh(for

Fast Fourier Transform v2.1 Streaming architecture) &

fft_config.vh(for Loop engine architecture). Here user can configure the USER CONFIGURABLE PARAMETERS/DEFINES & should not touch other parameters. The other parameters are testcase specific. If fftv2_ is suffix then this is specific for streaming FFT Core. If fft_ is the suffix then configuration is for the Loop engine. So while doing synthesis one should include this configuration file (fftv2_config.vh) and change the parameter for user specific. The allowable range is given in the comment of the configuration file & also in the parameter table. The following parameters are specific to the streaming FFT core, THRESHOLD_MEM_FF and THRESHOLD_BRAM_REGFILE. These parameters would decide the choice of memory and the type of memory for each stage. BLK_FLT_POINT parameter is specific to Loop engine which makes the core for dynamic scaling according to the input value.

Rev: FFT_v2_1_ds001

www.eASIC.com

27

CONFIDENTIAL

Fast Fourier Transform v2.1

List of Configurations of FFT Core


There are different configurations that are present in the FFT Version 2.1. These are listed below. 1) Loop engine a. Radix-2 b. Radix-4 2) Streaming a. Static Even Point b. Static Odd Point c. Run time Even Point d. Run time Odd Point 3) Multi Core Loop engine a. Radix-2 b. Radix-4 4) Multi Core Streaming a. Static Even Point b. Static Odd Point c. Run time Even Point d. Run time Odd Point

Rev: FFT_v2_1_ds001

www.eASIC.com

28

CONFIDENTIAL

Fast Fourier Transform v2.1

Component Instantiation for Loop Engine


Verilog Module Declaration
module fft_r2_top_viarom_rtl ( clk_i , rst_ni , rom_clk_i , rom_rst_ni , ce_i , xn_re_i , xn_im_i , start_i , unload_i , nfft_i , nfft_we_i , fwd_inv_i , fwd_inv_we_i , scale_sch_i , scale_sch_we_i , config_o , xk_re_o , xk_im_o , xn_index_o , xk_index_o , rfd_o , busy_o , dv_o , done_o , blk_exp_o , ovflo_o ); This instance is for the single core loop engine Radix-2. Note: For Radix-4 instance the top level module name will change to fft_r4_top_viarom_rtl rest remains the same. The port name and the width also would remain the same. Same holds good for VHDL component, Radix-4 component name would be changed to fft_r4_top_viarom_rtl.

Rev: FFT_v2_1_ds001

www.eASIC.com

29

CONFIDENTIAL

Fast Fourier Transform v2.1

VHDL Component Declaration


component fft_r2_top_viarom_rtl is generic( data_widht : natural := 16; phase_width : natural := 16; n_point : natural := 1024 ); port( clk_i : in std_logic; rst_ni : in std_logic; rom_clk_i : in std_logic; rom_rst_ni : in std_logic; ce_i : in std_logic; xn_re_i : in std_logic_vector (data_width - 1 downto 0); xn_im_i : in std_logic_vector (data_width - 1 downto 0); start_i : in std_logic; unload_i : in std_logic; nfft_i : in std_logic_vector (4 downto 0); nfft_we_i : in std_logic; fwd_inv_i : in std_logic; fwd_inv_we_i : in std_logic; scale_sch_i : in std_logic_vector (width_scale -1 downto 0); scale_sch_we_i : in std_logic; config_o : out std_logic; xk_re_o : out std_logic_vector (data_width - 1 downto 0); xk_im_o : out std_logic_vector (data_width - 1 downto 0); xn_index_o : out std_logic_vector (width_inx - 1 downto 0); xk_index_o : out std_logic_vector (width_inx - 1 downto 0); rfd_o : out std_logic; busy_o : out std_logic; dv_o : out std_logic; done_o : out std_logic; blk_exp_o : out std_logic_vector (5 downto 0); ovflo_o : out std_logic ); end component fft_r2_top_viarom_rtl; For Radix-2 num_of_stg = ceil_log2(n_point); width_scale = 2*num_of_stg; width_inx = ceil_log2(n_point); For Radix-4 num_of_stg = ceil_log4(n_point); width_scale = 2*num_of_stg; width_inx = ceil_log2(n_point);

Rev: FFT_v2_1_ds001

www.eASIC.com

30

CONFIDENTIAL

Fast Fourier Transform v2.1

Component Instantiation for streaming


Verilog Module Declaration
module fftv2_top_rtl ( clk_i , rst_ni , rom_clk_i , rom_rst_ni , ce_i , xn_re_i , xn_im_i , start_i , nfft_i , nfft_we_i , fwd_inv_i , fwd_inv_we_i , scale_sch_i , scale_sch_we_i , xk_re_o , xk_im_o , xn_index_o , xk_index_o , rfd_o , busy_o , dv_o , done_o , ovflo_o , config_o );

Rev: FFT_v2_1_ds001

www.eASIC.com

31

CONFIDENTIAL

Fast Fourier Transform v2.1

VHDL Component Declaration


component fftv2_top_rtl is generic( p_data_width : natural := 16; p_phase_width : natural := 16; p_n_point : natural := 1024 ); port( clk_i : in std_logic; rst_ni : in std_logic; rom_clk_i : in std_logic; rom_rst_ni : in std_logic; ce_i : in std_logic; xn_re_i : in std_logic_vector (p_data_width - 1 downto 0); xn_im_i : in std_logic_vector (p_data_width - 1 downto 0); start_i : in std_logic; nfft_i : in std_logic_vector (4 downto 0); nfft_we_i : in std_logic; fwd_inv_i : in std_logic; fwd_inv_we_i : in std_logic; scale_sch_i : in std_logic_vector (p_scale_width -1 downto 0); scale_sch_we_i : in std_logic; config_o : out std_logic; xk_re_o : out std_logic_vector (p_data_out_width - 1 downto 0); xk_im_o : out std_logic_vector (p_data_out_width - 1 downto 0); xn_index_o : out std_logic_vector (p_cnt_width - 1 downto 0); xk_index_o : out std_logic_vector (p_cnt_width - 1 downto 0); rfd_o : out std_logic; busy_o : out std_logic; dv_o : out std_logic; done_o : out std_logic; ovflo_o : out std_logic ); end component fftv2_top_rtl; The below parameters are to be declared in a package and include that package where this component is declared. If scaling is not present p_data_out_width = p_data_width + ceil_log2(p_n_point); else p_data_out_width = p_data_width; p_cnt_width = ceil_log2(p_n_point); p_no_of_stgs = ceil_log4(p_n_point); p_scale_width = 2*p_no_of_stgs;

Rev: FFT_v2_1_ds001

www.eASIC.com

32

CONFIDENTIAL

Fast Fourier Transform v2.1

Multi FFT Core Instance


Following are changes that are involved in the multi FFT core. The single Streaming engine would be used with a modification that the ViaROM & twiddle support would be placed outside the core. Following blocks would be contained in the top wrapper file FFT Core without ViaROM wrapper & twi_support block ViaROM & Twiddle Support block would be sitting outside the single Streaming engine core.

The following diagram shows how multiple instance of FFT core is done.
ViaROM Wrapper phase_w FFT core modified rd_addr_viarom_w xn0_re_i xn0_im_i xk0_re_i xk0_im_i fwd_inv_0i scale_sch_0i Nfft0_i

clk_i Twiddle Support Block

twi_wr_addr_w twi_wr_en_w config_o FFT core modified xn1_re_i xn1_im_i xk1_re_i xk1_im_i fwd_inv_1i scale_sch_1i Nfft1_i

rst_ni

FFT core modified

xn2_re_i xn2_im_i xk2_re_i xk2_im_i fwd_inv_2i scale_sch_2i Nfft2_i

FFT core modified

xn3_re_i xn3_im_i xk3_re_i xk3_im_i fwd_inv_3i scale_sch_3i Nfft3_i

Figure 17 : Multiple instance of FFT core

Rev: FFT_v2_1_ds001

www.eASIC.com

33

CONFIDENTIAL
Following pin out would change for Multi Core FFT.

Fast Fourier Transform v2.1

Single Core clk_i rst_ni rom_clk_i rom_rst_ni ce_i xn_re_i xn_im_i start_i unload_i nfft_i nfft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i scale_sch_we_i config_o xk_re_o xk_im_o xn_index_o xk_index_o rfd_o busy_o dv_o done_o blk_exp_o ovflo_o

Multi Core
clk_i rst_ni rom_clk_i rom_rst_ni ce_i xn_re_i xn_im_i start_i unload_i nfft_i nfft_we_i fwd_inv_i fwd_inv_we_i scale_sch_i config_o xk_re_o xk_im_o xn_index_o xk_index_o rfd_o busy_o dv_o done_o blk_exp_o ovflo_o [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst] [No of Inst]

scale_sch_we_i [No of Inst]

The Top module name will change when one is going for multiple instantiation. Radix-2 Loop engine : fft_multi_core_r2_top_rtl Radix-4 Loop engine Streaming Even Streaming Odd Streaming Run Even Streaming Run Odd : fft_multi_core_r4_top_rtl : fftv2_even_top_multi_core_viarom_rtl : fftv2_odd_top_multi_core_viarom_rtl : fftv2_run_even_top_multi_core_viarom_rtl : fftv2_run_odd_top_multi_core_viarom_rtl

NOTE: In the config file the parameter NO_OF_CHANNELS which defines the number of instance of FFT core present in the module.

Rev: FFT_v2_1_ds001

www.eASIC.com

34

CONFIDENTIAL

Fast Fourier Transform v2.1

Bit-Accurate C Model
The C Model is designed for bit-accurate modelling of the FFT core. The model produces the same exact result as Verilog implementation of the FFT core. It is important to note that the C-model is not cycle accurate and does not model interface or clock latency. The files provided with the C-Model are 1. fftv2_r22sdf_cmodel.c - The complete C-Model 2. fftv2_inter_parameter.h Internal parameters required for the IP 3. fftv2_user_defines.h - User parameters for the FFT

System Requirements
A GCC or Microsoft Visual C++ 8.0 or greater is required to use the C-Model The C-model is tested using GCC and Microsoft Visual Studio 8.0 in the Linux environment. NOTE: The FFT V2.1 C-model code is tested in a 32-bit environment (Win XP Operating System). The models functionality has been tested for configurations in which the intermediate results do not exceed 32 bits. (We have chosen configurations where the data width = 8bits, 9 bits and 10 bits). In MS-VC++, functionality of the model for the cases where, internal/final result with data width is greater than 32 bits has not been tested.

Rev: FFT_v2_1_ds001

www.eASIC.com

35

CONFIDENTIAL

Fast Fourier Transform v2.1

User Defines
Table 5: User Parameters for the C Model Description
The Point size of a Transform fft data width, only real data width fft phase width, only real data width number of frames first frame N_POINT value second frame N_POINT value depending on NO_OF_FRAMES values we need to have that many F*_N_POINT defines F1_FWD_INV F2_FWD_INV first frame transform value second frame transform value forward transform value = 1 for reverse transform value = 0 F1_SCA_VAL F2_SCA_VAL first frame scaling value second frame scaling value this value indicates the scaling after each stage whether to include scaling or not SCALING_EN BLK_FLT_POINT STATIC_IFFT if it is 1 then scaling is enabled if it is 0 then scaling is disabled and default scaling is applied. To enable the block floating point scaling feature when STATIC_IFFT = 0, negates the imaginary values of phase factor to calculate the IFFT in multi frame transform When DYN_FFT_IFFT defined, and the transform == 0, the twiddle factors are negated. This is to calculate the Forward and inverse transformation in run time input output EN_STAGE_RESULT PRINT_TWIDDLE F_N_POINT F_FWD_INV F_SCA_VAL
Rev: FFT_v2_1_ds001

Parameter Name
N_POINT DATA_WIDTH PHASE_WIDTH NO_OF_FRAMES F1_N_POINT F2_N_POINT

DYN_FFT_IFFT

input file name output file name This is to print intermediate stage results This is to print phase factor values make the array of F*_N_POINT make the array of F*_FWD_INV make the array of F*_SCA_VAL
www.eASIC.com 36

CONFIDENTIAL Input Data File Format

Fast Fourier Transform v2.1

In "fft_user_defines.h" the character array "input" specifies the input file name. This file should contain the input data to be transformed. The data should be in decimal format, and contain the real and imaginary values separated by space. An example is shown below: +19783 +47534 +61825 +16308 +118822 +43074 +96314 +117995 . The left most column is the real part and the right portion is the imaginary part. Note that the data are in decimal format.

Output Data File Format


In "fft_user_defines.h" the character array "output" specifies the output file name. It contains the transformed values of input data. The data is in decimal format and it contains real part, imaginary part and overflow indication bit for that frame. An example is shown below: +19783 +475341 +61825 +163081 +118822 +430741 +96314 +1179951 . The left most column is the real part of the output data. Next column is the imaginary part of the data. The last column is for the indication of the overflow bit.

Rev: FFT_v2_1_ds001

www.eASIC.com

37

CONFIDENTIAL

Fast Fourier Transform v2.1

Steps to run FFT C- Model in Linux environment:


1. 2. 3. 4. 5. 6. 7. 8. Provide the correct parameters required for your purpose in the fft_user_defines.h file. User is having the option to change only this file. Place the input file in the same folder where you are running the model To compile the c-model we need 64-bit c-compiler All the commands shown below are for Linux OS. Type : Prompt> gcc -lm fft_bitacc_cmodel.c This would produce an executable a.out. Type : Prompt> ./a.out The output file would be created in the same folder.

Steps to run FFT C- Model in Microsoft Visual C++


1. Create the VC++ project

Figure 18 : Creating the VC++ Project


Rev: FFT_v2_1_ds001 www.eASIC.com 38

CONFIDENTIAL

Fast Fourier Transform v2.1

2. Add the source file "fftv2_r22sdf_cmodel.c" to the project

Figure 19: Add the Source files

3. Select the project in the Solution explorer Click on the properties option and Set "Working Directory: You need to specify where the "input.txt" file is present

Rev: FFT_v2_1_ds001

www.eASIC.com

39

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 20: Select the project

Figure 21: Project Properties

Rev: FFT_v2_1_ds001

www.eASIC.com

40

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 22: Select the working directory

4. Build the Project

Rev: FFT_v2_1_ds001

www.eASIC.com

41

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 23: Build the project 5. Execute the project

Figure 24: Execute the project

Rev: FFT_v2_1_ds001

www.eASIC.com

42

CONFIDENTIAL
"output.txt" file will be generated in the "Working Directory" specified above

Fast Fourier Transform v2.1

Dynamic Range Performance


The C-Model for the FFT v2.1 can be used to evaluate the dynamic range of the IP. Below a number of plots are provided showing the difference between a floating point FFT implementation and the eASIC IP for a varying number of features. 1) Unscaled Version Streaming 2) Scaled Version Streaming 3) Scaled version R-2 4) Scaled Version R-4 5) BPF R-2 6) BFP R-4 Note: Dark Blue eASIC FFT Core result Pink Floating point FFT (Using Octave)

Figure 25: Slot Noise comparison between eASIC FFT and Floating Pt streaming FFT for 16 bit 1024 pt FFT unscaled

Rev: FFT_v2_1_ds001

www.eASIC.com

43

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 26: Slot Noise comparison between eASIC FFT and Floating Pt FFT for 16 bit 1024 pt FFT Scaled by 32

Rev: FFT_v2_1_ds001

www.eASIC.com

44

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 27: Slot Noise comparison between eASIC FFT and Floating Pt FFT for 16 bit 1024 pt FFT Radix-2 scaled version The scaling value used was [1 1 1 1 ] .Hence the stages were alternatively scaled.

Rev: FFT_v2_1_ds001

www.eASIC.com

45

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 28: Slot Noise comparison between eASIC FFT and Floating Pt FFT for 16 bit 1024 pt FFT Radix-2 block floating point version

Rev: FFT_v2_1_ds001

www.eASIC.com

46

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 29: Slot Noise comparison between eASIC FFT and Floating Pt FFT for 16 bit 1024 pt FFT Radix4 scaled version

Rev: FFT_v2_1_ds001

www.eASIC.com

47

CONFIDENTIAL

Fast Fourier Transform v2.1

Figure 30: Slot Noise comparison between eASIC FFT and Floating Pt FFT for 16 bit 1024 pt FFT Radix4 block floating point version

Rev: FFT_v2_1_ds001

www.eASIC.com

48

CONFIDENTIAL

Fast Fourier Transform v2.1

Directory Structure
Figure 277 shows the directory structure after unpacking the release package. Make sure the directory structure is correct before using the core:

Figure 27 : Top level directory

Figure 28: Interface folder containing testcases

Figure 29 : Simulation folder

Figure 30 : Testcase folder arrangement

Rev: FFT_v2_1_ds001

www.eASIC.com

49

CONFIDENTIAL

Fast Fourier Transform v2.1

Compile & Simulate the Design (Streaming viarom)


The following steps are required for running the provided testcases (For Streaming ViaROM simulation) 1. 2. Before you start simulation, ensure that the Modelsim present working directory is set to the \data\dsp_cores\fft_core\SIMULATION folder. We have scripts for N2X device in the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2 for N2X device. 3. Here we will explain the procedure for N2X device. Now go to the file \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\ run_all_viarom.do and set the variable ELIB_NX2in the above mentioned script file to the path where N2X Libraries are available. These files are available in \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\NX2 Note: Library files should be in the same folder. These are the files required for the simulation of nextreme2 memory models. 4. Once the directory is set, then go to Modelsim command/transcript window and type do ./Streaming/scripts/functional_NX2/run_all_viarom.do Modelsim will call the macro run_all_viarom.do and executes commands. 5. Open the simulation script run_all_viarom.do, which is located in the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\ in any text editor. This file has commands to run each of the test cases. Test case names are assigned to TESTCASE variable in the script. The information regarding the configuration used for a particular test case will be available in the fftv2_config.vh file under the folder \data\dsp_cores\fft_core\INTERFACE\Streaming\NX2\<testcase_name>. 6. 7. 8. Once the simulation for a test case is finished, **** Simulation End **** message is displayed on Modelsim command/transcript window. The verdict will be generated whose script is included in the run_all_viarom.do script itself Final report of the test cases status PASS/FAIL will be present in the verdict report verdict_viarom.rpt in the folder verdict will \data\dsp_cores\fft_core\TEST\Streaming\functional_NX2\.This failure or dump of DUT. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\Streaming\simdata\<testcase>\report\ <testcase>_nx2_1.rpt for N2X device NOTE: 1. Before compilation of any testcase set the variable ELIB_NX2 to the correct path where the files for the simulation of Dual port ram is available For running one testcase following changes are required
www.eASIC.com 50

indicate that any value mismatch has occurred if any. Report doesnt contain the reason for the 9.

2.

Rev: FFT_v2_1_ds001

CONFIDENTIAL

Fast Fourier Transform v2.1

a. Go to the ./scripts/functional_NX2/run_all_viarom.do file. b. Look for quietly set TESTCASE {<testcase_name>}. This would have the list of the testcases that needs to run. c. Set TESTCASE to a particular testcase that needs to be run. d. After modification follow the steps mentioned above for compilation & simulation of the core

Script Descriptions
The scripts are available in the folder data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2 testcase. Following are the files which are used for streaming viarom (EASIC) configurations 1) compile_all_viarom.do //Compilation of R22SDF Architecture source & TB files 2) compile_lib_NX2.do //Compilation of NX 2 device libraries source files 3) run_all_viarom.do // Loading the design & running the simulation of R22SDF architecture files to run a

4) fft_gen_verdict_rpt_viarom.tcl // Generation of final verdict Also, under NX2 folder memory models are present. compile_all_viarom.do This script will compile source file & Test bench files required for the R22SDF streaming architecture. The Test bench files are under the directory \data\dsp_cores\fft_core\TESTBENCH\Streaming RTL and RTL header files are present under the directory \data\dsp_cores\fft_core\RTL\Streaming_viarom compile_lib_NX2.do This script will compile library files for the memory model required. Memory related files for NX2 are available under the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\NX2 run_all_viarom.do
st

This script will create the work directory when user runs for the 1

time. In this script

compile_lib_NX2.do script is called for the compilation of library files related to memory. In this script the compile_all_viarom.do script is called for the compilation of the source & TB files To run regression & a particular testcase follow the procedure given in the section above. This script is for simulation of streaming architecture The above script internally calls fft_gen_verdict_rpt_viarom.tcl script which gives the verdict report which tells the status of the testcase. The report also provides the time stamp of each of the test cases.
Rev: FFT_v2_1_ds001 www.eASIC.com 51

CONFIDENTIAL

Fast Fourier Transform v2.1

Note: There is a variable called FIRST_RUN confirm this is set to 1 for running first time. This ensures that the work is created and libraries are compiled before compilation of RTL files. NOTE The streaming configuration is also available for altera version. The script files used are given as follows and they are present in the same folder as before. Following are the files which are used for streaming viarom altera configurations 1) compile_all_altera.do //Compilation of R22SDF Architecture source & TB files 2) compile_lib_NX2.do 3) run_all_altera.do //Compilation of NX 2 device libraries source files // Loading the design & running the simulation of R22SDF architecture files

4) fft_gen_verdict_rpt_viarom.tcl // Generation of final verdict

An Example Test case


Select a testcase to be run.

Let us consider tc_fftv2_64 is to be run. As per the testcase register .xls tc_fftv2_64 has the following configuration. 1) Run time configurable 2) N point = 1024 3) Dynamic FFT/iFFT 4) Data width = 16 (Real = 16 & Imag = 16) 5) Phase width = 16 (Real = 16 & Imag = 16) 6) Output order bit Natural 7) Scaled version Hence, in the fftv2_config.vh file under the directory \data\dsp_cores\fft_core\INTERFACE\Streaming\NX2\tc_fftv2_64\ which is the configuration file for the core should have following definitions `define DATA_WIDTH `define PHASE_WIDTH `define N_POINT //`define STATIC_IFFT `define DYN_FFT_IFFT //`define NX `define RUN_TIME_N_CONFIG `define OUTPUT_ORDER `define SCALING define THRESHOLD_MEM_FF define THRESHOLD_BRAM_REGFILE 2 16 16 16 1024

In addition to this the user can also define clock period & duty cycle This completes the FFT core configuration. Next is changing the testcase name in the run_all_viarom.do for simulation and running it
www.eASIC.com 52

Rev: FFT_v2_1_ds001

CONFIDENTIAL
In the run_all_viarom.do file which is under the folder

Fast Fourier Transform v2.1

\data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2, set the test case name to be run as follows and save the file. quietly set TESTCASE {" tc_fftv2_64 "} Run the script using the command do ./Streaming/scripts/functional_NX2/run_all_viarom.do After the simulation is complete, the output report file can be viewed for final results. The report file will be available in the file tc_fftv2_64_nx2_1.rpt under the folder \data\dsp_cores\fftv2.0\TESTBENCH\Streaming\simdata\tc_fftv2_64\rep ort\

Compile & Simulate the Design (Loop Engine Radix-2)


The following steps are required for running the provided testcases (For Loop Engine Radix 2) 1. 2. Before you start simulation, ensure that the Modelsim present working directory is set to the \data\dsp_cores\fft_core\SIMULATION folder. We have scripts for N2X device in the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2 for N2X device. 3. Here we will explain the procedure for N2X device. Now go to the file \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2\run_all_viarom.do and set the variable ELIB_NX2in the mentioned script file to the path where the below files are available. They following files are available in \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2 \NX2 4. Once the directory is set, then go to Model sim command/transcript window and type do ./Loop_engine/Radix2/scripts/functional_NX2/run_all_viarom.do Model sim will call the macro run_all_viarom.do and executes commands. 5. Open the simulation script run_all_viarom.do, which is located in the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2\ in any text editor. This file has commands to run each of the test cases. Test case names are assigned to TESTCASE variable in the script. The information regarding the configuration used for a particular test case will be available in the fft_config.vh file under the folder \data\dsp_cores\fft_core\INTERFACE\Loop_engine\NX2\<testcase_name>. 6. 7. 8. Once the simulation for a test case is finished, **** Simulation End **** message is displayed on Modelsim command/transcript window. The verdict will be generated whose script is included in the run_all_viarom.do script itself Final report of the test cases status PASS/FAIL will be present in the verdict report verdict_viarom.rpt
Rev: FFT_v2_1_ds001

in

the

folder

\data\dsp_cores\fft_core\TEST\
53

www.eASIC.com

CONFIDENTIAL
Loop_engine\Radix2\functional_NX2 9.

Fast Fourier Transform v2.1

\.This verdict will indicate that any value

mismatch has occurred if any. Report doesnt contain the reason for the failure or dump of DUT. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\Loop_engine\simdata\<testcase>\repor t\<testcase>_nx2.rpt for N2X device

Script Descriptions
The scripts are available in the folder data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2 to run a testcase. Following are the files which are used for looping radix 2(EASIC) configurations 1) compile_all_viarom.do //Compilation of loop engine Radix2 Architecture source & TB files 2) compile_lib_NX2.do 3) run_all_viarom.do //Compilation of NX 2 device libraries source files // Loading the design & running the simulation of

loop engine Radix2 architecture files 4) fft_gen_verdict_rpt_viarom.tcl // Generation of final verdict NOTE The script files for running the altera version of streaming is also available in the folder data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2 Running altera test cases is similar to streaming test cases compile_all_viarom.do This script will compile source file & Test bench files required for the loop engine Radix2 architecture. Following are the list of the files that this script compiles The Test bench files are under the directory \data\dsp_cores\fft_core\TESTBENCH\Loop_engine RTL and RTL header files are present under the directory \data\dsp_cores\fft_core\RTL\Loop_engine_viarom\r2_rtl for radix 2 and \data\dsp_cores\fft_core\RTL\Loop_engine_viarom\r4_rtl for radix 4 compile_lib_NX2.do This script will compile library files for the memory model required. Following are the list of the files that this script compiles .Memory related files for NX2 are available under \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2\NX2 run_all_viarom.do This script will create the work directory when user runs for the 1 time. In this script compile_lib_NX2.do script is called for the compilation of library files related to memory. In this script the compile_all_viarom.do script is called for the compilation of the source & TB files. To run regression & a particular testcase follow the procedure given in the section above. This script is for simulation of loop engine Radix2 architecture
Rev: FFT_v2_1_ds001 www.eASIC.com 54
st

the

folder

CONFIDENTIAL

Fast Fourier Transform v2.1

The above script internally calls fft_gen_verdict_rpt_viarom.tcl script which gives the verdict report which tells the status of the testcase. The report also provides the time stamp of each of the test cases. Note: There is a variable called FIRST_RUN confirm this is set to 1 for running first time. This ensures that the work is created and libraries are compiled before compilation of RTL files. NOTE The looping configuration is also available for altera version. The script files used are given as follows and they are present in the same folder as before. Following are the files which are used for looping radix2 altera configurations 1) compile_all_altera.do //Compilation of loop engine Architecture source & TB files 2) compile_lib_NX2.do 3) run_all_altera.do //Compilation of NX 2 device libraries source files // Loading the design & running the simulation of

loop engine architecture files 4) fft_gen_verdict_rpt_viarom.tcl // Generation of final verdict

An Example Test case


Select a testcase to be run.

Let us consider TV_FFT_34 is to be run. As per the testcase register .xls TV_FFT_34 has the following configuration. 8) Run time configurable 9) N point = 8192 10) Static FFT 11) Data width = 16 (Real = 16 & Imag = 16) 12) Phase width = 16 (Real = 16 & Imag = 16) 13) Input order bit Natural 14) Scaled version Hence, in the fft_config.vh file under the directory \data\dsp_cores\fft_core\INTERFACE\Loop_engine\NX2\ TV_FFT_34\ which is the configuration file for the core should have following definitions (5frames) `define DATA_WIDTH `define PHASE_WIDTH `define N_POINT //`define STATIC_IFFT //`define DYN_FFT_IFFT //`define NX //`define RUN_TIME_N_CONFIG `define INPUT_ORDER In addition to this the user can also define clock period & duty cycle 16 16 8192

Rev: FFT_v2_1_ds001

www.eASIC.com

55

CONFIDENTIAL

Fast Fourier Transform v2.1

This completes the FFT core configuration. Next is changing the testcase name in the run_all_viarom.do for simulation and running it In the run_all_viarom.do file which is under the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\funct ional_NX2, set the test case name to be run as follows and save the file. quietly set TESTCASE {" TV_FFT_34"}

Run the script using the command do ./Loop_engine/Radix2/scripts/functional_NX2/run_all_viarom.do After the simulation is complete, the output report file can be viewed for final results. The report file will be available in the file TV_FFT_34_nx2.rpt under the folder \data\dsp_cores\fftv2.0\TESTBENCH\ Loop_engine\ simdata\ TV_FFT_34\report\

Procedure for Radix-4 Loop Engine Test cases


The procedure is similar to running radix 2 testcases The script files are located in the location \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix4\scripts\func tional_NX2\ run_all_viarom.do The script file names are similar to that of radix2 and the simulation is done by running the script file run_all_viarom.do do ./Loop_engine/Radix4/scripts/functional_NX2/run_all_viarom.do The test case selected for simulation is TV_FFT_R4_34a The configuration file for the particular test case is given in the location \data\dsp_cores\fft_core\INTERFACE\Loop_engine\NX2\<testcase_name> The final verdict report will be obtained in file verdict_viarom.rpt in the location \data\dsp_cores\fft_core\TEST\Loop_engine\Radix4\functional_NX2 \. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\Loop_engine\simdata\<testcase>\re port\<testcase>_nx2.rpt for N2X device Details about the RTL files and Test Bench files are provided in the release notes

Compile & Simulate the Design (Multi Core Streaming viarom)


The following steps are required for running the provided testcases (For Multi Core Streaming ViaROM simulation) 1. 2. Before you start simulation, ensure that the Modelsim present working directory is set to the \data\dsp_cores\fft_core\SIMULATION folder. We have scripts for N2X device in the folder
www.eASIC.com 56

Rev: FFT_v2_1_ds001

CONFIDENTIAL

Fast Fourier Transform v2.1

\data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2 for N2X device. 3. Here we will explain the procedure for N2X device. Now go to the file \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\ run_all_viarom_multi_core.do and set the variable ELIB_NX2in the above mentioned script file to the path where the below files are available. They files are available in \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\NX2 4. Once the directory is set, then go to Modelsim command/transcript window and type do ./Streaming/scripts/functional_NX2/ run_all_viarom_multi_core.do Modelsim will call the macro run_all_viarom_multi_core.do and executes commands. 5. Open the simulation script run_all_viarom_multi_core.do, which is located in the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\ in any text editor. This file has commands to run each of the test cases. Test case names are assigned to TESTCASE variable in the script. The information regarding the configuration used for a particular test case will be available in the fftv2_config.vh file under the folder \data\dsp_cores\fft_core\INTERFACE\multi_core_Streaming\NX2\<testcase_name> 6. 7. 8. Once the simulation for a test case is finished, **** Simulation End **** message is displayed on Modelsim command/transcript window. The verdict will be generated whose run_all_viarom_multi_core.do script itself Final report of the test cases status PASS/FAIL will be present in the verdict report verdict.rpt in the folder \data\dsp_cores\fft_core\TEST\ multi_core_streaming\functional_NX2.This verdict will indicate that any value mismatch has occurred if any. Report doesnt contain the reason for the failure or dump of DUT. 9. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\multi_core_streaming\simdata\<testca se>\report\<testcase>_nx2_XXX.rpt for N2X device script is included in the

Script Descriptions
The scripts are available in the folder data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2 testcase. Following are the files which are used for streaming viarom (EASIC) configurations 1) compile_all_viarom_multi_core.do //Compilation of R22SDF Architecture source & TB files 2) compile_lib_NX2.do //Compilation of NX 2 device libraries source files 3) run_all_viarom_multi_core.do // Loading the design & running the simulation of R22SDF architecture files to run a

4) fft_gen_verdict_rpt_viarom_multicore.tcl // Generation of final verdict


Rev: FFT_v2_1_ds001 www.eASIC.com 57

CONFIDENTIAL

Fast Fourier Transform v2.1

Note: All these scripts are similar to the scripts used for streaming test cases and they are explained above.

An Example Testcase
Select a testcase to be run. Let us consider tc_fftv2_64 is to be run. As per the testcase register .xls tc_fftv2_64has the following configuration. 15) Multichannel test case (10 channels) 16) Run time configurable 17) N point = 1024 18) Dynamic FFT/IFFT 19) Data width = 16 (Real = 16 & Imag = 16) 20) Phase width = 16 (Real = 16 & Imag = 16) 21) Output order Natural 22) Scaled version Hence, in the fftv2_config.vh file under the directory \data\dsp_cores\fft_core\INTERFACE\multi_core_streaming\NX2\ tc_fftv2_64\ which is the configuration file for the core should have following definitions (5frames) `define DATA_WIDTH `define PHASE_WIDTH `define N_POINT //`define STATIC_IFFT `define DYN_FFT_IFFT //`define NX `define RUN_TIME_N_CONFIG `define OUTPUT_ORDER `define SCALING In addition to this the user can also define clock period & duty cycle This completes the FFT core configuration. Next is changing the testcase name in the run_all_viarom.do for simulation and running it In the run_all_viarom_multi_core.do file which is under the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX 2\NX2, set the test case name to be run as follows and save the file. quietly set TESTCASE {" tc_fftv2_64"} Run the script using the command do ./Streaming/scripts/functional_NX2/run_all_viarom_multi_core.do
Rev: FFT_v2_1_ds001 www.eASIC.com 58

16 16 1024

CONFIDENTIAL

Fast Fourier Transform v2.1

After the simulation is complete, the output report file can be viewed for final results. The file verdict.rpt will be available in the folder \data\dsp_cores\fft_core\TEST\ multi_core_streaming\functional_NX2.This verdict will indicate that any value mismatch has occurred if any. Report doesnt contain the reason for the failure or dump of DUT. Details regarding the rtl files can be found in the release document fft_release_notes_2.1.doc The rtl files can be found in the folder \data\dsp_cores\fft_core\RTL\Multi_core_streaming And the Test bench files are there in the folder \data\dsp_cores\fft_core\TESTBENCH\multi_core_streaming

NOTE The multicore streaming configuration is also available for altera version. The script files used are given as follows and they are present in the folder \data\dsp_cores\fft_core\SIMULATION\Streaming\scripts\functional_NX2\NX2 The RTL files are present in the folder \data\dsp_cores\fft_core\RTL\Altera_Multi_core_streaming Following are the files which are used for multi core streaming viarom (EASIC) configurations 1. compile_all_altera_multi_core.do //Compilation of R22SDF Architecture source & TB files 2. compile_lib_NX2.do 3. run_all_altera_multi_core.do //Compilation of NX 2 device libraries source files // Loading the design & running the simulation of R22SDF architecture files

4. fft_gen_verdict_rpt_viarom_multicore.tcl // Generation of final verdict

Compile & Simulate the Design (Multi Core Loop engine radix 2)
The following steps are required for running the provided testcases (For Multi Core Loop engine radix 2 simulation ) 1. Before you start simulation, ensure that the Modelsim present working directory is set to the \data\dsp_cores\fft_core\SIMULATION folder. 2. We have scripts for N2X device in the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2 for N2X device. 3. Here we will explain the procedure for N2X device. Now go to the file \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2\ run_all_viarom_multi_core.do and set the variable ELIB_NX2in the above mentioned script file to the path where the below files are available. These files are available in
Rev: FFT_v2_1_ds001 www.eASIC.com 59

CONFIDENTIAL

Fast Fourier Transform v2.1

\data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2 \NX2 4. Once the directory is set, then go to Modelsim command/transcript window and type do ./Loop_engine/Radix2/scripts/functional_NX2/ run_all_multicore_viarom.do Modelsim will call the macro run_all_multicore_viarom.do and executes commands. 5. Open the simulation script run_all_multicore_viarom.do, which is located in the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\function al_NX2\ in any text editor. This file has commands to run each of the test cases. Test case names are assigned to TESTCASE variable in the script. The information regarding the configuration used for a particular test case will be available in the fft_config.vh file under the folder \data\dsp_cores\fft_core\INTERFACE\multi_core_Loop_engine\NX2\ <testcase_name>. 6. 7. 8. Once the simulation for a test case is finished, **** Simulation End **** message is displayed on Modelsim command/transcript window. The verdict will be generated whose run_all_multicore_viarom.do script itself Final report of the test cases status PASS/FAIL will be present in the verdict report multi_core_Loop_engine_verdict.rpt in the folder \data\dsp_cores\fft_core\TEST\multi_core_Loop_engine\Radix2\functional_ NX2.This verdict will indicate that any value mismatch has occurred if any. Report doesnt contain the reason for the failure or dump of DUT. 9. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\ multi_core_Loop_engine\simdata\<testcase>\report\<testcase>XX_nx2.rpt for N2X device script is included in the

Script Descriptions
The scripts are available in the folder data\dsp_cores\fft_score\SIMULATION\Loop_engine\Radix2\scripts\functional_NX2 to run a testcase. Following are the files which are used for multicore loop engine (EASIC) configuration 1) compile_all_ multicore _viarom.do //Compilation of Loop engine radix 2 Architecture source & TB 2) compile_lib_NX2.do //Compilation of NX 2 device libraries source files 3) run_all_ multicore _viarom.do // Loading the design & running the simulation of Loop engine radix 2 architecture files // Generation of final verdict

4) fft_gen_verdict_rpt_multicore _viarom.tcl

All these scripts are similar to the scripts used for loop engine test cases and they are explained above.
Rev: FFT_v2_1_ds001 www.eASIC.com 60

CONFIDENTIAL

Fast Fourier Transform v2.1

NOTE - The multicore looping configuration is also available for altera version. The script files used are given as follows and they are present in the same folder as before. Following are the files which are used for multicore loop engine (altera) configuration 1) compile_all_viarom_multicore_altera.do //Compilation of Loop engine radix 2 Architecture source & TB 2) compile_lib_NX2.do 3) run_all_viarom_ multicore_altera.do 4) fft_gen_verdict_rpt_multicore _viarom.tcl //Compilation of NX 2 device libraries source files // Loading the design & running the simulation of Loop engine radix 2 architecture files // Generation of final verdict

An Example Testcase
Select a testcase to be run. Let us consider TV_FFT_36 is to be run. As per the testcase register .xls TV_FFT_36 has the following configuration. 1) Multichannel test case (2 channels) 2) Run time configurable 3) N point = 8192 4) Dynamic FFT/IFFT 5) Data width = 16 (Real = 16 & Imag = 16) 6) Phase width = 16 (Real = 16 & Imag = 16) 7) Input order Natural 8) Block floating point Hence, in the fft_config.vh file under the directory \data\dsp_cores\fft_core\INTERFACE\multi_core _Loop_engine\NX2\ TV_FFT_36 \ which is the configuration file for the core should have following definitions ` define DATA_WIDTH ` define PHASE_WIDTH ` define N_POINT //`define STATIC_IFFT ` define DYN_FFT_IFFT //`define NX ` define RUN_TIME_N_CONFIG ` define INPUT_ORDER ` define BLK_FLT_POINT In addition to this the user can also define clock period & duty cycle This completes the FFT core configuration. Next is changing the testcase name in the run_all_multicore_viarom.do for simulation and running it In the run_all_viarom_multi_core.do file which is under the folder \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix2\scripts\func tional_NX2\, set the test case name to be run as follows and save the file.
Rev: FFT_v2_1_ds001 www.eASIC.com 61

16 16 8192

CONFIDENTIAL
quietly set TESTCASE {" TV_FFT_36"}

Fast Fourier Transform v2.1

Run the script using the command do ./Loop_engine/Radix2/scripts/functional_NX2/run_all_multicore_ viarom.do After the simulation is complete, the output report file can be viewed for final results. The file verdict.rpt will be available in the folder \data\dsp_cores\fft_core\TEST\multi_core_Loop_engine\Radix2\functional_NX2. This verdict will indicate that any value mismatch has occurred if any. Report doesnt contain the reason for the failure or dump of DUT. NOTE -Details regarding the rtl files can be found in the release documentfft_release_notes_2.1.doc The rtl files can be found in the folder \data\dsp_cores\fft_core\RTL\Multi_core_Loop_engine\r2_rtl for radix 2 \data\dsp_cores\fft_core\RTL\Multi_core_Loop_engine\r2_rtl for radix 4 And the Test bench files are there in the folder \data\dsp_cores\fft_core\TESTBENCH\multi_core_Loop_engine

Procedure for Radix-4 Loop Engine Test Cases (Multi-Core)


The procedure is similar to running radix 2 testcases The script files are located in the location \data\dsp_cores\fft_core\SIMULATION\Loop_engine\Radix4\scripts\func tional_NX2 The script file names are similar to that of radix2 and the simulation is done by running the script file run_all_multicore_viarom.do do ./Loop_engine/Radix4/scripts/functional_NX2/ run_all_multicore_viarom.do The test case selected for simulation is TV_FFT_R4_40 The configuration file for the particular test case is given in the location \data\dsp_cores\fft_core\INTERFACE\multi_core_Loop_engine\NX2\<test case_name> The final verdict report will be obtained in file multi_core_Loop_engine_verdict.rpt in the location \data\dsp_cores\fft_core\TEST\multi_core_Loop_engine\Radix4\functio nal_NX2. Detailed report of each test case will be present in \data\dsp_cores\fft_core\TESTBENCH\Loop_engine\simdata\<testcase>\re port\<testcase>_nx2.rpt for N2X device
Rev: FFT_v2_1_ds001 www.eASIC.com 62

CONFIDENTIAL

Fast Fourier Transform v2.1

Details about the RTL files and Test Bench files are provided in the release notes

Rev: FFT_v2_1_ds001

www.eASIC.com

63

CONFIDENTIAL

Fast Fourier Transform v2.1

References
1) Digital Signal Processing - Principles, Algorithms & Applications Proakis & Manolakis][3rd Ed].

Revision History
Date 04/24/2010 Version v2.1 ds001 Summary of Changes Initial version release

Rev: FFT_v2_1_ds001

www.eASIC.com

64

You might also like