You are on page 1of 14

White Paper

FPGA Design Methods for Fast


Turn Around
“If only I could get my FPGA design done sooner”
March, 2010

Angela Sutton Today’s FPGAs are doubling in capacity every 2 years and have already surpassed the 5 million equivalent
Staff Product ASIC gate mark. With designs of this magnitude, the need for fast flows has never been greater. At the
Marketing Manager, same time, designers are seeking rapid feedback on their ASIC or FPGA designs by implementing
FPGA
quick prototypes or initial designs on FPGA-based boards. These prototypes or designs allow designers
Implementation,
to start development, verification and debug of the design—in the context of system software and
Synopsys
hardware—and also to fine tune algorithms in the design architecture. Quick and intuitive debug iterations
to incorporate fixes are of great value. The ability to perform design updates that don’t completely
uproot all parts of the design that have already been verified is also a bonus! Whether the goal is
aggressive performance or to get a working initial design or prototype on the board as quickly as possible,
this paper provides information on traditional and new techniques that accelerate design and debug
iterations.

Introduction
FPGAs have been used for years to create functioning prototypes of ASIC and System on a Chip (SoC)
designs, and the popularity of this verification technique continues to increase. Indeed, well over 90% of
designs are prototyped using FPGAs. Such verification typically involves squeezing the evolving design,
eventually destined for an ASIC, into the largest, most capable FPGAs available on the market and then
debugging the design along with system software and drivers on the board.

Shorter iteration times for system debug…. results stability from one run to the next … Team and/
or parallel development flows …. Ways to quickly make small changes …. The holy grail

In addition to prototyping, there is increasing demand for FPGAs in production systems. Growth in
capacity, functionality, and performance, accompanied by a decrease in price per gate of FPGAs fuels
this trend. Low-cost FPGA families such as Cyclone-IV from Altera and Spartan-6 from Xilinx offer
million+ ASIC gate equivalent capacity and come equipped with embedded RAM, microprocessors,
dedicated DSP blocks and Gigabit serial transceivers. Part prices in volume have become very attractive
-- ranging from sub-$10 to the low hundreds of dollars. These production designs are typically both large
and challenging, driving a demand for ASIC-style iterative flows and fast design debug cycles. The cost
of a hardware design mistake or update may be cheaper in an FPGA than in an ASIC – simply repartition
and reprogram the chip with the corrected design - but time is still money when it comes to completing a
design project .
Different solutions for different design objectives?
Designers using FPGAs for production may have aggressive QoR goals, whereas designers using FPGAs
for prototypes may not. This paper details approaches available to FPGA production users with tight timing
constraints (QoR focused designs) as well as an expanded set available to ASIC prototypers who have lower
QoR expectations and slack timing constraints to meet.

The bigger the better!


Iteration times for RTL to FPGA implementation (bitfile) can take several days, depending on how aggressive
your design performance needs are and the computer platform chosen to run the tools.

RTL to bitfile RTL to bitfile RTL to bitfile


in minutes in hours in days

Virtex-II
Virtex-II 100K LUT
Pro
Pro

Virtex-4 200K LUT

Virtex-5 330K LUT

2x the capacity of the prior generation family


Virtex-6 (XC6VLX760) 760K LUT

Stratix 4 (EP4SE820) 820K LUT

1M 2M 5M
Equivalent ASIC gates

Figure 1: FPGA sizes are starting to double with each new generation of silicon meaning
a design iteration can now take days. The example shown is for Xilinx FPGA families

The shorter the design iteration time the better; the more stable the results from one run to the next the
better, since this simplifies the re-verification and debug process.

Good news! Design schedules can be shortened.

In this paper we look at a variety of techniques to bring design schedules under control for large FPGAs.

Speeding Up Your Design Project


The menu of items available to accelerate your design project will depend upon whether (1) your priority is
Quality of Results (performance QoR), (2) your goal is to create an ASIC prototype, or (3) you have the means to
perform parallel development. Table 1 shows a variety of traditional and new techniques to consider for these 3
scenarios – including “fast synthesis”,and “continue synthesis upon error” (a.k.a. complete what you can).

FPGA Design Methods for Fast Turn Around 2


Priority Best QoR Quick ASIC Prototype Fast incremental design
with loose constraints updates and design tuning
(tight timing constraints)
Pipe clean RTL/ Team Design
Constraints
Primary goal Faster implementation Fastest Time to the Board. Results stability and Parallel
without sacrificing QoR Ease of use Development
Multiprocessing  
Server Farm   
”hunt for the best
performance in parallel”
Floorplanning 
“partition the design”
Incremental Static Timing   
Analysis
Fast Synthesis-only  
iterations with tight During initial design tuning During initial design tuning
constraints
Continue Synthesis Upon   
Error
Use Tools to Find, locate, fix   
Errors
Fast Synthesis, then P&R  
with loose constraints
Incremental P&R (e.g.  
Guided Flow)
Fast P&R Modes  
If P&R output is legal
Block Based (Partition) flows 
For parallel development—
May not save total runtime
ASIC Design Import 
Table 1: Traditional and new techniques to help speed things up!

Some Traditional Approaches


First, let’s first take a look at the some of the tried and trusted approaches used today, including server farms,
block based flows, and floorplanning.

When using your server farm, you might apply slight variations in constraints/RTL or settings such as: Xilinx-
route; vary Place and Route effort level or seeds; and run several variations of design settings, in parallel, on
multiple machines. Then you would compare, contrast and choose the best result or just learn from the results
that you see. This can be somewhat of a trial-and-error process that helps you hunt for the best performance
and area results but may not always yield the results that you need. See Table 2

QoR ; Quick Prototype; Incremental Updates/Team Design


Server Farm Hunt for the best way to attain design goals by applying different constraints and RTL scenarios
in parallel on adjacent machines and then comparing the outcome
Useful when… …design runs cleanly though the flow and you want to experiment or hunt for ways to meet
performance/area criteria
Advantage Can deliver good QoR if used in top-down flow.
Easy to automate via scripts.
Disadvantage A trial and error approach for discovering the best performance/area but does not necessarily
help to improve results
Requires access to a lot of machine hardware.
More versions of the design to manage.

Table 2: Server Farms deliver more horsepower to let you try different scenarios in parallel

FPGA Design Methods for Fast Turn Around 3


Many users are familiar with block based flows that allow you to partition the design into blocks upfront
and then refine the blocks in parallel. These blocks are sometimes used in conjunction with floorplanning
where space on the chip is allocated upfront to each partition. Examples of block based flows in use today
include the Altera LogicLock flow and the new Xilinx design preservation flow. The Synopsys - Xilinx design
preservation flow involves defining blocks known as compile points upfront at the RTL level in the Synopsys
Synplify Pro or Synplify Premier Synthesis tool. Synthesis and place and route is then run. In subsequent runs,
unchanged partitions of the design are preserved all the way from RTL through netlist, placement and routing.
During Synthesis runs, only the compile points (partitions) that change are re-synthesized, and then only
these are re-placed and routed. A complete script and GUI based integration exists. Partition information is
shared between Synplify Pro/Premier and the Xilinx ISE Design Suite that includes Xilinx place and route and
tools, Synplify Pro and Synplify Premier optionally integrate with ISE PlanAhead floorplanning flow, generating
EDIF to populate each pre-floorplanned block. See table 3 for a list of advantages and disadvantages of
floorplanning and block based flows.

Synplify Premier itself includes a Design Planner feature which can optionally be used to partition RTL,
working in conjunction with physical synthesis or logic synthesis runs.

Incremental Updates/Team Design


Floorplanning and Block Divide and Conquer! Partition your design upfront into blocks that can be worked on
based flows individually. You may update each block separately. If floorplanning, allocate partitions to
defined physical regions on the chip. It is wise to place logic likely to change in partitions
separate from logic that is unlikely to change
Useful when…  …you want to preserve working blocks that have already been verified.
 …you have a design team (each member can work on a given set of blocks in parallel).
 …you use Floorplanning to meet performance targets or lock down performance.
Advantage  Improves results stability—Saves verification time since you don’t have to re-verify parts of
the design that already work.
 Allows team members to work in parallel
Disadvantage  May cost QoR by imposing block boundaries and physical placement limitations that prevent
the tools from performing the optimizations required to meet timing.
 May not be useful if your ASIC design contains gated clocks for some synthesis tools
 May cost you runtime because the afore-mentioned boundary and physical constraints limit
your optimization possibilities, making the tool work harder to meet performance/area goals
 Constraints setup can be time consuming—you may need to do time budgeting. If
floorplanning, you will have to allocate resources between partitions to ensure resources are
available for clock, DSP element and memory resources. Floorplanning and partitioning is
complex—you may need to keep clock domains within a single partition, keep constants in
the same partition, reduce cross-partition latency and minimize boundary I/O’s, and you will
need to set physical constraints correctly and optimally. If floorplanning, you will have the
additional baggage of many physical constraints to manage.

Table 3: Floorplanning /block based flows preserve design results but can cost QoR

Speeding Up Place and Route (P&R)


P&R typically consumes well over half of the overall design iteration time so it’s vital to speed this design step
up too. FPGA vendors provide “backend” place and route tools customized to their FPGA families and have
sought to improve turnaround times using techniques such as “multiprocessing”, “incremental P&R” and
“fast P&R”.

For multi-million gate designs, however, place and route can take a whole work day to complete. This is
problematic if all you want is a quick iteration to test a small design change or when you just want a quick
initial implementation of your prototype on the board (see Figure 2). To chip away at the P&R runtime, some
place and route tools may be run in fast or lower effort modes (Table 4), sacrificing some QoR. For example,
Altera users may use fast P&R modes in the Quartus backend tools that run the fitter extremely fast and Xilinx
users may choose to apply lower effort levels to shorten P&R runtime.

FPGA Design Methods for Fast Turn Around 4


RTL, DesignWare
constraints library

Tune/fix RTL,
constraints. Synplify Premier
Analyze
Define CP’s
(optional)

Fast synthesis
Netlist

New! Continue synthesis


(on error with CP flow)
Netlist
with
black Global and detailed
box placement

ISE

Route, Generate bitfile

Figure 2: An example of a quick initial implementation or ASIC prototype flow

Quick Prototype Incremental Updates/Team Design


FPGA vendor Fast Synplify Premier (or the P&R tool) minimize the efforts to produce good QoR in the interest of
P&R Modes saving time during routing
Useful when… … tuning the design or when you have non-aggressive timing/area goals
Advantage Saves overall runtime—RTL to the Board
Disadvantage May sacrifice QoR. In some FPGA vendor fast P&R modes, you may get an illegal netlist
(you will be fully aware that this is an issue)

Table 4: Fast (or low effort) Placement and Routing modes speed up the P&R design step
but may cost some QoR

Additionally, users of Synplify Premier and Xilinx ISE P&R tools can perform fast incremental P&R (see Table
5) using the Xilinx “Guided Flow”. This flow which emphasizes results stability is useful when you make minor
changes to the design that are not on the critical path. How does it work? The Xilinx ISE P&R tools determine
“what’s changed” by doing a netlist comparison between 2nd and prior run. The key to the success of this flow
is the ability for the Synplify Premier synthesis tool to synthesize reproducible and deterministic netlists and
instance names from one run to the next, for every iteration. In 2007, Synplify Premier introduced “path group”
technology that localizes changes in a synthesized netlist to only those parts of the design where the RTL or
constraints actually changed. Similar RTL and constraints produce similar results—a reproducible netlist in
other words.

FPGA Design Methods for Fast Turn Around 5


Incremental Updates/Team Design
Incremental P&R P&R is incremental in 2nd and subsequent iterations of your design flow—and endeavors to only
replace those portions of the design where the netlist changed so long as timing can still be met
Useful when… Design changes are small and do not reside on a timing critical path. In other words, it must
still be possible to replace and re-route the changed part of the design without ripping up other
unchanged parts of the design
Advantage Can save up to 50% in P&R runtime
Easy to use since it requires no change to your frontend (synthesis) methodology, nor to your
backend (P&R) methodology other than to designate that you want to run the guided flow in the
backend
Up to 10 iterations can typically be done before there’s a need to re-run the entire P&R from
scratch
Disadvantage Use is limited—If your RTL/Constraints design change is on the critical path, chances are you
won’t save much P&R runtime because P&R will have to rip up large portions of the designs. If
your chip is very full, “highly utilized”, it will be hard to integrate the change without ripping up
other portions of the design.

Table 5: Incremental P&R is useful for minor changes not on the critical path

Like Synplify Premier Synthesis, Xilinx and Altera P&R tools have multiprocessing capabilities to reduce
runtime at the cost of some QoR (see Table 6).

Quick Prototype; Incremental Updates/Team Design


Multiprocessing during P&R tool runs the design on 2 or more processors in parallel
P&R
Useful when… When you are tuning your design or have non-aggressive timing/area goals
Advantage Saves P&R and timing analysis runtime by 10% or more

Disadvantage May reduce QoR

Table 6: Multiprocessing during P&R help reduce runtime, but at the cost of QoR

Faster Synthesis
We’ve discussed fast Place and Route (netlist to bitfile)—Now let’s look at ways to speed up design synthesis
(RTL to netlist). A faster synthesis iteration that incorporates and gives you feedback on an RTL or constraint
change in 1 hour instead of 3 hours is very valuable. Synthesis time can indeed be cut using Synplify Premier’s
new FAST synthesis mode (see Figure 3)—which improves runtimes by 2x to 3x for a small reduction in
overall Quality of Results (area and fmax).

FPGA Design Methods for Fast Turn Around 6


FAST mode: fastest synthesis results

FPGA implementers seeking fast initial results

RTL, constraints

Synplify Premier
TuneRTL 62% Typical synthesis
constraints Tight timing
runtime savings
constraints
Analyze

Synthesize
Virtex-5, C2009.03
Out of box geomean results
FAST mode ON vs. OFF
Same tight timing constraints
Netlist

FPGA P&R (P&R not run in this flow)

Figure 3: Fastest synthesis results flow (Iterate through synthesis only


to tune your RTL/constraints)

When using fast synthesis mode, consider whether your intent is to tune your RTL constraints in which case
you would use this capability for synthesis-only iterations with your normal tight constraints…. Or whether
your intent is a fast iteration RTL  bitfile, in which case it is recommended that you use loose timing
constraints (lower QoR).

If your intent is fast synthesis-only iterations…..use normal constraints with Synthesis FAST mode (see Table 7)

QoR ; Incremental Updates/Team Design


FAST SYNTHESIS for Run synthesis only (not P&R afterwards) with your normal planned timing constraints.
Fastest Synthesis Results Synthesis minimizes efforts to produce good results in the interest of saving time
and synthesis-only
iterations
Useful when… …when you are pipe-cleaning your flow or RTL and just intend to run synthesis—
…when you want to know whether your RTL will synthesize
…when you want to know the approximate results you can get out of the box
Advantage Saves up to 50% synthesis runtime, allowing you to get rapid feedback so you can fix your
RTL and constraints
Disadvantage Sacrifices QoR
Results are useful for netlist results analysis, not for use in subsequent Place and Route

Table 7: Fast Synthesis improves synthesis runtimes by 2x to 3x

Since the Fast Synthesis flow does sacrifice some QoR, it is specifically recommended that you NOT run P&R
on the synthesized netlist; that netlist does reflect sub-optimal area and timing results after all. If you ran P&R
on the synthesis netlist, runtime benefit may be lost in an increased P&R runtime because P&R would have to
work harder to make up for the QoR lost during synthesis. If your intent is faster iterations, RTL to bitfile,….use
loose constraint with Synthesis FAST mode (see Figure 4).

FPGA Design Methods for Fast Turn Around 7


The very same Synplify Premier fast synthesis mode can be used with P&R and loose timing constraints for
lower performance designs such as FPGA prototypes (see Table 8).

FAST mode: fastest board implementation

ASIC prototypes with stack timing constraints

RTL, constraints

Synplify Premier
24% Typical runtime saving
Debug (RTL to bitfile)
design Easy to meet
on the timing constraints 44% Typical runtime saving
board (RTL to netlist)

Synthesize
Virtex-5,
C2009.03.ISE 10.1sp3
Out of box geomean results
FAST mode ON vs. OFF
Netlist
1 MHz global clk timing
constraints
for synthesis and P&R

FPGA P&R

Bitfile

Figure 4: Fast Synthesis to produce fastest implementation on the board


(iterate through synthesis and P&R) with loose timing constraints

Quick Prototype; Incremental Updates/Team Design


FAST Synthesis for Run synthesis and then P&R with loose timing constraints. Synthesis tool tries less hard to
Fastest Implementation produce good results in the interest of saving time and saves synthesis runtime. P&R is run to
on the Board generate the bitfile to program the FPGA on the board
Useful when… You want to debug a system on the board and are not going to run at high speed.
You want to implement prototype designs more quickly ready for debug on the board and
incorporate RTL and constraint design changes
Advantage Reduces (RTL  bitfile ) runtime to about ¾ of what it would have been
Disadvantage Useful only if you have low QoR expectations

Table 8: Fast synthesis reduces runtime but sacrifices QoR

In the Synplify Premier tool, you can use your machine’s multiprocessing capability to synthesize designated
design blocks in parallel on separate processors, speeding your runtimes … up to 30% (see table 9). You can
specify the maximum number of processors to be used.

FPGA Design Methods for Fast Turn Around 8


Quick Prototype; Incremental Updates/Team Design
Multiprocessing during Synthesis tool runs the design using 2 or more processors in parallel
Synthesis
Useful when… When you are first tuning your design or have non-aggressive timing/area goals
Advantage Saves synthesis runtime by up to 30% allowing you to get rapid feedback and tune your RTL
and constraints

Disadvantage Generally reduces QoR. May frequently be used with block based flows which can further limit
QoR

Table 9: Multiprocessing during synthesis allows rapid feedback

Traditionally, if a small number of errors are encountered during synthesis the synthesis tool will promptly
abort the run. This can result in huge design delays if there are a lot of errors because each error will have to
be detected and fixed piecemeal. Suppose that your design synthesis run encounters 100 errors, of just 5
different types…. and that your flow aborts and errors out whenever a cumulative total of 3 errors have been
encountered. You fix the first 3 errors and then re-start your synthesis run - then the next 3 errors surface; they
are similar to the previous 3. You fix them and you have to start synthesis again. Wouldn’t it be better to know
about all 100 errors after 1 synthesis run rather than having to flush the errors 3 at a time? This is possible
with the Synplify Premier Synthesis product thanks to a new “continue synthesis on error” feature (see
Table 10). When possible, the synthesis tool will black box the erroneous portion of the design and continue
to synthesize the remainder of the design. Under the hood, Synplify Premier is automatically partitioning the
design for parallel synthesis. Good, error-free partitions complete while those with errors are black boxed.

QoR ; Quick Prototype; Incremental Updates/Team Design


Continue Synthesis upon During Synthesis, complete what you can in the presence of coding errors, and then fix your
Error project files in aggregate
Useful when… You are pipe cleaning your design project files—and your files still have hundreds of errors in
them. You would rather save time by finding all the errors in one synthesis run and fix them in
aggregate, than find and fix each error, one at a time
Advantage Saves time—Fix all the errors in one go rather than run synthesis—find an error—fix the error—
rerun synthesis from the beginning—find the next error—rerun synthesis from the beginning

Table 10: Synplify Premier Synthesis finds all errors in a design during a single synthesis run

When errors do occur in your project files, figuring out how to fix them can also be time-consuming. Synplify
Premier hyperlinks your error/warning report to useful documentation that helps you to identify a fix.
You can filter these errors/warnings by type so that you can work only on those errors or warnings that are of
interest (see Table 11).

QoR ; Quick Prototype; Incremental Updates/Team Design


Locate and fix errors During Synthesis, errors and warning are placed in a report that is automatically hyperlinked to
quickly during synthesis documentation to help you understand what to do to fix the problem. You may also use FIND or
FIND-IN-FILE features to locate and fix the problems in your source code
Useful when… You are initially running a design or making changes to the design and incur design errors. FIND
is useful when you have a very large design and need to locate those parts that you wish to
improve or change quickly
Advantage Easier and faster to identify and fix issues within the design… Quick references from error/
warning to the documentation helps you determine the cause of the problem

Table 11: An error and warning report is generated allowing you to quickly identify and fix errors in
the design in aggregate

FPGA Design Methods for Fast Turn Around 9


Synplify Premier also includes TCL/FIND features that allow you to locate instances in the code, for
example those with negative slack, and improve and debug that part of the design. There is also a “find in
file” feature that allows you to search for strings quickly across specified projects or file types, allowing you to
locate the RTL source that you need to fix.

QoR ; Quick Prototype; Incremental Updates/Team Design


Incremental Static Timing Update exception constraints such as multi-cycle paths or false paths and see results reflected
Analysis during synthesis in revised timing reports without re-running synthesis. You can also generate a new incremental
netlist/ constraints file to forward annotate to P&R without re-running synthesis.
Useful when… …it becomes apparent after synthesis that you did not completely specify multi-cycle and false
paths
Advantage Saves a synthesis iteration—You can continue to run P&R using the updated constraints
Disadvantage You may get better quality of results (better area and possibly timing) by rerunning synthesis
if the exception occurred along what synthesis believed to be a critical path. For example,
synthesis may have compromised area by previously optimizing a path that it thought was
critical and that was in fact a false or multi-cycle path

Table 12: Incremental Static Timing Analysis allows you to change exception constraints and see the results
reflected immediately in the timing report, without the need to run synthesis

Dealing with the “Moving Target” or “Pieces Missing” Design


“My design source files are a moving target …I need fast respins!!”
“Part of the Source code is not available!!”

Getting the design through your FPGA flow can be a challenge, especially when various pieces of the design
are still evolving. If you are creating ASIC prototypes, the team generating the ASIC source RTL could well be
changing that source underneath you every week. Your challenge then would be to respin the ASIC prototype
and provide feedback to the ASIC team on the source files faster than they are changing the source!! And,
pieces of the design may be unavailable or incomplete. Chances are that you will need:

Fast turnaround time using some of the flows previously described in this paper
``
Reproducible and stable results from one run to the next, e.g. if the ASIC team makes a small RTL change
``
to a file, it will only trigger a small change in the resulting FPGA netlist. As previously described, Synplify
Premier applies “path group” technology to localize small changes in the RTL to small changes in the
resulting netlist
The ability to synthesize in the absence of some source files—Modules of your design may be
``
incomplete or unavailable since the source files are still being worked but you may want to get a head start
and synthesize the modules that are already complete. You can do this using a compile-point block based
flow by designating the part of the design that is incomplete as a black box
The flexibility to swap out changed files and manage hundreds of design files. Some ways to do this are
``
described below
If prototyping, fast ASIC design import—an FPGA flow that accepts your (ASIC) files easily without manual
``
modification. Considerations and solutions are outlined in the section on the next page

FPGA Design Methods for Fast Turn Around 10


Managing, visualizing and swapping in and out hundreds of design files
When there are hundreds of files to manage, Synplify Premier allows you to perform hierarchical design
project management. You can organize design files into subdirectories. This makes it easier to swap out
changed files and manage hundreds of design files (see Figure 5).

Figure 5: The ability to organize design source files hierarchically is important

Updating select portions of the design (e.g. internal IP or DesignWare IP)


Synplify Premier also allows you to specify, package and integrate IP in the industry standard IP-XACT
format—You can assemble/connect the IP at the system level using its SystemDesigner system-level assembly
feature. DesignWare Cores can be directly imported. Simply configure your core in the DesignWare coreTools
and generate project source files ready for import into Synplify Premier and the VCS simulator (see Figure 6).

Figure 6: Synopsys coreConsultant configures DesignWare cores and creates a Synplify Premier or
Synplify Pro-ready project file (scripts and source file)

FPGA Design Methods for Fast Turn Around 11


Faster ASIC Design Import—An FPGA Flow that Accepts Your ASIC
Design Files
A key question to ask yourself when using an FPGA-based prototype is: Will your ASIC source files
work in an FPGA flow? RTL architected and tuned for your ASICs may not be automatically accepted or
comprehended by an FPGA synthesis tool. The table below lists the most frequent issues encountered
and how the FPGA synthesis tool that reads these ASIC designs must address the issue. In addition, your
FPGA synthesis tool will generate a design using FPGA building block primitives such as LUTs, registers,
DSP elements and dedicated memories. These building blocks as well as the FPGA’s clocking schemes
and resources, are fundamentally very different from what you see in an ASIC fabric, so you need an FPGA
synthesis tool that understands how to use, deal with resource restrictions and design rules, and apply the
FPGA resources to serve the same function as an ASIC

ASIC design contains The FPGA compatibility issue is The FPGA Synthesis tool must …
Gated Clocks ( Used in FPGAs have no true equivalent of a gated Convert gated clocks to the logical
ASICs to reduce ASIC power clock. Also, when gated clocks are used equivalent without changing the intended
consumption) in a partitioned block based flow, clock functionality; Automatically Convert gated
management, allocation and correct clock to the FPGA equivalent (a register with
implementation across block boundaries a clock enable).
is a challenge so many FPGA tools don’t Support gated clocks even when a clock
support gated clocks in block based flows. exists in multiple blocks of a partitioned chip
DesignWare IP Must understand the meaning of and Accept RTL that includes instantiations
implement any DesignWare Building Block of DesignWare IP building blocks
instantiation or functions when it encounters (and Synthesize them with reasonable
one in the RTL. performance results)
Read any configuration of digital IP core, Accept designs generated and configured
even if it is encrypted by Synopsys coreTools
Your own or 3rd party IP May be encrypted in a way that the Preserve boundaries for the IP if requested.
synthesis tool cannot read. May have been Time through IP. In some cases, internally
highly optimized for ASIC and thus lack unencrypt but protect the IP
performance in an FPGA

Embedded memory functions FPGA tool may not recognize something Implement behavioral synthesis capabilities
in your RTL as a memory. The user has to (e.g. Synplify Premier’s SynCore) that
write specific memory models that work for generate RTL for memory in a way that the
FPGAs. FPGA tool recognizes and can implement
Some FPGA-vendor specific memory cores optimally.
are encrypted, making it impossible to
simulate the synthesized design (because
the memories are black boxed)
Extensive Language Language support may lag the ASIC Ensure compatibility with the most
support (VHDL, Verilog, tool’s support in particular with respect to commonly used Synthesizable ASIC RTL
SystemVerilog, VHDL 2008) SystemVerilog support.

FPGA Design Methods for Fast Turn Around 12


Putting it all together
When Quality of results (QoR) is your priority, you can deploy high horsepower techniques such as
multiprocessing and only use “FAST synthesis” and “continue synthesis upon error” techniques during initial
design tuning to reduce synthesis iteration times. Multiprocessing may be applied during initial design tuning.
Your final run will use the slower normal synthesis to achieve the best QoR—see Figure 7

Debug RTL/netlist/
Import constraints Debug

Read design Error report Debug RTL/


constraints
Links to docs on the board
Fix black boxed
erroneous modules
Fix RTL/constraints
(TCL/find etc)

Configure Re-run synthesis Re-run synthesis

Server farm Fast synthesis Normal synthesis


(normal constraints)
Multiprocessing

Run synthesis Run P&R Re-run P&R

Constraints check Normal P&R Normal P&R


Fast synthesis
(normal constraints)
Continue synthesis
on error

Figure 7: Fast QoR flow

When Turnaround time RTL to bitfile implementation on the board is the priority, you may continue to
use fast synthesis and may use either fast or incremental or normal Place and Route modes for subsequent
iterations. Block based flows and multiprocessing are useful tools in your tool chest and you can continue to
use fast synthesis for subsequent iterations. See Figure 8.

FPGA Design Methods for Fast Turn Around 13


Debug RTL/netlist/
Import constraints Debug

ASIC design Error report Debug RTL/


project import constraints
Links to docs on the board
Designware cores
configs and Fix black boxed
Designware erroneous modules
building blocks
Fix RTL/constraints
(TCL/find etc)

Configure Re-run synthesis Re-run synthesis

Parallel development/ Fast synthesis Fast synthesis


Partition design or incremental
floorplan static timing
Server farm analysis
Multiprocessing
Instrument RTL

Run synthesis Run P&R Re-run P&R

Constraints check Fast P&R or Incremental


normal P&R or normal P&R
Fast synthesis
(loose constraints)
Continue synthesis
on error

Figure 8: Example flow—Quick ASIC Prototype: Priority = is fast board implementation and fast respins

These techniques for QoR and Quick Prototype design were summarized in Table 1.

Conclusion
Users of large FPGAs can get their products out the door much faster when design turnaround time is
reduced by using some or all of the methods described in this paper. Additionally, it is very valuable to have
results stability from one design run to the next when incorporating changes and to have the ability to quickly
integrate these changes and see the results. As FPGAs get larger the engineering teams developing them are
also growing requiring new parallel design methodologies be adopted.

At the same time, users generally don’t desire disruptive changes to the design methodology and, when
prototyping, hope that the methodology will not require significant changes to the ASIC project files for
them to be accepted by the FPGA flow. Synplify Premier delivers a menu of technologies including “fast
synthesis”,and “continue upon synthesis error” technology, block based flows, incremental flows, and ASIC
compatibility for prototypers. These capabilities ensure that large designs can be delivered on schedule.

Synopsys, Inc.  700 East Middlefield Road  Mountain View, CA 94043  www.synopsys.com

©2010 Synopsys, Inc. All rights reserved. Synopsys is a trademark of Synopsys, Inc. in the United States and other countries. A list of Synopsys trademarks is
available at http://www.synopsys.com/copyright.html. All other names mentioned herein are trademarks or registered trademarks of their respective owners.
03/10.MH.10-18253.

You might also like