You are on page 1of 71

01.

The Xilinx ISE design environment: design entry modules


2.2. Design Entry Modules
Design entry is the first step in the design flow. During this step, source
files are created in order to represent the design. The top-level design
source file can have one of the following formats:



Project Navigator).
For designs with an HDL or schematic file as the top-level source file,
lower-level source files can have several formats, including HDL files,
schematic files, IP cores, and netlists. For designs with an EDIF or NGC/NGO
netlist as the top-level source file, EDIF or NGC/NGO files are the only
source file types allowed in the project.
2.2.1. HDL Editor
The HDL editor allows creating a file in a hardware description language,
such as VHDL or Verilog, which describes the behavior and structure of the
design. Using HDLs offers the following advantages:
Synthesis decreases design time by eliminating the need to define every gate.
In addition, the synthesis tool can automate the process, using several
encoding styles for state machines or performing automatic I/O buffer
insertion during optimization, resulting in greater efficiency.
simulating the HDL description. Design simulation at the gate-level before
implementation allows evaluating the architectural and design decisions.


2.2.2. Schematic Editor
The schematic editor allows creating a visual representation of the designed
system.
The editor can be used for the top-level design file, the lower-level design
files, or both.
A schematic can represent a top-level design, and the lower-level modules can
be created using any of the following source types: HDL files, CORE Generator
IP cores, Architecture Wizard IP cores, or schematic files.
Schematics can also be used to define the lower-level modules of the design.
If the top-level design file is a schematic file, then a schematic symbol
must be created from each lower-level module and then it must be instantiated
in the top-level schematic file. If the toplevel design file is an HDL file,
an HDL instantiation template must be created from the schematic, and then
the template must be instantiated in the top-level HDL file.
All schematic files are ultimately converted to either VHDL or Verilog
structural netlists before being passed on to the synthesis tool during the
synthesis step.
2.2.3. CORE Generator Software
The CORE Generator software reduces design time by providing access to
parameterized IP (Intellectual Property) cores for Xilinx FPGA devices. The
software provides a catalog of architecture specific, domain specific
(embedded, connectivity, DSP), and product specific (automotive, consumer
electronics, military equipments, communications) IP cores.
These user-customizable IP cores range in complexity from commonly used
modules, such as
memories and FIFO memories, to system-level building blocks. Using these IP
cores can reduce significantly the design time.
The CORE Generator software includes the following types of IP cores:
accumulator, multiplier, complex multiplier, etc.);
Logic Analyzer, Virtual Input/Output;
-X;



02. The Xilinx ISE design environment: synthesis module
After design entry and optional simulation, the next step in the design flow
is the synthesis. The Xilinx ISE Design Suite environment includes the
Xilinx Synthesis Technology (XST) software, which synthesizes VHDL or Verilog
designs to create Xilinx-specific netlist files, known as NGC files. Unlike
outputs from other tools, which consist of an EDIF file with an associated
NCF constraints file, NGC files contain both logical design data and
constraints. The NGC file is then accepted as input to the translate step of
the design implementation process.

In addition to NGC files, the XST software also generates the following
outputs: a synthesis report, an NGR file with the RTL (Register Transfer
Level) schematic, and a technology schematic. The synthesis report contains
the results from the synthesis run, including an area and timing estimation.
The RTL schematic is a representation of the design before the optimization
using generic symbols, such as adders, multiplexers, counters, gates. This
schematic is generated after the HDL synthesis phase of the synthesis
process. The technology schematic is a representation of an NGC file using
logic elements optimized to the target architecture or technology. This
schematic is generated after the optimization phase of the synthesis process.

First, a parsing of the HDL code is performed, when XST checks whether the
HDL code is correct and reports syntactic errors. If there are no syntax
errors, the HDL synthesis is performed. XST analyzes the HDL code and
attempts to infer specific design building blocks or macros (such as
multiplexers, memories, adders, subtractors) for which it can create
efficient technology implementations.
To reduce the amount of inferred macros, XST performs a resource sharing
check that leads to a reduction of the area and an increase in the clock
frequency. At this step, the XST software recognizes the Finite State
Machines (FSMs) independent of the modeling style used. To create the most
efficient implementation, XST uses the optimization criterion that has been
specified (area or speed) to determine the FSM encoding algorithm that will
be used.
The last step of the synthesis is the low-level optimization. XST transforms
inferred macros and general logic into a technology-specific implementation.
03. The Xilinx ISE design environment: implementation
modules
1.1. Implementation Modules
The implementation process consists of three stages:




1.1.1. Translation Module
The NGDBuild software is the Xilinx tool that is used to perform the
translation step.
The NGDBuild tool generates a logical description of the design and a
description of the original hierarchy based on the input netlist file.

The format of the input netlist file is either EDIF or NGC. An NGC file,
which is created by the Xilinx XST software, is a binary file containing both
logical design data and constraints. NGDBuild also accepts as inputs NMC
library files, containing the definition of physical macros that can be
instantiated into a design. Other inputs to the NGDBuild software are UCF and
NCF constraints files. A UCF file (User Constraints File) represents an ASCII
file containing timing and placement constraints that affect how the logical
design is implemented in the target device. An NCF file (Netlist Constraints
File) is created by a certain vendor toolset and contains constraints
specified within that toolset.
The NGDBuild software generates an intermediary NGO file containing a logical
description of the design in terms of its original components and their
hierarchy. The outputs of the NGDBuild software are an NGD (Native Generic
Database) file and a build report BLD file. The NGD file is a binary file
that contains:
on of the design in terms of its original components,
their hierarchy, and the primitives (gates, look-up tables, flip-flops, RAMs)
to which the design is reduced;

ion of physical macros from the NMC files.
1.1.2. Mapping Module
The MAP software performs the mapping of the logical design to the available
logical resources of the target FPGA device.

The main input file to the MAP software is an NGD file, which is generated
during the translate process by the NGDBuild tool. MAP also accepts as inputs
NMC files containing the definition of physical macros.
The MAP software first performs a design rule check on the design in the
input NGD file. The next step is mapping the design logic to the components
of the FPGA device. The mapping is based on the mapping directives.
The main output of the MAP software is an NCD (Native Circuit Description)
file, which is a physical representation of the design mapped to the
components in the target FPGA device, such as configurable logic blocks
(CLBs) and I/O blocks (IOBs). The mapped NCD file can then be placed and
routed using the PAR software. Optionally, the mapped NCD file can be used
as a guide file and applied as an input to guide a later run of the MAP
software.
MAP also generates a PCF file (Physical Constraints File), which is an ASCII
text file containing constraints specified during design entry, expressed in
terms of timing restrictions, physical elements and other attributes placed
in a UCF or NCF file. Other outputs are an NGM file, which is used for back-
annotation and contains logical and physical information about the mapping
process, and a MAP report (MRP) file.

1.1.3. Placement and Routing Module
After the mapping module generates the NCD file, the design can placed and
routed using the PAR software. PAR accepts a mapped NCD file and a PCF file
as inputs, places and routes the design, and then generates a placed and
routed NCD file to be used by the configuration bitstream generator module.
The place and route process is structured in two phases:
PAR writes the NCD file after all the placer phases are complete. The
components are placed into locations based on factors such as constraints
specified in the PCF file, length of connections, and available routing
resources.
phases of the router module. This module performs a converging procedure for
a solution that fully routes the design and meets timing constraints. After
the design is fully routed, PAR generates an NCD file that can be used for
timing analysis. The PAR software writes a new NCD file as the routing
improves throughout the router phases.
The placement and routing operations can be performed in two modes: timing-
driven and cost-based.
Timing-driven placement and routing is based on the Xilinx timing analysis
software, an integrated static timing analysis tool. Placement and routing
are executed according to timing constraints specified in the beginning of
the design process. The timing analysis tool interacts with the PAR software
to ensure that the timing constraints imposed on the design are met.
Cost-based placement and routing is performed using various cost tables that
assign weighted values to relevant factors such as constraints, length of
connections, and available routing resources. Cost-based placement and
routing is used if no timing constraints are present in the design.

The PAR software may accept as input an optional guide file, which is a
placed and routed NCD file that can be used as a guide for placing and
routing the design. The PAR file illustrated in the figure is a report file
that contains summary information of all placement and routing iterations.
04. Structure and execution of a process
A process is a sequence of statements that are executed in the specified
order. The process declaration delimits a sequential domain of the
architecture in which the declaration appears. Processes are used for
behavioral descriptions.
1.1.1. Structure and Execution of a Process
A process may appear anywhere in an architecture body (the part starting
after the begin keyword). The basic structure of a process declaration is the
following:
[name:] process [(sensitivity_list)]
[type_declarations]
[constant_declarations]
[variable_declarations]
[subprogram_declarations]
begin
sequential_statements
end process [name];
The process declaration is contained between the keywords process and end
process. A process may be assigned an optional name for simpler
identification of the process in the source code. The name is an identifier
and must be followed by the ':' character.
This name is also useful for simulation, for example, to set a breakpoint in
the simulation execution. The name may be repeated at the end of the
declaration, after the keywords end process.
The optional sensitivity list is the list of signals to which the process is
sensitive. Any event on any of the signals specified in the sensitivity list
causes the sequential instructions in the process to be executed, similar to
the instructions in a usual program. As opposed to a programming language, in
VHDL the end process clause does not specify the end of process execution.
The process will be executed in an infinite loop. When a sensitivity list is
specified, the process will only suspend after the last statement, until a
new event is produced on the signals in the sensitivity list. Note that an
event only occurs when a signal changes value. Therefore, the assignment of
the same value to a signal does not represent an event.
When the sensitivity list is missing, the process will be run continuously.
In this case, the process must contain a wait statement to suspend the
process and to activate it when an event occurs or a condition becomes true.
When the sensitivity list is present, the process cannot contain wait
tatements.
The declarative part of the process is contained between the process and
begin keywords. This part may contain declarations of types, constants,
variables, and subprograms (procedures and functions) that are local to the
process. Thus, the declared items can only be used inside the process.
Note
be declared inside a process; only constants and variables
may be declared.
The statement part of the process starts after the begin keyword. This part
contains the statements that will be executed on each activation of the
process. It is not allowed to use concurrent instructions inside a process.
We present the declaration of a simple process composed of a single
sequential signal assignment.
proc1: process (a, b, c)
begin
x <= a and b and c;
end process proc1
05. Processes in the VHDL language
2.1. Processes with Incomplete Sensitivity Lists
Some synthesis tools may not check for sensitivity lists of processes. These
tools may assume that all signals on the right-hand side of sequential signal
assignments are in the sensitivity list. Thus these synthesis tools will
interpret the two processes in the example to be identical.
Example
proc6: process (a, b, c)
begin
x <= a and b and c;
end process proc6;
proc7: process (a, b)
begin
x <= a and b and c;
end process proc7;
All synthesis tools will interpret process proc6 as a 3-input AND gate. Some
synthesis tools will also interpret process proc7 as a 3-input AND gate, even
though when this code is simulated, it will not behave as such. While
simulating, a change in value of signal a or b will cause the process to
execute, and the value of logical AND of signals a, b, and c will be assigned
to signal x. However, if signal c changes value, the process is not executed,
and signal x is not updated.
Because it is not clear how a synthesis tool should generate a circuit for
which a transition of signal c does not cause a change of signal x, but for
which a change of signals a or b causes signal x to be updated with the
logical AND of signals a, b, and c, there are the following alternatives for
the synthesis tools:
includes all signals on the right-hand side of any signal assignment
statement within the process);
ed
without a complete sensitivity list.
The second variant is preferable, because the designer will have to modify
the source code so that the functionality of the generated circuit will match
the functional simulation of the source code.
Although it is syntactically legal to declare processes without a sensitivity
list or a wait statement, such processes never suspend. Therefore, if such a
process would be simulated, the simulation time would never advance because
the initialization phase, in which all processes are executed until
suspended, would never complete.
2.2. Combinational and Sequential Processes
Both combinational and sequential processes are interpreted in the same way
at synthesis, the only difference being that for sequential processes the
output signals are stored into registers. A simple combinational process is
described in the example.
Example
proc8: process
begin
wait on a, b;
z <= a and b;
end process proc8;
This process will be implemented by synthesis as a two-input AND gate.
For a process to model combinational logic, it must contain in the
sensitivity list all the signals that are inputs of the process. In other
words, the process must be reevaluated every time one of the inputs to the
circuit it models changes. In this way combinational logic is correctly
modeled.
If a process is not sensitive to all its inputs and it is not a sequential
process, then it cannot be synthesized, since there is no hardware equivalent
of such a process. Not all synthesis tools enforce such a rule, so great care
should be taken in the design of combinational processes in order not to
introduce errors in the design. Such errors will cause subtle differences
between the simulated model and the circuit obtained by synthesis, because a
noncombinational process is interpreted by the synthesizer as a combinational
circuit.
If a process contains a wait until statement or an if signal'event
statement, the process will be interpreted as a sequential process. Hence the
process in the example will be interpreted as a sequential process.
Example
proc9: process
begin
wait until clk = '1';
z <= a and b;
end process proc9;
By synthesizing this process, the circuit in Figure 1 will result, where a
flip-flop is added on the output.

06. Synthesis of if statements
2.4. Synthesis of if Statements
An if statement may be implemented by a multiplexer. First consider the if
statement without any elsif branch. The example presents the use of such a
statement to describe a comparator.

Example
library ieee;
use ieee.std_logic_1164.all;
entity comp is
port (a, b: in std_logic_vector (7 downto 0);
equal: out std_logic);
end comp;
architecture functional of comp is
begin
process (a, b)
begin
if a = b then
equal <= '1';
else
equal <= '0';
end if;
end process;
end functional;
The previous example tests the equality of two signals of type
std_logic_vector, representing two 8-bit vectors, and gives a result of type
std_logic. The resulting circuit is shown in the figure below. Note that, as
with other examples, in practice the synthesis tool will remove the
inefficiencies in the circuit (in this case, the constant inputs to the
multiplexer) to give a minimal solution.

A multi-branch if statement, in which at least one elsif clause appears, is
implemented by a multi-stage multiplexer. Consider the if statement in this
example.
Example
process (a, b, c, s0, s1)
begin
if s0 = '1' then
z <= a;
elsif s1 = '1' then
z <= b;
else
z <= c;
end if;
end process;
The result of implementing the if statement from the previous example is
presented in the figure below. This circuit is equivalent to that resulted by
implementing a conditional signal assignment, which is a concurrent
statement.

The conditions in the successive branches of an if statement are evaluated
independently. In the previous example, the conditions involve the two
signals s0 and s1. There can be any number of conditions, and each of them is
independent of the others. The structure of the if statement ensures that the
earlier conditions are tested first. In this example, signal s0 has been
tested before signal s1. This priority is reflected in the generated circuit,
where the multiplexer controlled by signal s0 is nearer to the output than
the multiplexer controlled by signal s1.
It is important to remember the existence of this priority for condition
testing, so that redundant tests can be eliminated. Consider the if statement
in this example, which is equivalent to the if statement in the previous one.
Example
process (a, b, c, s0, s1)
begin
if s0 = '1' then
z <= a;
elsif s0 = '0' and s1 = '1' then
z <= b;
else
z <= c;
end if;
end process;
The additional condition s0 = '0' is redundant since it will be tested only
if the first condition of the if statement is false. It is recommended to
avoid such redundancies, because there is no guarantee that they will be
detected and removed by the synthesis tool.
For multi-branch if statements, normally each condition will be dependent on
different signals and variables. If every branch is dependent on the same
signal, then it is more advantageous to use a case statement.
07. Incomplete if statements
2.5. Incomplete if Statements
In the examples presented so far, all the if statements have been complete.
In other words, the target signal has been assigned a value under all
possible conditions. However, there are two situations when a signal does not
receive a value: when the else clause of the if statement is missing, and
when the signal is not assigned to a value in some branches of the if
statement. In both cases the interpretation is the same. In the situations
when a signal does not receive a value, its previous value is preserved.
The problem is what the previous value of the signal is. If there is a
previous assignment statement in which the signal appears as target, then the
previous value comes from that assignment statement. If not, the value comes
from the previous execution of the process, leading to feedback in the
circuit.
The first case is illustrated in this example.
Example
process (a, b, en)
begin
z <= a;
if en = '1' then
z <= b;
end if;
end process;
In this case, the if statement is incomplete because the else clause is
missing. In the if statement, the signal z gets a value if the condition en =
'1' is true, but remains unassigned if the condition is false. The previous
value comes from the unconditional assignment before the if statement.
The if statement of the previous example is equivalent to the if statement of
this example.
Example
process (a, b, en)
begin
if en = '1' then
z <= b;
else
z <= a;
end if;
end process;
When the if statement is incomplete and there is no previous assignment,
then a latch will be inserted to the output of the circuit and a feedback
will exist from the output to the input. This is because the value of the
signal from the previous execution of the process is preserved and it becomes
the value in the current execution of the process.
This form of the if statement is used to describe a flip-flop or a register
with an enable input, as in this example.
Example
process (clk)
begin
if (clkevent and clk = '1') then
if en = '1' then
q <= d;
end if;
end if;
end process;
Signal q is updated with the new value of signal d when the condition is
true, but is not updated when the condition is false. In this case, the
previous value of signal q is preserved by sequential feedback of q. The
resulting circuit is presented in this figure.

The if statement of the previous example is equivalent to the complete if
statement of this example.
Example
process (clk)
begin
if (clkevent and clk = '1') then
if en = '1' then
q <= d;
else
q <= q;
end if;
end if;
end process;
When the condition is false, the signal q is assigned to itself, which is
equivalent to preserving its previous value.
One of the most common errors encountered in VHDL descriptions targeted for
synthesis is the unintended introduction of feedback in the circuit due to an
incomplete if statement. This will insert latches in the synthesized design,
which can be problematic for FPGA devices because timing for paths containing
latches are difficult to analyze. Synthesis tools usually report when a
latch is inserted.
In order to avoid the insertion of latches, the designer must ensure that
every signal assigned to in an if statement within a combinational process
(which is therefore an output signal of the process) receives a value under
every possible combination of conditions. In practice, there are two
possibilities of doing this: to assign a value to output signals in every
branch of the if statement and including the else clause, or to initialize
signals with an unconditional assignment before the if statement.
In the following example, although the if statement looks complete, different
signals are being assigned a value in each branch of the if statement. Thus
both signals z and y will have asynchronous feedback.
Example
process (a, b, c)
begin
if c = '1' then
z <= a;
else
y <= b;
end if;
end process;
Another example is where there is a redundant test for a condition which must
be true.
Example
process (a, b, c)
begin
if c = '1' then
z <= a;
elsif c = '0' then
z <= b;
end if;
end process;
In this case, although the if statement looks complete (assuming that signal
c is of type bit), each of the conditions in the if statement is synthesized
independently. The synthesis tool may therefore not detect that this second
condition is redundant. In this case, the if statement is synthesized as a
three-way multiplexer, the third input being the missing else condition which
is the feedback of the previous value. The circuit synthesized for this
example is shown below.

2.6. If Statements with Variables
So far, in the if statements only signals were used. The same rules apply
when using variables, with a single difference. Like a signal, if a variable
is assigned to only in some branches of the if statement, then the previous
value is preserved by feedback. Unlike the case when a signal is used, the
reading and writing of a variable in the same process will result in feedback
only if the read occurs before the write. In this case, the value read is the
previous value of the variable. In the case when a signal is used, a read and
a write in the same process will always result in feedback.
This observation may be used to create registers or counters using variables.
Remember that a sequential process is interpreted by synthesis by placing a
flip-flop or register on every signal assigned to in the process. This means
that normally variables are not written to flip-flops or registers. However,
if there is feedback of a previous variable value, then this feedback is
implemented via a flip-flop or register to make the process synchronous.
The example below describes a counter using the unsigned integer type. When a
value of type unsigned is incremented, if the value is the highest value of
the range, then the lowest value of the range is obtained.
Example
process (clk)
variable count: unsigned (7 downto 0);
begin
if (clkevent and clk = '1') then
if rst = '1' then
count := "00000000";
else
count := count + 1;
end if;
result <= count;
end if;
end process;
In this example, in the else branch of the if statement the previous value
of the count variable is being read to calculate the next value. This results
in a feedback.
Note that in this example actually two registers are created. According to
the feedback rules, variable count will be registered. Signal result will
also be registered, because all signals assigned to in a sequential process
will be registered. This extra register will always contain the same value as
the register for variable count. The synthesis tool will normally eliminate
this redundant register.
08. Examples of combinational circuits
2.1. Multiplexers
Multiplexers may be described using several methods. The example below
describes the 4:1 multiplexer for 4-bit buses of this figure using a selected
signal assignment.

Example
library ieee;
use ieee.std_logic_1164.all;
entity mux is
port (a, b, c, d: in std_logic_vector (3 downto 0);
s: in std_logic_vector (1 downto 0);
x: out std_logic_vector (3 downto 0));
end mux;
architecture arch_mux of mux is
begin
with s select
x <= a when "00",
b when "01",
c when "10"
d when "11",
d when others;
end arch_mux;
The reason of using the others keyword is that the selection signal s is of
type std_logic_vector, and there are nine possible values for a data object
of this type. All the possible values of the selection signal must be
covered. If the others option were not used, only four of the 81 values would
be covered by the set of options. Other possible values of signal s are, for
example, "1X", "UX", "Z0", "U-". For synthesis, "11" is the only meaningful
value, but for simulation there are 77 other values that signal s may have.
The metalogical value "--" may also be used to assign a dont care value to
signal x.

The 4:1 multiplexer can be described with an if statement as shown below.
Example
architecture arch_mux of mux is
begin
mux4_1: process (a, b, c, d, s)
begin
if s = "00" then
x <= a;
elsif s = "01" then
x <= b;
elsif s = "10" then
x <= c;
else
x <= d;
end if;
end process mux4_1;
end arch_mux;
Since the conditions imply mutually exclusive values of signal s, by
synthesizing this description the same circuit is generated as when a
selected signal assignment statement is used. However, because the conditions
contain a priority, the if statement is not advantageous when the conditions
imply multiple signals that are mutually exclusive. Using an if statement in
these cases may generate additional logic to ensure that the preceding
conditions are not true. Instead of an if statement, it is more advantageous
to use a Boolean equation or a case statement.
2.2. Decoders
A decoder is a combinational circuit that identifies an input code by
asserting a single output line, corresponding to the input code. A decoder
with n input lines has, in general, 2n output lines and is denoted by DCD
n:2^n.
Below is described a 1:8 decoder with active-high outputs. For the
description a conditional signal assignment is used.
Example
library ieee;
use ieee.std_logic_1164.all;
entity decoder_1_8 is
port (a: in std_logic_vector (2 downto 0);
y: out std_logic_vector (7 downto 0));
end decoder_1_8;
architecture decod of decoder_1_8 is
begin
y <= "00000001" when a = "000" else
"00000010" when a = "001" else

"00000100" when a = "010" else
"00001000" when a = "011" else
"00010000" when a = "100" else
"00100000" when a = "101" else
"01000000" when a = "110" else
"10000000";
end decod;
When the XST synthesis program is used, in order to infer a decoder from the
HDL description all combinations of the inputs must be specified and all
outputs must be used (for instance, values of 'X' for the output lines should
not be specified).
2.3. Priority Encoders
An example of a priority encoder is shown below.

This priority encoder may be described concisely with a conditional signal
assignment statement, as below.
Example
library ieee;
use ieee.std_logic_1164.all;
entity priority_encoder is
port (a, b, c, d: in std_logic;
w, x, y, z: in std_logic;
j: out std_logic);
end priority_encoder;
architecture priority of priority_encoder is
begin
j <= w when a = '1' else
x when b = '1' else
y when c = '1' else
z when d = '1' else
'0';
end priority;

The when-else statement in the previous example indicates that signal j is
assigned the value of signal w when a is '1', even if b, c, or d are '1'.
Signal b holds priority over signals c and d, and signal c holds priority
over signal d. If signals a, b, c, and d are mutually exclusive (that is, if
it is known that only one will be asserted at a time), then the description
of below is more appropriate.
Example
library ieee;
use ieee.std_logic_1164.all;
entity no_priority is
port (a, b, c, d: in std_logic;
w, x, y, z: in std_logic;
j: out std_logic);
end no_priority;
architecture no_priority of no_priority is
begin
j <= (a and w) or (b and x) or (c and y) or (d and z);
end no_priority;
The logic generated by synthesizing the description of Example 12 requires
AND gates with only two inputs. Although using AND gates with more inputs in
a CPLD device does not usually require additional resources, these gates
could require additional logic cells and logic levels in an FPGA device. The
descriptions of the previous two examples are not functionally equivalent,
however. This equivalence only exists if signals a, b, c, and d are
known to be mutually exclusive. In this case, the description of the previous
example generates an equivalent logic with fewer resources.
2.4. Combinational Shifters
A combinational shifter performs a logical or arithmetic shift operation on
the input data. The inputs of the shifter are the data to be shifted and the
selector whose binary value specifies the shift distance. The output of the
shifter is the result of the shift operation.
When the XST synthesis program is used, the following restrictions apply in
order to infer a combinational shifter from the HDL description:
hift (sla, sra), rotate (rol,
ror), and concatenation (&) operators can be used. Shift operations that fill
vacated positions with values from another signal are not recognized.

specifies the shift distance in the shift operation must be
positive and must be incremented or decremented only by 1 for each consequent
binary value of the selector.

The example below describes a combinational shifter for 8-bit vectors that
can be shifted left with one, two, or three positions. A selected signal
assignment is used to describe the shifter.
Example
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity shift_left is
port (din: in unsigned (7 downto 0);
sel: in unsigned (1 downto 0);
dout: out unsigned (7 downto 0));
end shift_left;
architecture arch_shift of shift_left is
begin
with sel select
dout <= din when "00",
din sll 1 when "01",
din sll 2 when "10",
din sll 3 when others;
end arch_shift
09. Examples of sequential circuits
3.1. Synchronous and Asynchronous Sequential Circuits
Sequential circuits represent a category of logic circuits that include
storage elements.
These circuits contain feedback loops from the output to the input. The
signals generated at the outputs of a sequential circuit depend on both the
input signals and on the state of the circuit.
The present state of a sequential circuit depends on a previous state and on
the values of input signals. With synchronous sequential circuits, the change
of state is controlled by a clock signal. With asynchronous circuits, the
change of state may be caused by the random change in time of an input
signal. The behavior of an asynchronous circuit is less secure, since the
state evolution is also influenced by the delays of the circuits components.
The transition between two stable states may be attained by a succession of
unstable, random states.
Synchronous sequential circuits are more reliable and have a predictable
behavior. All storage elements of a synchronous circuit change their state
simultaneously, which eliminates intermediate unstable states. By testing the
input signals at well-defined times, the influence of delays and noises is
reduced.
There are two techniques for designing sequential circuits: Mealy and
Moore. For Mealy sequential circuits, the output signals depend on both the
current state and the present inputs. For Moore sequential circuits, the
outputs depend only on the current state, and they do not depend directly on
the inputs. The Mealy method allows to implement a circuit by a minimal
number of storage elements (flip-flops), but the possible uncontrolled
variations of the input signals may be transmitted to the output signals. The
design using the Moore method requires more storage elements for the same
behavior, but the circuit operation is more reliable.
3.2. Flip-Flops
Example 14 describes a synchronous D-type flip-flop triggered on the rising
edge of the clock signal.

Example

library ieee;
use ieee.std_logic_1164.all;
entity dff is
port (clk: in std_logic;
d: in std_logic;
q: out std_logic);
end dff;
architecture example of dff is
begin
process (clk)
begin
if (clk'event and clk = '1') then

q <= d;
end if;
end process;
end example;
The process used to describe the flip-flop is sensitive only to changes of
the clk clock signal. A transition of the input signal d does not cause the
execution of this process. The clk'event expression and the sensitivity list
are redundant, because both detect changes of the clock signal. Some
synthesis tools, however, will ignore the process sensitivity list, and thus
the clk'event expression should be included to describe events triggered on
the edge of the clock signal.
To describe a level-sensitive latch (below), the clk'event condition is
removed and the data input d is inserted in the process sensitivity list.

Example

architecture example of d_latch is
begin
process (clk, d)
begin
if (clk = '1') then
q <= d;
end if;
end process;
end example;
In the previous example and in the following there is no else condition.
Without this condition, an implied memory element is specified (that will
keep the value of signal q). In other words, the following fragment:
if (clk'event and clk = '1') then
q <= d;
end if;
has the same meaning for simulation as the fragment:
if (clk'event and clk = '1') then
q <= d;
else
q <= q;
end if;
This is consistent with the operation of a D-type flip-flop. Most synthesis
tools do not allow to use an else expression after an if (clk'event and clk =
'1') condition, because it may describe a logic for which the implementation
is ambiguous.
3.3. Registers
The following example describes an 8-bit register by a process similar to
that of the second previous one, the difference being that d and q are
vectors. In addition, this register has a clock enable signal (ce).
Example

library ieee;
use ieee.std_logic_1164.all;
entity reg8 is
port (clk: in std_logic;
ce: in std_logic;
d: in std_logic_vector (7 downto 0);
q: out std_logic_vector (7 downto 0));
end reg8;
architecture ex_reg of reg8 is
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (ce = '1') then
q <= d;
end if;
end if;
end process;
end ex_reg;
3.4. Shift Registers
A shift register is a sequential circuit that shifts left or right the
contents of the register with one position in each clock cycle. Usually, the
inputs of a shift register are represented by the clock signal, a serial
input data, a synchronous or asynchronous set/reset signal, and a clock
enable signal. In addition, a shift register may have data and control
signals for synchronous or asynchronous parallel load. The output data of a
shift register can be accessed either serially, when only the contents of the
last flip-flop are accessible for the rest of the circuit, or in parallel,
when the contents of several flip-flops are accessible.
Xilinx FPGA devices contain dedicated resources (the SRL16 and SRL32
primitives) that allow an efficient implementation of shift registers without
using additional flip-flops.
However, these resources only support left shift operations, and have a
limited number of input/output signals: clock, clock enable, serial data
input, and serial data output. Synchronous and asynchronous set/reset signals
are not available in the SRL primitives. Therefore, if any set, reset, or
parallel load logic is used in the description, the XST synthesis tool may
not be able to take advantage of the dedicated primitives for an efficient
implementation.
There are several possibilities to describe shift registers in the VHDL
language:

reg <= reg (6 downto 0) & si;


The following example describes an 8-bit shift-left register with clock
enable, serial input, and serial output signals. A for loop construct is used
to describe the shift register.
Example
library ieee;
use ieee.std_logic_1164.all;
entity shift_reg8 is
port (clk: in std_logic;
ce: in std_logic;
si: in std_logic;
so: out std_logic);
end shift_reg8;

architecture shift_reg of shift_reg8 is
signal tmp: std_logic_vector (7 downto 0);
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (ce = '1') then
for i in 0 to 6 loop
tmp(i+1) <= tmp(i);
end loop;
tmp(0) <= si;
end if;
end if;
end process;
so <= tmp(7);
end shift_reg;
3.5. Counters
We will describe a 3-bit counter.
Example
library ieee;
use ieee.std_logic_1164.all;
entity count3 is
port (clk: in std_logic;
count: out integer range 0 to 7);
end count3;
architecture count3_integer of count3 is
signal tmp: integer range 0 to 7;
begin
cnt: process (clk)
begin
if (clk'event and clk = '1') then
tmp <= tmp + 1;
end if;
end process cnt;
count <= tmp;
end count3_integer;
In the previous example, the addition operator is used for the count signal,
which is of type integer. Most of synthesis tools allow this use, converting
the type integer to bit_vector or std_logic_vector. Nonetheless, using the
type integer for ports poses some problems:
1) In order to use the value of count in another portion of a design for
which the interface has ports of type std_logic, a type conversion must be
performed.
2) The vectors applied during simulation of the source code cannot be used to
simulate the model generated by synthesis. For the source code, the vectors
should be integer values. The synthesized model will require vectors of type
std_logic.
Because the native VHDL + operator is not predefined for the types bit or
std_logic, this operator must be overloaded before it may be used to add
operands of these types. The IEEE 1076.3 standard defines functions to
overload the + operator for the following operand pairs: (unsigned,
unsigned), (unsigned, integer), (signed, signed), and (signed, integer).
These functions are defined in the numeric_std package of the 1076.3
standard.
The following example is the modified version of the previous one, in order
to use the type unsigned for the counters output.
Example
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity count3 is
port (clk: in std_logic;
count: out unsigned (2 downto 0));
end count3;
architecture count3_unsigned of count3 is
signal tmp: unsigned (2 downto 0);
begin
cnt: process (clk)
begin
if (clk'event and clk = '1') then
tmp <= tmp + 1;
end if;
end process cnt;
count <= tmp;
end count3_unsigned;
Usually, synthesis tools supply additional packages to overload operators for
the type std_logic. Although not standard packages, these are often used by
designers because they allow arithmetic and relational operations on the type
std_logic, and from this point of view they are even more useful than the
numeric_std package. These packages do not require to use two additional
types (signed, unsigned) in addition to std_logic_vector, as well as the
functions to convert between these types. When using one of these packages
for arithmetic operations, a synthesis tool will use an unsigned or signed
(twos complement) representation for the type std_logic_vector, and will
generate the appropriate arithmetic components as well.
The next example presents the modified description of the counter from the
previous examples to use the std_logic_unsigned package and the type
std_logic_vector for the counters output.
Example
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity count3 is
port (clk: in std_logic;
count: out std_logic_vector (2 downto 0));
end count3;
architecture count3_std_logic of count3 is
signal tmp: std_logic_vector (2 downto 0);
begin
cnt: process (clk)
begin
if (clk'event and clk = '1') then
tmp <= tmp + 1;
end if;
end process cnt;
count <= tmp;
end count3_std_logic;
3.6. Three-State Buffers and Bidirectional Signals
Most programmable-logic devices have three-state outputs or bidirectional I/O
signals. Additionally, some devices have internal three-state buffers. The
values that a three-state signal may have are '0', '1', and 'Z', all of which
are supported by the type std_logic.
The example below presents the modified description for the counter of
Example 23 to use three-state outputs. This counter does not have an
asynchronous preset signal.
Example
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity count8 is
port (clk, rst: in std_logic;
en, load: in std_logic;
oe: in std_logic;
data: in std_logic_vector (7 downto 0);
count: out std_logic_vector (7 downto 0));
end count8;
architecture arch_count8 of count8 is
signal tmp: std_logic_vector (7 downto 0);
begin
cnt: process (rst, clk)
begin
if (rst = '1') then
tmp <= (others => '0');
elsif rising_edge (clk) then
if (load = '1') then
tmp <= data;
elsif (en = '1') then
tmp <= tmp + 1;
end if;
end if;
end process cnt;
oep: process (oe, tmp)
begin
if (oe = '0') then
count <= (others => 'Z');
else
count <= tmp;
end if;
end process oep;
end arch_count8;
Compared to the description of Example 23, in this description an additional
signal oe is used to control the three-state outputs. The process labeled oep
describes the three-state outputs for the counter. If signal oe is not
asserted, the outputs are placed in the highimpedance state. The oep process
description is consistent with the behavior of a three-state buffer (below).

The counter of the preceding examples may be modified to use bidirectional
signals for its outputs. In this case, the counter may be loaded with the
current value of its outputs, which means that the value loaded when the load
signal is asserted will be the previous value of the counter or an external
value, depending on the state of the oe signal.
Below, the output enable of a three-state buffer is defined implicitly.
Example
mux: process (row_addr, col_addr, present_state)
begin
if (present_state = row or present_state = RAS) then
dram <= row_addr;
elsif (present_state = col or present_state = CAS) then
dram <= col_addr;
else
dram <= (others => 'Z');
end if;
end process mux;
The three-state buffers of the dram signal are enabled if the value of the
present_state signal is row, RAS, col, or CAS. For any other values of
this signal, the output buffers are not enabled.
In the preceding examples, behavioral descriptions were used for three-state
buffers. To generate these buffers, structural descriptions may be used as
well, such as the for generate construct. This construct will be described in
the laboratory work dedicated to structural design.
10. Inertial delay and transport delay
In the VHDL language there are two types of delays that can be used to model
real systems. These are the inertial delay and the transport delay. These
delays cannot be used for logic synthesis.
The inertial delay is the default delay and it is used when the type of delay
is not specified. The after clause assumes by default the inertial delay. In
a model with inertial delay, two consecutive changes of an input signal value
will not change an output signal value if the time between these changes is
shorter than the specified delay. This delay represents the inertia of the
real circuit. If, for example, certain pulses of short periods of the input
signals occur, the output signals will remain unchanged.
The figure below illustrates the inertial delay with a simple buffer. The
buffer with a delay of 20 ns has an input A and an output B. Signal A changes
from '0' to '1' at 10 ns and from '1' to '0' at 20 ns. The input signal pulse
has a duration of 10 ns, which is shorter than the delay introduced by the
buffer. As a result, the output signal B remains '0'.
The buffer in below can be modeled by the following assignment statement:
b <= a after 20 ns;

The transport delay must be specified explicitly with the transport keyword.
This represents the delay of an interconnection, in which the effect of a
pulse in an input signal is propagated to the output with the specified
delay, regardless of the duration of that pulse. The transport delay is
especially useful for modeling transmission lines and interconnections
between components.
Considering the same buffer of Figure 1 and the input signal A with the same
waveform, if the inertial delay is replaced with the transport delay, the
output signal B will have the form shown in below. The pulse on the input
signal is propagated unchanged to the output with a delay of 20 ns.
If the transport delay is used, the buffer of Figure 2 may be modeled by the
following assignment statement:
b <= transport a after 20 ns;


11. Event-driven simulation
4.1. Event-Driven Simulation
All VHDL simulators are event-driven simulators. An event is a change of a
signal state. There are three basic concepts of event-driven simulation.
These are simulation time, event processing, and delta delay.
During simulation, the simulator keeps track of the simulation time, which is
the circuit time that has been modeled by the simulator, not the time needed
for the simulation. This time is usually measured as an integral multiple of
a basic unit of time known as the resolution limit. The simulator cannot
measure time intervals less than the resolution limit. For gatelevel or RTL
simulation, the resolution limit may be, for example, 1 ps.
When a change of a signal value appears, an event is placed in an event queue
for the simulation time at which this change occurs. When the simulator
processes that event, it reevaluates any statement whose input is the signal
that determined the event (that is, the statements that are sensitive to
that signal). This results in changes of other signals and therefore other
events are generated.
Consider the process below, which contains a single assignment statement.

Example
proc1: process (a, b, c)
begin
x <= a and b and c;
end process proc1;
When a change in value of one of the signals a, b, or c occurs, the
assignment statement is reevaluated, and a new value will result for signal
x. Since an after clause is not used in this statement to specify a delay, an
event is scheduled for signal x for the current simulation time, event that
consists in changing the value of this signal. This could create potential
problems when signal x has to be updated at the same time as one of the
signals from which it is generated. To solve this problem, VHDL introduces
the concept
A delta delay may be considered as an infinitesimally small delay that
implies a delta cycle (the delta cycle is explained in Section 4.3).
Therefore, the semantics of the previous assignment statement is that the
value of the right-hand side expression at the current simulation time, Tc,
is scheduled for assignment to signal x one delta delay after the current

new events will result. Some of them may be scheduled for the current
events are processed again, new events result, and so on, until there are no
other events scheduled for the current simulation time. Only then the
simulation time is incremented.
12. Signal drivers
Consider again the process of Example 3. As previously explained, if one of
the signals a, b, or c changes value, then the value of the logical AND of
signals a, b, and c at the current simulation time, Tc, is scheduled for
words, an event is scheduled for the signal driver of x. A signal driver is
represented by a projected output waveform. Each time a signal assignment is
performed, that signals projected output waveform is updated.
A projected output waveform is a set of transactions that specify new values
for a signal and the times at which the signal will be updated. When
simulating models written for synthesis, there are essentially two
transactions that need to be maintained for any given signal: the current
transaction that specifies the current value and time, and the next
transaction, if it exists, that specifies the new value of the signal at the
next delta delay. However, this does not mean that a maximum of two delta
cycles will be required before the next simulation time. This will be
illustrated when we present the simulation of a model with several concurrent
statements.
Note
delta delays, transactions, signal drivers, and projected
output waveforms are presented for conceptual purposes only. Various VHDL
simulators may implement these concepts differently than presented here. VHDL
simulators need only comply with the operational specifications defined in
the reference manual of the language (IEEE standard 1076), but the
implementation is specific for each CAD system.
As an example, executing process proc1 results in a signal driver for x as
shown below:


If signals b and c are logical '1' and signal a changes from '0' to '1' at 5
ns, as illustrated in Figure 4, when the simulation time is 5 ns process
proc1 will execute. At this time, the current value of signal x is '0', and a
transaction is added to the signal driver for x
is an infinitesimally small delay. When the current simulation time, Tc,
passes 5 ns, the only transaction left on the driver for signal x is ('1',
Tc).

Consider the process of Example 4, which contains two assignments to the same
signal.
Example 4
proc2: process (a, b, c)
begin
x <= '0';
if (a = b or c = '1') then
x <= '1';
end if;
end process;
If signals a, b, and c have the forms shown in Figure 5, when the current
simulation time is 5 ns the transition of signal b causes the process to
execute. The first sequential signal assignment results in the transaction
this transaction is not added to the driver. Next, the condition for the if
statement is evaluated. Because the expression is false, no further
statements are executed and the process suspends.
When the current simulation time is between 5 and 10 ns, the driver for
signal x has only one transaction, ('0', Tc). At 10 ns, a transition of
signal b occurs, and the process is executed again. As previously, the first
sequential statement results in no transaction being added to the driver
(that is, the projected output waveform is not updated). When the condition
for the if statement is evaluated, this time the expression is true. The
signal assignment x <= '1' is executed, causing a new transaction, ('1', 10
.

When the simulation time reaches 15 ns, a transition of signal c occurs,
causing the process to execute. The first statement causes the transaction
if statement evaluates true, the next signal assignment overrides the current
transaction, replacing
same as the current value, this transaction need not replace the last
transaction is deleted.
When the simulation time reaches 20 ns, signals b and c change, and the
process is executed once again. The first statement causes the transaction
for the if statement evaluates false, and the process suspends.
A simple interpretation of signal assignments in a process, assuming that the
process does not contain after clauses, is the following:
the righthand side of the <= symbol;
signal will be updated with the value in the last assignment;
suspends.
As opposed to sequential signal assignment statements inside processes, a
concurrent signal assignment statement has an implicit sensitivity list that
includes all signals on the right-hand side of the <= symbol. Concurrent
statements execute any time a transition of a signal in the implicit
sensitivity list occurs. When there are multiple concurrent statements, they
do not execute sequentially. In the next section well present how multiple
concurrent statements execute when the evaluating expression contains signals
that are being updated by another statement.
13. Simulation cycle
When a VHDL model is simulated, first an initialization phase is executed,
and then repeated simulation cycles. The initialization phase starts with the
current simulation time set to 0 ns. In general, if an explicit initial value
is not specified for a signal, then the signal will have the initial value
'0' if it is of type bit, or 'U' if it is of type std_logic. Next, each
process is executed until it suspends. Concurrent statements are considered
processes in this context and they will also be executed.
After the initialization phase, simulation cycles are run, each cycle
consisting of the following steps:
events to occur for these signals.

occurred in the current simulation cycle is executed.
either the next time at which a signal changes value, based on its projected
output waveform, or the time at which a process resumes (for models written
for simulation, not synthesis), whichever is earlier. If the simulation time
for the next cycle is a delta delay or multiple delta delays from the current
simulation time, then the current simulation time, Tc, remains the same and a
delta cycle consisting of the same steps as above is executed. (Hence, a
delta delay is actually a zero delay, and it is only conceptually convenient
to consider it as being a very small delay.) Otherwise, the current
simulation time is set to the next simulation cycle time ( Tc = Tn).
Consider the description below to illustrate the simulation cycle.
Implementation of the simulation cycle is simulator-specific, but the
simulation results (for example, the signal waveforms) will be equivalent.
Example
entity delta is -- 1
port (a, b, c, d: in bit; -- 2
u, v, w, x, y, z: buffer bit); -- 3
end delta; -- 4
architecture delta of delta is -- 5
begin -- 6
z <= not y; -- 7
y <= w or x; -- 8
x <= u or v; -- 9
w <= u and v; -- 10
v <= c or d; -- 11
u <= a and b; -- 12
end delta; -- 13
In the initialization phase, all signals are set to '0'. Then, each
concurrent statement is executed. The order of concurrent statement writing
and execution is not important, so we will illustrate this by executing the
statements from the last to the first. The signal drivers are updated
according to the projected output waveform below, one delta cycle being
required to update the signals before the current simulation time can
advance.


Suppose that the input signals transition as shown in Figure 7. When the
current simulation time reaches 100 ns, a simulation cycle begins and the
signals are updated. Signal a transitions from '0' to '1', and this causes
the assignment statement for signal u (line 12) to be executed. The value of
signal u does not change, so the simulation cycle is complete and the
simulation time may advance. When the current simulation time reaches 200 ns,
a new simulation cycle begins. Signal b transitions from '0' to '1' and the
assignment statement in line 12 is executed. A new transaction ('1', 200 ns +
simulation time does not advance. During the delta cycle, signal u is updated
with its new value. This causes statements in lines 10 and 9 to execute (in
either order, but during the same delta cycle). A new transaction is not
added for signal w, because its value remains '0'. However, a new transaction
is added to the driver for signal x, ('1', 20
is required. During this cycle, signal x is updated with its new value, which
causes statement in line 8 to execute. A new transaction is added to the
to
signal z. A fourth delta cycle is required to update signal z, after which
the current simulation time may advance.

14. Structural design: elements of a structural design
A structural description consists of components interconnected by signals. A
component may be defined in an architecture by a component declaration, or it
may be represented by a separate system specified as an entity and an
architecture. In order to use a component declared earlier, it must be
instantiated within the structural description. Component instantiations
represent the basic statements in a structural architecture. These
instantiations are concurrent with each other. In a component instantiation
the port mapping is specified, which indicates the signals connected to the
components ports. These signals may be specified as ports or internal
signals of the system. In the latter case, they must be declared in the
declarative part of the architecture.
2.1. Example of Structural Description
The elements of a structural description will be illustrated first with a
complete example. The components of the structural description will be
examined then separately in the next sections. The example consists of two D-
type flip-flops connected in series as a pipeline.
The circuit structure is illustrated below.

We assume that the D-type flip-flop is already defined in a library and has
the entity and architecture definition presented below.
Example
library ieee;
use ieee.std_logic_1164.all;
entity dff is
port (d, clk: in std_logic;
q, qn: out std_logic);
end dff;
architecture arch_dff of dff is
signal tmp: std_logic;
begin
process (clk)
begin
if rising_edge (clk) then
tmp <= d;
end if;
end process;
q <= tmp;
qn <= not tmp;
end arch_dff;
There are several ways to describe this circuit using components. A possible
description is presented below.
Example
library ieee;
use ieee.std_logic_1164.all;
entity delay2 is
port (din, clock: in std_logic;
qout: out std_logic);
end delay2;
architecture structural of delay2 is
signal intern: std_logic;
-- Component declaration
component dff is
port (d, clk: in std_logic;

q, qn: out std_logic);
end component dff;
-- Configuration specification
for all: dff use entity work.dff (arch_dff);
begin
-- Component instantiation
d1: dff port map
(d => din, clk => clock, q => intern, qn => open);
d2: dff port map
(d => intern, clk => clock, q => qout, qn => open);
end structural;
The architecture contains three parts related to the use of components. These
have been labeled with comments and are the following: component declaration,
configuration specification, and component instantiation. The three parts are
described in the next sections.
2.2. Component Declaration
A component declaration defines the interface with a design entity which
describes that component. The component declared in this way may be used
later in component instantiation statements. However, the component
declaration does not specify the entityarchitecture pair that describes the
component or the ports of the component; this information is contained either
in the configuration specification or in the configuration declaration.
The simplified syntax for a component declaration is the following:
component component_name [is]
generic (generic_list);
port (port_list);
end component [component_name];
The syntax for a component declaration is similar to the entity declaration.
The generic clause specifies the generics of the component, and the port
clause specifies its ports.
In practice, the name of the component, the name of its generics and ports,
as well as their order, are identical to the elements that appear in the
entity declaration corresponding to the component.
A component may be declared in an architecture, a block, an entity, or in a
package.
If the component is declared in an architecture, it must be declared in the
declarative part of the architecture, before the begin keyword. In such a
case, the component may be used (instantiated) in the architecture only. If
the component is declared in a package, it will be visible in any
architecture that uses this package.
The component dff above has been declared as:
component dff is
port (d, clk: in std_logic;
q, qn: out std_logic);
end component dff
2.3. Component Instantiation
A component instantiation associates signals or values with the ports of a
component and associates values with the generics of that component. The
simplified syntax for a component instantiation statement is the following:
label: [component] component_name
[generic map (generic_association_list)]
port map (port_association_list);
Component instantiation introduces a relationship to a unit declared earlier
as a component. The name of the instantiated component must match the name of
the declared component. For the instantiated component the generics and ports
are specified, which represent the actual parameters of the declared
component. The association list can be either named or positional.
Named association allows to list the generics and ports in an order that is
different from the one specified in the component declaration. In this case,
each generic or port is explicitly associated a value or signal. The generic
or port name is followed by the => symbol, and then by the value assigned to
the generic or the signal to which the port is to be connected. Ports of a
component may be left unconnected by using the keyword open.
In in the example above, named association has been used for ports. The
component instantiation from this example is reproduced below:
d1: dff port map
(d => din, clk => clock, q => intern, qn => open);
d2: dff port map
(d => intern, clk => clock, q => qout, qn => open);
In a positional association list, the actual parameters (generics and ports)
are specified in the same order in which they appear in the component
declaration. In this case, the generic or port names and the => symbol are
omitted. The component instantiations may be rewritten using positional
association as follows:
d1: dff port map (din, clock, intern, open);
d2: dff port map (intern, clock, qout, open);
In the example, there are two instantiations of the dff component, which are
labeled d1 and d2. These labels are mandatory and must be unique. Each
instantiation creates a subcircuit containing the dff component and the
interconnections with this component.
Notes

represents the instantiation of the component
declaration and not the entity declaration. The relationship between the
component declaration and the entity that describes the component is
controlled by the configuration specification.
2.4. Direct Entity Instantiation
It is not always necessary to define a component to instantiate it, because
the VHDL 93 version of the language allows direct instantiation of an
entity. This instantiation represents the simplest form to specify a
structural system. The syntax of the direct entity instantiation is the
following:
label: entity library_name.entity_name
[(architecture_name)]
[generic map (generic_association_list)]
port map (port_association_list);
The entity instantiation statement specifies the design entity and,
optionally, the name of the architecture to be used for this entity. The
entity may later be used as a component. The entity is specified with the
name of the library to which the entity is compiled and with the entity name.
All entities specified by the user are compiled by default into the library
work, so that usually this library is specified in the entity instantiation
statement. The architecture name must be specified only when there is more
than one architecture defined for a single entity. If the architecture name
is not specified and there is more than one architecture for the directly
instantiated entity, the last compiled architecture associated with the
entity will be used.
Assuming that the entity and architecture for the D-type flip-flop of Example
1 are compiled into the library work, the circuit in Figure 1 may be
described without declaring a component, by using direct entity
instantiations, as shown below.
Example 3
library ieee;
use ieee.std_logic_1164.all;
entity delay2 is
port (din, clock: in std_logic;
qout: out std_logic);
end delay2;
architecture structural of delay2 is
signal intern: std_logic;
begin
d1: entity work.dff (arch_dff)
port map (din, clock, intern, open);
d2: entity work.dff (arch_dff)
port map (intern, clock, qout, open);
end structural;
2.5. Configuration Specification and Declaration
When direct entity instantiations are not used, component declarations and
their instantiations are not enough for a complete specification of a
structural architecture, because
the description of component implementation is not specified. In this case a
configuration
specification may be used. A configuration is a construct that defines how
component instances are associated with design entities and their
architectures.
The reason for separating the entity and its components is to allow the
association
(called binding) between entity and component to be made as late as
possible in the simulation process. This association is carried out only at
the start of simulation, in the elaboration phase. This way, the source
modules of a hierarchical design may be compiled in any order.
The syntax of a configuration specification is the following:
for instance_label: component_name
use entity library_name.entity_name
[(architecture_name)]
[generic map (generic_association_list)]
[port map (port_association_list)];
Several configuration specifications for components may be included in a
configuration declaration, which may represent a separate design unit, and
therefore may be placed in a separate file. The syntax for a configuration
declaration is the following:
configuration configuration_name of entity_name is
for architecture_name
-- configuration specifications
end for;
-- other for clauses
end [configuration configuration_name];
The syntax of a configuration specification is similar to that of a direct
entity instantiation. However, a configuration specification represents a
more flexible method when a different implementation must be used for the
same component. If some changes have to be made, they will be introduced only
in the configuration file, while the structural architecture will remain
unchanged. Using direct entity instantiation would require all changes to be
introduced in the architecture.
A configuration specification has three parts. The first part specifies the
components to which the configuration applies. Each component is indicated by
the label of the statement in which that component is instantiated. It is
possible to use the keyword all to select all components with the specified
name. This keyword was used in the second previous example, the configuration
specification from this example being reproduced below:
for all: dff use entity work.dff (arch_dff);
Rather than specifying the configuration for all the components with the name
dff, it would have been possible to have separate configuration
specifications for each instantiated component:
for d1: dff ...
for d2: dff ...
The second part of a configuration specification selects the entity to be
used for a component or for all components with the specified name, as well
as the library in which the corresponding entity resides. This part may also
specify the architecture to be used for the selected entity, when there is
more than one architecture.
The third part of the specification is optional. This part may explicitly
specify how the generics and ports of an instantiated component are
associated with the generics and ports of the entity (the port bindings). The
generic map and port map clauses may be used for this purpose and the
association may be positional or named. Explicit association is only needed
if the names of generics and ports in a component declaration are different
from the names of generics and ports in the entity declaration used for that
component. In practice, however, it is recommended to match these names.
If the configuration specification is missing completely for a component, a
default association (default binding) will be performed. This means that an
entity with the same name from the current library will be selected for the
component, the most recently compiled architecture will be used, and the
generics and ports are associated to the generics and ports with the same
names within the entity.
Most of the times, the default association is also the desired one, so that
in these cases it is not necessary to specify a configuration. However, there
is one case when a configuration specification is necessary, when the
component is to be associated with an entity in a different library. A
possibility to achieve this association would be to use the library and use
clauses to make all the entities in that library visible. For example, if the
entity dff were compiled into the library named basic, then the library and
use clauses could be added to the architecture, as illustrated below. In this
case, the configuration specification is not necessary.
Example
library ieee;
use ieee.std_logic_1164.all;
library basic;
use basic.all;
entity delay2 is
port (din, clock: in std_logic;
qout: out std_logic);
end delay2;
architecture structural of delay2 is
signal intern: std_logic;
component dff is
port (d, clk: in std_logic;
q, qn: out std_logic);
end component dff;
begin
d1: dff port map (din, clock, intern, open);
d2: dff port map (intern, clock, qout, open);
end structural;
The problem with this method is that all the entities in that library become
visible, regardless of whether they are going to be used or not. For this
reason, conflicts may result between the names of entities in the library
and other names in the design units in which the library is visible. A better
solution is to specify a configuration to associate the component with the
entity, and use the default names for generics and ports. This solution is
illustrated below.
Example
library ieee;
use ieee.std_logic_1164.all;
library basic;
entity delay2 is
port (din, clock: in std_logic;
qout: out std_logic);
end delay2;
architecture structural of delay2 is
signal intern: std_logic;
component dff is
port (d, clk: in std_logic;
q, qn: out std_logic);
end component dff;
for all: dff use entity basic.dff;
begin
d1: dff port map (din, clock, intern, open);
d2: dff port map (intern, clock, qout, open);
end structural;
Notes
required to ensure that entity and component names, generics and ports match.
configuration must be declared in the same library.
15. The ChipScope logic analyzer: ICON core
All of the ChipScope cores use the JTAG Boundary Scan port to communicate
with the host computer via a JTAG downloading cable (either a parallel or a
USB cable). The Integrated CONtroller (ICON) core provides a communications
path between the JTAG Boundary Scan port of the FPGA device and the other
ChipScope cores (ILA, VIO).
The ICON core can communicate with up to 15 ILA and/or VIO cores at any given
time. However, individual cores cannot share their control ports with any
other core. Therefore, the ICON core needs a distinct control port for every
ILA and VIO core.

The Boundary Scan primitive component is used to communicate with the JTAG
Boundary Scan logic of the FPGA device. This component extends the JTAG Test
Access Port (TAP) interface of the FPGA device so that up to four internal
scan chains can be created, depending on the device family. The ChipScope
Analyzer tool communicates with the ChipScope cores by using one of the
internal scan chains provided by the Boundary Scan component. For instance,
the Boundary Scan component of Spartan-3 and Spartan-3E devices provides two
internal scan chains, USER1 and USER2.
Since the ChipScope cores use a single internal scan chain of the
Boundary Scan component, it is possible to share the Boundary Scan component
with other elements of the users design. One of the following two methods
can be used for this sharing:
including the unused Boundary Scan chain signals as ports on the ICON core
interface. The Boundary Scan component is instantiated inside the ICON core
by default.
attaching the USER1 or USER2 scan chain signals to the corresponding ports of
the ICON core interface.
When generating the ICON core, it is possible to enable the Include Boundary
Scan Ports option to provide access to the unused scan chain interfaces.
However, the Boundary Scan ports should be included only if the design needs
them. If the ports are included and not used, the synthesis tools may not
connect the ICON core properly, causing errors during the synthesis and
implementation stages of the design.
Below is illustrated the communication between the ICON, ILA, and VIO cores.

16. The ChipScope logic analyzer: ILA core
The Integrated Logic Analyzer (ILA) core is a customizable logic analyzer
core that can be used to monitor any internal signal of the design. Since the
ILA core is synchronous to the design being monitored, all the clock signal
constraints that are applied to that design are also applied to the
components inside the ILA core. The ILA core consists of the following main
components: trigger input and output logic, data capture logic, control and
status logic.
2.2.1. Trigger Input Logic
The trigger input logic allows detecting complex trigger events. Each ILA
core can have up to 16 trigger ports, and each port can be 1 to 256 bits
wide. The ability to provide multiple trigger ports is necessary in complex
systems where different types of signals or buses need to be monitored.
To detect events on a trigger port, up to 16 comparators can be connected to
that port. An individual comparator is called match unit. This feature
enables multiple comparisons to be performed on the trigger port signals. The
results of one or more match units are combined together to form the overall
trigger condition event that is used to control data capture. Selecting a
single match unit conserves resources while still allowing some flexibility
in detecting trigger events. Selecting two or more match units allows a more
flexible trigger condition equation. However, increasing the number of match
units connected to a trigger port increases the usage of logic resources
accordingly.
The match units connected to the trigger ports can be one of the following
types:

high-tolow and low-to-high transitions.

comparisons.
detects high-to-low and low-to-high transitions.
and not in range comparisons.
Range comparator w/edges: Similar to the range comparator, but also detects
highto-low and low-to-high transitions.
All the match units of a trigger port can be configured with an event
counter, with a selectable size of 1 to 32 bits. This counter can be
configured at run time to count events in the following ways:
or nonconsecutive events occur.
asserted once n consecutive or non-consecutive events occur.
consecutive events occur, and remains asserted until the match function is
not satisfied.
The internal trigger condition of the ILA core can be accessed using the
optional trigger output port. The signal on this port can be used as a
trigger signal for external test equipment by attaching the signal to an
output pin. This signal can also be used by internal logic as an interrupt, a
trigger, or to cascade multiple ILA cores together. The shape (level or
pulse) and sense (active-High or active-Low) of the trigger output can be
controlled at runtime.
In order to monitor different kinds of signals and buses in a design,
multiple trigger ports can be used. For example, if the design includes an
internal system bus that consists of control, address, and data signals, then
a separate trigger port can be assigned to monitor each signal group, as
shown below. If all of these different signals and buses would be connected
to a single trigger port, it would not be possible to monitor for individual
bit transitions on the CE, WE, and OE signals while looking for the Address
bus to be in a specified range.

A trigger condition is a Boolean or sequential combination of events that is
detected
by match unit comparators attached to the trigger ports of the ILA core. The
trigger condition
is used to specify the initial point in the data capture window and can be
located at the beginning, the end, or anywhere within the data capture
window.
A storage qualification condition is also a Boolean combination of events
that is detected by the match unit comparators. However, the storage
qualification condition differs from the trigger condition in that it
evaluates trigger port match unit events to decide whether or not to capture
and store each individual data sample. The trigger and storage qualification
conditions can be used together to define when to start the capture process
and what data to capture.
In the ILA core example shown above, suppose that the following operations
are required:
to Address = FF0000h.
from
Address = 23AACCh and Data values between 00000000h and 1000FFFFh.
To implement these conditions, the TRIG0 and TRIG1 trigger ports should each
have two match units attached to them, one for the trigger condition and one
for the storage qualification condition. Table 1 summarizes the set up of the
trigger condition and storage qualification equations and of each individual
match unit, in order to satisfy the conditions stated initially ('R' means
rising edge).

2.2.2. Trigger Output Logic
The ILA core implements a trigger output port called TRIG_OUT. The signal on
this port is the output of the trigger condition that is set up at run-time
using the analyzer. The shape (level or pulse) and sense (active-High or
active-Low) of the trigger output can also be controlled at run-time.
The TRIG_OUT port can be connected to a device pin in order to trigger
external test equipment such as oscilloscopes and logic analyzers. The
TRIG_OUT port of one core can also be connected to a trigger input port of
another core in order to expand the trigger and data capture capabilities of
the design.

2.2.3. Data Capture Logic
Each ILA core can capture data using on-chip Block RAM resources
independently from all other cores in the design. Each ILA core can capture
data using one of two capture modes: Window and N samples.
In Window capture mode, the sample buffer can be divided into one or more
equalsized sample windows. This mode uses a Boolean combination of the
individual trigger match unit events in order to collect enough data to fill
a sample window.
The N samples capture mode is useful for capturing the exact number of
samples needed per trigger without wasting capture storage resources. This
mode is similar to the Window capture mode except for two major differences:
buffer size minus 1;

2.2.4. Control and Status Logic
The ILA core contains a control and status logic that is used to manage the
operation of the core. All logic necessary to properly communicate with the
ILA core is implemented by this control and status logic.
17. The ChipScope logic analyzer: VIO core
The Virtual Input/Output (VIO) core is a customizable core that can monitor
and drive internal FPGA signals in real time. Unlike the ILA core, no
storage resources are required. There are four types of signals available in
a VIO core:
provided by the JTAG cable. The input values are read back periodically and
displayed in the graphical interface of the analyzer.
signal. The input values are read back periodically and displayed in the
graphical interface of the analyzer.
graphical interface of the analyzer and driven out of the core to the
surrounding logic. A logical 1 or 0 value can be defined for individual
asynchronous outputs.
interface of the analyzer, synchronized to the design clock signal, and
driven out of the core to the surrounding logic. A logical 1 or 0 can be
defined for individual synchronous outputs.
Every VIO core input has additional cells to capture the presence of
transitions on the input. Since the design clock will most likely be much
faster than the sample period of the analyzer, it is possible for the signal
being monitored to transition many times between successive samples. The
activity detectors capture this behavior and the results are displayed along
with the signal value in the graphical interface of the analyzer. In the
case of a synchronous input, activity cells capable of monitoring for
synchronous and synchronous events are used. This feature can be used to
detect glitches as well as synchronous transitions on the synchronous input
signal.
Every VIO core synchronous output has the ability to output a static 1, a
static 0, or a pulse train of successive values. A pulse train is a 16-clock
cycle sequence of 1 and 0 bits that drive out of the core on successive
design clock cycles. The pulse train sequence is defined in the analyzer and
is executed only one time after it is loaded into the core.
18.The Xilinx Embedded Development Kit: overview
The Xilinx Embedded Development Kit (EDK) is a collection of tools and IP
(Intellectual Property) cores that allows designing embedded processor
systems for implementation in a Xilinx FPGA device. This kit enables the
design of both the hardware and software components of an embedded system
within a single design environment. The EDK uses the Xilinx synthesis and
implementation tools to generate the hardware components of the embedded
system (one or two microprocessors and various peripherals), and GNU software
tools to generate the software components of the embedded system (the machine
code executed by each microprocessors).
There are two types of processors that can be used in an embedded system
designed with the EDK package. The first is the Xilinx MicroBlaze soft
processor core, which is synthesized using the available resources in the
target FPGA device. The second is the IBM PowerPC hardware processor core,
which is integrated into certain versions of Virtex FPGA devices and
therefore it does not require additional resources of the device.
One of the main components of the EDK package is the Xilinx Platform Studio
(XPS). This is a graphical Integrated Development Environment (IDE) that
includes all the tools required to create the hardware and software
components of the embedded system. The main tools included in the XPS
development environment are the following:
The Base System Builder (BSB) wizard allows to easily create a working system
targeting a board that is supported. Later on, the user can modify the system
by changing parameters of the existing components or it can extend the system
by adding other components. Standard input and output devices can be
specified in the BSB wizard, and software applications can be created for
memory test and for peripheral self-test.
The Create and Import Peripheral (CIP) wizard enables the user to create its
own peripherals and import them into the XPS project.
The Hardware Platform Generator tool (PlatGen) generates the netlist for the
hardware platform of the embedded system.
The Library Generator tool (LibGen) generates and configures the software
libraries, device drivers, file systems, and interrupt handlers for the
embedded system.
The Simulation Model Generator tool (SimGen) generates simulation models of
the embedded hardware system based on either the original hardware design
(behavioral) or on the finished FPGA implementation.
The Debug Configuration wizard can be used to insert ChipScope cores into the
system in order to perform hardware debugging of the system.
The Bitstream Initializer (BitInit) tool updates the configuration bitstream
of the FPGA device with the executable code of the software application. This
tool calls the Data2MEM utility provided in the Xilinx ISE design environment
to initialize the Block RAM (BRAM) memory of the FPGA device with the
executable code.
The Xilinx Software Development Kit (SDK) is an integrated development
environment, complimentary to the XPS environment, which can be used to
develop and debug C/C++ software applications for the embedded system. The
Xilinx SDK environment is based on the Eclipse tool suite.
The EDK package also includes other components, such as:
IP cores for a large number of peripherals;
Device drivers and libraries required to develop software applications;
GNU compiler, linker, and debugger for developing C/C++ software applications
targeting the MicroBlaze and PowerPC processors;
Sample projects.
19. The Xilinx Embedded Development Kit: generating the
hardware platform
The hardware platform of an embedded system created with the EDK package
includes one or two processors, along with various peripherals and memory
blocks. Each processor and peripheral core can be customized. Implementation
parameters control optional features of the cores and define the addresses
assigned to each peripheral. These addresses are mapped in the memory address
space.
Communication between a processor and on-chip BRAM memories is performed via
a Local Memory Bus (LMB). This bus provides single-cycle access to dual-port
BRAM memories and is split into instruction LMB and data LMB. Peripherals can
either connect directly to the Processor Local Bus (PLB) or to the On-Chip
Peripheral Bus (OPB). The PLB and OPB buses are connected via a bus bridge.
The hardware platform description is maintained by the XPS environment in a
file known as the Microprocessor Hardware Specification (MHS) file, with the
.mhs extension.
This is an editable text file and represents the main source file that
describes the hardware part of the embedded system. The MHS file contains the
instantiations of the processor cores and peripheral cores along with their
parameters. The file defines the configuration of the embedded system and
includes information on the buses, processors, peripherals, interconnections,
interrupt request priorities, and address space. For each peripheral, there
is a Microprocessor Peripheral Definition (MPD) file, with the .mpd
extension, which defines the configurable parameters along with their default
values, as well as the available ports of the peripheral.
There are two main steps for generating the hardware platform: netlist
generation (synthesis) and bitstream generation (implementation). These steps
are described next.
3.1. Netlist Generation
For generating the hardware platform, XPS first invokes the PlatGen tool to
generate a system netlist. The PlatGen tool performs the following:
-Reads the hardware platform description from the MHS file.
-Generates a representation in a hardware description language (e.g., VHDL)
of the MHS file into a system.vhd file along with a system_stub.vhd file.
The system.vhd file is used when the embedded system is developed entirely in
the XPS environment. The system_stub.vhd file is an instantiation template
file of the embedded system and is used when this system is a sub-module in a
larger design.
-Generates a Block RAM Memory Map (BMM) file, with the .bmm extension. This
is a text file that describes how individual BRAM modules constitute a
contiguous logical data space.
-Extracts the peripheral netlists from the EDK install directory.
-Calls the Xilinx Synthesis Technology (XST) tool to synthesize the design.
-Generates a system netlist file and the peripheral netlist files (NGC files
with .ngc extension).
-The netlist generation process is illustrated below.

3.2. Bitstream Generation
For generating the bitstream configuration file, the Xilinx ISE
implementation tools are used. It is possible to run the implementation tools
from the ISE Project Navigator graphical user interface. However, these tools
are usually invoked in batch mode via the XFlow command-line program. This
program has a simple, flexible, and user customizable interface to the Xilinx
ISE implementation tools.
The XFlow program reads the netlist files generated by the PlatGen tool along
with a user constraints file (UCF) and invokes the Xilinx ISE implementation
tools (NGDBuild, MAP, PAR). Then the XFlow program calls the BitGen tool to
generate the system.bit configuration file for the FPGA device. The generated
configuration file does not include the executable code of any software
application. The BitGen tool also generates a BMM file (system_bd.bmm) that
contains the physical locations of the BRAM memories.

20. The Xilinx Embedded Development Kit: generating the
software platform and executable code
A software platform is a collection of libraries and device drivers that are
used by any software application. Before creating a software application, a
software platform has to be generated with the LibGen tool. This tool uses a
description of the software system maintained by the XPS development
environment. This description is stored in a file known as the Microprocessor
Software Specification (MSS) file, with the .mss extension, which is the
analogous to the MHS file that contains the description of the hardware
platform. The MSS file, together with the software applications, are the main
source files that describe the software part of the embedded system.
LibGen configures libraries and device drivers using the information stored
in the MSS file and generates the following archives of object files:
-libc.a: Standard C library;
-libXil.a: Xilinx library;
-libm.a: Math functions library.
In addition to these object files, the LibGen tool also generates a Board
Support Package (BSP) for each processor. This package is a collection of
files containing software drivers associated with peripherals, selected
libraries, standard I/O devices, and interrupt handler routines. Therefore,
it is recommended to generate the BSPs after the hardware components are
defined and the address map is specified.
After generating the libraries and BSPs, the next step consists in compiling
the source files of the software applications and generating the executable
file for each processor. This is performed by invoking the following GNU
tools:
-Pre-processor: Replaces all macros with their definitions in the .c or .h
files.
-Processor-specific compiler: Compiles C/C++ code and generates assembly-
language code.
-Processor-specific assembler: Converts assembly-language code to object
code.
-Linker: Combines all the object and library files into a single executable
file using either a default linker script or a user-defined linker script.
The linker script describes how the various sections of the same type (e.g.,
.text, .data) in all the object files are combined into the corresponding
sections in the output executable file.
The output executable file is an ELF (Executable and Linkable Format) file.
This is a binary file named executable.elf that contains machine code.
The process of generating the executable code:

You might also like