The Xilinx ISE design environment: design entry modules
2.2. Design Entry Modules Design entry is the first step in the design flow. During this step, source files are created in order to represent the design. The top-level design source file can have one of the following formats:
Project Navigator). For designs with an HDL or schematic file as the top-level source file, lower-level source files can have several formats, including HDL files, schematic files, IP cores, and netlists. For designs with an EDIF or NGC/NGO netlist as the top-level source file, EDIF or NGC/NGO files are the only source file types allowed in the project. 2.2.1. HDL Editor The HDL editor allows creating a file in a hardware description language, such as VHDL or Verilog, which describes the behavior and structure of the design. Using HDLs offers the following advantages: Synthesis decreases design time by eliminating the need to define every gate. In addition, the synthesis tool can automate the process, using several encoding styles for state machines or performing automatic I/O buffer insertion during optimization, resulting in greater efficiency. simulating the HDL description. Design simulation at the gate-level before implementation allows evaluating the architectural and design decisions.
2.2.2. Schematic Editor The schematic editor allows creating a visual representation of the designed system. The editor can be used for the top-level design file, the lower-level design files, or both. A schematic can represent a top-level design, and the lower-level modules can be created using any of the following source types: HDL files, CORE Generator IP cores, Architecture Wizard IP cores, or schematic files. Schematics can also be used to define the lower-level modules of the design. If the top-level design file is a schematic file, then a schematic symbol must be created from each lower-level module and then it must be instantiated in the top-level schematic file. If the toplevel design file is an HDL file, an HDL instantiation template must be created from the schematic, and then the template must be instantiated in the top-level HDL file. All schematic files are ultimately converted to either VHDL or Verilog structural netlists before being passed on to the synthesis tool during the synthesis step. 2.2.3. CORE Generator Software The CORE Generator software reduces design time by providing access to parameterized IP (Intellectual Property) cores for Xilinx FPGA devices. The software provides a catalog of architecture specific, domain specific (embedded, connectivity, DSP), and product specific (automotive, consumer electronics, military equipments, communications) IP cores. These user-customizable IP cores range in complexity from commonly used modules, such as memories and FIFO memories, to system-level building blocks. Using these IP cores can reduce significantly the design time. The CORE Generator software includes the following types of IP cores: accumulator, multiplier, complex multiplier, etc.); Logic Analyzer, Virtual Input/Output; -X;
02. The Xilinx ISE design environment: synthesis module After design entry and optional simulation, the next step in the design flow is the synthesis. The Xilinx ISE Design Suite environment includes the Xilinx Synthesis Technology (XST) software, which synthesizes VHDL or Verilog designs to create Xilinx-specific netlist files, known as NGC files. Unlike outputs from other tools, which consist of an EDIF file with an associated NCF constraints file, NGC files contain both logical design data and constraints. The NGC file is then accepted as input to the translate step of the design implementation process.
In addition to NGC files, the XST software also generates the following outputs: a synthesis report, an NGR file with the RTL (Register Transfer Level) schematic, and a technology schematic. The synthesis report contains the results from the synthesis run, including an area and timing estimation. The RTL schematic is a representation of the design before the optimization using generic symbols, such as adders, multiplexers, counters, gates. This schematic is generated after the HDL synthesis phase of the synthesis process. The technology schematic is a representation of an NGC file using logic elements optimized to the target architecture or technology. This schematic is generated after the optimization phase of the synthesis process.
First, a parsing of the HDL code is performed, when XST checks whether the HDL code is correct and reports syntactic errors. If there are no syntax errors, the HDL synthesis is performed. XST analyzes the HDL code and attempts to infer specific design building blocks or macros (such as multiplexers, memories, adders, subtractors) for which it can create efficient technology implementations. To reduce the amount of inferred macros, XST performs a resource sharing check that leads to a reduction of the area and an increase in the clock frequency. At this step, the XST software recognizes the Finite State Machines (FSMs) independent of the modeling style used. To create the most efficient implementation, XST uses the optimization criterion that has been specified (area or speed) to determine the FSM encoding algorithm that will be used. The last step of the synthesis is the low-level optimization. XST transforms inferred macros and general logic into a technology-specific implementation. 03. The Xilinx ISE design environment: implementation modules 1.1. Implementation Modules The implementation process consists of three stages:
1.1.1. Translation Module The NGDBuild software is the Xilinx tool that is used to perform the translation step. The NGDBuild tool generates a logical description of the design and a description of the original hierarchy based on the input netlist file.
The format of the input netlist file is either EDIF or NGC. An NGC file, which is created by the Xilinx XST software, is a binary file containing both logical design data and constraints. NGDBuild also accepts as inputs NMC library files, containing the definition of physical macros that can be instantiated into a design. Other inputs to the NGDBuild software are UCF and NCF constraints files. A UCF file (User Constraints File) represents an ASCII file containing timing and placement constraints that affect how the logical design is implemented in the target device. An NCF file (Netlist Constraints File) is created by a certain vendor toolset and contains constraints specified within that toolset. The NGDBuild software generates an intermediary NGO file containing a logical description of the design in terms of its original components and their hierarchy. The outputs of the NGDBuild software are an NGD (Native Generic Database) file and a build report BLD file. The NGD file is a binary file that contains: on of the design in terms of its original components, their hierarchy, and the primitives (gates, look-up tables, flip-flops, RAMs) to which the design is reduced;
ion of physical macros from the NMC files. 1.1.2. Mapping Module The MAP software performs the mapping of the logical design to the available logical resources of the target FPGA device.
The main input file to the MAP software is an NGD file, which is generated during the translate process by the NGDBuild tool. MAP also accepts as inputs NMC files containing the definition of physical macros. The MAP software first performs a design rule check on the design in the input NGD file. The next step is mapping the design logic to the components of the FPGA device. The mapping is based on the mapping directives. The main output of the MAP software is an NCD (Native Circuit Description) file, which is a physical representation of the design mapped to the components in the target FPGA device, such as configurable logic blocks (CLBs) and I/O blocks (IOBs). The mapped NCD file can then be placed and routed using the PAR software. Optionally, the mapped NCD file can be used as a guide file and applied as an input to guide a later run of the MAP software. MAP also generates a PCF file (Physical Constraints File), which is an ASCII text file containing constraints specified during design entry, expressed in terms of timing restrictions, physical elements and other attributes placed in a UCF or NCF file. Other outputs are an NGM file, which is used for back- annotation and contains logical and physical information about the mapping process, and a MAP report (MRP) file.
1.1.3. Placement and Routing Module After the mapping module generates the NCD file, the design can placed and routed using the PAR software. PAR accepts a mapped NCD file and a PCF file as inputs, places and routes the design, and then generates a placed and routed NCD file to be used by the configuration bitstream generator module. The place and route process is structured in two phases: PAR writes the NCD file after all the placer phases are complete. The components are placed into locations based on factors such as constraints specified in the PCF file, length of connections, and available routing resources. phases of the router module. This module performs a converging procedure for a solution that fully routes the design and meets timing constraints. After the design is fully routed, PAR generates an NCD file that can be used for timing analysis. The PAR software writes a new NCD file as the routing improves throughout the router phases. The placement and routing operations can be performed in two modes: timing- driven and cost-based. Timing-driven placement and routing is based on the Xilinx timing analysis software, an integrated static timing analysis tool. Placement and routing are executed according to timing constraints specified in the beginning of the design process. The timing analysis tool interacts with the PAR software to ensure that the timing constraints imposed on the design are met. Cost-based placement and routing is performed using various cost tables that assign weighted values to relevant factors such as constraints, length of connections, and available routing resources. Cost-based placement and routing is used if no timing constraints are present in the design.
The PAR software may accept as input an optional guide file, which is a placed and routed NCD file that can be used as a guide for placing and routing the design. The PAR file illustrated in the figure is a report file that contains summary information of all placement and routing iterations. 04. Structure and execution of a process A process is a sequence of statements that are executed in the specified order. The process declaration delimits a sequential domain of the architecture in which the declaration appears. Processes are used for behavioral descriptions. 1.1.1. Structure and Execution of a Process A process may appear anywhere in an architecture body (the part starting after the begin keyword). The basic structure of a process declaration is the following: [name:] process [(sensitivity_list)] [type_declarations] [constant_declarations] [variable_declarations] [subprogram_declarations] begin sequential_statements end process [name]; The process declaration is contained between the keywords process and end process. A process may be assigned an optional name for simpler identification of the process in the source code. The name is an identifier and must be followed by the ':' character. This name is also useful for simulation, for example, to set a breakpoint in the simulation execution. The name may be repeated at the end of the declaration, after the keywords end process. The optional sensitivity list is the list of signals to which the process is sensitive. Any event on any of the signals specified in the sensitivity list causes the sequential instructions in the process to be executed, similar to the instructions in a usual program. As opposed to a programming language, in VHDL the end process clause does not specify the end of process execution. The process will be executed in an infinite loop. When a sensitivity list is specified, the process will only suspend after the last statement, until a new event is produced on the signals in the sensitivity list. Note that an event only occurs when a signal changes value. Therefore, the assignment of the same value to a signal does not represent an event. When the sensitivity list is missing, the process will be run continuously. In this case, the process must contain a wait statement to suspend the process and to activate it when an event occurs or a condition becomes true. When the sensitivity list is present, the process cannot contain wait tatements. The declarative part of the process is contained between the process and begin keywords. This part may contain declarations of types, constants, variables, and subprograms (procedures and functions) that are local to the process. Thus, the declared items can only be used inside the process. Note be declared inside a process; only constants and variables may be declared. The statement part of the process starts after the begin keyword. This part contains the statements that will be executed on each activation of the process. It is not allowed to use concurrent instructions inside a process. We present the declaration of a simple process composed of a single sequential signal assignment. proc1: process (a, b, c) begin x <= a and b and c; end process proc1 05. Processes in the VHDL language 2.1. Processes with Incomplete Sensitivity Lists Some synthesis tools may not check for sensitivity lists of processes. These tools may assume that all signals on the right-hand side of sequential signal assignments are in the sensitivity list. Thus these synthesis tools will interpret the two processes in the example to be identical. Example proc6: process (a, b, c) begin x <= a and b and c; end process proc6; proc7: process (a, b) begin x <= a and b and c; end process proc7; All synthesis tools will interpret process proc6 as a 3-input AND gate. Some synthesis tools will also interpret process proc7 as a 3-input AND gate, even though when this code is simulated, it will not behave as such. While simulating, a change in value of signal a or b will cause the process to execute, and the value of logical AND of signals a, b, and c will be assigned to signal x. However, if signal c changes value, the process is not executed, and signal x is not updated. Because it is not clear how a synthesis tool should generate a circuit for which a transition of signal c does not cause a change of signal x, but for which a change of signals a or b causes signal x to be updated with the logical AND of signals a, b, and c, there are the following alternatives for the synthesis tools: includes all signals on the right-hand side of any signal assignment statement within the process); ed without a complete sensitivity list. The second variant is preferable, because the designer will have to modify the source code so that the functionality of the generated circuit will match the functional simulation of the source code. Although it is syntactically legal to declare processes without a sensitivity list or a wait statement, such processes never suspend. Therefore, if such a process would be simulated, the simulation time would never advance because the initialization phase, in which all processes are executed until suspended, would never complete. 2.2. Combinational and Sequential Processes Both combinational and sequential processes are interpreted in the same way at synthesis, the only difference being that for sequential processes the output signals are stored into registers. A simple combinational process is described in the example. Example proc8: process begin wait on a, b; z <= a and b; end process proc8; This process will be implemented by synthesis as a two-input AND gate. For a process to model combinational logic, it must contain in the sensitivity list all the signals that are inputs of the process. In other words, the process must be reevaluated every time one of the inputs to the circuit it models changes. In this way combinational logic is correctly modeled. If a process is not sensitive to all its inputs and it is not a sequential process, then it cannot be synthesized, since there is no hardware equivalent of such a process. Not all synthesis tools enforce such a rule, so great care should be taken in the design of combinational processes in order not to introduce errors in the design. Such errors will cause subtle differences between the simulated model and the circuit obtained by synthesis, because a noncombinational process is interpreted by the synthesizer as a combinational circuit. If a process contains a wait until statement or an if signal'event statement, the process will be interpreted as a sequential process. Hence the process in the example will be interpreted as a sequential process. Example proc9: process begin wait until clk = '1'; z <= a and b; end process proc9; By synthesizing this process, the circuit in Figure 1 will result, where a flip-flop is added on the output.
06. Synthesis of if statements 2.4. Synthesis of if Statements An if statement may be implemented by a multiplexer. First consider the if statement without any elsif branch. The example presents the use of such a statement to describe a comparator.
Example library ieee; use ieee.std_logic_1164.all; entity comp is port (a, b: in std_logic_vector (7 downto 0); equal: out std_logic); end comp; architecture functional of comp is begin process (a, b) begin if a = b then equal <= '1'; else equal <= '0'; end if; end process; end functional; The previous example tests the equality of two signals of type std_logic_vector, representing two 8-bit vectors, and gives a result of type std_logic. The resulting circuit is shown in the figure below. Note that, as with other examples, in practice the synthesis tool will remove the inefficiencies in the circuit (in this case, the constant inputs to the multiplexer) to give a minimal solution.
A multi-branch if statement, in which at least one elsif clause appears, is implemented by a multi-stage multiplexer. Consider the if statement in this example. Example process (a, b, c, s0, s1) begin if s0 = '1' then z <= a; elsif s1 = '1' then z <= b; else z <= c; end if; end process; The result of implementing the if statement from the previous example is presented in the figure below. This circuit is equivalent to that resulted by implementing a conditional signal assignment, which is a concurrent statement.
The conditions in the successive branches of an if statement are evaluated independently. In the previous example, the conditions involve the two signals s0 and s1. There can be any number of conditions, and each of them is independent of the others. The structure of the if statement ensures that the earlier conditions are tested first. In this example, signal s0 has been tested before signal s1. This priority is reflected in the generated circuit, where the multiplexer controlled by signal s0 is nearer to the output than the multiplexer controlled by signal s1. It is important to remember the existence of this priority for condition testing, so that redundant tests can be eliminated. Consider the if statement in this example, which is equivalent to the if statement in the previous one. Example process (a, b, c, s0, s1) begin if s0 = '1' then z <= a; elsif s0 = '0' and s1 = '1' then z <= b; else z <= c; end if; end process; The additional condition s0 = '0' is redundant since it will be tested only if the first condition of the if statement is false. It is recommended to avoid such redundancies, because there is no guarantee that they will be detected and removed by the synthesis tool. For multi-branch if statements, normally each condition will be dependent on different signals and variables. If every branch is dependent on the same signal, then it is more advantageous to use a case statement. 07. Incomplete if statements 2.5. Incomplete if Statements In the examples presented so far, all the if statements have been complete. In other words, the target signal has been assigned a value under all possible conditions. However, there are two situations when a signal does not receive a value: when the else clause of the if statement is missing, and when the signal is not assigned to a value in some branches of the if statement. In both cases the interpretation is the same. In the situations when a signal does not receive a value, its previous value is preserved. The problem is what the previous value of the signal is. If there is a previous assignment statement in which the signal appears as target, then the previous value comes from that assignment statement. If not, the value comes from the previous execution of the process, leading to feedback in the circuit. The first case is illustrated in this example. Example process (a, b, en) begin z <= a; if en = '1' then z <= b; end if; end process; In this case, the if statement is incomplete because the else clause is missing. In the if statement, the signal z gets a value if the condition en = '1' is true, but remains unassigned if the condition is false. The previous value comes from the unconditional assignment before the if statement. The if statement of the previous example is equivalent to the if statement of this example. Example process (a, b, en) begin if en = '1' then z <= b; else z <= a; end if; end process; When the if statement is incomplete and there is no previous assignment, then a latch will be inserted to the output of the circuit and a feedback will exist from the output to the input. This is because the value of the signal from the previous execution of the process is preserved and it becomes the value in the current execution of the process. This form of the if statement is used to describe a flip-flop or a register with an enable input, as in this example. Example process (clk) begin if (clkevent and clk = '1') then if en = '1' then q <= d; end if; end if; end process; Signal q is updated with the new value of signal d when the condition is true, but is not updated when the condition is false. In this case, the previous value of signal q is preserved by sequential feedback of q. The resulting circuit is presented in this figure.
The if statement of the previous example is equivalent to the complete if statement of this example. Example process (clk) begin if (clkevent and clk = '1') then if en = '1' then q <= d; else q <= q; end if; end if; end process; When the condition is false, the signal q is assigned to itself, which is equivalent to preserving its previous value. One of the most common errors encountered in VHDL descriptions targeted for synthesis is the unintended introduction of feedback in the circuit due to an incomplete if statement. This will insert latches in the synthesized design, which can be problematic for FPGA devices because timing for paths containing latches are difficult to analyze. Synthesis tools usually report when a latch is inserted. In order to avoid the insertion of latches, the designer must ensure that every signal assigned to in an if statement within a combinational process (which is therefore an output signal of the process) receives a value under every possible combination of conditions. In practice, there are two possibilities of doing this: to assign a value to output signals in every branch of the if statement and including the else clause, or to initialize signals with an unconditional assignment before the if statement. In the following example, although the if statement looks complete, different signals are being assigned a value in each branch of the if statement. Thus both signals z and y will have asynchronous feedback. Example process (a, b, c) begin if c = '1' then z <= a; else y <= b; end if; end process; Another example is where there is a redundant test for a condition which must be true. Example process (a, b, c) begin if c = '1' then z <= a; elsif c = '0' then z <= b; end if; end process; In this case, although the if statement looks complete (assuming that signal c is of type bit), each of the conditions in the if statement is synthesized independently. The synthesis tool may therefore not detect that this second condition is redundant. In this case, the if statement is synthesized as a three-way multiplexer, the third input being the missing else condition which is the feedback of the previous value. The circuit synthesized for this example is shown below.
2.6. If Statements with Variables So far, in the if statements only signals were used. The same rules apply when using variables, with a single difference. Like a signal, if a variable is assigned to only in some branches of the if statement, then the previous value is preserved by feedback. Unlike the case when a signal is used, the reading and writing of a variable in the same process will result in feedback only if the read occurs before the write. In this case, the value read is the previous value of the variable. In the case when a signal is used, a read and a write in the same process will always result in feedback. This observation may be used to create registers or counters using variables. Remember that a sequential process is interpreted by synthesis by placing a flip-flop or register on every signal assigned to in the process. This means that normally variables are not written to flip-flops or registers. However, if there is feedback of a previous variable value, then this feedback is implemented via a flip-flop or register to make the process synchronous. The example below describes a counter using the unsigned integer type. When a value of type unsigned is incremented, if the value is the highest value of the range, then the lowest value of the range is obtained. Example process (clk) variable count: unsigned (7 downto 0); begin if (clkevent and clk = '1') then if rst = '1' then count := "00000000"; else count := count + 1; end if; result <= count; end if; end process; In this example, in the else branch of the if statement the previous value of the count variable is being read to calculate the next value. This results in a feedback. Note that in this example actually two registers are created. According to the feedback rules, variable count will be registered. Signal result will also be registered, because all signals assigned to in a sequential process will be registered. This extra register will always contain the same value as the register for variable count. The synthesis tool will normally eliminate this redundant register. 08. Examples of combinational circuits 2.1. Multiplexers Multiplexers may be described using several methods. The example below describes the 4:1 multiplexer for 4-bit buses of this figure using a selected signal assignment.
Example library ieee; use ieee.std_logic_1164.all; entity mux is port (a, b, c, d: in std_logic_vector (3 downto 0); s: in std_logic_vector (1 downto 0); x: out std_logic_vector (3 downto 0)); end mux; architecture arch_mux of mux is begin with s select x <= a when "00", b when "01", c when "10" d when "11", d when others; end arch_mux; The reason of using the others keyword is that the selection signal s is of type std_logic_vector, and there are nine possible values for a data object of this type. All the possible values of the selection signal must be covered. If the others option were not used, only four of the 81 values would be covered by the set of options. Other possible values of signal s are, for example, "1X", "UX", "Z0", "U-". For synthesis, "11" is the only meaningful value, but for simulation there are 77 other values that signal s may have. The metalogical value "--" may also be used to assign a dont care value to signal x.
The 4:1 multiplexer can be described with an if statement as shown below. Example architecture arch_mux of mux is begin mux4_1: process (a, b, c, d, s) begin if s = "00" then x <= a; elsif s = "01" then x <= b; elsif s = "10" then x <= c; else x <= d; end if; end process mux4_1; end arch_mux; Since the conditions imply mutually exclusive values of signal s, by synthesizing this description the same circuit is generated as when a selected signal assignment statement is used. However, because the conditions contain a priority, the if statement is not advantageous when the conditions imply multiple signals that are mutually exclusive. Using an if statement in these cases may generate additional logic to ensure that the preceding conditions are not true. Instead of an if statement, it is more advantageous to use a Boolean equation or a case statement. 2.2. Decoders A decoder is a combinational circuit that identifies an input code by asserting a single output line, corresponding to the input code. A decoder with n input lines has, in general, 2n output lines and is denoted by DCD n:2^n. Below is described a 1:8 decoder with active-high outputs. For the description a conditional signal assignment is used. Example library ieee; use ieee.std_logic_1164.all; entity decoder_1_8 is port (a: in std_logic_vector (2 downto 0); y: out std_logic_vector (7 downto 0)); end decoder_1_8; architecture decod of decoder_1_8 is begin y <= "00000001" when a = "000" else "00000010" when a = "001" else
"00000100" when a = "010" else "00001000" when a = "011" else "00010000" when a = "100" else "00100000" when a = "101" else "01000000" when a = "110" else "10000000"; end decod; When the XST synthesis program is used, in order to infer a decoder from the HDL description all combinations of the inputs must be specified and all outputs must be used (for instance, values of 'X' for the output lines should not be specified). 2.3. Priority Encoders An example of a priority encoder is shown below.
This priority encoder may be described concisely with a conditional signal assignment statement, as below. Example library ieee; use ieee.std_logic_1164.all; entity priority_encoder is port (a, b, c, d: in std_logic; w, x, y, z: in std_logic; j: out std_logic); end priority_encoder; architecture priority of priority_encoder is begin j <= w when a = '1' else x when b = '1' else y when c = '1' else z when d = '1' else '0'; end priority;
The when-else statement in the previous example indicates that signal j is assigned the value of signal w when a is '1', even if b, c, or d are '1'. Signal b holds priority over signals c and d, and signal c holds priority over signal d. If signals a, b, c, and d are mutually exclusive (that is, if it is known that only one will be asserted at a time), then the description of below is more appropriate. Example library ieee; use ieee.std_logic_1164.all; entity no_priority is port (a, b, c, d: in std_logic; w, x, y, z: in std_logic; j: out std_logic); end no_priority; architecture no_priority of no_priority is begin j <= (a and w) or (b and x) or (c and y) or (d and z); end no_priority; The logic generated by synthesizing the description of Example 12 requires AND gates with only two inputs. Although using AND gates with more inputs in a CPLD device does not usually require additional resources, these gates could require additional logic cells and logic levels in an FPGA device. The descriptions of the previous two examples are not functionally equivalent, however. This equivalence only exists if signals a, b, c, and d are known to be mutually exclusive. In this case, the description of the previous example generates an equivalent logic with fewer resources. 2.4. Combinational Shifters A combinational shifter performs a logical or arithmetic shift operation on the input data. The inputs of the shifter are the data to be shifted and the selector whose binary value specifies the shift distance. The output of the shifter is the result of the shift operation. When the XST synthesis program is used, the following restrictions apply in order to infer a combinational shifter from the HDL description: hift (sla, sra), rotate (rol, ror), and concatenation (&) operators can be used. Shift operations that fill vacated positions with values from another signal are not recognized.
specifies the shift distance in the shift operation must be positive and must be incremented or decremented only by 1 for each consequent binary value of the selector.
The example below describes a combinational shifter for 8-bit vectors that can be shifted left with one, two, or three positions. A selected signal assignment is used to describe the shifter. Example library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity shift_left is port (din: in unsigned (7 downto 0); sel: in unsigned (1 downto 0); dout: out unsigned (7 downto 0)); end shift_left; architecture arch_shift of shift_left is begin with sel select dout <= din when "00", din sll 1 when "01", din sll 2 when "10", din sll 3 when others; end arch_shift 09. Examples of sequential circuits 3.1. Synchronous and Asynchronous Sequential Circuits Sequential circuits represent a category of logic circuits that include storage elements. These circuits contain feedback loops from the output to the input. The signals generated at the outputs of a sequential circuit depend on both the input signals and on the state of the circuit. The present state of a sequential circuit depends on a previous state and on the values of input signals. With synchronous sequential circuits, the change of state is controlled by a clock signal. With asynchronous circuits, the change of state may be caused by the random change in time of an input signal. The behavior of an asynchronous circuit is less secure, since the state evolution is also influenced by the delays of the circuits components. The transition between two stable states may be attained by a succession of unstable, random states. Synchronous sequential circuits are more reliable and have a predictable behavior. All storage elements of a synchronous circuit change their state simultaneously, which eliminates intermediate unstable states. By testing the input signals at well-defined times, the influence of delays and noises is reduced. There are two techniques for designing sequential circuits: Mealy and Moore. For Mealy sequential circuits, the output signals depend on both the current state and the present inputs. For Moore sequential circuits, the outputs depend only on the current state, and they do not depend directly on the inputs. The Mealy method allows to implement a circuit by a minimal number of storage elements (flip-flops), but the possible uncontrolled variations of the input signals may be transmitted to the output signals. The design using the Moore method requires more storage elements for the same behavior, but the circuit operation is more reliable. 3.2. Flip-Flops Example 14 describes a synchronous D-type flip-flop triggered on the rising edge of the clock signal.
Example
library ieee; use ieee.std_logic_1164.all; entity dff is port (clk: in std_logic; d: in std_logic; q: out std_logic); end dff; architecture example of dff is begin process (clk) begin if (clk'event and clk = '1') then
q <= d; end if; end process; end example; The process used to describe the flip-flop is sensitive only to changes of the clk clock signal. A transition of the input signal d does not cause the execution of this process. The clk'event expression and the sensitivity list are redundant, because both detect changes of the clock signal. Some synthesis tools, however, will ignore the process sensitivity list, and thus the clk'event expression should be included to describe events triggered on the edge of the clock signal. To describe a level-sensitive latch (below), the clk'event condition is removed and the data input d is inserted in the process sensitivity list.
Example
architecture example of d_latch is begin process (clk, d) begin if (clk = '1') then q <= d; end if; end process; end example; In the previous example and in the following there is no else condition. Without this condition, an implied memory element is specified (that will keep the value of signal q). In other words, the following fragment: if (clk'event and clk = '1') then q <= d; end if; has the same meaning for simulation as the fragment: if (clk'event and clk = '1') then q <= d; else q <= q; end if; This is consistent with the operation of a D-type flip-flop. Most synthesis tools do not allow to use an else expression after an if (clk'event and clk = '1') condition, because it may describe a logic for which the implementation is ambiguous. 3.3. Registers The following example describes an 8-bit register by a process similar to that of the second previous one, the difference being that d and q are vectors. In addition, this register has a clock enable signal (ce). Example
library ieee; use ieee.std_logic_1164.all; entity reg8 is port (clk: in std_logic; ce: in std_logic; d: in std_logic_vector (7 downto 0); q: out std_logic_vector (7 downto 0)); end reg8; architecture ex_reg of reg8 is begin process (clk) begin if (clk'event and clk = '1') then if (ce = '1') then q <= d; end if; end if; end process; end ex_reg; 3.4. Shift Registers A shift register is a sequential circuit that shifts left or right the contents of the register with one position in each clock cycle. Usually, the inputs of a shift register are represented by the clock signal, a serial input data, a synchronous or asynchronous set/reset signal, and a clock enable signal. In addition, a shift register may have data and control signals for synchronous or asynchronous parallel load. The output data of a shift register can be accessed either serially, when only the contents of the last flip-flop are accessible for the rest of the circuit, or in parallel, when the contents of several flip-flops are accessible. Xilinx FPGA devices contain dedicated resources (the SRL16 and SRL32 primitives) that allow an efficient implementation of shift registers without using additional flip-flops. However, these resources only support left shift operations, and have a limited number of input/output signals: clock, clock enable, serial data input, and serial data output. Synchronous and asynchronous set/reset signals are not available in the SRL primitives. Therefore, if any set, reset, or parallel load logic is used in the description, the XST synthesis tool may not be able to take advantage of the dedicated primitives for an efficient implementation. There are several possibilities to describe shift registers in the VHDL language:
reg <= reg (6 downto 0) & si;
The following example describes an 8-bit shift-left register with clock enable, serial input, and serial output signals. A for loop construct is used to describe the shift register. Example library ieee; use ieee.std_logic_1164.all; entity shift_reg8 is port (clk: in std_logic; ce: in std_logic; si: in std_logic; so: out std_logic); end shift_reg8;
architecture shift_reg of shift_reg8 is signal tmp: std_logic_vector (7 downto 0); begin process (clk) begin if (clk'event and clk = '1') then if (ce = '1') then for i in 0 to 6 loop tmp(i+1) <= tmp(i); end loop; tmp(0) <= si; end if; end if; end process; so <= tmp(7); end shift_reg; 3.5. Counters We will describe a 3-bit counter. Example library ieee; use ieee.std_logic_1164.all; entity count3 is port (clk: in std_logic; count: out integer range 0 to 7); end count3; architecture count3_integer of count3 is signal tmp: integer range 0 to 7; begin cnt: process (clk) begin if (clk'event and clk = '1') then tmp <= tmp + 1; end if; end process cnt; count <= tmp; end count3_integer; In the previous example, the addition operator is used for the count signal, which is of type integer. Most of synthesis tools allow this use, converting the type integer to bit_vector or std_logic_vector. Nonetheless, using the type integer for ports poses some problems: 1) In order to use the value of count in another portion of a design for which the interface has ports of type std_logic, a type conversion must be performed. 2) The vectors applied during simulation of the source code cannot be used to simulate the model generated by synthesis. For the source code, the vectors should be integer values. The synthesized model will require vectors of type std_logic. Because the native VHDL + operator is not predefined for the types bit or std_logic, this operator must be overloaded before it may be used to add operands of these types. The IEEE 1076.3 standard defines functions to overload the + operator for the following operand pairs: (unsigned, unsigned), (unsigned, integer), (signed, signed), and (signed, integer). These functions are defined in the numeric_std package of the 1076.3 standard. The following example is the modified version of the previous one, in order to use the type unsigned for the counters output. Example library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity count3 is port (clk: in std_logic; count: out unsigned (2 downto 0)); end count3; architecture count3_unsigned of count3 is signal tmp: unsigned (2 downto 0); begin cnt: process (clk) begin if (clk'event and clk = '1') then tmp <= tmp + 1; end if; end process cnt; count <= tmp; end count3_unsigned; Usually, synthesis tools supply additional packages to overload operators for the type std_logic. Although not standard packages, these are often used by designers because they allow arithmetic and relational operations on the type std_logic, and from this point of view they are even more useful than the numeric_std package. These packages do not require to use two additional types (signed, unsigned) in addition to std_logic_vector, as well as the functions to convert between these types. When using one of these packages for arithmetic operations, a synthesis tool will use an unsigned or signed (twos complement) representation for the type std_logic_vector, and will generate the appropriate arithmetic components as well. The next example presents the modified description of the counter from the previous examples to use the std_logic_unsigned package and the type std_logic_vector for the counters output. Example library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity count3 is port (clk: in std_logic; count: out std_logic_vector (2 downto 0)); end count3; architecture count3_std_logic of count3 is signal tmp: std_logic_vector (2 downto 0); begin cnt: process (clk) begin if (clk'event and clk = '1') then tmp <= tmp + 1; end if; end process cnt; count <= tmp; end count3_std_logic; 3.6. Three-State Buffers and Bidirectional Signals Most programmable-logic devices have three-state outputs or bidirectional I/O signals. Additionally, some devices have internal three-state buffers. The values that a three-state signal may have are '0', '1', and 'Z', all of which are supported by the type std_logic. The example below presents the modified description for the counter of Example 23 to use three-state outputs. This counter does not have an asynchronous preset signal. Example library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity count8 is port (clk, rst: in std_logic; en, load: in std_logic; oe: in std_logic; data: in std_logic_vector (7 downto 0); count: out std_logic_vector (7 downto 0)); end count8; architecture arch_count8 of count8 is signal tmp: std_logic_vector (7 downto 0); begin cnt: process (rst, clk) begin if (rst = '1') then tmp <= (others => '0'); elsif rising_edge (clk) then if (load = '1') then tmp <= data; elsif (en = '1') then tmp <= tmp + 1; end if; end if; end process cnt; oep: process (oe, tmp) begin if (oe = '0') then count <= (others => 'Z'); else count <= tmp; end if; end process oep; end arch_count8; Compared to the description of Example 23, in this description an additional signal oe is used to control the three-state outputs. The process labeled oep describes the three-state outputs for the counter. If signal oe is not asserted, the outputs are placed in the highimpedance state. The oep process description is consistent with the behavior of a three-state buffer (below).
The counter of the preceding examples may be modified to use bidirectional signals for its outputs. In this case, the counter may be loaded with the current value of its outputs, which means that the value loaded when the load signal is asserted will be the previous value of the counter or an external value, depending on the state of the oe signal. Below, the output enable of a three-state buffer is defined implicitly. Example mux: process (row_addr, col_addr, present_state) begin if (present_state = row or present_state = RAS) then dram <= row_addr; elsif (present_state = col or present_state = CAS) then dram <= col_addr; else dram <= (others => 'Z'); end if; end process mux; The three-state buffers of the dram signal are enabled if the value of the present_state signal is row, RAS, col, or CAS. For any other values of this signal, the output buffers are not enabled. In the preceding examples, behavioral descriptions were used for three-state buffers. To generate these buffers, structural descriptions may be used as well, such as the for generate construct. This construct will be described in the laboratory work dedicated to structural design. 10. Inertial delay and transport delay In the VHDL language there are two types of delays that can be used to model real systems. These are the inertial delay and the transport delay. These delays cannot be used for logic synthesis. The inertial delay is the default delay and it is used when the type of delay is not specified. The after clause assumes by default the inertial delay. In a model with inertial delay, two consecutive changes of an input signal value will not change an output signal value if the time between these changes is shorter than the specified delay. This delay represents the inertia of the real circuit. If, for example, certain pulses of short periods of the input signals occur, the output signals will remain unchanged. The figure below illustrates the inertial delay with a simple buffer. The buffer with a delay of 20 ns has an input A and an output B. Signal A changes from '0' to '1' at 10 ns and from '1' to '0' at 20 ns. The input signal pulse has a duration of 10 ns, which is shorter than the delay introduced by the buffer. As a result, the output signal B remains '0'. The buffer in below can be modeled by the following assignment statement: b <= a after 20 ns;
The transport delay must be specified explicitly with the transport keyword. This represents the delay of an interconnection, in which the effect of a pulse in an input signal is propagated to the output with the specified delay, regardless of the duration of that pulse. The transport delay is especially useful for modeling transmission lines and interconnections between components. Considering the same buffer of Figure 1 and the input signal A with the same waveform, if the inertial delay is replaced with the transport delay, the output signal B will have the form shown in below. The pulse on the input signal is propagated unchanged to the output with a delay of 20 ns. If the transport delay is used, the buffer of Figure 2 may be modeled by the following assignment statement: b <= transport a after 20 ns;
11. Event-driven simulation 4.1. Event-Driven Simulation All VHDL simulators are event-driven simulators. An event is a change of a signal state. There are three basic concepts of event-driven simulation. These are simulation time, event processing, and delta delay. During simulation, the simulator keeps track of the simulation time, which is the circuit time that has been modeled by the simulator, not the time needed for the simulation. This time is usually measured as an integral multiple of a basic unit of time known as the resolution limit. The simulator cannot measure time intervals less than the resolution limit. For gatelevel or RTL simulation, the resolution limit may be, for example, 1 ps. When a change of a signal value appears, an event is placed in an event queue for the simulation time at which this change occurs. When the simulator processes that event, it reevaluates any statement whose input is the signal that determined the event (that is, the statements that are sensitive to that signal). This results in changes of other signals and therefore other events are generated. Consider the process below, which contains a single assignment statement.
Example proc1: process (a, b, c) begin x <= a and b and c; end process proc1; When a change in value of one of the signals a, b, or c occurs, the assignment statement is reevaluated, and a new value will result for signal x. Since an after clause is not used in this statement to specify a delay, an event is scheduled for signal x for the current simulation time, event that consists in changing the value of this signal. This could create potential problems when signal x has to be updated at the same time as one of the signals from which it is generated. To solve this problem, VHDL introduces the concept A delta delay may be considered as an infinitesimally small delay that implies a delta cycle (the delta cycle is explained in Section 4.3). Therefore, the semantics of the previous assignment statement is that the value of the right-hand side expression at the current simulation time, Tc, is scheduled for assignment to signal x one delta delay after the current
new events will result. Some of them may be scheduled for the current events are processed again, new events result, and so on, until there are no other events scheduled for the current simulation time. Only then the simulation time is incremented. 12. Signal drivers Consider again the process of Example 3. As previously explained, if one of the signals a, b, or c changes value, then the value of the logical AND of signals a, b, and c at the current simulation time, Tc, is scheduled for words, an event is scheduled for the signal driver of x. A signal driver is represented by a projected output waveform. Each time a signal assignment is performed, that signals projected output waveform is updated. A projected output waveform is a set of transactions that specify new values for a signal and the times at which the signal will be updated. When simulating models written for synthesis, there are essentially two transactions that need to be maintained for any given signal: the current transaction that specifies the current value and time, and the next transaction, if it exists, that specifies the new value of the signal at the next delta delay. However, this does not mean that a maximum of two delta cycles will be required before the next simulation time. This will be illustrated when we present the simulation of a model with several concurrent statements. Note delta delays, transactions, signal drivers, and projected output waveforms are presented for conceptual purposes only. Various VHDL simulators may implement these concepts differently than presented here. VHDL simulators need only comply with the operational specifications defined in the reference manual of the language (IEEE standard 1076), but the implementation is specific for each CAD system. As an example, executing process proc1 results in a signal driver for x as shown below:
If signals b and c are logical '1' and signal a changes from '0' to '1' at 5 ns, as illustrated in Figure 4, when the simulation time is 5 ns process proc1 will execute. At this time, the current value of signal x is '0', and a transaction is added to the signal driver for x is an infinitesimally small delay. When the current simulation time, Tc, passes 5 ns, the only transaction left on the driver for signal x is ('1', Tc).
Consider the process of Example 4, which contains two assignments to the same signal. Example 4 proc2: process (a, b, c) begin x <= '0'; if (a = b or c = '1') then x <= '1'; end if; end process; If signals a, b, and c have the forms shown in Figure 5, when the current simulation time is 5 ns the transition of signal b causes the process to execute. The first sequential signal assignment results in the transaction this transaction is not added to the driver. Next, the condition for the if statement is evaluated. Because the expression is false, no further statements are executed and the process suspends. When the current simulation time is between 5 and 10 ns, the driver for signal x has only one transaction, ('0', Tc). At 10 ns, a transition of signal b occurs, and the process is executed again. As previously, the first sequential statement results in no transaction being added to the driver (that is, the projected output waveform is not updated). When the condition for the if statement is evaluated, this time the expression is true. The signal assignment x <= '1' is executed, causing a new transaction, ('1', 10 .
When the simulation time reaches 15 ns, a transition of signal c occurs, causing the process to execute. The first statement causes the transaction if statement evaluates true, the next signal assignment overrides the current transaction, replacing same as the current value, this transaction need not replace the last transaction is deleted. When the simulation time reaches 20 ns, signals b and c change, and the process is executed once again. The first statement causes the transaction for the if statement evaluates false, and the process suspends. A simple interpretation of signal assignments in a process, assuming that the process does not contain after clauses, is the following: the righthand side of the <= symbol; signal will be updated with the value in the last assignment; suspends. As opposed to sequential signal assignment statements inside processes, a concurrent signal assignment statement has an implicit sensitivity list that includes all signals on the right-hand side of the <= symbol. Concurrent statements execute any time a transition of a signal in the implicit sensitivity list occurs. When there are multiple concurrent statements, they do not execute sequentially. In the next section well present how multiple concurrent statements execute when the evaluating expression contains signals that are being updated by another statement. 13. Simulation cycle When a VHDL model is simulated, first an initialization phase is executed, and then repeated simulation cycles. The initialization phase starts with the current simulation time set to 0 ns. In general, if an explicit initial value is not specified for a signal, then the signal will have the initial value '0' if it is of type bit, or 'U' if it is of type std_logic. Next, each process is executed until it suspends. Concurrent statements are considered processes in this context and they will also be executed. After the initialization phase, simulation cycles are run, each cycle consisting of the following steps: events to occur for these signals.
occurred in the current simulation cycle is executed. either the next time at which a signal changes value, based on its projected output waveform, or the time at which a process resumes (for models written for simulation, not synthesis), whichever is earlier. If the simulation time for the next cycle is a delta delay or multiple delta delays from the current simulation time, then the current simulation time, Tc, remains the same and a delta cycle consisting of the same steps as above is executed. (Hence, a delta delay is actually a zero delay, and it is only conceptually convenient to consider it as being a very small delay.) Otherwise, the current simulation time is set to the next simulation cycle time ( Tc = Tn). Consider the description below to illustrate the simulation cycle. Implementation of the simulation cycle is simulator-specific, but the simulation results (for example, the signal waveforms) will be equivalent. Example entity delta is -- 1 port (a, b, c, d: in bit; -- 2 u, v, w, x, y, z: buffer bit); -- 3 end delta; -- 4 architecture delta of delta is -- 5 begin -- 6 z <= not y; -- 7 y <= w or x; -- 8 x <= u or v; -- 9 w <= u and v; -- 10 v <= c or d; -- 11 u <= a and b; -- 12 end delta; -- 13 In the initialization phase, all signals are set to '0'. Then, each concurrent statement is executed. The order of concurrent statement writing and execution is not important, so we will illustrate this by executing the statements from the last to the first. The signal drivers are updated according to the projected output waveform below, one delta cycle being required to update the signals before the current simulation time can advance.
Suppose that the input signals transition as shown in Figure 7. When the current simulation time reaches 100 ns, a simulation cycle begins and the signals are updated. Signal a transitions from '0' to '1', and this causes the assignment statement for signal u (line 12) to be executed. The value of signal u does not change, so the simulation cycle is complete and the simulation time may advance. When the current simulation time reaches 200 ns, a new simulation cycle begins. Signal b transitions from '0' to '1' and the assignment statement in line 12 is executed. A new transaction ('1', 200 ns + simulation time does not advance. During the delta cycle, signal u is updated with its new value. This causes statements in lines 10 and 9 to execute (in either order, but during the same delta cycle). A new transaction is not added for signal w, because its value remains '0'. However, a new transaction is added to the driver for signal x, ('1', 20 is required. During this cycle, signal x is updated with its new value, which causes statement in line 8 to execute. A new transaction is added to the to signal z. A fourth delta cycle is required to update signal z, after which the current simulation time may advance.
14. Structural design: elements of a structural design A structural description consists of components interconnected by signals. A component may be defined in an architecture by a component declaration, or it may be represented by a separate system specified as an entity and an architecture. In order to use a component declared earlier, it must be instantiated within the structural description. Component instantiations represent the basic statements in a structural architecture. These instantiations are concurrent with each other. In a component instantiation the port mapping is specified, which indicates the signals connected to the components ports. These signals may be specified as ports or internal signals of the system. In the latter case, they must be declared in the declarative part of the architecture. 2.1. Example of Structural Description The elements of a structural description will be illustrated first with a complete example. The components of the structural description will be examined then separately in the next sections. The example consists of two D- type flip-flops connected in series as a pipeline. The circuit structure is illustrated below.
We assume that the D-type flip-flop is already defined in a library and has the entity and architecture definition presented below. Example library ieee; use ieee.std_logic_1164.all; entity dff is port (d, clk: in std_logic; q, qn: out std_logic); end dff; architecture arch_dff of dff is signal tmp: std_logic; begin process (clk) begin if rising_edge (clk) then tmp <= d; end if; end process; q <= tmp; qn <= not tmp; end arch_dff; There are several ways to describe this circuit using components. A possible description is presented below. Example library ieee; use ieee.std_logic_1164.all; entity delay2 is port (din, clock: in std_logic; qout: out std_logic); end delay2; architecture structural of delay2 is signal intern: std_logic; -- Component declaration component dff is port (d, clk: in std_logic;
q, qn: out std_logic); end component dff; -- Configuration specification for all: dff use entity work.dff (arch_dff); begin -- Component instantiation d1: dff port map (d => din, clk => clock, q => intern, qn => open); d2: dff port map (d => intern, clk => clock, q => qout, qn => open); end structural; The architecture contains three parts related to the use of components. These have been labeled with comments and are the following: component declaration, configuration specification, and component instantiation. The three parts are described in the next sections. 2.2. Component Declaration A component declaration defines the interface with a design entity which describes that component. The component declared in this way may be used later in component instantiation statements. However, the component declaration does not specify the entityarchitecture pair that describes the component or the ports of the component; this information is contained either in the configuration specification or in the configuration declaration. The simplified syntax for a component declaration is the following: component component_name [is] generic (generic_list); port (port_list); end component [component_name]; The syntax for a component declaration is similar to the entity declaration. The generic clause specifies the generics of the component, and the port clause specifies its ports. In practice, the name of the component, the name of its generics and ports, as well as their order, are identical to the elements that appear in the entity declaration corresponding to the component. A component may be declared in an architecture, a block, an entity, or in a package. If the component is declared in an architecture, it must be declared in the declarative part of the architecture, before the begin keyword. In such a case, the component may be used (instantiated) in the architecture only. If the component is declared in a package, it will be visible in any architecture that uses this package. The component dff above has been declared as: component dff is port (d, clk: in std_logic; q, qn: out std_logic); end component dff 2.3. Component Instantiation A component instantiation associates signals or values with the ports of a component and associates values with the generics of that component. The simplified syntax for a component instantiation statement is the following: label: [component] component_name [generic map (generic_association_list)] port map (port_association_list); Component instantiation introduces a relationship to a unit declared earlier as a component. The name of the instantiated component must match the name of the declared component. For the instantiated component the generics and ports are specified, which represent the actual parameters of the declared component. The association list can be either named or positional. Named association allows to list the generics and ports in an order that is different from the one specified in the component declaration. In this case, each generic or port is explicitly associated a value or signal. The generic or port name is followed by the => symbol, and then by the value assigned to the generic or the signal to which the port is to be connected. Ports of a component may be left unconnected by using the keyword open. In in the example above, named association has been used for ports. The component instantiation from this example is reproduced below: d1: dff port map (d => din, clk => clock, q => intern, qn => open); d2: dff port map (d => intern, clk => clock, q => qout, qn => open); In a positional association list, the actual parameters (generics and ports) are specified in the same order in which they appear in the component declaration. In this case, the generic or port names and the => symbol are omitted. The component instantiations may be rewritten using positional association as follows: d1: dff port map (din, clock, intern, open); d2: dff port map (intern, clock, qout, open); In the example, there are two instantiations of the dff component, which are labeled d1 and d2. These labels are mandatory and must be unique. Each instantiation creates a subcircuit containing the dff component and the interconnections with this component. Notes
represents the instantiation of the component declaration and not the entity declaration. The relationship between the component declaration and the entity that describes the component is controlled by the configuration specification. 2.4. Direct Entity Instantiation It is not always necessary to define a component to instantiate it, because the VHDL 93 version of the language allows direct instantiation of an entity. This instantiation represents the simplest form to specify a structural system. The syntax of the direct entity instantiation is the following: label: entity library_name.entity_name [(architecture_name)] [generic map (generic_association_list)] port map (port_association_list); The entity instantiation statement specifies the design entity and, optionally, the name of the architecture to be used for this entity. The entity may later be used as a component. The entity is specified with the name of the library to which the entity is compiled and with the entity name. All entities specified by the user are compiled by default into the library work, so that usually this library is specified in the entity instantiation statement. The architecture name must be specified only when there is more than one architecture defined for a single entity. If the architecture name is not specified and there is more than one architecture for the directly instantiated entity, the last compiled architecture associated with the entity will be used. Assuming that the entity and architecture for the D-type flip-flop of Example 1 are compiled into the library work, the circuit in Figure 1 may be described without declaring a component, by using direct entity instantiations, as shown below. Example 3 library ieee; use ieee.std_logic_1164.all; entity delay2 is port (din, clock: in std_logic; qout: out std_logic); end delay2; architecture structural of delay2 is signal intern: std_logic; begin d1: entity work.dff (arch_dff) port map (din, clock, intern, open); d2: entity work.dff (arch_dff) port map (intern, clock, qout, open); end structural; 2.5. Configuration Specification and Declaration When direct entity instantiations are not used, component declarations and their instantiations are not enough for a complete specification of a structural architecture, because the description of component implementation is not specified. In this case a configuration specification may be used. A configuration is a construct that defines how component instances are associated with design entities and their architectures. The reason for separating the entity and its components is to allow the association (called binding) between entity and component to be made as late as possible in the simulation process. This association is carried out only at the start of simulation, in the elaboration phase. This way, the source modules of a hierarchical design may be compiled in any order. The syntax of a configuration specification is the following: for instance_label: component_name use entity library_name.entity_name [(architecture_name)] [generic map (generic_association_list)] [port map (port_association_list)]; Several configuration specifications for components may be included in a configuration declaration, which may represent a separate design unit, and therefore may be placed in a separate file. The syntax for a configuration declaration is the following: configuration configuration_name of entity_name is for architecture_name -- configuration specifications end for; -- other for clauses end [configuration configuration_name]; The syntax of a configuration specification is similar to that of a direct entity instantiation. However, a configuration specification represents a more flexible method when a different implementation must be used for the same component. If some changes have to be made, they will be introduced only in the configuration file, while the structural architecture will remain unchanged. Using direct entity instantiation would require all changes to be introduced in the architecture. A configuration specification has three parts. The first part specifies the components to which the configuration applies. Each component is indicated by the label of the statement in which that component is instantiated. It is possible to use the keyword all to select all components with the specified name. This keyword was used in the second previous example, the configuration specification from this example being reproduced below: for all: dff use entity work.dff (arch_dff); Rather than specifying the configuration for all the components with the name dff, it would have been possible to have separate configuration specifications for each instantiated component: for d1: dff ... for d2: dff ... The second part of a configuration specification selects the entity to be used for a component or for all components with the specified name, as well as the library in which the corresponding entity resides. This part may also specify the architecture to be used for the selected entity, when there is more than one architecture. The third part of the specification is optional. This part may explicitly specify how the generics and ports of an instantiated component are associated with the generics and ports of the entity (the port bindings). The generic map and port map clauses may be used for this purpose and the association may be positional or named. Explicit association is only needed if the names of generics and ports in a component declaration are different from the names of generics and ports in the entity declaration used for that component. In practice, however, it is recommended to match these names. If the configuration specification is missing completely for a component, a default association (default binding) will be performed. This means that an entity with the same name from the current library will be selected for the component, the most recently compiled architecture will be used, and the generics and ports are associated to the generics and ports with the same names within the entity. Most of the times, the default association is also the desired one, so that in these cases it is not necessary to specify a configuration. However, there is one case when a configuration specification is necessary, when the component is to be associated with an entity in a different library. A possibility to achieve this association would be to use the library and use clauses to make all the entities in that library visible. For example, if the entity dff were compiled into the library named basic, then the library and use clauses could be added to the architecture, as illustrated below. In this case, the configuration specification is not necessary. Example library ieee; use ieee.std_logic_1164.all; library basic; use basic.all; entity delay2 is port (din, clock: in std_logic; qout: out std_logic); end delay2; architecture structural of delay2 is signal intern: std_logic; component dff is port (d, clk: in std_logic; q, qn: out std_logic); end component dff; begin d1: dff port map (din, clock, intern, open); d2: dff port map (intern, clock, qout, open); end structural; The problem with this method is that all the entities in that library become visible, regardless of whether they are going to be used or not. For this reason, conflicts may result between the names of entities in the library and other names in the design units in which the library is visible. A better solution is to specify a configuration to associate the component with the entity, and use the default names for generics and ports. This solution is illustrated below. Example library ieee; use ieee.std_logic_1164.all; library basic; entity delay2 is port (din, clock: in std_logic; qout: out std_logic); end delay2; architecture structural of delay2 is signal intern: std_logic; component dff is port (d, clk: in std_logic; q, qn: out std_logic); end component dff; for all: dff use entity basic.dff; begin d1: dff port map (din, clock, intern, open); d2: dff port map (intern, clock, qout, open); end structural; Notes required to ensure that entity and component names, generics and ports match. configuration must be declared in the same library. 15. The ChipScope logic analyzer: ICON core All of the ChipScope cores use the JTAG Boundary Scan port to communicate with the host computer via a JTAG downloading cable (either a parallel or a USB cable). The Integrated CONtroller (ICON) core provides a communications path between the JTAG Boundary Scan port of the FPGA device and the other ChipScope cores (ILA, VIO). The ICON core can communicate with up to 15 ILA and/or VIO cores at any given time. However, individual cores cannot share their control ports with any other core. Therefore, the ICON core needs a distinct control port for every ILA and VIO core.
The Boundary Scan primitive component is used to communicate with the JTAG Boundary Scan logic of the FPGA device. This component extends the JTAG Test Access Port (TAP) interface of the FPGA device so that up to four internal scan chains can be created, depending on the device family. The ChipScope Analyzer tool communicates with the ChipScope cores by using one of the internal scan chains provided by the Boundary Scan component. For instance, the Boundary Scan component of Spartan-3 and Spartan-3E devices provides two internal scan chains, USER1 and USER2. Since the ChipScope cores use a single internal scan chain of the Boundary Scan component, it is possible to share the Boundary Scan component with other elements of the users design. One of the following two methods can be used for this sharing: including the unused Boundary Scan chain signals as ports on the ICON core interface. The Boundary Scan component is instantiated inside the ICON core by default. attaching the USER1 or USER2 scan chain signals to the corresponding ports of the ICON core interface. When generating the ICON core, it is possible to enable the Include Boundary Scan Ports option to provide access to the unused scan chain interfaces. However, the Boundary Scan ports should be included only if the design needs them. If the ports are included and not used, the synthesis tools may not connect the ICON core properly, causing errors during the synthesis and implementation stages of the design. Below is illustrated the communication between the ICON, ILA, and VIO cores.
16. The ChipScope logic analyzer: ILA core The Integrated Logic Analyzer (ILA) core is a customizable logic analyzer core that can be used to monitor any internal signal of the design. Since the ILA core is synchronous to the design being monitored, all the clock signal constraints that are applied to that design are also applied to the components inside the ILA core. The ILA core consists of the following main components: trigger input and output logic, data capture logic, control and status logic. 2.2.1. Trigger Input Logic The trigger input logic allows detecting complex trigger events. Each ILA core can have up to 16 trigger ports, and each port can be 1 to 256 bits wide. The ability to provide multiple trigger ports is necessary in complex systems where different types of signals or buses need to be monitored. To detect events on a trigger port, up to 16 comparators can be connected to that port. An individual comparator is called match unit. This feature enables multiple comparisons to be performed on the trigger port signals. The results of one or more match units are combined together to form the overall trigger condition event that is used to control data capture. Selecting a single match unit conserves resources while still allowing some flexibility in detecting trigger events. Selecting two or more match units allows a more flexible trigger condition equation. However, increasing the number of match units connected to a trigger port increases the usage of logic resources accordingly. The match units connected to the trigger ports can be one of the following types:
high-tolow and low-to-high transitions.
comparisons. detects high-to-low and low-to-high transitions. and not in range comparisons. Range comparator w/edges: Similar to the range comparator, but also detects highto-low and low-to-high transitions. All the match units of a trigger port can be configured with an event counter, with a selectable size of 1 to 32 bits. This counter can be configured at run time to count events in the following ways: or nonconsecutive events occur. asserted once n consecutive or non-consecutive events occur. consecutive events occur, and remains asserted until the match function is not satisfied. The internal trigger condition of the ILA core can be accessed using the optional trigger output port. The signal on this port can be used as a trigger signal for external test equipment by attaching the signal to an output pin. This signal can also be used by internal logic as an interrupt, a trigger, or to cascade multiple ILA cores together. The shape (level or pulse) and sense (active-High or active-Low) of the trigger output can be controlled at runtime. In order to monitor different kinds of signals and buses in a design, multiple trigger ports can be used. For example, if the design includes an internal system bus that consists of control, address, and data signals, then a separate trigger port can be assigned to monitor each signal group, as shown below. If all of these different signals and buses would be connected to a single trigger port, it would not be possible to monitor for individual bit transitions on the CE, WE, and OE signals while looking for the Address bus to be in a specified range.
A trigger condition is a Boolean or sequential combination of events that is detected by match unit comparators attached to the trigger ports of the ILA core. The trigger condition is used to specify the initial point in the data capture window and can be located at the beginning, the end, or anywhere within the data capture window. A storage qualification condition is also a Boolean combination of events that is detected by the match unit comparators. However, the storage qualification condition differs from the trigger condition in that it evaluates trigger port match unit events to decide whether or not to capture and store each individual data sample. The trigger and storage qualification conditions can be used together to define when to start the capture process and what data to capture. In the ILA core example shown above, suppose that the following operations are required: to Address = FF0000h. from Address = 23AACCh and Data values between 00000000h and 1000FFFFh. To implement these conditions, the TRIG0 and TRIG1 trigger ports should each have two match units attached to them, one for the trigger condition and one for the storage qualification condition. Table 1 summarizes the set up of the trigger condition and storage qualification equations and of each individual match unit, in order to satisfy the conditions stated initially ('R' means rising edge).
2.2.2. Trigger Output Logic The ILA core implements a trigger output port called TRIG_OUT. The signal on this port is the output of the trigger condition that is set up at run-time using the analyzer. The shape (level or pulse) and sense (active-High or active-Low) of the trigger output can also be controlled at run-time. The TRIG_OUT port can be connected to a device pin in order to trigger external test equipment such as oscilloscopes and logic analyzers. The TRIG_OUT port of one core can also be connected to a trigger input port of another core in order to expand the trigger and data capture capabilities of the design.
2.2.3. Data Capture Logic Each ILA core can capture data using on-chip Block RAM resources independently from all other cores in the design. Each ILA core can capture data using one of two capture modes: Window and N samples. In Window capture mode, the sample buffer can be divided into one or more equalsized sample windows. This mode uses a Boolean combination of the individual trigger match unit events in order to collect enough data to fill a sample window. The N samples capture mode is useful for capturing the exact number of samples needed per trigger without wasting capture storage resources. This mode is similar to the Window capture mode except for two major differences: buffer size minus 1;
2.2.4. Control and Status Logic The ILA core contains a control and status logic that is used to manage the operation of the core. All logic necessary to properly communicate with the ILA core is implemented by this control and status logic. 17. The ChipScope logic analyzer: VIO core The Virtual Input/Output (VIO) core is a customizable core that can monitor and drive internal FPGA signals in real time. Unlike the ILA core, no storage resources are required. There are four types of signals available in a VIO core: provided by the JTAG cable. The input values are read back periodically and displayed in the graphical interface of the analyzer. signal. The input values are read back periodically and displayed in the graphical interface of the analyzer. graphical interface of the analyzer and driven out of the core to the surrounding logic. A logical 1 or 0 value can be defined for individual asynchronous outputs. interface of the analyzer, synchronized to the design clock signal, and driven out of the core to the surrounding logic. A logical 1 or 0 can be defined for individual synchronous outputs. Every VIO core input has additional cells to capture the presence of transitions on the input. Since the design clock will most likely be much faster than the sample period of the analyzer, it is possible for the signal being monitored to transition many times between successive samples. The activity detectors capture this behavior and the results are displayed along with the signal value in the graphical interface of the analyzer. In the case of a synchronous input, activity cells capable of monitoring for synchronous and synchronous events are used. This feature can be used to detect glitches as well as synchronous transitions on the synchronous input signal. Every VIO core synchronous output has the ability to output a static 1, a static 0, or a pulse train of successive values. A pulse train is a 16-clock cycle sequence of 1 and 0 bits that drive out of the core on successive design clock cycles. The pulse train sequence is defined in the analyzer and is executed only one time after it is loaded into the core. 18.The Xilinx Embedded Development Kit: overview The Xilinx Embedded Development Kit (EDK) is a collection of tools and IP (Intellectual Property) cores that allows designing embedded processor systems for implementation in a Xilinx FPGA device. This kit enables the design of both the hardware and software components of an embedded system within a single design environment. The EDK uses the Xilinx synthesis and implementation tools to generate the hardware components of the embedded system (one or two microprocessors and various peripherals), and GNU software tools to generate the software components of the embedded system (the machine code executed by each microprocessors). There are two types of processors that can be used in an embedded system designed with the EDK package. The first is the Xilinx MicroBlaze soft processor core, which is synthesized using the available resources in the target FPGA device. The second is the IBM PowerPC hardware processor core, which is integrated into certain versions of Virtex FPGA devices and therefore it does not require additional resources of the device. One of the main components of the EDK package is the Xilinx Platform Studio (XPS). This is a graphical Integrated Development Environment (IDE) that includes all the tools required to create the hardware and software components of the embedded system. The main tools included in the XPS development environment are the following: The Base System Builder (BSB) wizard allows to easily create a working system targeting a board that is supported. Later on, the user can modify the system by changing parameters of the existing components or it can extend the system by adding other components. Standard input and output devices can be specified in the BSB wizard, and software applications can be created for memory test and for peripheral self-test. The Create and Import Peripheral (CIP) wizard enables the user to create its own peripherals and import them into the XPS project. The Hardware Platform Generator tool (PlatGen) generates the netlist for the hardware platform of the embedded system. The Library Generator tool (LibGen) generates and configures the software libraries, device drivers, file systems, and interrupt handlers for the embedded system. The Simulation Model Generator tool (SimGen) generates simulation models of the embedded hardware system based on either the original hardware design (behavioral) or on the finished FPGA implementation. The Debug Configuration wizard can be used to insert ChipScope cores into the system in order to perform hardware debugging of the system. The Bitstream Initializer (BitInit) tool updates the configuration bitstream of the FPGA device with the executable code of the software application. This tool calls the Data2MEM utility provided in the Xilinx ISE design environment to initialize the Block RAM (BRAM) memory of the FPGA device with the executable code. The Xilinx Software Development Kit (SDK) is an integrated development environment, complimentary to the XPS environment, which can be used to develop and debug C/C++ software applications for the embedded system. The Xilinx SDK environment is based on the Eclipse tool suite. The EDK package also includes other components, such as: IP cores for a large number of peripherals; Device drivers and libraries required to develop software applications; GNU compiler, linker, and debugger for developing C/C++ software applications targeting the MicroBlaze and PowerPC processors; Sample projects. 19. The Xilinx Embedded Development Kit: generating the hardware platform The hardware platform of an embedded system created with the EDK package includes one or two processors, along with various peripherals and memory blocks. Each processor and peripheral core can be customized. Implementation parameters control optional features of the cores and define the addresses assigned to each peripheral. These addresses are mapped in the memory address space. Communication between a processor and on-chip BRAM memories is performed via a Local Memory Bus (LMB). This bus provides single-cycle access to dual-port BRAM memories and is split into instruction LMB and data LMB. Peripherals can either connect directly to the Processor Local Bus (PLB) or to the On-Chip Peripheral Bus (OPB). The PLB and OPB buses are connected via a bus bridge. The hardware platform description is maintained by the XPS environment in a file known as the Microprocessor Hardware Specification (MHS) file, with the .mhs extension. This is an editable text file and represents the main source file that describes the hardware part of the embedded system. The MHS file contains the instantiations of the processor cores and peripheral cores along with their parameters. The file defines the configuration of the embedded system and includes information on the buses, processors, peripherals, interconnections, interrupt request priorities, and address space. For each peripheral, there is a Microprocessor Peripheral Definition (MPD) file, with the .mpd extension, which defines the configurable parameters along with their default values, as well as the available ports of the peripheral. There are two main steps for generating the hardware platform: netlist generation (synthesis) and bitstream generation (implementation). These steps are described next. 3.1. Netlist Generation For generating the hardware platform, XPS first invokes the PlatGen tool to generate a system netlist. The PlatGen tool performs the following: -Reads the hardware platform description from the MHS file. -Generates a representation in a hardware description language (e.g., VHDL) of the MHS file into a system.vhd file along with a system_stub.vhd file. The system.vhd file is used when the embedded system is developed entirely in the XPS environment. The system_stub.vhd file is an instantiation template file of the embedded system and is used when this system is a sub-module in a larger design. -Generates a Block RAM Memory Map (BMM) file, with the .bmm extension. This is a text file that describes how individual BRAM modules constitute a contiguous logical data space. -Extracts the peripheral netlists from the EDK install directory. -Calls the Xilinx Synthesis Technology (XST) tool to synthesize the design. -Generates a system netlist file and the peripheral netlist files (NGC files with .ngc extension). -The netlist generation process is illustrated below.
3.2. Bitstream Generation For generating the bitstream configuration file, the Xilinx ISE implementation tools are used. It is possible to run the implementation tools from the ISE Project Navigator graphical user interface. However, these tools are usually invoked in batch mode via the XFlow command-line program. This program has a simple, flexible, and user customizable interface to the Xilinx ISE implementation tools. The XFlow program reads the netlist files generated by the PlatGen tool along with a user constraints file (UCF) and invokes the Xilinx ISE implementation tools (NGDBuild, MAP, PAR). Then the XFlow program calls the BitGen tool to generate the system.bit configuration file for the FPGA device. The generated configuration file does not include the executable code of any software application. The BitGen tool also generates a BMM file (system_bd.bmm) that contains the physical locations of the BRAM memories.
20. The Xilinx Embedded Development Kit: generating the software platform and executable code A software platform is a collection of libraries and device drivers that are used by any software application. Before creating a software application, a software platform has to be generated with the LibGen tool. This tool uses a description of the software system maintained by the XPS development environment. This description is stored in a file known as the Microprocessor Software Specification (MSS) file, with the .mss extension, which is the analogous to the MHS file that contains the description of the hardware platform. The MSS file, together with the software applications, are the main source files that describe the software part of the embedded system. LibGen configures libraries and device drivers using the information stored in the MSS file and generates the following archives of object files: -libc.a: Standard C library; -libXil.a: Xilinx library; -libm.a: Math functions library. In addition to these object files, the LibGen tool also generates a Board Support Package (BSP) for each processor. This package is a collection of files containing software drivers associated with peripherals, selected libraries, standard I/O devices, and interrupt handler routines. Therefore, it is recommended to generate the BSPs after the hardware components are defined and the address map is specified. After generating the libraries and BSPs, the next step consists in compiling the source files of the software applications and generating the executable file for each processor. This is performed by invoking the following GNU tools: -Pre-processor: Replaces all macros with their definitions in the .c or .h files. -Processor-specific compiler: Compiles C/C++ code and generates assembly- language code. -Processor-specific assembler: Converts assembly-language code to object code. -Linker: Combines all the object and library files into a single executable file using either a default linker script or a user-defined linker script. The linker script describes how the various sections of the same type (e.g., .text, .data) in all the object files are combined into the corresponding sections in the output executable file. The output executable file is an ELF (Executable and Linkable Format) file. This is a binary file named executable.elf that contains machine code. The process of generating the executable code: