Clock Distribution Using VHDL

ABSTRACT
Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device. The term is no longer as common as it once was, as chips have increased in complexity into the hundreds of millions of transistors.
VHDL stands for VHSIC Hardware Description Language. VHSIC is an abbreviation for Very High Speed Integrated Circuit, a project sponsored by the US Government and Air Force begun in 1980 to advance techniques for designing VLSI silicon chips. VHDL is an IEEE standard. The quality of the clock signals is the most important factor for ensuring a chips successful operation. In a design net list, there are hundreds of thousands or millions of cells. Those cells can be classified as two types: combinational cells and sequential cells (including memories). The sequential cells are used for storing information and they must operate on clocks. After the placement stage of the design implementation process, all of the cells, including the sequential cells are spread around the entire chip. The task of clock distribution is to distribute the clock signals to all of these sequential cells. This work is commonly called clock tree synthesis .The principle idea of how a clock tree is constructed. Our project is dealing with how the clock is distributed.
DEPT OF ECE
Page 1 of 66
AIET
Chapter 1 INTRODUCTION
In the past few decades, Integrated Circuit technology has been advancing rapidly. In synchronous integrated circuits, clock is used to synchronize the data transfer. The performance and functionality of the entire system depends on the clock characteristics. Of late, clock distribution has become an exigent task for the VLSI designers, as it consumes a large portion of resources like wiring, design time and power. In the worst case, functional errors can also be caused due to the uncertainties in clock network delays. These uncertainties also result in performance degradation. Therefore it is imperative to model clock distributions accurately for their performance. Clock distribution in high-speed digital systems is an exigent problem overwhelming a rising fraction of assets for example design time, power, wiring, and skew, which is the key parameter of interest. First of all, the issues related to clock skew and its estimation in a digital circuit or network comes in our mind. Clock skew is the difference in clock arrival time between different components across a chip. Due to this difference in clock arrival time, delay comes into picture in getting the output of each circuit, which results in speed irregularity of the digital system. [1] In this paper, we introduce a synchronous counter circuit in which four D Flip-flops are used, driven with the same clock and in this synchronous circuit we estimate the clock skew using VHDL(Very High Speed Integrated Circuits Hardware Descriptive Language). This paper is organized as follows. Section II describes the circuit design .Section III consists the implementation of the circuit in VHDL. Section IV shows the analysis and simulation results of the circuit. Section V describes the conclusion of our analysis. Clock signals are typically loaded with the greatest fan-out and operate at the highest speeds of any signal, either control or data, within the entire synchronous system. Since the data signals are provided with a temporal reference by the clock signals, the clock waveforms must be particularly clean and sharp. Furthermore, these clock signals are particularly affected by technology scaling (Moore's law), in that long global
DEPT OF ECE
Page 2 of 66
AIET
interconnect lines become significantly more resistive as line dimensions are decreased. This increased line resistance is one of the primary reasons for the increasing significance of clock distribution on synchronous performance. Finally, the control of any differences and uncertainty in the arrival times of the clock signals can severely limit the maximum performance of the entire system and create catastrophic race conditions in which an incorrect data signal may latch within a register. The clock distribution network often takes a significant fraction of the power consumed by a chip. Furthermore, significant power can be wasted in transitions within blocks, even when their output is not needed. These observations have led to a power saving technique called clock gating, which involves adding logic gates to the clock distribution tree, so portions of the tree can be turned off when not needed (when a clock can be safely gated may be determined either through automatic analysis of the circuit, or specified by the designer). The exact savings are very design dependent, but around 20-30% is often achievable. Most synchronous digital systems consist of cascaded banks of sequential registers with combinational logic between each set of registers. The functional requirements of the digital system are satisfied by the logic stages. Each logic stage introduces delay that affects timing performance, and the timing performance of the digital design can be evaluated relative to the timing requirements by a timing analysis. Often special consideration must be made to meet the timing requirements. For example, the global performance and local timing requirements may be satisfied by the careful insertion of pipeline registers into equally spaced time windows to satisfy critical worstcase timing constraints. The proper design of the clock distribution network helps ensure that critical timing requirements are satisfied and that no race conditions exist ( clock skew). The delay components that make up a general synchronous system are composed of the following three individual subsystems: the memory storage elements, the logic elements, and the clocking circuitry and distribution network. Interrelationships among these three subsystems of a synchronous digital system are critical to achieving maximum levels of performance and reliability. Clock gating is a popular technique used in many synchronous circuits for reducing dynamic power dissipation. Clock gating saves power by adding more logic to a DEPT OF ECE Page 3 of 66 AIET
circuit to prune the clock tree. Pruning the clock disables portions of the circuitry so that the flip-flops in them do not have to switch states. Switching states consumes power. When not being switched, the switching power consumption goes to zero, and only leakage currents are incurred. Clock gating works by taking the enable conditions attached to registers, and uses them to gate the clocks. Therefore it is imperative that a design must contain these enable conditions in order to use and benefit from clock gating. This clock gating process can also save significant die area as well as power, since it removes large numbers of mixes and replaces them with clock gating logic. This clock gating logic is generally in the form of "Integrated clock gating" (ICG) cells. However, note that the clock gating logic will change the clock tree structure, since the clock gating logic will sit in the clock tree.
1.1 Motivation
Clock distribution networks synchronize the data transfer between the data paths. In Integrated Circuits, proper design of a clock distribution network is necessary as it has a direct impact on the system performance. Significant research has been done both in the industrial and academic communities in the area of design and the optimizations of clock distribution networks. To the best of our knowledge, there are no previous works on modeling clock distribution networks using Hardware Description Languages. The need for accurate modeling of general characteristics of clock distribution networks has motivated this research. Providing accurate models to identify any uncertainty in the clock signal arrival times at different significant points in the clock distribution network will be a great aid to the circuit designers. The models developed as part of this research will be used to build the model library in Distributed Processing Laboratory (DPL) at University of Cincinnati.
1.2 Objective
This thesis deals with the following research questions.
DEPT OF ECE
Page 4 of 66
AIET
1. Is it possible to accurately model clock distributions using VHDL? If so,What do the models look like? 2. Is VHDL versatile enough to model different clock distribution networks? 3. What are the characteristics of clock distribution that can be modeled by VHDL?
1.3 Approach
The approach taken in this research to model any clock distribution network is as follows: Firstly a library of components which are essential components of any clock distribution network is build. The components considered in this research are as follows: Interconnects Buffers Phase Locked Loop In addition to the above listed components, simple models for the oscillator and the load for the clock distribution network are modeled. Using the library of components, a generic model for a clock distribution network can be generated. This satisfies the first research goal. To validate the second research goal, two case studies have been considered in this research. An H-tree clock distribution network and a regular pattern clock distribution network have been chosen for their versatility. By modeling these two different types of clock distribution networks, the versatility of the VHDL-AMS is proven. The important characteristic of a clock distribution network considered in this research is the skew of the clock signal.
1.4 Overview of Results

Experiments have been setup to validate the goals of the research. It is shown how the models developed for this research can be used to model clock distribution networks. The simulation and CPU times of the various models are reported to validate the speed of VHDL language. Two case studies were considered to prove the versatility of the clock distribution network. The skew is an important performance limiting factor in any clock DEPT OF ECE Page 5 of 66 AIET
distribution. The skew variation with varying levels of the H-tree, interconnect lengths, load capacitance and the number of stages in a regular pattern clock distribution network is analyzed.
CHAPTER 2 BACKGROUND
This chapter discusses the need for modeling clock distribution networks in A brief description of the related work is provided in provides an overview of the VHDLlanguage used in this research. The language constructs and the modeling techniques to model a mixed-signal system are presented in this section. Background information of clock distribution networks is presented in Section 2.4. And the components of the clock distribution networks: interconnects, buffers and phase locked loops are discussed after wards Finally the characteristics of clock distribution considered in this research are discussed in Section 2.5.
2.1 Need for Modeling Clock Distribution Networks

Clock distribution over an entire chip is a very complex problem and is one of the main challenges in the design of todays high-performance processors. Clock distribution has a significant impact on the performance of the entire system and heavily contributes to the total power dissipation of the chip. Any inaccuracies of clock timing may be critical to the circuit operations resulting in functionality errors. An accurate model of the clock distribution network for any VLSI circuit is helpful for accurate performance evaluation [1]. It will be of great help to the circuit designers to model uncertainties in the clock signal arrival times between key points in a clock distribution network.
DEPT OF ECE
Page 6 of 66
AIET
2.2 Related Work

In the literature, there are some works reported to model the impact of process variations on Clock skew [4], and the effect of technology scaling on clock skew and interconnect delay [5]. Research has also been done to model the effects of systematic within-die interconnect parameter variation like metal thickness, metal line width, and inter-layer dielectric thickness variations on circuit performance [6]. Buffer Insertion and the effect of process variations on its sizing have also been studied [7]. Clock skew analysis has also been reported in many research papers [8]. But most of the above works use statistical analysis and Monte-Carlo simulations to find the effects of different variations on skew. To the best of our knowledge, no work has been reported to model clock distribution network using a hardware description language (HDL) taking parameters like skew and process variations in to account. Research efforts have been made to model the Phase Locked Loop and its components for mixed-mode simulation [9, 10, 11]. But there are no existing works which link the phase locked loop with the clock distribution network and model its effect on the performance of the clock distribution. The novel feature of this research is that a clock distribution network is being modeled using a HDL by modeling its components (Interconnects, Buffers, Phase locked loop, Source and Load) and their impact on clock skew and rise and fall times of the clock. Use of a HDL combines the advantages of the flexibility of the language in modeling, with the accuracy and speed of simulation and modeling. The HDL used for this research is VHDL.
2.3 Clock Distribution Networks

Designing a clock distribution network is a very complex task for the circuit designers. The difficulties in the clock distribution design are being augmented by the device technology improvements leading to smaller feature size; larger chip area and increased component density, since they result in higher interconnect resistance and higher clock loads [14]. Some of the clock distribution networks used in certain microprocessors are discussed below. The clock distribution network used in Intel IA-64 microprocessor is shown in Figure 2.1. The significant segments are discussed as follows. The global clock distribution connects the PLL clock generator to the de-skewing buffers. DEPT OF ECE Page 7 of 66 AIET
The regional clock distribution connects the clock from de-skewing buffers to the local clock regions using clock grids. The local clock distribution connects the clock from the regional clock grid to the clocked elements using local clock buffers and local interconnections Figure 2.1: Clock distribution network in Intel IA-64 microprocessor [8] The clock distribution network of the 600MHz Alpha microprocessor is depicted in Figure 2.2. The clock is generated on-chip using a PLL which multiplies an 80-200MHz reference clock to generate a frequency of 600MHz. The feedback loop of the PLL includes the clock distribution network up to and including the global clock (GCLK) to control phase alignment. A high-gain buffer network is used to route GCLK to a central point on the die. From there the clock is distributed by buffered X and H trees as shown in Figure 2.3 to the GCLK drivers located in a windowpane pattern across the chip
Figure 1: Clock distribution network in 600MHz Alpha Microprocessor
DEPT OF ECE
Page 8 of 66
AIET
Figure 2: Global Clock Network [15] To summarize, any clock distribution network will usually contain the following components: Phase-locked loop for on-chip clock generation, clock buffers to drive the large capacitive load in the network and the local interconnects to connect the clock to the clock driving points. Other components of the clock distribution networks include Deskewing buffers, Delay locked loops, etc. In this research, interconnects, buffers and PLLs will be studied in detail.
2.4 Components of a Clock Distribution Network 2.4.1 Buffers

Another important component of any clock-distribution network is a buffer. Typically the clock is connected to a large number of components in the circuit resulting in large loads. The characteristic load consists of the clock distribution wires, and the end points which drive the logic blocks in the circuit. Buffers are inserted between the clock source and the load, to ensure that the clock signals at different end points have smaller rise and fall times. The clock buffer design involves the following steps: deciding on the buffer delay and the output slopes, and choosing the buffer sizes based on the load capacitance to be driven. The clock buffer delays affect the clock skew of the network and DEPT OF ECE Page 9 of 66 AIET
hence are chosen such that the clock skew budget is met. The effect of clock buffer delays on skew is shown in Figure 2.5.
Figure 3: Clock skew variation with buffer delay
2.4.2 Phase Locked Loop

A Phase Locked Loop (PLL) is used as a frequency synthesizer to multiply a reference frequency generated by a crystal oscillator to much higher frequencies needed for todays higher end microprocessors. In principle, a PLL synchronizes the frequency of the output signal generated by an oscillator with the frequency of a reference signal and eliminates any phase misalignment. Some of the other applications of PLLs in communications are carrier and clock recovery, frequency and phase modulation and demodulation [17]. Thus, the main purpose of a PLL in integrated circuits is to generate clock and to obtain accurate phase synchronizations between the off-chip reference clock and the internal clock signals. A rudimentary PLL (shown in Figure 2.6) typically has the following three basic blocks in a feedback loop Phase Detector (PD) Low Pass Filter (LPF) Voltage Controlled Oscillator (VCO) DEPT OF ECE Page 10 of 66 AIET
Ref. signal
PD
LPF
VCO
Figure 4: Block diagram of a basic PLL The phase detector may be a simple analog multiplier. Based on the requirements of the application, more complicated phase detectors are used in practice. The loop filters are optional and are used to increase the bandwidth of the PLL and to reduce the phase noise. The VCO is an oscillator producing an output frequency proportional to its control voltage [18]. The main drawback of a rudimentary PLL using an analog multiplier as a phase detector is that it has a finite phase error that is a function of input frequency. The PLLs used contemporarily are charge-pump PLLs, which have the capability to track the phase accurately, resulting in practically zero nominal phase error regardless of its input Frequency. A charge pump PLL shown in Figure 2.7 has the following blocks. Phase detector (PD) Charge pump (CP) Low Pass Filter (LPF) Voltage Controlled Oscillator (VCO)
Reference signal
OUTPUT
PD
CP
LPF
VCO
FREQUENCY DIVIDER
DEPT OF ECE
Page 11 of 66
AIET
Figure 5: Block diagram of charge-pump PLL A brief discussion about the different blocks of a PLL is given below
2.4.3 Phase Detector

A phase detector of the PLL and outputs the phase error. There will be no output from the PD if the two signals have same phase and frequency. Else, the phase error is used to generate a control voltage for the voltage controlled oscillator such that the phase error minimizes to zero.
2.4.4 The charge pump

The charge pump converts the phase error detected by the PD into current or Voltage, to control the frequency generated by the VCO. The charge pump can be used to set the phase detector gain, KD. The Charge Pump either charges or discharges the filters capacitors based on the output of the PD. If the reference signal leads the VCO output, the output of the PD will signal the Charge Pump to pump more charge into the capacitors. If it lags, the equivalent amount of charge discharges from the filters capacitors.
2.4.5 Low Pass Filter

The main purpose of a low pass filter is to modify the bandwidth of the PLL and to reduce the phase error. The filter converts the charge of the charge-pump into voltage, to control the frequency of the VCO. A passive RC filter can be used as a simple low pass filter. The higher the order of the filter, the better is the noise rejection in the PLL. The filters can be a passive RC filters or active filters using op-amps. If the control voltage of the VCO is less than the voltage generated by the charge pump, a passive RC filter will suffice, if not, an active filter is used
2.4.6 Voltage Controlled Oscillator

A Voltage Controlled Oscillator is the heart of the PLL as it dominates the phase noise performance of the entire PLL. It produces the required frequency for the PLL. The frequency of the VCO can be controlled by a control voltage. The output of the VCO is DEPT OF ECE Page 12 of 66 AIET
fed back to the phase detector and the phase difference between the reference signal and the output in such a way that the output matches the reference signal closely. This process is called acquiring of the VCO is changed into a DC output voltage. This DC voltage controls the frequency of the VCO lock.
2.4.7 Frequency Divider

Typically the output frequency of a PLL will be a multiple of the reference signal Frequency. Hence, a frequency divider is used as a part of the feedback loop to divide the frequency generated by the VCO into a value that can be comparable to the reference signal. Of late frequency dividers have become inevitable in any PLL circuit as the output clock frequencies of todays microprocessors are much higher than their input clock frequencies.
2.5 Characteristics of Clock Distribution Network

Clock distribution is very crucial for any digital system. Ideally, clock signals should have zero skew, zero jitter, negligible rise and fall times and specified duty cycles. But in reality, clock signals have non-zero skews, non-zero jitter, considerable rise and fall times and varying duty cycles. Power consumption is another important performance metric of any clock distribution network, as it may take a large portion of the total power consumption of the entire chip. As the clock frequency increases, clock inaccuracy is occupying a considerable percentage of the clock period. In present-day microprocessors, clock skews take up as much as 10% of the available clock cycle time [22]. Thus, clock distribution networks can be modeled for various characteristics like clock skew, clock jitter, rise and fall times and variations in duty cycles, at different driving points in the distribution network. The characteristic of clock distribution considered in this research is clock skew.
2.5.1 Clock Skew

An ideal clock is defined as a signal which arrives at different register inputs at the same time. But due to static mismatches in the clock paths and the clock load variations, DEPT OF ECE Page 13 of 66 AIET
clocks are not ideal. The absolute delay of any clock path is not that important compared to the relative arrival of the clock at different points in the circuit. Clock skew is defined as the spatial variation in arrival time of the clock at different clock terminals in the circuit. Clock skew results in phase shift . Clock skew is one of the important performance limiting factors in the system performance. Usually, the circuit designers have a clock skew budget to meet, beyond which the system will not have correct functionality at a desired frequency. The clock distributions usually target zero or minimum skew for efficient performance. Zero skew is obtained when the phase delays of all the clock terminals from the clock source calculated with a delay model, like the Elmore delay model, are equal under ideal process condition. Clock skew results mostly from the different delays associated with the clock buffers present on chip. The common design technique used to reduce the skew is to equalize the capacitive load of clock signal as seen by each clock buffer. Skew can be caused either by a systematic effect which is predictable or a random effect which is not predictable. Some of the reasons for skew include variations in effective channel lengths of devices, Inter-layer dielectric (ILD) thickness variations, process variations, threshold voltage variations, power supply voltage variations and temperature variations across the die and design errors and capacitive coupling in the circuit .
DEPT OF ECE
Page 14 of 66
AIET
CHAPTER 3
3.1 CLOCK DISTRIBUTION

The quality of the clock signals is the most important factor for ensuring a chips successful operation. In a design net list, there are hundreds of thousands or millions of cells. Those cells can be classified as two types: combinational cells and sequential cells (including memories). The sequential cells are used for storing information and they must operate on clocks. After the placement stage of the design implementation process, all of the cells, including the sequential cells, are spread around the entire chip. The task of clock distribution is to distribute the clock signals to all of these sequential cells. This work is commonly called clock tree synthesis. Figure 4.36 (page 141) shows the principle idea of how a clock tree is constructed. As depicted, a clock network may be constructed in tree fashion. Starting from the clock source, the first level of clock buffers are laid out, then the second level, then the third level, and so on. In most designs, there are many clock domains, and each domain has hundreds or thousands of sequential cells attached to it. This many cells cannot be driven by a single buffer from the clock source, even with the strongest buffer in the library. A tree structure is used to deal with this problem by letting each buffer drive only the number of loads that it is allowed to drive. As a result, the quality of the clock signal, in term of slew rate (the rising and falling time of the clock edges), is not significantly degraded when it reaches the leaf sequential cells. Figure 4.37 shows the commonly used clock tree structures in the clock distribution networks: trunk, branch-tree, mesh, X-tree and H-tree. Figure 4.38 is an example of how a real clock tree looks in a design block. In this simple example, there is one level of clock buffers between the clock root and the leaves. Another type of clock distribution network is the clock grid. In this approach, a grid of metal structure, which covers the entire chip, is dedicated to the distribution of clock signals, as graphically shown in Figure 3.1
Figure 6 A basic clock tree.
Figure 7 commonly used tree structures in clock distribution networks
Figure.8
An example of a clock tree in chip design.
A tree structure usually consumes less wiring and thus less capacitance and less routing resources, which results in lower power and less latency. However, a tree structure must be carefully tuned and it is very load (placement) dependent. In contrast, a grid structure uses significantly more routing resources and thus has large capacitance and large latency, but it tends to be less load dependent as any leaf cell can always find a nearby tapping point to connect to directly. As a result, a grid structure clock distribution network is typically used only for high-end applications, such as microprocessors, whereas a tree structure is widely used for ASIC-based designs. The clock distribution network consumes more than 10% of the total power used by the chip in large designs. During each clock cycle, the capacitance associated with the entire clock structure must be charged to the supply voltage and
subsequently dumped to ground, with the stored energy lost as heat. To ease this problem, resonant clock distribution has been actively studied by some groups. In this method, the traditional tree- or grid-driven clock structure is augmented with on-chip inductors to resonate with the clock capacitance at the clocks fundamental frequency. The energy of the fundamental frequency resonates back and forth between its electric and magnetic form rather than being dissipated as heat. The clock driver is only used for adding the energy lost during the operation. This idea is depicted in Figure 3.3
3.2 The Key Requirements For Constructing A Clock Tree.

The key requirements for constructing a clock tree are clock skew and insertion delay. Clock skew is the maximum timing difference among the arrive times of the leaf cells in a clock domain. In Figure 4.41, the result of a SPICE analysis of a clock tree is demonstrated. A clock pulse is injected into the clock tree at time 0 ns with a rise time of 1 ns. After traveling inside the tree, the clock signal arrives at the leaves (also called clock sinks) at approximately 3.4 ns. However, it is clear that the arrive times for the leaves are not the same due to the different physical locations of the leaf cells .They spread within a range of approximately 1 ns, which is defined as clock skew. In other words, the existence of skew means that not all of the sequential cells in a particular clock domain receive their clock signals at exactly the same moment, as desired. Clock skew is significant because it eats up the time budget assigned for logic operations. If skew is over the desired budget, the chip might not function correctly at its designed speed (a setup violation), or might not function at all (a hold violation).Clock tree insertion delay is the measure of time difference between the clock signal started at the source and the clock signal received at the leaf cells. The concept of insertion delay is also depicted in Figure 4.41. Insertion delay is important because the designer might need to balance clock tree delays between different clock domains for cross-domain information exchange. Also, insertion delay impacts I/O timing constraints. These scenarios are graphically demonstrated in Figure 4.42 where the insertion delays of CLK1_TREE and CLK2_TREE must be balanced for the proper exchange of data between the logic cells of the two domains. For the CLK2domain, the value of the insertion delay must be known so
that the communication between I/O cells (DATA_IN, DATA_OUT) and logic cells can be carried out safely.
3.3 Difference between Time skew and length skew in a clock Tree
Clock tree synthesis is a crucial step in a chips physical design. The quality of the clock tree has a great impact on the status of timing closure. One of the critical metrics in measuring the clock tree quality is the time skew, which is the maximum arrive time difference among the clock sinks. Physically, the time skew is caused by the different locations of the clock sinks on chip. Figure 3.5 is an abstract view of the physical locations of a clock trees leaf cells. Figure 3.6 presents the same information in a real layout .As seen, the clock sinks are spread within a certain region. From the clock source to various clock sinks, the physical distances are different. Hence, when connections are completed by metal routing, the wire lengths are not the same. The maximum wire length difference is referred to as length skew .Physically, the clock tree is composed of clock buffers and routing wires. Therefore, the time delay from the clock source to any clock sink is affected by two factors: the gates delay and the wires delay. Since these two types of delay scale differently among different process, temperature, and voltage (PTV) conditions, a time-balanced clock tree in one PTV corner might experience significant time skew in another PTV corner if the clock tree is constructed with a considerable amount of length skew. This scenario is worsened when the process geometry becomes smaller because wire delay carries more weight in the total delay equation. Ideally, among different branches of a clock tree, it is desired to match gate delay with gate delay and wire delay with wire delay. In other words, time skew should be minimized by using the approach of minimizing the length skew such that the amount of time skew is preserved over different PTV conditions. This is especially helpful for the on-chip variation (OCV) optimization .Figure 3.7 depicts the relationship between time skew and length skew for the same clock tree in Figure 3.5 . As shown in this space timing plot this clock tree has six levels. Any vertical line in this plot represents a gate delay since a gate has no length skew but time delay. Wire delays are expressed by nearly horizontal lines, which have a large length difference but small time difference. At Level 4 and Level 5, the clock tree starts to grow different branches. Consequently, the length
skew is seen at these levels. The time skew for this tree is ~30 ps, whereas the length skew is approximately250 _m. Figure 3.7 is the same spacetime relationship in a three dimensional(3D) world. Figure 3.8 is the 3D plot of a very large clock tree with 23,942 sinks. The time skew discussed above is called global skew, which is usually pessimistic. A more specific term, local skew, is defined as the time difference for the clock signals to reach the sinks that have data exchange activities among them. Local skew is more precise and useful for circuit analysis but the extraction of necessary information for processing is beyond the capability of current tools.
Figure 9 Cell based ASIC design methodology.
Figure 10 Abstract view of the physical distribution of a clock sink.
Figure 11 Layout view of the physical distribution of a clock sink.
Figure 12 The clock tree in a three-dimensional spacetime plot.
Figure 13 A large clock tree of 23,942 sinks.
CHAPTER 4
4.1
VLSI
4.1.1 INTRODUCTION
Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device. The term is no longer as common as it once was, as chips have increased in complexity into the hundreds of millions of transistors.
4.1.2 Overview
The first semiconductor chips held one transistor each. Subsequent advances added more and more transistors, and, as a consequence, more individual functions or systems were integrated over time. The first integrated circuits held only a few devices, perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or more logic gates on a single device. Now known retrospectively as "small-scale integration" (SSI), improvements in technique led to devices with hundreds of logic gates, known as largescale integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has moved far past this mark and today's microprocessors have many millions of gates and hundreds of millions of individual transistors. At one time, there was an effort to name and calibrate various levels of large-scale integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the huge number of gates and transistors available on common devices has rendered such fine distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in widespread use. Even VLSI is now somewhat quaint, given the common assumption that all microprocessors are VLSI or better. As of early 2008, billion-transistor processors are commercially available, an example of which is Intel's Montecito Itanium chip. This is expected to become more commonplace as semiconductor fabrication moves from the current generation of 65 nm processes to the next 45 nm generations (while experiencing new challenges such as increased variation across process corners). Another notable example is NVIDIAs 280 series GPU.
DEPT OF ECE
Page 26 of 66
AIET
This microprocessor is unique in the fact that its 1.4 Billion transistor count, capable of a teraflop of performance, is almost entirely dedicated to logic (Itanium's transistor count is largely due to the 24MB L3 cache). Current designs, as opposed to the earliest devices, use extensive design automation and automated logic synthesis to lay out the transistors, enabling higher levels of complexity in the resulting logic functionality. Certain high-performance logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest efficiency (sometimes by bending or breaking established design rules to obtain the last bit of performance by trading stability).
4.1.3 What is VLSI?

VLSI stands for "Very Large Scale Integration". This is the field which packing more and more logic devices into smaller and smaller areas. Simply we say Integrated circuit is many transistors on one chip. Design/manufacturing of extremely small, complex circuitry using modified semiconductor material Integrated circuit (IC) may contain millions of transistors, each a few mm in size Applications wide ranging: most electronic logic devices involves
History of Scale Integration
late 1940s Transistor invented at Bell Labs late 1950s First IC (JK-FF by Jack Kilby at TI) early 1960s Small Scale Integration (SSI) 10s of transistors on a chip late 1960s Medium Scale Integration (MSI) 100s of transistors on a chip early 1970s Large Scale Integration (LSI) 1000s of transistor on a chip early 1980s VLSI 10,000s of transistors on a
DEPT OF ECE
Page 27 of 66
AIET
chip (later 100,000s & now 1,000,000s) Ultra LSI is sometimes used for 1,000,000s SSI - Small-Scale Integration (0-102) MSI - Medium-Scale Integration (102-103) LSI - Large-Scale Integration (103-105)
VLSI - Very Large-Scale Integration (105-107) ULSI - Ultra Large-Scale Integration (>=107)
Advantages of ICs over discrete components

While we will concentrate on integrated circuits, the properties of integrated circuitswhat we can and cannot efficiently put in an integrated circuit-largely determine the architecture of the entire system. Integrated circuits improve system characteristics in several critical ways. ICs have three key advantages over digital circuits built from discrete components: Size. Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer sizes, compared to the millimetre or centimetre scales of discrete components. Small size leads to advantages in speed and power consumption, since smaller components have smaller parasitic resistances, capacitances, and inductances. Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips. Communication within a chip can occur hundreds of times faster than communication between chips on a printed circuit board. The high speed of circuits on-chip is due to their small sizesmaller components and wires have smaller parasitic capacitances to slow down the signal. Power consumption. Logic operations within a chip also take much less power. Once again, lower power consumption is largely due to the small size of circuits on the chip-smaller parasitic capacitances and resistances require less power to drive them.
DEPT OF ECE
Page 28 of 66
AIET
VLSI and systems These advantages of integrated circuits translate into advantages at the system level: Smaller physical size. Smallness is often an advantage in itself-consider portable televisions or handheld cellular telephones. Lower power consumption. Replacing a handful of standard parts with a single chip reduces total power consumption. Reducing power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used; since less power consumption means less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic shielding may be feasible, too. Reduced cost. Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that the cost of a system built from custom ICs can be less, even though the individual ICs cost more than the standard parts they replace. Understanding why integrated circuit technology has such profound influence on the design of digital systems requires understanding both the technology of IC manufacturing and the economics of ICs and digital systems. Applications Electronic system in cars. Digital electronics control VCRs Transaction processing system, ATM Personal computers and Workstations Medical electronic systems. Etc.
Applications of VLSI
Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In
DEPT OF ECE
Page 29 of 66
AIET
other cases electronic systems have created totally new applications. Electronic systems perform a variety of tasks, some of them visible, some more hidden: Personal entertainment systems such as portable MP3 players and DVD players perform sophisticated algorithms with remarkably little energy. Electronic systems in cars operate stereo systems and displays; they also control fuel injection systems, adjust suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems. Digital electronics compress and decompress video, even at high-definition data rates, on-the-fly in consumer electronics. Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated function. Personal computers and workstations provide word-processing, financial analysis, and games. Computers include both central processing units (CPUs) and special-purpose hardware for disk access, faster screen display, etc. Medical electronic systems measure bodily functions and perform complex processing algorithms to warn about unusual conditions. The availability of these complex systems, far from overwhelming consumers, only creates demand for even more complex systems. The growing sophistication of applications continually pushes the design and manufacturing of integrated circuits and electronic systems to new levels of complexity. And perhaps the most amazing characteristic of this collection of systems is its variety-as systems become more complex, we build not a few general-purpose computers but an ever wider range of special-purpose systems. Our ability to do so is a testament to our growing mastery of both integrated circuit manufacturing and design, but the increasing demands of customers continue to test the limits of design and manufacturing. Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In other cases electronic systems have created totally new applications.
DEPT OF ECE
Page 30 of 66
AIET
4.1 VHDL 4.2.1 Introduction
VHDL is an acronym for Very High Speed Integrated Circuits Hardware description Language. The language can be used to model a digital system at many levels of abstraction ranging from the algorithmic level to the gate level. The complexity of the digital system being modeled could vary from that of a simple gate to a complete digital electronic system. The VHDL language can be regarded as an integrated amalgamation of sequential, concurrent, net list and waveform generation languages and timing specifications.
4.2.2 History of VHDL

VHDL stands for VHSIC (Very High Speed Integrated Circuit) Hardware Description Language. It was developed in the 1980s as spin-off of a high-speed integrated circuit research project funded by the US department of defence. During the VHSIC program, researchers were confronted with the daunting task of describing circuits of enormous scale (for their time) and of managing very large circuit design problems that involved multiple teams of engineers. With only gate-level tools available, it soon became clear that more structured design methods and tools would be needed. To meet this challenge, teams of engineers from three companies - IBM, Texas Instruments and Intermetrics were contracted by the department of defence to complete the specification and implementation of a new language based design description method. The first publicly available version of VHDL, version 7.2 was released in 1985. In 1986, the IEEE
DEPT OF ECE
Page 31 of 66
AIET
was presented with a proposal to standardize the language, which it did in 1987
and
academic representatives. The resulting standard, IEEE 10761987 is the basis for virtually every simulation and synthesis product sold today. An enhanced and updated version of the language, IEEE 1076-1993, was released in 1994, and VHDL tool vendors have been responding by adding these new language features to their products.
Although IEEE standard 1076 defines the complete VHDL language, there are aspects of the language that make it difficult to write completely portable design descriptions (description that can be simulated identically using different vendors tools). The problem stems from the fact that VHDL supports many abstract data types, but it does not address the simple problem of characterizing different signal strengths or commonly used simulation conditions such as unknowns and high impedances. Soon after IEEE 1076-1987 [3] was adopted, simulator companies began enhancing VHDL with new non-standard types to allow their customers to accurately simulate complex electronic circuits. This caused problems because design descriptions entered into one simulator were often incompatible with another with other environments. VHDL was quickly becoming a non-standard.
To get around the problem of non-standard data types, an IEEE committee adopted another standard. This standard numbered 1164, defines a standard package (a VHDL feature that allows commonly used declaration to be collected into an external library) containing definition for a standard nine-value data type. This standard data type is called standard logic, and the IELL 1164 package is often referred to as the standard logic package. The problem stems from the fact that VHDL supports many abstract data types, but it does not address the simple problem of characterizing different signal strengths or commonly used simulation conditions such as unknowns and high impedances. Soon after IEEE 10761987 [3] was adopted, simulator companies began enhancing VHDL with new non-standard types to allow their customers to accurately simulate complex electronic circuits. This caused problems because design descriptions entered into one simulator were often incompatible with another with other environments. VHDL was quickly becoming a non-standard.
DEPT OF ECE
Page 32 of 66
AIET
The IEEN 1076-1987 and IEEE 1164 standards together form the complete VHDL standard in widest use today (IEEE 1076-1993 is slowly working its way into the VHDL mainstream, but it does not add significant number of features for synthesis users).
In the search for a standard design and documentation tool for the Very High Speed Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The conclusion of the workshop was the need for a standard language, and the features that might be required by such a standard in 1983.DoD established requirements for a standard VHSIC hardware description language(VHDL), based on the recommendation of the Woods Hole workshop. A contract for the development of the VHDL language, its environment, and its software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released only six months after the project began. The language was significantly improved hereafter and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this significant developments led to the release of VHDL 6.0. In 1985 these significant development led to the release of VHDL 7.2 language reference manual. This was later on developed as IEEE 1076/A VHDL language reference manual.
Efforts for defining the new version of VHDL stated in 1990 by a ream of volunteers working under the IEEE DASC (Design Automation Standards committee). In October of 1992, a new VHDL93 was completed and was released for review. After minor modifications, this new version was approved by the VHDL balloting group members and became the new VHDL language standard. The present VHDL standard is formally referred as VHDL 1076-1993. In the search for a standard design and documentation tool for the Very High Speed Integrated Circuits (VHSIC) program the United States Department of Defence (DOD) in the summer of 1981 sponsored a workshop on HDLs at Woods Hole, Massachusetts. The conclusion of the workshop was the need for a standard language, and the features that might be required by such a standard in 1983.DoD established requirements for a standard VHSIC hardware description language(VHDL), based on the recommendation of the Woods Hole workshop. A contract for the development of the VHDL language, its environment, and its software was awarded to IBM, Texas instruments and Intermetrics. VHDL 2.0 was released only six months after the project began. The language was significantly improved hereafter
DEPT OF ECE
Page 33 of 66
AIET
and other shortcomings were corrected leading to the release of VHDL 6.0. In 1985 this significant developments led to the release of VHDL 6.0. In 1985 these significant development led to the release of VHDL 7.2 language reference manual. This was later on developed as IEEE 1076/A VHDL language reference manual.
4.2.3 Levels of abstraction (Styles)

VHDL supports many possible styles of design description. These styles differ primarily in how closely they relate to the underlying hardware. When we speak of the different styles of VHDL, then, we are really talking about the differing levels of abstraction possible using the language. To give an example, it is possible to describe a counter circuit in a number of ways. At the lowest level of abstraction, you could use VHDL's hierarchy features to connect a sequence of predefined logic gates and flip-flips to form a counter circuit.
Figure. 14 Levels of abstraction
In a behavioural description, the concept of time may be expressed precisely, with actual delays between related events, or may simply be an ordering of operations that are expressed sequentially. When you are writing VHDL for input to synthesis tools, you may use
DEPT OF ECE
Page 34 of 66
AIET
behavioural statements in VHDL to imply that there are registers in your circuit. It is unlikely, however, that your synthesis tool will be capable of creating precisely the same behaviour in actual circuitry as you have defined in the language.
The highest level of abstraction supported in VHDL is called the behavioural level of abstraction. When creating a behavioural description of a circuit, you will describe your circuit in terms of its operation over time. The concept of time is the critical distinction between behavioural descriptions of circuits and lower-level descriptions.
If you are familiar with event-driven software programming languages then writing behaviour level VHDL will not seem like anything new. Just like a programming language, you will be writing one or more small programs that operate sequentially and communicate with one another through their interfaces. The only difference between behaviour-level VHDL and a software programming language such as Visual Basic is the underlying execution platform: in the case of Visual Basic, it is the Windows operating system; in the case of VHDL, it is a simulator.
An alternate design method, in which a circuit design problem is segmented into registers and combinational input logic, is what is often called the dataflow level of abstraction. Dataflow is an intermediate level of abstraction that allows the drudgery of combinational logic to be hidden while the more important parts of the circuit, the registers, are more completely specified.
There are some drawbacks to using a purely dataflow method of design in VHDL. First, there are no built-in registers in VHDL; the language was designed to be generalpurpose, and VHDLs designers on its behavioural aspects placed the emphasis. If you are going to write VHDL at the dataflow level of abstraction, then you must first create behavioural descriptions of the register elements that you will be using in your design. These elements must be provided in the form of components or in the form of subprograms.
DEPT OF ECE
Page 35 of 66
AIET
But for hardware designers, for whom it can be difficult to relate the sequential descriptions and operation of behavioural VHDL with the hardware that is being described, using the dataflow level of abstraction can make quite a lot of sense. Using dataflow, it can be easier to relate a design description to actual hardware devices. If you are familiar with event-driven software programming languages then writing behaviour level VHDL will not seem like anything new. Just like a programming language, you will be writing one or more small programs that operate sequentially and communicate with one another through their interfaces. The only difference between behaviour-level VHDL and a software programming language such as Visual Basic is the underlying execution platform: in the case of Visual Basic, it is the Windows operating system; in the case of VHDL, it is a simulator.
The dataflow and behaviour levels of abstraction are used to describe circuits in terms of their logical function. There is a third style of VHDL that is used to combine such descriptions together into a larger, hierarchical circuit description. Structural VHDL allows you to encapsulate one part of a design description as a reusable component. Structural VHDL can be thought of as being analogous to a textual schematic, or as a textual block diagram for higher-level design.
4.2.4 Need for VHDL

The complex and laborious manual procedures for the design of the hardware have paved the way for the development of languages for high level description of the digital system. This high-level description can serve as documentation for the part as well as an entry point into the design process. The high level description can be processed through various boards, or gate array using the synthesis tools of Hardware Description language us such a language. VHDL was designed as a solution to provide an integrated design and documentation to communicate design data between various levels of abstractions.
4.2.5 Advantages of VHDL

VHDL allows quick description and synthesis of circuits of 5, 10, 20 thousand gates. It also provides the following capabilities. The following are the major advantages of VHDL over other hardware description languages:
DEPT OF ECE
Page 36 of 66
AIET
Power and flexibility VHDL has powerful language constructs which allows code description of complex control logic. Device independent design VHDL creates design that fits into many device architecture and it also permits multiple styles of design description. Portability VHDLs portability permits the design description to be used on different simulators and synthesis tools. Thus VHDL design descriptions can be used in multiple projects. ASIC migration The efficiency of VHDL allows design to be synthesized on a CPLD or an FPGA. Sometimes the code can be used with the ASIC. Quick time to market and low cost VHDL and programmable logic pair together facilitate speedy design process. VHDL permits designs to be described quickly. Programmable logic eliminates expenses and facilitates quick design iterations The language can be used as a communication medium between different Computer Aided Design (CAD) and Computer Aided Engineering (CAE) tools. The language supports hierarchy, i.e., a digital system can be modelled as a set of interconnected components; each component, in turn, can be modelled as a set of interconnected subcomponents. The language supports flexible design methodologies: Top-Down, Bottom- Up, or Mixed. The language is technology independent and hence the same behaviour model can be synthesized into different vendor libraries. Various digital modelling techniques such as finite-state machine descriptions, algorithmic descriptions and Boolean equations can be modelled using the language. It supports both synchronous and asynchronous timing models. It is an IEEE and ANSI standard, and therefore, models described using these languages are portable. There are no limitations that are imposed by the language on the size of the design. The language has elements that make large-scale design modelling easier, for e.g. Components, functions, procedures and packages. Test benches can be written using the same language to test other VHDL models.
DEPT OF ECE
Page 37 of 66
AIET
Nominal propagation delays, min-max delays, setup and holding timing, timing constraints, and spike detection can all be described very naturally in this language. Behavioural models that conform to a certain synthesis description style are capable of being synthesized to gate-level description. The capability of defining new data types provides the power to describe and simulate a new design technique at a very high level of abstraction without any concern about implementation details.
4.2.6 Design methodology using VHDL

There are three design methodologies namely: bottom-up, top-down and flat The bottom-up approach involves the defining and designing the individual components, then bringing the individual components together to form the overall design. In a flat design the functional components are defined at the same level as the interconnection of those functional components. A top-down design process involves a divide-and-conquer approach to implement the design a large system. Top-down design is referred to as recursive partitioning of a system into its sub-components until all sub-components become manageable design parts. Design of a component is manageable if the component is available as part of a library, it can be implemented by modifying an already available part, or it can be described for a synthesis program or an automatic hardware generator.
4.2.7Elements of VHDL
Constructs of the VHDL language are designed for describing hardware components, packaging parts and utilities use of libraries and for specifying design libraries and parameters. In its simplest form, the description of a component in VHDL consists of an interface specification and an architectural specification. The interface description begins with Entity keyword and contains the input-output ports of the component. An architectural specification begins with the Architectural keyword, which describes the functionality of a component.
DEPT OF ECE
Page 38 of 66
AIET
This functionality depends on input-output signals and other parameters that are specified in the interface description. Several architectural specifications with different identifiers can exist for one component with a given interface description. VHDL allows architecture to be configured for a specific technology environment. In a hardware design environment it becomes necessary to group components or utilities used for description of components. Components and such utilities can be grouped by use of packages. A package declaration contains components and utilities to be come visible by Entities and Architectures. VHDL allows the use of Libraries and binding of subcomponents of a design to elements of various libraries. Constructs for such applications include a library statement and configurations.
4.2.8 VHDL language features

The various building blocks and constructs in VHDL which have been used are:
4.2.8.1 Entity
Every VHDL design description consists of at least one entity. In VHDL, an entity declaration describes the circuit as it appears from the "outside", from the perspective of its input and output interfaces. An entity declaration in VHDL provides the complete interface for a circuit. Using the information provided in an entity declaration (the port names and the data type and direction of each port), you have all the information you need to connect that portion of a circuit into other, higher-level circuits. The entity declaration includes a name, compare, and a port statement defining all the inputs and outputs of the entity. Each of the ports is given a direction (either in, out or inout). Formal Definition It is the hardware abstraction of a digital system. Entity declaration describes the external view of the entity to the outside world. Simplified syntax: Entity entity-name is Port (port-list); [generic(generic-list);]
DEPT OF ECE
Page 39 of 66
AIET
end entity-name;
Description All designs are expressed in terms of entities. Entity is the most basic building block in a design. The uppermost level of the design is the top-level entity. If the design is hierarchical, then the top-level description will have lower-level descriptions contained in it. These lower-level descriptions will be lower-level entities contained in the top-level entity description.
4.2.8.2 Architecture
Every entity in a VHDL design description must be bound with a corresponding architecture. The architecture describes the actual function of the entity to which it is bound. Using the schematic as a metaphor, you can think of the architecture as being roughly analogous to a lower-level schematic pointed to by the higher-level functional block symbol. The second part of a minimal VHDL source file is the architecture declaration. Every entity declaration you write must be accompanied by at least one corresponding architecture. The architecture declaration begins with a unique name, followed by the name of the entity to which the architecture is bound. Within the architecture declaration is found the actual functional description of our comparator. There are many ways to describe combinational logic functions in VHDL. Formal Definition A body associated with an entity declaration to describe the internal organization or operation of a design entity. An architecture body is used to describe the behavior, data flow or structure of a design entity: Simplified syntax Architecture architecture-name of entity-name is Architecture-declarations Begin Concurrent-statements End [architecture] [architecture-name];
DEPT OF ECE
Page 40 of 66
AIET
Description Architecture assigned to an entity describes internal relationship between input and output ports of the entity. It contains of two parts: declarations and concurrent statements. First (declarative) part of architecture may contain declarations of types, signals, constants, subprograms (functions and procedures), components and groups. Concurrent statements in the architecture body define the relationship between inputs and outputs. This relationship can be specified using different types of statements: Concurrent signal assignment, process statement, component instantiation, and concurrent procedure call, generate statement, concurrent assertion statement, and block statement. It can be writing in different styles: structural, dataflow, behavioral (functional) or mixed. The description of a structural body is based on component instantiation and generates statements. It allows creating hierarchical projects, from simple gates to very complex components, describing entire subsystems. The Connections among components are realized through ports. The Dataflow description is built with concurrent signal assignment statements. Each of the statements can be activated when any of its input signals changes its value. The architecture body describes only the expected functionality (behavior) of the circuit, without any direct indication as to the hard ware implementation. Such description consists only of one or more processes, each of which contains sequential statements. The Architecture body may contain statements that define both behavior and structure of the circuit at the same time. Such architecture description is called mixed.
4.2.8.3 Component declaration

Formal Definition A component declaration declares a virtual design entity interface that may be used in component instantiation statement. Simplified syntax: Component component-name
DEPT OF ECE
Page 41 of 66
AIET
[generic(generic-list)]; port(port-list); end component [component-name];
Component instantiation Formal Definition A component instantiation statement defines a subcomponent of the design entity in which it appears, associate signals or values with the ports of that subcomponent, and associates values with generics of that subcomponent. Simplified syntax Label: [component] component-name Generic map (generic-association-list); Port map (port-association-List);
4.2.8.4 Configuration declaration

Formal Definition A configuration is a construct that defines how component instances in a given block are bound to design entities in order to describe how design entities are put together to form a complete design. Simplified syntax Configuration configuration-name of entity-name is Configuration declarations. For architecture-name For instance-label: component-name
DEPT OF ECE
Page 42 of 66
AIET
Use entity library-name. Entity-name (arch-name); End for; End for; End configuration-name;
Configuration instantiation Formal Definition A component instantiation statement defines a subcomponent of the design entity in which it appears, associates signals or value with the ports of that subcomponent, and associates values with generics of that subcomponent.
Simplified syntax Label: Configuration configuration-name Generic map (generic-association-list); Port map (port-association-list);
4.2.8.5 Package
Formal Definition A package declaration defines the interface to a package. Simplified syntax Package package-name is Package declarations End [package] package-name; Package body Formal Definition
DEPT OF ECE
Page 43 of 66
AIET
A package body defines the bodies of subprograms and the values of deferred constants declared in the interface to the package. Simplified syntax: Package body package-name is Package-body-declarations Subprogram bodies declarations End [package body] package-name;
4.2.8.6 Attributes
Attributes are of two types: user defined and predefined. User defined Formal Definition A value, function, type, range, signals, or constant that may be associated with one or more named entities in a description. Simplified syntax Attribute attribute-name: type; --attribute declaration
Attribute attribute-name of item: item-class is expression attribute specification Description Attributes allow retrieving information about named entities: types, objects, subprograms etc. Users can define mew attributes and then assign them to named entities by specifying the entity and the attribute values for it. Predefined Formal Definition A value, function, type, range, signals, or constant that may be associated with one or more named entities in a description. Simplified syntax: objects attribute-name
4.2.8.7 Process statement
DEPT OF ECE
Page 44 of 66
AIET
Formal Definition A process statement defines an independent sequential process representing the behaviour of some portion of the design Simplified syntax: [process-label:] process [(sensitivity-list)]; Process-declarations begin Sequential-statements end process [process-label];
4.2.8.8 Function
Formal Definition A function call is a subprogram of the form of an expression that returns a value. Simplified syntax Function function name (parameters) return type -- function declaration
Function function-name (parameters) return type is --- function definition. Begin Sequential statements End [function] function-name;
4.2.8.9 Port
Formal Definition A channel for dynamic communication between a block and its environment. Simplified syntaxe: Port (port-dclaration, port-dclaration,-----); ----port dclarations: Port-signal-name: in port-signal-type: =initial-value
DEPT OF ECE
Page 45 of 66
AIET
Port-signal-name: out port-signal-type: =initial-value Port-signal-name: in out port-signal-type: =initial-value Port-signal-name: buffer port-signal-type: =initial-value Port-signal-name: linkage port-signal-type: =initial-value
4.2.8.10 Sensitivity list

Formal Definition A list of signals a process is sensitive to. Simplified syntax: (Signal-name, signal-name, ---) Formal Definition
4.2.8.11 Standard logic

Formal Definition A nine-value resolved logic type. Std-logic is not a part of the VHDL standard. It is defined in IEEE Std 1164. Simplified syntax: Type std-ulogic is (U, -- Uninitialized X, -- Forcing Unknown 0, -- Forcing 0 1, -- Forcing 1 Z -- High Impedance W--Weak Unknown L--Weak 1 ---Dont Care); Type std-ulogic-vector is array (natural range <>) of std-ulogic Function resolved (s: std-ulogic-vector) return std-ulogic;
DEPT OF ECE
Page 46 of 66
AIET
Subtype std-logic is resolved std-ulogic;
4.2.9 Data Types

There are many data types available in VHDL. VHDL allows data to be represented in terms of high-level data types. These data types can represent individual wires in a circuit, or can represent collections of wires using a concept called an array. The preceding description of the comparator circuit used the data types bit and bit vector for its inputs and outputs. The bit data type (bit vector is simply an array of bits) values of '1' and '0' are the only possible values for the bit data type. Every data type in VHDL has a defined set of values, and a defined set of valid operations. Type checking is strict, so it is not possible, for example, to directly assign the value of an integer data type to a bit vector data type. (There are ways to get around this restriction, using what are called type conversion functions.) VHDL is rich language with many different data types.
The most common data types are listed below: Bit: a 1-bit value representing a wire. (Note: IEEE standard 1164 defines a 9-valued replacement for bit called std_logic.) Bit vector: an array of bits. (Replaced by std_logic_vector in IEEE 1164.) Boolean: a True/False value. Integer: a signed integer value, typically implemented as a 32-bit data type. Real: a floating-point value. Enumerated: used to create custom data types. Record: used to append multiple data types as a collection. Array: can be used to create single or multiple dimension arrays. Access: similar to pointers in C or Pascal.
DEPT OF ECE
Page 47 of 66
AIET
File: used to read and write disk files. Useful for simulation. Physical: used to represent values such as time, voltage, etc. using symbolic units of measure (such as 'ns' or 'ma').
4.2.10 Packages and Package Bodies.
A VHDL package declaration is identified by the package keyword, and is used to collect commonly used declarations for use globally among different design units. You can think of a package as being a common storage area, one used to store such things as type declarations, constants, and global subprograms.
A package can consist of two basic parts: a package declaration and an optional package body. Package declarations can contain the following types of statements: Type and subtype declarations Constant declarations Global signal declarations Function and procedure declarations Attribute specifications File declarations Component declarations Alias declarations Disconnect specifications Use clauses
Items appearing within a package declaration can be made visible to other design units through the use of a use statement.
DEPT OF ECE
Page 48 of 66
AIET
If the package contains declarations of subprograms (functions or procedures) or defines one or more deferred constants (constants whose value is not given), then a package body is required in addition to the package declaration. A package body must have the same name as its corresponding package declaration, but can be located anywhere in the design. The relationship between a package and package body is somewhat akin to the relationship between an entity and its corresponding architecture. While the package declaration provides the information needed to use the items defined within it (the parameter list for a global procedure, or the name of a defined type or subtype), the actual behavior of such things as procedures and functions must be specified within package bodies.
DEPT OF ECE
Page 49 of 66
AIET
4.3 SOFTWARE USED: 4.3.1. Xilinx

Xilinx software is used by the VHDL designers for performing Synthesis operation. Any simulated code can be synthesized and configured on FPGA. Synthesis is the transformation of VHDL code into gate level net list. It is an integral part of current design flows.
4.3.2. Algorithm
Start the ISE Software by clicking the XILINX ISE icon.
Create a New Project and find the following properties displayed.
Create a VHDL Source formatting all inputs, outputs and buffers if required. which provides a window to write the VHDL code, to be synthesized.
DEPT OF ECE
Page 50 of 66
AIET
Check Syntax after finally editing the VHDL source for any errors. Design Simulation is done after compilation. Synthesizing starts by creating Timing Constraints Implement Design and Verify Constraints
Assigning Pin Location Constraints according to the requirement on FPGA board. Download Design to the Spartan FPGA Board by clicking Configure device, until a .bit file is generated showing a message Program Succeeded.
DEPT OF ECE
Page 51 of 66
AIET
4.4 VERILOG HDL

Verilog HDL is a hardware description language that can be used to model a digital system at many levels of abstraction ranging from the algorithmic-level to the gatelevel to the switch-level. The complexity of the digital system being modeled could vary from that of a simple gate to a complete electronic digital system, or anything in between. The digital system can be described hierarchically and timing can be explicitly modeled within the same description. The Verilog HDL language includes capabilities to describe the behavior-al nature of a design, the dataflow nature of a design, a design's structural composition, delays and a waveform generation mechanism including aspects of response monitoring and verification, all modeled using one single language. In addition, the language provides a programming language interface through which the internals of a design can be accessed during simulation including the control of a simulation run. The language not only defines the syntax but also defines very clear simulation semantics for each language construct. Therefore, models written in language can be verified using a Verilog simulator. The language inherits this many of its
operator symbols and constructs from the C programming language. Verilog HDL provides an extensive range of modeling capabilities, some of which are quite difficult to comprehend initially. However, a core subset of the language is quite easy to leam and use. This is sufficient to model most applications.
DEPT OF ECE
Page 52 of 66
AIET
4.4.1 History: The verilog HDL language was first developed by Gateway Design Automation in 1983 as hardware are modleling language for their simulator product, At that time ,twas a propnetary language. Because of the popularity of the,simulator product, Verilog HDL gained acceptance as a usable and practical language by a number of designers. In an effort to increase the popularity of the language, the language was placed in the public domain in 1990. Open verilog International (OVI) was formed to promote Verilog. In 1992 OVI decided to pursue standardization of verilog HDL as an IEEE standard. This effort was succeful and the language became an IEEE standard in 1995. The complete standard is described in the verilog hardware description language reference manual. The standard is called std 1364-1995. 4.4.2 Major Capabilities: Listed below are the majort capabilities of the verilog hardware description: Primitive logic gates, such as and, or and nand, are built-in into the language. Flexibility of creating a user-defined primitive (UDP). Such a primitive could either be a combinational logic primitive or a sequential logic primitive. Switch-level modeling primitive gates, such as pmos and nmos, are also built-in into the language. Explicit language constructs are provided for specifying pin-to-pin delays, path delays and timing checks of a design. A design can be modeled in three different styles or in a mixed style. These styles are: behavioral style - modeled using procedur-al constructs; dataflow style - modeled using continuous assign-ments; and structural style - modeled using gate and module instantiations. There are two data types in Verilog HDL; the net data type and the register data type. The net type represents a physical connection between structural elements while a register type represents an abstract data storage element. Figure.2-1 shows the mixed-level modeling capability of Verilog HDL, that is, in one design, each module may be modeled at a different level.
DEPT OF ECE
Page 53 of 66
AIET
Figure:15 Mixed level Verilog HDL also has built-in logic functions such as & (bitwise-and) and I (bitwisemodelling or). High-level programming language constructs such as condition- als, case statements, and loops are available in the language. Notion of concurrency and time can be explicitly modeled. Powerful file read and write capabilities fare provided. The language is non-deterministic under certain situations, that is, a
model may produce different results on different simulators; for example, the ordering of events on an event queue is not defined by the standard. 4.4.3 SYNTHESIS: Synthesis is the process of constructing a gate level netlist from a register-transfer level model of a circuit described in Verilog HDL. Figure.2-2 shows such a process. A synthesis system may as an intermediate step, generate a netlist that is comprised of registertransfer level blocks such as flip-flops, arithmetic-logic-units, and multiplexers, interconnected by wires. In such a case, a second program called the RTL module builder is necessary. The purpose of this builder is to build, or acquire from a library of predefined components, each of the required RTL blocks in the user-specified target technology.
DEPT OF ECE
Page 54 of 66
AIET
Figure:16 synthesis process Having produced a gate level netlist, a logic optimizer reads in the netlist and optimizes the circuit for the user-specified area and timing constraints. These area and timing constraints may also be used by the module builder for appropriate selection or generation of RTL blocks. In this book, we assume that the target netlist is at the gate level. The logic gates used in the synthesized netlists are described in Appendix B. The module building and logic optimization phases are not described in this book. The above figure shows the basic elements ofVerilog HDL and the elements used in hardware. A mapping mechanism or a construction mechanism has to be provided that translates the Verilog HDL elements into their corresponding hardware elements as shown in
DEPT OF ECE
process
Fig.2-3 Typical design Page 55 of 66
AIET
CHAPTER 5 SIMULATION MODEL 5.1 PROGRAM

library ieee; use ieee.std_logic_1164.all; entity clk_div is port ( nreset : in std_logic_vector(3 downto 0); -- Reset clk_in : in std_logic; -- Clock Input clk_out1 : out std_logic;-- Clock Output1 clk_out2 : out std_logic;-- Clock Output2 clk_out3 : out std_logic;-- Clock Output3 clk_out4 : out std_logic;-- Clock Output4 clk_out5 : out std_logic;-- Clock Output5 clk_out6 : out std_logic;-- Clock Output6 clk_out7 : out std_logic;-- Clock Output7 clk_out8 : out std_logic;-- Clock Output8 clk_out9 : out std_logic;-- Clock Output9 clk_out10 : out std_logic;-- Clock Output10 clk_out11 : out std_logic;-- Clock Output11 clk_out12 : out std_logic;-- Clock Output12 clk_out13 : out std_logic;-- Clock Output13 clk_out14 : out std_logic;-- Clock Output14 clk_out15 : out std_logic);-- Clock Output15 end entity clk_div; architecture clk_div of clk_div is signal div_2 : std_logic; -- Divide By 2^1
DEPT OF ECE
Page 56 of 66
AIET
signal div_4 signal div_8 signal div_16 signal div_32 signal div_64 signal div_128 signal div_256 signal div_512 signal div_1024 signal div_2048 signal div_4096 signal div_8192
: std_logic; -- Divide By 2^2 : std_logic; -- Divide By 2^3 : std_logic; -- Divide By 2^4 : std_logic; -- Divide By 2^5 : std_logic; -- Divide By 2^6 : std_logic; -- Divide By 2^7 : std_logic; -- Divide By 2^8 : std_logic; -- Divide By 2^9 : std_logic; -- Divide By 2^10 : std_logic; -- Divide By 2^11 : std_logic; -- Divide By 2^12 : std_logic; -- Divide By 2^13
signal div_16384 : std_logic; -- Divide By 2^14 signal div_32768 : std_logic; -- Divide By 2^15
begin Process(nreset,clk_in,Div_2) is -- Divide by 2^1 begin if (nreset="0000") then div_2 <= '0'; elsif (clk_in = '1' and clk_in'event) then div_2 <= not div_2; end if; end process; Process(div_2,div_4,nreset) is -- Divide by 2^2 begin if (nreset ="0000") then div_4 <= '0'; elsif(div_2 ='1' and div_2'event) then div_4 <= not div_4;
DEPT OF ECE
Page 57 of 66
AIET
end if; end process; Process(div_4,div_8,nreset) is -- Divide by 2^3 begin if (nreset ="0000") then div_8 <= '0'; elsif(div_4 ='1' and div_4'event) then div_8 <= not div_8; end if; end process; Process(div_8,div_16,nreset) is -- Divide by 2^4 begin if (nreset ="0000") then div_16 <= '0'; elsif(div_8 ='1' and div_8'event) then div_16 <= not div_16; end if; end process; Process(div_16,div_32,nreset) is -- Divide by 2^5 begin if (nreset ="0000") then div_32 <= '0'; elsif(div_16 ='1' and div_16'event) then div_32 <= not div_32; end if; end process; Process(div_32,div_64,nreset) is -- Divide by 2^6 begin if (nreset ="0000") then div_64 <= '0';
DEPT OF ECE
Page 58 of 66
AIET
elsif(div_32 ='1' and div_32'event) then div_64 <= not div_64; end if; end process; Process(div_64,div_128,nreset) is -- Divide by 2^7 begin if (nreset ="0000") then div_128 <= '0'; elsif(div_64 ='1' and div_64'event) then div_128 <= not div_128; end if; end process; Process(div_128,div_256,nreset) is -- Divide by 2^8 begin if (nreset ="0000") then div_256 <= '0'; elsif(div_128 ='1' and div_128'event) then div_256 <= not div_256; end if; end process; Process(div_256,div_512,nreset) is -- Divide by 2^9 begin if (nreset ="0000") then div_512 <= '0'; elsif(div_256 ='1' and div_256'event) then div_512 <= not div_512; end if; end process; Process(div_512,div_1024,nreset) is -- Divide by 2^10 begin
DEPT OF ECE
Page 59 of 66
AIET
if (nreset ="0000") then div_1024 <= '0'; elsif(div_512 ='1' and div_512'event) then div_1024 <= not div_1024; end if; end process; Process(div_1024,div_2048,nreset) is -- Divide by 2^11 begin if (nreset ="0000") then div_2048 <= '0'; elsif(div_1024 ='1' and div_1024'event) then div_2048 <= not div_2048; end if; end process; Process(div_2048,div_4096,nreset) is -- Divide by 2^12 begin if (nreset ="0000") then div_4096 <= '0'; elsif(div_2048 ='1' and div_2048'event) then div_4096 <= not div_4096; end if; end process; Process(div_4096,div_8192,nreset) is -- Divide by 2^13 begin if (nreset ="0000") then div_8192 <= '0'; elsif(div_4096 ='1' and div_4096'event) then div_8192 <= not div_8192; end if; end process;
DEPT OF ECE
Page 60 of 66
AIET
Process(div_8192,div_16384,nreset) is -- Divide by 2^14 begin if (nreset ="0000") then div_16384 <= '0'; elsif(div_8192 ='1' and div_8192'event) then div_16384 <= not div_16384; end if; end process; Process(div_16384,div_32768,nreset) is -- Divide by 2^15 begin if (nreset ="0000") then div_32768 <= '0'; elsif(div_16384 ='1' and div_16384'event) then div_32768 <= not div_32768; end if; end process; clk_out1 <= div_2; clk_out2 <= div_4; clk_out3 <= div_8; clk_out4 <= div_16; clk_out5 <= div_32; clk_out6 <= div_64; clk_out7<= div_128; clk_out8 <= div_256; clk_out9 <= div_512; clk_out10 <= div_1024; clk_out11 <= div_2048; clk_out12 <= div_4096; clk_out13 <= div_8192; clk_out14 <= div_16384; clk_out15 <= div_32768;
DEPT OF ECE
Page 61 of 66
AIET
end architecture clk_div;
5.2 RESULTENT WAVE FORM
Figure:18 Output figure of clock distribution
DEPT OF ECE
Page 62 of 66
AIET
CHAPTER 6 CONCLUSIONS AND FUTURE WORK 6.1 Conclusions

In this thesis we presented a novel approach to model a clock distribution network using VHDL-AMS. For this purpose, a set of models were developed for the clock distribution network, including the components like Interconnects, Buffers, Phase Locked Loop and Source Oscillator. The models were simulated using Cadence LDV 5.1 AMS simulator and were checked for functionality. Modeling of a generic clock distribution network was demonstrated using the VHDL-AMS models. This satisfied the first objective of this project. Two case studies were considered in this research to demonstrate the versatility of the VHDL AMS in modeling a clock distribution network. In the first case, a balanced H Tree based clock distribution network was modeled, and in the second case a regular pattern clock distribution network was modeled. This addressed the second objective of this project. The characteristic of clock studied in this research was clock skew. Its variation with varying levels of the H-Tree, interconnect lengths, load capacitance and the number of stages in a regular pattern clock distribution network was studied in this research. This satisfied the third objective of this research (Section 1.2). Compared to equivalent SPICE AMS models developed in this research had an average error ranging between 3.60% and 8.21%. This validated the accuracy of the VHDL-AMS models developed in this research.
DEPT OF ECE
Page 63 of 66
AIET
6.2 Future Work

The suggested future works of this research are listed as follows: A Model generator can be developed to automatically output a VHDL-AMS Model based on the user requirements for the clock distribution network. In this research, only behavioral and structural levels of abstraction were considered for the selected components. To achieve high fidelity, component level of abstraction can be considered. The buffer models can be made more exhaustive by including the process variation effects. For high fidelity, transmission models can be generated for interconnects which involves frequency domain modeling. Higher order filters can be implemented in PLLs to make the model more Accurate. Effects of jitter can be included to improve the accuracy of a clock distribution network model
DEPT OF ECE
Page 64 of 66
AIET
Bibliography
1. FRIEDMAN, E. G., AND POWELL, S. Design and Analysis of a Hierarchical Clock Distribution System for Synchronous Standard Cell/Macro Cell. IEEE Journal of Solid-State Circuits (April 1986), Vol. SC-21, No. 2. 2. RESTLE, P. J., AND DEUTSCH, A. Designing the Best Clock Distribution Network. 1998 Symposium on VLSI Circuits Digest of Technical Papers. 3. FRIEDMAN, E. G. Clock Distribution Design in VlSI Circuits an Overview. Proceedings of IEEE International Symposium on Circuits and Systems (May 1993), pp. 1475-1478. 4. ZANELLA, S., NARDI, A., NEVIANI, A., QUARANTELLI, M., SAXENA, S., AND GUARDIANI, C. Analysis of the Impact of Process Variations on Clock Skew.IEEE Transactions on Semiconductor Manufacturing (Nov 2000), Vol 13, No. 4. 5. MEHROTRA, V., AND BONING, D Technology Scaling Impact of Variation on Clock Skew and Interconnect Delay. International Interconnect Technology Conference (IITC) (June 2001), San Francisco, CA. 6. MEHROTRA, V., SAM, S. L., BONING, D., CHANDRAKASAN, A., VALLISHAYEE, R., AND NASSIF, S. A Methodology for M odeling the Effects of Systematic WithinDie Interconnect and Device Variation on Circuit Performance. 37th Conference on Design Automation (DAC 2000), pp. 172-175. 7. XI, J. G., AND DAI, W. Buffer Insertion and Sizing Under Process Variations for Low Power Clock Distribution. Proceedings of the 32nd ACM/IEEE Conference ation (1995), pp. 491-496. 8. WOLAVER, D.H. Phase-Locked Loop Circuit Design. Prentice Hall, 1991. 9. STENSBY, J. L. Phase-Locked Loops: Theory and Applications. CRC Press, 1997. 10. http://lsiwww.epfl.ch/LSI2001/teaching/webcourse/ch05/ch05.html. 11. http://www-ensps.u-strasbg.fr/coursen/Option3A/ams_part1.html 12. http://hyperphysics.phy-astr.gsu.edu/hbase/electric/restmp.html 13. http://www.mosis.org/cgi-bin/cgiwrap/umosis/swp/params/ami-c5/t54gparams. Txt 14. http://www.ece.cmu.[28] edu/~ee762/hspice-ocs/html/hspice_and_qrg/hspice_2001_2-
DEPT OF ECE
Page 65 of 66
AIET
72.html
DEPT OF ECE
Page 66 of 66
AIET

Clock Distribution Using VHDL

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Clock Distribution Using VHDL

Uploaded by

Copyright:

Available Formats

ABSTRACT

1.4 Overview of Results

2.1 Need for Modeling Clock Distribution Networks

2.2 Related Work

2.3 Clock Distribution Networks

Figure 1: Clock distribution network in 600MHz Alpha Microprocessor

2.4 Components of a Clock Distribution Network 2.4.1 Buffers

Figure 3: Clock skew variation with buffer delay

2.4.2 Phase Locked Loop

2.4.3 Phase Detector

2.4.4 The charge pump

2.4.5 Low Pass Filter

2.4.6 Voltage Controlled Oscillator

2.4.7 Frequency Divider

2.5 Characteristics of Clock Distribution Network

2.5.1 Clock Skew

3.1 CLOCK DISTRIBUTION

Figure 6 A basic clock tree.

Figure 7 commonly used tree structures in clock distribution networks

An example of a clock tree in chip design.

3.2 The Key Requirements For Constructing A Clock Tree.

Figure 9 Cell based ASIC design methodology.

Figure 10 Abstract view of the physical distribution of a clock sink.

Figure 11 Layout view of the physical distribution of a clock sink.

Figure 12 The clock tree in a three-dimensional spacetime plot.

Figure 13 A large clock tree of 23,942 sinks.

4.1.3 What is VLSI?

History of Scale Integration

Advantages of ICs over discrete components

4.1 VHDL 4.2.1 Introduction

4.2.2 History of VHDL

4.2.3 Levels of abstraction (Styles)

Figure. 14 Levels of abstraction

4.2.4 Need for VHDL

4.2.5 Advantages of VHDL

4.2.6 Design methodology using VHDL

4.2.8 VHDL language features

4.2.8.3 Component declaration

[generic(generic-list)]; port(port-list); end component [component-name];

4.2.8.4 Configuration declaration

4.2.8.7 Process statement

4.2.8.10 Sensitivity list

4.2.8.11 Standard logic

Subtype std-logic is resolved std-ulogic;

4.2.9 Data Types

4.2.10 Packages and Package Bodies.

4.3 SOFTWARE USED: 4.3.1. Xilinx

Create a New Project and find the following properties displayed.

4.4 VERILOG HDL

Fig.2-3 Typical design Page 55 of 66

CHAPTER 5 SIMULATION MODEL 5.1 PROGRAM

end architecture clk_div;

5.2 RESULTENT WAVE FORM

Figure:18 Output figure of clock distribution

CHAPTER 6 CONCLUSIONS AND FUTURE WORK 6.1 Conclusions

6.2 Future Work

You might also like