You are on page 1of 45

ELE863 VLSI Systems

On-Chip Power and Clock Distribution

Fei Yuan, PhD. PEng. Department of Electrical & Computer Engineering Ryerson University Toronto, Ontario, Canada Copyright c Fei Yuan, 2012

Copyright (c) F. Yuan

(1)

Preface
This chapter covers the fundamentals of on-chip power distribution, on-chip clock generation, and distribution. Materials of this tutorial are drawn from various published texts and published research papers. Students are strongly advised to read the cited references for further information on the subjects.

Copyright (c) F. Yuan

(2)

OUTLINE
Power Distribution Clock Generation and Distribution References

Copyright (c) F. Yuan

(3)

Power Distribution
Introduction Power Distribution Guidelines Power and Ground Distribution Trees Dual Power and Ground Pads and Trees Dual Power and Ground Distribution Rings

Copyright (c) F. Yuan

(4)

Introduction
Background
Power and ground distributions are the number one issue in chip oor planning. Power and ground must be distributed using metal interconnects due to their low resistance. The top metal layer has the largest thickness (lower resistance per unit width whereas lower metal layers usually have smaller but identical thickness. The minimum width of metal interconnects for power and ground distribution is governed by (1) electron migration that sets the maximum allowable average current and (2) the instantaneous voltage drop along the metal interconnects that sets the peak current.

Copyright (c) F. Yuan

(5)

Introduction (contd)
Basic Concepts
Peak Current Density Ipeak , A

Jpeak =

(1)

where A=cross-section area of interconnect. Average Current Density 1 T


T 0

Javg =

j (t)dt,

(2)

where T =period of the waveform of the current owing through the interconnect. RMS Current Density 1 T

Jrms =

T 0

j 2(t)dt.

(3)

Typical 0.35m CMOS parameters : At 110C, Jpeak = 1mA/m, Jpeak,contact = 0.95mA/m, Jpeak,via = 0.6mAm. The average interconnect current can be derived from Jpeak by including the waveform of the current.

Copyright (c) F. Yuan

(6)

Introduction (contd)
Electromigration
When a current ows through an interconnect, an electron wind is set up opposite to the direction of the current. These electrons, upon colliding with the metal ions, impact sucient energy and displace the metal ions from their lattice sites, creating vacancies. These vacancies condense to form voids that result in an increase in the local resistance of the interconnect and eventually create open-circuit conditions. Electromigration lifetime of interconnects is determined by Javg [10]. For signal lines where the currents are usually bi-directional, less electromigration is observed. For power lines where the currents are usually unidirectional, sever electromigration exists.

Copyright (c) F. Yuan

(7)

Introduction (contd)
Self-Heating
Challenges in thermal management of interconnects Increased number of layers of interconnects : 3 metal layers for 0.35m, 6 metal layers for 0.18m, 8 metal layers for 0.13m, 9 metal layers for 50nm. The top metal layer is far away from the substrate and is isolated by eld oxide (SiO2 ) that is very low thermal conductivity. Reduced contact and via dimensions increased current density at contacts and vias. Low dielectric constant materials are being introduced as alternative insulators to reduce cross-talk among and the parasitic capacitance between interconnects. These materials have reduced thermal conductivity.

Copyright (c) F. Yuan

(8)

Power Distribution Guidelines


Power and ground interconnects must be sized such that the there is a safe margin between normal operating currents and the maximum allowable currents of the interconnects. Power and ground interconnects must be sized such that the voltage drops over the power and ground wires are suciently small. The number of contacts and vias in power and ground paths must be determined so that their overall current-carrying ability is sucient and the voltage drops over them is suciently small to ensure a reliable operation.

Copyright (c) F. Yuan

(9)

Power Distribution Guidelines (contd)


Use separate power and ground interconnects for ESD protection circuitry and core circuitry. This is because ESD protection circuitry will carry very large currents in the case of an ESD strike. Use separate power and ground interconnects for analog and digital circuits. The details of this criterion were presented in the chapter of simultaneous switching noise (SSN). Refer to SSN chapter of the details. Use separate power and ground interconnects for analog blocks and their bias circuitry if possible. The reason for this is that the operation of the biasing circuitry should be absolutely stable.
Bias VDD Main VDD

Bias Main circuits analog circuits

Bias circuits

Main analog circuits

Main VSS Bias VSS

Figure 1: Separate power and ground for analog blocks and their bias circuitry

Copyright (c) F. Yuan

(10)

Power Distribution Guidelines (contd)


Use separate power and ground interconnects analog blocks and their guard rings.
Guard ring
Vss S G D

p+

n+

n+

p+

Digital portion

Analog portion

Effective on collecting holes from digital portion only

VDD

Double guard rings

VSS S

n+

p+

n+

n+

p+

n+

Digital portion

Analog portion

Effective on collecting electrons and holes from digital portion

Figure 2: Single and double guide rings

Copyright (c) F. Yuan

(11)

Power and Ground Distribution Trees


Power distribution trees with the power supply at the root and the logic gates connected to the twinges. Each branch must be wide enough to carry the current in all of its sub-branches.

VDD

VSS

Figure 3: Power and ground distribution trees

Copyright (c) F. Yuan

(12)

Power and Ground Distribution Trees (contd)

Figure 4: Power and ground distribution trees [6]

Copyright (c) F. Yuan

(13)

Dual Power and Ground Pads and Trees


Power and ground pads are placed on both sides of the chip to (1) reduce the length of power and ground interconnects and (2) to reduce the voltage drops along the power and ground interconnects.
VDD VSS

CIRCUITS

VDD

VSS

Figure 5: Dual power and ground pads and trees [?]

Copyright (c) F. Yuan

(14)

Dual Power and Ground Pads and Trees

Figure 6: Dual power and ground pads and trees

As an example, let the sheet resistance be R2 = 0.05, L = 50. If I = 10mA, L = 1000m, W = 1m, we have R = R2 W we have the voltage drop cross the interconnect V = RI = 0.5V. More VDD and VSS pads should therefore be used to reduce the voltage drop along VDD and VSS interconnects.

Copyright (c) F. Yuan

(15)

Dual Power and Ground Distribution Rings


VSS (RING) VDD (RING) VDD (CORE) VSS (CORE)

Signal pad

CORE CIRCUITS

VDD (RING)

VSS (RING)

VDD (RING)

VDD (CORE)

VSS (CORE)

Figure 7: Double power and ground distribution rings

The outer power and ground rings provide VDD and VSS for ESD protection circuitry. The inner power and ground rings provide VDD and VSS for core circuitry. Separate pads for ring and core circuitry.

Copyright (c) F. Yuan

(16)

Clock Generation and Distribution


Oscillators Clock Generators Clock Signal Direction Clock Skew Clock Distribution

Copyright (c) F. Yuan

(17)

Oscillators
Crystal oscillators Ring oscillators LC tank oscillators

Copyright (c) F. Yuan

(18)

Crystal Oscillators

Crystal C1 C2

Figure 8: Crystal oscillator

Crystal oscillators are mechanical oscillators. Inverter 1 provides needed voltage dierence. Crystal can be represented by a RLC equivalent circuit. Superior stable oscillation used in most microprocessors. Low oscillation frequency. When high clocks of high frequencies are needed, frequency synthesizers and clock generators are needed.

Copyright (c) F. Yuan

(19)

Ring Oscillators
Static CMOS Inverter Ring Oscillators Fully Dierential Ring Oscillators Voltage-Controlled Ring Oscillators

Copyright (c) F. Yuan

(20)

Static CMOS Inverter Ring Oscillators


CLK

CLK

Figure 9: Ring oscillators

The number of inverters in the ring must be odd. Oscillation starts by amplifying the noise residing in the circuit. Note that a static CMOS inverter has a very large voltage gain in its transition region, the region where both nMOS and pMOS transistors are in saturation. The small noise is amplied fully such that the inverters experience a from small-signal to full-swing operation. Oscillation period T = N , where =average propagation delay of the inverter and N =number of inverters in the ring. Buers (inverter chain with gradually increased size) are needed to drive load (CLK and CLK ). Delay is strongly aected by the uctuation of VDD and VSS . Oscillation frequency is process, temperature, and power supply dependent, and is unstable. This type of ring oscillators are used for low-end processors and applications where absolute clock accuracy is not critical.

Copyright (c) F. Yuan

(21)

Fully Dierential Ring Oscillators


Vc V+ Vb VVc 1 0 0 1 1 0 0 1 1 0

Vb

Figure 10: Dierential-pair Ring oscillator

Inverters are implemented using dierential-mode logic circuits (CML (current-mode logic) and current-steering logic). Dierential-mode logic circuits have the advantages of (1) high-speed due to reduced voltage/current swing and (2) the minimum switching noise due to the constant tail current. Oscillation period T = N , where =the average propagation delay of the inverter and N =number of inverters in the ring. Oscillation frequency is independent of power and ground variations (ideally). Attractive for high-speed applications such as RF and optical communication systems, as well as mixed-mode systems. High phase noise due to the up-conversion of 1/f noise of the tail current source citeHajimiri1999.

Copyright (c) F. Yuan

(22)

Cross-Coupled VCOs
Vc Vc 1 0 V+ VVb 0 1 1 0 0 1 1 0

Figure 11: Cross coupled VCO

The pMOSs are biased in deep triode region and they behave as a linear resistor approximately [11, 12]. Positive feedback is used to speed up the state transition region. The transition region is most sensitive to VDD and VSS uctuations. Noise injected in this region contribute most to the timing jitter [9]. The tail current source is removed to eliminate the phase noise arising from the up-conversion of 1/f noise of the tail current source. Very attractive for low-noise applications.

Copyright (c) F. Yuan

(23)

LC Tank VCOs
Parallel RLC networks

z (s) = At o = 1

jLp Lp 2 [1 ( ) ] + j R o p

(4)

Cp Lp

, z (jo ) = Rp , the network becomes purely resistive.

At o , zL = Rp , vo = gm Rp (-180 degree phase shift).


|Z| Lp Rp Cp Rp

wo Lp Rp Cp Vo Vin Inductive Resistive

Freq

Z 90 0

Freq -90 Capacitive

Figure 12: RLC parallel network

Copyright (c) F. Yuan

(24)

LC Tank VCOs (contd)


Spiral inductors or active inductors

Cp Vo+

Rp

Lp

Lp

Rp

Cp VoVaractor (voltge-controlled capacitor)

Figure 13: LC tank VCO

At resonant frequency o , zL = Rp , Av1 = gm1 Rp , Av2 = gm2Rp . Both stages give a combined -360 degree phase shift.
2 Once the loop gain satises Av1Av2 = gm1 gm2 Rp 1, oscillation will start.

Oscillation frequency is controlled by adjusting the voltage of the varators (voltage-controlled capacitors).

Copyright (c) F. Yuan

(25)

LC Tank VCOs (contd)


1 2 p+ 2 p-sub R sub 1 n+ pn-junction

2 1 p+ 2 R n-well p-sub n+

n-well

Figure 14: Top - grounded diodes (Varators); Bottom - oating diodes.

Varators (voltage-controlled capacitors) - the junction capacitance is a nonlinear function of the reverse biasing voltage CJo 1+
Vr o

CJ =

(5)

where CJo =junction capacitance at zero-biasing voltage, o =built-in potential of pn-junction. Vr - reverse biasing voltage of the pn-junction.

Copyright (c) F. Yuan

(26)

Clock Generators
single-phase master clock f Clock Generator
Figure 15: Clock generator

f1 f2 fn

Multi-phase clocks

Convert a single-phase master clock from a crystal oscillator into a set of multi-phase slave clocks. Improve the driving ability of clock signals use large buers to provide large charging and discharging currents. Improve the waveforms of clock signals use positive feedback mechanism to restore waveforms.

Copyright (c) F. Yuan

(27)

Clock Generators (contd)


RS-Flipop Clock Generators
CLK CLK-1

CLK-2

CLK

t CLK-1

t CLK-2

Figure 16: Waveforms of RS-Flipop clock generator (Neglect the delay of the inverter)

Copyright (c) F. Yuan

(28)

Clock Generators (contd)


Buered RS-Flipop Clock Generators
To improve the driving ability of clock generators.
CLK CLK-1

CLK-2

Inverter with low Vth CLK

t CLK-1

t CLK-2

Figure 17: Buered RF-ipop clock generators

The delay of the inverter must be small The inverter should have low Vth so that it can be activated before NOR gates. Positive feedback yields output waveforms with sharp edges even when the master clock does not have sharp edges. Buers are often large. When clocked, they dissipate a large amount of dynamic power.

Copyright (c) F. Yuan

(29)

Clock Generators (contd)


Dierential Clock Generators Motivations - Reduce voltage swing and increase clock speed.
Circuit conguration:
Vdd Vdd

Clk

Clk

Master clk

Master clk

Master clk

Master clk

Figure 18: Dierential clock buer

Only small swing of master clock is needed low power consumption and high speed. Less noise is injected into the substrate because the bias current is constant. Disadvantages : both MasterClock and its complementary are needed. Note that CLK and CLK are single-ended signals.

Copyright (c) F. Yuan

(30)

Clock Generators (contd)


D-Latch and D Flip-Flop
D-latch : The output Q is transparent to the input D as long as the control signal C is present. D-FlipFlop : The output Q is evaluated only at the transition edges of the control signal C and remains unchanged elsewhere.

Waveform of D-latch

Waveform of D-FlipFlop

Figure 19: D-latch and D-Flipop

Copyright (c) F. Yuan

(31)

Clock Generators (contd)


50% Duty-Cycle Clock Generator
F1

D1

Q1 Q1

D2

Q2 Q2

CLK

CLK

F F
t

D1

t Q1

D2

Q2

Figure 20: 50% Duty-Cycle Clock Generator and its waveforms (neglect the delay of inverter)

The output of D-latch remains unchanged if = 0. The frequency of 1 is half of that of .

Copyright (c) F. Yuan

(32)

Clock Generators (contd)


25% Duty-Cycle 25% Separation Clock Generator
D1 Q1 D-Flipflop CLK Q1 D2 Q2 D-Flipflop Q2 D2 Q2 D-Flipflop CLK Q2

CLK

D1

Q1

Delay of D-Flipflop D2

Q2

Figure 21: 25% Duty-Cycle 25% Separation Clock Generator

Copyright (c) F. Yuan

(33)

Clock Signal Direction


The output of Latch-n responds to its inputs when n = 1 and n = 0. Ideally, the delay of the logic circuit n = the delay of the delay cell n. Logic circuit n+1 will respond to the output of Latch-n when it is available. If the propagation delay of Logic circuit n + 1 is less than that of Delay unit n + 1, then the content will be aected by Logic circuit n when n goes HIGH.

Data

Logic circuit n

Latch n FF

Logic circuit n+1

Latch n+1 FF

Fn-1 Fn-1

Fn Delay n Fn Delay n+1

Fn+1 Fn+1

Figure 22: Clock propagates in the same direction as the data

Copyright (c) F. Yuan

(34)

Clock Signal Direction (contd)


Latch n+1 goes into the latch mode before the output of Latch n begins to move. Clock signals should propagate in the opposite direction.

Data

Logic circuit n

Latch n FF

Logic circuit n+1

Latch n+1 FF

Fn-1 Fn-1

Fn Delay n Fn Delay n+1

Fn+1 Fn+1

Figure 23: Clock propagates in the opposite direction as the data

Copyright (c) F. Yuan

(35)

Clock Skew
Clock Skew Positive and Negative Clock Skew System-Level Clock Skew Phase-Lock Loops

Copyright (c) F. Yuan

(36)

Clock Skew
Data Logic circuit n Latch n Logic circuit n+1 Latch n+1

Fn

Fn+1

Figure 24: Clock skew

Ideally n and n+1 are synchronized zero clock skew. Clock skew is the dierence in clock signal arrival time between two sequentially adjacent registers (blocks). skew = n n+1 where n =delay from the clock source to stage n. Clock skew is due to the unbalanced of the data paths (i.e. dierent propagation delay). Although the layout of two data paths is identical, the dierent neighboring devices will also results in dierent propagation delay, subsequently clock skew. The minimum clock period between two registers must be greater than the sum of propagation delay and the clock skew. Tmin = P D + skew where P D =propagation delay between stages n and n + 1, skew =clock skew
Copyright (c) F. Yuan (37)

(6)

(7)

Positive and Negative Clock Skew


Fn Fn+1

t Skew

>0

Positive clock skew


Fn Fn+1

t Skew

<0

Negative clock skew

Figure 25: Positive and negative clock skew

Positive clock skew increases the minimum clock period. Subsequently reduces the max. operation frequency. Positive clock skew does not create race conditions because the input of the combinational circuits (stage n + 1) are not available yet. Negative clock skew reduces the minimum clock period. Subsequently increase the max. operation frequency. Negative clock skew may create race conditions. skew of negative clock skew cases must be LESS THAN the time required for the data to leave Latch (n), propagate through the interconnects and combinational logic circuits (n+1) in between, and allow Latch (n+1) to latch up.

Copyright (c) F. Yuan

(38)

System-Level Clock Skew

Clock generator F1

Logic circuits

Logic circuits F2

Clock generator

CHIP-1
Master clock F

CHIP-2

Figure 26: Clock skew at system levels

1 and 2 are synchronized ideally. In reality, however, 1 and 2 are not synchronized due to clock skew. The outputs generated by chips 1 and 2 are not be synchronized. Solution - use phase-lock loop (PLL) to synchronize 1 and 2.

Copyright (c) F. Yuan

(39)

Phase-Lock Loops
Voltage-controlled oscillator Charge pump

J PhaseFrequency Detector J

Loop filter

1/N

Divided by N

Figure 27: Phase lock loop

Phase-frequency detector detects the frequency and phase dierence between the incoming master clock and the local clock generated by VCO. It generates binary UP and DN signals. Charge pump converters binary UP and DN signals into an analog signal. Loop lter is a low-pass that lters out all high-frequency components of the analog signal from the charge pump. The low-frequency signal is then used to control the frequency of VCO. PLL is a typical mixed analog-digital circuits. In practice, all function blocks of PLLs must be dierentially congured.

Copyright (c) F. Yuan

(40)

Clock Distribution
Buered Clock Distribution Tree

Master clock Buffer

Figure 28: Buered Clock Distribution Tree

Buers amplify degraded clock signals due to distributed interconnect impedance. Buers isolate local clock networks from up-stream load impedance. Buers provide sucient currents to drive the network capacitance and maintain high quality clock waveforms the output impedance of the buers must be much larger than the impedance of the interconnect sections being driven. Due to the variation of the active device characteristics buers are a primary source of clock skew for a well-balanced clock tree .

Copyright (c) F. Yuan

(41)

Clock Distribution (contd)


Clock Distribution Tree with Parameterized Buered

Master clock Buffer

Parameterized buffers

Figure 29: Clock Distribution Tree with parameterized buers

Parameterized buers are used to compensate the variation of clock delay. The size of parameterized buers diers.

Copyright (c) F. Yuan

(42)

Clock Distribution (contd)


Symmetric H-Tree Clock Distribution Networks
Node n+1 Zn Zn Z n+1

Node n

Node n

Taped H-Tree clock distribution network

Figure 30: Symmetric H-Tree Clock Distribution Networks

The length of interconnects is identical from the source node n + 1 to the two destination nodes n. The primary delay dierence among the clock signal paths is due to the variations of process parameters aecting (i) interconnect impedance and (ii) characteristics of buers. The interconnect width is decreased progressively to minimize the reection of high-speed clock signals. The impedance of the interconnects leaving node n + 1, denoted by Zn , must be TWICE the impedance of the interconnects providing the signal to node n + 1. Interconnect capacitance is much larger as compared with the standard clock tree due to longer wire length. Dicult to route in practice.
Copyright (c) F. Yuan (43)

References
References
[1] Jan M. Rabaey, Digital Integrated Circuits : A Design Perspective, Upper Saddle River, New Jersey : Prentice-Hall, 1996. [2] K. Martin, Digital Integrated Circuit Design Oxford University Press, 2000. [3] Wayne Wolf, Modern VLSI Design : Systems on Silicon, 2nd edition, Prentice Hall, Upper Saddle River, NJ 07458, 1998. [4] B. Razavi, Design of Analog CMOS Integrated Circuits, McGraw-Hill, 2001. [5] A. Bellaoura and M. I. Elmasry, Low-power digital VLSI design : Circuits and Systems, Boston : Kluwer Academic, 1995. [6] D. Clein, CMOS IC Layout - Concepts, Methodologies, and Tools,Newnes, Boston, 1999. [7] E. G. Friedman, Introduction : clock distribution networks in VLSI circuits and systems, pp. 1-35. [8] Analog Design Flow, Canadian Microelectronics Corporation, 2001. [9] A. Hajimiri, S. Limotyakis, and T. Lee, Jitter and phase noise in ring oscillators, IEEE J. Solid-State Circuits, Vol. 34, No. 6, pp. 790-804, Jun. 1999. [10] B. Liew, N. Cheung, and C. Hu, Projecting interconnect electromigration lifetime fo arbitrary current waveform, IEEE Trans. on Electron Devices, vol. 37, pp. 1343-1350, 1990. [11] J. Lee and B. Kim, A low-noise fast-lock phase-locked loop with adaptive bandwidth control, IEEE J. Solid-State Circuits, vol. 35, No. 8, pp. 1137-1145, Aug. 2000.

Copyright (c) F. Yuan

(44)

[12] J. Kim, S. Lee, T. Jung, C. Kim, S. Cho, and B. Kim, A low-jitter mixed-mode DLL for high-speed DRAM applications, IEEE J. Solid-State Circuits, vol. 35, No. 10, pp. 1430-1436, Oct. 2000. [13] D. Jeong, S. Chai, W. Song, and G. Cho, CMOS current-controlled oscillators using multiple-feedback-loop ring architectures, in Proc. Intl Solid-State Circuit Conf., pp.386-387, 1997. [14] C. Park and B. Kim, A low-noise, 900-MHz VCO in 0.6m CMOS, IEEE J. Solid-State Circuits, Vol. 34, No. 5, pp. 586-591, May 1999. [15] Y. Eken and J. Uyemura, A 5.9-GHz voltage-controlled ring oscillator in 0.18m CMOS, IEEE J. Solid-State Circuits, Vol.39, No. 1, pp. 230-233, Jan. 2004.

Copyright (c) F. Yuan

(45)

You might also like