Professional Documents
Culture Documents
SYSU-CMU Shunde International Joint Research Institute, Sun Yat-sen University, Shunde 528300, China
2
School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510006, China
3
School of Computer Science and Technology, Hunan University of Arts and Science, Changde 415000, China
* Email:luchong@mail2.sysu.edu.cn
Abstract
In this paper, we propose a fast clock synchronizer for
the low-voltage clock distribution network to reduce the
power consumption and to suppress the phase error. This
proposed circuit will align the clock signals of the leaf
nodes with the source of root in most 4 clock cycles and
diminish the buffers of original clock driver chains. CTC
and FTC are implemented to perform coarse and fine tuning separately in 2 and 3 clock cycles with one shared
cycle and low-voltage phase detectors are also applied to
meet the requirement of power supply. Interleaved delay
units are introduced to improve the precision of coarse
tuning and binary search scheme is employed to shorten
the fine tuning periods. The proposed circuit is designed
using TSMC 65 nm GP process with a least 0.6 V supply.
Comparison with the H-tree clock network synthesized
of single core of OpenSPARC T2 is applied in this paper.
The experimental results show that the clock will get
synchronized in most 4 cycles with the phase error suppressed under 48 ps and the power saving is up to 42%.
1. Introduction
The clock signal is significant to the synchronous digital
circuits block and memory devices, and the distribution
to millions of registers and latches on very large scale
microprocessors is quite a challenge. Normally, the distribution network is divided to global and local parts
with individual methodologies. Binary trees are widely
accepted in global clock distribution for its simplicity
and ease of integration to the digital backend workflow
with the aids of automatic clock synthesis algorithm [1],
[2]. A huge amount of inverters and buffers are inserted
and sizing-optimized to balance the propagation paths
and to drive local clock networks or mesh. The power
consumption of global distribution tree is enormous and
the simultaneous rising transition of clock signals will
lead to a notable swing on the power supply network [3].
Recently, academic researches focusing on deduction of
power consumption of clock distribution network imply
possible solutions in the future [4-7]. Reduction of supply voltage is a promising technic, however the performance of clock driving buffers is affected and the phase
error increases much [8] and the sensitivity to temperature variations should also be noticed [9].
In this paper, a novel low-power clock distribution network with fast synchronization circuit is proposed. The
operation supply voltage is shrank to 0.6 V using TSMC
65 nm GP process and the power consumption is lessened. The impact on the phase error and jitter is compensated by the synchronization circuit. The proposed
circuit is composed of an improved synchronous mirror
delay (SMD) circuit with closed-loop structure and a dynamic compensation circuit for coarse and fine tuning.
The performance lost on precision is minor and the expense for alignment is only most 4 clock cycles and least
2 clock cycles.
The synchronization circuit consists of several functional
blocks: a coarse tuning component (CTC) and a fine tuning component (FTC), an input buffer (IB) and a feedback buffer (FB), as Figure 1 demonstrated. Unlike conventional SMD, clock drivers (CD) are now separated
from the circuit and working as part of the global clock
distribution network, while the sizes of clock drivers are
determined by the effective load of the local networks or
mesh.
pensated with Tv. However, the precision of CTC is restricted by the resolution of measurement or compensation units. Three reference clocks are generated in IB,
normal reference clock RCLK with delay d1 and the
other two inverted clocks with phase difference , NCLK
and PCLK. Moreover, PCLK arrives earlier than NCLK
with a phase shifting .
Since the supply voltage is much lower than normal case,
the delay of measurement unit is nearly doubled comparing with the case with normal supply voltage. Another
challenge comes from the transition time of reference
clocks, and even worse in the tri-state inverters or AND
gates, which are the fundamental delay units of conventional clock synchronization circuits. In CTC, delay units
of IMDL are simplified to balanced inverters with the
delay R without any duty cycle distortion. CTC consists
of an interleaved measurement delay line (IMDL), a control circuit (CC) and a dual control delay line (DCDL),
as Figure 2 shown.
Qk Q k 1
Pk
Pk 1 I k
PN 1
Q N 1 1
(2)
Tv = Tck d1 d 2
Tvc K * R
Tvc Tv
(1)
K u R Tv ( R, 0)
Io
Ie
G V o T G ( R, 0)
G V e | G ( R, 0)
(3)
5. Summary
In this paper, we propose a novel clock synchronization
circuit for low-voltage clock distribution. The proposed
circuit will perform clock alignment in least 2 cycles and
most 4 cycles with phase error under 48 ps. The active
area is 460 m x 9.6 m with the power consumption at
1.4 mW. With this circuit, the power saving in clock distribution network can archive 42% at most.
References
[1] H. Qian, P. Restle, J. Kozhaya, and C. Gunion,
Subtractive router for tree-driven-grid clocks,
Computer-Aided Design of Integrated Circuits and
Systems, IEEE Transactions on, 31(6), pp.868877
(2012).
[2] C. Deng, Y. Cai, and Q. Zhou, A register clustering
algorithm for low power clock tree synthesis, 2014
IEEE International Symposium on Circuits and
Systems (ISCAS), (2014).
[3] A. Kahng, S. Kang, and H. Lee, Smart non-default
routing for clock power reduction, Design
Automation Conference (DAC), pp.17 (2013).
[4] H.-T. Lin, Y.-L. Chuang, Z.-H. Yang, and T.-Y. Ho,
Pulsed-latch utilization for clock-tree power optimization, Very Large Scale Integration (VLSI) Systems IEEE Transactions on, pp.721733(2014).
[5] F. Haj Ali Asgari and M. Sachdev, A low-power
reduced swing global clocking methodology, Very
Large Scale Integration (VLSI) Systems, IEEE
Transactions on, 12(5), pp. 538545(2004).
[6] N. Kancharapu, M. Dave, V. Masimukkula, M.
Baghini, and D. Sharma,A low-power low-skew
current-mode clock distribution network in 90nm
cmos technology, VLSI, IEEE Computer Society
Annual Symposium on, pp.132137(2011).
[7] A. Kulkarni and P. Khandekar, Design and implementation of low power clock distribution network,
in Advances in Engineering, Science and Management, International Conference on, pp. 761765
(2012).
[8] J. Pangjun and S. Sapatnekar, Low-power clock
distribution using multiple voltages and reduced
swings, Very Large Scale Integration Systems, IEEE
Transactions on, 10(3), pp. 309318(2002).
[9] S. Tawfik and V. Kursun, Low-power low-voltage
hot-spot tolerant clocking with suppressed skew, in
Circuits and Systems, IEEE International Symposium
on, pp. 645648(2007).
[10] M.-V. Krishna, M.-A. Do, K.-S. Yeo, C.-C. Boon,
and W.-M. Lim, Design and Analysis of Ultra Low
Power True Single Phase Clock CMOS 2/3 Prescaler, IEEE Trans. Circuits Syst. I, Reg. Papers,
57(1), pp.72-82(2010).
[11] www.opensparc.org