You are on page 1of 4

IEEE 2006 Custom Intergrated Circuits Conference (CICC)

Comparison and Impact of Substrate Noise Generated by


Clocked and Clockless Digital Circuitry
Jim Le, Christopher Hanken, Martin Held, Mike Hagedorn∗, Kartikeya Mayaram, and Terri S. Fiez
School of EECS, Oregon State University, Corvallis, OR 97331
∗ Theseus Logic, Inc., Orlando, FL 32826

Abstract— A pseudo-random number generator implemented on an analog block, the performance degradation of a delta-
in asynchronous logic generates one-fifth the RMS substrate sigma modulator (DSM) is evaluated in the presence of the
noise compared to the equivalent design in synchronous logic. substrate noise from each processor. These measurements
An asynchronous 8051 processor generates one-third the RMS
substrate noise as the equivalent synchronous design. The SNR provide insight into noise tolerant analog/RF circuit design
of a second order delta-sigma modulator (DSM) is not affected techniques.
by substrate noise due to an asynchronous processor while
it experiences 15 dB degradation when the synchronous 8051 II. CBL V ERSUS NCL
processor is clocked near integer multiples of the DSM sampling As mentioned above, one of the largest sources of noise
frequency.
Keywords: substrate noise, synchronous circuit, asynchronous
generation for CBL is the clock tree. A NCL design circum-
circuit, null conventional logic, delta sigma modulator. vents this problem by implementing a building block called a
threshold gate that consists of a DATA state and a NULL state.
I. I NTRODUCTION A threshold gate starting with its output in a NULL state will
remain in the NULL state until the specified number of inputs
The trend toward integrated Systems-on-a-Chip (SoC) has is placed in the DATA state. Once the gate reaches the DATA
resulted in combining analog and digital components on a state, it remains in this state until all of the inputs return to
single chip. Due to this integration, switching noise generated the NULL state. A combination of clockless threshold gates
by the digital circuitry is coupled to the chip substrate through can be used to build any conventional Boolean gate.
transistor junction, interconnect and bond-pad capacitances Since the threshold gate needs to hold state information in
[1]. This generates noise currents that may degrade analog a latch in addition to performing their logic function, they
performance by changing the transistor body potential and by are typically larger than their traditional Boolean logic coun-
altering the power and ground voltage levels [2], [3]. terparts that perform the same function. They do, however,
In recent years, many techniques have been developed to hold several distinct advantages over synchronous circuits,
suppress the substrate noise coupling to the analog block. Most especially when it comes to noise generation such as: clock
of these methods attempt to reduce noise coupling by either correlated switching noise, peak currents on power rails due to
blocking or actively canceling the noise in the substrate [4]. supply noise, and extra power consumption due to unnecessary
An alternative approach is to reduce the amount of noise that clock induced switching.
is injected into the substrate. For a typical clocked Boolean
logic (CBL) design, the main sources of noise injection are III. S IMULATION M ETHODOLOGY
the clock tree and synchronous switching. The clock tree, Simulation was used for validation of the comparison be-
used to distribute the clock across the chip, represents a large tween synchronous and asynchronous circuits. For very large
capacitive load in terms of both power and noise generation. digital blocks, simulation of substrate noise coupling is not
Synchronous switching noise is the result of thousands of practical to do at the transistor level. An efficient methodology
digital gates switching relatively close in time such that their presented in [6] uses a gate-level VHDL description of the dig-
effects tend to accumulate. Both of these problems can be ital system to generate transition information. This information
mitigated with an asynchronous design approach such as Null is then combined with a noise signature library for each gate-
Conventional Logic (NCL). With this clockless logic, data is level block to determine cell noise currents. Finally the cell
assessed and propagated independently by each gate. Thus, noise currents can be used in the transistor level simulation
switching is localized and for the most part, independent of of the analog block in order to determine the noise coupling
activity elsewhere on the chip [5]. effects. The complete design and simulation flow is illustrated
In this paper, the substrate noise generated by a simple in Fig. 1.
synchronous and an asynchronous circuit are compared and For synchronous and asynchronous blocks, each gate is
analyzed. Next, the analysis is expanded to examine the noise characterized by a noise signature library and an equivalent
from a typical large digital block such as a synchronous CBL rail parasitic library. The latter is used to simulate the parasitic
8051 processor and an asynchronous NCL 8051 processor. effects of the gate transistions on the power rails when
In order to gauge the practical impact of the substrate noise performing the final simulation with the cell noise currents.

1-4244-0076-7/06/$20.00 ©2006 IEEE 6-7-1 105

Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
Behavioral
Test Vectors
VHDL / Verilog

VHDL
Synthesis & Timing Event
Analysis/Verification Timing info
Simulation

Analog PRNGs
Design/Layout Layout Place
& Route Noise Noise Vector SA
Signature Generation
Library
Timing
Extraction & Cell Noise
Verification Currents

Final
Generate Substrate Transient
Final Layout
Parasitic Network Simulation

Output Waveforms Fig. 3. Die photograph of the synchronous (CBL) and asynchronous
(NCL) pseudo-random number generators (PRNG). The effective die areas
are 0.32mm2 and 0.6mm2 , respectively.
Fig. 1. Design and simulation flow incorporating substrate noise analysis.

Constant
0.04 0.04

Multiplier 0.02 0.02

Magnitude (V)

Magnitude (V)
1 8 LSB 0 0

Adder
-0.02 -0.02

-0.04 -0.04
0 2 4 6 0 2 4 6
Register Time (sec) -7 Time (sec) -7
x 10 x 10

Fig. 4. Measured substrate noise for the synchronous (left) and asynchronous
Output (right) PRNGs in the time domain.

Fig. 2. Linear congruential random number generator. 0 0

-20 -20

-40 -40
All final cell noise currents and parasitics were run for each
Magnitude (dB)

Magnitude (dB)
-60 -60
implementation of the processor along with an extracted netlist
-80 -80
of the appropriate analog block. -100 -100
An equivalent resistor network was used to simulate the -120 -120
substrate. A 3-dimensional Green’s function solver was used -140 -140

to calculate the resistance values [7]. The connections from -160


25 75 125 175 225 275
-160
25 75 125 175 225 275
the circuit to substrate networks were determined with the use Frequency (MHz) Frequency (MHz)

of Silencer! [8]. All equivalent package, bondwire, and PCB


Fig. 5. Measured substrate noise for the synchronous (left) and asynchronous
parasitics were also included in the final simulation. (right) PRNGs in the frequency domain.

IV. P SEUDO -R ANDOM N UMBER G ENERATION B LOCKS


were used for the NCL and CBL circuits to allow measurement
In order to compare and contrast the substrate noise induced of the noise of one block while the other is inactive.
by synchronous and asynchronous circuits, a CBL and NCL The substrate noise is measured with on-chip probing at the
version of an 8-bit linear congruential pseudo-random num- output of a wideband amplifier with unity gain bandwidth of
ber generator (PRNG) was implemented in a heavily-doped approximately 1GHz [2]. Figs. 4 and 5 show the measured
0.25µm process. A block diagram of this circuit is shown in substrate noise in the time domain and frequency domains,
Fig. 2. This easily scalable circuit consists of a multiplier, an respectively. For these measurements, the CBL PRNG is
adder, and a register to generate a sequence of 256 unique clocked at the same equivalent operating speed of the NCL
data values in a continuous loop. PRNG (approximately 50MHz). The time domain plot shows
The die photograph of the chip with the PRNGs is shown the obvious differences between the noise generated by the
in Fig. 3. A total of 36 blocks of the CBL PRNG and 36 synchronous and asynchronous logic. The RMS noise voltage
blocks of the NCL PRNG were placed on the die to emulate of the asynchronous circuit is found to be 14dB lower than
the switching noise from a large digital block. Rows of the the synchronous circuit. The frequency domain plot for the
synchronous and asynchronous PRNGs were inter-digitated on synchronous PRNG shows large clock tones at the operating
the die to provide equivalent distance to the sensing circuits. frequency of 50MHz and other smaller tones corresponding
The die itself is mounted directly on the PCB as a chip-on- to the synchronous switching. Conversely, the frequency do-
board to eliminate package parasitics. Separate power rails main plot for the asynchronous PRNG shows the noise is

6-7-2 106

Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
0.04 0.04
8051
NCL 0.03 0.03
CBL
Memory
0.02 0.02

Magnitude (V)

Magnitude (V)
0.01 0.01

0 0

SA -0.01 -0.01

-0.02 -0.02

-0.03 -0.03
0 1 2 3 4 5 6 0 2 4 6
Time (sec) -7 Time (sec) -7
x 10 x 10

Fig. 7. Measured substrate noise for the synchronous (left) and asynchronous
Fig. 6. Die photograph of the synchronous (CBL) and asynchronous (NCL) (right) 8051s in the time domain.
8051s. The die area of the cores are 0.5mm2 and 0.62mm2 , respectively. 0 0

-20 -20

-40 -40

Magnitude (dB)

Magnitude (dB)
spread across the spectrum. There are noticeable tones at the -60 -60

equivalent operating frequency and its harmonics. The skirting -80 -80

seen at these frequencies is caused by the logic of the NCL -100 -100

switching at different times. Also notable is the size of the -120 -120

-140 -140
second harmonic at 100MHz. The size of this tone can be
-160 -160
explained by the nature of the NCL having an output which 0 20 40 60
Frequency (MHz)
80 100 0 20 40 60
Frequency (MHz)
80 100

has a return-to-zero behavior. This causes a doubling in the


frequency of the noise from some gates. Fig. 8. Measured substrate noise for the synchronous (left) and asynchronous
(right) 8051s in the frequency domain.

V. 8051 M ICROPROCESSOR C ORES


substrate noise generated by the asynchronous design was
Although the PRNG blocks are useful to analyze the differ- 9.5dB less than that generated by the synchronous design.
ences in the substrate noise injected by a synchronous and Further analysis of the measured substrate noise reveals
an asynchronous circuit, these blocks are in general not a two main contributors to the observed waveform: the noise
good measure of the substrate noise that would be present due to the architectural implementation of the processor and
in a typical mixed-signal chip. A more realistic comparison in the noise resulting from the software that is loaded in the
terms of substrate noise can be found with a microprocessor. processor. These individual contributions can readily be seen
Microprocessors are commonly incorporated onto large mixed- from the frequency spectrum of the noise, shown in Fig. 8. In
signal chips as application specific functional blocks. In order the synchronous implementation, the architectural contributors
to extend this analysis, another test chip was fabricated with a to the injected substrate noise are the clock and the instruction
synchronous and an asynchronous version of a generic 8051 cycle. As expected, the clock is the dominant source of noise
microprocessor. in the synchronous 8051. This can be seen in the form of
The die photo of the test chip is shown in Fig. 6. The large noise tones in the frequency spectrum at 33MHz and
chip was fabricated in a heavily-doped 0.25µm process and its harmonics. The instruction cycle in this implementation
packaged in a CPGA132 package. Aside from the two micro- of the 8051 is four clock cycles long and can be seen in
processor cores, the designs share a 256 byte program memory, the frequency spectrum as slightly smaller tones at 1/4th
a 256 byte data memory, and an external memory interface the clock frequency or 8.25MHz (and its harmonics). The
for reading and writing from an off-chip source. The common contribution to the measured substrate noise due to software
components of the design are physically placed between the can be found from repetitive structures in the program. In the
two cores to maintain layout symmetry for substrate noise pseudo-random number program for the synchronous 8051,
comparisons between the CBL and NCL designs. I/O pins the loop that generates the random number sequence is 12
and peripherals have also been kept to a bare minimum to instruction cycles long. The resulting effect of this loop can
maintain the integrity of the substrate noise analysis. Similar be seen in the frequency spectrum as tones at 0.68MHz and
to the PRNG case, the substrate noise is measured with on- its harmonics.
chip probing at the output of a wideband amplifier with unity In the implementation of the asynchronous 8051, it can be
gain bandwidth of approximately 1GHz. seen that frequency components due to the clock are indeed
Time domain plots of the substrate noise generated by the absent from the spectrum. The dominant noise source for
synchronous and asynchronous 8051s are shown in Fig. 7. this processor is the memory accesses in the RAM. In the
For these measurements the 8051 processors are loaded with spectrum, similar tones to those in the synchronous spectrum
an equivalent software implementation of the pseudo random can be seen at 0.68MHz and its harmonics. This result is to
number algorithm used in the PRNGs. The synchronous 8051 be expected since the asynchronous 8051 is running the same
is clocked at 33MHz to obtain the equivalent operating speed pseudo-random number program. Note that the noise is spread
of the asynchronous 8051. Measurements show that the RMS out across the spectrum similar to the asynchronous PRNG.

6-7-3 107

Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
0 0
VI. S UBSTRATE N OISE E FFECTS ON A D ELTA -S IGMA -20 -20
M ODULATOR -40 -40

Magnitude (dB)

Magnitude (dB)
A fully differential second-order DSM was implemented on -60 -60

the chip to examine substrate noise effects on the performance -80 -80

of a typical analog block. For a sampling frequency of 4MHz, -100 -100

an OSR of 128, and an input signal at 10kHz, the nominal -120 -120

SNR of the DSM is 83dB. -140


0 4 8 12 16
-140
0 4 8 12 16
Frequency (kHz) Frequency (kHz)
Measurement of the DSM with the synchronous 8051 active
shows that the performance is most sensitive to noise frequen- Fig. 9. Nominal DSM spectrum (left) and DSM spectrum with substrate
cies around integer multiples of the DSM sampling clock. At noise injection from the synchronous 8051 (right).
these frequencies, the difference between the 8051 clock and
the sampling frequency is aliased down into the passband of 90
the DSM. Fig. 9 shows the nominal DSM spectrum and the Exact integer multiple of the clock frequency

spectrum with noise from the synchronous 8051 clocked at 85


200Hz below the DSM sampling frequency. As seen in this
spectrum, tones 200Hz apart show up in the passband due

SNR (dB)
80
to aliasing. After sweeping the synchronous 8051 clock at
frequencies close to the sampling clock, shown in Fig. 10, 75
it was found that the SNR degradation can be up to 15
dB. Simulation shows that substrate noise coupling at the 70
input dominates the performance degradation. At exact integer
multiples of the clock frequency, the aliasing results in a dc 65
−10 −5 0 5 10
offset which causes little SNR degradation. This is consistent Offset Frequency (kHz)
with previously published work [9]. Measurement of the DSM
Fig. 10. DSM SNR with substrate noise from the synchronous 8051 clocked
with the asynchronous 8051 active shows no noticeable effect close to the DSM sampling frequency.
on the SNR performance.

VII. D ISCUSSION AND C ONCLUSION R EFERENCES


[1] N. Verghese, T. Schmerbeck, and D. Allstot, Simulation Techniques
Generalizing the results from the sampled-data DSM, in- and Solutions for Mixed-Signal Coupling in Integrated Circuits. Kluwer
termodulation terms generated by the substrate noise tones Academic Publishers, 1995.
and the clock frequency that are above the circuit noise floor [2] M. van Heijningen, J. Compiet, P. Wambacq, S. Donnay, M. Engels, and
I. Bolsens, “Analysis and experimental verification of digital substrate
and in the bandwidth of interest may degrade the analog noise generation for epi-type substrates,” IEEE J. Solid-State Circuits,
circuit performance. Additionally, as measurements revealed, pp. 1002–1008, Jul. 2000.
providing a phase offset for the digital clock does not alter [3] M. Nagata, K. Hijikata, J. Nagai, T. Morie, and A. Iwata, “Reduced sub-
strate noise digital design for improving embedded analog performance,”
these effects. Generally, continuous time analog circuits have in IEEE International Solid-State Circuits Conference, pp. 224–225, Feb.
relatively high linearity and thus, only the in-band substrate 2000.
noise that exceeds the specified noise floor degrades the [4] M. Peng and H. Lee, “Study of substrate noise and techniques for
minimization,” IEEE J. Solid-State Circuits, vol. 39, pp. 2080–2086, Nov.
performance. 2004
This information can be applied to RF circuits and in [5] K. Fant and S. Brandt, “NULL conventional logic: A complete and
particular a RF LNA [10]. There are both harmonic and consistent logic for asynchronous digital circuit systems,” in International
Conference on Application-specific Systems, Architectures, and Proces-
intermodulation (IM) tones at the output of a RF LNA. The sors, pp. 261–273, 1996.
intermodulation terms are from the harmonics of the clock [6] H. Habal, T. Fiez, and K. Mayaram, “Accurate and efficient simulation of
mixing with the RF carrier in the active devices, whereas the synchronous digital switching noise in systems on a chip,” IEEE Trans.
VLSI, vol.13, pp. 330-338, March 2005.
harmonics are coupled through passive circuitry directly into [7] C. Xu, EPIC: A Program for Extraction of the Resistance and Capaci-
the amplifier output [10]. Based on this observation, it is clear tance of Substrate With the Green’s Function Method, ECE Dept., Oregon
that the digital clock harmonics should be reduced, which is State Univ., 2002.
[8] P. Birrer, T. Fiez, and K. Mayaram, “Silencer!: a tool for substrate noise
possible with the use of clockless digital circuitry. coupling analysis,” IEEE International SOC Conference, pp. 105-108,
Sept. 2004.
VIII. ACKNOWLEDGEMENTS [9] T. Blalack and B. Wooley, “The effects of switching noise on an
oversampling A/D converter,” in IEEE International Solid-State Circuits
Conference, pp. 200-201, Feb. 1995.
This research was supported in part by grants under the [10] S. Hazenboom, T. Fiez, and K. Mayaram, “Digital noise coupling
DARPA TEAM and CLASS programs. The authors would also mechanisms in a 2.4GHz LNA for heavily and lightly doped CMOS
like to thank Triet Le for the DSM design, Husni Habal for his substrates,” Proc. Custom Integrated Circuits Conference 2004, pp. 367-
370, Oct. 2004.
work on the PRNG, and James Ayers for the chip-on-board
packaging.

6-7-4 108

Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.

You might also like