Implementations of Delay-Insensitive Circuits

Implementations of Delay-Insensitive Circuits∗
Joel Reyes Noche

Department of Mathematics and Natural Sciences
Ateneo de Naga University
Naga City, Camarines Sur, Philippines
February 20, 2009
Abstract
A delay-insensitive (DI) circuit is a digital logic circuit that operates correctly regardless of any delays in
the logic gates (modules) or in the wires (interconnection lines). In practice, DI circuits are constructed by
putting together DI primitive modules. These primitives have implementations in CMOS (complementary
metal oxide semiconductor) technology, in RSFQ (rapid single flux quantum) technology, in SET (single
electron tunneling) technology, and on asynchronous cellular automata (ACA).
1 Introduction
An asynchronous circuit is a digital logic circuit that does not use a global clock signal. Delay-insensitive (DI)
circuits are asynchronous circuits that make the least restrictive timing assumptions. DI circuits are made by
connecting DI building blocks (called primitive modules or primitives). As long as the internal timing assumptions
of these blocks are satisfied, the behaviors of circuits composed of these primitives are not affected by the speed
of operation of the modules or of the delays in the wires connecting them.
Although there have been many delay-insensitive circuit building blocks proposed in the past (for example,
[3, 6]), we only consider the blocks created by Patra and Fussell [9, 10]. They present sets of blocks that are
universal (can be used to make any DI circuit) and minimal (no proper subset of them is sufficient for making all
such circuits). Software models of these are described in [13]; the models aid in the visualization of the primitives’
operation.
A description of a DI primitive consists of the accepted behavior of the module in response to its environment
and also the accepted behavior of the environment in response to the module. We simplify the discussion here
and describe a primitive by giving examples of accepted and unaccepted behaviors.
We focus on one DI primitive, the Merge.1 A Merge has two input ports a? and b? and one output port c!.
A signal transition on one input port, say, a?, once assimilated by the module, leads to a signal transition on the
output port c!. The Merge is serial, that is, every signal on one of its input ports must be followed by exactly
one signal on an output port of the module before the next input signal can be assimilated by the module [6,
p. 3]. For example, b?c!a?c! and a?c!a?c! are accepted behaviors for the Merge, while a?b?, a?c!c!, and c! are
unaccepted behaviors.
For the remainder of the paper, we show how a Merge is implemented in different technologies.
2 Complementary Metal Oxide Semiconductor Technology

In a metal-oxide-semiconductor field-effect transistor (MOSFET), a conducting metallic (e.g., aluminum) gate
is electrically isolated from a semiconductor (e.g., silicon) channel by an insulating oxide layer (e.g., silicon diox-
ide). Complementary MOS (CMOS) digital logic uses p-channel and n-channel (enhancement type) MOSFETs
(PMOS and NMOS) in complementary networks (when one conducts, the other does not). When used as a
∗ Presented at the 2009 Engineering Bicol Research Conference held at the Ateneo de Naga University
1 For detailed descriptions of the other primitives, see [9, 10] and also [6, §2].
1
switch, a MOSFET has three terminals: a drain, a gate, and a source.2 When the gate voltage is high, a PMOS
does not conduct current from its source to its drain (it is off), and an NMOS conducts current from its drain
to its source (it is on). When the gate voltage is low, a PMOS is on, and an NMOS is off. (For details, see, for
example, [1, §5.5, 5.6, 5.7].)
Figure 1: Two-input CMOS xor gate (figure taken from [8, p. 162] and modified)
A Merge can be implemented in CMOS as an xor gate [16, p. 77] (see Figure 1). Note that here a signal
transition is taken to be a change in voltage (either from high to low, or from low to high).
CMOS implementations of some of Patra and Fussell’s other building blocks are in [9, 10, 16].
CMOS implementations of DI circuits are complex [11, p. 42], making them inefficient and impractical. “[I]n
CMOS technology asynchronous designs actually perform worse [than synchronous designs] with respect to power
consumption, wiring requirements, and speed” [6, p. 1034].
3 Rapid Single Flux Quantum Technology

Rapid single flux quantum (RSFQ) logic is based on low temperature superconductors using Josephson junc-
tions (JJ) as the basic switching elements.
In current technology, a JJ consists of a pair of niobium superconductor electrodes separated by

aluminum oxide as a thin tunnel barrier. Below a certain critical current Ic , such junctions carry
superconducting current with no need for a bias voltage across the junction, thus not dissipating
any energy. When the induced current exceeds Ic , the junction becomes resistive and undergoes a
Josephson 2π phase leap which produces a non-zero voltage drop across the junction. This effect in
two-terminal Josephson junctions is the basis for the switching action needed to build logic devices.
[...] In RSFQ logic circuits, [...] a bit of information is carried by the propagation of a single magnetic
flux quantum Φ0 [...]. These quantaR can equivalently be thought of as very short voltage pulses V (t)
of quantized area, where Φ0 = V (t) dt ≈ 2.07 mV·ps. Pulse propagation takes place by biasing the
JJs in the circuit so that an arriving flux quantum will cause a JJ to exceed its critical current Ic ,
thereby switching and emitting a new flux quantum. A full flux quantum is generated whenever a
JJ’s Ic is exceeded, regardless of any degradation that may have occurred to the triggering quantum,
thus providing for power gain in RSFQ circuits. [11, pp. 44–45]
An RSFQ implementation3 of a Merge is shown in Figure 2.
2 A fourth terminal, the substrate, is connected to the highest voltage of the circuit (for PMOS) or to the lowest voltage of the
circuit (for NMOS).

3 Patra, Polonsky, and Fussell [11] do not specify the values of the circuit elements for this implementation of a Merge, although
for the other DI RSFQ primitives they presented, biasing currents were a little less than a tenth of a milliampere, JJ critical currents
were a few tenths of a milliampere, and inductances were a few picohenrys.
2
Figure 2: RSFQ Merge (figure taken from [11, p. 46])
[T]he two inputs are a and b, and the output is c. Ib1 is the bias current, which is split between
two arms feeding junctions J3, J1 and J4, J2. The critical current thresholds are arranged so
that Ic3 < Ic1 and likewise Ic4 < Ic2 . When a pulse on a arrives, additional current flows through
J1, exceeding its critical current, whereupon J1 goes resistive. J3 is not triggered since the current
induced by the input pulse is in the opposite direction from the bias current. The SFQ pulse developed
consequently across J1 is transferred through J3, across which the potential is still zero, to J5. This
causes J5 to trip and emit an output pulse at c. The pulse generated by J1 also trips J4, whose
critical current is less than that of J2. When J4 becomes resistive, it prevents J2 from tripping. Since
there is no voltage drop across J2, no pulse is emitted back through input b. Inputs on b operate
symmetrically. Thus the junctions J3 and J4 serve to isolate the inputs from each other and provide
signal directionality (from inputs to output). [11, p. 46]
RSFQ implementations of some of Patra and Fussell’s other building blocks are in [11], and some have been
fabricated and tested at low frequencies. These DI RSFQ primitives have been used in designs of self-timed
pipelined parallel adders [2].
RSFQ technology has sub-picosecond junction switching speed (allowing operation at several hundred giga-
hertz) and very low power dissipation (below one microwatt per JJ even in its resistive state). However, since
low temperature superconductors are used, they must be cooled using liquid helium. There are also “limitations
on RSFQ memory density due to the large physical size of a flux quantum and the difficulty of amplifying output
signals to off-chip power levels at speeds comparable to those attainable on the chip.” [11, p. 45]
4 Single Electron Tunneling Technology

Single electron tunneling (SET) technology is based on the quantum tunneling effect wherein an electron has
a non-zero probability of passing through a potential barrier. In logic circuits using this technology, the switching
elements are quantum tunnel junctions. (See [14, §2.1] for more details.) Tunneling through a junction becomes
possible when the voltage applied to a junction exceeds the junction’s critical voltage. Since electron tunneling is
stochastic in nature, the switching delay is a random variable. [15, p. 704]. For temperatures above zero kelvin it
is possible that an electron will tunnel through a junction even though the critical voltage condition is not met.
To ensure that thermal effects do not dominate, the temperature must be kept low enough so that the charging
energy is much greater than the thermal energy. [14, p. 8]
A SET implementation4 of a Merge is shown in Figure 3. The inverted signals V aˆ and V bˆ are produced
using SET static inverting buffers (Figure 4).
When both inputs are low the inverted input signals are high causing an electron to tunnel through
J2 leaving a positive charge on n2. This in turn causes an electron to tunnel through J3 leaving
4 Safiruddin and Cotofana [15] use the values Ca = Cb = Ct = 0.5 aF, Cs1 = 9.5 aF, Cs2 = 10.5 aF, Cg = 10 aF, C1 = C2 =
C3 = 0.1 aF, V s = 16 mV, and the resistance of each junction is 25.8 kΩ.
3
Figure 3: SET Merge (figure taken from [15, p. 705] and modified)
a positive charge on n3. This is inverted and so the output becomes low, as it should be. If one
of the input signals undergoes a transition then the voltage of J2 (and J1) becomes lower than the
critical voltage causing the electron to tunnel back leaving no charge on n2 (and n1). This reduces
the voltage over J3 to under the critical voltage and the electron tunnels back leaving no charge on
n3. This value is complemented afterwards by the output inverter thus the output becomes high, as
it should. If the second signal also undergoes a transition then the voltage over J1 becomes higher
than the critical voltage causing an electron to tunnel. This cause[s] an electron to tunnel through
J3 leaving a positive charge on n3 corresponding to a low output, as it should. If subsequently one
of the signals transitions again, going low this time, the circuit goes into the previous state with no
charges on n1, n2, and n3. If after that the other signal also transitions the circuit returns to the
original charge neutral state. [15, p. 705]
Figure 4: SET static inverting buffer (figure taken from [15, p. 705] and modified)
It must be noted that these designs were simulated using ideal conditions (zero kelvin temperature, no co-
tunneling or background charge effects) [15] and have not yet been fabricated.
5 Asynchronous Cellular Automata

A cellular automaton is an array of cells arranged and connected uniformly. “Each cell is connected to a
neighborhood of a finite number of cells, and having a state from a finite state set. Each cell undergoes state
4
transitions according to a transition function, which determines the cell’s state based on the states of cells in its
neighborhood.” [5, §2]
Asynchronous cellular automata (ACA) “allow any cell to undergo state transitions at arbitrary times inde-
pendent of the timings of the other cells’ transitions. Due to the asynchronicity, however, computation in ACA
may be nondeterministic, i.e., more than one global configuration may evolve from a certain configuration.” [5,
§2]
Lee, et al. [5, 4] consider two-dimensional ACA in which each cell has a neighborhood composed of its four
orthogonal adjacent cells along with itself (a von Neumann neighborhood). They were able to embed a universal
set of DI building blocks5 [6] in a 5-state ACA, achieving computational universality [5].
Later, they embedded some of Patra and Fussell’s primitives in a 4-state ACA (see Figure 5). From these
they constructed a universal logic element (a Rotary Element [7]), showing that their model has computational
universality.
Figure 5: Transition rules in A4 (figure taken from [4, p. 207] and modified)
A Merge Core is shown in Figure 6. If we connect two Entrances (Figure 7), a Turn Core (Figure 8), and
an Exit (Figure 9), we get the Merge module shown in Figure 10.
Although no physical implementations of ACA models like this have yet been created, some possible candidates
are discussed in [12].
References
[1] Jose Araneta. A First Course in Semiconductor Devices and Circuits. National Book Store, Mandaluyong
City, Philippines, 2007.
[2] Y. Kameda, S. Polonsky, M. Maezawa, and T. Nanya. Self-timed parallel adders based on DI RSFQ primi-
tives. IEEE Transactions on Applied Superconductivity, 9(2):4040–4045, June 1999.
[3] Robert Keller. Towards a theory of universal speed-independent modules. IEEE Transactions on Computers,
23(1):21–33, 1974.
5 Their primitives differ from Patra and Fussell’s [9, 10] by allowing input and output lines of modules to be bi-directional and
able to buffer signals.
5
Figure 6: (a) ACA Merge Core; (b) Merge Core operating on a signal on its right internal path (figure taken
from [4, p. 212] and modified)
Figure 7: (a) ACA Entrance; (b) Entrance operating on an input signal (figure taken from [4, p. 208] and
modified)
Figure 8: (a) ACA Turn Core; (b) Turn Core operating on a signal arriving on its lower internal path (figure
taken from [4, p. 210] and modified)
6
Figure 9: (a) ACA Exit; (b) Exit operating on a signal (figure taken from [4, p. 209])
7
Figure 10: ACA Merge module (figure taken from [4, p. 213] and modified)
[4] Jia Lee, Susumu Adachi, Ferdinand Peper, and Shinro Mashiko. Delay-insensitive computation in asyn-
chronous cellular automata. Journal of Computer and System Sciences, 70:201–220, 2005.
[5] Jia Lee, Susumu Adachi, Ferdinand Peper, and Kenichi Morita. Embedding universal delay-insensitive
circuits in asynchronous cellular spaces. Fundamenta Informaticae, XX:1–24, 2003.
[6] Jia Lee, Ferdinand Peper, Susumu Adachi, and Kenichi Morita. Universal delay-insensitive circuits with
bi-directional and buffering lines. IEEE Transactions on Computers, 53(8):1034–1046, August 2004.
[7] Kenichi Morita. A simple universal logic element and cellular automata for reversible computing. In Maurice
Margenstern and Yurii Rogozhin, editors, MCU, volume 2055 of Lecture Notes in Computer Science, pages
102–113. Springer, 2001.
[8] Joel Noche. An asynchronous single-precision floating-point arithmetic unit. Master’s thesis, University of
the Philippines, Diliman, College of Engineering, 2003.
[9] Priyadarsan Patra and Donald Fussell. Building-blocks for designing DI circuits. Technical Report TR93-23,
Department of Computer Sciences, University of Texas at Austin, 1993.
[10] Priyadarsan Patra and Donald Fussell. Efficient building blocks for delay insensitive circuits. In Proceedings
of the International Symposium on Advanced Research in Asynchronous Circuits and Systems, pages 196–
205. IEEE Computer Society, November 1994.
[11] Priyadarsan Patra, Stanislav Polonsky, and Donald Fussell. Delay insensitive logic for RSFQ superconductor
technology. In Proceedings of the International Symposium on Advanced Research in Asynchronous Circuits
and Systems, pages 42–53. IEEE Computer Society, April 1997.
[12] Ferdinand Peper, Jia Lee, Susumu Adachi, and Shinro Mashiko. Laying out circuits on asynchronous cellular
arrays: A step towards feasible nanocomputers? Nanotechnology, 14(4):469–485, 2003.
[13] Jesse Sacayanan and Joel Noche. Modeling of delay-insensitive circuit building-blocks using the Hamburg
design system. Philippine Engineering Journal, XXIII(2):11–18, December 2002.
[14] Saleh Safiruddin. Single electron tunneling based building blocks for delay insensitive circuits. Master’s thesis,
Delft University of Technology, Faculty of Electrical Engineering, Mathematics and Computer Science, 2008.
[15] Saleh Safiruddin and Sorin Cotofana. Building blocks for delay-insensitive circuits using single electron
tunneling devices. In Proceedings of the IEEE International Conference on Nanotechnology, pages 704–708.
IEEE, August 2007.
[16] Philip Shirvani, Subhasish Mitra, Jo Ebergen, and Marly Roncken. DUDES: A fault abstraction and col-
lapsing framework for asynchronous circuits. In Proceedings of the International Symposium on Advanced
Research in Asynchronous Circuits and Systems, pages 73–82. IEEE Computer Society, April 2000.

Implementations of Delay-Insensitive Circuits

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Implementations of Delay-Insensitive Circuits

Uploaded by

Copyright:

Available Formats

Implementations of Delay-Insensitive Circuits∗

Joel Reyes Noche

2 Complementary Metal Oxide Semiconductor Technology

3 Rapid Single Flux Quantum Technology

In current technology, a JJ consists of a pair of niobium superconductor electrodes separated by

circuit (for NMOS).

4 Single Electron Tunneling Technology

5 Asynchronous Cellular Automata

able to buffer signals.

You might also like