Analysis and Design of High Performance 128-Bit Parallel Prefix e

Northeastern University
Electrical and Computer Engineering Master's Theses January 01, 2011 Department of Electrical and Computer Engineering
Analysis and design of high performance 128-bit parallel prefix end-around-carry adder
Ogun Turkyilmaz
Northeastern University
Recommended Citation
Turkyilmaz, Ogun, "Analysis and design of high performance 128-bit parallel prefix end-around-carry adder" (2011). Electrical and Computer Engineering Master's Theses. Paper 59. http://hdl.handle.net/2047/d20001096
This work is available open access, hosted by Northeastern University.
Analysis and Design of High Performance 128-bit Parallel Prex End-Around-Carry Adder
A Thesis Presented by
Ogun Turkyilmaz
to
The Department of Electrical and Computer Engineering
in partial fulllment of the requirements for the degree of
Master of Science
in
Electrical and Computer Engineering
Northeastern University Boston, Massachusetts
August 2011
Abstract Addition is a timing critical operation in todays oating point units. In order to develop faster processing, an end-around carry (EAC) was proposed as a part of fused-multiply-add unit which performs multiplication followed by addition [5]. The proposed EAC adder was also investigated through other prex adders in FPGA technology as a complete adder [6]. In this thesis, we propose a 128-bit standalone adder with parallel prex end around carry logic and conditional sum blocks to improve the critical path delay and provide exibility to design with dierent adder architectures. In previous works, CLA logic was used for EAC logic. Using a modied structure of a parallel prex 2n 1 adder provides exibility to the design and decreases the length of the carry path. After the architecture is tested and veried, critical path is analyzed using FreePDK45nm library. Full custom design techniques are applied carefully during critical path optimization. Critical path analysis provides fast comparison of the total delay among dierent architectures without designing the whole circuit and a simpler approach to size the transistors for lowest delay possible. As a nal step, datapath is designed as a recurring bitslice for fast layout entry. The results show that the proposed adder shows 142ps delay, 2.42mW average power dissipation, and 3,132 sq. micron area assuming there is not much routing area overhead in the estimated area.
Acknowledgements
I would like to express my foremost appreciation to my advisor, Prof. Yong-Bin Kim for giving me the opportunity to conduct research at Northeastern University. I am grateful for his technical guidance and constant support in my graduate career. Without his valuable suggestions and assistance, this thesis would not be accomplished. I would like to thank the committee members, Prof. Fabrizio Lombardi and Prof. Gunar Schirner for reading my thesis and oering valuable suggestions and contributions. I am, especially, grateful to Prof. Schirner for the long discussions about graduate study. He has been a mentor and a teacher to me, who generously shares his knowledge and experiences with tremendous enthusiasm and never ending encouragement. It has been an honor and a great pleasure to study as a Fulbright Scholar. I would like to express my appreciation to the Fulbright Commission for giving me the chance to pursue further academical study and connect with many accomplished scholars. I would also like to thank Faith Crisley, Graduate Coordinator at the ECE Department, for her support and valuable suggestions. She has always been helpful with her comforting manner even in the most stressful moments. Last but not least, I would like to express my sincere appreciation to my beloved parents, Nuket and Nevzat Turkyilmaz, and my sister, Pinar Turkyilmaz, who encouraged me continuously in every step I take, supported me constantly through every hardship I faced and loved me without boundaries. Without them, I could not even imagine being where I am today.
Ogun Turkyilmaz August 2011
Contents
1 Introduction 1.1 Fused Multiply-Add Operation 1.2 Adders . . . . . . . . . . . . . . 1.2.1 Ripple Carry Adders . . 1.2.2 Carry lookahead adders 1.2.3 Parallel Prex Adders . 1.3 Tree Adders . . . . . . . . . . . 1.4 Recurrence Algorithms . . . . . 1.4.1 Weinberger Recurrence 1.4.2 Ling Recurrence . . . . 1.5 Conclusion . . . . . . . . . . . 1 2 3 3 4 5 6 11 12 12 14 16 16 17 18 22 23 24 24 24 25 28 29 29 30 30 31 32 32 34 35 38 38
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
2 Modulo Adders 2.1 Introduction . . . . . . . . . . . . . . . . . . . . 2.2 Addition in Modulo 2n 1 Adder . . . . . . . . 2.3 Analysis of Previous End-Around-Carry Adders 2.4 Carry-lookahead EAC Logic Unit . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . 3 Modied Parallel Prex EAC Adder 3.1 Introduction . . . . . . . . . . . . . . . . 3.2 Proposed Adder . . . . . . . . . . . . . . 3.2.1 The 16bit blocks in EAC adder . 3.2.2 Parallel Prex 2n 1 EAC Block 3.3 Implementation and Validation . . . . . 3.4 Conclusion . . . . . . . . . . . . . . . . 4 Critical Path Analysis 4.1 Path Identication . . . . . . . . . . . . 4.2 Path Design . . . . . . . . . . . . . . . . 4.3 Transistor Level Design and Sizing . . . 4.3.1 Logic Level Minimization . . . . 4.3.2 Late arriving signal exploitation 4.3.3 Logical Eort . . . . . . . . . . . 4.3.4 Design with Helpers . . . . . . . 4.4 Transistor Sizing . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
4.5 4.6
Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 45 48 49 52 53 53 56 63
5 Datapath Library 5.1 Introduction . . . . . . . 5.2 Concepts in Full Custom 5.3 Datapath Design . . . . 5.4 Layout Design . . . . . 5.5 Results . . . . . . . . . . 5.6 Conclusion . . . . . . .
. . . . . Design . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
6 Conclusion and Future Works A Verilog Code of the Proposed Adder
B HSPICE Simulation Files 68 B.1 Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 B.2 Simulation Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 B.3 Condition of Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
List of Figures
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.1 2.2 2.3 2.4 2.5 3.1 3.2 3.3 3.4 3.5 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 5.1 Ripple Carry adder [1]. . . . . . . Carry lookahead adder [2]. . . . . Group PG cells [3]. . . . . . . . . Taxonomy of prex networks [3]. Kogge-Stone adder [3]. . . . . . . Sklansky adder [3]. . . . . . . . . Brent-Kung adder [3]. . . . . . . Han-Carlson adder [3]. . . . . . . Knowles [2,1,1,1] adder [3]. . . . Ladner Fischer adder [3]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 . 4 . 6 . 7 . 8 . 9 . 9 . 10 . 11 . 11 . . . . . . . . . . . . . . . . . . . . . . . 17 18 19 20 22 25 26 27 27 28 31 32 33 34 35 38 39 40 40 41 42 42 43
Prex graph with fast end-around carry [4]. . . . General block diagram Modulo 2n 1 Adder [4]. Block diagram of the 128-bit binary adder [5]. . . Architecture of the EAC adder [6]. . . . . . . . . Architecture of the EAC adder [6]. . . . . . . . . Architecture of the modied EAC Adder. 16bit conditional sum blocks. . . . . . . . cin merge with fast carry link [7]. . . . . . cin merge with extra bit [7]. . . . . . . . . Modied 8bit Kogge-Stone EAC block. . . . . . . . . . . . . . . . . . . . . . .
Critical path of the modied EAC adder. . . . . . . . Gate level design of critical path. . . . . . . . . . . . . Reduced rst stage in Weinberger recursion adder [8]. AOI without late arriving exploitation. . . . . . . . . . AOI with late arriving exploitation. . . . . . . . . . . EAC logic with helper. . . . . . . . . . . . . . . . . . . Spreadsheet for Logical Eort Calculation. . . . . . . . Transistor level schematic. . . . . . . . . . . . . . . . . Transistor level schematic. . . . . . . . . . . . . . . . . Delay vs. Vdd at 25 C. . . . . . . . . . . . . . . . . . . Power vs. Vdd at 25 C. . . . . . . . . . . . . . . . . . Delay vs. Vdd at 100 C. . . . . . . . . . . . . . . . . . Power vs. Vdd at 100 C. . . . . . . . . . . . . . . . . .
Global oorplan of a datapath [9]. . . . . . . . . . . . . . . . . . . . . . . . . . . 46 iii
5.2 5.3 5.4 5.5 5.6 5.7 5.9 5.10 5.8
Regularity placement and routing datapath circuit [10]. . . . Schematic layout of datapath and detailed view of bitcell [11]. Representation of a datapath cell [9]. . . . . . . . . . . . . . . Designed basic cells. . . . . . . . . . . . . . . . . . . . . . . . Bit slices of the blocks in the adder. . . . . . . . . . . . . . . Bit slice of 16bit Kogge-Stone adder. . . . . . . . . . . . . . . Wide layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . Stacked compact layout. . . . . . . . . . . . . . . . . . . . . . Layout of the blocks. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
46 47 48 49 50 51 52 52 55
List of Tables
1.1 4.1 5.1 Trade-o between dierent adder topologies. . . . . . . . . . . . . . . . . . . . . . Delay and Power dissipation values in correspondence to VDD and Temperature. 7 43
Results comparison of proposed adder with the previous work. . . . . . . . . . . 53
Chapter 1
Introduction
Fused multiply add unit plays an important role in modern microprocessor. It performs oating point multiplication followed by an addition of the product with a third oating point operand. In 2007, a seven cycle fused multiply add pipeline unit was proposed as a part of the oating point unit in IBMs POWER6 microprocessor [5]. In this fused multiply add data ow, the product should be aligned before it is added with the addend. Because the magnitude of the product is unknown in the early stages prior to the combination with the addend it is dicult to determine a priori which operand is bigger. Even if it was determined early that the product is bigger, there would be a problem on conditionally complementing two intermediate operands the carry and sum outputs of the counter tree. Thus an adder needs to be designed to always output a positive magnitude result and preferably only needs to complement one operand. In [6], the adder in POWER6, was taken as a reference for design space exploration in FPGA technology. They designed a complete adder independent of FMA block and shown that Kogge-Stone does not provide the best performance in FPGA technology. Zhang et al. [12] proposed recently a 108-bit adder for an FMA unit. All these adders took the adder in POWER6 as a reference. We believe using a CLA block in this adder limits the possibility to totally exploit the benets of parallel prex adders. We designed an adder with parallel prex 2n 1 block. Although the carry increment topology is still employed, the number of carry merge terms is decreased as well as the length end around carry path. Another area of improvement lies in the design of rst level addition blocks. Using a Carry-select scheme provides the benet of choosing the sum 1
CHAPTER 1. INTRODUCTION
at the end of computation. However, carry path should not include the calculation of group propagate and generate(PG ) terms according to input carry. In our conditional sum blocks, the carry path only includes PG terms with cin = 0 and the sum is calculated in non-critical path for both conditions: cin = 0 and cin = 1. At the nal stage, the real sum is selected according to the output of the EAC block. The thesis is organized as following; in Chapter 1 general information about adders are provided to show the importance of design space. In Chapter 2 detailed analysis of modulo 2n 1 adders and EAC adders are given. The modied adder is described in Chapter 3 and compared with the previous architectures. Critical path analysis methodology is analyzed in detail in Chapter 4 and simulation results are provided. Finally in Chapter 5 datapath library design methodology is described.
1.1
Fused Multiply-Add Operation
A fused multiply-add(FMA) unit performs the multiplication A B followed immediately by an addition of the product and a third operand C so that the result T is calculated as Eqn. 1.1 in a single indivisable step [2]. Such a unit is capable of performing multiply only by setting C = 0 and add(or subtract) only by setting, for example, B=1. T =AB+C =M +C (1.1)
An advantage of a fused multiply add unit, compared to separate multiply and adder, arises when executing oating-point operations since rounding is performed only once for the result of T = A B + C rather than twice(for the multiply and then for the add) [13]. Since rounding may introduce computation errors, reducing the number of rounding operations aects positively the overall error. The input of the operands are calculated at the CSA(Carry-Save Adder) multiplier tree and the magnitude of the operands is not known prior to addition to determine which operand has greater value. Since oating point is a sign magnitude operation, the result of the adder should be in twos complement form [14]. Therefore, an adder is needed to produce two separate results
CHAPTER 1. INTRODUCTION for the following cases: Case 1 : If operand M > C , |M C | = M C = M + C + 1 Case 2 : If operand C < M , |M C | = C M = (M C ) = (M + C + 1) = (M + C ) 1 = (M + C )
During subtraction of M C , the nal carry out is Cout is 1 when M > C , and 0 when C > M . Consequently, an End-Around-Carry adder produces two dierent results and Cout determines whether case 1 or case 2 happens [12].
1.2
1.2.1
Adders
Ripple Carry Adders
The addition of two operands is the most frequent operation in almost any arithmetic unit. A two-operand adder is used not only when performing additions and subtractions, but also often employed when executing more complex operations like multiplication and division. Consequently, a fast two-operand adder is essential [3]. si = ai bi ci ci+1 = ai bi + (ai + bi ) ci (1.2) (1.3)
Figure 1.1: Ripple Carry adder [1].
The most straightforward implementation of a parallel adder for two operands x and y is through the use of n basic units called full adders. A full adder(FA) is a logical circuit that accepts two operand bits, say x and y, and an incoming carry bit, denoted by s and an outgoing carry bit denoted by c. As the notation suggests, the outgoing carry c is also the incoming carry for the subsequent FA, which has c and y as the input bits. The FA is a combinational digital circuit implementing the binary addition of three bits through the Boolean equations in 1.2 and 1.3. The ripple carry adder is shown on Figure 1.1.
1.2.2
Carry lookahead adders
The most commonly used scheme for accelerating carry propagation is the carry lookahead scheme [2] on Figure 1.2. The equations in Eqn. 1.4 show the realization of the Carry-LookAhead Generator. The main idea behind carry lookahead addition is an attempt to generate all incoming carries in parallel(for all the n-1 high order FAs) and avoid the need to wait until the correct carry propagates from the stage(FA) of the adder where it has been generated. This can be accomplished in principle, since the carries generated and the way they propagate depend only on the digits of the original numbers x-1, x-2,and y. These digits are available simultaneously to all stages of the adder and consequently value of the incoming carry and compute the sum bit accordingly. This however, would require and inordinately large number of inputs to each stage of the adder, rendering this approach impractical [1].
Figure 1.2: Carry lookahead adder [2].
c4 = G 0 + c0 P0 , c8 = G 1 + G0 P1 + c0 P0 P1 , c12 = G 2 + G1 P2 + G0 P1 P2 + c0 P0 P1 P2
(1.4)
1.2.3
Parallel Prex Adders
A parallel prex circuit is a combinational circuit with n inputs x1 , x2 , , xn producing the outputs x1 , x2 x1 , ..., xn xn1 ... x1 where is the associativity binary operation. The rst stage of the adder generates individual P and G signals. The remaining stages constitute the parallel prex circuit with the fundamental carry operation serving as the associative binary operation. This part of the adder can be designed in many dierent ways.
gi = ai bi pi = ai bi
(1.5)
Gi:k = Gi:j + Pi:j Gj 1:k Pi:k = Pi:j Pj 1:k
(1.6)
Although computing carry-propagate addition can use generate and propagate signals, its implementation in VLSI can be quite inecient due to the number of wires that have to be connected together. Parallel-prex adders solve this problem by making the wires shorter with simple gate structures to aid in the passing of groups of carries to the next weight [3] [15]. The proof of parallel prex adder can be found in [16]. Parallel-prex adders can be broken down into three stages: Pre-computation: single bit carry generate/propagate are obtained with Equation 1.5 and temporary sum is generated. This stage can be simplied applying the rules dened in Section 1.4.1. Parallel-prex tree: carry at each bit is computed with group carry generate/propagate where Equation 1.6 is applied. It is possible to simplify the rst stage of the tree using the rules dened in Section 1.4.2.
CHAPTER 1. INTRODUCTION Post-computation: Sum and carry-out are derived with Equation 1.2. Basic cell denitions of prex adders is shown Figure 1.3.
Figure 1.3: Group PG cells [3].
1.3
Tree Adders
Tree structures have been used for graphically representing the various parallel prex algorithms. Many state-of-the-art adder circuits use parallel prex schemes to achieve high performance [1719]. For wide adders, the delay of carry-lookahead (or carry-skip or carry-select) adders becomes dominated by the delay of passing the carry through the lookahead stages. This delay can be reduced by looking ahead across the lookahead blocks [20]. In general, it possible to construct multilevel tree of look-ahead structures to achieve delay that grows with log N. Such adders are variously referred to as tree adders, logarithmic adders, multilevel-lookahead adders, parallel-prex adders, or simply lookahead adders. There are many ways to build the lookahead tree that oer tradeos among the number of
Table 1.1: Trade-o between dierent adder topologies. Topology Logic Level Fanout Wiring Track Kogge-Stone [22] Low Low High Sklansky [23] Low High Low Brent-Kung [24] High Low Low stages of logic, the number of logic gates, the maximum fanout on each gate, and the amount of wiring between stages. Figure 1.4 shows a three dimensional taxonomy of prex adders [21]. Far edge adders are Kogge-Stone [22], Sklansky [23] and Brent-Kung [24] for their properties of low logic-level, higher wiring track; low logic-level, high fanout and high logic-level, low fanout respectively as shown on Table 1.1.
Figure 1.4: Taxonomy of prex networks [3]. The following parallel prex adders can be found in the literature: Kogge-Stone The Kogge-Stone tree [22] Figure 1.5 achieves both log 2N stages and fanout of 2 at each stage. This comes at the cost of long wires that must be routed between stages. The tree also contains more PG cells; while this may not impact the area if the adder layout is on a regular grid, it will increase power consumption. Despite these cost,
Kogge-Stone adder is generally used for wide adders because it shows the lowest delay among other structures.
Figure 1.5: Kogge-Stone adder [3].
Sklansky The Sklansky or divide-and-conquer tree [23] on Figure 1.6 reduces the delay to log 2N stages by computing intermediate prexes along with the large group prexes. This comes at the expense of fan-outs that double at each level: The gates fanout to [8, 4, 2, 1] other columns. These high fan-outs cause poor performance on wide adders unless the gates are appropriately sized or the critical signals are buered before being used for the intermediate prexes. Transistor sizing can cut into the regularity of the layout because multiple sizes of each cell are required, although the larger gates can spread into adjacent columns. With appropriate buering, the fan-outs can be reduced to [8,1,1,1].
Figure 1.6: Sklansky adder [3].
Brent-Kung The Brent-Kung tree [24] as shown on Figure 1.7 computes prexes for 2-bit groups. These are used to nd prexes for 4-bit groups, which in turn are used to nd prexes for 8-bit groups, and so forth. The prexes then fan back down to compute the carries-in to each bit. The tree requires 2(log 2N ) 1 stages. The fanout is limited to 2 at each stage. The diagram shows buers used to minimize the fanout and loading on the gates, but in practice, the buers are generally omitted.
Figure 1.7: Brent-Kung adder [3].
CHAPTER 1. INTRODUCTION Other than those major adders trade-o can be achieved with the following adders.
10
The Han-Carlson trees [25] are a family of networks between Kogge-Stone and Brent-Kung. Figure 1.8 shows such a tree that performs Kogge-Stone on the odd- numbered bits, and then uses one more stage to ripple into the even positions. The Knowles trees [26] are a family of networks between Kogge-Stone and Sklansky. All of these trees have log 2N stages, but dier in the fanout and number of wires. If we say that 16-bit Kogge-Stone and Sklansky adders drive fanouts of [1, 1,1,1] and [8, 4, 2, 1] other columns, respectively, the Knowles networks lie between these extremes. For example, Figure 1.9 shows a [2, 1, 1, 1] Knowles tree that halves the number of wires in the nal track at the expense of doubling the load on those wires. The Ladner-Fischer trees [27] are a family of networks between Sklansky and Brent-Kung. Figure 1.10 is similar to Sklansky, but computes prexes for the odd- numbered bits and again uses one more stage to ripple into the even positions. Cells at high-fanout nodes must still be sized or merged appropriately to achieve good speed.
Figure 1.8: Han-Carlson adder [3].
11
Figure 1.9: Knowles [2,1,1,1] adder [3].
Figure 1.10: Ladner Fischer adder [3].
1.4
Recurrence Algorithms
Recurrence algorithms have been a research area for a long time [8] [28]. Weinberger presented the most widely known carry recurrence for VLSI addition in 1958 [20]. Over the years, several addition algorithms have been developed. These algorithms manipulate the carry and sum equations in an attempt to improve the speed of addition. The equations for sum and carry are dened as equation 1.2 and 1.3.
12
Ling modied the algorithm to reduce the complexity of the carry computation at the cost of increased complexity in the sum computation. An analysis was later performed by Doran [29] to determine the set of recurrences which have recurrence properties that are similar to Weinbergers and Lings.
1.4.1
Weinberger Recurrence
Weinberger [20] demonstrated that addition speed could be improved by parallelizing the computation of carry. Although widely credited with only the Carry Look-Ahead Adder, Weinbergers recurrence was not limited in group size or number of levels for carry computation [20]. The fundamental advancement of his work was the introduction of generate and propagate as shown on Eqn. 1.7. Weinberger dened the terms: bitwise generate (g ), bitwise propagate (p), group generate (G), and group propagate (P ). These terms allow for carry computation to be performed in parallel, yielding a signicant improvement in performance compared to ripple-carry addition. For a group of 4 bits, Weinberger recurrence has ten terms for the generation of G5:2 from the inputs and four terms for the generation of P5:2 . The maximum transistor stack height is 5. Weinberger demonstrated that and could be used to create blocks of arbitrary size and parallelized to form multiple levels of recurrence [20]. Thus, the majority of parallel prex adders proposed for high-performance addition employ the realizations of Weinbergers recurrence, e.g., Kogge-Stone [22], Brent-Kung [24], Han-Carlson [25], Ladner-Fischer [27], and those described by Knowles [26].
gi = ai bi pi = ai + bi
(1.7)
1.4.2
Ling Recurrence
IBM ECL technology limitations on fan-in (limited to 4) and wired-OR (limited to 8) motivated Ling to develop a transformation that reduced the fan-in of Weinbergers recurrence [30] [8]. For clarity, a simple derivation of Lings transformation will be shown. This derivation provides
13
the physical meaning of the signals used in Lings transformation and identies the favorable characteristics of Ling for implementation in modern CMOS technology. In the derivation, the bitwise generate signal is dened as: gi = ai bi and the bitwise propagate signal is dened as: ti = ai + bi . Note that the propagate signal ti is the same as Weinbergers pi (when implemented using an OR). To maintain consistency with Lings original paper, ti will be used for propagate. Lings transformation reduces the complexity of Weinbergers recurrence by factoring ti from ci+1 to create a pseudo-carry (hi ) on which the recurrence is performed. The transformation is shown below on c1 to form h0 . The carry-out signal, c1 , of the rst bit position is c1 = g0 + t0 c0 Lings transformation uses the property ti gi = gi to form c1 = t0 g0 + t0 c0 = t0 (g0 + c0 ) where g0 + c0 = h0 , which leads to c1 = t0 (g0 + c0 ) = t0 h0 The general transformation of is ci dened as ti1 hi1 c0 if i > 0 i=0 (1.10) (1.9) (1.8)
ci =
(1.11)
where the pseudo-carry, hi , is dened as hi = gi + ci (1.12)
The physical meaning of the pseudo-carry signal h can be described as follows. By factoring ti out of the carry expression and propagating hi instead of ci+1 , all cases where carry is generated and/or propagated from the stage preceding stage i are included in hi . This includes the case where a carry-in to the ith stage can be assimilated (which should not result in a carry-out). The assimilate condition is handled when forming ci+1 by ANDing hi with ti to produce ci+1 . If the carry-assimilate (carry-kill) condition exists then ti = 0 , which results in ci+1 = 0. A recurrence for can be dened as has been done previously for Weinbergers ci . The
14
group pseudo-carry and transmit which allow for parallel prex computation can be dened over the group of bits (capital letters are used to refer to the group): Ti:j = ti ti1 tj (1.13)
Hi:j = gi + gi1 + ti1 gi2 + ti1 ti2 gi3 + + ti1 ti2 tj +1 gj The recurrence can be expressed using the operator as Hi:j Ti1:j 1 Hj 1:k Tj 2:k1 Hi:j + Ti1:j 1 Hj 1:k Ti1:j 1 Tj 2:k1
(1.14)
(1.15)
The transformation from Weinbergers recurrence to Lings recurrence for a group of 4 bits is shown in the example in Fig. 2. This gure should dispel any diculties associated with understanding the original Lings derivation. The advantage of using pseudo-carry instead of carry is oset by the increased complexity of sum computation, which requires the real carry to form individual sum signals. In CMOS technology can be eciently calculated conditionally, thus avoiding the AND operation on the critical carry path: ai bi ai bi ti1 hi1 =0 hi1 =1
si =
(1.16)
1.5
Conclusion
In this chapter general information about binary adder realization in VLSI perspective is described. Parallel prex adders provide the fastest carry propagation on the critical path for wide adders and it allows fast layout design because of its regularity. The most important advantage is the design space with many trade-os in delay, power dissipation and area. Also, recurrence algorithms are provided to show it is possible to shorten the carry path with dierent propagate and generate terms. In the next chapter, a detailed analysis about the modulo adders is given. End-around-carry adders are analyzed in detail as a specialized realization of modulo adders. Analysis of previous
CHAPTER 1. INTRODUCTION work is also included.
15
Chapter 2
Modulo Adders
2.1 Introduction
Modular arithmetic has been an interest to researchers in a wide range of areas, since its operations are the basis for systems that use the Residue Number Systems (RNS) [31]. Modulo addition/subtraction and multiplication can also be applied to digital lters [32], cryptography [33], error detection and correction [34], as well as checksum computation in high-speed networks [35]. More importantly, modulo 2n 1 addition is a common operation that can be implemented in hardware, because of its circuit eciency and simple implementation [36]. In end-around carry adders the carry-in depends on the carry out [4] ie. the carry out cout is fed through some logic back to the carry in cin . In particular, this is used for modulo 2n + 1 [37] and 2n 1 [38] addition rely on decrement and increment respectively of the addition result depending on cout . Since prex algorithms actually rely on incrementer structures, considering parallel-prex schemes for this kind of adders is very promising. In order to obtain fast end around carry adders both conditions of fast carry out generation and fast carry in processing have to be met. This implies that there should be no combinational path existing between cin and cout . A fast end around carry adder can be built using the prex structure. Here the last prex is used as an incrementer which is controlled by the cout of the previous prex stages.
16
CHAPTER 2. MODULO ADDERS
17
2.2
Addition in Modulo 2n 1 Adder
Addition modulo 2n 1 or ones complement addition can be formulated by the following equation:
A + B (mod2n 1) =
A + B (2n 1) = A + B + 1 A+B
(mod2n )
if A + B 2n 1 otherwise
(2.1)
However, the condition A + B 2n 1 is not trivial to compute. It can be rewritten as A + B 2n with a carry input of 1.
A + B (mod2n 1) =
A + B (2n 1) = A + B + 1 A+B
(mod2n )
if A + B 2n otherwise
(2.2)
Now the carry out cout from the addition A + B can be used to determine whether increment has to be performed or even, simpler cout can be added to the sum of A + B . This equation however results in a double representation of zero ( ie. 0 = 00 = 11)
Figure 2.1: Prex graph with fast end-around carry [4]. The standard approach for the implementation of a modulo 2n 1 adder is by using a conventional carry propagate adder(CPA) abd have the carry out fed back into the carry in of the adder. This creates the necessary end-around-carry needed to have modulo 2n 1 operate correctly as shown on Figure 2.2.
18
Figure 2.2: General block diagram Modulo 2n 1 Adder [4].
2.3
Analysis of Previous End-Around-Carry Adders
Previous dened end-around action can be obtained using dierent rules. Although the EAC adder has been used [39] and implemented on several microprocessors, very few details exist on their formulations and arithmetic algorithms in todays literature. Schwarz [40] provided explanations about some aspects of the EAC adders algorithm as a part of fused-multiplyadd(FMA) unit. Shedletsky [41] dened the indeterminate behavior of EAC adders and Liu et al. [42] presented a formal analysis of EAC adders.
19
Figure 2.3: Block diagram of the 128-bit binary adder [5]. Liu et al. [6] [42] extended the algorithm to make the adder independent without being part of a FMA unit. The design mainly follows the algorithms of the EAC adder which is implemented in IBM POWER6 microprocessor [5]. The additional logic units of the proposed adder are useful to ensure the whole adder to work independently. Another advantage is that it is easier to implement and test the adder in FPGA technology, which enables design space exploration. Figure 2.4 shows the architecture of the adder.
20
Figure 2.4: Architecture of the EAC adder [6]. EAC means that when subtracting two signed numbers that are in signed magnitude format, the subtraction is implemented by the addition of the rst operand with the Boolean complement of the second operand. For this addition, instead of setting a carry into the least signicant digit, the carry out of the most signicant digit is taken as the carry in. This ensures that the result of the addition is always positive magnitude result and preferably only one operand needs to be conditionally complemented. Thus, an EAC adder performs addition similar to other regular adders and subtraction using the end around carry to ensure the result is positive. The adder shown in Figure 2.4 should satisfy the following conditions: 1) when x .s =y .s the adder should do addition and we have s .s = x .s and s .s = x .x + y .y . 2) when x .s = y .s , the
21
adder should do subtraction. If x .x y .y , then s .s = x .s and s .s = x .x y .y ; if x .x < y .y then s .s = y .s and s .s = y .y x .x. The subtraction operation can be described as follows: 1) Determining which operand is bigger. After a subtraction operation if result is positive, .y + 1 = x .x + 2n y .y , operand x is bigger otherwise y is bigger. When x .x y .y = x .x + y the carry out of x .x + y .y + 1 will be 1. Therefore, cout results as 1 if x is bigger and 0 if y is .y + cout bigger. Hence the sum equation can be written as x .x + y 2)When y is bigger, cout = 0 the subtraction can be written as s .s = y .y x .x = ( y .y .y + 1) = ( x.x + y .y + 0) + 1 1 = ( x.x + y .y ). x .x) = ( x.x + y 3) The cout is used to select correct result: x .x + y .y + cout x .x + y .y + cout cout = 1 cout = 0
s .s =
In order to implement addition and subtraction in one adder,y should be complemented conditionally. The eective operation can be dened as; Os = x .s y .s (2.3)
yt =
y .y y .y
Os = 0 Os = 1
The sign of the result is determined after sign logic: x .s y .s cout = 1 cout = 0
s .s =
Generally implementation of an adder/subtracter is achieved using two dierent adders, one for addition and one for subtraction. After the results for both of the operation is calculated, the nal result is selected with a multiplexer as given on Figure 2.5.
22
Figure 2.5: Architecture of the EAC adder [6].
2.4
Carry-lookahead EAC Logic Unit
The use of EAC unit helps implementing adder/subtracter using only one adder. In [40], the algorithm for EAC unit with four carry bits, can be found. The most signicant bit is labeled as 0. The group carries for a CLA adder is dened as :
C0 = G0 + P0 G1 + P0 P1 G2 + P0 P1 P2 G3 + P0 P1 P2 P3 Cin C1 = G1 + P1 G2 + P1 P2 G3 + P1 P2 P3 Cin C2 = G2 + P2 G3 + P2 P3 Cin C3 = G3 + P3 Cin If the carry out C0 is fed to the carry in, EAC operation is achieved as Equation 2.5. (2.4)
C0 = G0 + P0 G1 + P0 P1 G2 + P0 P1 P2 G3 + P0 P1 P2 P3 C1 = G1 + P1 G2 + P1 P2 G3 + P1 P2 P3 G0 + P0 P1 P2 P3 C2 = G2 + P2 G3 + P2 P3 G0 + P2 P3 P0 G1 + P0 P1 P2 P3 C3 = G3 + P3 G0 + P3 P0 G1 + P3 P0 P1 G2 + P0 P1 P2 P3 The combination of the carries in this way result in a carry chain for every group that is the length of the width of the adder. This wrapping of the carries is correct for subtraction but is not correct for addition. To make adder selectable for addition and subtraction. the P3 term needs to be modied. An extra bit is combined to the least signicant bit of the adder to assert the carry propagation when the eective operation is subtraction. This bit can be integrated (2.5)
CHAPTER 2. MODULO ADDERS into P3 to make P3 = 0 for an eective operation of addition as dened in Eqn 2.6.
t P3 =
23
P3 0
Os = 1 Os = 0
(2.6)
2.5
Conclusion
In this chapter detailed information about binary modulo adders and previous work on EAC adders is provided. It is shown that when the cout is fed back into a carry-increment stage provides EAC operation as dened as 2n 1 adders. The designs in [5] and [12] employ the adder as a part of FMA ow. However, by addition of extra logic Liu [6] proposes a stand alone adder. In the next chapter detailed analysis about the proposed parallel prex EAC adder will be given. The design premises an alternative to the previous work while providing wider design space.
Chapter 3
Modied Parallel Prex EAC Adder

3.1 Introduction
Previosly proposed adders were analyzed in section 2.3. Since the designed adder in [5] did not include the implementation details about how the blocks were internally built, [6] made a detailed analysis about how the rst stage adder and EAC blocks are dened. They also extended the work to a complete adder without a FMA unit. Their implementation was directed to FPGA. Although the analysis that they provided solved many of the questions about previously designed work, a number of the parts of the adder still needs to analyzed. In this chapter, we propose a modied adder which uses a modied parallel prex 2n 1 adder block as the EAC logic with the conditional sum blocks for exibility among dierent adder architectures and lower total propagation delay.
3.2
Proposed Adder
Figure 3.1 shows the architecture of the proposed adder. The rst level includes eight 16bit blocks of Kogge-Stone prex-2 adders for 128bit inputs and the second level includes modiifed 2n 1 parallel prex Kogge-Stone adder. In order to design a standalone adder, the input complement, add/sub and sign blocks are included as suggested in [6].
24
CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER
25
Figure 3.1: Architecture of the modied EAC Adder.
3.2.1
The 16bit blocks in EAC adder
Figure 3.2 shows the 16bit conditional sum blocks. The black and gray blocks are the same as Fig. 1.3. The dashed lines correspond to the halfsum blocks, hi = ai bi . GG and GP refer to the group generate and propagate signals. As explained in [6], when carry in to the adder block is assumed 0, it is possible to reduce the complexity of the adder. When carry in is intended to propagate, the generate term in Equation 1.6 can be extended as Gi:k = Gi:j + Pi:j cin . Thus, for cin = 0 it results in Gi:k = Gi:j . It can be seen that the generated group carry for the corresponding bit position need not be merged with the previous carry. Thus, it is possible to reduce the number of black terms and use gray terms instead. Namely, if the there is no carry merge operation in the next level, the corresponding level can be terminated with a gray cell. Figure 1.5 shows 16-bit Kogge-stone tree with gray
26
and black cells. However, in our design since we included a second stage, we need both the generate and propagate terms. This necessity increases the use of black cells instead of gray cells.
Figure 3.2: 16bit conditional sum blocks. The adder in [6] is not clear about how the input carry is merged in the rst level of adders. A discussion of the most ecient approaches for the traditional carries can be found in [43]. The carry-in bit can be included either by adding a fast carry increment stage or by treating cin as an extra bit of the preprocessing stage of the adder. The rst case in shown in Figure 3.3. The second case can be derived by setting g1 = cin according to Equation 1.6. As a result of these schemes, the complexity increases to solve carry incorporation problem. Additionally [6] and [5] does not explain how the carry is propagated after 8bit blocks. Although a similar structure is used in [12], the adder architecture is dierent and only one set of generate-propagate is calculated and propagated in the rst level. Since [6] uses two dierent adders for each condition when cin = 0 and cin = 1, two set carries need to be selected before
27
leaving the rst level. This operation makes the calculation more complicated. Thus, we use the architecture in Figure 3.2 to calculate one set of generate-propagate as group terms for the conditional of cin = 0. In the next stage, conditional carry for cin = 1 is calculated with a simple equation Gi:k = Gi:j + Pi:j which is simply an OR gate.
Figure 3.3: cin merge with fast carry link [7].
Figure 3.4: cin merge with extra bit [7]. As a nal step the sum for each carry condition, for cin = 0 and cin = 1, is calculated to be
CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER selected according to the result of the second level of carry calculation.
28
3.2.2
Parallel Prex 2n 1 EAC Block
As explained in chapter 2, 2n 1 adders can be used for EAC calculation. In this section, we extend the use of 2n 1 adders for adder/subtractor. Figure 3.5 shows the modied 8bit Kogge-Stone EAC block.
Figure 3.5: Modied 8bit Kogge-Stone EAC block. The block rst takes the group generate and propagate terms that are calculated in the rst level. Since there are eight 16 bit adders in the rst level, the carry for the whole adder must be calculated rst. The 8 carries are merged in the Kogge-Stone adder. In order to achieve subtraction operation, the nal carry at the most signicant bit position must be merged with lower signicant position. This step is also called as carry increment stage. The eective operation signal Os is dened as in Eqn. 2.3. When operation is subtraction, Os = 1, the AND gate propagates the most signiicant carry and when operation is addition, Os = 0, most signicant carry is blocked and the unit works as a regular adder. ct 8 = Os G127:0 is dened as the end-around-carry in Eqn. 3.1. ct 8 = G127:0 0 Os = 1 Os = 0
(3.1)
In the literature [37] [4] [44] [45] [15] carry-increment stage has found many usage for 2n 1 operation. Especially for the wide adders, N 64 the nal carry needs to travel a very long path to arrive on the least signicant position, actually twice the length from the least signicant
29
position to least signicant position again. It is dened in [40] that carry needs to travel only the total length in an EAC operation. That result motivated the use of CLA logic as a EAC logic. In our adder, we use a second stage for EAC calculation instead of one long parallel prex propagation and carry increment stages. Namely for 128 bit EAC adder the shorthest stage count is 2n = 128, n = 7 and one stage of 127 carry merge terms. In our adder, we have 4 stages for the rst level, 3 stages for the second level and 1 stage of 8 carry merge terms, which makes the total stage count the same, whereas a much lower count of carry merge terms. Decreasing the number of carry merge terms, actually, decreases the length of the carry path as well as the delay.
3.3
Implementation and Validation
After the adder architecture is nalized, both the modied and previously proposed [6] adders are designed in Verilog. A test xture is created to validate the adder with the corner cases, such as 0-to-1 crossover, carry propagation from 16th bit to the next bit. A check procedure is dened to signal out if the output of the adder and the calculated result are dierent. The tests show that both adders have similar operation and thus the modied EAC is veried.
3.4
Conclusion
In this chapter the proposed Parallel Prex EAC adder is analyzed. It is shown that using conditional sum blocks solves the carry incorporation problem existing in the previous works. Moreover, the EAC logic in [6] is redesigned with a modied 2n 1 adder to provide wider design space and shorter carry path. As a nal step the adder is implemented in Verilog and validated through simulation. In the next chapter, a simple method for transistor level realization of the critical path is discussed.
Chapter 4
Critical Path Analysis

4.1 Path Identication
The critical path is identied as the path from the sign logic to the sum of at the output as shown in Figure 4.1 for the proposed adder. In order to conditionally complement the second input, the eective operation Os is calculated from sign inputs sA and sB . After B t is calculated, the propagate and generate terms are calculated using Eqn. 1.7. The half-sum block, hi = ai bi is not on the critical path because the half-sum is needed for sum calculation which is not needed for carry propagation. After 4 stages of 16bit adder, carry is merged with other group terms in the 8bit EAC block. The EAC block consists of 3 carry merge stages and 1 carry increment stage. We only take into account the carry merge operation Gi:k = Gi:j + Pi:j Gj 1:k , which is simply an AND-OR-INVERT(AOI) gate, because the group propagation term is not on the critical path. The most signicant carry should either be blocked or propagated. Thus, the Os controlled AND gate is on the critical path. Since the end-around carry term, the most signicant bit in EAC block, is calculated before the carry increment stage, c8 is not on the critical path. If we choose one of the carry in between the last and rst bits, it gives a good estimation about the delay of the operation. At the next stage one of the sums needs to be selected according to the carry from EAC block as an output for each16bit adder. Therefore, one mux is on the critical path. As a nal stage, the output needs to be conditionally complemented according to the end-around-carry and operation. The calculation of the condition is not on
30
CHAPTER 4. CRITICAL PATH ANALYSIS
31
critical path because during the rst MUX stage, it can be calculated. Thus, we end up one XOR stage for complementing.
Figure 4.1: Critical path of the modied EAC adder.
4.2
Path Design
After the critical path is identied, the gate level model is created as in Figure 4.2. All the inverters on the critical path are removed to decrease the number of logic levels by applying the inverting property of CMOS to the consecutive levels. It can be observed that each stage of carry merge operation shows an alternating design of AOI and OAI. The output load is one minimum size inverter.
32
Figure 4.2: Gate level design of critical path. Gate level design is just a logical expression of the circuit. Therefore, in order to make accurate simulations the gates should be mapped to transistor level models. In this phase, we created a library of custom design gates instead of using standard cell library for higher performance.
4.3
Transistor Level Design and Sizing
Full custom design always outperforms standard cell design in terms of power consumption, area and propagation delay. However, it is hard to automate designing in full custom perspective and the designing process takes much longer time than the design with standard cells. A number of techniques exist when designing in transistor level.
4.3.1
Logic Level Minimization
Before designing any logic circuit, it is a benecial step to exploit logic level minimization. In our design we tried to combine the pg generation and rst level of AOI as suggested in [8]. This approach allows for a single stage to be removed from Weinberger adder realization. However, this method increases the stack count and since our blocks were designed close to minimum width transistors, we did not observe any improvement on the delay. Figure 4.3 shows the reduced rst stage.
33
Figure 4.3: Reduced rst stage in Weinberger recursion adder [8]. As another step Ling recursion is applied to the rst step. However as suggested in [15] Ling carries must be converted to real carries by ANDing the most signicant carry with the corresponding propagate term, ci+1 = di pi before leaving the adder block. This calculation is on the critical path. Therefore, Ling recursion is not applied. As a nal step, we combined the Os controlled AND gate with the previous OAI gate. However, we observed that increasing the complexity of the gate did not provide much delay improvement because the gates are already sized close to minimum.
34
4.3.2
Late arriving signal exploitation
One of the most important perspective in transistor level design is to connect the late arriving signals close to output. It can be shown on Figure 4.4 AOI gate , assuming A is the late arriving signal, the gate on Figure 4.5 provides better delay.
Figure 4.4: AOI without late arriving exploitation.
35
Figure 4.5: AOI with late arriving exploitation.
4.3.3
Logical Eort
Sutherland et al. described a very useful concept of Logical Eort [46].The method of logical eort is founded on a simple model of the delay through a single MOS logic gate. The model describes delays caused by the capacitive load that the logic gate drives and by the topology of the logic gate. Clearly as the load increases, the delay increases, but delay also depends on the logic function of the gate. Inverters, the simplest logic gates, drive loads best and are often used as ampliers to drive large capacitances. Logic gates that compute other functions require more transistors, some of which are connected in series, making them poorer that inverters at driving current. A NAND gate has more delay than an inverter with similar transistor sizes that drives the same load. The method of logical eort quanties these eects to simplify delay analysis for individual logic gates and multistage logic networks. The logical eort of a logic gate tells how much worse it is at producing output current than is an inverter, given that each of its inputs may present the same input capacitance as the
36
inverter. Reduced output current means slower operation, and thus the logical eort number for a logic gate tells how much more slowly it will drive a load that would an inverter. Equivalently, logical eort is how much more input capacitance a gate must present in order to deliver the same output current as an inverter.
Logical Eort for Multistage Networks The method of logical eort reveals the best number of stages in a multistage network and how to obtain the least overall delay by balancing the delay among the stages. The notions of logical and electrical eort generalize easily from individual gates to multistage paths. The logical eort along a path compounds by multiplying the logical eorts of all the logic gates along the path. The symbol G denotes the path logical eort, so that it is distinguished from g, the logical eort of a single gate in the path. The subscript i indexes the logic states along the path. G = gi (4.1)
The electrical eort along a path through a network is simply the ratio of the capacitance that loads the last logic gate in the path to the input capacitance of the rst gate in the path. The uppercase symbol H indicates the electrical eort along a path. In this case, Cin and Cout refer to the input and output capacitances. H = Cout /Cin (4.2)
Branching eort b is used to account for fanout within a network. When fanout occurs within a logic network, some of the available drive current is directed along the analyzed path and some is directed along the o-path. The branching eort b is dened at the output of a logic to be b = (Conpath + Cof f path )/Cin (4.3)
Note that if the path does not branch, the branching eort is one. The branching eort along an entire path B is the product of the branching eort at each of the stages along the
CHAPTER 4. CRITICAL PATH ANALYSIS path. B = bi
37
(4.4)
Utilizing the denitions of logical, electrical and branching eort along a path, path eort is dened as F. Note that the path branching and electrical eorts are related to the electrical eort of each stage. F =GBH (4.5)
Although it is not a direct measure of delay along the path, the path eort holds the key to minimizing the delay. Observe that the path eort depends only on the circuit topology and loading and not upon the sizes of the transistors used in the logic gates embedded within the network. The path eort is related to the minimum achievable delay along the path. Only a little work yields to nd the best number of stages and the proper transistor sizes to realize the minimum delay. Optimizing the design of an N-stage logic network proceeds from a very simple principle that the path delay is least when each stage in the path bears the same stage eort. This minimum delay is achieved when the stage eort is: f = gi hi = F 1/N (4.6)
To equalize the eort borne by each stage on a path, and therefore achieve the minimum delay along the path, appropriate transistor sizes for each stage of logic along the path must be chosen. Each logic stage should be designed with electrical eort: hi = F 1/N /gi (4.7)
From this relationship, it is straightforward to determine the transistor sizes of gates along a path. Starting at the end of the path and working backward to apply the capacitance transformation: Cin,i = (gi Cout,i )/f (4.8)
38
The equation determines the input capacitance of each gate, which can then be distributed appropriately among the transistors connected to the input.
4.3.4
Design with Helpers
As described in [47] in most of the aders the stage eort is generally constant if wire capacitance is neglected. This means uniform gate sizes may be used throughout with little loss in performance. It is possible to have a very regular layout in this case. However, such adders like Sklansky show exponential fanout increase. In this case, the stage eort becomes high for those cells and as a result a bigger driver is needed. Therefore, Harris et al. [47] proposes the concept of helpers. When the stage eort increases, it is a good practice to duplicate driving cells in parallel to maintain lower delay.
Figure 4.6: EAC logic with helper. Using the concept described by [47], we duplicate the operation controlled NAND gate. That helps to decrease the fanout of the carry increment stage and, consequently, the size of the NAND gate.
4.4
Transistor Sizing
Logical eort found wide interest in transistor sizing [4851]. In our proposed adder, we used the concept to size the transistors in the critical path. After the sizes are determined, it is
39
possible to use these widths in the remaining blocks because other blocks which are not on critical path will have the same loads and fan-outs within the same logic level. In order to equalize the rising and falling edge of the output signal wp /wn = 2.5/1 ratio is used between NMOS and PMOS transistors. In the transistor level library, late arriving signal exploitation, helpers and logical eort are used to achive highest performance. In addition, the Os controlled AND gate needs to drive 7 OAI cells. If we use two helpers, as described in 4.3.4, instead of one, so that the gates drive 4 OAI at most and it is possible decrease the large size as well as delay. In order to apply logical eort, a spreadsheet is created as in Figure 4.7. After the logical eort parameters are calculated, the width is distributed to each stage according to stage coecients. Since the gates are designed according to 2.5/1 ratio, stage coecients show how large is the gate when compared to inverter.
Figure 4.7: Spreadsheet for Logical Eort Calculation. After the appropriate transistor sizes are found, we continued on performing simulations using HSPICE [52] and FreePDK45 [53] [54] library. Figure 4.8 and 4.9 show the transistor level schematic.
40
Figure 4.8: Transistor level schematic.
Figure 4.9: Transistor level schematic.
41
4.5
Simulation Results
During the simulation a square wave signal is applied with 150ps rising and falling edges. The delay between input and output is then measured. The power dissipation of the critical path is measured for power estimation of one bit of the adder. Temperature and supply voltage is swept between 25-100 C and 0.8-1.2Vdd to observe optimal performance. It can be observed from Figures 4.10 - 4.13 that critical path shows 141.8ps delay and 18.9W power dissipation for one bit, when 10 % switching activity is considered [3], which makes 2.42mW for 128bits.
Figure 4.10: Delay vs. Vdd at 25 C.
42
Figure 4.11: Power vs. Vdd at 25 C.
Figure 4.12: Delay vs. Vdd at 100 C.
43
Table 4.1: Delay and Power dissipation values in correspondence to VDD and Temperature. Vdd Delay(ps ) 25 C 0.8 1.0 1.2 178.18 141.83 125.45 100 C Power(W ) 100 C 11.55 20.59 36.19 Temp 25 C 269.08 10.33 214.94 18.90 189.31 33.79
Figure 4.13: Power vs. Vdd at 100 C. Table 4.1 shows the trade-o between temperature and supply voltage for the proposed adder. When Vdd is increased from 1.0V to 1.2V delay decreases by 12%, power increases by 79% and when decreased from 1.0V to 0.8V, delay increases by 26% and power decreases by 45%. When temperature changes from 25 C to 100 C delay increases by 51% and power increases by 9.2%. It can be observed that much more power can be saved than the increase in delay for low power applications by lowering the supply voltage. However, for high speed applications more power should be provided. Since the temperature of the circuit has a reverse eect on both power and delay, the circuit should remain at lower temperature for higher performance.
44
4.6
Conclusion
In this chapter the methodology for transistor level critical path optimization is provided. First, the critical path of the adder is identied taking into account the logic levels. Transistor are sized with the Logical Eort concept [46]. Architecture level and transistor level optimizations are applied according to full custom design rules in FreePDK45 library [54]. As a nal step, transient simulations are carried out to calculate the delay and power dissipation. It is observed that the method enables fast design for comparison of the path among dierent adder architectures in the critical delay and power dissipation perspective. In the next chapter, a methodology for fast area estimation and layout entry is given. Datapath library is generated exploiting the regularity for the proposed adder.
Chapter 5
Datapath Library
5.1 Introduction
Datapaths have been a research topic in the last years, as an approach to make the layout entry faster [9] [10] [55] [56]. Informally datapaths are circuits where the same or similar logic is applied to several bits [57]. A datapath stack [58] is made up of many custom word lines such as registers, ALU, adders, shifters, multiplexers, buers to form the data ow of the functional units. Datapaths are characterized by a highly regular layout structure. A typical datapath oorplan consists of an array of horizontally oriented words of identical bit cells, called datapath cells, and vertically oriented bit slices as shown on Figure 5.1. Since each bit slice is replicated a number of times(determined by the datapath width) with very little or no modication, layout generation of such regular structures reduces to a careful design, often by means of handcrafting, of individual datapath cells. Figure 5.2 shows the regularity placement and routing inside a datapath slice. Figure 5.3 shows the schematic layout of datapath and corresponding bit slice [55].
45
CHAPTER 5. DATAPATH LIBRARY
46
Figure 5.1: Global oorplan of a datapath [9].
Figure 5.2: Regularity placement and routing datapath circuit [10].
47
Figure 5.3: Schematic layout of datapath and detailed view of bitcell [11]. Datapath circuits are typically organized in horizontal rows of words representing the same functional block and vertical bit slices, delimited by vertically running power and ground rails. The layout of the datapath cell of bit slice i is identical to that of bit slice (i+1), but mirrored along the vertical axis so that the adjacent bit slices can share common power or ground rail.
48
Figure 5.4: Representation of a datapath cell [9]. The width of the bit slice, also known as a pitch is xed; it determines the width for all the datapath cells as outlined on Figure 5.4. Power and ground (VDD /VSS ) supply rails generally delimit the pitch. Signal nets are connected to the datapath cell components by means of bristles. Vertical bristles, or data lines, provide wiring between dierent within the same bit slice. They run in parallel to the power rails. Horizontal bristles or control lines provide wiring between datapath cells of dierent bit slices. Control lines span the width of the datapath run perpendicular to the power rails. Since adders exploit very regular structures, they can be designed with datapaths.
5.2
Concepts in Full Custom Design
Transistor chaining and device merging Transistor chaining is a widely used technique to improve both area and performance of datapath cells. Several transistors can be chained together by combining their diusion areas in order to reduce the diusion capacitance. Diusion sharing applied to simple logic gates in the same datapath cell is known as device merging. Transistor folding Transistor folding is another popular technique aimed at minimizing area
49
and improving performance of custom designs. The folding changes the aspect ratio of the component while maintaining the required device size (W/L ratio). By performing folding with dierent number of ngers(poly gates), dierent component instances can be created for the placement phase. Intracell sharing Two component areas(diusion regions or poly gates) belonging to components from adjacent bit slices can be merged if they share the same global net, such as power line, control line or clock signal. In a typical organization of a datapath, adjacent bit slices are identical copies of each other, reected with respect to the vertical boundary line. In this case the components can be pushed under the boundary line (ground or power rail) to create a more compact layout.
5.3
Datapath Design
Using the concepts dened in sections 5.2 and 4.3 the logic gates are designed. Figure 5.5(a) and 5.5(b) shows the design of basic AOI and MUX2 cells according to the rules. AOI cell occupies 0.77m 1.5m area and MUX2 0.97m 1.5m.
(a) AOI Layout
(b) MUX2 Layout
Figure 5.5: Designed basic cells. The longest cell according to the values from Figure 4.7 has a 1.5m height. Thus, the other
50
cells are designed according to the longest cell to maintain regularity and to exploit datapath. Figure 5.7 shows the bitslice of 16bit Kogge-Stone adder. Using small cells allowed us to reduce the height of the bit slice. The bitslice includes the sign logic and conditional sum calculation. It can be seen from the datapath that the carry merge and propagate cells are interchanging one stage after another due to the intrinsic negation of CMOS design.
(a) EAC bitslice
(b) EAC last bit
Figure 5.6: Bit slices of the blocks in the adder.
51
Figure 5.7: Bit slice of 16bit Kogge-Stone adder.
52
Using the same approach in Figure 5.7, bitslices for the last and regular bits of parallel prex EAC block can be designed. Since the last bit has a irregular layout it is designed separately. Figure 5.6(a) and 5.6(b) show the designed bitslices for EAC block.
5.4
Layout Design
Using the datapath designs, it is straightforward to design the whole adder. Replicating the bitslice as designed in Figure 5.7 gives the layout of 16 bit Kogge-Stone adder. Figure 5.8(a) shows the layout of 16 bit adder. 16 bit adder and EAC block are sized 13.42m 21.98m and 5.18m 11.06m respectively. As a nal step, using the designed blocks, it is possible to estimate the total area of 128 bit EAC adder. Two dierent schemes are considered. Figure 5.9 has a thinner but wider layout to be used as a part of larger datapath. Figure 5.10 has a thicker but tighter layout to have a more compact layout because half of the adder is stacked on top of the other. First one and the second are sized 20.8m 175m and 35.8m 87.5m respectively.
Figure 5.9: Wide layout.
Figure 5.10: Stacked compact layout.
53
Table 5.1: Results comparison of proposed adder with the previous work. Delay(ps) Power(mW) Area(m2 ) Technology [5] 200(+29%) 65nm [12] 270(+47%) 20(+88%) 17,237(+82%) 65nm Proposed 142 2.42 3,132 45nm
5.5
Results
The results of previous works and the proposed adder can be observed on Table 5.1. In previous works a pipelined adder with 128bit 5GHz+ Binary Floating Point adder [5] is proposed and in [12] a 108bit EAC adder is proposed. It is not possible to compare with both of the adders because they implement the adders as a part of FMA unit and the adders are designed in 65nm IBM SOI technology, which is not publicly available. The adder designed in [5] is a pipelined adder which does not fall into the scope of this thesis. The adder in [6] is designed in FPGA technology, however, our adder is full custom design. Therefore, to the best of our knowledge this is the rst adder to use small Parallel Prex 2n 1 EAC block with the full custom design methodology. It can be seen on Table 5.1 that the adder operates at the delay of 142ps and 2.42mW power dissipation in 3, 132m2 area under 25 C with 1V supply. The proposed adder shows up to 47% improvement in delay, 81% improvement in area and 88% improvement in power dissipation assuming routing does not majorly contribute to delay and power in comparison with the previous works.
5.6
Conclusion
In this nal chapter a detailed analysis about layout entry through datapath design is provided. The comparison between the proposed and the previous work is included. Datapath library provides building blocks for early area estimation before designing the whole adder. Further, the routing is simplied with datapath since wires can be placed regularly in the bitslice. The cells are designed using full custom design rules to achieve compact, area ecient layout. Two dierent layouts are generated by repeating the bitslices. We aimed to have the longest carry path under 100m assuming the routing does not majorly contribute to delay. Final result
54
shows that, in the stacked layout, longest path remains under 45nm and the proposed adder shows up to 47% improvement in delay, 81% improvement in area and 88% improvement in power dissipation in 45nm technology under 25 C and 1.0Vdd .
55
(a) Layout of 16bit Kogge-Stone adder
(b) Layout of parallel prex EAC block
Figure 5.8: Layout of the blocks.
Chapter 6
Conclusion and Future Works

Adders are the functional blocks which are generally designed for faster operation. However, power dissipation cannot have less priority anymore. Using parallel prex adders is a good design practice for trading-o between speed, power dissipation and area. It is observed in the literature that it is not possible to have higher gains anymore from the designed circuits in the performance perspective. However, the design space is very vast and there always exists possibilities for improvements. In this thesis, we designed a parallel prex 2n 1 based adder to show it is possible to shorten the critical path and power dissipation. After the adder is implemented in Verilog, we performed a critical path analysis. In this top down design perspective, we preferred full custom design to achieve the best performance rather that standard cell design. It is well known that analysis and design in full custom design methodology requires large amount of time. Thus, specifying the critical path provides fast analysis without designing the whole circuit. In order to make such analysis we assumed that wire delay does not contribute to the majority of the total propagation delay. This analysis shows that if the critical path is optimized, the total performance is optimized. As a next step, we created a datapath library using the results from critical path analysis. It is shown that datapath design reduces the complexity of adder design process because it exploits the regularity as bit slice. As a general outcome, critical path analysis and datapath design provides fast analysis for comparison among dierent adder architectures in speed, power dissipation and area perspectives. Moreover, once the bit slice is designed optimally, it can be used as a building block for fast layout entry. After the datapath 56
CHAPTER 6. CONCLUSION AND FUTURE WORKS
57
design, it is observed in the nal layout that the length of the end-around carry path is decreased to less than 45m, using a stacked layout, which conrms our assumption about routing delay. With our assumption, the adder shows 142ps delay, 2.42mW power dissipation and under 3200 sq. micron area. Our analysis is based on Static CMOS design. Recent works show that dynamic adders provides good performance results. Although the power dissipation increases in dynamic design, a trade-o can be achieved between speed and power. Moreover, pipelined circuits found interest in the adder design. The operation of this adder can be further analyzed in dynamic design space. In addition, the use of EAC adders in oating point units especially decimal oating point adders can be further investigated.
Bibliography
[1] J. Rabaey, A. Chandrakasan, and B. Nikoli c, Digital Integrated Circuits, 2/e. Education, 2003. [2] I. Koren, Computer Arithmetic Algorithms, ser. Ak Peters Series. [3] N. Weste and D. Harris, CMOS VLSI design: Pearson/Addison-Wesley, 2005. Pearson
A K Peters, 2002.
a circuits and systems perspective.
[4] R. Zimmermann, Ecient vlsi implementation of modulo (2n 1) addition and multiplication, in Computer Arithmetic, 1999. Proceedings. 14th IEEE Symposium on, 1999, pp. 158 167. [5] X. Y. Yu, Y. hing Chan, M. Kelly, and S. B. Curran, A 5ghz+ 128-bit binary oating-point adder for the power6, in Power6 Processor, Proc. of ESSCIRC, 2006, pp. 166169. [6] F. Liu, Q. Tan, G. Chen, X. Song, O. Ait Mohamed, and M. Gu, Field programmable gate array prototyping of end-around carry parallel prex tree architectures, Computers Digital Techniques, IET, vol. 4, no. 4, pp. 306 316, July 2010. [7] G. Dimitrakopoulos and D. Nikolos, High-speed parallel-prex vlsi ling adders, Computers, IEEE Transactions on, vol. 54, no. 2, pp. 225 231, Feb. 2005. [8] B. Zeydel, D. Baran, and V. Oklobdzija, Energy-ecient design methodologies: Highperformance vlsi adders, Solid-State Circuits, IEEE Journal of, vol. 45, no. 6, pp. 1220 1233, June 2010. [9] M. Ciesielski, S. Askar, and S. Levitin, Analytical approach to layout generation of datapath cells, Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 21, no. 12, pp. 1480 1488, Dec 2002. [10] T. Tao Ye and G. De Micheli, Data path placement with regularity, in Computer Aided Design, 2000. ICCAD-2000. IEEE/ACM International Conference on, 2000, pp. 264 270. [11] J.-S. Yim and C.-M. Kyung, Datapath layout optimisation using genetic algorithm and simulated annealing, Computers and Digital Techniques, IEE Proceedings -, vol. 145, no. 2, pp. 135 141, Mar 1998. [12] X. Y. Zhang, Y.-H. Chan, R. K. Montoye, L. J. Sigal, E. M. Schwarz, and M. Kelly, A 270ps 20mw 108-bit end-around carry adder for multiply-add fused oating point unit, Journal of Signal Processing Systems, vol. 58, pp. 139144, 2010.
58
BIBLIOGRAPHY
59
[13] J. Bruguera and T. Lang, Floating-point fused multiply-add: reduced latency for oatingpoint addition, in Computer Arithmetic, 2005. ARITH-17 2005. 17th IEEE Symposium on, June 2005, pp. 42 51. [14] D. Harris and S. Harris, Digital design and computer architecture, ser. Morgan Kaufmann. Morgan Kaufmann Publishers, 2007. [15] J. Chen and J. Stine, Parallel prex ling structures for modulo 2n 1 addition, in Application-specic Systems, Architectures and Processors, 2009. ASAP 2009. 20th IEEE International Conference on, July 2009, pp. 16 23. [16] G. Chen and F. Liu, Proofs of correctness and properties of integer adder circuits, Computers, IEEE Transactions on, vol. 59, no. 1, pp. 134 136, Jan. 2010. [17] J. Park, H. Ngo, J. Silberman, and S. Dhong, 470 ps 64-bit parallel binary adder [for cpu chip], in VLSI Circuits, 2000. Digest of Technical Papers. 2000 Symposium on, 2000, pp. 192 193. [18] D. Patil, O. Azizi, M. Horowitz, R. Ho, and R. Ananthraman, Robust energy-ecient adder topologies, in Computer Arithmetic, 2007. ARITH 07. 18th IEEE Symposium on, June 2007, pp. 16 28. [19] G. Dimitrakopoulos, P. Kolovos, P. Kalogerakis, and D. Nikolos, Design of high-speed low-power parallel-prex vlsi adders. in PATMOS04, 2004, pp. 248257. [20] A. Weinberger and J. Smith, A logic for high-speed addition, National Bureau of Standards, vol. no. Circulation 591, pp. 3 12, 1958. [21] D. Harris, A taxonomy of parallel prex networks, in Signals, Systems and Computers, 2003. Conference Record of the Thirty-Seventh Asilomar Conference on, vol. 2, Nov. 2003, pp. 2213 2217 Vol.2. [22] P. M. Kogge and H. S. Stone, A parallel algorithm for the ecient solution of a general class of recurrence equations, Computers, IEEE Transactions on, vol. C-22, no. 8, pp. 786 793, Aug. 1973. [23] J. Sklansky, Conditional-sum addition logic, Electronic Computers, IRE Transactions on, vol. EC-9, no. 2, pp. 226 231, June 1960. [24] R. Brent and H. Kung, A regular layout for parallel adders, Computers, IEEE Transactions on, vol. C-31, no. 3, pp. 260 264, March 1982. [25] T. Han, D. A. Carlson, and T. don Han, Fast area-ecient vlsi adders, in IEEE Symposium on Computer Arithmetic, 1987. [26] S. Knowles, A family of adders, in Computer Arithmetic, 2001. Proceedings. 15th IEEE Symposium on, 2001, pp. 277 281. [27] R. E. Ladner and M. J. Fischer, Parallel prex computation, Journal of The ACM, vol. 27, pp. 831838, 1980.
BIBLIOGRAPHY
60
[28] B. Zeydel, T. Kluter, and V. Oklobdzija, Ecient mapping of addition recurrence algorithms in cmos, in Computer Arithmetic, 2005. ARITH-17 2005. 17th IEEE Symposium on, June 2005, pp. 107 113. [29] R. Doran, Variants of an improved carry look-ahead adder, Computers, IEEE Transactions on, vol. 37, no. 9, pp. 1110 1113, Sep 1988. [30] H. Ling, High-speed binary adder, IBM Journal of Research and Development, vol. 25, no. 3, pp. 156 166, March 1981. [31] R. I. Tanaka, Residue arithmetic and its applications to computer technology, 1967. [32] W. Jenkins and B. Leon, The use of residue number systems in the design of nite impulse response digital lters, Circuits and Systems, IEEE Transactions on, vol. 24, no. 4, pp. 191 201, Apr 1977. [33] X. Lai and J. L. Massey, A proposal for a new block encryption standard. Verlag, 1991, pp. 389404. Springer-
[34] S.-S. Yau and Y.-C. Liu, Error correction in redundant residue number systems, Computers, IEEE Transactions on, vol. C-22, no. 1, pp. 5 11, Jan. 1973. [35] F. Halsall, Data communications, computer networks and open systems (4th ed.). Redwood City, CA, USA: Addison Wesley Longman Publishing Co., Inc., 1995. [36] V. Paliouras and T. Stouraitis, Novel high-radix residue number system multipliers and adders, in Circuits and Systems, 1999. ISCAS 99. Proceedings of the 1999 IEEE International Symposium on, vol. 1, Jul 1999, pp. 451 454 vol.1. [37] C. Efstathiou, H. Vergos, and D. Nikolos, Modulo 2n 1 adder design using select-prex blocks, Computers, IEEE Transactions on, vol. 52, no. 11, pp. 1399 1406, Nov. 2003. [38] L. Kalampoukas, D. Nikolos, C. Efstathiou, H. Vergos, and J. Kalamatianos, High-speed parallel-prex module 2n-1 adders, Computers, IEEE Transactions on, vol. 49, no. 7, pp. 673 680, Jul 2000. [39] A. Beaumont-Smith and C.-C. Lim, Parallel prex adder design, in Computer Arithmetic, 2001. Proceedings. 15th IEEE Symposium on, 2001, pp. 218 225. [40] E. M. Schwarz, High-performance energy-ecient microprocessor design, ser. Series on integrated circuits and systems. Springer, 2006, ch. Binary oating-point unit design. [41] J. Shedletsky, Comment on the sequential and indeterminate behavior of an end-aroundcarry adder, Computers, IEEE Transactions on, vol. C-26, no. 3, pp. 271 272, March 1977. [42] F. Liu, X. Song, Q. Tan, and G. Chen, Formal analysis of end-around-carry adder in oating-point unit, Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 10, pp. 1655 1659, Oct. 2010. [43] A. Goldovsky, R. Kolagotla, C. Nicol, and M. Besz, A 1.0-nsec 32-bit prex tree adder in 0.25- mu;m static cmos, in Circuits and Systems, 1999. 42nd Midwest Symposium on, vol. 2, 1999, pp. 608 612 vol. 2.
BIBLIOGRAPHY
61
[44] R. Muralidharan and C.-H. Chang, Hard multiple generator for higher radix modulo 2n-1 multiplication, in Integrated Circuits, ISIC 09. Proceedings of the 2009 12th International Symposium on, Dec. 2009, pp. 546 549. [45] L. Kalampoukas, D. Nikolos, C. Efstathiou, H. Vergos, and J. Kalamatianos, High-speed parallel-prex module 2n-1 adders, Computers, IEEE Transactions on, vol. 49, no. 7, pp. 673 680, Jul 2000. [46] I. Sutherland, R. Sproull, and D. Harris, Logical eort: designing fast CMOS circuits, ser. The Morgan Kaufmann Series in Computer Architecture and Design. Morgan Kaufmann Publishers, 1999. [47] D. Harris and I. Sutherland, Logical eort of carry propagate adders, in Signals, Systems and Computers, 2003. Conference Record of the Thirty-Seventh Asilomar Conference on, vol. 1, Nov. 2003, pp. 873 878 Vol.1. [48] A. Kabbani, D. Al-Khalili, and A. Al-Khalili, Logical path delay distribution and transistor sizing, in IEEE-NEWCAS Conference, 2005. The 3rd International, June 2005, pp. 391 394. [49] V. Oklobdzija, B. Zeydel, H. Dao, S. Mathew, and R. Krishnamurthy, Energy-delay estimation technique for high-performance microprocessor vlsi adders, in Computer Arithmetic, 2003. Proceedings. 16th IEEE Symposium on, June 2003, pp. 272 279. [50] F. Frustaci, M. Lanuzza, P. Zicari, S. Perri, and P. Corsonello, Designing high-speed adders in power-constrained environments, Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 56, no. 2, pp. 172 176, Feb. 2009. [51] R. Zlatanovici, S. Kao, and B. Nikolic, Energy-delay optimization of 64-bit carrylookahead adders with a 240 ps 90 nm cmos design example, Solid-State Circuits, IEEE Journal of, vol. 44, no. 2, pp. 569 583, Feb. 2009. [52] HSPICE, The gold standard for accurate circuit simulation, http://www.synopsys.com/ Tools/Verication/AMSVerication/CircuitSimulation/HSPICE/Pages/default.aspx. [53] J. Stine, I. Castellanos, M. Wood, J. Henson, F. Love, W. Davis, P. Franzon, M. Bucher, S. Basavarajaiah, J. Oh, and R. Jenkal, Freepdk: An open-source variation-aware design kit, in Microelectronic Systems Education, 2007. MSE 07. IEEE International Conference on, June 2007, pp. 173 174. [54] NCSU, 45nm variant of the FreePDK process design kit, http://www.eda.ncsu.edu/wiki/ FreePDK45:Contents. [55] T. Jing, X.-L. Hong, Y.-C. Cai, J.-Y. Xu, C.-Q. Yang, Y.-Q. Zhang, Q. Zhou, and W. Wu, Data-path layout design inside soc, in Communications, Circuits and Systems and West Sino Expositions, IEEE 2002 International Conference on, vol. 2, June-1 July 2002, pp. 1406 1410 vol.2. [56] W. Daily and A. Chang, The role of custom design in asic chips, in Design Automation Conference, 2000. Proceedings 2000. 37th, 2000, pp. 643 647.
BIBLIOGRAPHY
62
[57] N. H. E. Weste and K. Eshraghian, Principles of cmos vlsi design: a systems perspective, 1993. [58] W. Luk and A. Dean, Multistack optimization for data-path chip layout, ComputerAided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 10, no. 1, pp. 116 129, Jan 1991.
Appendix A
Verilog Code of the Proposed Adder

module Mod_2n_1_128b_temp( input [127:0] A, input sA, input [127:0] B, input sB, output [127:0] S, output sS ); wire [127:0] iB, iS; wire Os; wire [7:0] c,p,g; xor X1 (Os, sA, sB); assign iB = Os ? ~B : B; // 16b front and 8b EAC Mod2n_1_16b_wCI_CS_KS i0 Mod2n_1_16b_wCI_CS_KS i1 Mod2n_1_16b_wCI_CS_KS i2 Mod2n_1_16b_wCI_CS_KS i3
(.A(A[15:0]), (.A(A[31:16]), (.A(A[47:32]), (.A(A[63:48]),
.B(iB[15:0]), .B(iB[31:16]), .B(iB[47:32]), .B(iB[63:48]),
.cin(c[7]), .cin(c[0]), .cin(c[1]), .cin(c[2]),
.pg({p[0],g[0]}), .pg({p[1],g[1]}), .pg({p[2],g[2]}), .pg({p[3],g[3]}),
.S(iS[15:0])); .S(iS[31:16])); .S(iS[47:32])); .S(iS[63:48]));
Mod2n_1_16b_wCI_CS_KS i4 (.A(A[79:64]), .B(iB[79:64]), .cin(c[3]), .pg({p[4],g[4]}), .S(iS[79:64])); Mod2n_1_16b_wCI_CS_KS i5 (.A(A[95:80]), .B(iB[95:80]), .cin(c[4]), .pg({p[5],g[5]}), .S(iS[95:80])); Mod2n_1_16b_wCI_CS_KS i6 (.A(A[111:96]), .B(iB[111:96]), .cin(c[5]), .pg({p[6],g[6]}), .S(iS[111:96])); Mod2n_1_16b_wCI_CS_KS i7 (.A(A[127:112]), .B(iB[127:112]), .cin(c[6]), .pg({p[7],g[7]}), .S(iS[127:112])); // 8b EAC logic Mod2n_1_8b_wo_pg_KS EAC (.p(p[7:0]), .p1(p1[7:0]), .g(g[7:0]), .c(c[7:0]), .Os(Os)); assign S = Os&(c[7]) ? iS : ~iS; assign sS = c[7] ? sA : sB; endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module Mod2n_1_16b_wCI_CS_KS( input [15:0] A, input [15:0] B, input cin, output p1, output [1:0] pg, output [15:0] S ); wire [1:0] r1c7, r1c6, r1c5, r1c4, r1c3, r1c2, r1c1, r1c0; wire [1:0] r1c15, r1c14, r1c13, r1c12, r1c11, r1c10, r1c9, r1c8;
63
APPENDIX A. VERILOG CODE OF THE PROPOSED ADDER
64
pg16 ipg16(.A(A), .B(B), .pg15(r1c15),.pg14(r1c14),.pg13(r1c13),.pg12(r1c12), .pg11(r1c11),.pg10(r1c10),.pg9(r1c9),.pg8(r1c8), .pg7(r1c7),.pg6(r1c6),.pg5(r1c5),.pg4(r1c4), .pg3(r1c3),.pg2(r1c2),.pg1(r1c1),.pg0(r1c0)); wire [1:0] r2c15, r2c13, r2c11, r2c9, r2c7, r2c5, r2c3, r2c1; wire [1:0] r2c14, r2c12, r2c10, r2c8, r2c6, r2c4, r2c2; black black black black black black black black black black black black black black black ir1c15(.pg(r1c15), .pg0(r1c14), .pgo(r2c15)); ir1c14(.pg(r1c14), .pg0(r1c13), .pgo(r2c14)); ir1c13(.pg(r1c13), .pg0(r1c12), .pgo(r2c13)); ir1c12(.pg(r1c12), .pg0(r1c11), .pgo(r2c12)); ir1c11(.pg(r1c11), .pg0(r1c10), .pgo(r2c11)); ir1c10(.pg(r1c10), .pg0(r1c9), .pgo(r2c10)); ir1c9(.pg(r1c9), .pg0(r1c8), .pgo(r2c9)); ir1c8(.pg(r1c8), .pg0(r1c7), .pgo(r2c8)); ir1c7(.pg(r1c7), .pg0(r1c6), .pgo(r2c7)); ir1c6(.pg(r1c6), .pg0(r1c5), .pgo(r2c6)); ir1c5(.pg(r1c5), .pg0(r1c4), .pgo(r2c5)); ir1c4(.pg(r1c4), .pg0(r1c3), .pgo(r2c4)); ir1c3(.pg(r1c3), .pg0(r1c2), .pgo(r2c3)); ir1c2(.pg(r1c2), .pg0(r1c1), .pgo(r2c2)); ir1c1(.pg(r1c1), .pg0(r1c0), .pgo(r2c1));
wire [1:0] r3c15, r3c14, r3c11, r3c10, r3c7, r3c6, r3c3, r3c2; wire [1:0] r3c13, r3c12, r3c9, r3c8, r3c5, r3c4; black black black black black black black black black black black black black black ir2c15(.pg(r2c15), ir2c14(.pg(r2c14), ir2c13(.pg(r2c13), ir2c12(.pg(r2c12), ir2c11(.pg(r2c11), ir2c10(.pg(r2c10), ir2c9(.pg(r2c9), ir2c8(.pg(r2c8), ir2c7(.pg(r2c7), ir2c6(.pg(r2c6), ir2c5(.pg(r2c5), ir2c4(.pg(r2c4), ir2c3(.pg(r2c3), ir2c2(.pg(r2c2), .pg0(r2c13), .pg0(r2c12), .pg0(r2c11), .pg0(r2c10), .pg0(r2c9), .pg0(r2c8), .pg0(r2c7), .pg0(r2c6), .pg0(r2c5), .pg0(r2c4), .pg0(r2c3), .pg0(r2c2), .pg0(r2c1), .pg0(r1c0), .pgo(r3c15)); .pgo(r3c14)); .pgo(r3c13)); .pgo(r3c12)); .pgo(r3c11)); .pgo(r3c10)); .pgo(r3c9)); .pgo(r3c8)); .pgo(r3c7)); .pgo(r3c6)); .pgo(r3c5)); .pgo(r3c4)); .pgo(r3c3)); .pgo(r3c2));
wire [1:0] r4c15, r4c14, r4c13, r4c12, r4c11, r4c10, r4c9, r4c8; wire [1:0] r4c7, r4c6, r4c5, r4c4; black black black black black black black black black black ir3c15(.pg(r3c15), .pg0(r3c11), .pgo(r4c15)); ir3c14(.pg(r3c14), .pg0(r3c10), .pgo(r4c14)); ir3c13(.pg(r3c13), .pg0(r3c9), .pgo(r4c13)); ir3c12(.pg(r3c12), .pg0(r3c8), .pgo(r4c12)); ir3c11(.pg(r3c11), .pg0(r3c7), .pgo(r4c11)); ir3c10(.pg(r3c10), .pg0(r3c6), .pgo(r4c10)); ir3c9(.pg(r3c9), .pg0(r3c5), .pgo(r4c9)); ir3c8(.pg(r3c8), .pg0(r3c4), .pgo(r4c8)); ir3c7(.pg(r3c7), .pg0(r3c3), .pgo(r4c7)); ir3c6(.pg(r3c6), .pg0(r3c2), .pgo(r4c6));
65
black ir3c5(.pg(r3c5), .pg0(r2c1), .pgo(r4c5)); black ir3c4(.pg(r3c4), .pg0(r1c0), .pgo(r4c4)); wire [1:0] r5c15, r5c14, r5c13, r5c12, r5c11, r5c10, r5c9, r5c8; black black black black black black black black ir4c15(.pg(r4c15), .pg0(r4c7), .pgo(r5c15)); ir4c14(.pg(r4c14), .pg0(r4c6), .pgo(r5c14)); ir4c13(.pg(r4c13), .pg0(r4c5), .pgo(r5c13)); ir4c12(.pg(r4c12), .pg0(r4c4), .pgo(r5c12)); ir4c7(.pg(r4c11), .pg0(r3c3), .pgo(r5c11)); ir4c6(.pg(r4c10), .pg0(r3c2), .pgo(r5c10)); ir4c5(.pg(r4c9), .pg0(r2c1), .pgo(r5c9)); ir4c4(.pg(r4c8), .pg0(r1c0), .pgo(r5c8));
assign pg = r5c15; wire [15:0] r6c0, r6c1; Carry_Inc CIA0(.cin(1b0),.c7(r4c7),.c6(r4c6),.c5(r4c5),.c4(r4c4), .c3(r3c3),.c2(r3c2),.c1(r2c1),.c0(r1c0),.r1c(r6c0[7:0])); Carry_Inc CIA1(.cin(1b1),.c7(r4c7),.c6(r4c6),.c5(r4c5),.c4(r4c4), .c3(r3c3),.c2(r3c2),.c1(r2c1),.c0(r1c0),.r1c(r6c1[7:0])); Carry_Inc CIB0(.cin(1b0),.c7(r5c15),.c6(r5c14),.c5(r5c13),.c4(r5c12), .c3(r5c11),.c2(r5c10),.c1(r5c9),.c0(r5c8),.r1c(r6c0[15:8])); Carry_Inc CIB1(.cin(1b1),.c7(r5c15),.c6(r5c14),.c5(r5c13),.c4(r5c12), .c3(r5c11),.c2(r5c10),.c1(r5c9),.c0(r5c8),.r1c(r6c1[15:8])); wire [15:0] S1,S0; assign S0= {r6c0[14:0],1b0} ^ {r1c15[1],r1c14[1],r1c13[1],r1c12[1],r1c11[1],r1c10[1],r1c9[1],r1c8[1], r1c7[1],r1c6[1],r1c5[1],r1c4[1],r1c3[1],r1c2[1],r1c1[1],r1c0[1]}; assign S1= {r6c1[14:0],1b1} ^ {r1c15[1],r1c14[1],r1c13[1],r1c12[1],r1c11[1],r1c10[1],r1c9[1],r1c8[1], r1c7[1],r1c6[1],r1c5[1],r1c4[1],r1c3[1],r1c2[1],r1c1[1],r1c0[1]}; assign S = cin ? S1 : S0 ; endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module pg16 (A, B, pg15, pg14, pg13, pg12, pg11, pg10, pg9, pg8, pg7, pg6, pg5, pg4, pg3, pg2, pg1, pg0); input [15:0] A, B; output [1:0] pg15, pg14, pg13, pg12, pg11, pg10, pg9, pg8, pg7, pg6, pg5, pg4, pg3, pg2, pg1, pg0; assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign assign pg15 pg14 pg13 pg12 pg11 pg10 pg9 pg8 pg7 pg6 pg5 pg4 pg3 pg2 pg1 pg0 = = = = = = = = = = = = = = = = {(A[15] {(A[14] {(A[13] {(A[12] {(A[11] {(A[10] {(A[9] {(A[8] {(A[7] {(A[6] {(A[5] {(A[4] {(A[3] {(A[2] {(A[1] {(A[0] ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ B[15]), B[14]), B[13]), B[12]), B[11]), B[10]), B[9]), B[8]), B[7]), B[6]), B[5]), B[4]), B[3]), B[2]), B[1]), B[0]), (A[15] (A[14] (A[13] (A[12] (A[11] (A[10] (A[9] (A[8] (A[7] (A[6] (A[5] (A[4] (A[3] (A[2] (A[1] (A[0] & & & & & & & & & & & & & & & & B[15])}; B[14])}; B[13])}; B[12])}; B[11])}; B[10])}; B[9])}; B[8])}; B[7])}; B[6])}; B[5])}; B[4])}; B[3])}; B[2])}; B[1])}; B[0])};
66
endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module black (pg, pg0, pgo); input [1:0] pg, pg0; output [1:0] pgo; assign pgo[1] = pg[1] & pg0[1]; assign pgo[0] = pg[0] | (pg0[0] & pg[1]) ; endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module Carry_Inc( input cin, input [1:0] c7, c6, c5, c4, c3, c2, c1, c0, output [7:0] r1c ); // Carry Increment Stage gray gray gray gray gray gray gray gray ic7(.pg(c7), ic6(.pg(c6), ic5(.pg(c5), ic4(.pg(c4), ic3(.pg(c3), ic2(.pg(c2), ic1(.pg(c1), ic0(.pg(c0), .pg0(cin), .pg0(cin), .pg0(cin), .pg0(cin), .pg0(cin), .pg0(cin), .pg0(cin), .pg0(cin), .pgo(r1c[7])); .pgo(r1c[6])); .pgo(r1c[5])); .pgo(r1c[4])); .pgo(r1c[3])); .pgo(r1c[2])); .pgo(r1c[1])); .pgo(r1c[0]));
endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module gray (pg, pg0, pgo); input [1:0] pg; input pg0; output pgo; assign pgo = (pg0 & pg[1]) | pg[0]; endmodule //////////////////////////////////////////////////////////////////////////////////////////////////////////// module Mod2n_1_8b_wo_pg_KS( input [7:0] p, input [7:0] g, input Os, input [7:0] p1, output [7:0] c ); wire [1:0] r1c7, r1c6, r1c5, r1c4, r1c3, r1c2, r1c1, r1c0; assign assign assign assign assign assign assign assign r1c7 r1c6 r1c5 r1c4 r1c3 r1c2 r1c1 r1c0 = = = = = = = = {p[7],g[7]}; {p[6],g[6]}; {p[5],g[5]}; {p[4],g[4]}; {p[3],g[3]}; {p[2],g[2]}; {p[1],g[1]}; {p[0],g[0]};
wire [1:0] r2c7, r2c6, r2c5, r2c4, r2c3, r2c2, r2c1;
67
black black black black black black black
ir1c7(.pg(r1c7), ir1c6(.pg(r1c6), ir1c5(.pg(r1c5), ir1c4(.pg(r1c4), ir1c3(.pg(r1c3), ir1c2(.pg(r1c2), ir1c1(.pg(r1c1),
.pg0(r1c6), .pg0(r1c5), .pg0(r1c4), .pg0(r1c3), .pg0(r1c2), .pg0(r1c1), .pg0(r1c0),
.pgo(r2c7)); .pgo(r2c6)); .pgo(r2c5)); .pgo(r2c4)); .pgo(r2c3)); .pgo(r2c2)); .pgo(r2c1));
wire [1:0] r3c7, r3c6, r3c5, r3c4, r3c3, r3c2; black black black black black black ir2c7(.pg(r2c7), ir2c6(.pg(r2c6), ir2c5(.pg(r2c5), ir2c4(.pg(r2c4), ir2c3(.pg(r2c3), ir2c2(.pg(r2c2), .pg0(r2c5), .pg0(r2c4), .pg0(r2c3), .pg0(r2c2), .pg0(r2c1), .pg0(r1c0), .pgo(r3c7)); .pgo(r3c6)); .pgo(r3c5)); .pgo(r3c4)); .pgo(r3c3)); .pgo(r3c2));
wire [1:0] r4c7, r4c6, r4c5, r4c4; black black black black ir3c7(.pg(r3c7), ir3c6(.pg(r3c6), ir3c5(.pg(r3c5), ir3c4(.pg(r3c4), .pg0(r3c3), .pg0(r3c2), .pg0(r2c1), .pg0(r1c0), .pgo(r4c7)); .pgo(r4c6)); .pgo(r4c5)); .pgo(r4c4));
wire r5c7, r5c6, r5c5, r5c4, r5c3, r5c2, r5c1, r5c0; // End Around Carry Stage assign c0 = r4c7[0] &Os ; gray ir4c6(.pg(r4c6), .pg0(c0), .pgo(r5c6)); gray ir4c5(.pg(r4c5), .pg0(c0), .pgo(r5c5)); gray ir4c4(.pg(r4c4), .pg0(c0), .pgo(r5c4)); gray ir4c3(.pg(r3c3), .pg0(c0), .pgo(r5c3)); gray ir4c2(.pg(r3c2), .pg0(c0), .pgo(r5c2)); gray ir4c1(.pg(r2c1), .pg0(c0), .pgo(r5c1)); gray ir4c0(.pg(r1c0), .pg0(c0), .pgo(r5c0)); assign c = {r4c7[0],r5c6,r5c5,r5c4,r5c3,r5c2,r5c1,r5c0}; endmodule
Appendix B
HSPICE Simulation Files

B.1 Cells
Cell library for simulation. * Cells .subckt inv in out length=0.05u width=0.09u m1 out in vdd vdd PMOS_VTL l=length w=2.5*width m2 out in gnd gnd NMOS_VTL l=length w=width .ends inv .subckt nand2 in1 in2 out length=0.05u width=0.09u m1 out in2 vdd vdd PMOS_VTL L=length W=2.5*width m2 out in1 vdd vdd PMOS_VTL L=length W=2.5*width m3 out in1 1 1 NMOS_VTL L=length W=2*width m4 1 in2 gnd gnd NMOS_VTL L=length W=2*width .ends nand2 .subckt nor2 in1 in2 out length=0.05u width=0.09u m1 out in2 1 1 PMOS_VTL L=length W=5*width m2 1 in1 vdd vdd PMOS_VTL L=length W=5*width m3 out in1 gnd gnd NMOS_VTL L=length W=1*width m4 out in2 gnd gnd NMOS_VTL L=length W=1*width .ends nor2 .subckt tg in pctrl nctrl out length=0.05u width=0.09u m0 in pctrl out vdd PMOS_VTL L=length W=1*width m1 in nctrl out gnd NMOS_VTL L=length W=1*width .ends tg .subckt xor2 in1 in2 out length=0.05u width=0.09u x01 in1 out1 inv L=length W=1*width x02 in2 out2 inv L=length W=1*width x03 in1 in2 out2 out tg L=length W=1*width x04 out1 out2 in2 out tg L=length W=1*width .ends xor2 .subckt xnor2 in1 in2 out length=0.05u width=0.09u x01 in2 out2 inv L=length W=2*width x02 in1 out2 in2 out tg L=length W=1*width m1 out in1 out2 vdd PMOS_VTL L=length W=2*2.5*1*width m2 out in1 in2 gnd NMOS_VTL L=length W=2*width
68
APPENDIX B. HSPICE SIMULATION FILES
69
.ends xnor2 .subckt AOI m01 F A 2 2 m02 F B 2 2 m03 2 C vdd m04 F A 1 1 m05 1 B gnd m06 F C gnd .ends AOI A B C F length=0.05u width=0.09u PMOS_VTL L=length W=5*width PMOS_VTL L=length W=5*width vdd PMOS_VTL L=length W=5*width NMOS_VTL L=length W=2*width gnd NMOS_VTL L=length W=2*width gnd NMOS_VTL L=length W=1*width
.subckt mux2 in1 x01 sel nsel inv x02 in1 sel nsel x03 in2 nsel sel .ends mux2
in2 sel out length=0.05u width=0.09u L=length W=1*width out tg L=length W=1*width out tg L=length W=1*width
.subckt OAI A B C F length=0.05u width=0.09u m01 F A 2 2 PMOS_VTL L=length W=5*width m02 2 B vdd vdd PMOS_VTL L=length W=5*width m03 F C vdd vdd PMOS_VTL L=length W=2.5*width m04 F A 1 1 NMOS_VTL L=length W=2*width m05 F1 B 1 1 NMOS_VTL L=length W=2*width m06 1 C gnd gnd NMOS_VTL L=length W=2*width .ends OAI
B.2
Simulation Code
* source CRITICAL PATH * 16x8 EAC KS adder .include NMOS_VTL_.inc .include PMOS_VTL_.inc .include cells.cir .global vdd gnd .connect gnd 0 .PARAM W=0.05u L=0.05u N=8 M=4 .PARAM vdd = 1.0V vdd vdd 0 vdd V_V1 in0 0 +PULSE 0 1 0 150p 150p 0.5n 3n X_UI1 in0 in1 inv length=L width=2*W X_UI2 in1 in inv length=L width=2*W .connect in sA X_U1_1 X_U1_2 X_U1_3 X_U1_4 sA n1 inv length=L width=f1*W gnd n2 inv length=L width=f1*W sA gnd n2 Os tg length=L width=4*W n1 n2 gnd Os tg length=L width=4*W
X_U2_1 Os n3 inv length=L width=f2*W
70
X_U2_2 vdd Os n3 1 tg length=L width=4*W X_U2_3 gnd n3 Os 1 tg length=L width=4*W X_U3 X_U4 X_U4_1 X_U5 X_U5_1 X_U6 X_U6_1 X_U7 X_U8 X_U8_1 X_U9 X_U9_1 X_U10 X_U11 X_U11_1 X_U12 X_U12_1 X_U12_2 X_U12_3 5 1 vdd 2 nand2 length=L width=f3*W 2 gnd vdd 3 OAI length=L width=f4*W 2 gnd vdd 51 OAI length=L width=f4*W 3 vdd gnd 4 AOI length=L width=f5*W 3 vdd gnd 52 AOI length=L width=f5*W 4 gnd vdd 5 OAI length=L width=f6*W 4 gnd vdd 53 OAI length=L width=f6*W vdd gnd 6 AOI length=L width=f7*W
6 gnd vdd 7 OAI length=L width=f8*W 6 gnd vdd 54 OAI length=L width=f8*W 7 vdd gnd 8 AOI length=L width=f9*W 7 vdd gnd 55 AOI length=L width=f9*W 8 gnd vdd 9 OAI length=L width=f10*W
9 vdd 10 nand2 length=L width=f11*W 9 vdd 50 nand2 length=L width=f11*W 10 10 10 10 gnd gnd gnd gnd vdd vdd vdd vdd 11 35 36 37 OAI OAI OAI OAI length=L length=L length=L length=L width=f12*W width=f12*W width=f12*W width=f12*W
X_U13_1 11 n4 inv length=L width=f13*W X_U13_2 gnd 11 n4 12 tg length=L width=f13*W X_U13_3 vdd n4 11 12 tg length=L width=f13*W X_U14_1 X_U14_4 X_U14_2 X_U14_3 12 n6 inv length=L width=f14*W vdd n5 inv length=L width=f14*W 12 vdd n5 s75 tg length=L width=f14*W n6 n5 vdd s75 tg length=L width=f14*W
X_UO s75 out inv length=L width=2*W .PARAM f1=2 f2=2 f3=2.5 f4=2 f5=1.6 f6=1.3 f7=2 f8=1.6 f9=1.3 f10=2 f11=2.5 f12=0.8 f13=2 f14=2 .tran 0.1p 100n .option post=2 nomod LIST .meas tran tplh_inr trig v(in) td=70n val=vdd/2 cross=1 targ v(s75) td=70n val=vdd/2 cross=1 .meas tran tplh_inf trig v(in) td=71n val=vdd/2 cross=1 targ v(s75) td=71n val=vdd/2 cross=1 ************************************************************************************************ .meas tran tplh_inr_Os_in trig v(in) td=70n val=vdd/2 cross=1 targ v(Os) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_Os_in trig v(in) td=71n val=vdd/2 cross=1 targ v(Os) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_1_Os trig v(Os) td=70n val=vdd/2 cross=1 targ v(1) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_1_Os trig v(Os) td=71n val=vdd/2 cross=1 targ v(1) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_2_1 trig v(1) td=70n val=vdd/2 cross=1 targ v(2) td=70n val=vdd/2 cross=1
71
.meas tran tplh_inf_2_1 trig v(1) td=71n val=vdd/2 cross=1 targ v(2) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_3_2 trig v(2) td=70n val=vdd/2 cross=1 targ v(3) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_3_2 trig v(2) td=71n val=vdd/2 cross=1 targ v(3) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_4_3 trig v(3) td=70n val=vdd/2 cross=1 targ v(4) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_4_3 trig v(3) td=71n val=vdd/2 cross=1 targ v(4) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_5_4 trig v(4) td=70n val=vdd/2 cross=1 targ v(5) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_5_4 trig v(4) td=71n val=vdd/2 cross=1 targ v(5) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_6_5 trig v(5) td=70n val=vdd/2 cross=1 targ v(6) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_6_5 trig v(5) td=71n val=vdd/2 cross=1 targ v(6) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_7_6 trig v(6) td=70n val=vdd/2 cross=1 targ v(7) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_7_6 trig v(6) td=71n val=vdd/2 cross=1 targ v(7) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_8_7 trig v(7) td=70n val=vdd/2 cross=1 targ v(8) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_8_7 trig v(7) td=71n val=vdd/2 cross=1 targ v(8) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_9_8 trig v(8) td=70n val=vdd/2 cross=1 targ v(9) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_9_8 trig v(8) td=71n val=vdd/2 cross=1 targ v(9) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_10_9 trig v(9) td=70n val=vdd/2 cross=1 targ v(10) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_10_9 trig v(9) td=71n val=vdd/2 cross=1 targ v(10) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_11_10 trig v(10) td=70n val=vdd/2 cross=1 targ v(11) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_11_10 trig v(10) td=71n val=vdd/2 cross=1 targ v(11) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_12_11 trig v(11) td=70n val=vdd/2 cross=1 targ v(12) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_12_11 trig v(11) td=71n val=vdd/2 cross=1 targ v(12) td=71n val=vdd/2 cross=1 .meas tran tplh_inr_s75_12 trig v(12) td=70n val=vdd/2 cross=1 targ v(s75) td=70n val=vdd/2 cross=1 .meas tran tplh_inf_s75_12 trig v(12) td=71n val=vdd/2 cross=1 targ v(s75) td=71n val=vdd/2 cross=1 ************************************************************************************************* .PRINT POWER .MEASURE TRAN avg_power AVG POWER from 0ns to 100ns ******** alterations .alter case 2: .TEMP 100 .alter case 3: .TEMP 25 .param vdd=1.2V .alter case 4: .TEMP 100 .param vdd=1.2V .alter case 5: .TEMP 25 .param vdd=0.8V .alter case 6: .TEMP 100 .param vdd=0.8V .end
B.3
Condition of Transistors
1:m1 0:in1 0:in0 0:vdd 1:m2 0:in1 0:in0 0:0 2:m1 0:in 0:in1 0:vdd 2:m2 0:in 0:in1 0:0
Transistor conditions under 25 C and 1.0V supply: element name drain gate source
72
bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb
0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 3:m1 0:n1 0:in 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000
0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 3:m2 0:n1 0:in 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000
0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 4:m1 0:n2 0:0 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000
0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 4:m2 0:n2 0:0 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000
73
rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff
15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 5:m0 0:in 0:0 0:os 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0.
15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 5:m1 0:in 0:n2 0:os 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0.
15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 6:m0 0:n1 0:n2 0:os 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0.
15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 6:m1 0:n1 0:0 0:os 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0.
74
sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp
0. 7:m1 0:n3 0:os 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 9:m0 0:0 0:n3 0:1 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000
0. 7:m2 0:n3 0:os 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 9:m1 0:0 0:os 0:1 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000
0. 8:m0 0:vdd 0:os 0:1 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 10:m1 0:2 0:vdd 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f 24.7123a 155.9375a 249.5000a 25.0000
0. 8:m1 0:vdd 0:n3 0:1 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 10:m2 0:2 0:1 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f 24.7123a 155.9375a 249.5000a 25.0000
75
aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 10:m3 0:2 0:1 10:1 10:1 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 10:m4 10:1 0:vdd 0:0 0:0 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 11:m01 0:3 0:2 11:2 11:2 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 11:m02 11:2 0:0 0:vdd 0:vdd 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0.
76
delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff
0. 0. 0. 0. 0. 0. 0. 11:m03 0:3 0:vdd 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m01 0:51 0:2 12:2 12:2 0:pmos_vtl 490.0000n 22.5000n 0.
0. 0. 0. 0. 0. 0. 0. 11:m04 0:3 0:2 11:1 11:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m02 12:2 0:0 0:vdd 0:vdd 0:pmos_vtl 490.0000n 22.5000n 0.
0. 0. 0. 0. 0. 0. 0. 11:m05 11:f1 0:0 11:1 11:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m03 0:51 0:vdd 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0.
0. 0. 0. 0. 0. 0. 0. 11:m06 11:1 0:vdd 0:0 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m04 0:51 0:2 12:1 12:1 0:nmos_vtl 190.0000n 22.5000n 0.
77
rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod
0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m05 12:f1 0:0 12:1 12:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000
0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 12:m06 12:1 0:vdd 0:0 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000
0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m01 0:4 0:3 13:2 13:2 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000
0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m02 0:4 0:vdd 13:2 13:2 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000
78
rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff
1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m03 13:2 0:0 0:vdd 0:vdd 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m04 0:4 0:3 13:1 13:1 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m05 13:1 0:vdd 0:0 0:0 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 13:m06 0:4 0:0 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
element name
14:m01
14:m02
14:m03
14:m04
79
drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min
0:52 0:3 14:2 14:2 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 14:m05 14:1 0:vdd 0:0 0:0 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0.
0:52 0:vdd 14:2 14:2 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 14:m06 0:52 0:0 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0.
14:2 0:0 0:vdd 0:vdd 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m01 0:5 0:4 15:2 15:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0.
0:52 0:3 14:1 14:1 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m02 15:2 0:0 0:vdd 0:vdd 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0.
80
rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa
15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m03 0:5 0:vdd 0:vdd 0:vdd 0:pmos_vtl 152.5000n 22.5000n 0. 0. 10.0000f 12.8504a 81.0875a 129.7400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0.
15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m04 0:5 0:4 15:1 15:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0.
15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m05 15:f1 0:0 15:1 15:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0.
15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 15:m06 15:1 0:vdd 0:0 0:0 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0.
81
sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat
0. 0. 0. 0. 16:m01 0:53 0:4 16:2 16:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 16:m05 16:f1 0:0 16:1 16:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a
0. 0. 0. 0. 16:m02 16:2 0:0 0:vdd 0:vdd 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 16:m06 16:1 0:vdd 0:0 0:0 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a
0. 0. 0. 0. 16:m03 0:53 0:vdd 0:vdd 0:vdd 0:pmos_vtl 152.5000n 22.5000n 0. 0. 10.0000f 12.8504a 81.0875a 129.7400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m01 0:6 0:5 17:2 17:2 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a
0. 0. 0. 0. 16:m04 0:53 0:4 16:1 16:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m02 0:6 0:vdd 17:2 17:2 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a
82
capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod
64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m03 17:2 0:0 0:vdd 0:vdd 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0.
64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m04 0:6 0:5 17:1 17:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0.
249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m05 17:1 0:vdd 0:0 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0.
249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 17:m06 0:6 0:0 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0.
83
delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model
0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m01 0:7 0:6 18:2 18:2 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m05 18:f1 0:0 18:1 18:1 0:nmos_vtl
0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m02 18:2 0:0 0:vdd 0:vdd 0:pmos_vtl 390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m06 18:1 0:vdd 0:0 0:0 0:nmos_vtl
0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m03 0:7 0:vdd 0:vdd 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m01 0:54 0:6 19:2 19:2 0:pmos_vtl
0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 18:m04 0:7 0:6 18:1 18:1 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m02 19:2 0:0 0:vdd 0:vdd 0:pmos_vtl
84
w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd
150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m03 0:54 0:vdd 0:vdd 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000
150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m04 0:54 0:6 19:1 19:1 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000
390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m05 19:f1 0:0 19:1 19:1 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000
390.0000n 22.5000n 0. 0. 10.0000f 31.6318a 199.6000a 319.3600a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 19:m06 19:1 0:vdd 0:0 0:0 0:nmos_vtl 150.0000n 22.5000n 0. 0. 10.0000f 12.6527a 79.8400a 127.7440a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000
85
trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff
0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 20:m01 0:8 0:7 20:2 20:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 20:m02 0:8 0:vdd 20:2 20:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 20:m03 20:2 0:0 0:vdd 0:vdd 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 20:m04 0:8 0:7 20:1 20:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
86
element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf
20:m05 20:1 0:vdd 0:0 0:0 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 21:m03 21:2 0:0 0:vdd 0:vdd 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000
20:m06 0:8 0:0 0:0 0:0 0:nmos_vtl 55.0000n 22.5000n 0. 0. 10.0000f 5.1402a 32.4350a 51.8960a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 21:m04 0:55 0:7 21:1 21:1 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000
21:m01 0:55 0:7 21:2 21:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 21:m05 21:1 0:vdd 0:0 0:0 0:nmos_vtl 120.0000n 22.5000n 0. 0. 10.0000f 10.2803a 64.8700a 103.7920a 25.0000 1.0000
21:m02 0:55 0:vdd 21:2 21:2 0:pmos_vtl 315.0000n 22.5000n 0. 0. 10.0000f 25.7008a 162.1750a 259.4800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 21:m06 0:55 0:0 0:0 0:0 0:nmos_vtl 55.0000n 22.5000n 0. 0. 10.0000f 5.1402a 32.4350a 51.8960a 25.0000 1.0000
87
min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox
0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 22:m01 0:9 0:8 22:2 22:2 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0.
0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 22:m02 22:2 0:0 0:vdd 0:vdd 0:pmos_vtl 490.0000n 22.5000n 0. 0. 10.0000f 39.5397a 249.5000a 399.2000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0.
0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 22:m03 0:9 0:vdd 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0.
0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 22:m04 0:9 0:8 22:1 22:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0.
88
sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat
0. 0. 0. 0. 0. 22:m05 22:f1 0:0 22:1 22:1 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 23:m3 0:10 0:9 23:1 23:1 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f
0. 0. 0. 0. 0. 22:m06 22:1 0:vdd 0:0 0:0 0:nmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 23:m4 23:1 0:vdd 0:0 0:0 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f
0. 0. 0. 0. 0. 23:m1 0:10 0:vdd 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f 24.7123a 155.9375a 249.5000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 24:m1 0:50 0:vdd 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f
0. 0. 0. 0. 0. 23:m2 0:10 0:9 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f 24.7123a 155.9375a 249.5000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 24:m2 0:50 0:9 0:vdd 0:vdd 0:pmos_vtl 302.5000n 22.5000n 0. 0. 10.0000f
89
cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod
19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 24:m3 0:50 0:9 24:1 24:1 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000
19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 24:m4 24:1 0:vdd 0:0 0:0 0:nmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000
24.7123a 155.9375a 249.5000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m01 0:11 0:10 25:2 25:2 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000
24.7123a 155.9375a 249.5000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m02 25:2 0:0 0:vdd 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000
90
rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk
0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m03 0:11 0:vdd 0:vdd 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m01 0:35 0:10 26:2 26:2
0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m04 0:11 0:10 25:1 25:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m02 26:2 0:0 0:vdd 0:vdd
0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m05 25:f1 0:0 25:1 25:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m03 0:35 0:vdd 0:vdd 0:vdd
0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 25:m06 25:1 0:vdd 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m04 0:35 0:10 26:1 26:1
91
model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps
0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m05 26:f1 0:0 26:1 26:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000
0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 26:m06 26:1 0:vdd 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000
0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m01 0:36 0:10 27:2 27:2 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000
0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m02 27:2 0:0 0:vdd 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000
92
rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff
15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m03 0:36 0:vdd 0:vdd 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m04 0:36 0:10 27:1 27:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m05 27:f1 0:0 27:1 27:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 27:m06 27:1 0:vdd 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
93
element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic
28:m01 0:37 0:10 28:2 28:2 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 28:m05 28:f1 0:0 28:1 28:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000
28:m02 28:2 0:0 0:vdd 0:vdd 0:pmos_vtl 190.0000n 22.5000n 0. 0. 10.0000f 15.8159a 99.8000a 159.6800a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 28:m06 28:1 0:vdd 0:0 0:0 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000
28:m03 0:37 0:vdd 0:vdd 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 29:m1 0:n4 0:11 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000
28:m04 0:37 0:10 28:1 28:1 0:nmos_vtl 70.0000n 22.5000n 0. 0. 10.0000f 6.3264a 39.9200a 63.8720a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 29:m2 0:n4 0:11 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000
94
nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 30:m0 0:0 0:11 0:12 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 30:m1 0:0 0:n4 0:12 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 31:m0 0:vdd 0:n4 0:12 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0.
1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 31:m1 0:vdd 0:11 0:12 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0.
95
deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff
0. 0. 0. 0. 0. 0. 32:m1 0:n6 0:12 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 34:m0 0:12 0:vdd 0:s75 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0.
0. 0. 0. 0. 0. 0. 32:m2 0:n6 0:12 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 34:m1 0:12 0:n5 0:s75 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0.
0. 0. 0. 0. 0. 0. 33:m1 0:n5 0:vdd 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 35:m0 0:n6 0:n5 0:s75 0:vdd 0:pmos_vtl 90.0000n 22.5000n 0. 0.
0. 0. 0. 0. 0. 0. 33:m2 0:n5 0:vdd 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 35:m1 0:n6 0:vdd 0:s75 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0.
96
cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff element name drain gate source bulk model w eff l eff rd eff rs eff cdsat cssat capbd capbs temp aic nf min rbdb rbsb rbpb rbps rbpd trnqsmod acnqsmod rbodymod rgatemod
10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 36:m1 0:out 0:s75 0:vdd 0:vdd 0:pmos_vtl 240.0000n 22.5000n 0. 0. 10.0000f 19.7699a 124.7500a 199.6000a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000
10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0. 36:m2 0:out 0:s75 0:0 0:0 0:nmos_vtl 90.0000n 22.5000n 0. 0. 10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000
10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
10.0000f 7.9079a 49.9000a 79.8400a 25.0000 1.0000 0. 15.0000 15.0000 5.0000 15.0000 15.0000 0. 0. 1.0000 1.0000 1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
97
geomod rgeomod delvto mulu0 delk1 delnfct deltox sa sb sd saeff sbeff
1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.
1.0000 0. 0. 1.0000 0. 0. 0. 0. 0. 0. 0. 0.

Analysis and Design of High Performance 128-Bit Parallel Prefix e

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysis and Design of High Performance 128-Bit Parallel Prefix e

Uploaded by

Copyright:

Available Formats

Northeastern University

This work is available open access, hosted by Northeastern University.

The Department of Electrical and Computer Engineering

in partial fulllment of the requirements for the degree of

Electrical and Computer Engineering

Northeastern University Boston, Massachusetts

Ogun Turkyilmaz August 2011

Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 45 48 49 52 53 53 56 63

6 Conclusion and Future Works A Verilog Code of the Proposed Adder

Global oorplan of a datapath [9]. . . . . . . . . . . . . . . . . . . . . . . . . . . 46 iii

5.2 5.3 5.4 5.5 5.6 5.7 5.9 5.10 5.8

Results comparison of proposed adder with the previous work. . . . . . . . . . . 53

Fused Multiply-Add Operation

Figure 1.1: Ripple Carry adder [1].

Carry lookahead adders

Figure 1.2: Carry lookahead adder [2].

Parallel Prex Adders

Gi:k = Gi:j + Pi:j Gj 1:k Pi:k = Pi:j Pj 1:k

Figure 1.3: Group PG cells [3].

Figure 1.5: Kogge-Stone adder [3].

Figure 1.6: Sklansky adder [3].

Figure 1.7: Brent-Kung adder [3].

Figure 1.8: Han-Carlson adder [3].

Figure 1.9: Knowles [2,1,1,1] adder [3].

Figure 1.10: Ladner Fischer adder [3].

where the pseudo-carry, hi , is dened as hi = gi + ci (1.12)

CHAPTER 1. INTRODUCTION work is also included.

CHAPTER 2. MODULO ADDERS

Addition in Modulo 2n 1 Adder

CHAPTER 2. MODULO ADDERS

Figure 2.2: General block diagram Modulo 2n 1 Adder [4].

Analysis of Previous End-Around-Carry Adders

CHAPTER 2. MODULO ADDERS

CHAPTER 2. MODULO ADDERS

CHAPTER 2. MODULO ADDERS

CHAPTER 2. MODULO ADDERS

Figure 2.5: Architecture of the EAC adder [6].

Carry-lookahead EAC Logic Unit

Modied Parallel Prex EAC Adder

CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER

Figure 3.1: Architecture of the modied EAC Adder.

The 16bit blocks in EAC adder

CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER

CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER

Figure 3.3: cin merge with fast carry link [7].

Parallel Prex 2n 1 EAC Block

CHAPTER 3. MODIFIED PARALLEL PREFIX EAC ADDER

Implementation and Validation

Critical Path Analysis

CHAPTER 4. CRITICAL PATH ANALYSIS

Figure 4.1: Critical path of the modied EAC adder.

CHAPTER 4. CRITICAL PATH ANALYSIS

Transistor Level Design and Sizing

Logic Level Minimization

CHAPTER 4. CRITICAL PATH ANALYSIS

CHAPTER 4. CRITICAL PATH ANALYSIS

Late arriving signal exploitation

Figure 4.4: AOI without late arriving exploitation.

CHAPTER 4. CRITICAL PATH ANALYSIS

Figure 4.5: AOI with late arriving exploitation.

CHAPTER 4. CRITICAL PATH ANALYSIS

CHAPTER 4. CRITICAL PATH ANALYSIS path. B = bi

CHAPTER 4. CRITICAL PATH ANALYSIS