Chapter 14 Arithmetic Circuits (I): Adder Designs Rev. 1.0 05/12/2003 Rev. 2.0 06/05/2003 Rev. 2.1 06/12/2003
Building Blocks for Digital Architectures Arithmetic and Unit Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.) - Memory - RAM, ROM, Buffers, Shift registers Control - Finite state machine (PLA, random logic.) - Counters Interconnect - Switches - Arbiters - Bus
Intel Microprocessor Itanium has 6 integer execution units like this
Itanium Integer Datapath Fetzer, Orton, ISSCC’02
Several Implementations of Adders • One-Bit Full Adder (Cell) • Carry-Ripple Adder • Bit-Serial Adder • Mirror Adder • Transmission-Gate Adder • Manchester Adder • Carry lookahead Adder • Carry-Select Adder
Full-Adder (FA) Generate (G) = AB Propagate (P) = A B Å Delete = A B
Boolean Function of Binary Full-Adder CMOS Implementation
Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G) = AB Propagate (P) = A B Å Delete = A B S C D and P Can also derive expressions for and based on o Note that we will be sometimes using an alternate definition for + Propagate (P) = A B
A B A B A B A B 0 0 1 1 2 2 3 3 C C C C C i ,0 o ,0 o ,1 o ,2 o ,3 FA FA FA FA = ( C ) i ,1 S S S S 0 1 2 3 Carry-Ripple Adder Critical Path Worst-case delayis linear with the number of bits tadder = (N-1)tcarry + tsum td = O(N) • Propagation delay (or critical path) is the worst-case delay over all possible input patterns • A= 0001, B=1111, trigger the worst-case delay • A: 0 1, and B= 1111 fixed to set up the worst-case delay transition.
Complimentary Static CMOS Full Adder 28 Transistors • Logic effort of Ci is reduced to 2 (c.f., A and B signals) • Ci is late arrival signal near the output signal • Co needs to be inverted Slow down the ripple propagate
Minimize Critical Path by Reducing Inverting Stages • Exploit Inversion Property • Reduce One inverter delay in each Full-adder (FA) unit
Bit-Serial Adder A B
A Better Structure: The Mirror Adder Exploring the “Self-Duality” of the Sum and Carry functions
Mirror Adder Design • The NMOS and PMOS chains are completely symmetrical • A maximum of two series transistors can be observed in the carry-generation circuitry for good speed. • When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. • The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell . • The transistors connected to Ci are placed closest to the output.
Transmission-Gate 6T XOR Gate Truth Table A=0: Pass B Signal A=1: Inverting B Signal
Transmission-Gate Full Adder (24T) • Same delay for Sum and Carry Multiplier design
Manchester Carry-Chain Adder Static Circuits Dynamic Circuits
Manchester Adder Circuits (Weste) Dynamic Static Mux- based 4-bit Section sum<n>
Manchester Adder Circuits (Cont.) • Dynamic stage • When CLK is low, the output node is pre-charged by the p pull-up transistor. • When CLK goes high, the pull-down transistor turns on. • If carry generate G=AB is true the output node discharges. • If carry propagate P=A+B is true a previous carry may be coupled to the output node, conditionally discharging it. • Static stage • This requires P to be generated as AB • The Manchester adder stage improves on the carry-lookahead implementation.
P G P G P G P G 0 1 0 1 2 2 3 3 C C C C C i,0 o,3 o,0 o,1 o,2 Also called Carry-Skip FA FA FA FA P G P G P G P G 0 1 0 1 2 2 3 3 BP=P P P P o 1 2 3 C C C C i,0 o,0 o,1 o,2 r e FA FA FA FA x e C l o,3 p i t l u M Carry-Bypass Adder Design Idea: If ( ) elseKillor Generate then C = C O,3 I,0
Manchester Adder Circuits (Cont.) Wired OR • The control signals T1,T2,and T3 shown in Fig6(b) are generated by: • T1 = -(P0P1P2)P3 • T2 = -P3 • T3 = P0P1P2P3 Fig6. Manchester adder with carry bypass: (a) simple (b) conflict free
Manchester Adder Circuits (Cont.) • The worst case propagation time of a Manchester adder can be improved by bypassing the four stages if all carry-propagate signals are true. • Fig. 6(b) uses a “conflict -free” bypass circuit, which improves the speed by using a 3-input multiplexer that prevents conflicts at the wired OR node in the adder. • In Fig. 6(b), the inverter presented on the Cin signal has been moved to the center of the carry chain to improve speed.
Carry-Bypass Adder (cont.) tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum M bits form a Section (N/M) Bypass Stages
Carry Ripple versus Carry Bypass Wordlength (N) > 4~8 is better for Bypass Adder
Setup P,G "0" Carry Propagation "0" "1" "1" Carry Propagation C C o,k-1 2-to-1 Multiplexer o,k+3 Carry Vector Sum Generation Carry-Select Adder
Carry-Select Adder Fig7. Carry-select adder:(a) basic architecture (b) 32-bit carry-select adder example
Square Root Carry Select N-bit adder with P stages: 1st stage adds M bits, 2nd has (M+1) bits
The linear growth of adder carry-delay with the size of the input word for n-bit adder maybe improved by calculation the carries to each stage in parallel. Carry-Lookahead Adders
Carry-Lookahead Adders (cont’d) Carry of the ith stage --- Expanding: For four stages, the appropriate term: C0= G0 + P0CI C1= G1 + P1G0 + P1P0CI C2= G2 + P2G1 + P2P1G0 + P2P1P0CI C3= G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0CI Fig1. Generic carry-lookahead adder
Static CMOS Circuits Expanding Lookahead equations: All the way:
Dynamic CMOS Circuits • The worst-case delay path in this circuit has six • n-transistor in series.
Carry-Lookahead Adders • Size and fan-in of the gates needed to implement this carry-lookahead scheme can clearly get out of hand • Number of stages of lookahead is usually limited to about 4. • The circuit and layout are quite irregular compared with ripple adder designs.
Summary • Datapath designs are fundamentals for high-speed DSP, Multimedia, Communication digital VLSI designs. • Most adders, multipliers, division circuits are now available in Synopsys Designware under different area/speed constraint. • For details, check “Advanced VLSI” notes, or “Computer Arithmetic” textbooks