You are on page 1of 9

Truncated Multiplication with Correction Constant

Michael J . Schulte and Earl E. Swartzlander, Jr.


Abstract: Multiplication is frequently required in digital signal processing.
Parallel multipliers provide a high-speed method for multiplication, but require
large area for VLSI implementations. In most signal processing applications, a
rounded product is desired to avoid growth in word-size. Thus, an important design
goal is to reduce the area requirements of rounded output multipliers. This paper
presents a technique for parallel multiplication which computes the product of two
numbers by summing only the most significant columns of the multiplication
matrix, along with a correction constant. A method for selecting the value of the
correction constant which minimizes the average and mean square error is
introduced. Equations are given for estimating the average, mean square, and
maximum error of the rounded product. With this technique, the hardware
requirements of the multiplier can be reduced by 25 to 35 percent, while limiting
the maximum error of the rounded product to less than one unit in the last place.

1.

Introduction

The design of high-speed, area-efficient multipliers is essential for VLSI


implementations of digital signal processing systems. Often, the inputs to these
systems are of limited precision and are inaccurate due to noise. Consequently, an
exact result is not always required and a rounded product is used for further
computation. For parallel multipliers, the area requirements and power
consumption can be reduced by estimating the least significant columns of the
partial product matrix as a constant. Although this estimate introduces additional
error, the error can be made small enough to be acceptable for many applications
by appropriately selecting the value of the correction constant.
The multiplication of a n-bit multiplicand by a n-bit multiplier yields a 2n-bit
product. Figure 1 shows the multiplication matrix for two unsigned numbers A
and B, where A is the multiplicand and B is the multiplier. The values for A, B
and their product P are
n-1

A = Cai.2-n+i
i=O

n- 1

B = Cbi.2-n+i
i=O

The partial product bit xi, is equal to a, AND b;.

388

2n- 1

p = Cpi.2-2n+i
i=O

map-I
PZn-1

n"-l,n-l

m-19-2

P2,.2

Ph.3

----- - - - - -P,

nn-i,l b - 1 . 0

P,.I

- - - - - -PI

RI

Figure 1. Multiplication Matrix.

In conventional parallel multipliers, the n2 partial product bits are summed to


compute the 2n-bit product, which is then rounded to n bits. A substantial
hardware savings is realized by summing only the n+k most significant columns of
the matrix. This method of multiplication is called truncated multiplication.
Truncated multiplication leads to two sources of error: reduction error and
rounding error. Reduction error occurs because the n-k least significant columns of
the multiplication matrix are not used to compute the product. Rounding error
occurs because the product is rounded to n bits. To compensate for these two
sources of error a correction constant is added to the n+k most significant columns
of the multiplication matrix. Most of the bits in the correction constant are zero
and do not require additional hardware. Figure 2 shows a matrix for truncated
multiplication, where Cn+k-l, c"+k-2, ... , C1, COis the correction constant.

Figure 2. Truncated Multiplication Matrix.


In [I], Lin presents three methods for performing truncated multiplication. He
develops a method for determining the correction constant and discusses the
resulting rounding error. In his analysis the reduction error and rounding error are
treated separately which can lead to a poor selection of the correction constant. In

389

addition, the correction constant is allowed to take arbitrary values. For practical
implementations, however, the correction constant should be limited to the n+k
most significant columns, as shown in Figure 2.
This paper presents a method for truncated multiplication in which the correction
constant compensates for both the rounding error and the reduction error. The value
of the correction constant is restricted so that it can be added to the truncated partial
products with only a small amount of additional hardware. Section 2 gives a
method for selecting the correction constant so that the average and mean square
error are minimized. In Section 3, estimates are derived for the average, mean
square and maximum error of truncated multipliers. Hardware savings for truncated
array [2] and Dadda [3] multipliers are discussed in Section 4. Section 5 examines
two's complement truncated multiplication.

2.

Selecting the Correction Constant

To compensate for the reduction and rounding errors, a correction constant is added
to the truncated partial products. The value of the computed product P is

P'= p + Ereduct +- bound + c


where P is the true product, Ereduct is the reduction error, Eromd is the rounding
error, and C is the correction constant. To minimize the average error of the
truncated multiplication, the correction constant is selected to be as close as
possible to the additive inverse of the expected value of the sum of the reduction
error and the rounding error. The expected values of these two errors are determined
separately and then added together. Since the reduction and rounding error are both
negative, the correction constant is positive.
To estimate the expected value of the reduction error, it is assumed that the
probability of any input bit a, or bi being one is 0.5. The positional weight of
partial product bit xi, is 2-2n+1+j,and xi, is equal to 1 if and only if a, and bi are
both equal to 1. Therefore, the expected value of xi, is

Since all partial product bits in column q have indices i+j = q, and there are q+l
partial product bits in column q, the expected value of the reduction error (i.e., the
additive inverse of the expected value of columns 0 to n-k-1) is
n-k-1

390

To estimate the expected value of the rounding error, it is assumed that the
probability of any product bit pi being one is 0.5. If the product bits Pn-k to Pn-1
are truncated, the expected value of the rounding error is

q=n-k

The expected value of the total error is the sum of expected reduction error and the
expected rounding error
n-k-1
Etotal =
C(q+1).2-2W - 2"-1.(1 - 2-9

-4'
*

q= 0

As mentioned previously, it is necessary to restrict the correction constant to n+k


bits. This is achieved by setting the correction constant to
C=-

ro~nd(2~+~-E,,,1)
2n+k

where round(x) indicates x is rounded to the nearest integer.

3.

Estimating the Average, Mean Square and Maximum Error

If the correction constant is the additive inverse of the expected value of the error,
then the average error of the truncated multiplication is zero. However, since the
correction constant is restricted to the n+k most significant columns, the average
error is equal to the sum of the correction constant and the expected value of the
error.

Because the correction constant is computed by rounding the expected value of the
error to n+k bits, the magnitude of Eavg is always less than or equal to 2-n-k-1.
The reduction error and the rounding error are assumed to be independent.
Therefore, the total mean square error of the truncated multiplication is the sum of
the mean square value of each of these errors. To calculate the two errors
separately, a partial correction constant is assigned to each error term.The sum of
the two partial correction constants is equal to the total correction constant C. The
partial correction constants for the reduction and rounding error are chosen as
Creduct = -beduct

cround =

391

beduct

The variance 02i, of partial product bit K;, is

The mean square reduction error (i.e., the sum of the variance of the partial product
bits in columns 0 through n-k-1) is

n-k-1
q=o

The mean square rounding error is the mean square difference between the rounding
correction constant and the value of the truncated bits and is equal to
2k- 1
02round = 2-k' z(Cround - q*2-n-k)2
q=o

The total mean square error is the sum of the mean square reduction and rounding
enor
n-k-1

02totd

= 3 c(q+1)'22(-2n+9)

2k- 1
2-k. z ( C r o u n d - q'2-n-k)2

q=o

q=o

To compute the maximum absolute error, the observation is made that the
maximum absolute error occurs either when all of the partial products bits in
columns 0 to n-k- 1 and all the product bits in columns n-k to n- 1 are ones or when
they are all zeros. If they are all ones, the maximum absolute error is
n-k-1

I c - c(q+1).2'2n+q -

2-"( 1-2-k) I = I

c + 4.Er,duct + 2.EroundI

q=o

If they are all zeros, the maximum absolute error is C. Thus, the maximum
absolute error is
E,

= max(C,

I c + 4.Ereduct+ 2*ErOundI)

Table 1 shows the average, mean square and maximum absolute error, and the
correction constant for several truncated multipliers. The following values are used
in the table

C'=C.2"

E'avg = Eavg.2n

oq2totd
=~

~ ~ t ~ Elmax
t ~ l =.Emax.2"
2 ~ ~

In comparison, conventional multipliers which implement round to nearest by

392

adding a one to column n-1 of the multiplication matrix have values of


1

Eave = 2-n-1

C'=5

oV2total=

12

E'max =

For the multipliers listed, o'2total is less than 0.09 and E',,
is less than 1.0, for
k greater than rlog2(n)l. The average error varies greatly because it depends on the
how close the expected value of the error is to a fixed point number which can be
represented using n+k bits.

i1

8
8
16
16
16
16
16
16
16
24
24
24
24

E'avg

12total

o.:5

4
5

1 I
C'

8
1
2
3
4
5
6
16
1
2
3
4

-9.766~10-~0.1667
+6.152.10-2 0.1040
0.625 +6.152*10-2 0.0903
0.5
-1.660-10-2 0.0842
0.5
-9.766-10-4 0.0834
0.5
1.953.10-3 0.0833
2
-3.815*10-6
1.25 +6.250-10-2
0.875 +6.250.10-2
0.625 -1.563-10-4
0.5625 -3.815-10-6
0.53125 +3.902.10-3
0.5
+7.629*10-6
3
-1.490-10-8
I 1.75 I +6.250.10-2
1.25 +6.250*10-2
0.75
-1.563-10-2
0.625 -1.490-10-8
0.5625 +3.906*10-3

0.5

%Savings %Savings
Anay
Ddda
2.5039
35.4
41.8
1.2539
23.9
28.8
0.7539
15.2
18.6
0.6289
9.28
11.9
0.5352
4.36
6.14
0.5000
0.00
0.00
E'max

+2.980*10-*
4.

Hardware Savings

Parallel multipliers are often implemented as either array multipliers [2,4] or as


multiplier trees [3, 5 , 61. Conventional n by n array multipliers require n2 AND

393

gates, n2 - 2n full adders, and n half adders. If the least significant t columns are
not used in the computation, where t = n-k, the hardware saved (for t 2 2) is
ANDGates

jt-l).(t-2)
Full Adders
2

(t-1) Half Adders

To add the correction constant to the truncated partial products, m half adders are
changed to full adders, where m is the number of ones in the correction constant.
Dad& introduced an efficient method for implementing multiplier trees in [3]. A
conventional n by n Dadda multipliers requires n2 AND gates, n2 - 4.n + 3 full
adders and n - 1 half adders (for n 2 3). In addition, a (2n-21-bit carry look-ahead
adder (CLA) is required lo sum the final two rows. The hardware saved with a
truncated Dadda multiplier (for t 2 2) is

o.(2t-l)Full Adders
The reduction in the number of half adders is between 1 and t, depending on the
values of n and k. The word-length of the CLA is reduced by t-1 bits. An
additional m full adders are required to add the correction constant to the truncated
partial products.
Table 1 shows the hardware savings for various sizes of truncated multipliers. The
values given correspond to the hardware savings of truncated multipliers compared
with conventional multipliers which implement round to nearest by adding a one
to column n-1. For this table, the relative sizes of the AND gates, half adders and
full adders are 1.4 and 9, respectively. The relative size of each full adder in the
CLA is 9 and a 4-bit CLA logic block has a relative size of 20.

5.

Two's Complement Truncated Multiplication

The analysis presented in the previous sections will now be extended for two's
complement multiplication. A two's complement n+ 1 by n+ 1 multiplication
matrix [7]is shown in Figure 3. This matrix is similar to the matrix for unsigned
multiplication. However, the most significant bit (msb) of partial products 0
through n-1 and all the bits except the msb of partial product n are complemented,
and a one is added to column n+l. The values of the multiplicand, multiplier and
product are

394

Figure 3: Two's Complement Multiplication Matrix.


The only difference between these equations and the equations for unsigned
numbers is that a msb with a positional weight of -1 is added to each of the
numbers. This change and the change in the multiplication matrix do not affect the
value of the reduction error or the rounding error. Thus, the determination of the
correction constant and the error analysis presented in Sections 2 and 3 are also
valid for two's complement numbers. This assumes that the product is rounded to
n+ 1 bits and n+k+ 1 columns are used in to compute the product.
To estimate the percent hardware savings, the amount of hardware required for a
conventional two's complement multiplier is approximated by replacing n by n+ 1
in the equations given in Section 4. This approximation does not take into account
the change in hardware resulting from changing 2n of the AND gates to NAND
gates (to complement the partial product bits) or the addition of a one in column
n+l. The savings of the truncated multiplier is calculated with t = n-k.

6.

Conclusion

For applications which do not require exact multiplication, truncated multipliers


offer a substantial hardware savings while introducing only a small amount of
error. By considering both the reduction error and the rounding error when
determining the correction constant, the average and mean square error are
minimized. Given certain hardware and error constraints, the appropriate number of
columns required for the multiplication can be readily determined. The analysis
presented in this paper can also be extended to m by n multipliers, multipliers
which generate the partial products through other techniques (e.g.. modified Booth
encoding [SI), multipliers which employ generalized counters [9], and
implementations of merged arithmetic [lo].

395

References

Y .C. Lin. Single precision multiplier with reduced circuit complexity for
signal processing application. IEEE Transactions on Electronic Computers,
(41):1333-1336, 1992.
S.D. Peraris. A 40 ns 17-bit array multiplier. IEEE Transactions on
Computers, (20):442-447, 1971.
L. Dadda. Some schemes for parallel multipliers. Aha Frequenza, (34):349356, 1965.

G.W. McIver, R.W. Miller, and T.G. O'Shaughnessy. A Monolithic 16 by


16 Digital Multiplier. IEEE International Solid-State Circuits Digest of
Technical Papers, 23 1-233.1974.
C.S. Wallace. A Suggestion For a Fast Multiplier. IEEE Transactions on
Electronic Computers, (13):14-17, 1964.
M.R. Santoro and M.A. Horowitz. SPIM. A Pipelined 64 X 64-Iterative
Multiplier. IEEE Journal of Solid-state Circuits, (24):487-49 1, 1989.
C.R. Baugh and B.A. Wooley. A Two's Complement Parallel Array
Multiplication Algorithm. IEEE Transactions on Computers, (C-22): 10451047, 1973.
A.D. Booth. A Signed Binary Multiplication Technique. Q.J. Mechanics
and Applied Mathematics, (4):236-240.195 1.
W.J. Stencil, W.J. Kubitz, and G.H. Garcia. A Compact High-speed
Parallel Multiplication Scheme. IEEE Transactions on Computers, (C-26):
1045-1047, 1977.
E.E. Swartzlander, Jr. Merged Arithmetic. IEEE Transactions on
Computers, (C-29):946-950, 1980.
Acknowledgments
The authors are grateful to S h i m Shade for her help in completing this paper.
Michael J. Schulte and Earl E. Swartzlander,Jr.
University of Texas at Austin
Department of Electrical and Computer Engineering
Austin, Texas 78712
United States

396

You might also like