You are on page 1of 19

Fixed-point and floating-point numbers

CS370 Fall 2003

Representations of numbers
Unsigned integers Signed integers 1s and 2s complement representation To represent
Very Large and very Small numbers Real numbers in general

Fixed-point numbers Floating-point numbers


2

Base-10 (decimal) arithmetic


Uses the ten numbers from 0 to 9 Each column represents a power of 10
Thousands (103) column Hundreds (102) column Tens (101) column Ones (100) column

1999.10

= 1x103 + 9x102 + 9x101 + 9x100


3

Base-10 (decimal) arithmetic


Uses the ten numbers from 0 to 9 Each column represents a power of 10
Tens (101) column Ones (100) column Tenths (10-1) column Hundredths (10-2) column

19.9910

= 1x101 + 9x100 + 9x10-1 + 9x10-2


4

Standard binary representation


Uses the two numbers from 0 to 1 Every column represents a power of 2
Eights (23) column Fours (22) column Twos (21) column Ones (20) column

1001.2

= 1x23 + 0x22 + 0x21 + 1x20


5

Fixed-point representation
Uses the two numbers from 0 to 1 Every column represents a power of 2
Twos (21) column Ones (20) column Halves (2-1) column Fourths (2-2) column

10.012

= 1x21 + 0x20 + 0x2-1 + 1x2-2


6

Addition
Base-10 Base-2

1. 1. 2.

2 5 7

5 0 5

+ 1

1. 1. 0.

0 1 1

1 0 1

Range of values in a byte


Lowest exponent 0 -1 -2 -4 Min Step Max 255 127.5 63.75 15.9375 Value of 00110001 0 1 0 .5 0 .25 0 .0625

Scientific notation (1)


One billion = 1,000,000,000 = 1 x 109
significand or mantissa: 1 base or radix: 10 exponent: 9

Scientific notation (2)


1999 = 1.999 x 103
significand or mantissa: 1999 base or radix: 10 exponent: 3

= 19.99 x 10 = 199.9 x 10
10

Practice (base 10)


258 = 2.58 x 102
Mantissa = 258 Radix = 10 Exponent = 2

24.25 = 2.425 x 101


Mantissa = 2425 Radix = 10 Exponent = 1
11

Base-2 scientific notation


2.25ten = 10.01two = 10.01two x 20 = 1.001two x 21 normalized Numbers are usually normalized which means that the leading bit is always a 1.

12

8-bit floating point format (1)


sign 1 bit 0 0 0 1 exponent significand number number 3 bits base 2 base 10 4 bits 001 1001 1.001x21 2.25 011 111 001 1100 1110 1110 1.1 x 23 12.0

1.11 x 27 224.0 1.11 x 2-1 0.875


13

Improvements
Bias the exponent
Always subtract a fixed amount, e.g., 3 Allows representation of negative exponents

Implicit one
- Leading one in a Phone number such as 1-619-556-0231 is redundant. Why use a bit for the leading one?
14

8-bit floating-point format (2)


Exponent (3 bits) is biased by 3 The leading one of significand is implicit Zero is represented by all zeros
sign exponent 3 1 bit bits 0 100 0 011 0 111 1 001 significand 4 bits 0010 1000 1100 1100 number base number base 2 10 1.001x21 2.25 1.1 x 23 12.0 1.11 x 27 224.0 1.11 x 2-1 0.875
15

IEEE standard floating-point


Single precision
32 bits
sign: 1 bit exponent: 8 bits significand: 23 bits

Double precision
64 bits
sign: 1 bit exponent: 11 bits significand: 52 bits

Bias: 127

Bias: 511

16

Practice( base 10)


13 = 1.3 x 101
= 1.011 x 23

1.25 = 1.25 x 100


= 1.010 x 20

17

18

exponent

mantissa

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

exponent

mantissa

3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0

19

You might also like