You are on page 1of 30

Circuits and Analog and Digital

Design
this material is multiple courses

Programmable and Configurable and


Analog / Digital System Design
Paul Hasler
Professor, Georgia Institute of Technology

Power Efficient Computing


Portable Devices
battery powered
(or less)
larger systems
minimize battery size / weight
Get as much computation
as possible

Cortical Neurons
1000s of inputs,
1000s of channel populations,
one output

Equivalent computation ~
400MMAC / neuron
(no learning / growth)

~ roughly 20pW / neuron

Custom Analog ~ 1000 10000


more efficient than Custom Digital
(Mead 1990)

Analog (VMM): 10MMAC/ W


Digital: 4 MMAC / mW (DSP)
Useful Analog must be
Programmable / Configurable
400MMAC / neuron at 20pW
digital is quite far away (100mW)
analog VMM closer (100W)
analog HMM / dendrites get close
~ 500TMAC
< 10000 neurons
~100kW (comp) with 4000 DSPs

History of Digital System Design


First IC

4004 Intel

Speak and Spell


(first DSP?)

1980
1970

First VLSI
courses

Synthesis tools

2000
MOSIS
MIPS

First CAD
(Fairchild, 1967)

Handcrafted Design
Every Gate Optimized
Cost only feasible for
government contracts

TI C54
(fixed
point)

1990
Magic (CAD ventures)

1960

Pentium (Intel)
(0.8um)

TMS 32010(NMOS)

VLSI taught
In CMOS

First
Synthesis
classes

FPGAs
In classes

XC2064

Mead &
Conway

A separation of design from technology


(build framework for abstraction)
technologists know, how to fabricate
smaller and faster transistors
designers know how to coordinate
millions of transistors

Reconfigurable Signal Processing

100% S/W
(Programmable)

Cost

Cost

FPGAs Large Configurability


Tech
trend

100% H/W
(Fixed Function)

Power: Just MAC engine


around 2-10MMAC/mW
Baseline static power ~ 0.5W to 1 W
Signal routing power / memory: ?
DSPs Low Power Processing
- cell phones
(processing < 30mW average)
- hearing aids (1 mW levels)
(AMI / DSP factory)
Power: 54C series 4MMAC/mW
Power does not include comm off chip
(i.e. accessing memory)

Innovation and Process Scaling moves


solutions towards programmability
and reconfigurability

Power = C Vdd2 f for CMOS


Chip to Chip (10pF load min, 2.5V):
32uW/Mbit (dynamic)
Obtaining data for 4MMAC computation ~ 4mW

Modern System Design


Fixed function Digital

Design at
gate level

Design at
Multipliers and Adders Design at Basic Algorithms

(1837)
Fixed function Analog
Vector-Matrix Multiplication
Frequency Decomposition
Adaptive Filters
Classifiers (NN, GMM, HMM)

When building analog systems,


we expect to build primitives at the basic algorithm level....
Programmable
Digital (Mixed mode)

Analog = programmable and configurable.


How to get enough analog engineers
Hierarchy is a key ingredient to the
success of the digital circuit, and, until
recently, one reason why large analog
designs have been difficult

Levels of Energy Efficiency


Subthreshold
Transistor Operation
Highest throughput /
amount of power

Programmable Circuits
(FG transistors)

Analog Signal Processing

Eliminate mismatch
Programmability

~ x1000 improvement
in power efficiency

Configurable Signal
Processing
Wide accessibility

Moving analog approaches /conceptual framework to a system design approach,


similar to digitals system transformation in the 1970s / 80s.

Measured Channel Current

MOSFET Current-Voltage Curves

I = f r

EKV Model

If,r = 2 Ith

ln2

(1+e

((VgVT)Vs,dVd,s)/2UT
DIBL / VA

Ith = Cox UT2 (W/L) /


Subthreshold
((VgVT)VsVd)/UT

I = 2 Ith ( e

If = 2 Ith

- e

((VgVT)VsVd)/UT

Above-threshold

If, = ( Cox /) (W/L) /

((VgVT)VdVs)/UT
(Saturation, Vds > 4UT)

(Saturation, IR ~0, Vds > Von)

( ((V V )V V ) )2
g

Classic Multilevel EEPROMs


ISD voice recorder ICs
(answering machine messages, greeting cards, etc.)
ETANN: Floating-Gate element
used for biasing (Holler, et.al, 1989)

V1

V2

EEPROM Process, bidirectional tunneling

GND

Tunneling
Vtun Junction

GND

First reported EEPROM element in standard CMOS


(Thomson and Brooke, 1989)

Many standard IC processes allow for


EEPROM devices (standard cells, standard process)
Most commercial EEPROMs are multibit

Programmable Analog Transistors

StandardCMOS
Datareten.on:
<5V(0.5m)(10year,300K)
Apps:Filters,Dataconverters,
Regulators,etc.

Accuracy~0.1%between
100pA1A~10e
Writedegrada.on(100C):

Vtunincreaselessthan25%
Vinjnegligablechange
(100Cis>109completeFGrewrite)

Otherwise, need a DAC at every parameter and/or memory, etc.

Electron Transport in a subthreshold nFET

Measurements and Modeling of


Hot-Electron Injection

Impact Ionization

Themeanrateofanimpactioniza.on
collisionishighlyenergydependant
ImpactCurrentispropor.onal
tosourcecurrent

pFET Hot-Electron Injection


well
contact

source

n+

p+

gate

drain
p+

Vinj=430mV

n-well
p-substrate
gate
(3)

Channel

(1)
(2)

Drain-to-Channel
Depletion Region

p+
drain

Theinjectedelectronsaregenerated
byholeimpactioniza.ons.
Injec.oncurrentispropor.onalto
sourcecurrent,andisan
exponen.alfunc.onofdc.

[Hasler, et. al. 1996, 1997] [Duffy and Hasler, 2003]

E c,oxide

Injection Above and Below VT


Ec

Ev

pFET injection, Sub VT, Saturation

dr ain
2
el

chann
source

pFET injection, Above VT, Ohmic

Floating-Gate Devices as Circuit elements


Neuron MOS (MOS)
(Shibata and Ohmi, 1992)

Iout

Gate1
Gate2

Iout
NIPS 1994

Gate1
Gate2

GND

Analog Signal processing at EEPROM densities

GND

4-bit DAC (no sampling)


8C
a3

Vdd

Vdd

4C
a2
2C

Vout

a1
a0

8C
a3
4C
a2
2C
a1
C
a0

Vout

GND
GND

tun V

dd

g
V

Prog. Analog ICs

Industrial Respect

Floating-gate transistors
V tun

V dd

Vg

V tun

M3 Vg

M4

VA
S1

M9

D1

Measured Offset Voltage Drift vs. Temperature

V dd

VB
S2

D2

M 10
M1

In +

M2

Bias
Circuitry

M7

In Itail

M5

V out
M8

Input Offset Voltage Drifts


by 130V over 170C

M6

Input Offset
Voltage
Reduced to
25V

V. Srinivasan, G. Serrano,
J. Gray, and P. Hasler,
CICC 2005, pp. 739-742.
(Best paper CICC 2005)

Gm-C filters, C4 Filters, ADCs, DACs, V regulators

Floating-Gate Voltage Output DAC

Process/ Vdd

0.5um CMOS / 5V

Linearity

10bit (INL/DNL)

Epot Accuracy

< 100uV (measured)


< 1uV (theoretical)

Sample Rate

~10MSPS(instrumented)
>100MSPS (on-chip)

Input caps

140fF

Analog Signal Processing Techniques


Vin

Constant Q Filterbanks

Vector-Matrix Multiplication
I1+

I1-

Im+

Vtun Vdd

Im-

outn

out1 out2 out3 out4 out5

I+out
1.5mm

VMM

V3

Vn

Analog Digital
Output(s) Output(s)

V2

Winner-Take-All

V1

ou t

128x32 VMM (0.5m)


< 1mm2

Adaptive Filters
x1
x2

w1
w2

xN

wN

0.3

Steady-state weights

Gaussian Mixture Models / VQ

w2 = sin

0.2
0.1

w1 = cos

0
-0.1
-0.2
-0.3
-0.4
0

50 100 150 200 250 300 350

Rotation parameter(degrees)

Analog--Digital Signal Processing


Real
world
(analog)

A/D
Converter

DSP
Processor

Computer
(digital)

DSP
Processor

Computer
(digital)

Specialized A/D
Real
world
(analog)

ASP
IC

A/D

Digital and Analog SP Efficiency

CADSP = Cooperative AnalogDigital


Signal Processing
Custom Analog ~ 1000 - 10000 more
efficient than Custom Digital (Mead 1990)

Analog (VMM): 10MMAC/ W


( = 10TMAC / W)
Digital: 4 MMAC / mW (DSP)
Computation

MMAC/W

Ratio to digital

LowPowerDSPs

0.02 to 0.002

Analog VMM

1 to 30

1000

Analog Filterbanks

30 to 1000

10000

Analog VQ

1 to 10

300

Analog HMM

>1000

> 100000

Digital Signal
Processing

HMM

VQ

Cepstrum

Microphone

Resolution for Analog / Digital


Tradeoffs
16
14

log ( " Cost " )

ADC
(16bit)

digital

Remaining
DSP

FFT

12
10
8

analog

Lower
digital
cost

~10bit SNR

Lower
analog
cost

Remaining
DSP

2
0
0

10 12

14 16

Signa l-to- Noise ( Bits of Re so lution)


[Vittoz95, Sarpeskar98]

Analog filter
bank (~FFT)

10bit
[Kucic, et. al. 2001]

FPAAs are Gaining Momentum

Jan 2008
Concept

Approach Built on Floating


Gate Circuits

Simulation

VLSI

Fabrication

Testing

(3 months)

(T. Hall, P. Hasler, et. al, FPL, Sept. 2002. )

x3
Concept

Simulation/
Synthesis

Testing

VLSI

Fabrication

RASP 2.x:
RASP 2.5, 2.7: 2004-2007
- >50,000 Prog. Analog Devices
- Used by > 100 Eng

x 20

Large-Scale Field
Programmable Analog
Arrays (FPAA)

RASP 1.x (2002)

Can be a prototyping tool,


early devices,
or final application

RASP 2.8x: 2008- Used by > 100 Eng


RASP 2.9x: 2009-

RASP Programming/Configuration
Program

Vin

Vg

Run (Program)

GND

GND
Vdd

GND

GND

GND

GND

GND

Vd

GND

GND
Vdd
GND

Vout

Vin

CAB Type
row offset
column offset
A

VMM
0
252

VMM
0
216

VMM
0
180

VMM
0
144

VMM
0
108

VMM
0
72

VMM
0
36

VMM
0
0

GP
56
252

GP
56
216

GP
56
180

GP
56
144

GP
56
108

GP
56
72

GP
56
36

GP
56
0

GP
98
252

GP
98
216

GP
98
180

GP
98
144

GP
98
108

GP
98
72

GP
98
36

GP
98
0

GP
140
252

GP
140
216

GP
140
180

GP
140
144

GP
140
108

GP
140
72

GP
140
36

GP
140
0

GP
182
252

GP
182
216

GP
182
180

GP
182
144

GP
182
108

GP
182
72

GP
182
36

GP
182
0

GP
224
252

GP
224
216

GP
224
180

GP
224
144

GP
224
108

GP
224
72

GP
224
36

GP
224
0

VMM
266
252

VMM
266
216

VMM
266
180

VMM
266
144

VMM
266
108

VMM
266
72

VMM
266
36

VMM
266
0

Vout

GND

GND

GND

RASP 2.8 / 2.9 Series of FPAA devices


RASP 2.8 IC family

RASP 2.9 IC family


Switches are not dead weight

3mm

Family of nine FPAA ICs


Generic FPAA Block
FPAA with Channel CABs
FPAA with Channel CABs
+ Adaptive Synapses
FPAAs with Adaptive blocks

3mm
2.8a: General FPAA
2.8b: BioChannel FPAA
2.8c: Sensor FPAA
2.8d: MITE FPAA
a low-power FPGA

On-chip Programming
120 dB DR TIA
9 bit ramp ADC
7 bit DAC

0.35um CMOS
Usedby>100Eng.
Size ~ 3mm x 3mm
I/O pins ~ 56 (100 pin package)

Larger Devices: 5mm x 5mm (x3)


100CABs;
potentially 1TMAC from one chip
Better Reticle Design: more # of devices
Custom versus FPGAs:
x2-3 speed, x10 area, x100 power
Custom versus FPAAs:
< x2 speed, < x2 area, < x2 power

Basic 9-Transistor OTA


FG input 9-Transistor OTA

Floating Capacitors (2 terminals)


pFET Floating-Gate Transistors
FG input 9-Transistor, Buffer Connected OTA
Transmission Gate

nFET Transistors

Looking Closer at
CAB Components

Other RASP 2.8 Architectures


RASP 2.8b: Bio enabled FPAA
RASP 2.8 architecture with transistor channel /
synapses as CAB elements

Inspired from FPNA work [Farquhar, et. al, 2006)


RASP 2.8d : MITE Enabled FPAA
RASP 2.8 architecture with MITE CAB
design and current mode support circuitry

RASP 2.8c: Sensor enabled FPAA

RASP 2.8 architecture with additional CABs


for Universal Sensor Circuits

Next Questions on FPAAs


FAQ on Large-Scale FPAAs
Design time similar time for FPAA targeted and custom ICs
Size can be similar to custom (programmable caps / I)
Noise levels are similar to custom design
Similar speed as custom upto routing fabric speed
(~10-20MHz in 0.35um CMOS)
Power levels often similar to custom solutions
Techniques scale (~ ideal CMOS rules) with process shrink

Node (nm)

Prog #s (M)

TMACs

350

4.0

90

64.0

64

45

256.0

512

Tools?
Are these available anywhere?
Compiled circuits include:
n-th order filters / filterbanks, Capacitive summation / differencing,
Ramp ADC, Algorithmic and Sigma-Delta ADCs, MP3 encoder, WTA,
Analog Distributed Arithmatic, HMM classifiers, Van-der-pol Oscillator

Building Bridges between


Algorithms and Hardware
Building Infrastructure:
Testing / Demonstration Boards
& teaching how to build
- Wide use of FPAA test platform
- Smaller Board development /
dedicated Programming boards
- FPAA chip specific adaptor boards
(single and multiple chip platforms)

Software Infrastructure / Tools


First automated simulink to
system measurement test, Dec 2008

Some next directions


Targeting to SPICE
Simulink design tools
- More simulation models
- Noise, SNR, Distortion

Starting design
at high level
Extensive Library
(working circuits)
Parameter
Translation

Developed
visual tool for
routing (RAT)

Rapid Prototyping using FPAAs


RASP 2.7 PhotoReceptor Response
Paper Strip
1
1
2
3

Levels of Energy Efficiency


Subthreshold
Transistor Operation
Highest throughput /
amount of power

Programmable Circuits
(FG transistors)

Analog Signal Processing

Eliminate mismatch
Programmability

~ x1000 improvement
in power efficiency

Configurable Signal
Processing
Wide accessibility

Moving analog approaches /conceptual framework to a system design approach,


similar to digitals system transformation in the 1970s / 80s.
Large need for tools to compile / program these systems.
Link most useful at system /sig processing level
Education / training / foundational theory is critical for designing.
These techniques open further opportunities to utilize / explore
biologically inspired techniques

You might also like