Professional Documents
Culture Documents
<
00
"Cl
0.95
'";
E
.z 0.90
SAEN
Isensingenable(SAEN) timingmargin= Tm/Tp I
(a) Sensing enable (SAEN) timing margin definition.
Tm
(b) Normalized SAEN timing margin simulated in typical condition.
Fig. 6. Definition of SAEN timing margin and its normalization diagram.
Forced-stack effect [7] can be used at sensing enable
generation circuit to solve this problem (IVP and IVN in Fig.
4). Fig. 5 presents the simulation comparison results between
using forced-stack devices and using long channel length
devices to generate signal SAEN under the iso-area condition.
Iso-area is achieved by making the layout area after stack
forcing (IVP in Fig. 5) identical to long channel length (!NV
in Fig. 5). Under high voltage the two methods can get almost
same SAEN pulse width (the ratio is almost equal to 1), but
with the voltage decreasing forced-stack method can get wider
pulse width. This result in SAEN timing margin (defined in
Fig. 6(a)) gets enhanced in low voltage condition (shown in
Fig. 6 (b)).
IV. EXPERIMENTAL RESULTS
A test-chip has been fabricated using UMC's 90-nm low
leakage CMOS logic technology, with 17 memories and
embedded PLL etc. testing circuits which is designed for
SRAM's high speed and timing measurement. Fig. 7 shows
the layout of test-chip, which contains compiled SRAM's
1.25
1.00
_. ---..---
0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
Voltage
3.75
3.50
3.25
o 3.00
2.75
2.50
"3
Q. 2.25
3 2.00
V'I 1.75
1.50
I- -40"( .. 25"( 125'( r
Fig. 5. Signal SAEN pulse width ratio (ratio = Tsaen_forced_stack /
Tsaen_long_channel_length) waveform. Ratio numbers are obtained
from simulation under iso-area condition.
SRAM compiler we only use the power-gating method for the
final inverter of the global word line driver and local word
line driver. The part power-gating technique doesn't require
long wakeup time and status retention circuits because a
PMOS (MPO) only connecting eight inverters (IVO - IV7),
which makes the parasitic capacitance of virtual power supply
very low. During standby mode only 1 of 32 WLDRV8s is
active; the others can decrease the standby current because
related PMOS devices are shut off. Hspice simulation results
shown that a shut-off WLDRV8 can decrease 80% standby-
current in high temperature condition with the expense of
about 5% area overhead and about 8% speed penalty.
B. Sensing Enable
The bit line decoupled latch type sense amplifier (SA) [9] is
used in the SRAM compiler. Fig. 4 shows the SA block
diagram and sensing enable pulse generating circuits, of
which output signal SAEN is used for switching all SAs
located at the same block. A latch circuit (keeper) is shared by
the SA located at Block 1 and Block 2. During the activation
of signal SAEN, SA will drive the sensing result to SAOUT
and make the new data latched by the keeper. With power
supply decreasing the drive ability of SA also will become
weak, but the parasitic capacitance of wire SAOUT is almost
invariable. This problem necessitates keeping enough design
margins in the design of the sensing enable.
627
Fig. 7. Test-chip layout.
ranging from 64b to 5l2kb in a variety of aspect ratios. The
features of test-chip can be found in table II.
32768X16 8192X8
Configuration BM
I
CM BM
I
CM
Area (mrrr') 0.852
I
0.686 0.146
I
0.108
Area Comparison +24.20% + 35.18%
Static Current (IlA) 9.661
I
8.277 1.320
I
1.187
DC Comparison + 16.72% + 11.20%
Dynamic Current (mA/MHz) 0.029
I
0.041 0.011
I
0.019
AC Comparison - 29.27% -42.11%
REFERENCES
[1] M. Yoshimoto, K. Anami, H. Shinohara, T. Yoshihara, H. Takagi, S.
Nagao, S. Kayano, and T. Nakano, "A divided word-line structure in the
static RAM and its application to a 64K full CMOS RAM," IEEE
Journal of Solid-State Circuits, vol. SC-18, no. 5, pp. 479-485, Oct.
1983.
[2] 1. C. Tou, P. Gee, 1. Duh, and R. Eesley, "A submicrometer CMOS
embedded SRAM compiler," IEEE Journal of Solid-State Circuits, vol.
27, no. 3,pp. 417-424, Mar. 1992.
[3] 1. S. Caravella, "A low voltage SRAM for embedded applications,"
IEEE Journal of Solid-State Circuits, vol. 32, no. 3, pp. 428-432, Mar.
1997.
[4] B. S. Amrutur and M. A. Horowitz, "A replica technique for wordline
and sense control in low-power SRAM's," IEEE Journal of Solid-State
Circuits, vol. 33, no. 8, pp. 1208-1219, Aug. 1998.
[5] A. Karandiskar and K. K. Parhi, "Low power SRAM design using
hierarchical divided bit-line approach," Proceeding International
Conference on Computer Design: VLSI in Computers and Processors,
pp. 82-88, Oct. 1998.
[6] M. Jagasivamani and D. S. Ha, "Development of a low-power SRAM
compiler," IEEE International Symposium on Circuits and Systems
(ISCASj, vol. 4, pp. 498-501, May 2001.
[7] S. Narendra, S. Borkar, V. De, D. Antoniadis, and A. Chandrakasan,
"Scalling of stack effect and its application for leakage reduction,"
Proceedings of the International Symposium on Low Power Electronics
and Design, pp. 195-200, Aug. 2001.
[8] B. Yang and L. Kim, "A low-power SRAM using hierarchical bit line
and local sense amplifiers," IEEE Journal ofSolid-State Circuits, vol. 40,
no. 6,pp. 1366-1376, Jun. 2005.
[9] S. Singh, S. Azmi, N. Agrawal, P. Phani, and A. Rout, "Architecture and
design of a high performance SRAM for SOC design," Design
Automation Conference, pp. 447-451,2002.
[10] H. Jiang, M. M. Sadowska, and S. R. Nassif, "Benefits and costs of
power-gating technique," Proceedings of the 2005 International
Conference on Computer Design, pp. 559-566, 2005
[11] 1. T. Kao and A. P. Chandrakasan, "Dual-threshold voltage techniques
for low-power digital circuits," IEEE Journal ofSolid-State Circuits, vol.
35, no. 7, pp. 1009-1018, July 2000.
v. CONCLUSION
A highly configurable embedded low power SRAM
compiler based on an industrial 90-nm CMOS process has
been demonstrated. The SRAMs compiled can greatly reduce
dynamic current by combining DWL and DBL techniques
with the help of replica and self-timing scheme. Enough
margin simulation and verification with the help of robust
circuits further guarantee the SRAMs compiled with wider
margin for correct functionality and accurate characterization.
The measurement results of test-chip have proved the design
correctness and low power efficiency.
ACKNOWLEDGMENT
It is our pleasure to thank Teddy and James for help with
the test-chip design, W. T. and Jason for testing of the chips,
Willis, Jack, Alex and Ya-Qi for helpful discussion on the
circuits design.
UMC
90-nmlP9M low leakage CMOS
90-nm IP5M
1.2V
4000Jlmx 4000Jlm
QFP 208
TABLE II
FEATURES OF TEST-CHIP
Foundry
Process
SRAM macros
Supply voltage
Die size
Package
Table III gives the power measurement results at the
operating voltage of 1.2 Y for two SRAM macros (in the table
column BM) in the test-chip. The data of CM (Column Mux.)
macros (with bit line partition architecture) come from
Faraday commercial SRAM compiler datasheets. The results
show that this design can reduce dynamic current by 29% for
the 5l2Kb SRAM and by 42% for the 64Kb SRAM. The
static current actually has a little increasing in this work due to
additional circuits overhead for the DWL implementation.
However, the total average current dissipation is still reduced
as it is dominated by dynamic current dissipation.
TABLE III
COMPARISON WITH OTHER WORKS
PLL
Silicon measurement confirmed complete functionality over
voltage (0.9 - 1.8Y) and temperature (-40 - 125C) ranges
with all memories. The SCAN, March C- and March C+
patterns were utilized by memory BIST (Built-in Self-Test)
embedded in the test-chip. An embedded PLL was used for
SRAM timing measurements and high speed memory BIST
(maximum frequency can reach 500MHz) testing.
628