You are on page 1of 5

Digital Design Assignment (25 marks)

Analog and digital VLSI design

Objective:

 To make you learn writing synthesizable verilog / VHDL code

 Learning System Verilog for verification.

 To develop your intuition for designing.

 You should be able to optimize for speed and power

 Practical understanding low power, high speed design techniques

 Using EDA tools efficiently in system design

General Instructions:

 You are given 9 digital subsystems that are mostly used in processor design. You can
choose any one of them and follow the design methodology given below.

 There has to be TWO students per group at the maximum.

 Only 6 groups (at the maximum) can register for any particular project out of 9 projects
listed. Projects would be allotted based on first come first register basis.

 Please inform IC regarding your chosen circuit by 24 Jan.. 2010.

 You need to work on SOC encounter, modelsim, RTL compiler, Leonardo spectrum etc.

Extra reading is required for understanding of these circuits. Also, self tool exploration is required for implementation

Design instructions:

 Understand the function. Design the architecture(important part) yourself

 Optimize your design for speed and power. Subsystem is a part of bigger synchronous design.

 Refer to recent papers . Choose your own specs. As the assignment will be based on relative grading, the more
challenging you make from others the more marks you will be awarded. Challenging refers that how close are
your specs to the desired one.

 Characterize your design by tabulating obtained values of all (maximum) parameters (specs) which you
studied in the class.

 If you are not able to meet your specs, you can go ahead with system design, but you need to explain why it
has happened at the time of demonstration.

 If required, you can change specifications. However you need to give a clear justification
 Validate your design for all process corners and temperature Variation. Keep the power dissipation as small as
possible.

 You should submit a soft copy of DETAILED report of your assignment to IC.

 Your design should meet specification at all process corners with temperature varying from -40o to 125o C.

While designing in rtlcompiler, constraints should be set on


 Clock (max. Possible),
 Clock skew (min. Possible),
 Max. Fanout (4),
 Input-output pin delay (worst-case),
 Operating conditions-worst case (temp, process and voltage)

Common Specifications:

Technology node : umc 180nm CMOS , / faraday 90nm technology

VDD = 3 V, or take one defined in tehnology library

Circuits:

There are 10 digital circuits given below. Some of the required information with respect to particular circuit is
given to you. Typical values of performance parameters is also listed in a table

.For most of the circuits given switches are required. Implement Switch by simple NMOS or PMOS or both (to
decrease the ON resistance of it).

TOPIC 1: 4 KB 4-Ways Set Associative Cache Memory


The implementation of a 4-way set-associative cache is shown in the following diagram. (An n-way set-
associative cache can be implemented in a similar manner.) The index part of the input address is used to find the
proper row in the data memory array and the tag memory array. In this case, however, each row (set) corresponds
to four cache lines (four ways). A row in the data memory holds four cache lines (for 32-bytes cache lines, 128
bytes), and a row in the tag memory array contains four tags and status bits for those tags (2 bits per cache line).
The tag memory and the data memory are accessed in parallel, but the output data driver is enabled only if there
is a cache hit.
Apply techniques that can be used to help achieve the increasingly stringent design targets and constraints of modern
processors. In particular, consider techniques that enable the cache to be accessed quickly and still achieve a good hit
ratio with nearly nil leakage power . Also consider issues such as area cost and bandwidth requirements.
TOPIC 2: Dual Loop Delay Locked Loop
In synchronous systems, the integrated circuits in the system are synchronized to a common reference clock. This
synchronisation often cannot be achieved by distributing a single reference clock to each of the integrated circuits for
the following reasons, among others. When an integrated circuit receives a reference clock, the circuit often must
condition the reference clock before the circuit can use the clock. E.g. the circuit may buffer the incoming reference
clock or may convert the incoming clock from one voltage level to another. The processing introduces its own delay,
with the result that the processed reference clock, which will be referred to as a local clock, often will no longer be
adequately synchronised with the incoming referenced clock. The trend towards the faster system clock speeds further
aggravates this problem since faster clock speeds reduce the amount of delay, or clock skew, which can be tolerated.

To remedy this problem, an additional circuit is typically used to synchronise the local clock to the reference clock.
Two common circuits which are used for this purpose are the PLL and the DLL.

USEFUL REFERENCE:

 Uploaded on site

TOPIC 3: Design a HIGH SPEED TEN OPERANDS 128-BITS CARRY SAVE ADDER
The most important application of a carry-save adder is to calculate the partial products in integer multiplication. This
allows for architectures, where a tree of carry-save adders (a so called Wallace tree) is used to calculate the partial
products very fast. One 'normal' adder is then used to add the last set of carry bits to the last partial products to give
the final multiplication result. Usually, a very fast carry-look ahead or carry-select adder is used for this last stage, in
order to obtain the optimal performance.

Useful References:

1. Carry-Save Addition ,Prof. Loh ,CS3220 - Processor Design ,February 2, 2005

TOPIC4: Design of a low power high speed 64 x 64 bit multiplier


You should try to reduce switching activity and achieve low power dissipation through the Sign-Magnitude (SM)
notation for the multiplicand and through a novel design of the Redundant Binary (RB) adder and Booth decoder. The
high speed operation may be achieved through the Carry- Propagation-Free (CPF} accumulation of the Partial
Products (PP) by using the RB notation.

USEFUL REFERENCE:
[l] N. Takagi, et al, “High-Speed VLSI Multiplication Algorithm with a Redundant Binary Addition Tree,” IEEE
Trans. on Computers, Vol.C-34, No.9, pp.789-796, September 1985.

[2] H.Makino, et al, “A 8.8-ns 54x54-bit Multiplier Using New Redundant Binary Architecture,” Proceedings of 1993
International Conference on Computer Design, Cambridge, MA, USA, pp.202-205, October 3-6, 1993.\

Topic 5:Design a Low Power 32 bit binary-binary Logarithmic-antilogrithmic Converter


USEFUL REFERENCE:

1.) CMOS VLSI Implementation of a Low-Power Logarithmic Converter Khalid H. Abed, Senior Member,
IEEE, and Raymond E. Siferd, Member, IEEE

2.) Useful reference ---------IEEE TRANSACTIONS ON COMPUTERS, VOL. 52, NO. 11,
NOVEMBER 2003
Topic 6: Design a 32 bit x 32 Radix -4 SRT divider
Useful reference---Computer Arithmetic –algorithms and hardware design. By : Behrooz Parhami

Topic 7: Design of a fully pipelined CORDIC PROCESSOR for OFDM based WLAN

Useful reference --------European Journal of Scientific Research, ISSN 1450-216X Vol.27 No.4 (2009),
pp.588-596

Topic 8: Design of real time Autocorrelator to perform the autocorrelation of 128 samples each of 16
bits wide.

Useful reference ---------


2009 International Conference on Communication Software and Networks
Performance Comparision of Autocorrelation and CPRDIC Algorithm Implemented on FPGA for OFDM Based
WLAN
Macau, China
February 27-February 28

Optimizing Implementation of Autocorrelation Function


Authors: Eng. Mihai Nicolae – Newgate Design SRL, Dr. Eng. Ioan Rugina – Institutute of Solid Mechanics;, Eng.
Alexandru Vasile – University „Politechnica” Bucharest

Topic 9: Design of HASH encryption (decryption) algorithm SHA-512


Useful reference -------Comparative Analysis of the Hardware Implementations of Hash Functions SHA-1
and SHA-512
Tim Grembowski, Roar Lien1, Kris Gaj, Nghi Nguyen

Topic 10: Design of programmable Ring Oscillator using 101or more inverter chain/s. There can be multiple
parallel chains of inverters which can be activated/ deactivated through a digital input to change the frequency of
the oscillator by a factor 2. You have to design standard cell for single inverter and explore tool to use this
standard cell to generate an automatic schematic driven layout. implement your design using your standard cell
in 180nm technology. (This is schematic driven layout)

Useful reference: Cadence user manual………….and Google search engine


Topic11: A New VLSI Architecture of Parallel Multiplier–Accumulator Based on Radix-2
Modified Booth Algorithm.

Useful reference: IEEE Transaction 2010

Topic 12: Design and Implementation of High Speed DDR SDRAM( Dual Data Rate Synchronously
Dynamic RAM) Controller using (Verilog)
Useful reference: IEEE Explore

You might also like