You are on page 1of 4

Multi Dimensional Parity Based Hamming Codes For Correcting The SRAM Memory Faults Under High Emi

Conditions

MULTI DIMENSIONAL PARITY BASED HAMMING CODES FOR


CORRECTING THE SRAM MEMORY FAULTS UNDER HIGH EMI
CONDITIONS
1

ALEKYA VADINALA, 2G. KIRAN KUMAR

M.Tech-VLSI, Dept Of ECE, ASCET, Gudur, Andhra Pradesh, India.


M.Tech, Associate Professor, Dept Of ECE, ASCET, Gudur, Andhra Pradesh, India.

Abstract: Exposure to electromagnetic radiations (high speed a ray particles) is a prominent problem in all the
semiconductor memories of on-board computing unit used for space application. So, in this paper, an error detection and
correction method to protect the semiconductor memories against the soft errors is proposed. This method is based on 2-d
parities. The parity bits are calculated at the receiver end for each row, column and diagonal in slash and backslash directions
in a memory array. The parities are regenerated at the receiver end; the comparison of transmitted and received parity bits
detects the error. As soon as the error is detected, the code corrects the detected error. Hamming code is used for error
detection and correction. It uses parity codes in each of the four directions (that are horizontal, vertical, forward slash
diagonal and backslash diagonal) in a data part. Correction code can correct an error in each row, column, and forward slash
diagonal and back slash diagonal. This method is implemented on an FPGA device and it is evaluated for an onchip RAM
of a Virtex device. This method is a promising technique to detect and correct errors in semiconductor memories in presence
of large electromagnetic interference and hazardswith less computational complexity.

I.

extra components or time redundancy that is provided


by extra execution time or by different moment of
storage or can be a combination of both the hardware
and time redundancies. To allow redundancy to detect
permanent faults, the repeated computations are
performed differently. TMR (Triple Modular
Redundancy) is a suitable technique for SRAM-based
FPGAs because of its full hardware redundancy
property in the combinational and sequential logic.
To protect the memories, hardware redundancy
techniques can be one solution, but they are very
expensive. When hardware redundancy is not
possible, we have to go for software solutions. By
using software Error Detection And Correction,
transient faults in the combinational logic will never
be stored in the storage cells, and bit flips in the
storage cells will never occur or will be immediately
corrected. For applications where read and write
operations are done in blocks of words, such as
secondary storage systems made of solid-state
memories (RAM discs), software-implemented
EDAC could be a better choice than hardware EDAC,
because it can be used with a simple memory system
and it provides the flexibility of implementing more
complex coding schemes. With software EDAC, the
data that is read from main memory may be
erroneous, if the error occurs after the last scrub
operation and before the time of reading. In other
words, single-bit errors may cause failures. In
contrast, hardware EDAC checks all the data that is
read from memory, and corrects single-bit errors.
Therefore, hardware EDAC provides better reliability
and, when possible, should be the first choice for
protecting the main memory. When hardware EDAC
is not available or affordable, software EDAC can be
used as a lowcost solution for enhancing the

INTRODUCTION

In many computer systems, the contents of memory


are protected by an error detection and correction
(EDAC) code. Single bit upset and multiple bits upset
severely impact fieldlevel product reliability for
Semiconductor memories. In order to maintain good
level of reliability, it is necessary to protect memory
cells, for this purpose, various error detection and
correction methods are being used.The goal of any
Error Detection And Correction code is to provide
protection against transient errors (soft errors) that
manifest themselves as bit-flips in the memory. These
errors can be caused by single event upsets (SEUs),
power fluctuations or electromagnetic interference.
When a single charged particle strikes the silicon, it
loses its energy via the production of electron hole
pairs resulting in a dense ionized track in the local
region. This event can provoke a current pulse, which
can upset the integrated circuit operation. This work
targets the Single Event Upset (SEU) phenomenon
that is defmed by a random bit inversion of a storage
cell. Targets are registers, memories and all latches
and flip-flops in general. SEU protection techniques
must consider all storage cells in order to achieve full
reliability in the system. Although SEU is the major
concern in space application, multiple bit upsets
(MBU) start to be also a matter to be addressed
nowadays because of the nano-metric technologies.
When a single high-energy ion passes through the
silicon it can energize two or more adjacent memory
cells.
A Hardware redundancy Vs Software EDAC:
In order to protect semiconductor memories, software
EDAC or redundancy can be used. Redundancy can
either be hardware redundancy that is provided by

Proceedings of International Academic Conference on Electrical, Electronics and Computer Engineering, 8th Sept. 2013, Chennai, India
ISBN: 978-93-82702-28-3
46

Multi Dimensional Parity Based Hamming Codes For Correcting The SRAM Memory Faults Under High Emi Conditions

reliability of systems. For cases where data is read


and written in blocks of words rather than individual
words, software EDAC may be a better choice than
hardware EDAC. The paper is organized as, the
related work that has been done previously is given in
section 2, the proposed method of error detection and
correction is described in the section 3, results are
given in section 4 and in the end of the paper, paper is
concluded in the section 5.
II.

to solve the non linear power optimization problem


but it involves tedious computation of Hmatrix. In
[9] Heavy-ion and proton measurements are
presented that cause single-event multiple upsets.
Hardware configurations used to test the high density
devices that are used for the measurement should
have the ability to distinguish between single particle
single upsets and single particle multiple upsets. The
method in [10], which is named HVD, provides very
high detection coverage rate that can correct up to
three upsets in a data array. It uses parity codes in
four directions in a data part to assure the reliability
of memories and it can detect and correct the errors in
real data bits. If the parity bit is itself erroneous, then
that error is detected by generating the parity bits for
parities that is syndrome bits, but this is a
complicated process. An easy way to find the errors
in parity bits is presented in this paper. For this, we
can take data bits and parity bits as a whole word.
These words can be viewed as an m x n array. The
hamming code will be used for the error detection
and correction for this whole word containing both
the data bits and the parity bits across the length of
array. After finding the error, it can be detected
whether it is a data bit or a parity bit.

PREVIOUS WORK

In order to maintain good level of reliability, it is


necessary to protect memory cells using protection
codes, for this purpose, various error detection and
correction methods are being used. The method used
in [1], is based on the hardware and time redundancy,
although this technique reduces the number of input
and output pins of the combinational logic; it requires
additional
encoding/decoding
circuitry.
The
reliability issue can be solved, but the hardware
redundancy schemes like duplication or triple
modular redundancies are expensive. In [2], the
encoder and the decoder can use any error detection
and correction code. But the data is only coded in
write
operations,
and
decoded
in
read
operations.So,the accumulation of upsets is likely to
occur and it depends on the reading and writing
application request frequency. In order to avoid this
accumulation of upsets, it is necessary to use an extra
logic which is able to constantly detect and correct
upsets in all coded data. In the architecture given in
[3], Memory Mapped Error Correction Code
minimizes the need of SRAM resources. It maintains
the Error Correction Code information by placing it
within the memory hierarchy and re-using existing
cache storage and control. But it needs the redundant
information to be stored for error correction.
Hamming code and TMR are compared in [4], the
result of comparison shows that TMR increases
significantly the area of memory cells, while the voter
is implemented with a small number of logic gates.
The number of voters increases linearly with the
number of bits in the protected word. In case of
Hamming code, it produces a small increase in the
number of storage cells, but it needs large coders and
decoders logic blocks, which considerably increase
the area and the delay of the critical path. The EDAC
method given in [5] is again based on TMR, so
increases the density as it is a hardware redundancy
method. In [6], active and passive watch dog timers
and a voting system are used for the error correction.
The algorithm given in [7] periodically activates the
available memory protection scheme (parity or
Hamming code) and utilizes the respective error
detection capabilities it is capable of unveiling errors
in the stored data. The method given in [8] reduces
power consumption in single-error correcting, double
error-detecting checker circuits that perform memory
error correction code. This method can be employed

III.

PROPOSED METHOD

The
proposed
method
HVD
(HorizontalVerticalDiagonal) is based on 2-d parities. Parities
are generated in the 4 diagonals that are horizontal,
vertical, forward slash and back slash diagonals. The
8 x 8 array is given in figure 1, where the symbols h,
v, f and b denote the parity bits in horizontal, vertical,
forward slash and backward slash respectively and
the subscripts indicate the position of parity.
A. Detection Method:
For the whole array, parities are calculated in all the
directions at the receiver end (for example, from hi to
hs in horizontal direction in the above figure). These
calculated parities are compared against the actual
received parities. If the result of comparison does not
show any difference, it means the received data at the
receiver is correct so no correction is required; but if
there is a difference between the received and
calculated parities, the erroneous parity lines are
identified and then the correction process starts.

Figure I. 2-d coded array


Proceedings of International Academic Conference on Electrical, Electronics and Computer Engineering, 8th Sept. 2013, Chennai, India
ISBN: 978-93-82702-28-3
47

Multi Dimensional Parity Based Hamming Codes For Correcting The SRAM Memory Faults Under High Emi Conditions

B. Correction Method:
At the end of the detection process, the erroneous
parity lines are marked with a circle as shown in
figure 2. For correction, first the candidate bits are
marked.

IV.

RESULTS

This method is emulated in Virtex-II pro platfonn and


the robustness of the technique is evaluated by
random fault injection. The results show that, a large
combination of multiple faults can be corrected. The
number of errors that can be corrected depends upon
the length of the array of coded word. It increases
with the length of the array. The maximum errors for
a 8 x 8 array that can be corrected is shown with dark
circles in figure 5.

Figure 3. coded array with candidate bits

To find the erroneous bits among the candidate bits,


all the candidate bits are checked. The candidate bit
for which all the four lines intersect, is an erroneous
bit; if not then the bit is correct. This particular
candidate bit can be removed. The error bits for the
set of candidate bits in figure 3 are shown with dark
circles in figure 4. These erroneous bits are flipped to
correct.

Figure 5. 8 x 8 coded array with maximum errors

A. Hardware Analysis :
This method requires only 2 adders, 1 multiplexer
and 3 XOR gates for different length of code as
shown in table 1. It does not require any other extra
hardware as needed in [1] and [2], as it is not a
hardware redundant method.
TABLE I HARDWARE ANALYSIS FOR
DIFFERENT WORD LENGTH

B. Time analysis:
For the given method real time and CPU time are
analysed on the same platfonn for the word length of
4, 8, 16 and 32 bit. The analysis shows as the length
of code is increases, the time required to correct the
error increases. The CPU time for different length of
code is given in the table 2.
TABLE 2 TIME ANALYSIS FOR DIFFERENT
WORD LENGTH

Figure 4. coded array with error bits

Tof ind whether the erroneous bits are data bits or


parity Figure 2. coded array with erroneous parities
bits, the error bits can be checked as, if the position of
the erroneous bit is 2k (for k = 0, 1, 2, 3 .... ), then it
is an erroneous Wherever atleast two of the four
erroneous parity lines parity bit otherwise it is a data
bit that is erroneous. intersect, that bit in array is
marked as candidate bit. Candidate bits for the
erroneous parity lines are shown with black squares
in figure 3.

Proceedings of International Academic Conference on Electrical, Electronics and Computer Engineering, 8th Sept. 2013, Chennai, India
ISBN: 978-93-82702-28-3
48

Multi Dimensional Parity Based Hamming Codes For Correcting The SRAM Memory Faults Under High Emi Conditions

The VHDL codes are written and simulated for 4, 8,


16 and 32 bit code. Simulation result for a seven bit
coded word containing a four bit data and a three bit
parity code is given in Fig 6 . Figure 6. Simulation
results for a 7-bit word There is no need of extra
calculations of syndrome bits for parity bits as needed
in [10].

This method uses hamming code for error detection


and correction, so it can detect and correct a single
error in a particular line of code of the coded word
array; it is an only limitation of this method. Any
other correction code can be used with this method to
increase the number of errors in a single line that can
be corrected.

CONCLUSION

REFERENCES

In this paper an easy method with lesser computations


is presented. It is a simpler method of finding the
errors by removal of candidate bits as compared to
the method given in [10], so reduces the complexity.
This method can detect and correct the errors in data
bits as well as in the parity bits without any extra
calculations. As it is a software EOAC method so no
extra hardware circuitry is required and also the
transient faults and bit flips in the storage cells in the
combinational logic will be immediately corrected,
before it can be stored in the storage cells. With this
method a large combination of multiple faults can be
corrected. The number of errors that can be corrected
varies with the length of the coded word array. The
number increases with the length of the array.

[1] Fernanda Lima, Luigi Carro, Ricardo Reis "Designing Fault


Tolerant Systems into SRAM-based FPGAs" Anaheim,
California, USA, DAC'03, June 2-6, 2003.
[2] Argyrides C, Zarandi HR, Pradhan DK, "Multiple upsets
tolerance in SRAM memory" International symposium on
circuits and system, New Orleans, LA, May 2007.
[3] Doe Hyun Yoon, Mattan Erez, "Memory Mapped ECC: LowCost Error Protection for Last Level Caches", Austin, TX,
USA, ISCA'09 June 20-24.
[4] R. Hentschke, F. Marques, F. Lima, L. Carro, A. Susin, R. Reis,
"Analyzing Area and Performance Penalty of Protecting
Different Digital Modules with Hamming Code and Triple
Modular Redundancy", IEEE Proceedings of the 15 th
Symposium on Integrated Circuits and Systems Design
(SBCCI'02), 2002.
[5] Y. Bentoutou, "Program Memories Error Detection and
Correction OnBoard Earth Observation Satellites", World
Academy of science, Engineering and Technology 66 2010.

Proceedings of International Academic Conference on Electrical, Electronics and Computer Engineering, 8th Sept. 2013, Chennai, India
ISBN: 978-93-82702-28-3
49

You might also like