You are on page 1of 3

Notes on Using Error Correction with Flash Memory, March 2005, Morgan Colmer (CTO) Global

Silicon R&D Labs, Cambridge, UK

1.0 Background One solution to this is to ensure that certain


frequently accessed data items are not written
All flash memory products suffer from a finite
back to the same area of the flash memory but
number of erase cycles that they can withstand.
rotate around the memory to spread the “ware
With the die area of flash memories becoming
and tear” over the entire device.
larger all the time, the statistical probability of
any given bit in the memory becoming Clearly, the manufacturers do not want to be
damaged increases. constantly increasing the level of in-built
sophistication of flash drives as this increases
For bulk storage applications, the popular
the cost without necessarily giving a perceived
choice is NAND flash because of the increased
benefit to the user, and the flash manufacturers
data density compared to NOR flash, the chief
are unlikely to want to make a big issue about
drawback with using NAND flash is that the
the inherent unreliability of their products!
individual bits of bytes cannot be randomly
accessed, the device is arranged like a hard 1.1 The Consumer Audio Market
disc drive, into 512 byte sectors. When a flash
One of the unique problems faced by the
IC is manufactured and tested, it is expected
consumer audio manufacturers is that their
that some of these sectors will be damaged due
costs are increasingly being linked to the
to the process and so extra sectors are available
commodity memory market pricing and as the
to replace those lost to general semiconductor
acceptance of digital media grows, so this
yield issues. Often there is a complex
trend will increase. The end customers, like
controller that makes this process invisible to
Wal-Mart do not allow their audio suppliers to
the outside, typically there are 2% extra sectors
factor this memory price fluctuation into their
available for this. A typical NAND flash
buy-price (as is the case in the PC market) and
sector can be reprogrammed about 10000
this leaves the audio suppliers exposed to the
times.
fickle whims and trends of the current
Because of the inherent limited endurance of memory market.
flash memories, many manufacturers put some
The consumer audio industry has constantly
simple error correction into the memory.
sought ways of overcoming this, and recycling
Typically, they use Hamming codes and
has become commonplace. DRAM, another
increase the sector size by a further 16 bytes to
commodity memory product is frequently
accommodate the error correction overhead,
salvaged from old SIMMS and often at a
but this data space is not available to the
fraction of the ambient market prices. With a
outside system. Clearly all of these techniques
revolution in NAND flash demand from the
take up extra die area on the flash device to
audio electronics industry poised to happen, it
perform these functions.
seems very likely that this type of memory
Using the error correction, the flash memory product will also become targeting by
can correct only one bit in one sector (1 bit in component recycling companies.
4096 bits) and detect 2 bit bits in error per
Recycled flash memory will be characterised
sector. The flash manufacturers claim that this
by a number of factors; (i) older technology
is sufficient for most purposes; however filing
and (ii) higher probability of defective sectors.
systems can cause the level of damage
Any flash controller by a new entrant to this
sustained by certain sectors to be greatly
market must be capable of accommodating
increased, causing the product to fail in a very
recycled flash memory.
short period of time.
Filing systems such as FAT16 and FAT32 save
two copies of a table that is used to tell the host 2.0 Extended Error Correction
processor where everything is stored on the
To be able to make use of older recycled flash
device, every time any part of the bulk
memory, an extended error correction scheme
memory is changed, it will cause these two
needs to be applied for two reasons; (i) older
copies of this essential data to be re-written. In
NAND flash memory, a single location or byte memory types do not even have the simple
cannot be individually erased, and entire block Hamming code error correction included in
them, and (ii) it is likely that the capability of
(covering several sectors) must be formatted
the Hamming codes has already been exceeded
and re-written. This causes premature failures
(that’s why it’s being recycled in the first
to many devices such as thumb drives.
place) and the flash memory is already MIPS for a typical 128 kbps MP3 file
considered “broken”. depending on the level of errors found in the
data and the total memory use would be
There are many methods of performing error
approximately 1.5 Kbytes.
correction to digital data streams and all of
these will involve a computational and
memory overhead, some more burdensome
than others. All FEC correction systems are
complex and sophisticated pieces of IP that
generally take considerable time and effort to
develop.
Fortunately, for Global Silicon, the Sony
Corporation did a lot of thinking about how to
implement a powerful, yet lightweight error
correction algorithm called CIRC.
CIRC, or Cross Interleaved Reed-Soloman Clearly this technology is equally applicable to
Code, is a very powerful error correction DRAM or any other type of solid state
algorithm that was designed in the 1970’s for memory device.
the CD player standard. Because at this time,
memory and MIPS were both very expensive, As a method of further improving the error
Sony invested a lot of time and money to come correction capabilities, it is possible to
up with an algorithm that was efficient on additionally interleave the data to allow
both. multiple sectors to be corrected. If, for
example, the data was written to the memory
If only the second stage of error correction is device with the data interleaved over 4 sectors
used, then this in conjunction with the de- then the error correction system could be able
interleaving buffer would allow up to 4096 to fully recover the data from 4 consecutive
contiguous bits in error could be corrected sectors that were completely corrupted. This
without a single bit of the erroneous data being extra interleaving comes at the cost of extra
found by the host CPU. This is clearly 4096 memory being required to process the data, but
times better than the current error correction clearly can be extended to permit the
and without the enormous overhead that might maximum length of the correctable data to be
be expected by casual inspection. extended to any length given sufficient
In the typical flash memory error correction working memory in the CPU.
algorithms used by memory suppliers, the The diagram below shows the increased
redundancy is 16 bytes in every 512, thus 3.1% interleaving structure when operated over four
of the data stored to the flash memory is the 512 byte sectors of a typical NAND flash
error correction overhead, in a system based memory.
upon CIRC, this redundancy level rises to
12.5% (only using C2), however this is for a
4096:1 increase in the error correction
capacity. The CIRC error correction can also
re-use the extra space available from the now-
unused Hamming code system which takes the
data redundancy down to only 9.4%.
The purpose of the error correction; to allow
recycled flash memory to be used, is not the
only advantage that this system gives – the
lifetime of flash products can also be
considerably increased without the need for
costly silicon solutions aboard the flash
memory device.
In the Global Silicon application of this
technology it is anticipated that the great
majority of the processing would be
implemented in software but make use of the
special instructions that are present from the
CD data decoder to accelerate the process.
The processing overhead for a complete
encode and decoder should be less than 1 to 4
Key
bn = bit number, wn = word number (in this
case the words are 8 bits long).

You might also like