Professional Documents
Culture Documents
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 3, May June 2013 ISSN 2278-6856
Research Scholar, VLSI Systems Research Group, Electronics & Communication Engineering, KL University, Guntur, A.P, India
2
VLSI Systems Research Group Head, Electronics & Communication Engineering, KL University, Guntur, A.P, India
3
M.Tech Student, Electronics & Computer Engineering, KL University, Guntur, A.P, India
1.1 Basics of CAM We now take a more detailed look at CAM architecture. A small model is shown in figure 1. The gure 1 shows CAM consisting of 4 words, with each word containing 3 bits arranged horizontally (corresponding to 3 CAM cells). There is a match-line corresponding to each word (ML0, ML1, etc.) feeding into match line sense ampliers (MLSAs), and there is a differential search line pair corresponding to each bit of the search word (SL0, SL0, SL1, SL1, etc.). CAM search operation begins with loading the search-data word into the search-data registers followed by precharging all match lines high, putting them all temporarily in the match state. Next, the search line drivers broadcast the search word onto the differential search lines, and each CAM core cell compares its stored bit against the bit on its corresponding search lines. Match lines on which all bits match remain in the precharged-high state. Match lines that have at least one bit that misses, discharge to ground. The MLSA then detects whether its match line has a matching condition or miss condition. Finally, the encoder maps the match line of the matching location to its encoded address [1].
Keywords: Content-addressable memory (CAM), match line sensing, review, search line power.
1. INTRODUCTION
Most of the memory devices store and retrieve data by addressing specific memory locations. This path becomes the limiting factor for those systems that depend on fast memory access. The time required to find the data stored in memory can be reduced if the data can be identified by its content rather than by its address. A memory used for this purpose is Content Addressable Memory (CAM). CAM is used in applications where search time is very critical and very short. It is well suited for several functions like Ethernet address lookup, data compression, and security or encryption information on a packet-bypacket basis for high performance data switches. It can also be operated as a data parallel or Single Instruction/Multiple Data (SIMD) processor. Since CAM is an extension of RAM first, we have to know the RAM features to understand CAM. In general RAM has two operations read and write i.e. the data stored in RAM can be read or written but CAM has three operations read, write and compare [1]. The compare operation of CAM makes it useful in variety of applications like network routers. The network router is that which forwards the incoming packets from the sender port to the proper destination port by looking in to its routing table. Basically CAMs are used to design network routers for fast transfer or forwarding of packets. Volume 2, Issue 3 May June 2013
Figure 1 Simple schematic of a CAM CAM core cells and match line structures of CAM are discussed in section 2 and 3. Match line sensing schemes and search line driving approaches are reviewed in section 4 and 5. And the conclusion is given at the end.
2. CORE CELLS
Basically, CAM can be implemented using two cells namely NOR cell and NAND cell. Page 360
2.1 NOR Cell Figure 2 shows a NOR type CAM cell. The NOR cell implements the comparison between the complementary stored bit, D (and D ), and the complementary search data on the complementary search line, SL (and SL ), using four comparison transistors, M1 through M4, which are all typically minimum-size to maintain high cell density. These transistors implement the pull down path of a dynamic XNOR logic gate with inputs SL and D. Each pair of transistors, M1/M3 and M2/M4, forms a pull down path from the match line, ML, such that a mismatch of SL and D activates least one of the pull down paths, connecting ML to ground. A match of SL and D disables both pull down paths, disconnecting ML from ground. The NOR nature of this cell becomes clear when multiple cells are connected in parallel to form a CAM word by shorting the ML of each cell to the ML of adjacent cells. The pull down paths connect in parallel resembling the pull down path of a CMOS NOR logic gate. There is a match condition on a given ML only if every individual cell in the word has a match [1].
2.2 NAND Cell Figure 3 shows a NAND type CAM cell. The NAND cell implements the comparison between the stored bit, D, and corresponding search data on the corresponding search lines, (SL, SL ), using the three comparison transistors M1, MD and MD which are all typically minimum-size to maintain high cell density. We illustrate the bitcomparison operation of a NAND cell through an example. Consider the case of a match when SL=1 and D=1. Pass transistor is ON and passes the logic 1 on the SL to node B. Node B is the bit-match node which is logic 1 if there is a match in the cell. The logic 1 on node B turns ON transistor M1. Note that M1 is also turned ON in the other match case when SL = 0 and D = 0. In this case, the transistor MD passes logic high to raise node B. The remaining cases, where SL = D, result in a miss condition, and accordingly node B is logic 0 and the transistor M1 is OFF. Node B is a pass-transistor implementation of the XNOR function. The NAND nature of this cell becomes clear when multiple NAND cells are serially connected. In this case, the MLn and Volume 2, Issue 3 May June 2013
Figure 4 Structure of a NOR match line with n cells. 3.2 NAND Match Line Figure 5 shows the NAND match line. A number of n cells are cascaded to form the NOR match line. The precharge pMOS transistor, Mpre, sets the initial voltage of the match line, ML, to the supply voltage, VDD. Next, the evaluation nMOS transistor, Meval, turns ON. In the case of a match, all nMOS transistors, M1through Mn are ON, effectively creating a path to ground from the ML node, hence discharging ML to ground. In the case of a miss, at least one of the series nMOS transistors, M1 through Mn, is OFF, leaving the ML voltage high. A sense amplier, MLSA, detects the difference between the match (low) voltage and the miss (high) voltage. The NAND match line has an explicit evaluation transistor, Meval, unlike the NOR match line, where the CAM cells themselves perform the evaluation. There is a potential charge-sharing problem in the NAND matchline. Charge sharing can occur between the ML node and the intermediate MLi nodes. This charge sharing may cause the ML node voltage to drop sufciently low such that the MLSA detects a false match. A technique that eliminates charge sharing is to precharge high, in addition toML, the intermediate match nodes ML1 through MLn-1 . This procedure eliminates charge sharing, since the intermediate match nodes and the ML node are initially shorted. However, there is an increase in the power consumption due to the search line precharge. A feature of the NAND match line is that a miss stops signal propagation such that there is no consumption of power past the nal matching transistor in the serial nMOS chain. Typically, only one match line is in the match state, consequently most matchlines have only a small number of transistors in the chain that are ON and thus only a small amount of power is consumed. Two drawbacks of the NAND match line are a quadratic delay dependence on the number of cells, and a low noise margin. The quadratic delay-dependence comes from the fact that adding a NAND cell to a NAND matchline adds both a series resistance due to the series nMOS transistor and a capacitance to ground due to the nMOS diffusion capacitance. These elements form an RC ladder structure whose overall time constant has a quadratic dependence on the number of NAND cells. The low noise margin is caused by the use of nMOS pass transistors for the comparison circuitry. NOR cells avoid this problem by applying maximum gate voltage to all CAM cell transistors when conducting [1]. Volume 2, Issue 3 May June 2013
Figure 6 Segmented Match Line Architecture. 4.1.1 Pre-Charging The entire match line is pre-charged during every search operation in the traditional NOR match-line structure, whenever a word under comparison does not match the comparison signal. However, SMA segments the total match line capacitor and pre-charges the subset of the match lines. As a result, SMA reduces the power consumption in match lines by reducing the total capacitance seen by a power source. SMA is similar to the conditional pre-charging methods in that only a subset of the match line is pre-charged in the rst phase. The conditional pre-charging scheme, however, needs to precharge the remaining match lines depending on the rststage results. However, SMA performs charge sharing instead of the second pre-charging, and thus does not draw current from the power source. The pre-charging time is also reduced because of smaller RC time constant. The total charging time is further reduced because the two charged segments 1 and 4 are charged at the same time [6]. 4.1.2 Charge Sharing The charge stored in the nth charged segment originates from either pre-charging or charge sharing. The charge size is referred to as Qnwhere n is the segment number. Q1 and Q4 reach their maximum values when the respective match-line segments, 1and 4, are pre-charged. The charge in the charged segments is then shared with other match line partitions when CSC is enabled. The static match voltage at left segments, Vlf is established after charge sharing. The voltage is determined by the charge conservation rule as shown in (1). The capacitance of the nth segment is represented as Cn. (1) The voltage at the right segments Vrf can be similarly calculated. The two voltages (Vlf, Vrf) do not have to be the same. The static match voltages are not typically the rail voltage. It is, therefore, important for the static voltages to meet the minimum sensing voltage of the match sensor block. NI(1) and NI(4) determine the minimum static voltage and should be carefully selected Volume 2, Issue 3 May June 2013 4.1.3 Evaluation After the pre-charging operation, the partial evaluation results for the four segments are merged to determine the nal match result. The process is called merging segmented match lines (MSMs). The MSM phase can be broken down into three sub operations. In the rst operation, the charge is shared through CSC. The rst Page 363
Figure 7 Low-Swing Scheme in [5]. The tank capacitor is pre-charged to VDD and shared with a match line to create a low-voltage swing at each match line by choosing the size of the tank capacitor. The technique requires an additional tank capacitor. SMA achieves the same by pre-changing the pre-charged segments without additional capacitors. SMA is not limited to a voltage swing and has the exibility to create an arbitrary voltage at each match line by choosing the number of NOR circuits in the pre-charged segments without creating externally generated reference voltages. The case selected Ctank to make the match line voltage swing of VDD/2and will be referred to as low-swing VDD/2 (LS-VDD/2). Once the static voltages at each match line are sensed, the charge stored in all segments is re-cycled for subsequent search operations if the word comparison result is a match. The re-cycled charge is then accumulated with the shared charge from precharging. The charge shared voltages approach the rail voltage, VDD when a word continuously produces the match result. The charge shared voltage has minimum and maximum values, which are referred to as Vm-max, and Vm-min, respectively. The voltage boundary is formulated as
(2)
4.2.1.1 Mismatch Before the searching process SP = 0, SEARCH = SEARCH_EN is pulled to high at the beginning of the searching process. Then, MN1 is turned on to charge the ML(i) such that KP will be discharged but not totally pulled down to 0. If there is any mismatch CAM cell, MSi is turned on to make a current path between ML(i) and SML(i) such that SML(i) will be charged by ML(i). When the voltage of SML(i) is high enough to turn off MP3, the voltage of KP will be pulled down such that MATCHB is equal to logic 1, indicating the comparison result of the word = mismatch. By two feedback paths, MATCHB turns MN3 on and MP1 off, respectively, such that the current path of MP1 is shut off to choke the charge current of ML(i) and SP is discharged via MN3 to turn off MN1. The former constitutes a positive loop from MATCHB to KP through MN3 and MN2, which more quickly pulls down KP. Therefore, the power consumption is reduced after the searching process [7]. 4.2.1.2 Match If all of the CAM cells are match, ML(i) and SML(i) are isolated without any current path. The voltage difference between ML(i) and SML(i) creates an output current of the differential pair (MP2 and MP3) to charge the KP and SP. As soon as KP is charged to high, MATCHB becomes logic 0, indicating that the comparison is a match. After the SP is raised to high, SEARCH will equal to logic 0 and turn off MN1 to choke the charge current to ML(i) [7].
4.2 Self Disabled Sensing Technique The self-disabled sensing technique can choke the charge current fed into the ML right after the matching comparison is generated. Figure 8 shows the CAM architecture, where block C and block DMLSA denote the CAM cell and differential MLSAs (DMLSAs) respectively. The prototype CAM is 128 words32 bits. The Search Word Register loads the search key and feeds it into all the CAM cells. Each of the DMLSA charges the ML and senses the voltage variation to generate the match signal, which is sent to the Address Encoder. In general, there is only one word or no match with the search key to enable the Address Encoder to generate the corresponding address code or a no-match signal after the searching process [7].
Figure 8 Architecture of the Self disabled sensing CAM. 4.2.1 DMLSA Figure 9 shows the DMLSA schematic diagram. The DMLSA senses the voltage on the ML(i) and SML(i) to Volume 2, Issue 3 May June 2013
4.3
4.3.1 Parity Bit Based CAM The parity bit based CAM design is shown in figure 10 consisting of the original data segment and an extra onebit segment, derived from the actual data bits. We only obtain the parity bit, i.e., odd or even number of 1s. The Page 364
Figure 10 Parity bit based CAM. 4.3.2 Gated Power Match Line Sense Amplifier The CAM architecture is shown in figure 11. The CAM cells are organized into rows (word) and columns (bit). Each cell has the same number of transistors as the conventional P-type NOR CAM and use a similar ML structure. However, the COMPARISON unit, i.e., transistors M1-M4, and the SRAM unit, i.e., the crosscoupled inverters, are powered by two separate metal rails, namely VDDMLand the VDD, respectively. The VDDML is independently controlled by a power transistor (Px) and a feedback loop that can auto turn-off the ML current to save power. The purpose of having two separate power rails of (VDD and VDDML) is to completely isolate the SRAM cell from any possibility of power disturbances during COMPARE cycle. The gated-power transistor Px, is controlled by a feedback loop, denoted as Power Control which will automatically turn off Px once the voltage on the ML reaches a certain threshold. At the beginning of each cycle, the ML is first initialized by a global control signal Volume 2, Issue 3 May June 2013 Figure 11 (a) CAM Architecture. (b) Each cell powered by two different rails.
(6)
Page 366
6. CONCLUSION
In this paper, CAM and its application and basics related to it are introduced. Various cells of CAM mainly NOR cell and NAND cell and their operations are also discussed. This discussion is extended to these cells which are used to design a match line of CAM mainly the power consumption of CAM due to match line sensing techniques and search line driving approaches which are used to reduce the power consumption of CAM. In future many techniques can be used to design Low power CAMs.
References
[1] Kostas Pagiamtzis, and Ali Sheikholeslami, Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey, IEEE Journal of Solid-State Circuits, Vol. 41, No. 3, March 2006. [2] C. Zukowski And S. Wang, Use Of Selective Precharge For Low-Power Content-Addressable Memories, In Proc. IEEE Int. Symp. Circuits Syst., Jun. 912, 1997, PP. 17881791. [3] C. Lin, J. Chang, and B. Liu, A Low-Power Precomputation-Based Fully Parallel ContentAddressable Memory, IEEE J. Solid-State Circuits, Vol. 38, No. 4, PP. 654662, Apr. 2003. [4] Sanghyeon Baeg, Low-Power Ternary ContentAddressable Memory Design Using a Segmented Match Line, IEEE Transactions on Circuits And SystemsI: Regular Papers, Vol. 55, No. 6, July 2008. [5] M. Khellah and M. Elmasry, Use Of Charge Sharing To Reduce Energy Consumption In Wide Fain-In Gates, in Proc. IEEE Int. Symp. Circuits Syst., 1998, PP. 912. [6] G. Kasai, Y. Takarabe, K. Furumi, and M. Yoneda, 200-Mhz/200-MSPS 3.2W At 1.5V Vdd, 9.4Mbits Page 367
Page 368