Parallel Processing

974
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: REGULAR PAPERS, VOL. 51, NO. 5, MAY 2004
A Ranked Order Filter Implementation for Parallel Analog Processing

Jonne Poikonen, Student Member, IEEE, and Ari Paasio, Member, IEEE
AbstractOrder statistic filtering, the generalization of which is ranked order filtering, is needed for many image-processing functions including median filtering and mathematical morphology. Combining order statistic functionality with the parallel operation and local connectivity of array processing approaches such as the cellular nonlinear network model, has the potential for very high performance in image processing. This paper examines the implementation of programmable ranked order extraction with a very compact hardware realization of an analog current-mode ranked order filter. The considerable savings in the required circuit area, compared to other circuits, make it possible to use the structure as a building block in a massively parallel signal processing array. The operation of the circuit is analyzed in detail with the help of simulations and measurement results obtained from a test chip manufactured in a 0.18- m standard digital CMOS technology are also presented. The simulations and measurement results verify the correct operation of the circuit and show that it is very suitable for inclusion in every cell of a large parallel processor array. This makes many grayscale processing functions available with truly parallel operation and therefore very high performance. Index TermsAnalog array processing, order statistic filtering, ranked order filter.
I. INTRODUCTION
RDER statistic filters are nonlinear filters which select an output from a set of inputs according to the order statistics of the inputs. The generalized case of order statistic filtering is ranked order filtering, which means extracting from a set of inputs, the one with the selected rank. The most important special cases of order statistic filtering are the extraction of the minimum, maximum and median of a given input set. These functions can be used for a variety of grayscale image processing functions such as median filtering, weighted median filtering [1] and image processing techniques based on mathematical morphology, which require local minimum and maximum operations and, in the case of soft morphology, also more general order statistics [2]. Implementing order statistic functionality in an analog parallel array processing system with local connectivity, such as those based on the cellular nonlinear network (CNN) model [3] can result in very high efficiency in grayscale
Manuscript received July 29, 2003; revised January 10, 2004. This paper was recommended by Guest Editor A. Zrandy. J. Poikonen is with the Department of Information Technology, Laboratory of Electronics and Information Technology, Department of Applied Physics, University of Turku, FIN-20014 Turku, Finland, and also with the Turku Centre for Computer Science (TUCS) Graduate School, FIN-20014 Turku, Finland (e-mail: jokapo@utu.fi). A. Paasio is with the Department of Information Technology, University of Turku, FIN-20014 Turku, Finland (e-mail: arjupa@utu.fi). Digital Object Identifier 10.1109/TCSI.2004.827620
image processing, since it combines very effective nonlinear processing with a simultaneous application of the function to each pixel in an image. It has been shown that order statistic operations, ranked order filtering in the most general case, can be implemented with the CNN model by using difference controlled nonlinear templates [4], [5]. These template operations are however difficult to realize in practice. In fact, to the current knowledge of the authors, there are no CNN hardware implementations which can directly process the necessary nonlinear functions in parallel for the entire array. A CNN-based implementation of median filtering was presented in [6], however the circuit processes the image in clocked a row-by-row manner. Algorithmic methods can be used to sort inputs according to their ranks in many successive processing steps, however, if the signal values have to be ordered through recursive processing, the performance advantage of the locally connected array architecture can be lost. New hardware architectures are needed to fully take advantage of the theoretically available performance. The problem with the addition of any new hardware components in a processing core of a large parallel array is the increased circuit area and power consumption. It is therefore crucial to optimize any new circuit implementation in terms of complexity and area while still preserving adequate accuracy of processing. This paper examines an analog circuit for ranked order extraction, presented first in transistors for input signals. [7], which requires only Since the transistors can also be fairly small in size, the circuit is small enough to be realistically included in each processing cell of an analog parallel processing array without severely restricting the possible number of cells. This design is part of an approach [8] where some of the universality of the original CNN model is traded off for higher performance in the most important functionalities. In this approach, several dedicated processing cores are included in the processor cells, each realizing different linear or nonlinear functions for binary or grayscale inputs. This paper first takes a general look at the principles of ranked order filtering in Section II, and then examines some of the previous implementations of analog ranked order filters in Section III. In Section IV, the proposed ranked order filter circuit and its operation is analyzed, along with accuracy and performance issues concerning the implementation. Circuit simulations are shown in Section V, and Section VI presents measurement results from a test chip manufactured in a 0.18- m digital CMOS technology. II. RANKED ORDER FILTERING Ranked order filtering is the generalization of order statistic filtering, which is based on sorting the magnitudes of input sig-
1057-7122/04$20.00 2004 IEEE
POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION
975
nals and selecting the input which has the correct order statistic, depending on the desired function. A ranked order filter extracts the one with the selected rank , i.e., from inputs the th largest. An alternative notation, which is also sometimes used, is to denote the smallest input with the rank 1. The most important special cases of ranked order extraction are the selec, minimum or the median tion of the maximum , from a given set of input signals. These functions can all be performed with any implementation where the rank to be extracted can be programmed freely. The signal among with the desired rank, i.e., the th largest element inputs, can be found by solving the equation (1) where is the output and the function is defined as (2) and is a bias term which has to be between and for rank [9]. The bipolar function of (2) can also be replaced with a unipolar unit step function. As will be discussed later, this leads to a fewer number of devices in the circuit realization of the ranked order filter. In the unipolar case, the rank equation to be solved can be expressed as (3) where the function is now a step function given by (4) and also has a different range: which is not dependent on the number of inputs, since there are no negative terms in the summation of (3). The inputs of the ranked order filter can also be assigned different weights. This makes it possible to implement, for example, weighted median filtering, which is a generalization of standard median filters. The selection of correct weights for different inputs allows the filtering behavior to be better controlled [1]. The generalized weighted ranked order operation for the unipolar case can be derived from (3), for the th statistic (5) where is the weight assigned to input [9]. III. PREVIOUS FILTER IMPLEMENTATIONS Several circuits for finding the maximum, minimum, and median have been presented in the literature, e.g., [10][13]. In a cellular processing architecture, there is, however, not enough silicon area available to implement different order statistic functions with separate hardware components. The ability to extract
Fig. 1.
Implementing (1) with transconductance amplifiers.
any desired rank is therefore necessary, in order to widen the range of possible operations which can be performed with the same circuitry, and to increase the efficiency of the architecture. Previous realizations of analog ranked order filters are given, e.g., in [9], [14][18]. In some circuits, the operation is not truly parallel [16], or multistage sorting networks are used [14] which result in large and complex hardware for a larger number of inputs. Another method, illustrated in Fig. 1, is based on the use of transconductance amplifiers [9], [17] which can directly implement the nonlinearity of (2). The rank to be extracted is selected with bias source . This approach is, however, not the optimal solution for large processor arrays. The transistors in a high-gain transconductance amplifier have to be rather large, and dedicating an amplifier for each input takes up a large circuit area. The amplifiers also consume static power, which is not desirable in massively parallel arrays, which are also targeted for low-power applications. So-called corner errors can also cause problems in this approach. Corner errors are created when two or more inputs are very close to each other and to the value to be extracted. If the gain of the amplifier is not high enough, inputs that are close to each other can cause the bias current in Fig. 1 to divide between several branches, not only between the desired branch and the output, as would be the ideal case. Gain and accuracy can be increased by using cascaded subthreshold amplifiers [9] or multistage amplifiers [17]; however, this still leaves the problem of large circuit area unsolved and will probably even make it worse. An alternative to the approach described above, which operates on voltage signals, is to use current-mode inputs, which also makes the realization more compatible with the targeted parallel processor architecture [8]. An accurate and more hardware-efficient ranked order filter, compared to the implementations using transconductance amplifiers, was presented in [15]. In this circuit, current comparators are used, one for each input variable, to achieve the required high-gain and the difference between the output and the input values. The bidirectional output currents of the comparators are summed together, to implement the left-hand side of (1), and the sum is added to a bias current, which implements the right-hand side of the equation, and is used to select the extracted rank. Feedback is used to reach a balance between input and output according to (1). This circuit transistors to extract is very modular and requires the rank from N input values. The three additional transistors, common to the whole circuit, consist of an output transistor and of two transistors required to create a bidirectional adjustable current source for supplying the bias values. The total number of transistors is still rather large for the circuit to be included in every cell of a large array. It also has to be noted that the current comparators consume a considerable amount of static power.
976
Fig. 2.
Three-input version of the proposed ranked order filter circuit.
A voltage-mode ranked order filter which uses only transistors for inputs was presented in [18]. The circuit, as such, however operates poorly, the simulated performance is inaccurate, with large corner errors. A solution to improve performance was proposed, where one of the two transistors used for each of the inputs of the ranked order filter is replaced with a supertransistor consisting of a cascade connection of three transistors, and two additional bias current sources for each input. This improves the performance of the circuit, howand a large ever it also results in a transistor count of increase in current consumption, since each input is now biased with two additional currents, the other of which had a fairly large value of 50 A [18].
IV. PROPOSED RANKED ORDER FILTER A. Circuit Setup A very compact circuit for ranked order extraction, first presented in [7], is shown in Fig. 2. In an array processor architecture with 1-neighborhood connectivity the actual ranked order filter in each cell has nine inputs, one for the value of the cell itself and eight for values of the closest neighboring cells. For the sake of clarity, Fig. 2 shows a three-input version of the circuit, however the principle of operation is the same, regardless of the number of inputs. In the schematic, simulations performed for the ranked order filter circuit, a full nine-input version was used. tranThe proposed ranked order circuit uses only sistors to extract a signal with the desired rank from inputs. and are required for each input, The transistors while the bias transistor and the output transistor are common to all inputs. The input transistors, which are also shown in Fig. 2, are not exclusive to the ranked order filter, they are also used to provide neighborhood connectivity for other
circuitry in the processor cell. In terms of reduced complexity, this circuit is a clear improvement over other compact realizations for ranked order filtering, discussed above, which required [18] or [15] transistors. Because the ranked order filter is used in a current-mode processing system [8], the closest target of comparison for the circuit is that of [15]. The number of devices in the proposed ranked order circuit is reduced as a result of two improvements over other implementations. First, in this circuit, the current comparators, using six transistors per input in [15], are replaced with a much simpler structure. The second modification is to use only unipolar current instead of the bipolar current in the current comparator output such as in [15]. This saves an additional two transistors per input variable. One extra transistor can also be removed from the bias section because only one-directional bias current has to can also be shared be provided. The transistors by the binary and grayscale processing cores present in each cell [8], which leads to a more efficient use of available circuit area. The absence of an inverter-based current comparator also yields a considerable improvement in the power consumption figure of merit of the circuit. Compared to [15], this circuit also has the advantage that weighted ranked order filtering can be very easily implemented. B. Circuit Operation The analysis of the operation of the ranked order filter circuit in Fig. 2 can be started out by first taking a look at the functionalities of the different devices in the circuit. The circuit has the same kind of feedback structure as in the winner-take-all circuit of [10], however the addition of controlled coefficient and bias current sources allows for fully programmable selection of any desired rank to be extracted. The circuit of Fig. 2. also directly extracts the input current with the correct rank, instead of an output voltage logarithmically related to the input current [10].
977
A basic building block for the ranked order circuit is the comand a switch tranbination of a coefficient current source . The output of each block is determined by the sistor difference between the output current of the ranked order circuit and the input related to that block. This block basically has three modes of operation, depending on whether the input in question has an order statistic rank which is higher, equal or lower than that to be extracted. In the first operation mode, when the input is larger than the corresponding to, e.g., switch transistor output value of the ranked order filter, the switch conducts all the coefficient current source is able to of the current deliver, which is controlled by voltage Vcoeff. This is called a unity current in the rest of the text. In the second mode, when the input has the rank that is to be extracted, the switch restricts the current from the source, allowing only some part of the unity current through. In the third operation mode, when the input is smaller than the output of the ranked order filter, the switch transistor is not conducting at all and therefore the output current of such block is virtually zero. This way, the blocks can implement the unipolar step function of (4). The summing of the currents of all of the switch/source blocks at the common node Vcont implements the left-hand side of (3). The right-hand side is then simply implemented by the bias current provided by transistor and the current values for selecting the desired rank can be directly derived as stated in Section II for the unipolar case (6) is the unrestricted where is the rank to be extracted and unity current. A second important part of the circuit is the feedback tranwhich has the voltage Vcont as input and current sistor as output. One feedback transistor and one switch/source block is required for each input signal. The output current of the circuit is available at the drain of the output transistor , which is connected to a diode-connected load transistor in Fig. 2. When the bias current is selected according to (6), (3) is automatically realized by the circuit. If, for example, the largest is to extracted, of the input currents , to guarthe bias current can be set to antee robust operation. Because the bias current must equal the and , the voltage Vcont sum of the coefficient currents at the gate nodes of the feedback transistors automatically settles to a value that assures this condition. The feedback tran, and try to sink the same amount sistors of current, since they have the same gate voltage, but because and are not equal, the conthe input currents and settle into different levels. In this trol voltages equilibrium, some of the switch transistors conduct the whole unity current, some do not conduct at all, and ideally, one of the . switches limits the current in a way that In the branch, where this third operation mode takes place, the feedback current equals the input current. In the example of the and from the maximum extraction, the output currents sets the current switch circuits equal zero and the voltage , through transistor , to equal the bias current. When the circuit is in the equilibrium state, the current sunk by transistor
equals the input current and since the other curand are smaller than , the voltages , and rents are close to . If, in the equilibrium, the control voltage would become larger, Vcont would get higher, the current which in turn would lower the Vcont. The same also applies for a case where the control voltage would become lower. Now, because the output transistor has the same size and the same control voltage as the feedback transistors, the output current equals in this example, and in this way, the the feedback current, desired operation of extracting the input with the correct rank is achieved. Weighted ranked order extraction, which allows for e.g., weighted median filtering to be included in the available nonlinear functions, can be very easily implemented with the proposed ranked order filter. The assigning of different weights to different inputs can be understood, as can be seen from (5), as the number of replicated input samples to be included in the set of inputs. In the proposed architecture, the weighting can be achieved by multiplying the value of the unity current for the corresponding input. The basic assumption for nonweighted filtering was that all of the switch/source circuits, when conducting fully, provided the same unity current. If, for example, a weight of three is to be assigned to the first input of the ranked order filter, the control voltage or the aspect ratio would be set in such a manner that the of the transistor would be in magnitude of the coefficient current the unrestricted case. This also means that now, in terms of (3), input signals, where is the there are effectively number of inputs in the unweighted case. This also changes the , correct rank for the median, which is now This, of course, affects the selection of the bias current. C. Accuracy and Performance Considerations There are different error mechanisms and performance issues that should be considered for the ranked order filter circuit. These include corner errors, the robustness of the rank selection, and the errors caused by device mismatch between the different inputs and the output of the ranked order filter. The different aspects of accuracy are closely related to the minimization of circuit area, and some tradeoffs between the sizing of the devices and achieved accuracy are inevitable. In the target application of array processing, a satisfactory accuracy would be in the range of 56 bits. The maximum error in processing should conform to this requirement, so that the probability or the effects of larger errors are small enough to make these occurrences negligible, i.e., an average accuracy of 6 bits might not be enough. All of the transistors in the ranked order filter however, do not share the same accuracy requirements, which means that circuit area can be saved by correct optimization of different devices. The transistors which have the largest effect on the accuracy of circuit are the feedback transistors and the output transistor, because the extracted input is mirrored to the output transistor from the feedback transistor corresponding to the correct rank. A large mismatch between the feedback transistors and the output transistor directly translates into a large error in the operation of the circuit. It is therefore clear that as much area as possible should be devoted to these devices. The operation of the circuit is not as sensitive to mismatch in the transistors
978
Fig. 3. Differential operation of the coefficient current sources.
which make up the switch/source block. As can be seen from (6), to extract the correct rank, the bias value can be set to a fairly large range, relative to the sum of the coefficient currents. Larger mismatch can be allowed in the unity currents without losing the correct functionality of the circuit. The accuracy required for the rank selection bias current however depends on the number of input signals to the ranked order, since the magnitude of the bias value for extracting the minimum or other higher ranks becomes larger with a large number of inputs and thus a smaller relative error can be tolerated. The coefficient and bias currents should also be correlated, i.e., when implemented in a large processor array they should be created locally from the same reference source, to avoid a situation where, in the worst case, the value of the unity current and the bias current would have significant gain errors in the opposite directions. The least critical devices in terms of mismatch are the switch transistors, which can be made very small, with a minimum channel length. The transistor sizes used in the test chip, discussed later, were selected according to what would be desirable, area-wise, when the circuit is integrated in a large array structure, in order to verify if this sizing is adequate or if larger devices have to be ratios of the transistors were 0.5 /0.5 for the used. The coefficient current sources, 0.5 /1 for the feedback transistors, 0.5 /0.18 for the switches, and 0.8 /0.5 for the input transistors. The bias current was directly provided by an on-chip digitalanalog converter (DAC). Another source of possible inaccuracy, in addition to device mismatch, is the corner error. As was already mentioned, the corner error, which is created when two inputs are very close to
each other and to the rank to be extracted, is a serious source of errors for ranked order filters based on low-gain transconductance amplifiers, the current-mode ranked order filter of [15] however exhibited very small corner errors. The sensitivity of the proposed ranked order filter to closely spaced inputs was examined through the simulation shown in Fig. 3. The difference between the output currents of two switch/source blocks was plotted as a function of the difference in the input currents corresponding to the respective blocks. One of the input currents was kept constant at 3 A, the other input was swept from 0 to 5 A, and the maximum was extracted by setting the bias current to 250 nA, the value of the unity current was 500 nA. It can be seen that the the switch/source block corresponding to the swept input starts to conduct when the difference between the inputs becomes smaller than approximately 10 nA, after which the output value is no longer unambiguously determined by just one of the inputs. The magnitude of this difference, at 0.3% of the input values, is very small compared to the allowed mismatch in the current mirroring, which is approximately 2%3% of the input range of the filter. The corner error is therefore not a dominant factor limiting the performance of the circuit. D. Power Consumption One of the design requirements for any components in a large parallel processor array is low power consumption. The ranked order filter discussed here is also very well suited for its target application in this regard. The overall power consumption during processing is dependent on the inputs to the circuit, the power consumption due to the rank selection, i.e., caused by the
979
Fig. 4. Simulated maximum extraction and the extraction error.
bias current, is however more straightforward to determine. If a value of 500 nA is used for the unity currents of the coefficient current sources, the bias current used to select the extracted rank will be in the range of 0 to 4.5 A, in the case of nine inputs, and the maximum power consumption due to biasing is 8.1 W with a 1.8-V power-supply voltage. However, the circuit also works with smaller unity currents, such as 250 nA, in which case the bias power consumption is halved. The bias power consumption is at its maximum when the minimum from the inputs is extracted. However, in this case, the input currents are limited to approximately the value of the output, i.e., the inputs cannot be larger than the extracted minimum output. This means that although the bias power consumption is at a maximum, possible large input currents are limited at the inputs, thus lowering power consumption. An approximation of the maximum power consumption of the circuit can be made if the maximum of the input signal range is set at 5 A, the maximum value applied to all of the nine inputs and the minimum extracted with a unity current value of 500 nA. This leads to a maximum power consumption of approximately 100 W. V. CIRCUIT SIMULATIONS The ranked order circuit was simulated in Cadence Spectre at schematic level by using a 0.18- m CMOS technology and Level 50 transistor models. The power supply voltage was 1.8 V. The sizes of the transistors were the same as those at given in Section IV, with the size of the bias transistor 1 /1 . The circuit was not optimized for the best possible
simulated performance, rather, the device sizes were selected targeting the lowest possible area usage. First, the static behavior of a nine-input ranked order circuit was simulated. One of the inputs was swept from 0 to 5 A while the other inputs were at the following levels: 2, 2.5, four times 3.35, and two times 3.5 A. The output current, conducted through a load provided by a diode connected PMOS transistor, was plotted for different extracted ranks. In Fig. 4, the extracted maximum current, and the error in the maximum extraction. i.e., the difference between the input current to the circuit ) and the extracted output, are shown (through transistor as a function of the swept input current. For the maximum extraction the bias current was set to 250 nA. In the next plot of Fig. 5, the extraction of the minimum and the error in the extraction are shown for the same set of input currents. The bias current was now set to a value of 4.25 A to extract the ninth rank. In the third simulation, shown in Fig. 6, the second smallest current was extracted, with the inputs set as before. The bias current value was now set to 3.75 A. The extraction of the median and the weighted median were also simulated. Eight of the inputs were now set to equally spaced values of 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5 A, and the final input was swept from 0 to 5 A. Fig. 7 shows the result of the unweighted median extraction and Fig. 8 shows a weighted median extraction, where the ninth swept input was assigned a weight of 3. For the median extraction, the bias value was set to 2.25 A; however, since the weighted median extraction was the equivalent of extracting the median from 11 inputs, the bias was set to 2.75 A. In the final dc simulation, the inputs were set to the same equally spaced values as in the median
980
Fig. 5. Simulated minimum extraction and the extraction error.
Fig. 6. Simulated extraction of the second smallest value.
simulation, with the swept input replaced by a value of 5 A, and the bias current was swept from 0 to 5 A to illustrate the robustness of the rank selection. The nine extracted output ranks can be seen in Fig. 9. A transient simulation was also performed to examine the speed of the evaluation. A 5-MHz sine input was applied to
one of the inputs, and the other inputs were set to 3 1 A, 2 1.5 A, 1 2 A, and 2 2.5 A. The ideal sine input and the extracted maximum output are shown in Fig. 10. It can be seen that the circuit settles to a new value very quickly. Because the transient performance is dependent on the capacitive load of the inputs and on the gain of the circuit, it is related to the
981
Fig. 7. Simulated extraction of the median.
Fig. 8.
Simulated extraction of the weighted median.
sizing of the transistors and to the values of coefficient and bias currents used. However, the speed of the circuit is sufficient for the parallel processing application, where the input signals are ideally constant during the processing.
The simulations show that the ranked order filter circuit performs the different order statistic operations correctly and with high performance. The accuracy is well in line with the desired specifications, that is, the nominal errors are insignificant com-
982
Fig. 9. Simulated extraction of all nine ranks by sweeping the bias current value.
Fig. 10.
Simulated transient performance.
pared to magnitude of mismatch errors allowed in the practical realization. The operation speed of the circuit is also very high,
which means that the order statistic functions can be processed with a convergence time of less than 50 ns.
983
Fig. 11.
Layout of the nine-input ranked order filter.
Fig. 12.
Measurement setup.
VI. TEST CHIP MEASUREMENTS A test circuit of a nine-input ranked order filter was manufactured in a 0.18- m standard digital technology. The sizes of the transistors were those given in Section IV. The layout of the nine-input ranked order filter is shown in Fig. 11, where the layout area for the ranked order circuit is approximately 6 13 m, which also includes additional switches used to select/deselect inputs to the circuit. Fig. 12 shows a schematic of the test chip setup for two inputs and the output of the ranked order filter. The test chip also includes other circuitry, which partly shares the same input/output pads.
The input currents to the filter and the bias current for selecting the desired rank to be extracted are provided by ten on-chip 8-bit digital to analog converters. The coefficient currents are mirrored directly from an input transistor biased with an off-chip resistor. Some switches have been added to the circuit to allow for the measurement of different aspects of the system. Switches SW1 n, connected directly to a common output pad, are used to measure the outputs of the DACs, while switches SW2 n (ON/OFF switches in Fig. 11) disconnect the coefficient current sources from the ranked order filter, i.e., the inputs to the ranked order filter could be switched off when necessary. Switch SW3 is used to measure the output of the
984
TABLE I MEASURED CURRENT VALUES AND THEIR MISMATCHES
Fig. 13.
Measured extraction of all nine ranks by changing the bias current.
ranked order filter, and switch SW4 can be used to connect the common gate-node to an output pad, for example, in order to measure the outputs of the the coefficient current sources or the bias DAC. The NMOS switches were driven from a higher 2.5-V power supply, to limit their effects on the current values. For the measurements presented here, the input DACs were biased with a current value of 125 nA, which was mirrored directly to the fourth binary weighted output. The smaller binary weighted current sources were scaled down from the bias value, i.e., the nominal LSB current of the input DACs was 15.6 nA. The rank-selection DAC was biased with a current twice the value of the input DACs. The mismatch between the input and output transistors of the ranked order filter was measured by turning off all but one of the ranked order circuit inputs at a time, and setting the bias value to realize maximum extraction. With this setup, the one input being examined is directly mirrored to the output of the circuit. Table I shows some selected values, first measured from the outputs of the DACs by connecting switches SW1 n to a 1-V bias voltage, and the corresponding output values of the ranked order
circuit. The output of the circuit was also connected to a 1-V bias voltage for the current measurements. It can be noticed that there is a common negative gain error between the output and all inputs, i.e., the output is always smaller than the input. For large input values this is partly due to the high resistance of the ranked order inputs, caused by the narrow and long feedback transistors, and can also be noticed in the simulations, where the input currents to the circuit are also smaller than ideal. Long channel devices are however preferred because of mismatch concerns. Table I also shows the calculated mismatches for the inputs, if the output of the ranked order filter were to be compensated with a gain of 10%, which could be quite simply implemented. The mismatch after compensation is reasonably good, considering the small size of the transistors. The mismatch between the unity currents provided by the coefficient current sources was also measured by connecting the gate node of the feedback transistors to an output pin through transistor SW4. The results are also shown in Table I. The ideal unity current, created with an off-chip resistor was 506 nA. It can be noticed that, considering the small size of the coefficient
985
Fig. 14.
Measured maximum extraction.
Fig. 15.
Measured minimum extraction.
transistors, the mismatch is not very severe, and is well tolerated because of the robustness of the bias selection. Fig. 13 shows the measured output as nine input values, with the values of Table I were applied to the circuit and the rank selection bias current was swept from 0 to approximately 4.5 A. It can be noticed that the circuit clearly distinguishes all nine ranks and the selection of the correct rank is robust. It can be seen that the output decreases slightly over each bias region, which can also be noticed in simulations. However, the change in the output is small, considering the targeted resolution, and therefore, not a serious problem.
The extraction of the maximum, minimum, and median values was measured with four constant inputs and a fifth input varied with an input DAC. The values of the constant inputs were the same as those given in Table I for inputs 2, 4, 6, and 8 and input 1 was swept over the input range. Inputs 3, 5, 7, and 9 were turned off. The extraction of the maximum is shown in Fig. 14, as a function of input 1, the minimum extraction in Fig. 15 and the extraction of the median in Fig. 16. The operation of the circuit was also measured with the inputs set to values close to each other. Five inputs were used, which were set to values of 230, 257, 272, 280, and 321 nA. To remove
986
Fig. 16.
Measured median extraction.
the effects of inputoutput mismatch, the values were first measured one at a time from the output of the circuit, by turning off the other inputs. When all five inputs were applied simultaneously, the measured extracted outputs were: minimum nA, maximum nA, and median nA. These values show very accurate operation, considering the maximum accuracy of the current measurements, which was 1 nA for small A). This confirms that, as the simulations also currents ( showed, corner error is not significant for the circuit. The mismatch between the inputs and the output of the ranked order filter creates errors with considerably larger magnitudes. The measurement results show that the ranked order filter operates very well, however the the errors caused by mismatch in the circuit are larger than wanted, with the target being a maximum error of 2%3% of an input range of 44.5 A, i.e., a maximum error of approximately 100 nA. The errors caused by mismatch in the feedback transistors can be decreased by making the devices larger. Because of the small size of this initial test circuit, the transistor sizes can still be increased and the increased silicon area can be tolerated without making the circuit area too large for the application. The layout of the ranked order filter cell can also be further optimized, in terms of mismatch sensitivity and unused layout space. VII. CONCLUSION A very compact current-mode ranked order filter circuit was discussed and its operation was analyzed with the help of simulations. The correct operation was also verified through measurements of a test circuit manufactured in 0.18- m CMOS technology. The low number of transistors in the implementation and the small size of the transistors make this ranked order filter a good hardware realization for a massively parallel analog array processor. The layout area of the nine input
ranked order filter was only 6 13 m. The circuit can perform fully programmable ranked order extraction with very high speed of operation and low power consumption. The simulated and measurement results show that the selection of the rank to be extracted is very robust, and the manufactured circuit performs the extraction of different ranks correctly. Corner errors are not significant for the circuit. The measured mismatch between the inputs and outputs of the ranked order filter is still larger than preferred, however the transistors in the circuit are very small and they can be made larger in order to increase accuracy, without making the whole circuit prohibitively large for inclusion in a parallel processor array. This shows that the implementation of truly parallel order statistic processing in an array processing architecture can be practically realized. REFERENCES
[1] L. Yin, R. Yang, M. Gabbouj, and Y. Neuvo, Weighted median filters: A tutorial, IEEE Trans. Circuits Syst. II, vol. 43, pp. 157192, Mar. 1996. [2] J. Goutsias, L. Vincent, and D. S. Bloomberg, Mathematical Morphology and Its Applications to Image and Signal Processing. Norwell, MA: Kluwer Academic, 2000. [3] T. Roska and L. O. Chua, The CNN universal machine: An analogic array computer, IEEE Trans. Circuits Syst. II, vol. 40, pp. 163173, Mar. 1993. [4] B. E. Shi, Order statistic filtering with cellular neural networks, in Proc. 3rd IEEE Int. Workshop on Neural Networks and Their Applications, Rome, Italy, Dec. 1994, pp. 441443. [5] C. Rekeczky, T. Roska, and A. Ushida, CNN-based difference-controlled adaptive nonlinear image filters, Int. J. Circuit Theory Applicat., vol. 26, pp. 375423, 1998. [6] K. Slot, J. Kowalski, A. Napieralski, and T. Kacprzak, Analogue median/average image filter based on cellular neural network paradigm, Electron. Lett., vol. 35, no. 19, pp. 16191620, 1999. [7] A. Paasio and K. Halonen, An analogue circuit for weighted ranked order filtering, in Proc. Eur. Conf. Circuit Theory and Design (ECCTD01), vol. 1, Espoo, Finland, 2001, pp. 125128. [8] A. Paasio, A. Kananen, M. Laiho, and K. Halonen, A compact computational core for image processing, in Proc. Eur. Conf. Circuit Theory and Design (ECCTD01), vol. 1, Espoo, Finland, 2001, pp. 337339.
987
[9] K. Urahama and T. Nagao, Direct analog rank filtering, IEEE Trans. Circuits Syst. II, vol. 42, pp. 385388, July 1995. [10] J. Lazzaro, S. Ryckebusch, M. Mahowald, and C. A. Mead, Winnertake-all networks of ( ) complexity, Adv. Neural Info. Processing Syst., vol. 1, pp. 703711, 1989. [11] C. Y. Huang, C. J. Wang, and B. D. Liu, Modular current-mode multiple input minimum circuit for fuzzy logic controllers, in Proc. IEEE Int. Symp. Circuits and Systems, vol. 3, 1996, pp. 361363. [12] P. Dietz and R. Carley, An analog circuit technique for finding the median, in Proc. IEEE Custom Integrated Circuits Conf., San Diego, CA, May 1993, pp. 6.1.16.1.4. [13] I. E. Opris and G. T. A. Kovacs, A high-speed median circuit, IEEE J. Solid-State Circuits, vol. 32, pp. 905908, June 1997. [14] I. E. Opris, Analog rank extractors, IEEE Trans. Circuits Syst. I, vol. 44, pp. 11141121, Dec. 1997. [15] G. Fikos, S. Vlassis, and S. Siskos, High-speed, accurate analogue CMOS rank filter, Electron. Lett., vol. 36, no. 7, pp. 593594, 2000. [16] B. P. Tan and D. M. Wilson, Semiparallel rank order filtering in analog VLSI, IEEE Trans. Circuits Syst. II, vol. 48, pp. 198205, Feb. 2001. [17] J. Ramirez-Angulo, C. Lackey, and A. Diaz-Sanchez, Compact continuous-time analog rank-order filter implementation in CMOS technology, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS02), Phoenix, AZ, May 2002, pp. V-6568. [18] J. Ramirez-Angulo, R. Gonzalez-Carvajal, and G. O. Ducoudray, New very compact CMOS continuous-time, low-voltage analog rank-order filter architecture, in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS03), Bangkok, Thailand, May 2003, pp. 805808.
On
Jonne Poikonen (S03) was born in Raisio, Finland, in 1975. He received the M.Sc. degree in electronics from the University of Turku, Turku, Finland, in 2001. He is currently working toward the Ph.D. degree at the same university. He is also working in the Department of Information Technology, University of Turku, and is affiliated with the Turku Centre for Computer Science Graduate School, Turku, Finland. His main research interests include physical implementations of analog signal-processing techniques for array processing.
Ari Paasio (S97M99) was born in Turku, Finland, in 1969. He received the M.Sc. degree in electrical engineering, the Licentiate of Technology degree, and the Doctor of Technology degree from the Helsinki University of Technology, Helsinki, Finland, in 1993, 1996, and 1999, respectively. From 1992 to 1998, he was employed by the Electronic Circuit Design Laboratory, Helsinki University of Technology. Since May 1998, he has been partly affiliated with the University of Turku. Since January 2003, he has been a Professor of Microelectronics, University of Turku. His research interests include system design aspects of very large-scale integration implementations, especially silicon implementations of parallel processor arrays.

Parallel Processing

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parallel Processing

Uploaded by

Copyright:

Available Formats

974

A Ranked Order Filter Implementation for Parallel Analog Processing

1057-7122/04$20.00 2004 IEEE

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Implementing (1) with transconductance amplifiers.

Three-input version of the proposed ranked order filter circuit.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Fig. 3. Differential operation of the coefficient current sources.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Fig. 4. Simulated maximum extraction and the extraction error.

Fig. 5. Simulated minimum extraction and the extraction error.

Fig. 6. Simulated extraction of the second smallest value.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Fig. 7. Simulated extraction of the median.

Simulated extraction of the weighted median.

Simulated transient performance.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Layout of the nine-input ranked order filter.

TABLE I MEASURED CURRENT VALUES AND THEIR MISMATCHES

Measured extraction of all nine ranks by changing the bias current.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

Measured maximum extraction.

Measured minimum extraction.

Measured median extraction.

POIKONEN AND PAASIO: RANKED ORDER FILTER IMPLEMENTATION

You might also like