Professional Documents
Culture Documents
Parallel Processing in digital signal processing (DSP) is a technique duplicating function units to operate different tasks (signals) simultaneously.[1] Accordingly, we can perform the same processing for different signals on the corresponding duplicated function units. Further, due to the features of parallel processing, the parallel DSP design often contains multiple outputs, resulting in higher throughput than not parallel.
Contents
1 Conceptual Example 2 Parallel Processing Versus Pipelining 3 Parallel FIR Filters 4 Parallel 1st-order IIR Filters 5 Parallel Processing for Low Power 6 Reference
Conceptual Example
Consider a function unit (F0) and three tasks (T0, T1 and T2). The required time for the function unit F0 to process those tasks is t 0,t 1 and t 2 respectively. Then, if we operate these three tasks in a sequential order, the required time to complete them is t 0+t 1+t 2.
However, if we duplicate the function unit to another two copies (F), the aggregate time is reduced to max(t 0,t 1,t 2), which is smaller than in a sequential order.
en.wikipedia.org/wiki/Parallel_Processing_(DSP_implementation)
1/5
which is shown in the following figure. Assume the calculation time for multiplication units is Tm and Ta for add units. The sample period is given by
en.wikipedia.org/wiki/Parallel_Processing_(DSP_implementation)
2/5
By parallelizing it, the resultant architecture is shown as follows. The sample rate now becomes
where N represents the number of copies. Please note that, in a parallel system, while holds in a pipelined system.
where |a|1 for stability, and such filter has only one pole located at z=a; The corresponding recursive representation is Consider the design of a 4-parallel architecture (N=4). In such parallel system, each delay element means a block delay and the clock period is four times the sample period. Therefore, by iterating the recursion with n=4k , we have
The resultant parallel design has the following properties. The pole of the original filter is at z=a while the pole for the parallel system is at z=a4 which is closer to the origin. The pole movement improves the robustness of the system to the round-off noise. Hardware complexity of this architecture: N*N multiply-add operations. Please note that the square increase in hardware complexity can be reduced by exploiting the concurrency and the incremental computation to avoid repeated computing.
where the Ctotal represents the total capacitance of the CMOS circuit. For a parallel version, the charging capacitance remains the same but the total capacitance increases by N times. In order to maintain the same sample rate, the clock period of the N-parallel circuit increases to N times the propagation delay of the original circuit. It makes the charging time prolongs N times. The supply voltage can be reduced to V0. Therefore, the power consumption of the N-parallel system can be formulated as
Reference
1. ^ K.K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation, John Wiley, 1999 2. ^ Slides for VLSI Digital Signal Processing Systems: Design and Implementation John Wiley & Sons, 1999 (ISBN Number: 0-471-24186-5): http://www.ece.umn.edu/users/parhi/slides.html
en.wikipedia.org/wiki/Parallel_Processing_(DSP_implementation) 4/5
Retrieved from "http://en.wikipedia.org/w/index.php? title=Parallel_Processing_(DSP_implementation)&oldid=500001867" Categories: Digital signal processing This page was last modified on 30 June 2012 at 03:59. Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See Terms of Use for details. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.
en.wikipedia.org/wiki/Parallel_Processing_(DSP_implementation)
5/5