Matlab

EURASIP Journal on Wireless Communications and Networking
OFDMA Architectures, Protocols, and

Applications
Guest Editors: Victor C. M. Leung, Alister G. Burr, Lingyang Song,

Yan Zhang, and Thomas Michael Bohnert
OFDMA Architectures, Protocols,
and Applications
EURASIP Journal on
Wireless Communications and Networking
OFDMA Architectures, Protocols,

and Applications
Guest Editors: Victor C. M. Leung, Alister G. Burr,
Lingyang Song, Yan Zhang, and Thomas Michael Bohnert
Copyright © 2009 Hindawi Publishing Corporation. All rights reserved.
This is a special issue published in volume 2009 of “EURASIP Journal on Wireless Communications and Networking.” All articles are
open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Editor-in-Chief
Luc Vandendorpe, Université catholique de Louvain, Belgium
Associate Editors
Thushara Abhayapala, Australia Christian Hartmann, Germany Sayandev Mukherjee, USA
Mohamed H. Ahmed, Canada Stefan Kaiser, Germany Kameswara Rao Namuduri, USA
Farid Ahmed, USA George K. Karagiannidis, Greece Amiya Nayak, Canada
Carles Antón-Haro, Spain Chi Chung Ko, Singapore Claude Oestges, Belgium
Anthony C. Boucouvalas, Greece Visa Koivunen, Finland A. Pandharipande, The Netherlands
Lin Cai, Canada Nicholas Kolokotronis, Greece Phillip Regalia, France
Yuh-Shyan Chen, Taiwan Richard Kozick, USA A. Lee Swindlehurst, USA
Pascal Chevalier, France Sangarapillai Lambotharan, UK George S. Tombras, Greece
Chia-Chin Chong, South Korea Vincent Lau, Hong Kong Lang Tong, USA
Soura Dasgupta, USA David I. Laurenson, UK Athanasios Vasilakos, Greece
Ibrahim Develi, Turkey Tho Le-Ngoc, Canada Ping Wang, Canada
Petar M. Djurić, USA Wei Li, USA Weidong Xiang, USA
Mischa Dohler, Spain Tongtong Li, USA Yang Xiao, USA
Abraham O. Fapojuwo, Canada Zhiqiang Liu, USA Xueshi Yang, USA
Michael Gastpar, USA Steve McLaughlin, UK Lawrence Yeung, Hong Kong
Alex Gershman, Germany Sudip Misra, India Dongmei Zhao, Canada
Wolfgang Gerstacker, Germany Ingrid Moerman, Belgium Weihua Zhuang, Canada
David Gesbert, France Marc Moonen, Belgium
Fary Ghassemlooy, UK Eric Moulines, France
Contents
OFDMA Architectures, Protocols, and Applications, Victor C. M. Leung, Alister G. Burr, Lingyang Song,
Yan Zhang, and Thomas Michael Bohnert
Volume 2009, Article ID 703083, 4 pages
A Fast LMMSE Channel Estimation Method for OFDM Systems, Wen Zhou and Wong Hing Lam
Linearly Time-Varying Channel Estimation and Symbol Detection for OFDMA Uplink Using
Superimposed Training, Han Zhang, Xianhua Dai, Dong Li, and Sheng Ye
DFT-Based Channel Estimation with Symmetric Extension for OFDMA Systems, Yi Wang, Lihua Li,
Ping Zhang, and Zemin Liu
Near-Optimum Detection with Low Complexity for Uplink Virtual MIMO Systems, Sanhae Kim,
Oh-Soon Shin, and Yoan Shin
Separate Turbo Code and Single Turbo Code Adaptive OFDM Transmissions, Lei Ye and Alister Burr
Multiresolution with Hierarchical Modulations for Long Term Evolution of UMTS, Américo Correia,
Nuno Souto, Armando Soares, Rui Dinis, and João Silva
An Opportunistic Error Correction Layer for OFDM Systems, Xiaoying Shao, Roel Schiphorst,
and Cornelis H. Slump
Service Differentiation in OFDM-Based IEEE 802.16 Networks, Yi Zhou, Kai Chen, Jianhua He,
Haibin Guan, Yan Zhang, and Alei Liang
Multiuser Radio Resource Allocation for Multiservice Transmission in OFDMA-Based Cooperative Relay
Networks, Xing Zhang, Shuping Chen, and Wenbo Wang
Throughput Analysis of Band-AMC Scheme in Broadband Wireless OFDMA System, Sung K. Kim
and Chung G. Kang
Contiguous Frequency-Time Resource Allocation and Scheduling for Wireless OFDMA Systems with
QoS Support, I. Gutiérrez, F. Bader, R. Aquilué, and J. L. Pijoan
OFDMA-Based Medium Access Control for Next-Generation WLANs, H. M. Alnuweiri,
Y. Pourmohammadi Fallah, P. Nasiopoulos, and S. Khan
Multiuser Resource Allocation Maximizing the Perceived Quality, Andreas Saul and Gunther Auer
Admission Control Threshold in Cellular Relay Networks with Power Adjustment, Ki-Dong Lee
and Byung K. Yi
Advanced Receiver Design for Quadrature OFDMA Systems, Lin Luo, Jian (Andrew) Zhang,
and Zhenning Shi
Residue Number System Arithmetic Assisted Coded Frequency-Hopped OFDMA, Dalin Zhu
and Balasubramaniam Natarajan
Implementation of a Smart Antenna Base Station for Mobile WiMAX Based on OFDMA,
Seungheon Hyeon, Changhoon Lee, Chang-eui Shin, and Seungwon Choi
Hindawi Publishing Corporation
doi:10.1155/2009/703083
Editorial
OFDMA Architectures, Protocols, and Applications
Victor C. M. Leung,1 Alister G. Burr,2 Lingyang Song,3 Yan Zhang,4

and Thomas Michael Bohnert5
1 Department of Electrical and Computer Engineering, The University of British Columbia, 2332 Main Mall,
Vancouver, BC, Canada V6T 1Z4
2 Department of Electronics, University of York, York, YO10-5DD, UK
3 Peking University, China
4 Simula Research Laboratory, 1325 Lysaker, Norway
5 SAP Research CEC Zurich, Switzerland
Correspondence should be addressed to Yan Zhang, yanzhang@ieee.org
Received 31 March 2009; Accepted 31 March 2009
Copyright © 2009 Victor C. M. Leung et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
Welcome to this special issue of the EURASIP Journal on physical layer and at higher layers. These requirements
Wireless Communications and Networking (JWCN). This present many challenges in the design of network archi-
special issue is devoted to the topic of the latest research and tectures and protocols, which have motivated a significant
development on Orthogonal Frequency-Division Multiple amount of research in the area. Also, many critical problems
Access (OFDMA) from physical and network layers to associated with the applications of OFDMA technologies in
practical applications. OFDMA technologies are currently future wireless systems are still looking for efficient solutions.
attracting intensive attention in wireless communications to The aim of this special issue is to present a collection of
meet the ever-increasing demands arising from the explosive high-quality research papers that report the latest research
growth of Internet, multimedia, and broadband services. advances in OFDMA communications, networks, systems,
OFDMA-based systems are able to deliver high data rate, and its application in future wireless systems. In this
operate in the hostile multipath radio environment, and special issue, we selected 17 papers from 36 submissions.
allow efficient sharing of limited resources such as spectrum The selected papers may be classified into four categories:
and transmit power between multiple users. OFDMA has Channel Estimation, Coding and Modulation, QoS and
been used in the mobility mode of IEEE 802.16 WiMAX, resource allocation, and Systems and Implementation. In
is currently a working specification in 3GPP Long Term the first part, 4 papers were included. In the second part,
Evolution downlink, and is the candidate access method for there are 3 papers on the coding and modulation. There are
the IEEE 802.22 “Wireless Regional Area Networks.” Clearly 7 papers about QoS and resource allocation management,
recent advances in wireless communication technology have and 3 papers were selected for systems and implementation
led to significant innovations that enable OFDMA-based issues. A detailed overview of the selected works is given
wireless access networks to provide better Quality-of-Service below.
(QoS) than ever with convenient and inexpensive deploy-
ment and mobility. Channel Estimation. This part describes the recent advances
However, regardless of the technology used, OFDMA on channel estimation in OFDMA systems.
networks must not only be able to provide reliable and The first paper, “A fast LMMSE channel estimation
high-quality broadband services but also be implemented method for OFDM systems,” reports a fast linear min-
cost-effectively and be operated efficiently. OFDMA presents imum mean square error (LMMSE) channel estimation
many of the advantages and challenges of OFDM systems for method for OFDM systems. In comparison with conven-
single users, and the extension to multiple users introduces tional LMMSE channel estimation, the proposed channel
many further challenges and opportunities, both on the estimation method does not require statistical knowledge
2 EURASIP Journal on Wireless Communications and Networking
of the channel in advance and avoids the inverse operation the average throughput. The presence of interactivity will
of a large dimension matrix by using the FFT operation. allow for a certain amount of link quality feedback for groups
Therefore, the computational complexity can be reduced or individuals. This study performs a system level simulation
significantly. Numerical results show that the NMSE of the of multicellular networks considering broadcast/multicast
proposed method is very close to that of the conventional transmissions using the OFDM/OFDMA-based LTE tech-
LMMSE method. In addition, computer simulation shows nology with respect to the number of TV channels with
that the performance of the proposed method is almost the given bit rate and total spectral efficiency and coverage. Mul-
same as that of the conventional LMMSE method in terms of tiresolution with hierarchical modulations is able to achieve
bit error rate. much higher throughput gain compared to single resolution
The second paper, “Linearly time-varying channel esti- systems of Multimedia Broadcast/Multicast Service (MBMS)
mation and symbol detection for OFDMA uplink using standardized in Release 6.
superimposed training,” addresses superimposed training- The third paper, “An opportunistic error correction
(ST-) based linearly time-varying (LTV) channel estimation layer for OFDM systems,” proposes a cross-layer approach
and symbol detection for OFDMA systems. The study to reduce the power consumption of ADCs in OFDM
estimates the LTV channel transfer functions over the whole systems. The scheme is based on resolution-adaptive ADCs
frequency band by using a weighted average procedure, and Fountain codes. The key part of the proposed system
thereby providing validity for adaptive resource allocation. In is that the dynamic range of ADCs can be reduced by
addition, an iterative symbol detector is presented to mitigate discarding sub-carriers that are attenuated by the channel.
the superimposed training effects on information sequence Correspondingly, the power consumption in ADCs can
recovery. be decreased. The receiver only decodes subcarriers (i.e.,
The third paper, “DFT-based channel estimation with Fountain encoded packets) with the highest SNR. Others are
symmetric extension for OFDMA systems,” presents a partial discarded. With the approach, more than 70% of the energy
frequency response channel estimator for OFDMA systems. consumption in the ADCs can be saved compared with the
The partial frequency response is obtained by using the conventional IEEE 802.11a WLAN system under the same
least square (LS) method. A symmetric extension method is channel conditions.
proposed to reduce the leakage power. After IDFT of the sym-
metric extended signal, the leakage power of channel impulse
QoS and Resource Allocation. The third part focuses on
response is self-cancelled. Simulation results show that the
resource allocation and QoS issues. The issues cover medium
accuracy of the estimator has increased significantly com-
access control, cross-layer design, service differentiation, and
pared with the conventional DFT-based channel estimator.
admission control in IEEE 802.11WirelessLAN and IEEE
The fourth paper, “Near optimum detection with low
802.16 WirelessMAN (or WiMAX).
complexity for uplink virtual MIMO systems,” proposes
two efficient MIMO decoding schemes that achieve near- The first paper, “Service differentiation in OFDM-based
optimum performance with low complexity for uplink IEEE 802.16 networks,” proposes several service differentia-
virtual MIMO systems. The system has an iterative channel tion approaches, which are based on the contention-based
decoder using bit log-likelihood ratio information. The bandwidth request scheme and achieved by means of assign-
simulation results show that the proposed schemes achieve ing different channel access parameters and/or bandwidth
almost the same block error rate performance as that of allocation priorities to different services. Additionally, the
the optimal MLD with only minor increased computational study proposes an effective analytical model to study the
complexity. impacts of the service differentiation approaches, which
can be used for the configuration and optimization of the
Coding and Modulation. The first paper, “Separate turbo service differentiation services. The service differentiation
code and single turbo code adaptive OFDM transmissions,” approaches and the analytical model are evaluated by
studies adaptive modulation and adaptive rate turbo-coding simulation. It is observed that the analytical model has high
in OFDM to increase throughput on the time and frequency accuracy. Service can be efficiently differentiated by initial
selective channel. The adaptive turbo-code scheme is based backoff window in terms of throughput and channel access
on a subband adaptive method and compares two adaptive delay. And the service differentiation can be improved if
systems: a conventional approach where a separate turbo combined with the bandwidth allocation priority approach
code is used for each subband and a single turbo code without adverse impact on the overall system throughput.
adaptive system which uses a single turbo code over all The second paper, “Multiuser radio resource allocation
subbands. Simulation results show that the single turbo for multiservice in OFDMA-based cooperative relay net-
code adaptive system provides a significant performance works,” studies multiservice transmission over OFDMA-
improvement. based cooperative relay networks. The work proposes a
The second paper, “Multiresolution with hierarchical framework to adaptively allocate power, subcarriers, and
modulations for long term evolution of UMTS,” investigates data rate to maximize system spectral efficiency under
mobile TV services over UMTS Long Term Evolution (LTE). QoS constraints. The single user scenario is first investi-
By using multiresolution with hierarchical modulations, this gated in a point-to-point cooperative relay network. Then
service is expected to be broadcasted to larger groups achiev- multiservice transmission is investigated in a multiuser
ing significant reduction in power transmission or increasing point-to-multipoint scenario. Several suboptimal resource
EURASIP Journal on Wireless Communications and Networking 3
allocation algorithms are proposed to reduce the computa- The seventh paper, “Admission control threshold in
tional complexity. Simulation results show that the proposed cellular relay networks with power adjustment,” designs
algorithms yield both high spectral efficiency and low outage admission capacity planning in a cellular network using a
probability. cooperative relaying mechanism called decode-and-forward.
The third paper, “Throughput analysis of band AMC The work mathematically formulates the dropping ratio
scheme in broadband wireless OFDMA system,” performs using the randomness of “channel gain.” With this, the
an analysis of the maximum system throughput for a band- admission threshold planning problem is formulated as a
AMC under various system parameters. In particular, the simple optimization problem. The simplicity of the problem
practical features of resource management for OFDMA sys- formulation facilitates its solution in real-time. The proposed
tem are modeled and evaluated within the current analytical planning method can provide an attractive guideline for
framework. The results demonstrate that the band-AMC dimensioning a cellular relay network with cooperative
mode outperforms the diversity mode only by providing the relays.
channel qualities for a subset of good subbands, confirming
the multiuser and multiband diversity gain that can be Systems and Implementation. The first paper, “Advanced
achieved by the band-AMC mode. receiver design for quadrature OFDMA systems,” inves-
The fourth paper, “Continuous frequency-time resource tigates various detection techniques such as linear zero
allocation and scheduling for wireless OFDMA systems with forcing (ZF) equalization, minimum mean square error
QoS support,” presents a joint scheduling and resource (MMSE) equalization, decision feedback equalization (DFE),
allocation scheme for the OFDMA system with continuous and turbo joint channel estimation and detection, for Q-
subcarrier permutation. The proposed algorithm provides OFDMA systems to mitigate the noise enhancement effect
continuous sets of frequency-time resource units following and improve the bit error ratio (BER) performance. It
a rectangular shape, yielding a reduction of the required is shown that advanced detection, for example, DFE and
burst signalling. The joint scheme has two phases: the QoS turbo receiver, can significantly improve the performance of
requirements and the input buffers emptying status. For each QOFDMA.
phase, a specific prioritization function is defined in order The second paper, “Residue number system arithmetic
to obtain a trade-off between the fairness and the spectral assisted coded frequency-hopped OFDMA,” presents a
efficiency maximization. residue number system arithmetic-based frequency-hopped
(FH) pattern design. The proposed FH scheme guarantees
The fifth paper, “OFDMA-based medium access control
orthogonality among intracell users while randomizing the
for next-generation WLANs,” studies a new adaptive MAC
intercell interferences and providing frequency diversity
design based on OFDMA technology. The design uses
gains. Simulation results demonstrate the gains due to
OFDMA to reduce collision during transmission request
frequency diversity and intercell interference diversity on the
phases and makes channel access more predictable. To
system bit error rate (BER) performance. Furthermore, the
improve throughput, the study combines the OFDMA access
BER performance gain is consistent across all cells, which is
with a Carrier Sense Multiple Access (CSMA) scheme. Data
superior to other FH pattern design schemes since they have
transmission opportunities are assigned through an access
larger performance variations across cells.
point that can schedule traffic streams in both time and
The third paper, “Implementation of a smart antenna
frequency (subchannels) domains. The results demonstrate base station for mobile-WiMAX based on OFDMA,” presents
the effectiveness of the proposed MAC and compare it to the implementation of a smart antenna base station for
existing mechanisms through simulation experiments and by OFDMA-based WiMAX. To implement the Base Station,
deriving an analytical model for the operation of the MAC in the paper addresses a number of key issues in baseband
saturation mode. signal processing related to symbol-timing acquisition, the
The sixth paper, “Multiuser resource allocation max- beamforming scheme, and calibration. Experimental tests
imizing the user perceived quality,” addresses multiuser were performed to verify the validity of the solutions. Results
resource allocation for time/frequency-slotted wireless com- showed a 3.5-time (5.5 dB) link-budget enhancement on the
munication systems. A framework for application driven uplink compared to a single antenna system.
cross-layer optimization (CLO) between the application In conclusion, this issue of EURASIP JWCN offers a
(APP) layer and medium access control (MAC) layer is ground-breaking view into the recent advances in OFDMA
developed. The objective is to maximize the user perceived communications and networks. The popularity of submis-
quality by joint optimization of the rate of the information sions indicates that OFDMA is a worldwide focus that
bit-stream provided by the APP layer and the adaptive has universal appeal in terms of research, industry, and
resource assignment on the MAC layer. Assuming adaptive standardization. This issue offers both academic and indus-
transmission with long-term channel state information at try appeal—the former as a basis toward future research
the transmitter (CSIT), the optimization problem is analyzed directions and the latter toward viable commercial applica-
mathematically, which is then used as the basis for a CLO tions. OFDMA communications and networks in the longer-
algorithm. The proposed CLO framework supports user term will be characterized by their criticalness in consumer,
priorities such that premium users perceive a better service business, and government applications in the areas of
quality than ordinary users and have a higher chance to be radio communications, LTE, LTE Advanced, WiMAX, and
served. cognitive radio applications.
Finally, we would like to express our gratitude to the

Editor-in-Chief of EURASIP JWCN, Dr. Luc Vandendorpe
for his advice, patience, and encouragement from the
beginning until the final stage. We thank all anonymous
reviewers who spent much of their precious time reviewing
all the papers. Their timely reviews and comments greatly
helped us select the best papers in this special issue. We
also thank all authors who have submitted their papers for
consideration for this issue.
We hope you will enjoy reading the great selection of
papers in this issue.
Victor C. M. Leung
Alister G. Burr
Lingyang Song
Yan Zhang
Thomas Michael Bohnert
doi:10.1155/2009/752895
Research Article
A Fast LMMSE Channel Estimation Method for OFDM Systems
Wen Zhou and Wong Hing Lam

Department of Electrical and Electronics Engineering, The University of Hong Kong, Hong Kong
Correspondence should be addressed to Wen Zhou, wenzhou@eee.hku.hk
Received 20 July 2008; Revised 10 January 2009; Accepted 20 March 2009
Recommended by Lingyang Song
A fast linear minimum mean square error (LMMSE) channel estimation method has been proposed for Orthogonal Frequency
Division Multiplexing (OFDM) systems. In comparison with the conventional LMMSE channel estimation, the proposed channel
estimation method does not require the statistic knowledge of the channel in advance and avoids the inverse operation of a large
dimension matrix by using the fast Fourier transform (FFT) operation. Therefore, the computational complexity can be reduced
significantly. The normalized mean square errors (NMSEs) of the proposed method and the conventional LMMSE estimation have
been derived. Numerical results show that the NMSE of the proposed method is very close to that of the conventional LMMSE
method, which is also verified by computer simulation. In addition, computer simulation shows that the performance of the
proposed method is almost the same with that of the conventional LMMSE method in terms of bit error rate (BER).
Copyright © 2009 W. Zhou and W. H. Lam. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction information such as cyclostationarity induced by the cyclic

prefix. Therefore, blind channel estimation methods are not
Orthogonal frequency division multiplexing (OFDM) is an suitable for applications with fast varying fading channels.
efficient high data rate transmission technique for wireless And most practical communication systems such as World
communication [1]. OFDM presents advantages of high Interoperability for Microwave Access (WIMAX) system
spectrum efficiency, simple and efficient implementation adopt pilot assisted channel estimation, so this paper studies
by using the fast Fourier transform (FFT) and the inverse the first kind.
Fast Fourier Transform (IFFT), mitigation of intersym- For the pilot-aided channel estimation methods, there
bol interference (ISI) by inserting cyclic prefix (CP), and are two classical pilot patterns, which are the block-type
robustness to frequency selective fading channel. Channel pattern and the comb-type pattern [4]. The block-type
estimation plays an important part in OFDM systems. It can refers to that the pilots are inserted into all the subcarriers
be employed for the purpose of detecting received signal, of one OFDM symbol with a certain period. The block-
improving the capacity of orthogonal frequency division type can be adopted in slow fading channel, that is, the
multiple access (OFDMA) systems by cross-layer design [2], channel is stationary within a certain period of OFDM
and improving the system performance in terms of bit error symbols. The comb-type refers to that the pilots are inserted
rate (BER) [3–5]. at some specific subcarriers in each OFDM symbol. The
comb-type is preferable in fast varying fading channels, that
1.1. Previous Work. The present channel estimation methods is, the channel varies over two adjacent OFDM symbols
generally can be divided into two kinds. One kind is based but remains stationary within one OFDM symbol. The
on the pilots [6–9], and the other is blind channel estimation comb-type pilot arrangement-based channel estimation has
[10–12] which does not use pilots. Blind channel estimation been shown as more applicable since it can track fast
methods avoid the use of pilots and have higher spectral varying fading channels, compared with the block-type
efficiency. However, they often suffer from high computation one [4, 13]. The channel estimation based on comb-type
complexity and low convergence speed since they often need pilot arrangement is often performed by two steps. Firstly,
a large amount of receiving data to obtain some statistical it estimates the channel frequency response on all pilot
subcarriers, by lease square (LS) method, LMMSE method, complexity can be reduced significantly, compared with
and so on. Secondly, it obtains the channel estimates on the conventional LMMSE method. Thirdly, the proposed
all subcarriers by interpolation, including data subcarriers method can track the changes of channel parameters, that
and pilot subcarriers in one OFDM symbol. There are is, the channel autocorrelation matrix and SNR. However,
several interpolation methods including linear interpolation the conventional LMMSE method cannot track the channel.
method, second-order polynomial interpolation method, Once the channel parameters change, the performance of
and phase-compensated interpolation [4]. the conventional LMMSE method will degrade due to the
In [14], the linear minimum mean square error parameter mismatch.
(LMMSE) channel estimation method based on channel
autocorrelation matrix in frequency domain has been pro- 1.3. Organization. The paper is organized as follows.
posed. To reduce the computational complexity of LMMSE Section 2 describes the OFDM system model. Section 3
estimation, a low-rank approximation to LMMSE estimation describes the proposed fast LMMSE channel estimation. We
has been proposed by singular value decomposition [6]. The analyze the mean square error (MSE) of the proposed fast
drawback of LMMSE channel estimation [6, 14] is that it LMMSE channel estimation and the MSE of the conventional
requires the knowledge of channel autocorrelation matrix LMMSE channel estimation in Section 4. The simulation
in frequency domain and the signal to noise ratio (SNR). results and numerical results of the proposed algorithm are
Though the system can be designed for fixed SNR and discussed in Section 5 followed by conclusion in Section 6.
channel frequency autocorrelation matrix, the performance
of the OFDM system will degrade significantly due to
the mismatched system parameters. In [15], a channel 2. System Model
estimation exploiting channel correlation both in time and The OFDM system model with pilot signal (i.e., training
frequency domain has been proposed. Similarly, it needs sequence) assisted is shown in Figure 1. For N subcarriers
to know the channel autocorrelation matrix in frequency in the OFDM system, the transmitted signal x(i, n) in time
domain, the Doppler shift, and SNR in advance. Mismatched domain after inverse Fast Fourier Transform (IFFT) is given
parameters of the Doppler shift and the delay spread will by
degrade the performance of the system [16]. It is noted
that the channel estimation methods proposed in [6, 14–16] N −1
1 j2πnk
can be adopted in either the block-type pilot pattern or the x(i, n) = IFFTN [X(i, k)] = X(i, k) exp , (1)
N k=0 N
comb-type pilot pattern.
When the assumption that the channel is time-invariant
where X(i, k) denotes the transmitted signal in frequency
within one OFDM symbol is not valid due to high Doppler
domain at the kth subcarrier in the ith OFDM symbol. The
shift or synchronization error, the intercarrier interference
comb-type pilot pattern [4] is adopted in this paper. The
(ICI) has to be considered. Some channel estimation and
pilot subcarriers are equispaced inserted into each OFDM
signal detection methods have been proposed to compensate
symbol. It is assumed that the number of the total pilot
the ICI effect [17, 18]. In [17], a new equalization technique
subcarriers is N p , and the inserting gap is R. Each OFDM
to suppress ICI in LMMSE sense has been proposed.
symbol is composed of the pilot subcarriers and the data
Meanwhile, the authors reduced the complexity of channel
subcarriers. It is assumed that the index of the first pilot
estimator by using the energy distribution information of the
subcarrier is k0 . Therefore, the set of the indeces of pilot
channel frequency matrix. In [18], the authors proposed a
subcarriers, η, can be written as
new pilot pattern, that is, the grouped and equispaced pilot

pattern and corresponding channel estimation and signal
η = k | k = mR + k0 , m = 0, 1, . . . , N p − 1 , (2)
detection to suppress ICI.
where k0 ∈ [0, R). The received signal Y (i, k) in frequency
domain after FFT can be written as
1.2. Contributions. In this paper, the OFDM system frame-
work based on comb-type pilot arrangement is adopted, Y (i, k) = X(i, k)H(i, k) + W(i, k), (3)
and we assume that the channel remains stationary within
one OFDM symbol, and therefore there is no ICI effect. where W(i, k) denotes the AGWN with zero mean, and
We propose a fast LMMSE channel estimation method. variance σw2 , H(i, k) is the frequency response of the radio
The proposed method has three advantages over the con- channel at the kth subcarrier of the ith OFDM symbol.
ventional LMMSE method. Firstly, the proposed method Then, the received pilot signal Y p (i, k) is extracted from
does not require the knowledge of channel autocorrelation Y (i, k) to perform channel estimation. As shown in Figure 2,
matrix and SNR in advance but can achieve almost the the channel estimator firstly performs channel frequency
same performance with the conventional LMMSE channel response estimation at pilot subcarriers. There are some
estimation in terms of the normalized mean square error channel estimation methods for this part such as LS and
(NMSE) of channel estimation and bit error rate (BER). LMMSE estimator [4]. Next, once the channel frequency
Secondly, the proposed method needs only fast Fourier response estimation at pilot subcarriers, H p (i, k), is obtained,
transform (FFT) operation instead of the inversion operation the estimator performs interpolation to obtain channel
of a large dimensional matrix. Therefore, the computational frequency response estimation at all subcarriers. There
are linear interpolation method [4], second-order polyno- where E(•) denotes expectation. Denote the vector form of
mial interpolation method [4], discrete Fourier transform- the channel autocorrelation matrix by RHH , and we have
(DFT-) based interpolation method [19], and so on. In our RHH = [RHH (i, j)]N ×N . It is easy to find that the matrix RHH
system model, the linear interpolation method is adopted. is a circulant matrix. Therefore, as in [20], the eigenvalues of
After channel estimation, maximum likelihood detection is RHH are given by
k).
performed to obtain the estimated frequency signal X(i,
k) is given by
The X(i,
2 [λ0 λ1 · · · λN −1 ]
k) = argmin
X(i, k)S
Y (i, k) − H(i, , (4) (8)
S
= [FFTN (RHH (0, 0) RHH (0, 1) · · · RHH (0, N − 1))].
where S ∈ s, and s is the set containing all constellation

points, which depends on modulation method, that is, the The formula (8) can be equivalently written as
signal mapper. For√instance, if QPSK
√ modulation is√adopted,
the set√ s = {(1/ 2)(1 + j), (1/ 2)(1 − j), (1/ 2)(−1 +
j), (1/ 2)(−1 − j)}. Finally, the estimated frequency signal −1
N
j2πnk
k) passes through the signal demapper to obtain the
X(i, λk = RHH (0, n) exp − , k = 0, 1, ..., N − 1.
n=0
N
received bit sequence. (9)
3. The Proposed Fast LMMSE Algorithm

We can easily obtain from (7) and (9) that the number of
3.1. Properties of the Channel Correlation Matrix in Frequency nonzero eigenvalues of RHH is equal to the total number of
Domain. The channel impulse response in time domain can resolvable paths, L (see Appendix A). It is known by us that
be expressed as the rank of a square matrix is the number of its nonzero
−1
L eigenvalues. Therefore the rank of RHH is L, and RHH is a
h(i, n) = hl (i)δ(n − τl ), (5) singular matrix since L < N. The matrix RHH does not have
l=0 the inverse matrix and has only the Moore-Penrose inverse
matrix. However, the rank of the matrix RHH + σw2 I is N
where hl (i) is the complex gain of the lth path in the ith (see Appendix A), where I is an N by N identity matrix.
OFDM symbol period, δ(·) is the Kronecker delta function, Therefore, the matrix RHH + σw2 I is not singular and has the
τl is the delay of the lth path in unit of sample point, and inverse matrix.
L is the number of resolvable paths. Assume that different
paths hl (i) are independent from each other and the power
of the lth path is σl2 . The channel is normalized so that 3.2. The Proposed Fast LMMSE Channel Estimation Algo-
σh2 = l σl2 = 1. The channel response in frequency domain rithm. Let
H(i, k) is the FFT of h(i, n), and it is given by
−1
N T
j2πmk Hp (i) = H p (i, 0) H p (i, 1) · · · H p i, N p − 1 (10)
H(i, k) = FFTN (h(i, n)) = h(i, m) exp − , (6)
m=0
N
where FFTN (•) denotes N points FFT operation. The denote the channel frequency response at pilot subcarriers of
channel autocorrelation matrix in frequency domain can be the ith OFDM symbol, and let
expressed as
RHH (m, n) T
Yp (i) = Y p (i, 0) Y p (i, 1) · · · Y p i, N p − 1 (11)
= E[H(i, m)H ∗ (i, n)]
⎡
−1
N
j2πkm denote the vector of received signal at pilot subcarriers of
= E⎣ h(i, k) exp −
k=0
N the ith OFDM symbol after FFT. Denote the pilot signal of
⎤ the ith OFDM symbol by X p (i, j), j = 0, 1, . . . , N p − 1. The
−1
N channel estimate at pilot subcarriers based on least square
∗ j2πkn ⎦
· h (i, k) exp (7) (LS) criterion is given by
k=0
N

−1
N j2πk(m − n) T
= E |h(i, k)|2 exp − p,ls (i, 0) H
p,ls (i) = H
H p,ls i, N p − 1
p,ls (i, 1) · · · H
k=0
N
T
−1
L Y p (i, 0) Y p (i, 1) Y p (i, N p − 1)
j2πτl (m − n) = ··· .
= σl2 exp − , X p (i, 0) X p (i, 1) X p (i, N p − 1)
l=0
N
(12)
X(i, k) x(i, n)
Pilot
insertion
Bit sequence Signal and CP
mapper S/P IFFT P/S
· · · OFDM · · · · · · insertion · · ·
symbol
forming
Channel
k)
X(i, Y (i, k)
Maximum +
Received bit Signal CP AWGN
sequence P/S likelihood FFT S/P
demapper · · · detection · · · · · · removal · · ·
···
Channel
estimation · · ·
k)
H(i, Y p (i, k)
Figure 1: Baseband OFDM system.
H p (i, k) RHp Hp and SNR are often unknown in advance and time
Estimated channel frequency varying. Therefore the LMMSE channel estimator becomes
response at pilot subcarriers
Extracted unavailable in practice. To solve the problem, we propose the
received k)
pilot signal Pilot
H(i, fast LMMSE channel estimation algorithm. The algorithm
Channel Estimated channel
Y p (i, k) subcarrier can be divided into three steps. The first step is to obtain
··· · · · interpolation · · · frequency response
estimation at all subcarriers the estimate of channel autocorrelation matrices RHp Hp and
Hp Hp . Firstly, we obtain the least square (LS) channel
R
···
Pilots X p (i, k) estimation at pilot subcarriers in time domain, h p.ls (i, k), and
it is given by
Figure 2: Channel estimation based on comb-type pilots.
N p −1
1 j2πnk
h p.ls (i, k) = H p,ls (i, n) exp ,
N p n=0 Np (14)
The LMMSE estimator at pilot subcarriers is given by [6]
p,lMMSE (i)
H k = 0, 1, . . . , N p − 1.

= H p,lMMSE (i, 0) H
p,lMMSE (i, 1) · · · H
p,lMMSE i, N p − 1
Secondly, the most significant taps (MSTs) algorithm [21]
−1 has been proposed to obtain the refined channel estimation
β p,ls (i), in time domain. The MST algorithm deals with each OFDM
= R Hp Hp R Hp Hp + I H
SNR symbol by reserving the most significant L paths in terms of
(13) power and setting the other taps to be zero. The algorithm
can reduce the influence of AWGN and other interference
where RHp Hp is channel autocorrelation matrix at pilot significantly, compared with the LS method. However, the
subcarriers and is defined by RHp Hp = E{H p HHp }, where algorithm may choose the wrong paths and omit the
(·)H denotes Hermitian transpose. It is easy to verify that right paths because of the influence of AWGN and other
the matrix RHp Hp is circulant, the rank of RHp Hp is equal to interference. Thus, we will improve the algorithm of [21]
L, and the rank of RHp Hp + σw2 I is equal to N p . The signal- by processing several adjacent OFDM symbols jointly. We
to-noise ratio (SNR) is defined by SNR = E|X p (k)|2 /σw2 , calculate the average power of each tap for NMST adjacent
and β = E|X p (k)|2 E|1/X p (k)|2 is a constant depending OFDM symbols, PLS (k), and it is given by
on the signal constellation. For 16QAM modulation β =
17/9 and for QPSK and BPSK modulation β = 1. If −1
NMST 2
1
the channel autocorrelation matrix RHp Hp and SNR are PLS (k) = h p,ls (i, k) , k = 0, 1, . . . , N p − 1.
NMST
known in advance, RHp Hp (RHp Hp + (β/SNR)I)−1 needs to be i=0
calculated only once. However, the autocorrelation matrix (15)
Then we choose the L most significant taps from PLS (k) and 1.2
reserve the indeces of them into a set α . Finally, the refined
The magnitude of the first row of channel

1.1
channel estimation in time domain, h p,MST , is given by 1
h p,MST (i, k)
autocorrelation matrix
0.9
⎧ 0.8
⎪
⎨h p,ls (i, k), if k ∈ α ,
= (16) 0.7
⎪
⎩0, / α ,
if k ∈ 0.6
k = 0, 1, . . . , N p − 1, i = 0, 1, . . . , NMST − 1. 0.5
0.4
Then A
Hp Hp by A.
Denote the first row of the matrix R can be
given from (7) by 0.3
0.2
= N p · IFFTN p [PMST ],
A (17) 0 20 40 60 80 100 120 140
The index of the first row of channel autocorrelation matrix
where PMST is a 1 by N p vector with each entry
⎧ Figure 3: The first row of the channel autocorrelation matrix
⎨PLS (k), if k ∈ α , RHp Hp , A.
PMST (k) = ⎩
0, / α ,
if k ∈ (18)
k = 0, 1, . . . , N p − 1.
Since the matrix R Hp Hp is circulant, R

Hp Hp can be acquired by The proposed fast LMMSE algorithm avoids the matrix
The second step is to obtain the estimate of
inverse operation and can be very efficient since the algo-
circle shift of A. rithm only uses the FFT and circle shift operation. The
SNR. The estimate of SNR, SNR, is given by
proposed fast LMMSE algorithm can be summarized as

k PMST (k)
follows.
=
SNR . (19)
k PLS (k) − k PMST (k)
Step 1. Obtain the LS channel estimation of pilot signal in
The third step is to obtain the estimate of the matrix time domain, h p.ls (i, k), by formula (14).
−1
RHp Hp (RHp Hp + (β/SNR)I)−1 , R Hp Hp ( R
Hp Hp + (β/ SNR)I) .
We refer to the matrix RHp Hp (RHp Hp + (β/SNR)I)−1 as the Step 2. Calculate the average power of each tap for NMST
LMMSE matrix in this paper. Since RHp Hp is a circulant OFDM symbols, PLS (k), by formula (15). Then, we choose
the L most significant taps from PLS (k) and reserve it as
matrix and (RHp Hp + (β/SNR)I)−1 is a circulant matrix,
PMST (k), by formula (18).
the product of RHp Hp and (RHp Hp + (β/SNR)I)−1 is also a
circulant matrix. Therefore, we need only to compute the by formula (19).
Step 3. Obtain the estimate of SNR, SNR,
estimate of the first row of the LMMSE matrix. Denote the
first row of LMMSE matrix by B. The estimate of B, B , is Step 4. Obtain the estimate of the first row of the LMMSE
given by (see Appendix B) , by formula (20).
matrix, B
⎡
= IFFTN p⎣
PMST (0) PMST (1) Step 5. Obtain the estimation of the LMMSE matrix,
B
PMST (1) + β/N p SNR
−1
PMST (0) + β/N p SNR Hp Hp ( R
R
Hp Hp + (β/ SNR)I) . Then, the
, by circle shift of B
⎤ channel estimation in frequency domain at pilot subcarriers
PMST N p − 1 can be obtained by formula (21).
··· ⎦

PMST N p − 1 + β/N p SNR
It is noted that the estimation of the LMMSE matrix
(20) requires only N p points FFT operation and circle shifting
where IFFTN p (•) denotes N p points IFFT operation. operation, which reduce the computational complexity
Hp Hp ( R
Hp Hp +
significantly compared with the conventional LMMSE esti-
Therefore the estimated LMMSE matrix R
−1
mator since it requires the inverse operation of a large

(β/ SNR)I) can be obtained from circle shift of B . The dimension matrix.
channel estimation in frequency domain at pilot subcarriers
for the ith OFDM symbol can be given by
−1 4. Analysis of the Mean Square Error of
β
H p,fast lMMSE (i) = R
Hp Hp Hp Hp +
R I p,ls (i),
H the Proposed Fast LMMSE Algorithm

SNR (21)
In this section, we will present the mean square error (MSE)
i = 0, 1, . . . , NMST − 1. of the proposed fast LMMSE algorithm. Firstly, we present
4.1. MSE Analysis of the Conventional LMMSE Algorithm.

The magnitude of the first row of the LMMSE matrix
0.05
Denote the MSE of LMMSE algorithm by ϕMSE (SNR,
0.045
SNRdesign ), where SNR is the true SNR, and SNR design is
0.04 the designed SNR.
0.035
0.03 (i) MSE Analysis for Matched SNR. The MSE of LMMSE
0.025 algorithm at pilot subcarriers for matched SNR can be
derived as [22]
0.02
0.015
0.01 ϕMSE (SNR, SNR)
N p −1
1 2
0.005

0 = EH p,lMMSE (i, k) − H p (i, k)
0 20 40 60 80 100 120 140 N p k=0 (22)
The index of the first row of the LMMSE matrix
−1
H β
SNR = 5 dB =1−A· R Hp Hp + I · AH ,
SNR = 10 dB SNR
SNR = 20 dB
Figure 4: The first row of the LMMSE matrix

RHp Hp (RHp Hp + (β/SNR)I)−1 with different SNRs. where A is the first row of the matrix RHp Hp , and (·)H denotes
Hermitian transpose.
100 (ii) MSE Analysis for Mismatched SNR. The MSE of LMMSE
algorithm on pilot subcarriers for mismatched SNR can be
derived as [22]
10−1

ϕMSE SNR, SNRdesign
NMSE
10−2
N p −1
1
2

= EH p,lMMSE (i, k) − H p (i, k)
N p k=0
10−3
−1
β
= 1 + A · R Hp Hp + I
SNRdesign
10−4
0 5 10 15 20 25 (23)
β
SNR (dB) · R Hp Hp + I
SNR
LS, simulation
The proposed fast LMMSE, simulation −1
H β
The proposed fast LMMSE, numerical method · R Hp Hp + I · AH
LMMSE, simulation SNRdesign
LMMSE, numerical method
−1
H β
Figure 5: Normalized Mean square error (NMSE) of channel
− 2A · R Hp Hp + I · AH ,
estimation of LMMSE algorithm versus that of the proposed fast SNRdesign
LMMSE algorithm by computer simulation and numerical method.
where A is the first row of the matrix RHp Hp , and (·)H denotes
the MSE of LMMSE algorithm for comparison. We study Hermitian transpose.
two cases. One case is the MSE analysis for matched SNR,
that is, the designed SNR is equal to the true SNR, and the
other one is the MSE analysis for mismatched SNR. Secondly, 4.2. MSE Analysis for the Proposed Fast LMMSE Algorithm.
we present the MSE of the proposed fast LMMSE algorithm. Let us denote the MSE of the proposed fast LMMSE
Similarly, we study two cases. One is for matched SNR, and where SNR is the true SNR,
algorithm by φMSE (SNR, SNR),
the other is for mismatched SNR.
and SNR is the estimated SNR or the designed SNR.
(i) MSE for Matched SNR. The MSE of the proposed fast 10−1
LMMSE algorithm is given by
φMSE (SNR, SNR)

⎡ N p −1
⎤ 10−2
1
2

= E⎣ H p,fast lMMSE (i, k) − H p (i, k) ⎦
NMSE
N p k=0
2

= E H p,fast lMMSE (i, 0) − H p (i, 0)
10−3
⎡N −1⎧ ⎫
N p −1
p ⎨ 1 2π ⎬
= E ⎣
γ(l) exp j lk H p,ls (i, k)
⎩ Np Np ⎭
k=0 l=0
2 ⎤ 10−4

⎥ 0 5 10 15 20 25
−H p (i, 0)
⎦ SNR (dB)

(24) LMMSE, matched SNR, numerical method
⎡N −1 ⎧ ⎫
⎨ 1 N p −1 ⎬ LMMSE, SNR design = 5 dB, numerical method
p 2π
= E ⎣
γ(l)⎩ exp j lk H p,ls (i, k)⎭ LMMSE, SNR design = 10 dB, numerical method
l=0 N p k=0 Np LMMSE, SNR design = 20 dB, numerical method
LMMSE, SNR design = 5 dB, simulation
2 ⎤
⎥
− H p (i, 0)
⎦

Figure 6: NMSE of LMMSE algorithm with matched SNR and
⎡ 2 ⎤ mismatched SNRs versus SNR, by simulation and numerical
Np −1
⎢

⎥ method, respectively.
= E ⎣ γ(l)h p,ls (i, l) − H p (i, 0)

⎦
l=0
⎡ 2 ⎤ φMSE (SNR, SNR)
Np −1 N p −1

⎢ ⎥
⎡ $ 2 %
= E ⎣
γ(l) p,ls (i, l) −
h h p (i, j) ⎦,

l=0 Np −1 E h p,MST ( j)

j =0 ⎢
= E⎢ $ 2 %
⎣ j =0 E
h p,MST ( j)
+ β/ N p · SNR
where γ(l) = (PMST (l))/(PMST (l) + (β/(N p SNR))), l =
0, 1, . . . , N p − 1. If the number of the chosen OFDM symbol 2 ⎤
N p −1

⎥
to obtain the estimated average power for each tap, NMST , is ×h p,ls (i, j) − h p (i, j)
⎦
large, we can replace γ(l) with E(γ(l)) in (24), then, (24) can j =0
be further derived as
−1
L
1
=1+ γ1 (τl )2 σ 2 +
φMSE (SNR, SNR) l
SNR · N p
l=0
⎡ 2 ⎤
Np −1 N p −1
⎛ ⎞2
⎢ ⎥
= E ⎣
E[γ(l)] p,ls (i, l) −
h h (i, j) ⎦
1/(SNR · N p )
+ (L − L)⎝ ⎠
p
l=0
j =0
1/(SNR · N p ) + β/(SNR · N p )
⎡ $ 2 %

Np −1 E h (l) L−1
⎢ p,MST 1
≈ E⎢ $ % p,ls (i, l)
h × − 2 γ1 (τl )σl2
⎣
l=0 E SNR · N p
p,MST (l)
2
h + β/ N p · SNR l=0
−1
L
2 ⎤
γ1 (τl )2 σ 2 + 1
N p −1
=1+
⎥ l
SNR · N p
− h p (i, j)
⎦. l=0

j =0 2
(25) 1/SNR 1
+ (L − L) * +
(1/SNR) + β/SNR SNR · N p
If the improved MST algorithm chooses L (L ≥ L) paths, −1
L
where L is number of resolvable paths of the dispersive −2 γ1 (τl )σl2 ,
channel, and the chosen L paths contain all the L channel l=0
paths without omission, then (25) can be further written as (26)
where τl is the channel delay of the lth resolvable path, and 10−1
σl2 is the power of the lth path,
⎧
⎪
⎪ σ 2
+ 1/ SNR · N
⎪
⎪
l

p
,
⎪
⎪
⎪
⎪ 2
⎨ σl + 1/ SNR · N p + β/ SNR · N p 10−2
γ1 (i) = ⎪ if i ∈ α,
⎪
⎪
⎪
NMSE
⎪
⎪
⎪
⎪ 1/SNR
* +,
⎩ if i ∈
/ α,
(1/SNR) + β/SNR
α = {τl : l = 0, 1, . . . , L − 1}. 10−3
(27)
(ii) MSE for Mismatched SNR. Similarly, the MSE of the

proposed fast LMMSE algorithm for mismatched SNR is 10−4
given by 0 5 10 15 20 25
SNR (dB)

φMSE SNR, SNR
The proposed fast LMMSE, matched SNR, numerical method
The proposed fast LMMSE, SNR design = 5 dB, numerical method
The proposed fast LMMSE, SNR design = 10 dB, numerical method
⎡ 2 ⎤
N p −1 N p −1 The proposed fast LMMSE, SNR design = 20 dB, numerical method
⎦
= E ⎣ γ
(l) p,ls (i, l) −
h h (i, j) The proposed fast LMMSE, SNR design = 5 dB, simulation
l=0 j =0
p The proposed fast LMMSE, SNR design = 10 dB, simulation
L−1
The proposed fast LMMSE, SNR design = 20 dB, simulation
1
=1+ γ2 (τl )2 σ 2 +
l
SNR · Np (28) Figure 7: NMSE of the proposed fast LMMSE algorithm with
l=0
matched SNR and mismatched SNRs versus SNR, by simulation and
⎛ ⎞2 numerical method, respectively.
1/SNR 1
+ (L − L)⎝ ⎠

(1/SNR) + β/ SNR SNR · Np
−1
L
where H p (i, j) denotes the channel estimate at the jth pilot
−2 γ2 (τl )σl2 , subcarrier in the ith OFDM symbol, obtained by LMMSE
l=0 algorithm or the proposed fast LMMSE algorithm, and K
denotes the number of OFDM symbols in the simulation.
where γ (l) = PMST (l)/(PMST (l) + (β/(N p SNR))), l =
0, 1, . . . , N p − 1. τl is the channel delay of the lth resolvable
path, and σl2 is the power of the lth path, 5. Numerical and Simulation Results
⎧
⎪ Both computer simulation and numerical method have been
⎪
⎪ σl2 + 1/ SNR · N p
⎪
⎪ , deployed to investigate the performance of the proposed fast
⎪
⎪
⎪
⎪ σ 2
+ 1/ SNR · N + β/ · Np
SNR LMMSE algorithm for channel estimation. In the simulation,
⎨ l p
γ2 (i) = ⎪ if i ∈ α, we employ the channel model of COST207 [23] having 6
⎪
⎪ numbers of paths, that is, L = 6, and the maximum delay
⎪
⎪
⎪
⎪ 1/SNR
,
⎪
⎪ if i ∈/ α, spread of 2.5 microseconds. The channel power intensity
⎩ (1/SNR) + β/ SNR profile is listed in Table 1. The number of the subcarriers of
the OFDM system, N, is equal to 2048, and the CP length
α = {τl : l = 0, 1, . . . , L − 1}. is equal to 128 sample points. The bandwidth of the system
(29) is 20 MHz so that one OFDM symbol period Ts = 102.4
microseconds and the CP period TCP = 6.4 microseconds >
It is noted that since the channel is assumed to be nor-
2.5 microseconds. The number of the total pilots N p is equal
malized, the MSE of the proposed fast LMMSE algorithm
to 128, and the pilot gap R is 16. The transmitted signal is
and the MSE of the conventional LMMSE are equal to their
BPSK modulated, and the Doppler shift is 100 Hz.
normalized mean square errors (NMSEs), respectively. In
addition, for the sake of performance comparison between
the above analysis of NMSE and the NMSE obtained by 5.1. Channel Autocorrelation Matrix under Different SNRs.
computer simulation, we define the NMSE obtained by Figure 3 shows the magnitude of the first row of the
simulation as follows: channel autocorrelation matrix RHp Hp , A. Since the channel
K −1 N p −1

2

autocorrelation matrix is circulant, it is enough to show
i=0 j =0 H p (i, j) − H p (i, j) the first row of the channel autocorrelation matrix. Observe
NMSEsimu = K −1 N p −1 2 , (30) that the magnitude of A varies approximately periodically,

i=0 j =0 H p (i, j) and the period is 13 pilot subcarriers. Since the channel
power intensity profile is negative exponential distributed, Table 1: Channel Power Intensity Profile.
the period of the first row of the channel autocorrelation
Doppler
matrix is decided by the delay of the second path. The delay Tap Delay (us) Gain (dB)
Spectrum
of the second path is 0.5 microseconds, that is, 10 sample
points. According to (7), the period is N p /τ1 = 128/10 = 1 0 0.0 Clarke [24]
12.8. It is noted that the parameter N should be replaced 2 0.5 −6.0 Clarke
by N p in (7). Therefore, the period is about 13, as shown 3 1.0 −12.0 Clarke
in Figure 3. Figure 4 shows the magnitude of the first row of 4 1.5 −18.0 Clarke
the LMMSE matrix RHp Hp (RHp Hp + (β/SNR)I)−1 with SNR of 5 2.0 −24.0 Clarke
5 dB, 10 dB, and 20 dB, respectively. Since the LMMSE matrix 6 2.5 −30.0 Clarke
is also circulant, it is sufficient to depict the first row of the
LMMSE matrix. Observe that the value of the first row of the
LMMSE matrix is symmetry, and the center point is 64. The 100
first row of the LMMSE matrix is approximately periodic,
and the period is about 13 pilot subcarriers. Observe that
the value of the first row of the LMMSE matrix varies 10−1
insignificantly when SNR changes from 5 dB to 20 dB. In
addition, the local maximum values of the curves correspond
to strong correlation between pilot subcarriers, and the local
BER
10−2
minimum values correspond to weak correlation between
pilot subcarriers.
10−3
5.2. Normalized Mean Square Error (NMSE) Comparison
of Channel Estimation between LMMSE Algorithm and the
Proposed Fast LMMSE Algorithm. Figure 5 shows the NMSE
of channel estimation of LMMSE algorithm versus that of 10−4
0 5 10 15 20 25
the proposed fast LMMSE algorithm by computer simu- SNR (dB)
lation and numerical method, respectively. The numerical
results of LMMSE algorithm and the proposed fast LMMSE LS
algorithm are obtained by (22) and (26), respectively. The The proposed fast LMMSE algorithm
simulation results are obtained by (30). We replace H p LMMSE
Perfect channel estimation
in (30) with H p,LMMSE for LMMSE algorithm and replace
H p with H p,fast LMMSE for the proposed LMMSE algorithm, Figure 8: Bit error rate (BER) of the LS, LMMSE, the proposed fast
LMMSE, and perfect channel estimation versus SNR.
respectively. For the proposed fast LMMSE algorithm, the
number of OFDM symbols chosen to obtain the average
power of each tap, NMST , is 20, and the number of chosen
paths, L , is 10. The number of OFDM symbols in the mismatched SNRs, that is, designed SNRs, we use (23)
simulation, K, is 5000, for both LMMSE algorithm and the to obtain the results, by numerical method. Secondly, for
proposed fast LMMSE algorithm. Observe that the NMSE of the curves with mismatched SNRs obtained by computer
the proposed fast LMMSE algorithm is very close to that of simulation, we use the designed SNR (predetermined and
LMMSE algorithm in theory over the SNR range from 0 dB invariable) instead of the true SNR in (13) to obtain the
to 25 dB. In addition, for LMMSE algorithm the numerical channel estimation of pilot subcarriers. Observe that the
result is verified by the simulation. For the proposed fast analysis results are verified by computer simulation well, for
LMMSE algorithm, the simulation result approaches the the designed SNR of 5 dB, 10 dB, and 20 dB, respectively. For
numerical result well, except that the simulation result the case of the designed SNR of 5 dB, the MSE approaches
is a little higher than the numerical result at low SNR. the curve of matched SNR well within the range from 0 dB
Observe that both the proposed fast LMMSE algorithm to about 10 dB. However, when the SNR increases, an MSE
and LMMSE algorithm are superior to LS algorithm. For floor of about 2 × 10−3 occurs. Similar trend can be found
instance, the LMMSE algorithm has about 16 dB gain over for the case of designed SNR of 10 dB. Observe that the curve
the LS algorithm, at the same MSE over the SNR range from of designed 20 dB approaches the curve with matched SNR
0 dB to 25 dB. well within the SNR range from 0 dB to 25 dB. Therefore,
Figure 6 shows the normalized mean square error if we only know the channel autocorrelation matrix RHp Hp
(NMSE) of LMMSE algorithm with matched SNR and and do not know the SNR, the above results suggest that we
mismatched SNRs versus SNR, by simulation and numerical use a higher designed SNR in (13) when performing channel
method, respectively. Firstly, we give a necessary illustration estimation.
of the curves obtained by numerical method. For the curves Figure 7 shows the NMSE of the proposed fast LMMSE
with matched SNR, we use (22) to calculate the MSEs under algorithm with matched SNR and mismatched SNRs versus
different SNRs, by numerical method. For the curves with SNR, by simulation and numerical method respectively.
100 we choose the designed SNR to be 10 dB, SNR will be set to

be 10 dB in step 3 of the proposed fast LMMSE algorithm
instead of using formula (19) to obtain SNR. For the
10−1 computer simulation, the number of OFDM symbols chosen
to obtain the average power of each tap, NMST , is 20, and the
number of chosen paths, L , is 10. The number of OFDM
symbols in the simulation, K, is 5000. Observe that the
BER
10−2
analysis results are verified by computer simulation well, for
the designed SNR of 5 dB, 10 dB, and 20 dB, respectively. For
the case of the designed SNR of 5 dB, the MSE approaches
10−3 the curve of matched SNR well within the range from 0 dB
to about 10 dB. However, when the SNR increases, an MSE
floor of about 2 × 10−3 occurs. Similar trend can be found
10−4
for the case of designed SNR of 10 dB. Observe that the curve
0 5 10 15 20 25 of designed 20 dB approaches the curve of matched SNR well
SNR (dB) within the SNR range from 0 dB to 25 dB.
LMMSE, matched SNR LMMSE, SNR design = 10 dB
LMMSE, SNR design = 5 dB LMMSE, SNR design = 20 dB
5.3. Bit Error Rate (BER) Comparison between LMMSE
Figure 9: BER comparison between LMMSE channel estimation Algorithm and the Proposed Fast LMMSE Algorithm. Figure 8
with matched SNR and LMMSE channel estimation with designed shows the BER of LS, LMMSE, the proposed fast LMMSE,
SNRs. and perfect channel estimation, respectively. We adopt linear
interpolation to obtain the channel frequency response at
100
all subcarriers after the channel frequency response at pilot
subcarriers is obtained by LS, LMMSE, and the proposed fast
LMMSE estimator. Once the channel frequency response is
obtained, we use maximum likelihood detection to obtain
10−1 k). In addition, the perfect channel
the estimated signal X(i,
estimation refers to that the channel frequency response is
known by the receiver in advance. Observe that the BERs of
LMMSE estimator is very close to that of the proposed fast
BER
10−2
LMMSE estimator over the SNR range from 0 dB to 25 dB.
And they are about 1 dB worse than the perfect channel
estimator, over the SNR ranging from 0 dB to 25 dB. The
10−3 LMMSE estimator and the proposed LMMSE estimator are
about 3-4 dB better than the LS estimator at the same BER
over the SNR ranging from 0 dB to 25 dB.
10−4
Figure 9 shows the BER performance of the LMMSE
0 5 10 15 20 25 channel estimation with matched SNR and the LMMSE
SNR (dB) channel estimation with designed SNRs. The LMMSE
channel estimator with designed SNR refers to that we
The proposed fast LMMSE, estimated SNR
The proposed fast LMMSE, SNR design = 5 dB
use a predetermined and unchanged SNR in (13) instead
The proposed fast LMMSE, SNR design = 10 dB of the true SNR. Observe that the BERs of the LMMSE
The proposed fast LMMSE, SNR design = 20 dB with designed SNR of 5 dB, 10 dB, and 20 dB are almost
overlapped with each other within the lower SNR range from
Figure 10: BER comparison between the proposed fast LMMSE 0 dB to 15 dB. However, when SNR increases from 15 dB
channel estimation with estimated SNR and the proposed fast to 25 dB, the BER of the LMMSE estimator with higher
LMMSE channel estimation with designed SNRs.
designed SNR is better than that of the lower designed
SNR. The results are consistent with the NMSEs in Figure 4.
Therefore, a design for higher SNR is preferable as for
Firstly, we give a brief illustration of the curves obtained mismatch in SNR.
by numerical method. For the curve with matched SNR, Figure 10 shows the BER of the proposed fast LMMSE
we use (26) to obtain the results. For the curves with estimator with estimated SNR and the proposed fast LMMSE
mismatched SNRs, that is, designed SNR, we use (28) to estimator with designed SNRs. It is noted that the proposed
obtain the numerical results. To verify the numerical results, fast LMMSE estimator with estimated SNR refers to our
we perform computer simulation for each case with different proposed algorithm summarized in Section 3. The proposed
designed SNR. In the computer simulation, step 3 in the fast LMMSE estimator with designed SNR refers to that we
proposed fast LMMSE algorithm is modified by letting the modify the step 3 of the proposed algorithm by using a
be the designed SNR. For instance, if
estimated SNR, SNR, predetermined and unchanged SNR instead of using formula
(19) to obtain the estimated SNR. Observe that the BERs of Denote the eigenvalues of the matrix RHH + σw2 I by μk , k =
the proposed fast LMMSE estimator with designed SNR of 0, 1, . . . , N − 1. We can obtain that
5 dB, 10 dB, and 20 dB are almost overlapped with each other

within the lower SNR range from 0 dB to 15 dB. However, μ0 μ1 · · · μN −1
when SNR increases from 15 dB to 25 dB, the BER of the
proposed fast LMMSE estimator with higher designed SNR *
= FFTN RHH (0, 0) + σw2 RHH (0, 1) · · · RHH (0, N − 1))
is better than that of the lower designed SNR. Thus, a design
for higher SNR is preferable as for mismatch in SNR.
= λ0 + σw2 λ1 + σw2 · · · λN −1 + σw2 .
(A.2)
6. Conclusion
Therefore the number of nonzero eigenvalues of the matrix
In this paper, a fast LMMSE channel estimation method
RHH + σw2 I is N and the rank of the matrix RHH + σw2 I is N.
has been proposed and thoroughly investigated for OFDM
systems. Since the conventional LMMSE channel estimation
requires the channel statistics, that is, the channel auto- B.
correlation matrix in frequency domain and SNR, which
are often unavailable in practical systems, the application In this appendix, we will show the derivation of (20).
of the conventional LMMSE channel estimation is limited. Since the matrix R is circulant, the inverse
Hp Hp + (β/ SNR)I
Our proposed method can efficiently estimate the channel −1
matrix (R
Hp Hp + (β/ SNR)I) can be obtained by Kumar’s
autocorrelation matrix by the improved MST algorithm and
Hp Hp + (β/ SNR)I
calculate the LMMSE matrix by Kumar’s fast algorithm and fast algorithm [25]. Denote the first row of R
exploiting the property of the channel autocorrelation matrix by C, and we have
so that the computation complexity can be reduced signifi-
cantly. We present the MSE analysis for the proposed method C=
and the conventional LMMSE method and investigate the
β
MSE thoroughly under two cases, that is, the matched SNR RH p H p (0, 0) + RH p H p (0, 1) · · · RH p H p 0, N p − 1 .
and the mismatched SNR. Numerical results and computer
SNR
simulation show that a design for higher SNR is preferable as (B.3)
for mismatch in SNR.
Kumar’s fast algorithm can be summarized as follows.
Appendices Step 1. Compute N p points FFT of the vector C and we

obtain
A.
In this appendix, we will prove that the rank of RHH is equal D = d0 d1 · · · dN p −1 = FFTN p (C). (B.4)
to L and the rank of RHH + σw2 I is equal to N. We can obtain
from (7) and (9) that Step 2. E can be obtained from (B.4) as

N−1 1 1 1
j2πnk E = d d ··· d . (B.5)
λk = RHH (0, n) exp − 0 1 N p −1
n=0
N
−1L
N −1 Hp Hp + (β/
Step 3. Denote the first row of the matrix (R
j2πτl n j2πnk
= σl2 exp exp − −1
n=0 l=0
N N
SNR)I) by F, and F can be given by computing N p points
⎧ IFFT of the vector E:
⎪
⎪0, for k ∈
/ α, (A.1)
⎪
⎨
= L−1 F = IFFTN p (E). (B.6)
⎪
⎪ 2
⎪
⎩N σl , for k ∈ α,
l=0
The above three steps can be combined as
⎧
⎨0, for k ∈
/ α, $
= −1 %
⎩ F = IFFTN p 1 · diag FFTN p (C) , (B.7)
N, for k ∈ α,
where α = {τl | l = 0, 1, . . . , L − 1}, τl is the delay of where 1 = [ 1 1 ··· 1 ]1×N p , and diag{•} denotes diagonal-
−1
the lth path, and L is the number of resolvable paths. Thus, ization operation. The matrix (R
Hp Hp + (β/ SNR)I) can be
the number of nonzero eigenvalues of RHH is equal to L. acquired from the 1 by N p vector F by circle shift. Denote
−1
Hp Hp ( R
the first row of the matrix R
Hp Hp + (β/ SNR)I) ,
by B [5] Y. Zeng, W. H. Lam, and T. S. Ng, “Semiblind channel
−1 estimation and equalization for MIMO space-time coded

Hp Hp + (β/ SNR)I)
the first column of the matrix (R It
by G. OFDM,” IEEE Transactions on Circuits and Systems I, vol. 53,
follows that no. 2, pp. 463–473, 2006.
N p −1 * [6] O. Edfors, M. Sandell, J.-J. van de Beek, S. K. Wilson, and P.
* + +

B j = G
A(i) i − j mod N p , j = 0, 1, . . . , N p − 1, O. Börjesson, “OFDM channel estimation by singular value
i=0 decomposition,” IEEE Transactions on Communications, vol.
(B.8) 46, no. 7, pp. 931–939, 1998.
[7] O. Simeone, Y. Bar-Ness, and U. Spagnolini, “Pilot-based

where B(i),
A(i), are the ith elements of the vector B
and G(i) , channel estimation for OFDM systems by tracking the delay-

A, and G, respectively. A is the first row of the matrix RHp Hp . subspace,” IEEE Transactions on Wireless Communications, vol.
= FH and G(i) ∗ 3, no. 1, pp. 315–325, 2004.
Since G = G∗ (N p − i), where (•) denote [8] Y. Zhao and A. Huang, “A novel channel estimation method
H
conjugate, (•) denotes Hermitian transpose, and (B.8) can for OFDM mobile communication systems based on pilot
be equivalently written as signals and transform-domain processing,” in Proceedings of
the 47th IEEE Vehicular Technology Conference (VTC ’97), vol.
N p −1 *
* + + 3, pp. 2089–2093, Phoenix, Ariz, USA, May 1997.

B j =
A(i)F j − i mod N p , j = 0, 1, . . . , N p − 1. [9] J.-C. Lin, “Least-squares channel estimation for mobile
i=0 OFDM communication on time-varying frequency-selective
(B.9) fading channels,” IEEE Transactions on Vehicular Technology,
vol. 57, no. 6, pp. 3538–3550, 2008.
Or equivalently,
[10] R. Lin and A. P. Petropulu, “Linear precoding assisted blind
⊗ F,
=A
B (B.10) channel estimation for OFDM systems,” IEEE Transactions on
Vehicular Technology, vol. 54, no. 3, pp. 983–995, 2005.
where ⊗ denotes circulant convolution, and F(i) is the ith [11] X. D. Cai and A. N. Akansu, “A subspace method for blind
entry of the vector F. Using the property of DFT, (B.10) can channel identification in OFDM systems,” in Proceedings of the
be written as IEEE International Conference on Communications (ICC ’00),
vol. 2, pp. 929–933, New Orleans, La, USA, June 2000.
B ⊗F
=A [12] X. G. Doukopoulos and G. V. Moustakides, “Blind adaptive
−1 (B.11) channel estimation in OFDM systems,” IEEE Transactions on
= IFFTN p FFTN p A · 1 · diag FFTN p (F) . Wireless Communications, vol. 5, no. 7, pp. 1716–1725, 2006.
[13] S. Coleri, M. Ergen, A. Puri, and A. Bahai, “A study of channel
Using (17), (B.3), and (B.7), (B.11) can be further written as estimation in OFDM systems,” in Proceedings of the 56th IEEE
Vehicular Technology Conference (VTC ’02), vol. 2, pp. 894–
=
B 898, Vancouver, Canada, September 2002.
[14] J.-J. van de Beek, O. Edfors, M. Sandell, S. K. Wilson, and P.
⎡
O. Börjesson, “On channel estimation in OFDM systems,” in
PMST (0) PMST (1)
IFFTN p⎣ Proceedings of the 45th IEEE Vehicular Technology Conference
PMST (1) + β/ N p SNR
PMST (0) + β/ N p SNR (VTC ’95), vol. 2, pp. 815–819, Chicago, Ill, USA, July 1995.
⎤ [15] Y. Li, L. J. Cimini Jr., and N. R. Sollenberger, “Robust channel
PMST N p − 1 estimation for OFDM systems with rapid dispersive fading
··· ⎦. channels,” IEEE Transactions on Communications, vol. 46, no.

PMST N p − 1 + β/ N p SNR 7, pp. 902–915, 1998.
(B.12) [16] M. Morelli and U. Mengali, “A comparison of pilot-aided
channel estimation methods for OFDM systems,” IEEE Trans-
actions on Signal Processing, vol. 49, no. 12, pp. 3065–3073,
References 2001.
[1] S. B. Weinstein and P. M. Ebert, “Data transmission by [17] C. Kuo and J.-F. Chang, “Equalization and channel estimation
frequency-division multiplexing using the discrete Fourier for OFDM systems in time-varying multipath channels,” in
transform,” IEEE Transactions on Communications, vol. 19, no. Proceedings of the 15th IEEE International Symposium on Per-
5, part 1, pp. 628–634, 1971. sonal, Indoor and Mobile Radio Communications (PIMRC ’04),
[2] D. S. W. Hui, V. K. N. Lau, and W. H. Lam, “Cross-layer vol. 1, pp. 474–478, Barcelona, Spain, September 2004.
design for OFDMA wireless systems with heterogeneous delay [18] W.-G. Song and J.-T. Lim, “Channel estimation and signal
requirements,” IEEE Transactions on Wireless Communications, detection for MIMO-OFDM with time varying channels,”
vol. 6, no. 8, pp. 2872–2880, 2007. IEEE Communications Letters, vol. 10, no. 7, pp. 540–542,
[3] S. Coleri, M. Ergen, A. Puri, and A. Bahai, “Channel 2006.
estimation techniques based on pilot arrangement in OFDM [19] M. J. Fernández-Getino Garcı́a, J. M. Páez-Borrallo, and S.
systems,” IEEE Transactions on Broadcasting, vol. 48, no. 3, pp. Zazo, “DFT-based channel estimation in 2D-pilot-symbol-
223–229, 2002. aided OFDM wireless systems,” in Proceedings of the 53rd IEEE
[4] M.-H. Hsieh and C.-H. Wei, “Channel estimation for OFDM Vehicular Technology Conference (VTC ’01), vol. 2, pp. 810–
systems based on comb-type pilot arrangement in frequency 814, Rhodes, Greece, May 2001.
selective fading channels,” IEEE Transactions on Consumer [20] A. Böttcher and S. M. Grudsky, Spectral Properties of Banded
Electronics, vol. 44, no. 1, pp. 217–225, 1998. Toeplitz Matrices, SIAM, Philadelphia, Pa, USA, 2005.
[21] H. Minn and V. K. Bhargava, “An investigation into time-

domain approach for OFDM channel estimation,” IEEE
Transactions on Broadcasting, vol. 46, no. 4, pp. 240–248, 2000.
[22] S. Haykin, Adaptive Filter Theory, Publishing House of
Electronics Industry, Beijing, China, 4th edition, 2002.
[23] M. Failli, “Digital land mobile radio communications—COST
207,” Tech. Rep. EUR 12160, Commission of the European
Communities, Brussels, Belgium, September 1988.
[24] T. S. Rappaport, Wireless Communications Principles and
Practice, Publishing House of Electronics Industry, Beijing,
China, 2002.
[25] R. Kumar, “A fast algorithm for solving a Toeplitz system of
equations,” IEEE Transactions on Acoustics, Speech, and Signal
Processing, vol. 33, no. 1, pp. 254–267, 1985.
doi:10.1155/2009/307375
Research Article
Linearly Time-Varying Channel Estimation and Symbol
Detection for OFDMA Uplink Using Superimposed Training
Han Zhang, Xianhua Dai, Dong Li, and Sheng Ye

Department of Electronics & Communication Engineering, Sun Yat-Sen University, Guangzhou 510275, China
Correspondence should be addressed to Xianhua Dai, issdxh@mail.sysu.edu.cn
Received 30 July 2008; Revised 22 November 2008; Accepted 27 January 2009
We address the problem of superimposed trainings- (STs-) based linearly time-varying (LTV) channel estimation and symbol
detection for orthogonal frequency-division multiplexing access (OFDMA) systems at the uplink receiver. The LTV channel
coefficients are modeled by truncated discrete Fourier bases (DFBs). By judiciously designing the superimposed pilot symbols,
we estimate the LTV channel transfer functions over the whole frequency band by using a weighted average procedure, thereby
providing validity for adaptive resource allocation. We also present a performance analysis of the channel estimation approach
to derive a closed-form expression for the channel estimation variances. In addition, an iterative symbol detector is presented
to mitigate the superimposed training effects on information sequence recovery. By the iterative mitigation procedure, the
demodulator achieves a considerable gain in signal-interference ratio and exhibits a nearly indistinguishable symbol error rate
(SER) performance from that of frequency-division multiplexed trainings. Compared to existing frequency-division multiplexed
training schemes, the proposed algorithm does not entail any additional bandwidth while with the advantage for system adaptive
resource allocation.
Copyright © 2009 Han Zhang et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction be estimated over the whole frequency band. In conven-

tional pilot-aided approaches wherein the pilot symbols
Orthogonal Frequency-Division Multiplexing Access are frequency-division multiplexed (FDM) with the data
(OFDMA) is a promising technique for future high-speed symbols [3–8, 10–15]; however, channel estimation can
broadband wireless communication systems, and it has only be performed within each subband of individual user
recently been proposed or adopted in many industry separately since each user is only assigned a subset of the
standards (e.g., IEEE 802.16e [1], 3 GPP Long Term whole frequency band. This may be a great disadvantage
Evolution (LTE) [2]). In OFDMA, subcarriers are grouped for OFDMA systems with adaptive resource allocation.
into sets, each of which is assigned to a different user. In addition, extra bandwidth is required for transmitting
Interleaved, random, or clustered assignment schemes can known pilot symbols. In recent years, an alternative and
be used for this purpose. Such a system, however, relies on promising approach, referred to as superimposed training
the knowledge of propagating channel state information (ST), has been widely studied in [9, 16–24]. In the idea of
(CSI). Explicitly, in many mobile wireless communication ST, additional periodic training sequences are arithmetically
systems, transmission is impaired by both delay and Doppler added to information sequence in time or frequency domain,
spreads [3–10], resulting in inside- and out-of-band and the channel transfer function can thus be estimated by
interferences. using the first-order statistics. The advantage of the scheme
Channel estimation in OFDMA uplinks is challenging, is that there is no loss in information rate and thus enables
however, since different channel responses for the individual higher bandwidth efficiency. In this scheme, however, the
user need to be tracked simultaneously at the base station information sequences are viewed as interference to channel
(BS). OFDMA systems with adaptive resource allocation estimation since pilot symbols are superimposed at a low
are even more critical since the uplink channels have to power to the information sequences at the transmitter. To
Subcarrier allocation
based on channel state
information
User 1
. . . .
. . . Add .
. . IDFT . . Σ
CP LTV channel
User N
AWGN
User 1
. . .
.. .. Demodulator .. DFT Remove
CP
User N
Subcarrier allocation
Figure 1: System model.
circumvent the problem, it was recommended in [16–22, few researches are contributed to the superimposed training
24] that a periodic impulse train of the period larger than effect cancellation for information sequence recovery.
the channel order is superimposed in time-domain, and In this paper, we propose a new ST-based channel esti-
the channel is thus estimated by averaging the estimations mator that can overcome the aforementioned shortcomings
of multiple training periods to reduce the information in estimating LTV channel for OFDMA uplink systems. In
sequence interference. For a multicarrier systems, that is, contrast to the previous works, the main contributions of
SISO/OFDM system, [19] suggested a similar scheme that this paper are twofold. First, we extend conventional LTI-
superimposes the periodic impulse training sequences on based ST schemes [16–24] to the case where the channel
time-domain modulated signals, while for single-carrier coefficient is linearly time-varying. By resorting to the
systems, a novel block transmission method is proposed in truncated Fourier bases (DFBs) to model the LTV channel,
frequency domain in [23], where an information sequence we adopt a two-step approach to estimate the time-varying
dependent component is added to the superimposed training channel coefficients over multiple OFDMA symbols. Unlike
so as to remove the effect of the information sequence on the conventional FDM training strategy [12–15] where channel
channel estimation at receiver. In [24], an iterative approach estimation can only be performed within each subband of
is provided where the information sequence is exploited to individual user separately, the LTV uplink channel transfer
enhance the channel estimation performance. These above- functions over the whole frequency band can be estimated
mentioned schemes, however, are restricted to the case that directly by using specifically designed superimposed train-
the channel is linearly time-invariant (LTI), and cannot be ing. Furthermore, we present a performance analysis of the
extended to the linearly time-varying (LTV) channel since channel estimator. We demonstrate by simulation that the
the variation of channel coefficients may degrade the simple estimation variance, unlike that of conventional ST-based
average-based solution extensively. A combined approach schemes of LTI channel [16–22, 24], approaches to a fixed
is developed in [9, 11] to solve the problem of channel lower bound as the training length increases. Second, an
estimation of LTV channels. However, it is only suitable for iterative symbol detection algorithm is adopted to mitigate
single-carrier transmission. In addition, some useful power the superimposed training effects on information sequences
is wasted in ST which could have otherwise been allocated to recovery. In simulations presented in this paper, we compare
the information sequence. This lowers the effective signal- the results of our approaches with that of the FDM training
to-noise ratio (SNR) for information sequence and affects approaches [12–15] as latter serves as a “benchmark” in
the symbol error rate (SER) at receiver. This may be a related works. It is shown that the proposed algorithm
great disadvantage to wireless communication systems with outperforms FDM trainings, and the demodulator exhibits a
a limited transmission power. On the other hand, the nearly indistinguishable SER performance from that of [14].
interference to information sequence recovery due to the The rest of the paper is organized as follows. Section 2
embedded training sequences may degrade the SER perfor- presents the channel and system models. In Section 3, we
mance severely at receiver. Previous papers merely focus on estimate the LTV channel coefficients by using the proposed
the information sequence interference suppression; whereas channel estimator. In Section 4, we present the closed-form
expression of the channel estimation variances of Section 3. As mentioned in [3], the coefficients of the time- and
An iterative symbol detector is provided in Section 5. frequency-selective channel can be modeled as Fourier basis
Section 6 reports on some simulation experiments carried expansions. Thereafter, this model was intensively investi-
out in order to test the validity of theoretic results, and we gated and applied in block transmission, channel estimation,
conclude the paper with Section 7. and equalization (e.g., [4–8]). In this paper, we extend the
block-by-block process [4–8] to the case where multiple
Notation 1. The letter t represents the time-domain variable, OFDMA symbols are utilized. Consider a time interval or
and k is the frequency-domain variable. Bold letters denote segment {t : (l − 1)Ω ≤ t ≤ lΩ}, the channel coefficients in
the matrices and column-vectors, and the superscripts [•]T (3) can be approximated by truncated discrete Fourier bases
and [•]H represent the transpose and conjugate transpose (DFBs) within the segment as
operations, respectively. IK denotes the identity matrix of
size K, and [•]k,t denotes the (k, t) element of the specified
Q
matrix. hl (t) ≈ hl,q e( j2π(q−Q/2)t/Ω) ,
q=0 (4)
2. Channel and System Model (l − 1)Ω ≤ t ≤ lΩ, l = 1, 2, . . . ,
Consider an OFDMA uplink system with N active users
sharing a bandwidth of Z as shown in Figure 1. Although where hl,q is a constant coefficient, l = 0, . . . , L − 1 is the
there are many subcarrier assignment protocols, in this multipath delay, Q represents the basis expansion order that
paper, we assume that a consecutive set of subcarriers is is generally defined as Q ≥ 2 fd Ω/ fs [3–8], Ω > B is the
assigned to a user. This assumption is especially feasible segment length, and l is the segment index. Unlike [4–8],
when adaptive modulation and coding (AMC) protocol is the approximation frame Ω covers multiple OFDM symbols,
employed rather than partial usage of subchannels (PUSCs) denoted by i = 1, . . . , I, where I = Ω/B and B = B + L .
protocol [12–15]. The ith symbol of nth user is denoted by Stacking the received signals in (3) to form a vector and
then performing FFT operation, we obtain the demodulated
Sn (i) signals as
T
= [0, . . . , sn (i, 0), . . . , sn (i, k), . . . , sn (i, K − 1), 0, . . . , 0] , U(i) = [u(i, 0), . . . , u(i, k), . . . , u(i, B − 1)]T
n = 1, . . . , N, T (5)
= F y(i, 0), . . . , y(i, t), . . . , y(i, B − 1) .
(1)
where sn (i, k), k = 0, . . . , K − 1 is the transmitted data From (3)-(4) and the duality of time and frequency, the FFT
symbol, K is the subcarrier number allocated to the nth user, demodulated outputs in (5) can be written as
B = NK is the OFDM symbol-size. ⎧ ⎫
At transmit terminals, an inverse fast Fourier transform ⎨ −1
N L ⎬
(IFFT) is used as a modulator. The modulated outputs are u(i, k) = FFT⎩ hl (t)xn (i, t − l) + v(i, t)⎭
given by n=1 l=0
−1
N L
Xn (i) = [xn (i, 0), . . . , xn (i, t), . . . , xn (i, B − 1)]T = FFT{hl (t)} ⊗ FFT{xn (i, t)} + v(i, k)
(2) n=1 l=0
−1
= F Sn (i), ⎧ ⎫
−1
N L ⎨Q ⎬
= FFT⎩ hl,q e j2π (q−Q/2)t/Ω ⎭ ⊗ Sn (i)+v(i, k),
where F−1 is the IFFT matrix with [F−1 ]k,t = e j2πkt/B and j 2 = n=1 l=0 q=0
−1 . Then, Xn (i) is concatenated by a cyclic-prefix (CP) of
(6)
length L, propagated through respective channel. At receiver,
the received signals, discarding CP, can be written as
where FFT{·} represents the FFT vector of the specified

N function with a length B, and v(i, k) is the frequency-domain
y(i, t) = Xn (i) ⊗ h(t) + v(t) noise. Note that the vectors FFT{hl (t)} in (6) should be
n=1 computed corresponding to the variations of the propagating
(3) channel during an OFDM symbol time interval. Specifically,
−1
N L
= hl (t)xn (i, t − l) + v(i, t), t = 1, . . . , B, the variation of LTV channel is associated with the OFDM
n=1 l=0 symbol-size as well as the Doppler frequency or mobile
velocity.
where h(t) = [h0 (t), . . . , hL−1 (t), 0, . . . , 0]T is the B × 1 In this paper, we focus on the slowly time-varying chan-
impulse response vector of the propagating channel with the nel estimation. Following the slowly time-varying assump-
channel coefficients hl (t), l = 0, . . . , L − 1 being the functions tion where the time-varying channel coefficients can be
of time variable t. The notation ⊗ represents the cyclic approximated as LTI during one OFDM symbol period but
convolution, and v(i, t) is the additive noise with variance Ev . vary significantly across multiple symbols [25]. Accordingly,
the channel transfer function during an OFDMA symbol can

···
be approximated as

Q ···
l (t) = hl,q e j2π (q−Q/2)t/Ω
User index
q=0 . . . . .
(7) .. .. .. .. ..

Q
≈ hl,q e j2π (q−Q/2)ti /Ω , t = (i − 1)B , . . . , iB ,
q=0 ···
where ti = (l − 1)Ω + (i − 1)B + B/2 is the mid-sample of the Subband 1 Subband 2 ··· Subband Subband N
N −1
ith OFDMA symbol. In (7), the LTV channel coefficients are
in fact approximated by the mid-values of the LTV channel Whole frequency band of OFDMA
model (4) at the ith symbol. Since the proposed channel Information sequence in subband
estimation will be performed within one single frame Ω , we ST spreading the whole frequency band with training power E p
omit the frame index l and thus have ti = (i − 1)B + B/2 for
simplification. Figure 2: Superimposed training sequences of different users are
Accordingly, the vectors FFT{hl (t)} in (6) are thus distributed over the whole frequency band of OFDMA uplink
computed as δ-sequences, and the FFT demodulated signals system.
at the subcarrier k of the ith OFDMA symbol can be
rewritten as
u(i, k) end is overlapped across different users. To circumvent this
⎡ ⎤ problem, we adopt the training scheme as
−1
N L Q
= ⎣ hl,q e j2π (q−Q/2)ti /Ω ⎦e− j2πkl/K sn (i, k) + v(i, k)
n=1 l=0 q=0 pn (i, k) = E p e(− j2πk(n−1)L/B) , k = 0, . . . , B − 1, (10)
−1
N L
= l (i)e− j2πkl/K sn (i, k) + v(i, k), where E p is the fixed power of the pilot symbols.
n=1 l=0 Note that the pilot symbols in (10) are complex exponen-
(8) tial functions superimposed over the whole subcarriers, the
corresponding time-domain signals of various users are in
where l (i) = Qq=0 hl,q e j2π(q−Q/2)ti /Ω . fact a δ-sequence as pn (i, t) = E p Bδ(t − (n − 1)L), n =
In conventional FDM training schemes [12–14] where 1, . . . , N, that follows a disjoint set with an interval L.
each user is only assigned a subset of the whole subcarriers, Therefore, using the specifically designed training sequence
the channel estimation, however, cannot be performed over (10), the training signals of various users are decoupled. The
the whole frequency band. This may be a great disadvantage sequence (10), however, possibly leads to high signal peaks
for OFDMA systems with adaptive resource allocation. at the instant samples t = (n − 1)L, n = 1, . . . , N. One of
the simple ways to suppress the above undesired signal peaks
3. Superimposed Training-Based Solution may refer to the scrambling procedure [25] (details will not
be addressed here since it is beyond the scope of this paper).
In this section, we propose an ST-based two-step approach Substituting the specifically designed pilot sequence (10)
to estimate the channel transfer functions over the whole into (8), we have
frequency band and, meanwhile, overcome the above-
mentioned shortcoming of conventional ST-based schemes −1
N L
in estimating LTV channels. u(i, k) = l (i)e− j2πkl/B pn (i, k)
n=1 l=0
3.1. Channel Estimation over One OFDMA Symbol. In this −1

N L
paper, the new ST strategy in estimating LTV channel of + l (i)e− j2πkl/B sn (i, k) + v(i, k)
OFDMA uplink system is illustrated in Figure 2. Accordingly, n=1 l=0
the transmitted symbol in (2) can be rewritten by (11)
−1
N L
= Ep l (i)e−2πkl/B e− j2πk(n−1)l/B + w(i, k)
Sn (i) = pn (i, 0), . . . , pn (i, (n − 1)K − 1), sn (i, 0) n=1 l=0
+ pn (i, (n − 1)K), . . . , sn (i, K − 1) NL−1
T (9) = Ep λκ (i)e− j2πκl/B + w(m) (i, k),

+pn (i, nK − 1), pn (i, nK), . . . , pn (i, B − 1) κ=0
n = 1, . . . , N, N L−1
− j2πkl/B
where w(i, k) = n=1 l=0 hl (i)e sn (i, k) + v(i, k).
where pn (i, k), k = 0, . . . , B − 1 is the superimposed pilots In (11), the channel transfer functions are in fact incor-
of nth user. By (8), we notice that the signal at receiver porated into a single vector following the relationship
T
λ(n−1)L+l (i) = l (i), l = 0, . . . , L − 1, n = 1, . . . , N. By (10)- and then form a vector l = [ l (1), . . . , l (I)] . Following the
(11), we have the IFFT demodulated signals channel model in (7), we have
⎡ ⎤⎡ ⎤
e j2π(0−Q/2)t1 /Ω · · · e j2π(Q−Q/2)t1 /Ω hl,0
xn (i, t) = F−1 Sn (i) t,1 ⎢ ⎥⎢ ⎥
l = ηhl,q ⎢ .. .. .. ⎥⎢ .. ⎥
=⎢ . . . ⎥ ⎢ ⎥,
= xn (i, t) + E p Bδ(t − (n − 1)L), n = 1, . . . , N, ⎣ ⎦⎣ . ⎦
−Q/2)t1 /Ω −Q/2)t1 /Ω
e j2π(0 ··· e j2π(Q hl,Q
(12)
n = 1, . . . , N, l = 0, . . . , L − 1,

where xn (i, t) is the IFFT modulated signals of the infor- (15)
mation sequences sn (i, k) . The received signals (3) in time-
domain can be thus obtained as where hl,q = [hl,0 , . . . , hl,Q ]T is the complex exponential
coefficients modeling the LTV channel, and η is a I × (Q + 1)
matrix with [η]q,i = e j2π(q−Q/2)ti /Ω . Thus, when I ≥ Q + 1,
−1
N L
y(i, t) = l (i) E p Bδ(t − (n − 1)L − l) the matrix η is of full column rank, and the basis exponential
n=1 l=0 model coefficients can be estimated by
−1
N L
+ l (i)xn (i, t − l) + v(i, t) hl,q = η+ l , l = 0, . . . , L − 1. (16)
(13)
n=1 l=0
Substituting ti = (i − 1)B + B/2 into the matrix η, we have
= λ(n−1)L+l (i) E p Bδ(t − (n − 1)L − l) the pseudoinverse matrix

+ εn,l (i, t) + v(i, t), n = 1, . . . , N, η+ i,q

= e− j2π (q−Q/2)((i−1)B +B/2)/Ω /I. (17)
By (16)-(17), the modeling coefficients are estimated over the

where εn,l (i) = Nn=1 Ll=−01 l (i)xn (i, t − l) is the interference whole frame OFDMA symbols and can be rewritten by
to channel estimation due to the information sequence.
Consequently, the channel estimation can be performed in
I
time-domain as hl,q = e− j2π (q−Q/2)ti /Ω l (i)/I. (18)
i=1
λ(n−1)L+l (i) = l (i) In fact, (18) is estimated over multiple OFDMA symbols
N L−1 with a weighted average function of e− j2π(q−Q/2)ti /Ω /I . Similar
n=1
(i)x (i, (n − 1)L − κ)
κ=0 κ n to the average procedure of LTI case [16–22, 24], it is thus
= l (i) + anticipated that the weighted average estimation may also
EpB
exhibit a considerable performance improvement for the
v(i, (n − 1)L − l) time-varying channels over a long frame Ω .
+ , i = 1, . . . , I.
EpB Compared with the conventional STs that are generally
limited to the case of LTI channels [16–22, 24], the proposed
(14) weighted average approach can be performed to estimate
the LTV channels of OFDMA uplink systems. In fact, the
3.2. Channel Estimation over Multiple OFDMA Symbols. proposed channel estimation is composed of two steps: first,
From (14), we note that the information sequence inter- with specially designed training signals in (10), we estimate
ference vector (the second entry of (14)) can hardly be the channel coefficients during each OFDMA symbol as
neglected unless using a large pilot power E p . The conven- temporal results. Second, the temporal channel estimates are
tional ST trainings stated in [16–22, 24] employ averaging further enhanced over multiple OFDMA symbols by using
the channel estimates over multiple OFDM symbols (or a weighted average procedure. That is, not only the target
training periods) to suppress the information sequence symbol, but also the OFDMA symbols over the whole frame
interference in the case that the channel is linearly time- are invoked for channel estimation.
invariant during the record length. This arithmetical average On the other hand, the proposed ST-based approach can
operation in [16–22, 24], however, is no longer feasible be utilized to estimate the uplink channel over the whole
to the channel assumed in this paper wherein the channel frequency band, thus overcome the shortcoming of FDM
coefficients are time-varying over multiple OFDMA symbols. training methods [12–14] where channel estimation can
In this section, we develop a weighted average approach only be performed within each subband of individual user,
to suppress the abovementioned information sequence inter- separately.
ference over multiple OFDMA symbols, and thus overcom-
ing the shortcoming of conventional ST-based schemes for 4. Channel Estimation Analysis
linearly time-varying channel estimation.
We take the LTV channel coefficient estimation of each In this section, we analyze the performance of the proposed
OFDMA symbol l (i), i = 1, . . . , I (14) as a temporal result, channel estimator in Section 3 and derive a closed-form
expression of the channel estimation variance which can be, From (24), we can find that the estimation variance due to
in turn, used for superimposed training power allocation. the information interference is directly proportional to the
Before going further, we make the following assumptions. information-to-pilot power ratio Es /E p , thereby resulting in
an inaccurate solution for the general case that E p Es .
(H1) The information sequence Sn (i) is equi-powered, We then analyze the estimation performance (16)–(18)
finite-alphabet, i.i.d., with zero-mean and variance over multiple OFDMA symbols. Neglecting the modeling
Es , and uncorrelated with additive noise {vn (i, t)}. error, we use hl,q to evaluate the channel estimation variance.
(H2) The LTV channel coefficients l are i.i.d. complex Define
Gaussian variables. T
εn,l = εn,l (1), . . . , εn,l (I)
The interference vector caused by the information (25)
sequence in (13)-(14) can be rewritten as υ = [υ(1), . . . , υ(I)]T .
T By (H1)-(H2), the MSE of the weighted average estimator is
ε(i) = ε1,0 (i), . . . , ε1,L−1 (i), . . . , εN,0 (i), . . . , εN,L−1 (i) given by
⎡
N L−1
1 ⎣ MSE(ave)
= κ (i)xn (i, B − κ), . . . ,
E p B n=1 κ=0 2
(19)
def l,q
= E hl,q − h
⎤T
−1
N L 2
κ (i)xn (i, (N − 1)L + L − κ)⎦ .

= E η+ εn,l + υ
n=1 κ=0
H H H
= tr η+ E εn,l εn,l η+ +tr η+ E υ(υ)H η+
The additive noise vector is also given by
υ(i) 1
I
−1
= var(υ(i)) + var εn,l (i) tr ηH η .
I i=1
T
= [υ(i, 0), . . . , υ(i, NL − 1)] (26)
T
1 Note that the column vectors of the matrix η in (15) are
= [v(i, 0), . . . , v(i, (n − 1)L + l), . . . , v(i, NL − 1)] .
EpB in fact the FFT vectors of a I × I matrix, we thus have
−1
(20) ηH η = II(Q+1) and tr[ηH η] = (Q + 1)/I. Substituting (21)-
(22) into (26), we then obtain the variance of the weighted
By (H1), v(i, t) is also independent of εn,l (i). We first calculate average estimation hl,q associated with εn,l (i), i = 1, . . . , I as
the variance of v(i, t) in (20) by
I L−1 I L−1
(Q + 1)Es 2 (Q + 1)Es 2
1 σv2 ρl,q = |l (i)| = |l (i)| .
2
var(υ(i, t)) = E |v(i, t)|2 = . (21) BI E p i=1 l=0 ΩIE p i=1 l=0
BE p BE p
(27)
We also note that the estimation error εn,l (i) = By analogy, the variance of the additive noise υ(i), i =
N L−1
n=1 κ=0 κ (i)xn (i, (n − 1)L − κ) is approximately Gaussian 1, . . . , I can be also derived as
distributed for large symbol-size B. The estimation variance (Q + 1)Ev (Q + 1)Ev
due to the information sequence interference, therefore, can E |υ|2 = = . (28)
be obtained as BIE p ΩE p
Combining the variances in (27) and (28), we have the

2 1
L−1
var εn,l (i) = E εn,l (i) = 2
|l (i)| Es . (22) weighted average estimation variances
BE p l=0
I L−1
(Q + 1)Es 2 (Q + 1)Ev
Since (22) depends upon the channel transfer functions MSE(ave) = |l (i)| + . (29)
ΩIE p i=1 l=0 ΩE p
(equivalently, the channel impulse response), we define the
normalized variance as In (29), the last term is due to the additive noise. In general,
since the LTV channel model satisfies (Q + 1)/Ω 1, the
1
nvar εn,l (i) = 2 var εn,l (i) , (23) additive noise is greatly suppressed by the weighted average

(i) procedure. On the other hand, estimation variance due to
the information sequence interference (the first term in (29))
2
where |(i)| = Ll=−01 |l (i)|2 /L. Following the definition of may be the dominant component of the channel estimation
(23), we obtain the normalized variance as error, especially for high SNR. Similar to (23), we derive the
L−1 normalized variance of information sequence interference by
2
var εn,l (i) Es |l (i)|
l=0 L Es removing the channel gain by
nvar εn,l (i) = 2 = 2 = . ! !
B Ep 1
(i) BE p (i) nvar ρl,q = 2 var ρl,q , (30)

(24)
2 I L−1 2
where || = i=1 l=0 |l (i)| /LI. From (29) and (30), it removed at OFDMA uplink receiver before recovering the
follows that data symbols
I L−1
! (Q + 1)Es |l (i)|
2

N
i=1 l=0
nvar ρl,q = 2 )U(i) = U(i) −
H(i)P n (i) = H(i)S(i) + Ξ(i) + v(i), (32)

BE p I 2 (31) n=1
L(Q + 1)Es B L(Q + 1) Es

= ≈ . where H(i) is an M × M matrix with the diagonal
ΩE p B Ω Ep elements being the estimated channel frequency-domain
From (31), the normalized variance is directly proportional
transfer function, that is, diag(H(i)) 0), . . . , H(i,
= [H(i, k),
L−1
to the information-pilot power ratio Es /E p and the ratio B − 1)] (with H(i,
. . . , H(i, T
k) = l=0 l (i)e − j2πkl/B ) and the
of the unknown parameter number L(Q + 1) over the
remaining entries being zeros. Ξ(i) = [H(i) − H(i)]P(i) is the
frame length Ω. In particular, with the specifically designed residual error of the superimposed pilots.
training sequence (10), the closed-form estimation variance Note that Ξ(i) is distributed over the whole frequency
(31) may provide a guideline for signal power allocation tone; whereas owing to the specifically designed training
at transmitter, for example, for a given threshold of the signals in (10), the time-domain received signals affected by
estimation variance φ (channel gain has been normalized), the residual error are concentrated only during a sequence of
the minimum training power E p should at least satisfy the sample periods y(i, (n−1)L+κ), κ = 0, . . . , L−1, n = 1, . . . , N.
approximated constraint as E p ≥ φΩEs /NL(Q + 1) . In order to mitigate the residual error, a natural idea is to
Compared with the variances of channel estimation reconstruct the above time-domain signals of t = (n − 1)L+κ,
over one OFDMA symbol as in (22)–(24), the estimation κ = 0, . . . , L − 1, n = 1, . . . , N. In our proposed iterative
variances (29)–(31) of the weighted average estimator (15)– method, we carry out the following steps.
(18) are significantly reduced owing to the fact that Ω/B(Q +
1) 1. Theoretically, the weighted average operation can Step 1. By (32), we perform zero-forcing equalization by
be considered as an effective approach in estimating LTV
T !†
channel, where the information sequence interference can = S
S(i) 1 (i), . . . , S
N (i)
= H(i) )U(i). (33)
be effectively suppressed over multiple OFDMA symbols. As
stated in the conventional ST-based schemes [16–22, 24],
The information symbols, owing to the finite alphabet set
channel estimation performance can be improved along with
property, can be recovered by a hard detector as
the increment of the recorded frame length Ω, that is, the
estimation variance approaches to zero as Ω → ∞. This 2

sn (i, k) = arg min sn (i, k) − sn (i, k) , (34)
can be easily comprehended that larger frame length Ω sn (i,k)∈Θ
means more observation samples, and hence lowers the MSE
level. From the LTV channel model (4), however, we note where Θ is the finite alphabet set from which the transmitted
that as the frame length Ω is increased, the corresponding data takes, for example, 4-PSK and 8-PSK signals, and so
truncated DFB requires a larger order Q to model the LTV forth.
channel (maintain a tight channel model), and the least
order should be satisfied Q/2 ≥ fd Ω/ fs , where fd and fs are Step 2. Reconstruct the time-domain received signal vectors
the Doppler frequency and sampling rate, respectively [1– with the estimated channel coefficients in (16) and data
8]. Consequently, as the frame length Ω increases, the LTV sequences in (34), respectively, we obtain
channel estimation variance (31) approaches to only a fixed
T
lower-bound associate with the system Doppler frequency (i) = y(i, 0), . . . , y(i, t), . . . , y(i, B − 1)
Y = F−1 )U(i).
as well as the information-pilot power ratio. This is quite (35)
different from the ST trainings in estimating LTI channels
[16–22, 24]. Step 3. Replace the contaminated signals y(i, (n − 1)L + κ) by
the reconstructed signals y(i, (n−1)L+κ) in (35), the received
5. Iterative Symbol Detector signal vector is then updated by

Unlike the FDM trainings [10, 12–15, 25], the pilot sequences (i) = y(i, 0), y(i, 1), . . . , y(i, (n − 1)L + κ), . . . ,
Y
in (10) are superimposed on the information sequences and T
thus produce interferences on the information sequences y(i, (N − 1)L+L − 1), y(i, NL), . . . , y(i, B − 1) .
recovery. The existing ST approaches [9, 11, 16–22, 24] (36)
merely focus on the information sequence interference
suppression; whereas few researches are contributed to the Step 4. Using the updated signals in (36), we detect the infor-
ST effect cancellation for information sequence recovery. In mation symbols by (32)–(36) in the forthcoming iteration.
this section, we provide a new iterative symbol detector to
cancel the residual training effects on symbol recovery. Step 5. Repeat the Steps 1–4 until the increment changes of
As in the symbol detection of conventional ST-based the improved SER performance over successive iterations are
approach, the contribution of the training sequences is firstly below a given threshold.
When the SER of the initial hard detector in (34) is 10−1

lower than a certain threshold, the reconstructed signals
in the current iteration should approach to the original
signals )y(i, (n − 1)L + κ) more than that of the previous
Mean square error (MSE)

iteration, that is, |ycur (i, (n − 1)L + κ) − y"(i, (n − 1)L +
10−2
κ)| < |ypre (i, (n − 1)L + κ) − y"(i, (n − 1)L + κ)|, where
y"(i, t) is the pure IFFT modulated information signals of

U(i) = Nn=1 H(i)Sn (i), ycur (i, (n − 1)L + κ) and ypre (i, (n −
1)L + κ), κ = 0, . . . , L − 1 are the reconstructed signals
by (36) in the current and previous iterations, respectively. 10−3
Additionally, the iteration index depends crucially on the
size of the reconstructed signals over one OFDMA symbol
period, that is, τ = NL/B. Base on experiment studies,
the proposed iterative method should satisfy the constraint 10−4
of τ ≤ 0.2. Commonly, such constraint for practical 0 5 10 15 20 25 30
implementation can be satisfied freely by simply adjusting Signal-noise ratio (dB)
the total frequency bandwidth and the number of active
E p = 0.1Es , NL = 40 E p = 0.01Es , NL = 40
users. E p = 0.1Es , NL = 20 E p = 0.01Es , NL = 20
Obviously, the SER performance degradation owing to
the residual effect of superimposed training is guaranteed Figure 3: MSE versus SNR, with the LTV channel of fn = 300 Hz
with the proposed iterative approach. Compared with con- and Ω = 13.62 milliseconds under the different IPR and system
ventional ST methods [9, 11, 16–22, 24], the iterative scheme unknowns NL.
offers an alternative to enhance the channel estimation
performance by using a large training power E p while
without sacrificing SER performance degradation. problem, the frames are designed to be partially overlap, for
example, (l − 1)Ω − γB ≤ t ≤ lΩ, l = 2, 3, . . . , where γ is
a positive integer. By the frame-overlap, the LTV channel at
6. Simulation Results and Discussion the beginning and the end of the frame can be modeled and
estimated accurately from the neighboring frames.
In this section, we present the numerical examples to validate To evaluate the proposed channel estimator, we resort to
our analytical results. We assume the OFDMA uplink system the MSE of channel estimation to measure the estimation
with B = 512 and all subcarriers are equally divided into performance, which is defined as
N = 4 subband that assigned to four users. The transmitted
MSE
data symbol sn (i, k) is QPSK signals with symbol rate fs =
107 /second. The channel is assumed with L = 10, and the
Ω/B
MSE(i)
coefficients hn,l (t) are generated as low-pass, Gaussian, and =
i=1
Ω/B
zero-mean random processes and correlated in time with
⎧ 2⎫
the correlation functions according to Jakes’ mode rn (τ) = ⎪B−1 L−1
⎪
⎪
⎪ hl (i, t)− Q h l,q e j2π(q−Q/2)t/Ω ⎪
⎨ ⎪⎬

μ2n J0 (2π fn τ), n = 1, . . . , 4, where fn is the Doppler frequency B
Ω/B t =0 l=0 q=0
associated with the nth user. CP length is chosen to be 15 = E ,
Ω i=1 ⎪⎪
⎪ BL|hl (i, t)|2 ⎪
⎪
⎪
to avoid intersymbol interferences. The additive noise is a ⎩ ⎭
Gaussian and white random process with a zero mean. (37)
We run simulations with the Doppler frequency fn =
300 Hz that corresponds to the maximum mobility speed of where MSE(i) denotes the MSE of the ith OFDMA symbol.
162 km/h as the users operate at carrier frequency of 2 GHz.
In order to model the LTV channel, the frame is designed 6.1. Channel Estimation. We firstly examine the ST-based
as Ω = B × 256 = (B + CP − length) × 256 = 136192, weighted channel estimation scheme under different IPR to
that is, each frame consists of 256 OFDMA symbols. During verify the channel estimation variance analysis in Figure 3.
the frame, the channel variation is fn Ω/ fs = 4.1. Notice that From Figure 3, the curve of the MSE are almost independent
the channel variation during an OFDM symbol is fn B/ fs = of the additive white Gaussian noises, especially as SNR >
0.0154, and thus can be neglected. Over the total frame Ω, 5 dB since the additive noise has been greatly suppressed
we utilize the truncated DFB of order Q = 10 to model by the weighted average procedure. In addition, the results
the LTV channel coefficients. The LTV channel modeled shown in Figure 3 are consistent with the closed-form
by the truncated DFB, however, exhibits modeling errors estimation variance as formulated in (29)–(31), wherein
at the outmost samples. A possible explanation is that as the estimation variances are directly proportional to the
the Fourier basis expansions are truncated in (4), an effect unknown parameter L(Q + 1) and inversely proportional to
similar to the Gibbs phenomenon, together with spectral information-to-pilot power ratio Es /E p , respectively.
leakages, may lead to modeling inaccuracy at the beginning Then, we compare the developed channel estimator
and the end of the frame [3, 5, 7–9]. To circumvent the with the conventional ST-based method under the different
100
100

10−1
10−1
10−2
10−2
10−3
10−4 10−3
0 50 100 150 200 250 300 350 0 2 4 6 8 10 12 14 16 18 20
OFDMA symbol number of total frame Signal-noise ratio (dB)
Conventional ST, fd = 0 Hz FDM training based channel estimator [22]
Conventional ST, fd = 100 Hz Proposed channel estimator, E p = 0.01Es
Conventional ST, fd = 300 Hz Proposed channel estimator, E p = 0.02Es
Weighted average, fd = 0 Hz
Weighted average, fd = 100 Hz Figure 5: Comparison between the proposed estimation algorithm
Weighted average, fd = 300 Hz and that of [14] with of fd = 300 Hz.
Figure 4: MSE versus frame length under the different Doppler
frequencies, with Ω = 13.62 milliseconds, E p = 0.01Es , NL = 40, 100
and SNR = 20 dB.
10−1
Doppler frequencies. It shows clearly in Figure 4 that our

estimation approach achieves indistinguishable performance 10−2
SER
with the conventional ST-based scheme in estimating the LTI

channel of fn = 0 Hz, and the MSE level is significantly
10−3
reduced as the average length increases. However, the short-
coming of conventional ST appears when the channel being
estimated is linearly time-varying. Comparatively, by using 10−4
the weighted average procedure, our proposed approach
performs well for the LTV channel estimation of different
Doppler frequencies, that is, fn = 100 Hz/300 Hz. On the 10−5
0 5 10 15 20 25 30
other hand, we also observe that as the frame-length Ω
Signal-noise ratio (dB)
increases, the MSE approaches to a constant (lower-bound)
that associated with the Doppler frequency. The theoretical Conventional ST
analysis has been proved by Section 4. Proposed iterative detector
Figure 5 displays the comparison between the proposed FDM training scheme [22]
algorithm and the channel estimator [14]; wherein the
Figure 6: SER versus SNR for different demodulator with E p =
uplink channel over the whole frequency band is recon-
0.01Es of fd = 300 Hz.
structed with the aid of estimated subband channel transfer
functions. Owing to the time-variation of channel coeffi-
cients between OFDMA symbols, channel estimation per-
formed in [14] is required in each separate OFDMA symbol. developed in [24] can be directly employed herein to further
Since the total number of known pilots should be larger improve the estimation performance of our algorithm.
than or at least equal to the total channel unknowns NL =
40, 64 pilot tones (with 16 pilot symbols in each subband 6.2. Symbol Detection. As aforementioned, symbol detection
of individual user) are utilized within one OFDMA symbol. in demodulator of ST-based schemes [9, 11, 16–22, 24]
Correspondingly, 12.5% of total bandwidth is wasted in is affected by the residual contribution of embedded pilot
transmitting the pilot symbols. Comparatively, the proposed symbols. Herein, we carry out simulation experiments to
ST-based channel estimation approach, without entailing assess the effectiveness of the proposed iterative symbol
any additional bandwidth or constraint, outperforms the detector.
FDM training-based estimator [14] by using a small pilot Figure 6 illustrates the SER performance versus SNR with
power of E p = 0.02Es . Furthermore, the iterative method IPR as E p = 0.01Es . As shown in Figure 6, although the
10−1 6.3. Complexity Analysis. The description of the proposed

channel estimation method in Section 3 shows that the
overall complexity comes from the complex matrix pseu-
doinverse operation in (16). Note that (16) can be deduced
Symbol error rate (SER)
into a weighted average process in (18). Thus, compared to

10−2
the ST-based estimator within one OFDMA symbol (13),
only (Q+I +1) additional complex multiplication and (Q+I)
complex additions are required to obtain the accurate time-
domain CSI hl (t) of uplink OFDMA systems.
10−3
7. Conclusion
In this paper, we have developed a new method for estimating
10−4 the LTV channels of uplink OFDMA systems by using
1 2 3 4 5 6 7 8 9
Iteration number
superimposed training. We extend conventional LTI-based
ST schemes to the case where the channel coefficient is
NL/B = 20/512 ≈ 0.048B linearly time-varying. By resorting to the truncated Fourier
NL/B = 40/512 ≈ 0.08B bases (DFBs) to model the LTV channel, we adopt a two-step
NL/B = 80/512 ≈ 0.16B approach to estimate the time-varying channel coefficients
Figure 7: SER of the iterative symbol detection versus the iteration over multiple OFDMA symbols. We also present a per-
number under SNR = 24 dB, E p = 0.01Es . formance analysis of the channel estimation approach and
derive a closed-form expression for the channel estimation
variances. It is shown that the estimation variances, unlike
conventional superimposed training, approach to a fixed
channel estimator achieves well estimation performance in lower-bound that can only be reduced by increasing the
estimating the LTV channel coefficients, the conventional pilot power. In addition, an iterative symbol detector was
demodulator still exhibits a poor SER performance owing presented to mitigate the superimposed training effects on
to the effects of the residual error of embedded training information sequence recovery, thereby offering an alter-
sequences. In contrast, by the proposed iterative mitigation native to enhance the channel estimation performance by
procedure, the demodulator achieves a considerable gain using a large training power while without sacrificing SER
than that of conventional ST-based method. It thus confirms performance degradation. Compared with the existing FDM
that the above-mentioned residual interference can be effec- training schemes, the new estimator can estimate the channel
tively mitigated with the developed iterative approach. As transfer function over the whole frequency band without a
a comparison, we also list the SER performance based on loss of rate, and thus enables a higher efficiency with the
the FDM training scheme [14] where information sequences advantage for system adaptive resource allocation.
and pilot symbols are of frequency-division multiplexed
and the symbol detection can be thus performed without
additional pilot interference. We observe that the perfor- Acknowledgments
mance of two demodulators is in general indistinguishable
The authors would like to thank the editor and the reviewers
(15 dB∼25 dB), which confirms that the effects of the above-
for their helpful comments. This work is supported by
mentioned residual training on information sequence recov-
the National Natural Science Foundation of China (NSFC),
ery have been effectively cancelled by the proposed iterative
Grant 60772132, Key Project of Natural Science Foun-
approach.
dation of Guangdong Province, Grant 8251027501000011,
Figure 7 depicts the SER performance under different
Science & Technology Project of Guangdong Province, Grant
reconstructed signal-size over one OFDMA symbol period,
2007B010200055, Industry-Universities-Research Coopera-
that is, τ = NL/B. As stated in Section 5, the minimum
tion Project of Guangdong Province and Ministry of Educa-
iterations utilized to achieve a steady SER performance
tion of China, Grant 2007A090302116, and also supported in
depend crucially on the above constraint τ . It observed that
part by joint foundation of NSFC and Guangdong Province
when τ = NL/B ≤ 10%, a significant SER performance
U0635003.
improvement is achieved in the very first iterations (the
first 2∼3 iterations). Meanwhile, the iterations required
to achieve the steady-state solution of SER performance
References
increase along with the increment of τ. For the situation that
NL/B > 20%, the iterative cancellation may not convergent [1] IEEE LAN/MAN Standards Committee, “IEEE 802.16e: air
and the SER still keeps at a high level. Therefore, τ ≤ interface for fixed and mobile broadband wireless access
0.2 can be approximately considered as the upper-bound systems,” 2005.
for the implementation of the proposed iterative detection [2] 3GPP TR 25.913 (V7.3 0), “Requirements for evolved UTRA
approach. (E-UTRA) and evolved UTRA N (E-UTRAN),” March 2006.
[3] G. B. Giannakis and C. Tepedelenlioğlu, “Basis expansion [18] A. G. Orozco-Lugo, M. M. Lara, and D. C. McLernon, “Chan-
models and diversity techniques for blind identification and nel estimation using implicit training,” IEEE Transactions on
equalization of time-varying channels,” Proceedingsh of the Signal Processing, vol. 52, no. 1, pp. 240–254, 2004.
IEEE, vol. 86, no. 10, pp. 1969–1986, 1998. [19] Q. Yang and K. S. Kwak, “Superimposed-pilot-aided channel
[4] T. Zemen and C. F. Mecklenbräuker, “Time-variant channel estimation for mobile OFDM,” Electronics Letters, vol. 42, no.
estimation using discrete prolate spheroidal sequences,” IEEE 12, pp. 722–724, 2006.
Transactions on Signal Processing, vol. 53, no. 9, pp. 3597–3607, [20] S. He, J. K. Tugnait, and X. Meng, “On superimposed training
2005. for MIMO channel estimation and symbol detection,” IEEE
[5] Z. Tang, R. C. Cannizzaro, G. Leus, and P. Banelli, “Pilot- Transactions on Signal Processing, vol. 55, no. 6, part 2, pp.
assisted time-varying channel estimation for OFDM systems,” 3007–3021, 2007.
IEEE Transactions on Signal Processing, vol. 55, no. 5, part 2, [21] N. Chen and G. T. Zhou, “Superimposed training for OFDM:
pp. 2226–2238, 2007. a peak-to-average power ratio analysis,” IEEE Transactions on
[6] W.-S. Hou and B.-S. Chen, “ICI cancellation for OFDM Signal Processing, vol. 54, no. 6, part 1, pp. 2277–2287, 2006.
communication systems in time-varying multipath fading [22] T. Cui and C. Tellambura, “Pilot symbols for channel esti-
channels,” IEEE Transactions on Wireless Communications, vol. mation in OFDM systems,” in Proceedings of the IEEE Global
4, no. 5, pp. 2100–2110, 2005. Telecommunications Conference (GLOBECOM ’05), vol. 4, pp.
[7] X. Dai, “Optimal training design for linearly time-varying 2229–2233, St. Louis, Mo, USA, November 2005.
MIMO/OFDM channels modelled by a complex exponential [23] M. Ghogho, D. McLernon, E. Alameda-Hernandez, and A.
basis expansion,” IET Communications, vol. 1, no. 5, pp. 945– Swami, “Channel estimation and symbol detection for block
953, 2007. transmission using data-dependent superimposed training,”
[8] X. Ma, G. B. Giannakis, and B. Lu, “Block differential IEEE Signal Processing Letters, vol. 12, no. 3, pp. 226–229, 2005.
encoding for rapidly fading channels,” IEEE Transactions on [24] T.-J. Liang, W. Rave, and G. Fettweis, “Iterative joint chan-
Communications, vol. 52, no. 3, pp. 416–425, 2004. nel estimation and decoding using superimposed pilots in
[9] J. K. Tugnait and S. He, “Doubly-selective channel estimation OFDM-WLAN,” in Proceedings of the IEEE International
using data-dependent superimposed training and exponential Conference on Communications (ICC ’06), vol. 7, pp. 3140–
basis models,” IEEE Transactions on Wireless Communications, 3145, Istanbul, Turkey, July 2006.
vol. 6, no. 11, pp. 3877–3883, 2007. [25] I. Barhumi, G. Leus, and M. Moonen, “Optimal training
[10] K.-C. Hung and D. W. Lin, “Optimal delay estimation for design for MIMO OFDM systems in mobile wireless chan-
phase-rotated linearly interpolative channel estimation in nels,” IEEE Transactions on Signal Processing, vol. 51, no. 6, pp.
OFDM and OFDMA systems,” IEEE Signal Processing Letters, 1615–1624, 2003.
vol. 15, pp. 349–352, 2008.
[11] M. Ghogho and A. Swami, “Estimation of doubly-selective
channels in block transmissions using data-dependent super-
imposed training,” in Proceedings of the European Signal Pro-
cessing Conference (EUSIPCO ’06), Florence, Italy, September
2006.
[12] P. Fertl and G. Matz, “Multi-user channel estimation in
OFDMA uplink systems based on irregular sampling and
reduced pilot overhead,” in Proceedings of the IEEE Interna-
tional Conference on Acoustics, Speech, and Signal Processing
(ICASSP ’07), vol. 3, pp. 297–300, Honolulu, Hawaii, USA,
April 2007.
[13] M. R. Raghavendra, E. Lior, S. Bhashyam, and K. Giridhar,
“Parametric channel estimation for pseudo-random tile-
allocation in uplink OFDMA,” IEEE Transactions on Signal
Processing, vol. 55, no. 11, pp. 5370–5381, 2007.
[14] K. Hayashi and H. Sakai, “Uplink channel estimation for
OFDMA system,” in Proceedings of the IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP
’07), vol. 3, pp. 285–288, Honolulu, Hawaii, USA, April 2007.
[15] Y. Ma and R. Tafazolli, “Channel estimation for OFDMA
uplink: a hybrid of linear and BEM interpolation approach,”
IEEE Transactions on Signal Processing, vol. 55, no. 4, pp. 1568–
1573, 2007.
[16] G. T. Zhou, M. Viberg, and T. McKelvey, “A first-order statis-
tical method for channel estimation,” IEEE Signal Processing
Letters, vol. 10, no. 3, pp. 57–60, 2003.
[17] J. K. Tugnait and W. Luo, “On channel estimation using
superimposed training and first-order statistics,” IEEE Com-
munications Letters, vol. 7, no. 9, pp. 413–415, 2003.
doi:10.1155/2009/647130
Research Article
DFT-Based Channel Estimation with Symmetric
Extension for OFDMA Systems
Yi Wang,1, 2 Lihua Li,1, 2 Ping Zhang,1, 2 and Zemin Liu1, 2

1 KeyLaboratory of Universal Wireless Communications, Beijing University of Posts and Telecommunication,
Ministry of Education, Beijing 100876, China
2 Wireless Technology Innovation Institute, Beijing University of Posts and Telecommunication, Beijing 100876, China
Correspondence should be addressed to Yi Wang, wangyi81@gmail.com
Recommended by Yan Zhang
A novel partial frequency response channel estimator is proposed for OFDMA systems. First, the partial frequency response is
obtained by least square (LS) method. The conventional discrete Fourier transform (DFT) method will eliminate the noise in time
domain. However, after inverse discrete Fourier transform (IDFT) of partial frequency response, the channel impulse response
will leak to all taps. As the leakage power and noise are mixed up, the conventional method will not only eliminate the noise, but
also lose the useful leaked channel impulse response and result in mean square error (MSE) floor. In order to reduce MSE of the
conventional DFT estimator, we have proposed the novel symmetric extension method to reduce the leakage power. The estimates
of partial frequency response are extended symmetrically. After IDFT of the symmetric extended signal, the leakage power of
channel impulse response is self-cancelled efficiently. Then, the noise power can be eliminated with very small leakage power loss.
The computational complexity is very small, and the simulation results show that the accuracy of our estimator has increased
significantly compared with the conventional DFT-based channel estimator.
Copyright © 2009 Yi Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction implementation and require exact channel covariance matri-

ces. Reference [3] introduced additional DFT processing to
The orthogonal frequency-division multiplexing (OFDM) obtain the frequency response of LS-estimated channel. In
is an effective technique for combating multipath fading contrast to the frequency-domain estimation, the transform-
and for high-bit-rate transmission over mobile wireless domain estimation method uses the time-domain properties
channels. In OFDM system, the entire channel is divided into of channels. Since a channel impulse response is not longer
many narrow subchannels, which are transmitted in parallel, than the guard interval in OFDM system, the LS and the
thereby increasing the symbol duration and reducing the ISI. LMMSE were modified in [4, 5] by limiting the number
Channel estimation has been successfully used to of channel taps in time domain. References [6, 7] showed
improve the performance of OFDM systems. It is crucial for the performance of various channel estimation methods
diversity combination, coherent detection, and space-time and yielded that the DFT-based estimation can achieve
coding. Various OFDM channel estimation schemes have significant performance benefits if the maximum channel
been proposed in literature. The LS or the linear minimum delay is known. References [8–11] improved upon this
mean square error (LMMSE) estimation was proposed in idea by considering only the most significant channel taps.
[1]. Reference [2] also proposed a low-complexity LMMSE Reference [12] further investigated how to eliminate the
estimation method by partitioning off channel covariance noise on the insignificant taps by optimal threshold.
matrix into some small matrices on the basis of coherent However, in many applications such as OFDMA system,
bandwidth. However, these modified LMMSE methods still only the estimates of partial frequency response are available,
have quite high-computational complexity for practical and the estimate of channel impulse response in time domain
H 0LS hLS
0
H 0DFT has been shown in [14] that the channel frequency response
can be expressed as
H 1LS .
.
. −1
L
hLS
L−1
Hi,k hi,l e− j(2πkl/N) , (3)
LS channel N-point N-point .
. . l=0
estimation . IDFT DFT .
.
where hi,l h(iT f , l(Ts /N)), T f and Ts in the above
0
H NLS−1
.
.
expression are the block length and the symbol duration,
. H NDFT
−1 respectively. In (3), hi,l , for l = 0, 1, . . . , L − 1, are WSS
0
narrowband complex Gaussian processes. L is the number of
Figure 1: Block diagram of the conventional DFT-based channel multipath taps. The average power of hi,l and Ldepends on
estimation. the delay profile and dispersion of the wireless channels.
3. Channel Estimation Based on

cannot be obtained from the conventional DFT method. Symmetric Extension
After IDFT of partial frequency response, the channel
impulse response will leak to all taps in time domain. As 3.1. Conventional DFT Method. For simplicity, the index i
the noise and leakage power are mixed up, the conventional is omitted in the following formulation. The LS channel
DFT method will not only eliminate the noise, but also lose estimator is denoted as
the useful channel leakage power and result in MSE floor. LS Yk Nk
We have proposed the novel symmetric extension method Hk = = Hk + , 0 ≤ k ≤ N − 1. (4)
Xk Xk
to reduce the leakage power. The mathematic expression of
the MSE of the conventional DFT estimator and the upper
After IFFT, the time-domain expression of H LS
k is denoted as
bound of the MSE of our proposed estimator are derived in
N −1
this paper. 1
h LS =
n H LS e j(2π/N)kn
The rest of the paper is organized as follows. Section 2 N k=0 k
describes the system model and briefly introduces the statis-
N −1
tics of mobile wireless channel. Section 3 proposes the novel 1 N (5)
channel-estimation approach for OFDMA systems. Section 4 = Hk + k e j(2π/N)kn
N k=0 Xk
presents computer simulation results to demonstrate the
effectiveness of the proposed estimation approach. Finally, = h n + zn ,
conclusion is given in Section 5.
where hn is the channel impulse response on the nth path

zn = (1/N) kN=−01 (Nk /Xk )e j(2π/N)kn . Most mobile wireless
2. System and Channel Model channels are characterized by discrete multipath arrivals, that
is, the magnitude of hn for most n is zeros or very small;
Consider an OFDMA system that has N subcarriers. The
hence, these channel taps can be ignored. Assume LGP denote
data stream is modulated by inverse fast Fourier transform
the length of guard interval, then the maximum length of
(IFFT), and a guard interval is added for every OFDM
nonzero hn is LGP , and hn = 0 for LCP < n ≤ N − 1. In the
symbol to eliminate ISI caused by multipath fading channel.
conventional DFT method, in order to eliminate the noise,
At the receiver, with the ith OFDM symbol, the kth ⎧
subcarrier of the received signal is denoted as ⎨h
LS , n = 0, . . . , LGP − 1,
h
DFT = n
n ⎩0,
(6)
n = LGP , . . . , N − 1.
Yk,i = Hk,i · Xk,i + Nk,i , (1)
The estimate of frequency response is denoted as
where Xk,i are the pilot subcarriers, for simplicity, it is −1
L
assumed that |Xk,i | = 1, Hk,i represents the channel
H DFT
h
DFT e− j(2π/N)lk . (7)

k n
frequency response on the kth subcarrier. Nk,i is the AWGN l=0
with zero mean and variance of σ 2 .
The complex baseband representation of the mobile The basic block diagram of DFT-based estimation is shown
wireless channel impulse response can be described by [13] in Figure 1.
3.2. Partial Frequency Response by Conventional DFT. In

h(t, τ) = γk (t)c τ − τk , (2)
k
OFDMA system, as the pilot only occupies part of total
subcarriers, we can only get the estimates of partial frequency
response, which is denoted as
where τk is the delay of the kth path, γk (t) is the correspond-
ing complex amplitude, and c(t) is the shaping pulse. For partial LS
OFDM systems with proper cyclic extension and timing, it Hk =Hk+M1 , k = 0, . . . , M − 1, (8)
where M is the length of partial frequency response. For partial

H 0 h0
partial
U0
partial
simplicity, we consider M1 = 0 in this paper. However, partial

.
..
with only minor modification, the result discussed here is H 1
hLpartial −1
partial
partial
applicable to any M1 . The M point IFFT result of Hk is
LS channel M-point 0 M-point .
denoted as estimation .
.
.. .
.
. IDFT . DFT
M −1
partial 1 partial j(2πkn/M) 0
hn = H e hM −Lpartial
partial
M k=0 k partial partial
H M −1 UM −1
hM −1
partial
M −1 M −1
1 1 Nk j(2πkn/M) (9)
= Hk e j(2πkn/M) + e
M k=0 M k=0 Xk Figure 2: Block diagram of the partial frequency response DFT-
based channel estimation.
partial partial
= hn + zn ,
partial M −1 partial
where zn = (1/M) k=0 (Nk /Xk )e j(2πkn/M) , and hn is
denoted as The basic block diagram of partial frequency response DFT-

M −1 based estimation is shown in Figure 2.
partial 1
hn = Hk e j(2πkn/M)
M k=0
3.3. Partial Frequency Response Estimation by Symmetric
M −1 L−1
1 − j(2πkl/N) j(2πkn/M) Extension Method. As Hk , 0 ≤ k ≤ N − 1 are the samples
= hl e e (10)
M k=0 l=0 of the continuous and periodic channel frequency response,
in time domain, the IFFT result of Hk , 0 ≤ k ≤ N − 1 will
L−1
1 only concentrate on a few taps. However, the IFFT result of
= hl Cpartial (n, l, M, N), the partial frequency response samples Hk , 0 ≤ k ≤ M − 1
M l=0
will leak to all taps. This is because Hk , 0 ≤ k ≤ M − 1 are

−1 j(2πn/M −2πl/N)k
where Cpartial (n, l, M, N) = M k=0 e . From (10), the samples of partial-frequency response, and after periodic
it can be seen that the channel impulse response hn will expansion, the continuity of the signal is severely destroyed.
partial If the leakage power is reduced significantly compared with
leak to all taps of hn . The conventional DFT method is
partial the noise power, the noise still can be eliminated efficiently
no longer applicable as hn will be nonzero due to the
with very small loss of leakage power. Inspired by this,
power leakage; the noise and leakage power are mixed up.
in order to reduce the leakage power, we have proposed
The elimination of noise will also cause the loss of useful
the novel symmetric extension method to construct a new
channel impulse response leakage.
partial
It is assumed that each path is an independent zero-mean sequence with better continuity. Hk is extended with
complex Gaussian random process. The leakage power-to- symmetric signal of its own, and the symmetrically extended
noise power ratio (LNR) on the nth tap in the conventional signal is denoted as
DFT method can be denoted as
partial 2 L−1 2 ⎧
E hn σl2 Cpartial (n, l, M, N) ⎪
LNRpartial = =
l=0
, ⎨H
⎪ partial
0 ≤ k ≤ M − 1,
,
n
partial 2 Mσ 2
symmetric k
E zn Hk =
⎪
⎪
(14)
(11) ⎩H partial , M ≤ k ≤ 2M − 1.
2M −1−k
σl2
where is the average power of the lth path. As the channel
power mainly focuses on the low-frequency band, in order to After 2M point IFFT, the time-domain expression of
eliminate the noise in high-frequency band, let Lpartial denote symmetric
symmetric
the threshold, and the noise is eliminated by the conventional Hk is denoted as hn :
DFT method,
2M −1
gn
partial

symmetric 1 symmetric j(2πkn/2M)
hn = H e
⎧ 2M k=0 k
⎨h
⎪ partial
, 0 ≤ n ≤ Lpartial − 1 or M − Lpartial ≤ n ≤ M − 1,
n
= M −1
1
⎪
⎩0, Lpartial ≤ n ≤ M − 1 − Lpartial . = Hk e j(2πn/2M)k + e j(2πn/2M)(2M −1−k)
2M k=0
(12)
M −1
The corresponding estimate of partial frequency response is 1 Nk j(2πn/2M)k
+ e + e j(2πn/2M)(2M −1−k)
denoted as 2M k=0 Xk

M −1
partial partial − j(2πkn/M) symmetric symmetric
Uk = gn e , k = 0, . . . , M − 1. (13) = hn + zn ,
n=0 (15)
symmetric
h0
partial symmetric symmetric symmetric
H 0 H 0 Gk U0
.
partial ..
H 1
symmetric
H 1
hLsymmetric −1
symmetric .
. .
. .
LS channel . Symmetric 0 .
2M-point . 2M-point .
estimation partial extension . . Combination symmetric
H M −1 IDFT . DFT UM −1
0
h2M −Lsymmetric
symmetric
symmetric symmetric
H 2M −1 G2M −1
h2M −1
symmetric
Figure 3: Block diagram of our proposed symmetric extension DFT-based channel estimation.
symmetric M −1
where zn = (1/2M) k=0 (Nk /Xk )(e j(2πn/2M)k + After 2M point FFT,
j(2πn/2M)(2M −1−k) symmetric
e ), and hn is denoted as symmetric
Gk
symmetric
hn −1
2M
symmetric − j(2πkn/2M)
= gn e , k = 0, . . . , 2M − 1.
2M −1
1 symmetric j(2πkn/2M) n=0
= H e (19)
2M k=0 k
M −1 The corresponding estimate of partial frequency response is
1
= Hk e j(2πn/2M)k + e j(2πn/2M)(2M −1−k) denoted as
2M k=0
(16) symmetric
−1 L
−1
Uk
1
M
= hl e− j(2πkl/N) Gk
symmetric
+ G2M −1−k
symmetric (20)
2M k=0 l=0 = , k = 0, . . . , M − 1.
2
× e j(2πn/2M)k + e j(2πn/2M)(2M −1−k)
The basic block diagram of our proposed symmetric exten-
L−1
1 sion DFT-based estimation is shown in Figure 3.
= hl Csymmetric (n, l, M, N),
2M l=0
3.4. Performance Analysis. From (13), the MSE of the
M −1 conventional DFT method without symmetric extension is
where Csymmetric (n, l, M, N) = k=0 e− j(2πl/N)k (e j(2πn/2M)k + written as
e j(2πn/2M)(2M −1−k) ).

M −1 2
The leakage power-to-noise power ratio (LNR) on the 1 partial
nth tap can be denoted as MSEpartial = E Uk − Hk . (21)
M k=0
symmetric 2 From (20), the MSE of our proposed estimator is

E hn
LNRsymmetric =
n symmetric 2
E zn (17) 1
M −1
symmetric
2

2 MSEsymmetric = E Uk − Hk . (22)
L−1 M
l=0 σl2 Csymmetric (n, l, M, N) k=0
= .
2Mσ 2
The estimation error of the conventional method is divided
into two parts. The first part is that when Lpartial ≤ n ≤ M −
Let Lsymmetric denote the threshold. Using the con- partial
1 − Lpartial , the leakage power hn is lost as it is forced to be
ventional DFT method, the noise and leakage power is zero. The second part is that when n < Lpartial or n > M − 1 −
eliminated by Lpartial , the error is caused by AWGN. The estimation error
can be written as
symmetric
gn partial partial
⎧ ERRORpartial = hn − gn
⎪
⎪
symmetric ⎧ partial
⎪ 0 ≤ n ≤ Lsymmetric − 1
⎨ hn
⎪ , (18) ⎨hn , Lpartial ≤ n ≤ M − 1 − Lpartial ,
= =
⎪ or 2M − Lsymmetric ≤ n ≤ 2M − 1, ⎩zpartial ,
⎪
⎪ n others.
⎪
⎩0, Lsymmetric ≤ n ≤ 2M − 1 − Lsymmetric . (23)
Similarly, the estimation error of our proposed method is According to the Parseval theorem,
also divided into two parts. It can be written as

M −1 2 2
1 symmetric symmetric
E Gk − Hk + G2M −1−k − Hk
ERRORsymmetric 2M k=0
2M −1
symmetric symmetric symmetric

symmetric 2
= hn − gn =E hn − gn
⎧ symmetric (24) n=0
⎨hn , Lleakage ≤ n ≤ 2M − 1 − Lleakage ,
= 2M −1−Lleakage
⎩z
symmetric
, others. symmetric 2
n =E hn
n=Lleakage
Lleakage −1
According to the Parseval theorem, (21) can be written as
symmetric 2
+E zn (27)
n=0
MSEpartial
−1
2M
M −1 symmetric 2
2 +E zn
1 partial
= E Uk − Hk n=2M −Lleakage
M k=0 2M −1−Lleakage L−1
M −1 1 2
partial

partial 2
= σl2 Csymmetric (n, l, M, N)
=E hn − gn 4M 2 n=Lleakage l=0
n=0
Lleakage 2
M −1−Lpartial Lpartial −1 + σ .

partial 2
partial 2 M
=E hn +E zn
n=Lpartial n=0 From (26), (27), the upper bound of the MSE of our
proposed estimator is

M −1
partial 2
+E zn (25) upper
MSEsymmetric
n=M −Lleakage
2M −1−Lleakage L−1
M −1−Lpartial L−1 1 2
1 2 = σl2 Csymmetric (n, l, M, N)
= 2 σ 2 Cpartial (n, l, M, N)
l 4M 2 n=Lleakage
(28)
M n=Lpartial l=0
l=0
Lleakage 2
Lpartial 2 Lpartial 2 + σ .
+ σ + σ M
M M
M −1−Lpartial L−1 3.5. Estimator Complexity. The conventional DFT-based
1 2
= σl2 Cpartial (n, l, M, N) channel estimator is very attractive for its good performance
M2 n=Lpartial l=0 and low complexity. Its main computation complexity is M
2Lpartial 2 point IFFT and FFT. Our proposed symmetric extension
+ σ . method also inherits the low complexity of the DFT estima-
M
tor, and its main computation complexity is 2M point IFFT
and FFT. As the complexity of FFT and IFFT is significantly
From (24), (22) can be rewritten as reduced nowadays, our proposed method can provide a good
tradeoff between performance and complexity.
MSEsymmetric

−1 2
4. Performance Results
1
M
symmetric
= E Uk − Hk We investigate the performance of our proposed estimator
M k=0
through computer simulation. An OFDMA system with N =

−1 symmetric 2 512 subcarriers is considered the guard interval LGP = 64.
1
M
Gk
symmetric
+ G2M −1−k
= E − Hk The sampling rate is 7.68 MHz, and subcarrier frequency
M 2
k=0 space is 15 kHz. A six-path channel model is used. The power

−1
profile is given by P = [−3,0, − 2, − 6, − 8, − 10] dB,

M 2
1 symmetric symmetric and the delay profile after sampling is τ = [0,2,4,12,18,38].
= E Gk − Hk + G2M −1−k − Hk
4M k=0 Each path is an independent zero-mean complex Gaussian
−1
random process.

M 2 2
1 symmetric symmetric Figures 4 and 5 show the comparison of LNR between
≤ 2·E Gk − Hk + G2M −1−k − Hk .
4M k=0 the conventional DFT method and our proposed method. σ 2
(26) is normalized to 1, and M is set to 16 and 64. It should be
101 100
100
10−1
Theoretical MSE
10−2
LNR
10−1
10−3
10−4
10−5
10−2
10−6 1 2 3 4 5 6 7
0 5 10 15 20 25 31
Lpartial
n
SNR = 5 dB
M = 16 LNRpartial
SNR = 10 dB
M = 16 LNRsymmetric
SNR = 20 dB
Figure 4: LNR comparison when M = 16.
Figure 6: Theoretical MSE of conventional partial frequency
response DFT-based channel estimator.
101 100
100
Theoretical MSE upper bound
10−1
10−1
10−2
LNR
10−3
10−2
10−4
10−5
10−6 10−3
0 20 40 63 80 100 127 1 2 4 6 8 10 12 14 15
n Lsymmetric
M = 64 LNRpartial SNR = 5 dB
M = 64 LNRsymmetric SNR = 10 dB
SNR = 20 dB
Figure 5: LNR comparison when M = 64.
Figure 7: Theoretical upper bound of the MSE of our proposed
symmetric extension DFT-based channel estimator.
noted that the FFT length of the conventional DFT method

is M, while the FFT length of our proposed method is 2M due partial
large, although the loss of hn is small, the noise cannot be
to the symmetric extension. That is why the two curves have
eliminated efficiently, and the MSE is mainly caused by the
different lengths. It is shown that LNRpartial n is much larger
noise.
than LNRsymmetric
n . Compared with the conventional method,
Figure 7 shows the upper bound of the MSE of our
the leakage power is significantly self-cancelled by symmetric
proposed method. Compared with Figure 6, the upper
extension method.
bound of the MSE of our proposed method is smaller than
Figure 6 shows the theoretical MSE of the conventional
the MSE of the conventional DFT method. This is because
DFT method when M = 16. The MSE is calculated under
in our proposed method the channel leakage is significantly
SNR = 5 dB, 10 dB, and 20 dB, respectively. The MSE is large
reduced, and the elimination of noise will cause less channel
when Lpartial is small, this is because although most noise can
partial
leakage power loss.
be eliminated, the channel power hn is also lost, and the Figure 8 shows the MSE performance comparison of
partial
MSE is mainly caused by the loss of hn . When Lpartial is different methods. M is set to 16. In the conventional DFT
101 100
100
10−1
10−1
MSE
BER
10−2
10−2
10−3
10−4 10−3
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30
SNR (dB) SNR (dB)
Conventional LS M = 16 Conventional LS M = 16
Conventional DFT M = 16, Lpartial = 4 Conventional DFT M = 16, Lpartial = 4
Conventional DFT M = 16, Lpartial = 6 Conventional DFT M = 16, Lpartial = 6
Symmetric extension M = 16, Lsymmetric = 8 Symmetric extension M = 16, Lsymmetric = 8
Symmetric extension M = 16, Lsymmetric = 12 Symmetric extension M = 16, Lsymmetric = 12
Figure 8: Comparing MSE performance with proposed estimator, Figure 10: Comparing BER performance with proposed estimator,
conventional DFT estimator, and LS estimator, when M = 16, conventional DFT estimator, and LS estimator, when M = 16,
Lpartial = 4.6, and Lsymmetric = 8.12. Lpartial = 4.6, and Lsymmetric = 8.12.
101 can reduce the MSE. However, when SNR is higher than
15 dB, there is an evident MSE floor larger than 10−2 in the
conventional DFT method. While in our proposed method,
100 the MSE floor is eliminated efficiently. This is because when
SNR is low, the MSE is mainly caused by the noise, not
10−1
the loss of channel leakage power. When SNR is high, the
MSE is mainly caused by the leakage power loss instead. As
MSE
the leakage power is significantly reduced in our proposed

10−2 symmetric extension method, even when SNR is high,
the noise still can be eliminated at very small expense of
channel leakage power loss. Figure 8 also shows the effect
10−3 of threshold. It can be seen that when SNR is low, smaller
threshold has better MSE performance than larger threshold,
and when SNR is high, it has worse MSE performance. This
10−4
0 5 10 15 20 25 30 35 40 is because with the decrease of threshold, more noise can be
SNR (dB) eliminated, but more channel leakage power will be lost, and
with the increase of threshold, less channel leakage power will
Conventional LS M = 64
be lost, but less noise is eliminated.
Conventional DFT M = 64, Lpartial = 16
Conventional DFT M = 64, Lpartial = 24
Figure 9 shows the MSE performance when M is set to
Symmetric extension M = 64, Lsymmetric = 32 64, Lpartial is set to 16 and 24, and Lsymmetric is 32 and 48.
Symmetric extension M = 64, Lsymmetric = 48 The simulation result is similar to Figure 8. It proves that our
method is effective for different values of M.
Figure 9: Comparing MSE performance with proposed estimator, Figure 10 shows the raw BER performance with different
conventional DFT estimator, and LS estimator, when M = 64, channel estimation methods. Each subcarrier is modulated
Lpartial = 16.24, and Lsymmetric = 32.48. by 16 QAM. M is set to 16, Lpartial = 4.6, and Lsymmetric =
8.12. The channel is equalized by zero-forcing algorithm.
It can be seen that the BER with the conventional DFT
channel estimator still encounters BER floor because of the
method, Lpartial is set to 4 and 6 as the FFT length of channel estimation errors. While in our proposed symmetric
our proposed method is doubled, and the corresponding extension method, as the accuracy of channel estimator
threshold Lsymmetric is set to 8 and 12. When SNR is low, both is significantly increased, the BER performance is also
the conventional DFT method and our proposed method improved.
5. Conclusion [14] Y. Li, L. J. Cimini Jr., and N. R. Sollenberger, “Robust channel

estimation for OFDM systems with rapid dispersive fading
A simple DFT-based channel estimation method with sym- channels,” IEEE Transactions on Communications, vol. 46, no.
metric extension is proposed in this paper. In order to 7, pp. 902–915, 1998.
increase the estimation accuracy, the noise is eliminated in
time domain. As both the noise and the channel impulse
leakage power will be eliminated, we have proposed the novel
symmetric extension method to reduce the channel leakage
power. The noise can be efficiently eliminated with very small
loss of channel leakage power. The simulation results show
that, compared with the conventional DFT method, the MSE
of our proposed method is significantly reduced.
References
[1] O. Edfors, M. Sandell, J.-J. van de Beek, S. K. Wilson, and P.
O. Borjesson, “OFDM channel estimation by singular value
decomposition,” IEEE Transactions on Communications, vol.
46, no. 7, pp. 931–939, 1998.
[2] M. Noh, Y. Lee, and H. Park, “Low complexity LMMSE chan-
nel estimation for OFDM,” IEE Proceedings: Communications,
vol. 153, no. 5, pp. 645–650, 2006.
[3] Y. Zhao and A. Huang, “A novel channel estimation method
for OFDM mobile communication systems based on pilot
signals and transform-domain processing,” in Proceedings of
the 47th IEEE Vehicular Technology Conference (VTC ’97), vol.
3, pp. 2089–2093, Phoenix, Ariz, USA, May 1997.
[4] J.-J. van de Beek, O. Edfors, M. Sandell, S. K. Wilson, and P.
O. Borjesson, “On channel estimation in OFDM systems,” in
Proceedings of the 45th IEEE Vehicular Technology Conference
(VTC ’95), vol. 2, pp. 815–819, Chicago, Ill, USA, July 1995.
[5] O. Edfors, M. Sandell, J.-J. van de Beek, S. K. Wilson, and P.
O. Borjesson, “Analysis of DFT-based channel estimators for
OFDM,” Wireless Personal Communications, vol. 12, no. 1, pp.
55–70, 2000.
[6] A. Dowler, A. Doufexi, and A. Nix, “Performance evaluation of
channel estimation techniques for a mobile fourth generation
wide area OFDM system,” in Proceedings of the 56th IEEE
2040, Vancouver, Canada, September 2002.
[7] B. Yang, K. B. Letaief, R. S. Cheng, and Z. Cao, “Channel esti-
mation for OFDM transmission in multipath fading channels
based on parametric channel modeling,” IEEE Transactions on
Communications, vol. 49, no. 3, pp. 467–479, 2001.
[8] H. Minn and V. K. Bhargava, “An investigation into time-
domain approach for OFDM channel estimation,” IEEE
Transactions on Broadcasting, vol. 46, no. 4, pp. 240–248, 2000.
[9] M. R. Raghavendra and K. Giridhar, “Improving channel
estimation in OFDM systems for sparse multipath channels,”
IEEE Signal Processing Letters, vol. 12, no. 1, pp. 52–55, 2005.
[10] O. Simeone, Y. Bar-Ness, and U. Spagnolini, “Pilot-based
channel estimation for OFDM systems by tracking the delay-
subspace,” IEEE Transactions on Wireless Communications, vol.
3, no. 1, pp. 315–325, 2004.
[11] J. Oliver, R. Aravind, and K. M. M. Prabhu, “Sparse channel
estimation in OFDM systems by threshold-based pruning,”
Electronics Letters, vol. 44, no. 13, pp. 830–832, 2008.
[12] W. Yi, L. Lihua, Z. Ping, and L. Zemin, “Optimal threshold for
channel estimation in MIMO-OFDM system,” in Proceedings
of the IEEE International Conference on Communications
(ICC ’08), pp. 4376–4380, Beijing, China, May 2008.
[13] R. Steele, Mobile Radio Communications, IEEE Press, New
York, NY, USA, 1992.
doi:10.1155/2009/307407
Research Article
Near-Optimum Detection with Low Complexity for
Uplink Virtual MIMO Systems
Sanhae Kim,1, 2 Oh-Soon Shin,2 and Yoan Shin2

1 FLYVO R&D Center, POSDATA Co. Ltd., Bundang-gu, Seongnam-city, Gyeonggi-do 463-775, South Korea
2 School of Electronic Engineering, Soongsil University, 1 Sangdo-dong, Dongjak-gu, Seoul 156-743, South Korea
Correspondence should be addressed to Yoan Shin, yashin@e.ssu.ac.kr
Received 10 June 2008; Revised 4 December 2008; Accepted 9 January 2009
Recommended by Alister G. Burr
In mobile worldwide interoperability for microwave access (WiMAX) or 3rd Generation partnership project long-term evolution
(3GPP-LTE), uplink virtual multiple input multiple output (MIMO) technology is adopted to perform spatial multiple access
with two portable subscriber stations (PSSs), where each PSS has an antenna. As two PSSs transmit simultaneously on the
same orthogonal frequency division multiple access (OFDMA) resource blocks, the overall uplink capacity will be doubled.
To employ this interesting technique with high performance, most system venders demand the optimal maximum-likelihood
detection (MLD) scheme in the radio access station (RAS). However, the optimal MLD is difficult to implement due to its
explosive computational complexity. In this paper, we propose two efficient MIMO decoding schemes that achieve near-optimum
performance with low complexity for uplink virtual MIMO systems that have an iterative channel decoder using bit log-
likelihood ratio (LLR) information. The simulation results show that the proposed schemes have almost the same block error
rate (BLER) performance as that of the optimal MLD with only about 15.75% and 28% computational complexity in terms of real
multiplication, when both PSSs transmit 16 quadrature amplitude modulation (QAM) signals, and only about 3.77% and 7.22%
for 64 QAM signals.
Copyright © 2009 Sanhae Kim et al. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction the IEEE 802.16e standards. In the virtual MIMO scheme,

each data stream of two portable subscriber stations (PSSs)
Multiple input multiple output (MIMO) techniques are is simultaneously transmitted through the same OFDMA
potentially expected to be introduced in most mobile resources. Assuming the perfect cancellation of multiuser
communication systems for an increase in wireless channel interferences, the achievable channel capacity of uplink
capacity. Adopting MIMO schemes, diversity gain and mobile WiMAX using the virtual MIMO technology can be
coding gain can be simultaneously achieved, since a number increased in proportion to the number of PSSs.
of independent radio channels are generated by placing In 2007, the WiMAX forum mobile radio conformance
multiple antennas in the transmitter and the receiver test (RCT) [7] provided a criterion to verify the system per-
[1]. In particular, to effectively guarantee user throughput formance of mobile WiMAX Wave-II MIMO, based on the
in an uplink situation, multiuser MIMO schemes have well-known optimal maximum likelihood detection (MLD),
recently drawn increased attention [2–4]. The uplink MIMO which can achieve the best performance [8]. However, the
techniques have also been adopted in mobile worldwide optimal MLD requires a sizeable amount of computational
interoperability for microwave access (WiMAX) systems that complexity on the receiver side, which may exponentially
are based on the IEEE 802.16e − 2005 standards [5] or 3rd increase as the number of transmit antennas and the mod-
generation partnership project long-term evolution (3GPP- ulation level increase. Therefore, suboptimal detection algo-
LTE) systems [6]. Especially, for the uplink mobile WiMAX rithms that can reduce the complexity are required. Among
situation, virtual MIMO which is called “collaborative spatial previous studies, QR decomposition and M-algorithm-MLD
multiplexing (CSM),” is adopted as a mandatory profile in (QRM-MLD) and sphere decoding (SD) schemes have been
reported to achieve a near ML performance. However,

these schemes require additional computation after the hard
h1 = [h11 , h21 ]T
decision for log-likelihood ratio (LLR) information of all PSS #1
bits [9, 10]. Another previous work has proposed parallel
detection (PD) based on successive symbol cancellation [11].
By making use of the property of the MIMO channel,
the algorithm can attain near MLD performance with a h2 = [h12 , h22 ]T
slight increase in computational complexity. However, PD
Independent
cannot obtain LLR information for all the transmit layers, fading channels
PSS #2 RAS
because this scheme considers all possible symbols for the
first transmit layer only. Figure 1: Uplink virtual MIMO systems.
In this paper, we propose a two-step MIMO decoding
scheme that is an extension of PD with low computational
complexity for feasible implementation of uplink mobile Nr × Nt complex matrix, can be represented as
WiMAX systems that have an iterative channel decoder using
⎡ ⎤
bit LLR information. Unlike the optimal MLD, in which h1,1 · · · h1,Nt
all the layers are fully searched in all possible combinations ⎢ ⎥
⎢ .. .. .. ⎥
of symbol sets to determine the LLR values, the proposed H=⎢ . . . ⎥ . (1)
⎣ ⎦
scheme performs a search for only one transmit layer in the hNr ,1 · · · hNr ,Nt
first step, and then the LLR values of the residual transmit
layers are simply determined in the second step. These The modulated symbols transmitted from Nt PSSs can be
procedures have only 15.75% computational complexity in written as
terms of real multiplication as compared to the optimal T
MLD, when both PSSs transmit 16 quadrature amplitude s = s1 s2 · · · sNt , (2)
modulation (QAM) signals, and only 3.77% for 64 QAM
signals. Nevertheless, the proposed scheme is shown to where T is the transpose operation. The received signal
achieve reasonable block error rate (BLER) performance vector r of size Nr × 1 is then expressed as
comparable to the optimal MLD. We also propose another
MIMO decoding scheme that performs an independent r = H · s + n, (3)
search for each transmit layer. This scheme achieves exactly
where n is the zero-mean additive white Gaussian noise
the same BLER performance of the optimal MLD with only
(AWGN) vector.
28% and 7.22% computational complexity, when both PSSs
Now we explain two proposed schemes that have
transmit 16 QAM and 64 QAM signals, respectively.
low complexity for MIMO signal detection. Moreover, we
This paper is organized as follows: Section 2 intro- introduce the optimal MLD scheme for the purpose of
duces the uplink virtual MIMO systems, and describes our comparison.
proposed MIMO decoding algorithms. The computational
complexity of the MLD and the proposed schemes are
analyzed in Section 3. We introduce link-level simulation 2.1. Proposed Decoding Scheme I. The proposed decoding
environments for mobile WiMAX systems and discuss the scheme I, which is an extended type of modified PD, consists
simulation results in Section 4, followed by concluding of two steps. It performs full interference cancellation for
remarks in Section 5. only one transmit layer in the first step, and then the
LLR values of the residual transmitting layers are simply
determined in the second step. As the full search is performed
in only one transmit layer, the computational complexity
2. Virtual MIMO Decoding Schemes will be significantly reduced. Figure 4 illustrates the block
diagram of the proposed decoding scheme.
Figure 1 shows a virtual MIMO system where for simplicity
Before going into the first step, the decoder calculates the
we consider two PSSs and one radio access station (RAS). We
signal-to-noise ratio (SNR) of the two transmit layers from
assume a single transmitting antenna for each PSS, and two
channel responses in the receiver side. Because the second
PSSs transmit data streams on the same OFDMA resources
layer performs the cancellation with all symbol candidates,
simultaneously. As the RAS receives multiple data streams
the index of the transmit signal with a lower SNR among the
through two antennas, it makes a 2 × 2 independent fading
PSSs is selected as the second transmit layer.
channel condition. Figures 2 and 3 show the block diagram
In the first step, it performs a full interference cancella-
of the transmitter for one PSS and the receiver for the RAS
tion of the second layer as follows:
with two receiving antennas in mobile WiMAX systems,
respectively.

ri = r − h2 s2,i i = 1, . . . , M2 , (4)
For the purposes of a more general discussion, we
consider a MIMO system with Nr receiving antennas and Nt where Mm is the number of constellations for the mth
PSSs, where Nr ≥ Nt . The overall channel H, which is an transmit layer and sm,i is the ith constellation symbol of the
Channel Subcarrier
Source encoding QAM permutation Pilot OFDM
mod. inserting mod.
(CTC) (mapping)
Figure 2: Transmitter architecture of one PSS.
Sub- Channel
OFDM Pilot separating/ PSS #1’s
carrier decoding
demod. channel estimation source
demap. MIMO
(CTC)
decoding
and
LLR Channel
Sub- calculation
OFDM Pilot separating/ decoding PSS #2’s
carrier source
demod. channel estimation (CTC)
demap.
Figure 3: Receiver architecture of RAS with two receiving antennas.
mth layer. The reconstructed symbols for the first transmit After performing the second step, the scheme has (M2 +
layer can be expressed as log2 M1 ) squared EDs in the list. The LLR of the kth bit for
the proposed scheme can be calculated directly using these
y1,i = hH
1
ri i = 1, . . . , M2 , (5)
squared EDs in the list as follows:

s1,i = Q y1,i i = 1, . . . , M2 . (6)
Λk = min
Prop
ED2 − min
Prop
ED2 , (11)
The scheme then calculates the squared Euclidean distance s∈Ck,0 s∈Ck,1
(ED) and lists up after canceling the reconstructed symbols:
Prop
where Ck,l (l = 0, 1) is the set of candidate symbols with

ri =
ri − h1
s1,i i = 1, . . . , M2 ,
(7) the kth bit being fixed to l, of which the size is (M2 + log2 M1 ),
2
ED2i = ri i = 1, . . . , M2 . and ED2 is the squared ED vector of these candidate symbol
sets.
For the last part of the first step, we find the index of the Note that the proposed scheme can be extended to
minimum squared ED from the list as follows: Nt -transmit antennas by taking the zero forcing (ZF) or
minimum mean-square error (MMSE) algorithm for the
p = arg min ED2i . (8)
i∈{1,...,M2 } symbol reconstruction of the residual transmit layers in (5)
The idea of the second step is to generate only the and executing the second step Nt − 1 times.
neighboring symbols of s1,p by inversing each bit in s1,p
in order to reconstruct the symbol candidates for the first 2.2. Proposed Decoding Scheme II. Another proposed scheme
transmit layer. The additional squared EDs are added to the performs a full interference cancellation for all layers in
list. These procedures make Ck,l = / ∅ (l = 0, 1) for all the parallel. In other words, the proposed scheme II performs
kth bits due to the listed additional squared EDs with the only the first step of proposed scheme I in parallel for
bit inversed symbols in the first layer. Then, it can calculate every transmit layer. Figure 5 illustrates the block diagram
the LLR information of all bits directly without any addi- of the proposed decoding scheme for the first transmit layer.
tional computation, unlike other conventional suboptimum The following procedure describes the proposed decoding
schemes [9–11]. algorithm for the mth transmit layer.
First, in the second step, the scheme takes s1,p and First, the scheme performs a full interference cancellation
reconstructs symbol s1, j with the jth bit inverse as of the mth transmit layer as

s1, j = Q j s1,p j = 1, . . . , log2 M1 , (9)
ri = r − hm sm,i i = 1, . . . , Mm , (12)
where Q j (·) is the jth bit inverse operator. For example, if
where Mm is the number of constellations for the mth layer

s1,p of the 16 QAM symbol is “0001” in bit representation,
and sm,i is the ith constellation symbol of the mth layer.
s1,2 will be “0101” by the second bit inversed. The scheme can
The reconstructed symbols for the mth residual transmit
now calculate new squared EDs after recancellation of the bit
layer can be expressed as
inversed symbols:

r j =
r p − h1 s1, j j = 1, . . . , log2 M1 , y m,i = hHm
ri i = 1, . . . , Mm ,
2 (10) (13)
ED2j+M2 = r j j = 1, . . . , log2 M1 .
sm,i = Q ym,i i = 1, . . . , Mm .
· s2,M2
·· Performed for all candidates symbols
s2,2
s2,1 for the 2nd transmit layer
h2 h1

r1 Symbol
r Canceller hH
r
reconstruction 1 1
·
y1,1 ··
Symbol
hard decision
h1
s1,1
r1 Squared
ED list
Canceller Euclidean
distance ED21
1st step
s1,1 s2,1 ED21
.. .. ..
For i = 1, . . . , M2 Index of minimum ED = p . . .
·
··
s1,p s2,p ED2p
. . .
. . .
Symbol reconstruction . . .
with jth bit inverse
s1,M2 s2,M2 ED2M2
h1 s1,1

s1,M2 +1 s2,p ED2p

r p r1 Squared .. .. ..
Canceller Euclidean . . .
distance

s1,M2 +log2 (M1 ) s2,p ED2M2
2nd step
For j = 1, . . . , log2 (M1 )
Figure 4: Block diagram of the proposed decoding scheme I.
s1,M1
·· Performed for all candidates symbols
s1,2· for the 1st transmit layer
s1,1
·
··
·
··
h1 h2

r1
Symbol
r Canceller hH
2
r2
reconstruction
y2,1
Symbol
hard decision

s2,1
h2 ED21,M1
·
·
··
··
Squared ED21,2
r1
Canceller Euclidean ED21,1
distance
For i = 1, . . . , M1
Figure 5: Block diagram of the proposed decoding scheme II for the first transmit layer.
The scheme then calculates the squared ED and lists up after Table 1: Complexity analysis for the optimal MLD scheme (Nt =
canceling the reconstructed symbols: Nr = 2, per one subcarrier).
No. of
ri =
ri − hm
sm,i i = 1, . . . , Mm , Operations Real multiplication Real addition
iterations
2 (14)
ED2m,i = ri i = 1, . . . , Mm . For i = 1: M1 — — —

ri = r − h1 s1,i 8 8 M1
The LLR of the kth bit for the mth transmit layer can be For j = 1: M2 — — —
calculated directly using these squared EDs in the list as ei, j =
ri − h2 s2, j 8 8 M1 M2
follows: ED2i, j = ei, j 2 4 3 M1 M2
End — — —
Λk,m = minm ED2m − min ED2m . (15)
s∈Ck,0 m
s∈Ck,1 End — — —
Total 8M1 + 12M1 M2 8M1 + 11M1 M2 —
Here, Cmk,l (l = 0, 1) is the candidate symbol set for the mth
transmit layer satisfying that the kth bit is l, for which the
size is Mm . Moreover, ED2m is the squared ED vector of the
candidate symbol sets for the mth transmit layer. Table 2: Complexity analysis for the proposed scheme I (Nt = Nr =
To take all the LLR information from Nt transmit layers, 2, per one subcarrier).
these procedures are performed in parallel for every transmit
layer. Operations
Real
Real addition
No. of
The BLER performance of the proposed scheme is multiplication iterations
exactly the same as that of the MLD in the 2 × 2 MIMO [Ordering]
antenna configuration (i.e., two PSSs and one RAS) systems. SNR1 = |h11 |2 + |h21 |2 4 3 1
Since the decoding procedure is performed independently at SNR2 = |h12 |2 + |h22 |2 4 3 1
each transmit layer, the computational complexity linearly [First step]
increases in proportion to the number of transmit antenna
For i = 1: M2 — — —
Nt and its QAM modulation level M.

ri = r − h2 s2,i 8 8 M2
y1,i = hH1
ri 8 6 M2
2.3. Optimal MLD Scheme [12]. The MLD can detect the
s1,i = Q(y1,i ) — — —
desired signal by calculating the minimum squared ED for all
possible combinations of symbol set CNt . The error distance ri =
ri − h1 s1,i 8 8 M2
2
vector e and the detected signal can be expressed as ED2i = ri 4 3 M2
End — — —
e = r − H · s, [Second step]
(16)
s = arg min e2 . For j = 1: log2 M1 — — —
s∈CNt
r j =
r p − h1 s1, j 8 8 log2 M1
The LLR of the kth bit can be calculated as follows: ED2j+M2 = r j 2 4 3 log2 M1
End — — —
2
Λk = min e − min e 2 , (17) 28M2 + 25M2 +
N
s∈Ck,0t
N
s∈Ck,1t Total —
12 log2 M1 + 8 11 log2 M1 + 6
where CNk,lt (l = 0, 1) is the set of Nt symbol combinations

with the kth bit being fixed to l.
This scheme is optimal and can achieve the best per- Table 3: Complexity analysis for the proposed scheme II (Nt =
formance for MIMO systems. However, the computational Nr = 2, per one subcarrier).
complexity increases exponentially according to the number
of transmitting antennas and the number of symbol constel- No. of
Operations Real multiplication Real addition
lations M. iterations
For i = 1: Mm — — —

ri = r − hm sm,i 8 8 M1 , M 2
3. Complexity Analysis and Comparison
ym,i = hHm
ri 8 6 M1 , M 2
To compare the computational complexity, we consider real sm,i = Q(ym,i ) — — —
multiplication and real addition operations with two PSSs r =
ri − hm sm,i 8 8 M1 , M 2
and two receiving antennas in the RAS. Tables 1, 2, and ED2m,i = ri
2
4 3 M1 , M 2
3 show the complexity analyses for the optimal MLD, the End — — —
proposed scheme I, and the proposed scheme II per one
Total 28M1 + 28M2 25M1 + 25M2 —
subcarrier. The symbols in the tables are defined in Section 2.
Table 4: Complexity comparison for real multiplication operations (Nt = Nr = 2, per one subcarrier).
Optimal MLD Proposed scheme I Proposed scheme II

PSS index Modulation level
No. of oper. % No. of oper. % No. of oper. %
1 QPSK (M1 = 4)
224 100 144 64.29 224 100.0
2 QPSK (M2 = 4)
1 16 QAM (M1 = 16)
800 100 168 21.00 560 70.00
2 QPSK (M2 = 4)
1 16 QAM (M1 = 16)
3,200 100 504 15.75 896 28.00
2 16 QAM (M2 = 16)
1 64 QAM (M1 = 64)
49,664 100 1,872 3.77 3,584 7.22
2 64 QAM (M2 = 64)
Table 5: Complexity comparison for real addition operations (Nt = Nr = 2, per one subcarrier).
Optimal MLD Proposed scheme I Proposed scheme II

PSS index Modulation level
No. of oper. % No. of oper. % No. of oper. %
1 QPSK (M1 = 4)
208 100 128 61.54 200 96.15
2 QPSK (M2 = 4)
1 16 QAM (M1 = 16)
736 100 150 20.38 500 67.93
2 QPSK (M2 = 4)
1 16 QAM (M1 = 16)
2,944 100 450 15.29 800 27.17
2 16 QAM (M2 = 16)
1 64 QAM (M1 = 64)
45,568 100 1,672 3.67 3,200 7.02
2 64 QAM (M2 = 64)
From these analyses, we can compare the number of convolutional turbo coding (CTC) with code rate r was
operations in various sets of modulation levels. Tables 4 and utilized as the channel coding, for which the maximum
5 show the numerical comparisons of real multiplication and number of iterations was eight. The packet size was the same
real addition per one subcarrier, respectively. as the CTC block size, which was 144, 216, 288, and 432
As shown in these tables, when both PSSs transmit bits when the modulation and coding scheme (MCS) level
16 QAM signals, the complexity of the proposed scheme was (QPSK, r = 1/2), (QPSK, 3/4), (16 QAM, 1/2), and
I is only 15.75% and 15.29% of the optimal MLD for (16 QAM, 3/4), respectively. We assumed that the uplink
real multiplication and real addition, respectively. The channel response was perfectly known at the RAS, and
proposed scheme II requires only about 28% and 27.17% there were no time/frequency offsets in the system. We also
computational complexity as compared to the optimal MLD assumed that the power offset between the two PSSs was 0 dB.
under the same condition. In addition, when both PSSs The data packet was fully loaded in 12 OFDMA symbols per
transmit 64 QAM signals, the computational complexity frame; we considered only the partial usage of subchannels
of the proposed scheme I is only 3.77% and 3.67% as (PUSC) [5] mode with a subchannel rotation enabled as
compared to the optimal MLD in terms of real multiplication a type of subchannelization. The main system parameters
and real addition, respectively. Under same condition, the of the mobile WiMAX for the simulation are described in
proposed scheme II needs only about 7.22% and 7.02% of Table 6.
the optimal MLD. The relative computational complexity of For the BLER performance comparison, other conven-
the proposed schemes decreases significantly as the symbol tional spatial multiple access decoding schemes, including
modulation level increases. the optimal MLD, MMSE nulling, and MMSE-ordered
successive interference cancellation (MMSE-OSIC) [14], are
4. Simulation Results involved in our link-level simulations.
Figures 6, 7, 8, and 9 show the BLER performances of
The performance of the proposed MIMO decoding schemes the first PSS when the MCS of both PSSs are (QPSK, 1/2),
was evaluated through link-level simulations under the (QPSK, 3/4), (16 QAM, 1/2), and (16 QAM, 3/4). As shown
mobile WiMAX specifications. We considered the Vehicular- in these results, the proposed scheme I has only maximum
A channel environment in recommendations ITU-R M.1225 0.2 dB performance degradation from the optimal MLD at
[13] at 60 km/h mobile velocity. We also assumed that BLER of 10−2 . Specifically in Figure 6, the BLER performance
there were no correlations between the two PSSs. The of the proposed algorithm is almost the same as the MLD in
100 100
10−1 10−1
BLER
BLER
10−2 10−2
10−3 10−3
10−4 10−4
0 1 2 3 4 5 6 7 8 9 10 11 12 9 10 11 12 13 14 15 16 17 18 19 20 21
Received SNR (dB) Received SNR (dB)
MLD Proposed I MLD Proposed I
MMSE Proposed II MMSE Proposed II
MMSE-OSIC MMSE-OSIC
Figure 6: Comparison of the BLER performance of the first PSS Figure 8: Comparison of the BLER performance of the first PSS
for the proposed and the conventional schemes when both MSs for the proposed and the conventional schemes when both MSs
transmit with (QPSK, 1/2). transmit with (16 QAM, 1/2).
100
Table 6: Main system parameters for the mobile WiMAX systems
[5].
Parameters Values
10−1 Bandwidth 8.75 MHz
Sampling factor 8/7
Sampling frequency 10 MHz
BLER
10−2 Number of FFT points 1,024

Tone spacing 9.765625 kHz
Effective signal bandwidth 8.447 MHz
10−3 Basic OFDMA symbol time 102.4 μs
Cyclic prefix time 12.8 μs
OFDMA symbol time 115.2 μs
TDD frame length 5 ms
10−4
6 7 8 9 10 11 12 13 14 15 16 17 18 Number of symbols in a frame 42
Received SNR (dB) Number of DL/UL symbols 27:15
MLD Proposed I
MMSE Proposed II
MMSE-OSIC and 0.3 dB performance degradation from the optimal MLD
Figure 7: Comparison of the BLER performance of the first PSS at BLER of 10−3 , respectively. However, the proposed scheme
for the proposed and the conventional schemes when both MSs II, which has about two times complexity of the proposed
transmit with (QPSK, 3/4). scheme I, shows almost the same BLER performance of the
optimal MLD scheme. We also observe that MMSE nulling
and MMSE-OSIC suffer more performance degradation as
the case of (16 QAM, 3/4). Moreover, the proposed scheme II the code rate increases.
has no performance degradation as compared to the optimal
MLD for every MCS set. 5. Conclusion
Figures 10 and 11 show the BLER performances of the
first PSS when the MCS combinations are {(16 QAM, 1/2), In this paper, we have proposed suitable decoding schemes,
(QPSK, 1/2)} and {(16 QAM, 3/4), (QPSK, 3/4)}, respec- which achieve near-optimal ML performance with low
tively. Here, {A, B} means that the first PSS transmits with computational complexity in uplink virtual MIMO systems
A MCS level and the second transmits with B MCS level. As that utilize LLR information at the channel decoder. As the
shown in the results, the proposed scheme I has about 1.0 dB proposed schemes almost satisfy the criterion of the WiMAX
100 100
10−1 10−1
BLER
BLER
10−2 10−2
10−3 10−3
10−4 10−4
12 13 14 15 16 17 18 19 20 21 22 23 24 9 10 11 12 13 14 15 16 17 18 19 20 21
Received SNR (dB) Received SNR (dB)
MLD Proposed I MLD Proposed I

MMSE Proposed II MMSE Proposed II
MMSE-OSIC MMSE-OSIC
Figure 9: Comparison of the BLER performance of the first PSS Figure 11: Comparison of the BLER performance of the first PSS
for the proposed and the conventional schemes when both MSs for the proposed and the conventional schemes when the MCS
transmit with (16 QAM, 3/4). combination is {(16 QAM, 3/4), (QPSK, 3/4)}.
100
almost the same BLER performance as compared to the
optimal MLD. The proposed scheme I has only 15.75%
computational complexity in terms of real multiplication as
compared to the optimal MLD when both PSSs transmit
10−1 16 QAM signals, and only 3.77% for 64 QAM signals. More-
over, the proposed scheme II achieves exactly the same BLER
performance of the optimal MLD with only 28% and 7.22%
BLER
10−2 computational complexity of the optimal MLD, when both

PSSs transmit 16 QAM and 64 QAM signals, respectively. We
expect that there is more significant complexity reduction in
systems that transmit with a higher modulation level.
10−3
Acknowledgments
10−4 This work was supported by the Soongsil University Research
5 6 7 8 9 10 11 12 13 14 15 16 17
Received SNR (dB) Fund, and the MKE under the ITRC support program super-
vised by the IITA (IITA-2008-C1090-0803-0002), Republic of
MLD Proposed I Korea.
MMSE Proposed II
MMSE-OSIC
References
Figure 10: Comparison of the BLER performance of the first PSS
for the proposed and the conventional schemes when the MCS [1] A. J. Paulraj, D. A. Gore, R. U. Nabar, and H. Bölcskei,
combination is {(16 QAM, 1/2), (QPSK, 1/2)}. “An overview of MIMO communications—a key to gigabit
wireless,” Proceedings of the IEEE, vol. 92, no. 2, pp. 198–217,
2004.
[2] A. M. Sayeed, “A virtual mimo channel representation and
forum mobile RCT, which is based on the performance applications,” in Proceedings of IEEE Military Communications
of ML, the whole system can achieve a higher margin of Conference (MILCOM ’03), vol. 1, pp. 615–620, Boston, Mass,
implementation loss. USA, October 2003.
The link-level simulation is performed under the [3] S. W. Kim and R. Cherukuri, “Cooperative spatial multiplex-
assumption of perfect synchronization between two PSSs, ing for high-rate wireless communications,” in Proceedings
and performance of the virtual MIMO system may be of the 6th IEEE Workshop on Signal Processing Advances in
decreased in case of any imperfect synchronization. The link- Wireless Communications (SPAWC ’05), pp. 181–185, New
level performance shows that the proposed schemes have York, NY, USA, June 2005.
[4] Z. Shi, H. Sun, C. Zhao, and Z. Ding, “Linear precoder

optimization for ARQ packet retransmissions in centralized
multiuser MIMO uplinks,” IEEE Transactions on Wireless
[5] IEEE 802.16 Standard Committee, “IEEE 802.16-2005: IEEE
Standard for Local and Metropolitan Area Networks—Part 16:
Air Interface for Fixed and Mobile Broadband Wireless Access
Systems—Amendment 2: Physical Layer and Medium Access
Control Layers for Combined Fixed and Mobile Operation in
Licensed Bands,” Febtuary 2006.
[6] 3GPP TR 25.814, “Physical layer aspect for evolved Universal
Terrestrial Radio Access (UTRA),” v7.1.0 (Release 7), Septem-
ber 2006.
[7] WiMAX Forum, “Mobile Radio Conformance Tests Amend-
ment: Wave 2 Tests,” July 2007.
[8] X. Zhu and R. D. Murch, “Performance analysis of maximum
likelihood detection in a MIMO antenna system,” IEEE
Transactions on Communications, vol. 50, no. 2, pp. 187–191,
2002.
[9] H. Kawai, K. Higuchi, N. Maeda, et al., “Likelihood function
for QRM-MLD suitable for soft-decision turbo decoding and
its performance for OFCDM MIMO multiplexing in multi-
path fading channel,” IEICE Transactions on Communications,
vol. E88-B, no. 1, pp. 47–57, 2005.
[10] B. M. Hochwald and S. ten Brink, “Achieving near-capacity on
a multiple-antenna channel,” IEEE Transactions on Communi-
cations, vol. 51, no. 3, pp. 389–399, 2003.
[11] Y. Li and Z.-Q. Luo, “Parallel detection for V-BLAST system,”
in Proceedings of IEEE International Conference on Communi-
cations (ICC ’02), vol. 1, pp. 340–344, New York, NY, USA,
April 2002.
[12] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-
Time Wireless Communications, Cambridge University Press,
Cambridge, UK, 2003.
[13] Recommendation ITU-R M.1225, “Guidelines for Evaluation
of Radio Transmission Technologies for IMT-2000,” 1997.
[14] P. W. Wolniansky, G. J. Foschini, G. D. Golden, and R.
A. Valenzuela, “V-BLAST: an architecture for realizing very
high data rates over the rich-scattering wireless channel,”
in Proceedings of the International Symposium on Signals,
Systems and Electronics (ISSSE ’98), pp. 295–300, Pisa, Italy,
September-October 1998.
doi:10.1155/2009/609386
Research Article
Separate Turbo Code and Single Turbo Code Adaptive
OFDM Transmissions
Lei Ye1 and Alister Burr2

1 Department of Communication Engineering, University of Chongqing, Chongqing 400044, China
2 Department of Electronics, University of York, Heslington, York YO10 5DD, UK
Correspondence should be addressed to Lei Ye, yelei0815@gmail.com
Received 30 June 2008; Revised 22 December 2008; Accepted 18 January 2009
This paper discusses the application of adaptive modulation and adaptive rate turbo coding to orthogonal frequency-division
multiplexing (OFDM), to increase throughput on the time and frequency selective channel. The adaptive turbo code scheme is
based on a subband adaptive method, and compares two adaptive systems: a conventional approach where a separate turbo code is
used for each subband, and a single turbo code adaptive system which uses a single turbo code over all subbands. Five modulation
schemes (BPSK, QPSK, 8AMPM, 16QAM, and 64QAM) are employed and turbo code rates considered are 1/2 and 1/3. The
performances of both systems with high (10−2 ) and low (10−4 ) BER targets are compared. Simulation results for throughput and
BER show that the single turbo code adaptive system provides a significant improvement.
Copyright © 2009 L. Ye and A. Burr. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction OFDM in [3]. In the present paper, we propose the use

of adaptive modulation modes and adaptive code rates for
When transmitted in time dispersive channels, the bit error turbo-coded OFDM in two different ways: a single turbo
rate (BER) achieved on different orthogonal frequency- code scheme, and a separate turbo code scheme. In both
division multiplexing (OFDM) subcarriers depends on the schemes, each subband [4] contains a set of subcarriers
frequency domain channel transfer function. The bit errors for which the modulation mode and turbo code rate is
are normally concentrated in a few severely faded subcarriers determined by the signal-to-noise ratio (SNR) of subcarriers
in conventional nonadaptive OFDM systems. In the rest in this subband, but in the single turbo code scheme only
of the subcarriers, there are normally no bit errors. If we one turbo code is used over all subbands, while the separate
can identify these high BER subcarriers and apply more turbo code scheme uses a distinct turbo code frame for each
powerful, lower rate forward error correction (FEC) codes, subband.
the overall BER of the whole OFDM frame will be improved, In the remainder of this paper, we first give a description
while employing higher order modulation and higher rate of the system structure of both adaptive turbo-coded OFDM
codes on the high-quality subcarriers can improve overall schemes. We then discuss and compare the numerical results
throughput. of simulation. The final section gives a summary of the work.
Adaptive modulation was proposed for exploiting the
time variant Shannonian channel capacity of fading channel 2. System Structure
by Steele and Webb [1]. In addition to excluding some
fading subcarriers and varying the modulation mode, that is This adaptive turbo code OFDM system is based on a
adaptive modulation, code rate can also be adapted. Subse- subband adaptive method. For the separate turbo code
quently, Tang [2] contributed an intelligent learning scheme adaptive OFDM scheme, the subcarriers are divided into
for the appropriate adjustment of switching thresholds. several subbands. Then, the SNR of each subcarrier within
Bizzarri et al. proposed adaptive space-time-frequency cod- the subband is calculated and the modulation mode and code
ing schemes for multiple-input multiple-output (MIMO) rate for each subband are determined from these SNR values.
Turbo coding and

modulation
Data Turbo coding and

modulation Interleave Add
S/P to OFDM IFFT CP
frame
Turbo coding and

modulation
Scheme type decision Channel
Turbo decoding and

demodulation
Turbo decoding and

demodulation Interleave Rem
P/S to turbo FFT
code CP
frame
Turbo decoding and

demodulation
Figure 1: System structure of separate turbo code adaptive system.
Then, separate turbo coding and modulation are employed and the relationship of turbo code frame, OFDM frame, and
for each subband. At the transmitter, before adding a cyclic cyclic prefix of single turbo code scheme.
prefix, an inverse fast Fourier transform (IFFT) is applied to Figure 3 shows that the number of modulated symbols
the OFDM frames. in each turbo code frame after coding and modulation
After transmission through the fading channel, an FFT should be the same (because the number of subcarriers in
is applied to the received signal after removal of the cyclic each subband is the same). Hence, if different modulation
prefix. Then, each subband is demodulated and decoded mode and code rate are used for different turbo code frames
separately. The structure of the separate turbo coded adaptive (different subbands), the length of turbo code must be
OFDM system used is illustrated in Figure 1. different for each frame, and therefore the length of the turbo
For the single turbo code adaptive OFDM scheme, the code interleaver must be variable. As shown in Figure 4, the
information bits are firstly turbo coded by a single turbo turbo code length of the single turbo code system is also
code, then modulated separately for each subband. In the variable because the modulation scheme for the subbands
same way, at the receiver, the signal from the output of changes. In this work, we choose to use an S-random
the FFT is demodulated separately then decoded as a single interleaver for both schemes, which was first described in [5],
turbo code frame. In this scheme, when other parameters and hence we need to provide a variable-length S-random
are the same, the length of turbo code used is much longer interleaver.
than for the separate turbo code scheme, therefore it can Popovski presents a flexible length S-random interleaver
get better system performance with the single turbo code algorithm in [6]. In our simulation, we provided a simplified
adaptive OFDM system. The structure of the single turbo algorithm to generate a flexible S-random interleaver with
code adaptive OFDM system used is illustrated in Figure 2. the same S-parameter as a given interleaver.
Assume that we already have a length K S-random
2.1. Transmission Block Structure. In the separate turbo code interleaver with S-parameter S, denoted by πK and let L > K
scheme, the OFDM frame is divided into several subbands, be the maximal interleaver length of interest. Starting from
and the turbo code frame combines the same subband of N = K, each permutation πN+1 of length K < N + 1 ≤ L is
several OFDM symbols. Figure 3 shows the structure of the obtained from the permutation πN by inserting N at position
block and the relationship of turbo code frame and OFDM jN as follows:
frame of the separate turbo code scheme.
In the single turbo code scheme, the OFDM frame is also ⎧
⎪
⎪ 0 ≤ i < jN ,
divided into several subbands, each subband using a different ⎨πN (i),
modulation scheme. The turbo code frame combines several πN+1 (i) = ⎪N, i = jN , (1)
⎪
⎩π (i − 1),
OFDM symbols. Figure 4 shows the structure of the block N jN < i ≤ N.
Modulation
Data
Turbo code
Modulation Interleave Add
S/P to OFDM IFFT
CP
frame
Modulation
Scheme type decision Channel
Demodulation
Turbo decoder
Interleave
Demodulation to turbo Rem
P/S code FFT CP
frame
Demodulation
Figure 2: System structure of one turbo code adaptive system.
Time Time
OFDM OFDM OFDM OFDM OFDM OFDM
···
symbol 1 symbol 2 symbol F symbol 1 symbol 2 symbol F
Subband 1 ··· Subband 1
Subband 2
Turbo code frame
Subband 2
Frequency
Turbo code frame
.
Frequency
. .
. .
.
Subband T
.
.
.
Figure 4: Transmission block structure of one turbo code adaptive
system.
Subband T
Figure 3: Transmission block structure of separate turbo code

adaptive system. Then, if D is larger than S for all p, this permutation πN+1
is a length N + 1 S-random interleaver with S-parameter S.
Otherwise, we calculate D for the next permutation πN+1 .
The length N + 1 S-random interleaver is selected using Using this algorithm, all the length L > K S-random
the following steps for given interleaver length N with S- interleaver may easily be obtained. Once we find all the
parameter S. For each jN from 0 → N, create an N + positions j for all the lengths of interest, the interleaver can
1 permutation using (1) above. For each, calculate D = be defined using very simple rules by adding positions to the
|πN+1 (p) − πN+1 ( jN )|, for original interleaver. The system needs only to store the origi-
⎧ nal length interleaver πK and the values jK , jK+1 , . . . , jL−1 .
⎪
⎨0 −→ jN + S,
⎪ jN < S,
p = ⎪ jN − S −→ jN + S, S ≤ jN < N − S, (2) 2.2. Adaptation Algorithm for Both Schemes. The adaptation
⎪
⎩ j − S −→ N,
N jN ≤ N − S. algorithms used for both schemes are the same. Firstly, the
100 1
0.8
10−2
Amplitude
0.6
0.4
10−4
0.2
10−6 0
0 10 20 30 40 50 60 70
Sample index
10−8
0 5 10 15 20 Figure 6: Rayleigh fading channel impulse response.
BPSK σ = 1.24 length = 378 8AMPM σ = 3 length = 13818

BPSK σ = 1.5 length = 4602 16QAM σ = 4 length = 378 therefore the lowest quality subcarrier in the subband is
QPSK σ = 1 length = 378 16QAM σ = 4 length = 1530 chosen to compare with the threshold, which is fixed
QPSK σ = 1 length = 762 16QAM σ = 4 length = 18426 threshold adaptation.
QPSK σ = 1 length = 9210 64QAM σ = 2 length = 426 In the fixed threshold adaptation, using a conservative
8AMPM σ = 4 length = 426 64QAM σ length = 4602 approach, the worst subcarrier in each subband is used for
8AMPM σ = 3 length = 1146 64QAM σ = 2 length = 27642
channel quality estimation, to determine the modulation
Figure 5: BER of nonadaptive turbo coded OFDM system. mode and the code rate. Therefore, the overall BER in one
subband is normally lower than the BER target. If the overall
BER can be closer to the BER target (though still below
SNR of each subcarrier needs to be calculated. We assume it) by choosing a more suitable modulation mode or code
that the impulse response of the fading channel is time- rate, the throughput of the system will be higher. Therefore,
invariant for the duration of one OFDM symbol. Therefore, we propose an optimal adaptation algorithm giving a better
the frequency domain channel transfer function Hn can be tradeoff between throughput and overall BER by choosing
determined by a Fourier transform of the impulse response. more suitable schemes for each subband on the basis of fixed
The received data symbols Rn can be expressed as threshold adaptation.
Firstly, the local SNR for each subcarrier is calculated.
Rn = Sn · Hn + nn , (3) Then, the fixed threshold adaptation algorithm should be
used to calculate the original scheme An for each subband.
where Sn is transmission signal and nn is Gaussian noise. For each subband, if a higher order scheme An+1 is employed,
Since the channel’s frequency domain transfer function Hn is the estimated BER of each subcarrier in the subband can
independent of the noise power in each subcarrier, the local be obtained from the BER curve of the nonadaptive system
SNR of each subcarrier n can be expressed as (Figure 5), and hence the estimated average BER of this
2 subband with this scheme An+1 can be obtained. If the
γn = Hn · γ, (4)
estimated average BER is still lower than the BER target,
where γ is the overall SNR. If there is no inter-subcarrier then we calculate the estimated average BER of this subband
interference (ISI), or interference from other sources, the with higher order schemes An+2 and An+3 , and so forth, till
value γn determines the bit error probability of subcarrier n. the estimated average BER is worse than the BER target. By
Then, the threshold for the given long-term BER target using this algorithm, the highest order scheme, and therefore
should be determined. Five modulation schemes, namely, the highest throughput, which can fulfil the BER target is
BPSK, QPSK, 8AMPM, 16QAM, and 64QAM are used found.
in these adaptive turbo coded OFDM systems. The turbo
code rate is 1/2. Therefore, there are in total 5 modulation 3. Numerical Results
and code schemes for modulation and coding, plus the
nontransmission case. By simulating the nonadaptive OFDM 3.1. Transmission Assumptions. In the simulation for both
system with these schemes separately, the SNR thresholds for systems, we set the BER target to 10−2 . There are 768
a given long-term target BER can be determined. The length subcarriers in each OFDM symbol, which are split into 16
of turbo code affects the BER performance of the turbo code subbands with 48 subcarriers in each. In the separate turbo
especially when the turbo code length is short. This will affect code adaptive system, for every turbo code frame, subbands
the adaptive system threshold as well. Figure 5 gives the BER from 12 OFDM symbols are combined, which means the
curves for each of these schemes with different length of length of the signal after turbo encoding and modulation
turbo code. Here, we used soft-output demodulation in this is 48∗12 = 576 modulation symbols. For the single turbo
system. From this figure, the threshold for the BER target can code adaptive system, we still assumed 12 OFDM symbols are
be determined. combined, and all 768∗12 = 9216 subcarriers are included in
For subband adaptive OFDM transmission, there are one turbo code frame. So, the length of the signal after turbo
several subcarriers with different local SNRs in each subband; encoding and modulation is 768∗12 = 9216 modulation
100 3
2.5
10−2
2
BPS
BER
10−4 1.5
10−6
0.5
10−8 0
−5 0 5 10 15 20 25 30 0 5 10 15 20 25 30
SNR SNR
BPSK 64QAM Separate turbo code adaptive system

QPSK Separate turbo code adaptive system Single turbo code adaptive system
8AMPM Single turbo code adaptive system
16QAM
Figure 8: Throughput of separate turbo code system and one turbo
code system.
Figure 7: BER of separate turbo code system and single turbo code
system.
Figure 8 shows the throughput in bits per second (bps) of
both systems using same adaptation algorithm.
symbols. For schemes of higher order than QPSK, bit- The dotted lines in Figure 7 are the BER performances
interleaved coded modulation is used [7]. of the nonadaptive OFDM system with 1/2 rate turbo code
A frequency selective fading channel is assumed in this and the 5 modulation schemes mentioned above. The red
simulation. The impulse response h(τ, t) was generated solid line with circle in both Figures 7 and 8 are the BER
using a tapped delay line channel model in which each tap and throughput performance for the separate turbo code
amplitude follows an independent Rayleigh distribution. The system, while the blue solid line with square in both figures
impulse response is shown in Figure 6. is the BER and throughput performance for the single turbo
In this simulation, perfect knowledge of the channel code system. These figures show that the single turbo code
transfer function at the receiver is assumed. Also the channel adaptive OFDM system can provide better bps throughput
impulse response is not changed during one turbo code performance than the separate turbo code adaptive OFDM
frame (12 OFDM symbol) block. system with same adaptation algorithm. Both of the systems
fulfil the BER target, but the single turbo code system is closer
3.2. Simulation Results. In our separate turbo code and to it.
single turbo code adaptive turbo-coded OFDM system, as
mentioned above, there are five modulation modes with code 3.3. Simulation Results with Rate 1/3 BPSK. As shown in
rate 1/2, plus the nontransmission case. In the separate turbo Figure 8, the difference of throughput for these two schemes
code system, if one subband is determined to employing is smaller when the overall SNR is low. The reason for this
BPSK, the length of the turbo code frame used in this is when the SNR is low, there are more subbands with no
subband is 282; while for the subbands employing QPSK, transmission. So, the length of turbo code in one turbo code
8AMPM, 16QAM, and 64QAM, the length of the turbo code system is similar to the length in the separate turbo code
frames are 570, 858, 1146, and 1722, respectively. Compared system. Hence, a more powerful, lower rate FEC code can be
to the separate turbo code system, the length of the turbo included in the adaptation algorithm to reduce the number
code in the single turbo code adaptive system is much longer. of nontransmission subbands. In this case, we choose code
When none of the subbands are marked as nontransmission, rate 1/3 with BPSK as another option.
the length of a turbo code with BPSK in all subbands is 4602, In the separate turbo code system, it is easy to add a
and the length of a turbo code with 64QAM in all subbands rate 1/3 BPSK scheme. In the single turbo code system,
is 27642. because only one turbo code is used, we initially use the
Figure 7 illustrates the BER performance of both adaptive rate 1/3 code. The subbands with code rate 1/2 are achieved
turbo coded OFDM systems using these 5 modulation by puncturing the parity bit sequences. The process of
schemes, with the optimal adaptation algorithm. Also puncturing is illustrated in Figure 9.
Rate 1/3
k=0 1 2 3 4 5 6 7 ···
X 1 1 1 1 1 1 1 1
P1 1 1 1 1 1 1 1 1 X0 P10 P20 X1 P11 P21 X2 P12 P22
P2 1 1 1 1 1 1 1 1
Rate 1/2
X 1 1 1 1 1 1 1 1
P1 1 x 1 x 1 x 1 x X0 P10 X1 P21 X2 P12 X3 P23 X4
P2 x 1 x 1 x 1 x 1
Figure 9: Puncture process of different code rate turbo code (“1” in the table means the code bit is included, “x” means the code bit is
punctured).
100 3
2.5
10−2
2
BPS
1.5
BER
10−4
10−6
0.5
0
0 5 10 15 20 25 30
10−8
−5 0 5 10 15 20 25 30 SNR
SNR
Separate turbo code system with rate 1/3 BPSK
64QAM Single turbo code system with rate 1/3 BPSK
BPSK
QPSK Separate turbo code system with rate 1/3 BPSK
Figure 11: Throughput of separate turbo code system and one
8AMPM Single turbo code system with rate 1/3 BPSK
turbo code system with rate 1/3 BPSK.
16QAM
Figure 10: BER of separate turbo code system and one turbo code
system with rate 1/3 BPSK. lower SNR is slightly higher than the systems without rate
1/3 BPSK.
3.4. Simulation Results for Lower BER Target 10−4 . The

Figure 10 illustrates the BER performance of both adap- advantage of the single turbo coded schemes in the cases
tive turbo coded OFDM systems using these 6 modulation simulated here is relatively small because of the relatively
and code schemes, with the optimal adaptation algorithm. high BER target, at which point the required SNR is not
Also Figure 11 shows the throughput in bps of both systems much affected by the code length. So, a lower BER target of
using same adaptation algorithm. 10−4 was chosen for detailed simulation.
The magenta solid lines with cross in both Figures 10 and Figure 12 illustrates the BER performance of both adap-
11 are the BER and throughput performance for separate tive turbo-coded OFDM systems with this new BER target.
turbo code system with code rate 1/3 BPSK, while the Also Figure 13 shows the throughput in bps of both systems
black solid lines with star in both figures are the BER and with same BER target. The red solid lines with star in both
throughput performance for the single turbo code system. Figures 12 and 13 are the BER and throughput performance
The BER performance of both systems is similar to the system for the separate turbo code system, and the black solid lines
without rate 1/3 BPSK, but the throughput performance at with circles in both figures are the BER and throughput
100 4. Summary
This paper has presented two adaptive modulations and code
rate turbo coded OFDM schemes, namely the separate turbo
code system and the single turbo code system, including
10−2 descriptions of the system structures and transmission
signal block structure design. A flexible length S-random
interleaver algorithm is used. Also the performances of two
adaptive systems have been compared. As shown in the
BER
10−4 numerical results, for both high and low BER targets, the sin-
gle turbo code adaptive system provides better performance
by using longer turbo codes.
The gap between the turbo-coded systems and the
Shannon bound remains large, indicating that substantial
10−6 further gains should be possible, given that turbo codes are
in principle able to approach very closely to this bound. The
reasons for the gap here include the rather simple approach
to coded modulation for higher order modulation, and that
10−8 for low code rates the turbo frame (in terms of data bits) is
−5 0 5 10 15 20 25 30 relatively short in both systems. Future work will investigate
SNR the use of improved coded modulation and the effect of more
realistic channel estimation.
BPSK 64QAM
QPSK Separate turbo adaptive system
8AMPM Single turbo code adaptive system References
16QAM
Figure 12: BER of separate turbo code system and one turbo code [1] R. Steele and W. T. Webb, “Variable rate QAM for data
system with BER target 10−4 . transmission over Rayleigh fading channels,” in Proceedings of
IEEE Wireless Conference, pp. 1–14, Calgary, Canada, July 1991.
[2] C. Tang, “An adaptive learning approach to adaptive mod-
ulation,” in Proceedings of International Conference on Third
3
Generation Wireless and Beyond (3Gwireless ’01), San Francisco,
Calif, USA, May-June 2001.
2.5 [3] E. Bizzarri, A. S. Gallo, and G. M. Vitetta, “Adaptive space-time-
frequency coding schemes for MIMO OFDM,” in Proceedings of
IEEE Global Telecommunications Conference (GLOBECOM ’04),
2 vol. 2, pp. 933–937, Dallas, Tex, USA, November-December
2004.
[4] L. Hanzo, C. H. Wong, and M. S. Yee, Adaptive Wireless
BPS
1.5 Transceivers: Turbo-Coded, Turbo-Equalized and Space-Time

Coded TDMA, CDMA, and OFDM Systems, Wiley-IEEE Press,
Madison, Wis, USA, 2002.
1 [5] S. Dolinar and D. Divsalar, “Weight distributions for turbo
codes using random and nonrandom permutations,” Tech. Rep.
TDA PR 42–122, Jet Propulsion Laboratory, Pasadena, Calif,
0.5 USA, August 1995.
[6] P. Popovski, L. Kocarev, and A. Risteski, “Design of flexible-
length S-random interleaver for turbo codes,” IEEE Communi-
0 cations Letters, vol. 8, no. 7, pp. 461–463, 2004.
0 5 10 15 20 25 30
[7] G. White, Optimised turbo codes for wireless channels, Ph.D.
SNR thesis, University of York, York, UK, 2001.
Separate turbo code low ber target
Single turbo code system with low ber target
Figure 13: Throughput of separate turbo code system and one
turbo code system with BER target 10−4 .
performance for the single turbo code system. For this BER
target, the single turbo code adaptive turbo coded OFDM
system is still out performance of the separate turbo code
adaptive system.
doi:10.1155/2009/240140
Research Article
Multiresolution with Hierarchical Modulations for
Long Term Evolution of UMTS
Américo Correia,1, 2 Nuno Souto,1, 2 Armando Soares,2 Rui Dinis,1 and João Silva1, 2
1 Instituto de Telecomunicações (IT), Av. Rovisco Pais, 1 Lisboa 1049-001, Portugal
2 Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE ), Av. das Forças Armadas, Lisboa 1649-026, Portugal
Correspondence should be addressed to Américo Correia, americo.correia@lx.it.pt
Received 30 July 2008; Revised 10 December 2008; Accepted 26 February 2009
In the Long Term Evolution (LTE) of UMTS the Interactive Mobile TV scenario is expected to be a popular service. By using
multiresolution with hierarchical modulations this service is expected to be broadcasted to larger groups achieving significant
reduction in power transmission or increasing the average throughput. Interactivity in the uplink direction will not be affected by
multiresolution in the downlink channels, since it will be supported by dedicated uplink channels. The presence of interactivity
will allow for a certain amount of link quality feedback for groups or individuals. As a result, an optimization of the achieved
throughput will be possible. In this paper system level simulations of multi-cellular networks considering broadcast/multicast
transmissions using the OFDM/OFDMA based LTE technology are presented to evaluate the capacity, in terms of number of TV
channels with given bit rates or total spectral efficiency and coverage. multiresolution with hierarchical modulations is presented
to evaluate the achievable throughput gain compared to single resolution systems of Multimedia Broadcast/Multicast Service
(MBMS) standardised in Release 6.
Copyright © 2009 Américo Correia et al. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction IP), the focus of this evolution was on enhancements for

packet-based services. 3GPP aimed to conclude the evolved
Third-generation (3G) wireless systems, based on wideband 3G radio access technology in 2008, with subsequent initial
code-division multiple access (WCDMA) radio access tech- deployment in the 2009-2010 time frame. At this point
nology, are now being deployed on a broad scale all over it is important to emphasize that this evolved RAN is an
the world. However, user and operator requirements and evolution of the current 3G networks, building on already
expectations are continuously evolving, and competing radio made investments. 3GPP community has been working on
access technologies are emerging. Thus it was important for LTE and various contributions were made to implement
3GPP to start considering the next steps in 3G evolution, in MBMS in LTE [1].
order to ensure 3G competitiveness in a 10-year perspective Orthogonal frequency division multiplexing/orthogonal
and beyond. As a consequence, 3GPP has launched the study frequency division multiple access OFDM/OFDMA [2–4],
item evolved UTRA and UTRAN, the aim of which was to used in the physical layer (downlink connection) of LTE,
study means to achieve further substantial leaps in terms of is an attractive choice to meet requirements for high data
service provisioning and cost reduction. The overall target rates, with correspondingly large transmission bandwidths
of this long-term evolution (LTE) of 3G was to arrive at and flexible spectrum allocation. OFDM also allows for a
an evolved radio access technology that can provide service smooth migration from earlier radio access technologies
performance on a parity with current fixed line access. As and is known for high performance in frequency-selective
it is generally assumed that there will be a convergence channels. It further enables frequency-domain adaptation,
towards the use of Internet Protocol (IP)-based protocols provides benefits in broadcast scenarios, and is well suited
(i.e., all services in the future will be carried on top of for multiple-input multiple-output (MIMO) processing.
The possibility to operate in vastly different spectrum Hierarchical constellations and MIMO (spatial multi-
allocations is essential. Different bandwidths are realized by plexing [12, 13]) are methods to offer multiresolution.
varying the number of subcarriers used for transmission, The authors of this paper have previously analyzed and
while the subcarrier spacing remains unchanged. In this way evaluated these two forms of multiresolution considering
operation in spectrum allocations of 1.4, 3, 5, 10, 15, and the WCDMA technology in [14–16]. In OFDMA-based
20 MHz can be supported. networks, the transmission of different fractions of the total
For MBMS support within a certain cell coverage area set of subcarriers (chunks) depending on the position of
for a given coverage target, the (Modulation and Coding the mobiles is another way to offer multiresolution. Any
Scheme) MCS of the MBMS transport channel typically of these methods is able to provide unequal bit error
has to be designed under worst-case assumptions. Apart protection. In any case there are two or more classes of bits
from cell-edge users experiencing large intercell-interference, with different error protection, to which different streams
users with better channel conditions (closer to the base of information can be mapped. Regardless of the channel
station) could receive the same service with a better quality conditions, a given user always attempts to demodulate
(e.g., video resolution), as their receiving SNR would allow both the more protected bits and the other bits that carry
usage of a higher-rate MCS. Hierarchical modulation [5– the additional resolution. Depending on its position inside
8], which has been specified for broadcast systems like the cell more or less blocks with additional resolution will
(Digital Video Broadcast Terrestrial) DVB-T or MediaFLO, be correctly received by the mobile user. However, the
is one way of accounting for unequal receiving conditions. basic quality will be always correctly received independently
Here, a signal constellation like 16QAM, with each symbol of the position of any user, within the 95% coverage
being represented by four bits, is interpreted in a sense that target.
the two first bits belong to an underlying QPSK alphabet. For increasing distance between terminals and base
This enables the use of two independent data streams with station decreasing bit rates are correctly received due to the
different sensitivity requirements. In the example above, the decrease of SNR. Adaptive Modulation and Coding (AMC) is
so-called high priority stream employs QPSK modulation a technique that maximizes the total throughput for unicast
and is designed to cover the whole service area. The low- transmissions. The decrease of SNR with the distance is
priority stream requires the constellation to be demodulated common to unicast or broadcast/multicast transmissions.
as 16QAM, and provides an additional or refined service via However for broadcast/multicast the same video content
the two additional bits. These may transport an additional is transmitted and AMC is not possible without personal
MBMS channel with a different type of service, or an uplink feedback. With the introduction of multiresolution
enhancement stream that, for example, leads to enhancing techniques the maximization of the total throughput is
the resolution of the base stream. A design parameter that the goal to achieve. System-level simulations for broad-
determines the constellation layout allows the control of cast/multicast with multiresolution are necessary to evaluate
the amount of distortion that the enhancements symbols the achievable throughput gain compare to single resolution
add to the baseline constellation, and can be used to systems.
control the ratio of coverage areas or service data rates. In this paper Section 2 refers to the objectives and
Theoretical evaluation of this type of modulations where it is requirements, in Section 3 the evaluation methodology and
explicitly shown the dependence of the individual bit streams simulation assumptions are presented. In Section 4 the
performance on the constellation design parameter has been system level results are presented, and finally in Section 5 the
previously presented in [9, 10]. summary and conclusions are presented.
Introducing multiresolution in a broadcast system
mainly affects two parts, source coding and distribu-
tion/signalling. Until recently the source coding has been 2. Objectives and Requirements
aimed toward achieving the highest compression ratio
possible [11]. With the development of cellular phones The introduction of hierarchical modulation in a broadcast
to competent multimedia terminals and integration of the cellular system requires a scalable video coded as shown in
cellular networks with the Internet, the result is a more Figure 1 [11, 14], where the base layer transmission provides
heterogeneous network with regard to terminal capabilities the minimum quality, and one or more enhancement layers
and connection speed. offer improved quality at increasing bit/frame rates and
In this work it is assumed that scalable source coders resolutions. This method significantly decreases the storage
are used and scalability is done in layers. It consists of costs of the content provider compared to the simulcast
one basic layer to encode the basic quality and consecutive distribution where for a single video sequence excessive
refinement or enhancement layers for higher quality. The video sequences must be stored at the server to enable its
source coder can generate a total of L layers. For simplicity it distribution to different customers with different terminal
is also assumed that all layers require the same data rate and capabilities. Besides being a potential solution for content
target bit error rate. Specifically for broadcast and multicast adaptation, scalable video schemes may also allow an efficient
transmissions in a mobile cellular network, depending on usage of radio resources in enhanced MBMS.
the communication link conditions, some receivers will have According to Release 6 of 3GPP the single resolution
better signal-to-noise ratios (SNR) than others and thus the scheme corresponds to transmission of QPSK with more
capacity of the communication link for these users is higher. than 95% coverage. The assignment of the fraction of the
Simulation System level

parameters Results
simulator SNR
SNR BLER
BLER
Base layer
UE2 Link level
simulator BLER
Node B SNR
Figure 2: Interaction between link level simulator and system level

simulator.
Base layer +
UE1 enhanced layer and simulation time. Therefore, separate but interconnected
link and system level approaches are needed.
Figure 1: Scalable video transmission. The link level simulator is needed for the system simu-
lator to build a receiver model that can predict the receiver
(Block Error Rate/Bit Error Rate) BLER/BER performance,
taking into account channel estimation, interleaving, mod-
total transmission power reserved for MBMS has impli- ulation, receiver structure, and decoding. The system level
cations in the coverage and average throughput of the simulator is needed to model a system with a large number of
multiresolution based on the hierarchical 16-QAM scheme. mobiles and base stations, and algorithms operating in such
The multicell interference distribution has also strong impact a system.
in the coverage and throughput. An interesting design As the simulation is divided in two parts, an approach
parameter is the channel bit rate (and its coding rate) of linking between the two simulators must be defined.
associated to the multiresolution scheme. An optimization Conventionally, the information obtained from the link level
of this parameter has also strong impact in the achievable simulator is inserted in the system level simulator through
coverage and average throughputs. the utilization of a specific performance parameter (BLER)
Regardless of the channel conditions and user location, a corresponding to a determined signal to interference plus
given user always attempts to demodulate both the base layer noise ratio (SNR) estimated in the terminal or base station.
and the enhancement layer carrying additional resolution. In Figure 2 is shown the simulators interaction.
For good multiresolution design, the basic information will
be always correctly received independently of the position
of any user, within the 95% coverage target. However, 3.1. Link-Level Simulator Design. The link-level simulator
depending on its position inside the cell more or less blocks (LLS) was developed in Matlab and took into account the
with additional resolution will be correctly received by the specifications of 3GPP MBMS Release 7 [17] regarding to
mobile user. the signal processing of transport and physical channels and
The objective of this work is the design of multires- satisfying two essential requirements:
olution schemes in different scenarios, namely, multicell
(i) serve as reference for all the link level simulations
with intercell interference without and with macrodiversity
with multiresolution and parameters estimation,
support, and to measure the corresponding multiresolution
gain of total throughput compared to the reference total (ii) serve as a platform to the different multiresolution
throughput of the single resolution scheme based on the improvements tested and quantified.
QPSK transmission.
Typical time interval of each link level simulation is 0.5
seconds (as shown in Table 1). The entire OFDMA signal
3. Evaluation Methodology and processing at the transmitter was included in the LLS as well
Simulation Assumptions as several different receiver structures. To achieve reliable
channel estimation and data detection we employ a receiver
Typically, radio network simulations can be classified as capable of jointly performing these tasks through iterative
either link level (radio link between the base station and processing. The structure of the iterative receiver is shown
the user terminal) or system level (several base stations with in Figure 3 (see also [18]).
large number of mobile users). A single approach would be The receiver structure for additive white Gaussian noise
preferable, but the complexity of such simulator (including (AWGN) channel is less complex (only a few turbo-decoder
everything from transmitted waveforms to multicell net- iterations and no channel estimation nor channel equaliza-
work) is far too high for the required simulation resolutions tion required).
Channel
De-interleaver
Rk,l decoder
Channel log2 M
DFT Demodulator parallel chains
equalization
2
Channel
De-interleaver
decoder
(q)
H k,l

Sk,l Decision
device
Channel Transmitted signal
estimator rebuilder
Decision
device
Figure 3: Iterative receiver structure.
Multipath Rayleigh fading channels were considered in v = 30 km/h) channel model was chosen because it is an
the simulator due to the sensitivity of hierarchical high-order important test channel in 3GPP specifications also, it allows
QAM modulations to the channel parameters estimation. for direct comparison with previous system level simulations
As indicated the receiver structure is nonlinear, iterative, done by the authors [25]. In OFDM systems the important
and includes channel parameters estimation for the analyzed parameter is the maximum delay of the multipath profile
multipath Rayleigh fading channel [19]. This explains why and its relation with the duration of the time guard between
we used a different approach for the link level simulations OFDM symbols to avoid intersymbol interference. 3GPP has
compared to the typical 3GPP methodology which maps specified a short time guard with about 4.75 μs and a long one
against coded AWGN curves for various transport formats. with 16.67 μs. The long-time guard was considered in this
paper, making the performance less sensitive to the chosen
3.2. Radio Access Network System Level Simulator. For the propagation channel. However, there is a reduction of the
purpose of validating the work presented in this section, transmitted bit rates.
it was developed a system level simulator in Java, using In the radio access network subsystem system level
a discrete event-based philosophy, which captures the simulator only the resulting fading loss of the channel model,
dynamic behavior of the Radio Access Network System. expressed in dB, is taken into account. The fading model
This dynamic behavior includes the user (e.g., mobility is provided by the link level simulator through a trace of
and variable traffic demands), radio interface and (Radio average fading values (in dB), one per Transmission Time
Access Network) RAN with some level of abstraction. Interval (TTI) or Subframe duration. For each environment
The system level simulator (SLS) works at Transmission the mobile speed is the same and several traces of fading
Time Interval (TTI) rate and typical time interval of each values are provided for each pair of antenna. A uniform
simulation is 600 seconds. Table 1 shows the simulation distribution of mobile users is generated at the beginning
parameters. It presents the parameters used in the link and of each simulation. Typical number of users chosen for each
system level simulations based on 3GPP documents [20– simulation run was 20 per sector. Each mobile has random
23]. mobility with the specified 30 km/h.
The channel model used in the system level simulator Dynamic system level simulators like the one presented
considers three types of losses: distance loss, shadowing loss in this paper are very accurate, the main limitation is
and multipath fading loss (one value per TTI). The model the hypothetical urban macrocellular test scenario that is
parameters depend on the environment. For the distance different from any real one.
loss the Okumura-Hata Model from the COST 231 project Figure 4 illustrates the cellular layout (trisectorial
was used (see [24]). Shadowing is due to the existence of antenna pattern) indicating the fractional frequency reuse
large obstacles like buildings and the movement of UEs in of 1/3 considered in the system level simulations. 1/3 of
and out of the shadows. This is modelled through a process the available bandwidth was used in each sector to reduce
with a lognormal distribution and a correlation distance. The the multicell interference. As indicated in Figure 4, the
multipath fading in the system level simulator corresponds identification of the sources of multicell interference, that
to the 3GPP channel model, where the ITU Vehicular A is, use of the same adjacent subcarriers (named physical
(30 km/h) (see [19] Annex B) environment was chosen as resource blocks or chunks), is given by the sectors with
reference. The latter model was also used in the link level the same colour/number, namely, red/one, green/two, or
simulator but at much higher rate. Vehicular A (with velocity yellow/three.
Table 1: Link and system level simulation parameters for urban macrocellular scenario.
Transmission bandwidth 10 MHz

Cyclic prefix size 72
FFT Size 1024
Carriers space (kHz) 15
Available bandwidth 9 MHz
Sample time (ns) 130
Max Tx Power (dBm)/sector 46
Number of used subcarriers/sector 200
Number of used subcarriers/cell 600
Freq. Reuse 1/3
Subframe duration (ms) 0.5
Interfering cells transmit with % of Max Power 90
Cell Radius (m) 750
InterSite Distance (m) 1500
Cellular layout Hexagonal
Sectors 3 sectors/cell
Number of cell sites 19
Antenna gain of the base station 17.5 dBi
Width of beam of the antenna at −3 dB 70 degrees
Front/Back ratio of the antenna 20 dB
Antenna pattern radiation of the base station Gaussian
Propagation Model Okumura-Hata
Downlink thermal noise −100 dBm
Cable Loss 3 dB
Fade out standard deviation due to shadowing 10 dB
station. It is not assumed any time synchronism between

2 the transmissions from different base stations with the same
colour resulting in interference from all but one cell with the
1 1
same colour. However, in the scenario with macrodiversity
3 3 3 combining the two best radio links, it is assumed that
2 2
there is time synchronization between the two closest base
station sites with the same colour. In this case the multicell
1 1 1 interference is reduced because only the other base station
3 3 sites with the same colour remain unsynchronous and
capable to interfere.
2 2 2
Figure 5 illustrates the time and frequency division
1 1 of the physical resource blocks (PRBs) considering that
3 there are three sectors per cell. To combat the frequency
selective fading adjacent PRBs should belong to different
sectors as indicated in Figure 5. In each sector the total
Figure 4: Cellular layout including the frequency reuse of 1/3 bandwidth should be available in 1/3 of each subslot of
(colours/numbers of the cells). 0.5 ms, in addition, the allocation of the physical resource
blocks by the sectors should be dynamic instead of fixed.
For the system level simulation results presented in the
paper what matters is the identification of the interfering
For 16-QAM hierarchical constellations two classes of PRBs. Fixed or variable positions of PRBs within the same
bits with different error protection are used. The blue colour Subframe, only matters if there is no coordination between
around the antennas only indicates the approximate coverage adjacent base-stations to avoid intercell interference. We
of the weak bits blocks, while the other colours indicate the have assumed that this interference avoidance coordination
coverage of the strong bits blocks. exists. Variable positions of PRBs within one Subframe
This is the case for the scenario to be analyzed with are better to combat fast fading effects due to multipath
one radio link between the mobile and the closest base channels.
Frequency
1 2 3
1 2 3
1 2 3
0.5
1 2 3
3 1 2
1 2 3
1 2 3
Time (ms)
1 2 3
1 2 3
0.5
1 2 3
3 1 2
1 2 3
1 2 3
1 2 3
1 2 3
0.5
1 2 3
3 1 2
1 2 3
Figure 5: Time and frequency division of the physical resource blocks.
4. System-Level Performance Results three-channel bit rates. In Figures 7 and 8 we present the
BLER versus Es /N0 for the channel bit rates 256 kbps and
To study the behavior of the proposed OFDM multireso- 384 kbps, respectively.
lution schemes, several simulations were performed for 16- In the legend H1 denotes the strong bits block and H2
QAM hierarchical modulations. the weak bits. H1, k = 0.1 corresponds to the most left curve
16-QAM hierarchical constellations are constructed requiring the minimum Es /N0 and H2, k = 0.1 is the most
using a main QPSK constellation where each symbol is in right curve requiring the maximum Es /N0 . H1, k = 0.5 and
fact another QPSK constellation, as shown in Figure 6. H2, k = 0.5 correspond to the two inner curves that almost
The main parameter for defining one of these constella- overlap (same Es /N0 ) in the two figures. k = 0 corresponds
tions is the ratio between d1 and d2 as shown in Figure 6: to QPSK and its BLER performance is presented only in
d1 Figure 7. As expected, QPSK has a better coverage than any
= k, where 0 < k ≤ 0.5. (1) of the H1 blocks but obviously its bit rate is half of the set
d2
H1+H2 for each k = / 0.
Two classes of bits with different error protection were used.
Each information stream was encoded with a block size Comparison between these two figures indicates that
of 2560 bits per Subframe duration of 0.5 ms. One third considering any BLER and in particular the reference BLER
of the total physical resource blocks (PRB) are transmitted of 1%, higher channel bit rates require higher SNR) to
in each sector. This corresponds to an instantly occupied offer any given BLER, resulting in less coverage. However,
bandwidth of 3 MHz, where we have considered 20 PRBs higher channel bit rates can provide higher maximum
each with 150 kHz of adjacent bandwidth (corresponding throughputs. For k = 0.1 the coverage of the strong blocks
to 10 subcarriers with frequency spacing of 15 kHz). The is the maximum, however the coverage of the corresponding
number of adjacent subcarriers in each PRB was a study item weak blocks is the minimum. As a result the resulting
in 3GPP by the time we started our simulation work. We have total throughput of both types of blocks is the smallest.
considered PRBs with 10 adjacent subcarriers instead of 12 Notice that k = 0.5 corresponds to the 16QAM uniform
as currently specified by 3GPP. However this change in the constellation, where the strong bits are the standard bits of
size of the PRBs does not change our simulation results for QPSK modulation, however their coverage is less than the
the propagation channels and velocity chosen. We have also QPSK. The coverage of the corresponding weak blocks (k =
chosen PRBs of this size to have an integer number of TV 0.5) is the maximum resulting in the highest total throughput
channels (i.e., PRBs) each with bit rate of 256 kbps for the of both types of blocks.
chosen fractional frequency reuse of 1/3. Otherwise it would For the reference BLER of 1%, the spread in Es /N0
not be possible to compare directly the OFDM/OFDMA values for different k values is much higher for weak blocks
results with those obtained previously with the WCDMA compared to strong blocks. As a result, we observe a small
technology. All the parameters used for OFDM during these coverage gain for smaller k values but associated to high
simulations were based on 3GPP documents [20–23]. loss of total throughput (strong + weak blocks). This can be
We have considered that three different coding rates are observed in Figure 9 where the difference, related to QPSK,
used, namely, 1/2, 2/3 and 3/4. This leads to total transmitted in required SNR is presented versus k, taking the reference
information bit rates per cell sector of 5120 kbps, 6825 kbps, BLER of 1%.
and 7680 kbps, respectively. Considering that each PRB We have chosen the k = 0.5 curves for the system
carries a different TV program channel this corresponds level simulations because in this case there is the minimum
to channel bit rates of 256 kbps, 341 kbps and 384 kbps, difference between the BLER performance of H1 and H2,
respectively. We have evaluated in the link level simulations which is expected to assure the best combination of coverage
the hierarchical 16-QAM with different values of k for these and throughput.
d1
I I I
01 00 0101 0100 0001 0000
01 00
0111 0110 0011 0010
+ =
Q 11 10 Q 1101 1100 1001 1000 Q
11 10
1111 1110 1011 1010
Basic Enhancement d2
Figure 6: Signal constellation for 16-QAM hierarchical modulation.
256 kbps
100
24
10−1 20
BLER
10−2 16
ΔSNR (dB)
12
10−3
8
10−4
−5 0 5 10 15 20 25 30
4
Es /N0 (dB)
QPSK, k = 0 H2, k = 0.5 0

0 0.1 0.2 0.3 0.4 0.5 0.6
H1, k = 0.1 H2, k = 0.4
k
H1, k = 0.2 H2, k = 0.3
H1, k = 0.3 H2, k = 0.2 H1
H1, k = 0.4 H2, k = 0.1 H2
H1, k = 0.5
Figure 9: ΔSNR versus k for hierarchical 16-QAM, 256 kbps, VehA
Figure 7: BLER versus Es /N0 for hierarchical 16-QAM varying k, 30 km/h.
Rb = 256 kbps, VehA 30 km/h.
In the system level simulations mobile users receive

384 kbps
100 strong and weak bits blocks transmitted from base stations.
Each block undergoes small- and-large scale fading and
multicell interference. In terms of coverage or throughput the
10−1
SNR of each block is computed taking into account all the
above impairments and based on the comparison between
BLER
10−2 the reference SNR at a BLER of 1%, and the evaluated SNR
it is decided whether the block is or not correctly received.
This is done for all the transmitted blocks for all users in all
10−3
sectors of the 19 cells, during typically 10 minutes.
Figure 10 presents the coverage versus the fraction
10−4 of the total transmitted power (Ec /Ior ), for the multicell
0 5 10 15 20 25 30 35
interference scenario where there is interference only from
Es /N0 (dB)
1/3 of the sectors due to the frequency reuse of 1/3 (see
H1, k = 0.1 H2, k = 0.5 Figure 4). All interfering sites transmit with the maximum
H1, k = 0.2 H2, k = 0.4 power of 80% according to the parameters indicated in
H1, k = 0.3 H2, k = 0.3 Table 1. The cell radius is 750 m, and we have separated
H1, k = 0.4 H2, k = 0.2 strong blocks (H1) from weak blocks (H2) without including
H1, k = 0.5 H2, k = 0.1
macrodiversity combining. The multicell interference is 90%
Figure 8: BLER versus Es /N0 for hierarchical 16-QAM varying k, of the maximum transmitted power in each site. For Ec /Ior
Rb = 384 kbps, VehA 30 km/h. = 50% and channel bit rate 256 kbps the coverage of H1 is
Multi-cell interference scenario, 750 m Multi-cell interference scenario, 750 m

110 405
Average UE throughput (kbps)

100 360
90
Average coverage (%)
80 315
70 270
60 225
50
40 180
30 135
20 90
10
0 45
0 10 20 30 40 50 60 70 80 90 100 0
Multicast channel Ec /lor (%) 0 10 20 30 40 50 60 70 80 90 100
Multicast channel Ec /lor (%)
H1 (256 kbps) H2 (341 kbps)
H2 (256 kbps) H1 (384 kbps) 2RL (256 kbps) 1RL (384 kbps)
2RL (384 kbps) 1RL (341 kbps)
Figure 10: Average coverage (%) versus Ec /Ior , 1 Radio Link, k =
0.5. Figure 12: Throughput versus Ec /Ior , R = 750 m, k = 0.5.
Multi-cell interference scenario, 750 m Multi-cell interference scenario, 750 m

110 405
100 Average UE throughput (kbps) 360
90 315
80 270
70 225
60
180
50
40 135
30 90
20 45
10 0
0 0 100 200 300 400 500 600 700 800
0 10 20 30 40 50 60 70 80 90 100
Distance to BS (m)
H1 (341 kbps) H2 (384 kbps)
Figure 13: Throughput versus distance between UEs and BS, k =
Figure 11: Average coverage (%) versus Ec /Ior , 2 Radio Links, k = 0.5.
0.5.
Multi-cell interference scenario, 750 m

110
95% and for H2 is 85%. For the same Ec /Ior , but 384 kbps
100
data rate, the coverage values of H1 and H2 are 39% and 90
30%, respectively. In both cases there is a difference of about 80

10% between the coverage of H1 and H2 due to the chosen 70
k = 0.5. 60
Figure 11 present the coverage versus Ec /Ior separating 50
strong blocks (H1) from weak blocks (H2) with macrodi- 40
versity combining of the best two radio links. For Ec /Ior = 30
20% regardless of the channel bit rate and the type of blocks 20
10
the coverage is always above 95%. However, for 384 kbps the
0
coverage values of H1 and H2 are different from each other. 0 10 20 30 40 50 60 70 80 90 100
Only for Ec /Ior = 50% the coverage of strong blocks is Multicast channel Ec /lor (%)
above or equal to 95% for 384 kbps, but for 256 kbps the
coverage value for strong blocks is above 95% for Ec /Ior = H1 (256 kbps) H2 (341 kbps)
H2 (256 kbps) H1 (384 kbps)
5%. This indicates that as long as there is macrodiversity
H1 (341 kbps) H2 (384 kbps)
combining of the two best links it is possible to increase
the channel bit rate or increase the number of transmitted Figure 14: Average coverage (%) versus Ec /Ior , 2 Radio Links, k =
channels keeping the same bit rate. 0.4.
100
Multi-cell interference scenario, 750 m Figure 12 considers the throughput distribution as func-
90 tion of the Ec /Ior for multicellular network with and without
80 macrodiversity for the cell radius of 750 m. We observe a
70 considerable gain in throughput when macrodiversity (2RL)

60 is considered compared to the single radio link case. This is
50 particularly true for the high bit rate 384 kbps. For the low
40 bit rate the macrodiversity gain is not so substantial as the
30 throughput performance is already good for a single radio
20 link.
10 Figure 13 considers the throughput distribution as func-
0 tion of the distance between UEs and BS for the Ec /Ior = 90%,
0 10 20 30 40 50 60 70 80 90 100
with and without macrodiversity for the same cell radius of
Multicast channel Ec /lor (%) 750 m. For the chosen Ec /Ior , macrodiversity (2RL) assure
H1 (256 kbps) H2 (341 kbps) almost the maximum throughput for 256 kbps, however it
H2 (256 kbps) H1 (384 kbps) is more obvious the decrease in throughput for 384 kbps and
H1 (341 kbps) H2 (384 kbps) mobile users at the cell borders. It is obvious that without
macrodiversity (1RL case), only for the 256 kbps channel,
Figure 15: Average coverage (%) versus Ec /Ior , 2 Radio Links, k = the throughput is almost the maximum regardless of the
0.1.
distance. For the high bit rate 384 kbps a single radio link
only offers high throughput for users close to the base station.
Based on these results for the 16QAM multiresolution
Multi-cell interference scenario, 750 m scheme in the multicellular network with macrodiversity
405
combining (compared to one radio link) it is possible to
360
increase the channel bit rate keeping the same number of
315 channels or increasing the number of channels keeping the
270 same bit rate per channel. In terms of broadcasting mobile
225 TV channels it might be important to increase the InterSite
180 distanced to 1500 m to reduce the number of sites.
135 In Figures 14 and 15 the coverage performance curves for
90 k = 0.4 and k = 0.1, versus Ec /Ior , are presented and should
45 be compared to the corresponding figure with k = 0.5,
0 Figure 11. As expected the difference of coverage between
0 10 20 30 40 50 60 70 80 90 100
H1 and H2 blocks increases with decreasing k, this is more
noticeable for small k values such as k = 0.1 where even with
2RL (256 kbps) 1RL (384 kbps) macrodiversity combining the coverage of H2 blocks is rather
1RL (256 kbps) 2RL (341 kbps) low.
In Figures 16 and 17 the throughput performance versus
Figure 16: Throughput versus Ec /Ior , k = 0.4. Ec /Ior , for k = 0.4 and k = 0.1 are presented and should
be compared to Figure 12. With or without macrodiversity
combining there is about the same throughput for k =
0.5 and k = 0.4. However, there is a substantial decrease
405
Multi-cell interference scenario, 750 m in throughput for k = 0.1 without and especially with
macrodiversity combining, independently of the channel bit
360
315
rate.
270 To get the 16QAM multiresolution gain compared to
225 the single resolution with QPSK we should compute the
180
aggregate throughput in all the cell area with multiresolution
135
and divide by the single resolution aggregate throughput
in the cell area. As the coverage of QPSK blocks is the
90
same of strong bits blocks of hierarchical 16QAM due to
45
macrodiversity combining the comparison of the aggregate
0
0 10 20 30 40 50 60 70 80 90 100 throughput is based on the different coverage of the weak bits
Multicast channel Ec /lor (%) blocks.
From Figures 12 and 16 it is clear that the smallest
2RL (256 kbps) 1RL (341 kbps) throughput gain is achieved for coding rate = 1/2 (256 kbps).
For this case, the throughput gain is two, remember that
the single resolution throughput of QPSK is 128 kbps. The
Figure 17: Throughput versus Ec /Ior , k = 0.1. highest throughput gain is achieved for coding rate = 3/4
Table 2: Capacity values for 16QAM hierarchical multiresolution InterSite-distance (ISD) associated to this spectral efficiency
OFDMA. is 1500 m. Alternatively, 30 TV channels with 256 kbps could
QoS No. of channels Spectral efficiency ISD Bandwidth be transmitted at the same time as indicated in Table 2.
Table 3 shows the capacity of MBMS single resolution
256 kbps 30 0.768 bps/Hz/cell 1500 m 10 MHz
taking into account results for the standard MBMS nor-
QoS No. of channels Spectral efficiency ISD Bandwidth
malized in Release 6 and as presented in [25] for the same
384 kbps 20 0.768 bps/Hz/cell 1500 m 10 MHz scenario with macrodiversity of two radio links.
The comparison between Tables 2 and 3 is not straight-
forward due to the difference of bandwidth and ISO.
Table 3: Capacity values for QPSK single resolution, CDMA However it is possible to draw a capacity gain of at least two
scheme for 5 MHz bandwidth. between hierarchical 16QAM and QPSK (notice that higher
ISD is an advantage for broadcasting).
QoS No. of channels Spectral efficiency ISD Bandwidth
In the future we will study and evaluate the use
256 kbps 7 0.358 bps/Hz/cell 1000 m 5 MHz
of 64QAM hierarchical constellations and MIMO (spatial
multiplexing) in an OFDM/OFDMA system as other mul-
tiresolution schemes for the enhanced MBMS network. The
scenario based on the use of single-frequency network (SFN)
(384 kbps). For this case, the throughput gain is almost three with the Multimedia Broadcast over SFN (MBSFN) channel
(for k = 0.5 the throughput of 384 kbps is achieved up to will be also evaluated for 16QAM hierarchical modulation
600 m far from the base station (BS) as shown in Figure 13). and compared with the present work.
However for k = 0.1 the throughput gain never reaches two
(see Figure 17). So it is important to choose k values between
[0.4,0.5] to achieve the highest multiresolution gain. References
[1] “Feasibility study on improvement of the multimedia broad-
5. Summary and Conclusions cast multicast service (MBMS),” Tech. Rep. 25.905 version
7.2.0 Release 7, 3GPP, Sophia Antipolis Cedex, France, June
We have studied and evaluated the use of QAM hierarchical
2007, http://www.3gpp.org.
constellations in an OFDM system as a multiresolution [2] H. Sari, Y. Levy, and G. Karam, “An analysis of orthogonal
scheme for the enhanced MBMS network. Scenarios based frequency-division multiple access,” in Proceedings of IEEE
on multicell networks without and with macrodiversity Global Telecommunications Conference (GLOBECOM ’97), vol.
combining were evaluated using multiresolution based on 3, pp. 1635–1639, Phoenix, Ariz, USA, November 1997.
16QAM hierarchical modulation. [3] I. Koffman and V. Roman, “Broadband wireless access
We can conclude that multiresolution works fine with solutions based on OFDM access in IEEE 802.16,” IEEE
any of the analyzed scenarios, multicell networks without or Communications Magazine, vol. 40, no. 4, pp. 96–103, 2002.
with macrodiversity combining. Indeed it works better with [4] J. A. C. Bingham, “Multicarrier modulation for data transmis-
multicell with macrodiversity than with multicell without sion: an idea whose time has come,” IEEE Communications
macrodiversity. In multicell networks without macrodiver- Magazine, vol. 28, no. 5, pp. 5–14, 1990.
[5] T. Cover, “Broadcast channels,” IEEE Transactions on Informa-
sity due to the higher sensitivity to the channel bit rate of
tion Theory, vol. 18, no. 1, pp. 2–14, 1972.
higher-order constellations we can increase the channel bit [6] K. Ramchandran, A. Ortega, K. M. Uz, and M. Vetterli,
rate of each TV channel for users close to the base station. In “Multi-resolution broadcast for digital HDTV using joint
multicell scenario with macrodiversity, the multiresolution source/channel coding,” IEEE Journal on Selected Areas in
schemes become less sensitive to the used channel bit rates. Communications, vol. 11, no. 1, pp. 6–23, 1993.
In multicell without macrodiversity to achieve higher [7] H. Jiang and P. A. Wilford, “A hierarchical modulation for
multiresolution gain it is suggested to use the channel bit rate upgrading digital broadcast systems,” IEEE Transactions on
of 256 kbps, that is, the channel coding rate of 1/2. As long as Broadcasting, vol. 51, no. 2, pp. 223–229, 2005.
there is previous recording of link quality information in the [8] S. Wang, S. Kwon, and B. K. Yi, “On enhancing hierarchical
cell, it is recommended that a few different groups should be modulation,” in Proceedings of IEEE International Symposium
formed with different channel bit rates in order to increase on Broadband Multimedia Systems and Broadcasting (BMSB
the levels of multiresolution. One way to achieve this is the ’08), pp. 1–6, Las Vegas, Nev, USA, March-April 2008.
combination of hierarchical QAM modulations with MIMO [9] P. K. Vitthaladevuni and M.-S. Alouini, “A closed-form
2 × 2. expression for the exact BER of generalized PAM and QAM
constellations,” IEEE Transactions on Communications, vol. 52,
It was concluded that to achieve the highest multiresolu-
no. 5, pp. 698–700, 2004.
tion gain is important to choose k values between (0.4,0.5) [10] N. Souto, F. A. B. Cercas, R. Dinis, and J. Silva, “On the
and avoid smaller k values. BER performance of hierarchical M-QAM constellations with
For the high channel bit rate 384 kbps, the spectral diversity and imperfect channel estimation,” IEEE Transactions
efficiency achieved per cell sector considering that 20 on Communications, vol. 55, no. 10, pp. 1852–1856, 2007.
TV channels are transmitted simultaneously in the total [11] M. Vetterli and K. M. Uz, “Multiresolution coding techniques
bandwidth of 10 MHz is 0.768 bps/Hz/cell. This value of for digital television: a review,” Multidimensional Systems and
spectral efficiency is valid for users at the cell border. The Signal Processing, vol. 3, no. 2-3, pp. 161–187, 1992.
[12] G. J. Foschini, “Layered space-time architecture for wireless

communication in a fading environment when using multi-
element antennas,” Bell Labs Technical Journal, vol. 1, no. 2,
pp. 41–59, 1996.
[13] G. J. Foschini and M. J. Gans, “On limits of wireless
communications in fading environments when using multiple
antennas,” Wireless Personal Communications, vol. 6, no. 3, pp.
311–335, 1998.
[14] A. Soares, N. Souto, J. Silva, P. Eusébio, and A. Correia,
“Effective radio resource management for MBMS in UMTS
networks,” Wireless Personal Communications, vol. 42, no. 2,
pp. 185–211, 2007.
[15] A. Soares, J. Silva, N. Souto, F. Leitão, and A. Correia, “MIMO
based radio resource management for UMTS multicast broad-
cast multimedia services,” Wireless Personal Communications,
vol. 42, no. 2, pp. 225–246, 2007.
[16] A. Correia, N. Souto, J. Silva, and A. Soares, “Air interface
enhancements for MBMS,” in Handbook of Mobile Broad-
casting, B. Furht and S. Ahson, Eds., chapter 17, CRC Press,
Francis & Taylor, New York, NY, USA, 2008.
[17] “Technical specification group radio access network; physical
layers aspects for evolved (UTRA),” Tech. Rep. 25.814 version
7.1.0 Release 7, 3GPP, Sophia Antipolis Cedex, France,
September 2006, http://www.3gpp.org.
[18] N. Souto, A. Correia, R. Dinis, J. Silva, and L. Abreu,
“Multiresolution MBMS transmissions for MIMO UTRA LTE
systems,” in Proceedings of IEEE International Symposium on
Broadband Multimedia Systems and Broadcasting (BMSB ’08),
pp. 1–6, Las Vegas, Nev, USA, March-April 2008.
[19] “User equipment radio transmission and reception (FDD),”
Tech. Rep. TS 25.101-version 6.2.0, Release 6, 3GPP, Sophia
Antipolis Cedex, France, October 1999, http://www.3gpp.org.
[20] “Feasibility study for evolved universal terrestrial radio
access (UTRA) and universal terrestrial radio access net-
work (UTRAN),” Tech. Rep. 25.912 version 7.1.0 Release
7, 3GPP, Sophia Antipolis Cedex, France, September 2006,
http://www.3gpp.org.
[21] “Feasibility study for orthogonal frequency division mul-
tiplexing (OFDM) for UTRAN enhancement,” Tech. Rep.
25.892 version 6.0.0 Release 6, 3GPP, Sophia Antipolis Cedex,
France, June 2004, http://www.3gpp.org.
[22] “Evolved universal terrestrial radio access (E-UTRA); radio
frequency (RF) system scenarios,” Tech. Rep. 36.942 Release
8, 3GPP, Sophia Antipolis Cedex, France, December 2008,
http://www.3gpp.org.
[23] “LTE physical layer framework for performance verification,”
Tech. Rep. R1-070674 TSG-RAN1#48, 3GPP, Sophia Antipolis
Cedex, France, February 2007, http://www.3gpp.org.
[24] E. Damosso, Ed., “Digital Mobile Radio Towards Future
Generation Systems,” COST 231, European Commission,
Luxemburg, Germany, 1999.
[25] A. Correia, J. Silva, N. Souto, L. A. C. Silva, A. B. Boal, and
A. Soares, “Multi-resolution broadcast/multicast systems for
MBMS,” IEEE Transactions on Broadcasting, vol. 53, no. 1, pp.
224–233, 2007.
doi:10.1155/2009/750735
Research Article
An Opportunistic Error Correction Layer for OFDM Systems
Xiaoying Shao, Roel Schiphorst, and Cornelis H. Slump

The Signals and Systems Group, Department of Electrical Engineering Mathematics and Computer Science (EEMCS),
University of Twente, 7500 AE Enschede, The Netherlands
Correspondence should be addressed to Xiaoying Shao, x.shao@ewi.utwente.nl
We propose a novel cross layer scheme to reduce the power consumption of ADCs in OFDM systems. The ADCs in a receiver can
consume up to 50% of the total baseband energy. Our scheme is based on resolution-adaptive ADCs and Fountain codes. In a
wireless frequency-selective channel some subcarriers have good channel conditions and others are attenuated. The key part of the
proposed system is that the dynamic range of ADCs can be reduced by discarding subcarriers that are attenuated by the channel.
Correspondingly, the power consumption in ADCs can be decreased. In our approach, each subcarrier carries a Fountain-encoded
packet. To protect Fountain-encoded packets against bit errors, an LDPC code has been used. The receiver only decodes subcarriers
(i.e., Fountain-encoded packets) with the highest SNR. Others are discarded. For that reason a LDPC code with a relatively high
code rate can be used. The new error correction layer does not require perfect channel knowledge, so it can be used in a realistic
system where the channel is estimated. With our approach, more than 70% of the energy consumption in the ADCs can be saved
compared with the conventional IEEE 802.11a WLAN system under the same channel conditions and throughput. In addition,
it requires 7.5 dB less SNR than the 802.11a system. To reduce the overhead of Fountain codes, we apply message passing and
Gaussian elimination in the decoder. In this way, the overhead is 3% for a small block size (i.e., 500 packets). Using both methods
results in an efficient system with low delay.
Copyright © 2009 Xiaoying Shao et al. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction In the current generation of WLAN equipment (based on

IEEE 802.11a [6]), the forward error-correction (FEC) layer
The wireless channel is a very hostile environment. There- is based on rate compatible punctured codes (RCPC). These
fore, it is a challenge to communicate both reliably and codes have good performance for random bit-errors, but
with a high throughput. In this paper, we investigate a poorer performance for burst bit errors. For that reason, an
novel error-correction layer based on Fountain codes, orthog- interleaver is applied to randomize the burst errors of the
onal frequency-division multiplexing (OFDM), and adaptive wireless channel. On the other hand, the wireless channel is
analog-to-digital conversion to mitigate the effects of a changing in time. This means that some packets are received
wireless channel at a lower power consumption compared to with a “good” channel and others over a “bad” channel. The
traditional solutions. error-correction layer based on RCPC has been designed in
OFDM has become a popular scheme for recent such a way that for most channel realizations the bit-error
WLAN standards which operate at a high bit rate [1– rate (BER) is zero. For a small part of the channel, bit errors
3]. The main advantage of OFDM over the single-carrier will occur and retransmission is necessary. Although this
scheme is its ability to eliminate inter-symbol interfer- solution works well in practical systems, it is not energy-
ence (ISI) without complex equalization filters in the efficient for two reasons.
receiver [4]. OFDM has a high peak-to-average power
Ratio (PAPR), therefore it requires analog-to-digital con- (i) Packets which have encountered “bad” conditions are
verters (ADC) with a high dynamic range. These high- still processed by the entire receiver chain.
resolution ADCs can take up to 50% of the baseband power (ii) Fixed high-resolution ADCs are used in the current
[5]. WLAN systems, designed for worst-case scenarios.
5 only processes “good” packets. This error-correction is able

to cope with discarding packets because Fountain-encoded
packets are independent of each other. Also, less power is
0 consumed as the resolution of the ADC is adapted to the
Magnitude (dB)
minimum required in each case, compared to using a fixed-

resolution ADC. Thus, it is a novel cross-layer approach
−5 which integrates the error-correction into the physical layer
of an OFDM system.
The outline of this paper is as follows. We propose two
−10 techniques which together form the new error-correction
layer and reduce the power consumption: Fountain codes
and a resolution-adaptive ADC. First, Fountain codes are
−15 discussed, which is followed by the resolution-adaptive ADC.
−10 −8 −6 −4 −2 0 2 4 6 8 10
A practical example is given in this paper considering the
MHz
IEEE 802.11a system. In Section 4, a description is given
Figure 1: Example of the baseband transfer function of a of the IEEE 802.11a system model and are included our
frequency-selective channel model A. proposed modifications. Finally, the simulation results are
described, which compare the conventional 802.11a system
with our modifications. The paper ends with conclusions and
future work.
In this paper we propose a new error-correction layer,
which does not have these disadvantages. It is an opportunis- 2. Fountain Codes
tic error correction layer because it processes only “good”
packets. Also, it is a low power-consumption scheme as the The proposed error-correction layer is generic: any Fountain
resolution of the ADC is adapted to the minimum for each codes (e.g., Luby Transform (LT) codes [8], Raptor codes [9],
scenario instead of being designed for worst-case situations. etc.) can be applied in it. In this paper, we use LT codes in the
A further resolution reduction of the ADC can be proposed error-correction layer.
achieved by discarding those parts of the channel with deep Consider a file of size K packets s1 , s2 , . . . , sK to be
fading. Taking Figure 1 as an example, the dynamic range of encoded by a Fountain code. A “packet” has m bits and
the whole channel is around 18.8 dB. From this figure, we can is considered as an elementary unit here. At each clock
see that deep fading does not happen everywhere and only cycle, indexed by n, the encoder randomly chooses several
occurs in the frequency band of −8 ∼ 0 MHz. By discarding packets, and computes the bitwise sum (XOR) of these
this 8 MHz sub-band, the dynamic range of the channel is source packets to generate the corresponding transmitted
reduced to around 10.4 dB. The current WLAN standards do packet. The number of packets used is random, as well as
not support this approach, as all sub-bands are considered the selection of the packets used. The Fountain code can
equally important by the FEC layer. supply us with a stream of packets based on source packets
Therefore, we propose a novel FEC layer based on s1 , s2 , . . . , sK . In practical situations, however, only a fixed
Fountain codes that allows us to discard those parts of the number of packets N are generated.
channel with deep fading. In [7], MacKay describes the At the receiver side enough packets have to be received
encoder of a Fountain code as a metaphorical fountain that for successful decoding. The required number of received
produces a stream of encoded packets. Anyone who wishes packets N is slightly larger than the number of source packets
to receive the encoded file holds a bucket under the fountain K and is defined by
and collects enough packets to recover the original data. It
does not matter which packets are received, only a minimum N = K(1 + ε), (1)
amount of packets have to be received correctly [7]. In other
words, the Fountain-encoded packets are independent with where ε is the percentage of extra-packets and is called the
respect to each other. overhead.
To apply Fountain codes in WLAN systems, we divide After receiving N packets, the receiver can recover the
a block of source bits into a set of packets which are source packet using the message-passing algorithm which has
encoded by a Fountain code. A Fountain-encoded packet a linear decoding cost. By using message-passing to decode
is transmitted over a subcarrier. Thus, multiple packets LT codes, the practical block size for LT codes with small
are transmitted simultaneously, using frequency division ε (e.g., within 5%) is on the order of 104 or higher, which
multiplexing. In our system the transmitter generates an prevents the Fountain scheme from efficiently supporting
abundance of packets and the receiver can discard Fountain- real-time applications (i.e., low delay) [10]. For low failure
encoded packets which are transmitted over the subcarriers probability (e.g., 1%), using messaging-passing decoding,
with deep fading. Correspondingly, the power consumption the practical overhead for small block size (i.e., on the
in the ADCs decreases. order of 103 ) is much larger than in theory [7]. In [11],
The proposed method is an opportunistic error correc- the authors show that the practical overhead of LT codes
tion layer because it does not process all received packets but is 14% when K = 2000, which limits the application of
LT codes in practical systems to K ≤ 2000. The practical 0.5

overhead becomes smaller for a larger number of source 0.45
packets. Although larger packets decrease the overhead, this
0.4
also results in more delay. In addition, if the message-passing
decoding fails, it does not mean that the source packets are 0.35
not recoverable. Gaussian elimination can also be used for 0.3
Overhead
decoding, if the matrix G can be transformed into an up-
0.25
converted matrix.
However, Gaussian elimination has higher complexity 0.2
compared to the message-passing algorithm. The decod- 0.15
ing cost of using the message-passing algorithm scales as
K loge K and the cost of using the Gaussian elimination 0.1
algorithm is on the order of K 3 [7]. In [12], the authors 0.05
propose a fast Gaussian elimination algorithm over GF(2) 0
with reduced cost O(K 2 ). The message-passing algorithm 100 200 300 400 500 600 700 800 900 1000
has lower decoding costs (computational complexity) but K
requires more overhead (i.e., Fountain-encoded packets) for
Only message passing
successful decoding compared to the Gaussian elimination. Message passing plus Gaussian elimination
Therefore, we can combine both methods to give low
overhead and a reasonable complexity. Gaussian elimination Figure 2: The overhead of Fountain codes (LT codes, c = 0.03, δ =
is applied after the message-passing algorithm. Packets which 0.3). Fountain-encoded packets are transmitted over the erasure
cannot be retrieved by message-passing will be decoded by channel with the erasure probability of 20%. The dash-dot curve is
Gaussian elimination. By using both methods, the number the overhead of LT codes only using message-passing decoding and
of source packets can be small and the practical overhead the solid curve is the overhead of LT codes using message-passing
is reduced as shown in Figure 2. From this figure, we can algorithm and Gaussian elimination together to decode.
see that the overhead of using the message-passing plus
Gaussian elimination for K = 500 can be reduced from
42% to 3% in comparison to only message-passing decoding.
Finally, the packet is encoded by an LDPC code to combat bit
Furthermore, the complexity of this scheme is increased to
errors introduced by the channel.
O(K1 loge K1 ) + O(K − K1 2 ), where K1 is the number of source
At the receiver, each Fountain-encoded packet is first
packets recovered by the message-passing algorithm and K −
LDPC decoded if its energy is equal to or higher than a
K1 is the number of source packets recovered by the Gaussian
threshold (i.e., corresponding to BER ≤ 10−5 ). The received
elimination algorithm. For K = 500, on average around
packet is discarded if its energy is below the threshold. If
250 source packets can be decoded by the message-passing
the LDPC decoding fails, the received packet is discarded
algorithm and the rest of the packets can be recovered by
as well. If the LDPC decoding succeeds, the CRC is used
Gaussian elimination. In this case, the complexity is around
to identify any errors undetected by the LDPC codes. If
6 × 104 , which is around 25% of the complexity of only using
the CRC decoder detects an error, the receiver assumes that
Gaussian elimination algorithm for decoding. However, the
the whole packet has been lost. Once the receiver gets N
overhead by using both methods can be reduced from 42%
surviving Fountain-encoded packets, it starts to recover the
to 3% compared with the overhead of only using message-
source data.
passing.
As mentioned before, Fountain-encoded packets are
assumed to be transmitted over the erasure channel, which 3. Resolution-Adaptive ADC
means that the encoded packet is either received error-free
or not received at all. However, wireless channels are not Wireless channels in OFDM systems are fading channels
erasure channels. To convert the wireless channel into an and are modeled as frequency-selective channels [4, 16].
erasure channel, error-correction codes are applied to each An example is depicted in Figure 1. If a “bad” channel (A
Fountain-encoded packet in practical systems [7]. Both the “bad” channel means in our definition a large difference in
Low Density Parity Check (LDPC) codes [13] and Turbo energy between subcarriers, that is, a large dynamic range of
codes [14] are good error-correction codes which allow the ADC is required.) is encountered, the required dynamic
the transmission data rate close to the Shannon limit, but range of the ADC is higher than for a “good” one. (A
the complexity of LDPC codes is lower than Turbo codes “good” channel on the other hand is when, e.g., flat fading
and the performance of LDPC codes is better than Turbo occurs.) In addition, the ADC power consumption can be
codes for short-length blocks [15]. Therefore, in this paper almost 50% of the total baseband power consumption [5].
LDPC codes are used together with Cyclic Redundancy Check This means that a resolution-adaptive ADC can potentially
(CRC) to make the wireless channel behave like an erasure save power. A CMOS implementation of such an ADC
channel. is described in [17]. In this implementation, the power
Our FEC encoding scheme is performed as follows. First, consumption scales linearly with the number of quantization
a Fountain-encoded packet is created. Then, a CRC is added. levels.
3.1. Minimum Number of Quantization Levels. In OFDM where Nk is the quantization noise in the frequency domain
receivers, demodulation of the subcarriers is performed in and Hk is the fading over the kth subcarrier defined by
the frequency domain. For that reason, it is not beforehand
clear, how many ADC bits are necessary for proper decoding. Hk = hl e− j(2π/N)lk . (6)
In [18], the authors have derived a relation between the l
quantization noise in the time domain and frequency
domain. However, results are shown only for nonfading In [18], the authors have shown that Nk is a Gaussian-
channels. In this section, we present a scheme to design distributed random variable with zero mean and a variance
an optimum low-resolution ADC for frequency-selective of Δ2 /6. Thus, for each subcarrier, the variance of the quan-
channels. tization noise is the same, but the signal-to-(quantization)-
Because the quantization noise depends on the signal, we noise ratio (SNR) is different due to different fading:
first analyze the statistical characteristics of the ADC input 2
rn . The channel is supposed to be noiseless, so the output at Hk
the nth moment rn is defined as SNRk = . (7)
Δ2 /6
−1
L
Error correcting codes can be applied to mitigate the
rn = hl xn−l , (2) effects of quantization and each code has a certain SNR
l=0
threshold to achieve BER at a certain order (e.g., 10−4 ) or
where L is the number of channel taps, hl the channel lower. So, the quantization step Δ can be determined once the
taps, and x the transmitted signal. We assume that the error correcting code is chosen and the channel is estimated.
quantization noise is dominant, so other noise (e.g., thermal In practical systems, the ADC resolution is finite. This
noise) is ignored in this paper. From [18], we know that xn means that for the same channel, the required dynamic range
can be modeled as a complex Gaussian-distributed random of the ADC is larger for higher code rates.
variable with zero-mean and a variance of 1. The elements in If some clipping is allowed, the number of quantization
vector [x0 , x1 , . . . , xN−1 ] are mutual independent. levels Nq is given by [18]
According to the central limit theorem [19], the sum of
a sequence of independent, identically distributed random C
Nq = 2 , (8)
variables tends to be Gaussian-distributed, so the probability Δ
density function of rn can be described as
where C is equal to 3σrn . Once the channel is fixed, Nq is only
1
dependent on Δ. In such a case, Δ depends not only on the
≈ e−|rn | / l |hl | .
2 2
f rn (3)
π applied error-correction codes in the system, but also on how
the encoded bits are transmitted. Assume that the Fountain-
In other words, rn ∼ C N(0, l |hl |2 ). encoded packets are transmitted over a wireless channel as
The ADC output yn is expressed by shown in Figure 1 and that a packet is received correctly
when SNR ≥ 12 dB. There are two schemes to transmit these
yn = Q rn = hl xn−l + nn , (4)
Fountain-encoded packets
l
where nn is the quantization noise in the time domain. (i) Scheme I is to transmit each packet over all subcar-
From [18], we know that nn is uniformly distributed with riers like current WLAN systems, which means that
zero mean and a variance of Δ2 /6, where Δ is the uniform the SNR of the worst subcarrier should be at least
quantization step. equal to 12 dB. In this case, the required number
Due to the additional cyclic prefix in each OFDM of quantization levels Nq is 54 for the example in
symbol, the convolution in (4) can be considered as a cyclic Figure 3.
convolution [4]. So, after the OFDM demodulation, we can
(ii) Scheme II is to transmit each packet over one
write Yk as
subcarrier. Since each Fountain-encoded packet is
1 independent, it does not matter if we discard some
Yk = √ yn e− j(2π/N)nk packets which are transmitted over “bad” subcarriers.
N n
From Figure 3, we can see that by discarding 15
1 subcarriers, Nq can be reduced to 38 in comparison
= √ hl xn−l + nn e− j(2π/N)nk
N n l to Scheme I.
1
(5)
= √ xn−l e− j(2π/N)(n−l)k hl e− j(2π/N)lk
N n l 3.2. Power Consumption. The power consumption of the
1 ADC is proportional to the number of quantization levels Nq
+√ nn e− j(2π/N)nk which is related to the effective number of bits (ENOB) by
N n
= Hk Xk + Nk , Nq = 2ENOB . (9)
5
Frequency domain In this section, the system model of an IEEE 802.11a
transceiver is discussed as shown in Figure 4. It is a simplified
model with focus on the (de)modulation and (en/dec)coding
of the bit stream. This means that we assume, for example,
0 that there is no adjacent channel interference.
The FEC layer in current IEEE 802.11a system is based
Magnitude (dB)
on RCPC. RCPC has a good performance for random bit

errors. An Interleaver is used to remove the burst errors.
−5
Although this solution works well in practical systems, it is
not optimal. First, packets that have encountered a “bad”
SNR ≥ 12 dB ≥ Nq = 38 channel condition are still processed by the entire receiver
−10 chain. Although the IEEE 802.11a standard uses a form of
adaptive modulation, it only consists of 6 modes which is a
very coarse form. The transmitter tries continuously to use
SNR ≥ 12 dB ≥ Nq = 54
the highest code rate, but adaptation is relatively slow and
−15 each mode is designed for the “average” channel. This means
−30 −20 −10 0 10 20 30
that for most packets, the code rate and hence capacity can be
Sub-carriers
increased. Furthermore, the resolution of the applied ADCs
Figure 3: The difference in the number of ADC levels Nq between is fixed for a 802.11a system.
the transmission Scheme I and the transmission Scheme II. In In Figure 5, we show the new error-correction layer
this example, Nq = 54 levels are required for the transmission that mitigates both problems. The key idea is to generate
Scheme I such that each Fountain packet is transmitted over all additional packets by the Fountain encoder. First, the source
subcarriers; Nq can be reduced to 38 levels when 15 subcarriers packets are encoded by the Fountain encoder. Then, a CRC
are discarded in the transmission Scheme II where each Fountain
checksum is added to each Fountain-encoded packet and
packet is transmitted over one subcarrier only.
LDPC encoding is applied. The code rate of the LDPC code
is chosen relatively high as only packets with high SNR
have to be decoded, others are discarded. Each encoded
Thus, Nq is a measurement for the power consumption: packet is transmitted on one subcarrier of the OFDM system.
At the receiver side, we assume that the synchronization
c −1
M
is perfect and the channel is estimated by an adaptive
P= αi Nqi M, (10)
i=0
ADC with high-resolution. After that, the adaptive ADC
can be reduced to the minimum necessary resolution for
where Mc is the number of channel realizations, αi is the each channel realization. In the transmitter, more Fountain-
percentage of the ith channel realization where useful infor- encoded packets are created than necessary for decoding. The
mation is transmitted, Nqi is the number of quantization receiver has now the freedom to discard some of the received
levels used in the ith channel realization, and M is the packets. A further resolution reduction can be achieved by
number of samples per MAC frame. discarding the packets which are transmitted over “bad”
When Scheme II is applied, the power consumption subcarriers.
of the ADC can be reduced by discarding “bad” subcar- If the SNR of the subcarrier is equal to or above the
riers. However, discarding transmitted packets over “bad” threshold, the received Fountain-encoded packet will go
subcarriers leads to an increase in the number of the through LDPC decoding, otherwise it will be discarded.
transmitted packets. Therefore, there is a tradeoff in the In our implementation, we choose a threshold of 12 dB
power consumption of the ADC between the number of lost for the used LDPC code. This means that the receiver
subcarriers and the number of transmitted packets. is allowed to discard several subcarriers (i.e., packets) to
So far, we have designed the quantization scheme for lower the dynamic range of the ADC and hence the power
OFDM systems over the frequency-selective channels under consumption. After the LDPC decoding, the CRC checksum
the assumption of the perfect channel knowledge. However, is used to discard erroneous packets. As only packets with a
in practical systems, the channel cannot be perfectly esti- high SNR are processed by the receiver, this will not happen
mated, which affects the design of quantization scheme. We very often.
will discuss this influence in the following section. In practical systems, the channel cannot be perfectly
estimated. High-resolution ADCs are applied to estimate the
4. System Model channel and the channel is estimated, for example by the zero
forcing algorithm. A set of training symbols defined in [6] is
As mentioned earlier, our opportunistic error-correction used to estimate the channel, so we have:
layer is based on Fountain codes and resolution-adaptive
ADCs which have been explained in the previous sections. Yt = Hk Xt + Nh , (11)
This proposed error-correction layer can be applied in
OFDM systems. The IEEE 802.11a system is taken as an where Xt is the training symbol, Yt is the received training
example of an OFDM system in this paper. symbol, Hk is the kth subcarrier, and Nh is the quantization
Convolutional Interleaving Mapping IFFT DAC Up conversion

encoder
(a) Transmitter
Down conversion ADC FFT Demapping Deinterleaving Convolutional

decoder
(b) Receiver
Figure 4: Conventional 802.11a (a) transmitter and (b) receiver.
Fountain CRC LDPC Mapping

encoding encoding encoding
IFFT DAC Up conversion
(a) Transmitter
Estimate
channel
Down conversion ADC FFT
<threshold
Discard RX packet
SNR
≥threshold LDPC CRC Fountain

decoding decoding decoding
(b) Receiver
Figure 5: Proposed 802.11a (a) transmitter and (b) receiver. In the transmitter, first source packets are encoded by Fountain codes then
LDPC and CRC are applied to each Fountain-encoded packet; after that each encoded packet is transmitted over a subcarrier. In the receiver,
the channel is first estimated by high-resolution ADCs then the resolution of ADCs are adapted to the minimum according to the estimated
channel knowledge. Each received packet is decoded by LDPC and CRC if SNR ≥ threshold, otherwise, it will be discarded. When the
receiver gets N Fountain-encoded packets, it can recover the source file.
noise from adaptive ADCs with high-resolution. The kth So, we can rewrite the output signal in the frequency domain
subcarrier can be estimated by after quantization defined in (5) as
Yk = Hk Xk + Na
Y
H k = t Nh
Xt k Xk −
=H X + Na (13)
(12) Xt
Nh
= Hk + . k Xk + N ,
=H
Xt
10−2 (i) Method I: Assume σa2 > σh2 and the number of lost
subcarriers is fixed, so the quantization step Δ can be
derived from (14) and defined as

2
10−3 H
k

Δ= 6 − σh2 (15)
SNRk
BER
and Nq can be determined by (8).

10−4 (ii) Method II: Assume σa2 σh2 , (14) can be rewritten as
2
H
k
SNRk = , (16)
Δ2 /6
10−5
11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 12 so Δ is defined as:
SNR (dB)

2
H
k
With perfect H
Without perfect H Δ= 6 , (17)
SNRk
Figure 6: BER of the (175,255) LDPC code. The dash-dot line is the
BER curve with perfect channel knowledge and the solid line is the and Nq follows from (8) as well. In this case, we will
BER curve without perfect channel knowledge. have smaller Nq compared to Method I but we might
lose more subcarriers (i.e., packets) since the SNR
defined in (14) is smaller than the SNR we assume
in (17).
where Na is the quantization noise from resolution-adaptive
ADCs. The variance of N (σN2 ) is equal to σN2 h + σN2 a . In order to see the influence of the channel estimation
Therefore, with the channel estimation error, the SNR for error and the performance of the quantization design
each subcarrier defined in (7) can be updated as Methods I and II, we give an example as shown in Figure 7. In
this example, we only consider 52 active subcarriers defined
in [6] and assume that no subcarrier is discarded. In the case
2 of perfect channel estimation, 248 quantization levels are
H
k
SNRk = required. When the channel is estimated by the zero forcing
σN2 algorithm, the required Nq using Method I is 396, and using
2 Method II, 216 levels are needed which is less than the case of
H
k
= (14) perfect channel estimation. However, one extra subcarrier is
σN2 h + σN2 a discarded when Method II is applied, because this subcarrier
2 has lower SNR than the threshold (i.e., 12 dB). Obviously,
H
k
= . Method II requires smaller Nq in comparison with Method
σN2 h + Δ2 /6 I though in this case we might lose more subcarriers than
we expect. Method II is chosen to design the quantization
scheme in this paper.
As we can see, the noise defined in this equation is composed As we mentioned before, there is a tradeoff in the
of the quantization noise from high-resolution ADCs for the power consumption of ADCs between the number of lost
channel estimation and from resolution-adaptive ADCs for subcarriers and the number of transmitted packets. In
the user data. Figure 8 the relation is depicted between power consumption
The channel estimation error will affect the SNR thresh- (dynamic range) and the number of discarded subcarriers
old for the correct LDPC decoding, as shown in Figure 6. with perfect channel knowledge and without. In each case the
From this figure, we can see that the BER degradation same amount of information was transmitted and decoded
can be neglected (within 0.1 dB gap for BER at the order successfully by the receiver. Besides, the subcarriers with
of 10−5 ), which means the received packets can still go the lowest energy are discarded. From this figure, we can
through the LDPC decoder when SNR ≥ 12 dB. Though also see that the channel estimation error does not really
the channel estimation error does not influence the LDPC affect the total power consumption. For the perfect channel
decoding too much, the number of quantization levels estimation, the minimum power consumption is reached
Nq will be affected, as the LDPC decoder needs to know when 14 subcarriers are discarded. Without perfect channel
the SNR defined in (14). How Nq is influenced by the knowledge, the lowest power consumption can be obtained
channel estimation error depends on the design method when 15 subcarriers are discarded.
of quantization scheme. There are two design methods as In the following section, we compare both systems for the
follows. same bit rate.
5 ×108
7
0
6
−5
Magnitude (dB)
5
−10
4
−15
P
3
−20
SNR ≥ 12 dB ≥ NqI = 348, NqII = 216 2
−25
SNR ≥ 12 dB ≥ Nq = 248
1
−30
−30 −20 −10 0 10 20 30
Sub-carriers 0
0 2 4 6 8 10 12 14 16 18 20
Real H Number of lost sub-carriers
Estimated H
With perfect channel knowledge
Figure 7: The influence of the channel estimation error and the Without perfect channel knowledge
difference in Nq between the quantization design Method I and
the quantization design Method II. In this example, we assume the Figure 8: Power consumption defined in (10) versus the number of
worst subcarrier has an SNR of 12 dB. Nq = 248 for the case with lost subcarriers for a Fountain code with a packet size of 168 bits.
perfect channel knowledge. With the estimated channel knowledge, The dot points are with perfect channel knowledge and the circle
Nq = 348 for the quantization Method I and Nq = 216 for the points are without perfect channel knowledge.
quantization Method II.
Table 1: System setup comparison for three scenarios (Nc : the

number of data carriers, N: the number of subcarriers, Ns : the
5. Performance Analysis number of symbols per MAC frame).
In this section we compare three scenarios. Channel model Scenario I Scenario II Scenario III
A is used in all our simulations and we simulate at least ADC normal res. adapt. res. adapt.
1 million bits per simulation. The first scenario, Scenario LDPC +
I, is a conventional IEEE 802.11a system with 16-QAM FEC RCPC RCPC
Fount. codes
modulation and code rate 1/2. This mode has a throughput Code rate 0.5 0.5 0.66
of 24 Mbit/s (source bits). As the standard allows 10% packet
Modulation 16-QAM 16-QAM 16-QAM
loss [6], the effective throughput is 0.9 · 24 = 21.6 Mbit/s.
Nc 48 48 48
Moreover, we assume that conventional ADCs are used,
of which the resolution has been designed for 90% of N 64 64 64
the channel realizations. In Scenario II, the conventional Ns 500 500 500
ADCs are replaced by resolution-adaptive ADCs, which are Effective throughput 21.6 Mbit/s 21.6 Mbit/s 21.6 Mbit/s
designed to allow 10% packet loss. Finally, in Scenario III, SNR in freq.
9.0 dB 9.0 dB 12.0 dB
we apply the new opportunistic error-correction layer, which domain
has the same effective throughput as Scenario II.
As discussed before, the channel estimation error can
be neglected in case of using high-resolution ADCs. In our
simulations, we use the parameters of Table 1. The “SNR in We replace the error-correction layer by a 7-bit CRC
frequency domain” is the minimal SNR for each subcarrier. checksum and an LDPC code (175,255) which has a code rate
If this value is met, the packet error rate (PER) will be less of 0.66. For the Fountain code part, we use a LT code with
than 10%, as required by the standard [6]. Symbols are parameters c = 0.03 and δ = 0.3. The resulting Fountain
transmitted in bursts (i.e., MAC frame) and in 802.11a, 500 code packets are transmitted on separate subcarriers and over
OFDM symbols are packed into one burst. multiple MAC frames. On average, 14 subcarriers can be
From Figure 8, one can derive that the minimal power discarded by the receiver, which is close to the optimal value
consumption for Scenario III will be reached if about if there is no perfect channel estimation.
15 subcarriers can be discarded without perfect channel Figure 9 shows the consumed power (per source bit) for
knowledge. So, the LDPC and CRC checksum have to be each scenario versus the Fountain code block length K under
chosen in such a way that the total throughput is equal to the condition of the nonperfect channel knowledge. For each
Scenario I and about 15 subcarriers are discarded by the simulation point 2000 Fountain code bursts are transmitted.
receiver. The power consumption in Scenario I is constant for each
90 10−1
80
70 10−2
60
10−3
50
BER
P
40
10−4
30
20 10−5
10
0 10−6
100 200 300 400 500 600 700 800 900 1000 14 15 16 17 18 19 20 21 22 23
K SNR (dB)
Scenario I (perfect H) Scenario II (imperfect H) Convolutions codes

Scenario I (imperfect H) Scenario III (perfect H) LDPC
Scenario II (perfect H) Scenario III (imperfect H) Opportunistic error correction
Figure 9: The power consumption (defined in (10)) per source Figure 10: FEC layers comparison over channel model A with the
bit. The curves with circle-mark are for Scenario I, the curves same effective transmission data rate 21.6 Mbits/s (i.e., PER = 10%,
with square-mark are for Scenario II, and the curves with triangle- BER = 2 × 10−4 ). The SNR is in the time domain and the channel
mark are for Scenario III. The solid curve is with perfect channel estimation is assumed to be perfect. The blue circle points are
knowledge and the dash-dot line is without channel knowledge. for convolutions codes; the red square points are for LDPC codes
from 802.11n and the black star points are for opportunistic error
correction layer (Fountain codes + LDPC plus CRC). For SNR =
16 dB and higher, no errors are detected in the opportunistic error-
K since the conventional ADC is designed for worst-case. correction layer. So, for SNR = 16 dB, we represent BER = 0 by 10−6 .
In Scenario II, the power consumption on average is about
58% of the power consumed in Scenario I. The difference
in the power consumption for different K in this scenario is
due to the channel randomness. In Scenario III, the power curves overlap but this does not happen in Scenario I and II.
consumption for different K is inversely proportional to the Therefore, the new error-correction layer is less sensitive to
overhead of LT codes. The average power consumption for the channel estimation error compared to the conventional
receiving 1 source bit in Scenario III is about 48% of the error-correction layer.
average power consumed in Scenario II and about 28% of Thus, the resolution-adaptive ADC can save around 42%
the average power consumed in scenario I. power and the novel opportunistic error-correction layer can
Furthermore, Figure 9 shows the power consumption save an additional 30% power consumption. In total, the new
with perfect channel knowledge. As we can see, for each method reduces the power consumption in ADCs by 72%
Scenario, the consumed power has little difference between compared to the current 802.11a standard.
the perfect channel estimation and the nonperfect channel From Section 3, we can see that low power consumption
estimation. This difference depends on how accurate the means low SNR requirement. Compared to the RCPC codes,
channel is estimated. As we know, the zero forcing estimation LDPC codes have better performance close to the Shannon
algorithm assumes no noise in the received symbol which limit. For that reason, LDPC codes have been adopted by
means this algorithm has better performance when SNR the IEEE 802.11n standard. In order to check how our
is higher. Figure 7 also shows that “good” subcarriers can scheme performs with respect to the required SNR, we
be more accurately estimated than “bad” subcarriers. In compare convolutional codes, LDPC codes from the IEEE
Scenario III we only need to take care of “good” subcarriers 802.11n standard and our opportunistic error correction
but we have to take care of all subcarriers in Scenarios I, layer under the condition of the same effective throughput
and II. From (8), we can see that Nq is determined by (i.e., 21.6 Mbits/s). Figure 10 shows the simulation results
the quantization step Δ. The threshold of the used LDPC over the noisy channel model A with perfect channel
is 12 dB which means Δ depends on the wanted subcarrier estimation when K = 500. For each simulation point, more
with the lowest energy Hk as defined in (14). In a word, Nq than 1 million bits are transmitted. From this figure, we can
is determined by Hk . |Hk |2 in Scenario III is larger than see that the required SNR for convolutional codes is 23 dB
|Hk |2 in Scenario I and II. So, the difference in Nq between when BER = 2 × 10−4 . A similar value for this channel model
the perfect channel knowledge and the nonperfect channel is reported in [20]. Figure 10 shows that LDPC codes have
knowledge is smaller in Scenario III than in Scenario I and II, a gain of around 4 dB comparing to convolutional codes.
as we can see in Figure 9. In this figure, in Scenario III both However, the proposed method has a gain of around 7.5 dB.
6. Conclusions and Future Work IEEE International on Solid-State Circuits Conference (ISSCC
’02), vol. 1, pp. 126–451, San Francisco, Calif, USA, February
In this paper, we propose a novel cross-layer scheme which 2002.
integrates the error-correction into the physical layer. This [6] IEEE 802.11a-1999, “Supplement to Information Technology-
new opportunistic error-correction layer is designed for Telecommunications and Information Exchange Between
OFDM systems (e.g., IEEE 802.11a system) based on Foun- Systems—Local and Metropolitan Area Networks—Specific
tain codes and resolution-adaptive ADCs. Each Fountain- Requirements—Part 11: Wireless LAN Medium Access Con-
encoded packet is transmitted over a subcarrier. By discard- trol and Physical Layer Specifications: High Speed Physical
ing Fountain-encoded packets that have been transmitted Layer in the 5 GHz Band,” November 1999.
[7] D. J. C. MacKay, “Fountain codes,” IEE Proceedings: Commu-
over “bad” subcarriers, the dynamic range of ADCs can be
nications, vol. 152, no. 6, pp. 1062–1068, 2005.
reduced. Correspondingly, the power consumption of ADCs [8] M. Luby, “LT codes,” in Proceedings of the 43rd Annual IEEE
can be lowered as well. Symposium on Foundations of Computer Science, pp. 271–280,
The ADCs in a receiver can consume up to 50% of the Vancouver, Canada, November 2002.
total baseband energy, so it is advantageous to lower its [9] A. Shokrollahi, “Raptor codes,” IEEE Transactions on Informa-
power consumption. The resolution-adaptive ADC can save tion Theory, vol. 52, no. 6, pp. 2551–2567, 2006.
on average around 42% energy consumption comparing to [10] H. Zhu, C. Zhang, and J. Lu, “Designing of fountain codes
the conventional ADC. Fountain codes together with LDPC with short code-length,” in Proceedings of the 3rd International
plus CRC codes can allow the power consumption in ADC to Workshop on Signal Design and Its Applications in Communi-
be decreased by an additional 30%. So, the new opportunistic cations (IWSDA ’07), pp. 65–68, Chengdu, China, September
error-correction layer can reduce the power consumption in 2007.
[11] X. Shao, R. Schiphorst, and C. H. Slump, “Opportunistic
ADC by more than 70% compared with the conventional
error correction for WLAN applications,” in Proceedings of
IEEE 802.11a system. In addition, it requires 7.5 dB less SNR the International Conference on Wireless Communications,
than the 802.11a system. Networking and Mobile Computing (WiCOM ’08), pp. 1–5,
Besides, we have shown that the new error-correction Dalian, China, October 2008.
layer is a robust scheme against the channel estimation [12] A. Bogdanov, M. C. Mertens, C. Paar, J. Pelzl, and A. Rupp,
errors. So, ADCs can also be adapted to the minimum res- “A parallel hardware architecture for fast Gaussian elimination
olution in a realistic system where the channel is estimated. over GF(2),” in Proceedings of the 14th Annual IEEE Symposium
Moreover, by using message-passing algorithm and Gaussian on Field-Programmable Custom Computing Machines (FCCM
elimination algorithm, the new error-correction scheme can ’06), pp. 237–248, Napa, Calif, USA, April 2006.
be applied to a small packet size (e.g., K = 500) with [13] Y. Kou, S. Lin, and M. P. C. Fossorier, “Low-density parity-
low overhead (e.g., 3%) which can make this new scheme check codes based on finite geometries: a rediscovery and new
efficient. results,” IEEE Transactions on Information Theory, vol. 47, no.
7, pp. 2711–2736, 2001.
Here, we assume that there is no adjacent interference
[14] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon
which does not happen in the real wireless channels. Further limit error-correcting coding and decoding: turbo-codes.
research focuses on the optimization of this new error- 1,” in Proceedings of the IEEE International Conference on
correction for the wireless channel with adjacent interfer- Communications (ICC ’93), vol. 2, pp. 1064–1070, Geneva,
ence. Switzerland, May 1993.
[15] E. Jacobsen, “LDPC FEC for 802.11n application,” IEEE, 2003.
[16] J. G. Proakis, Digital Communications, McGraw Hill, New
Acknowledgments York, NY, USA, 2001.
The authors thank the anonymous reviewers for the useful [17] S. Nahata, K. Choi, and J. Yoo, “A high-speed power and
resolution adaptive flash analog-to-digital converter,” in Pro-
comments. Also, the authors thank the Dutch Ministry of
ceedings of the IEEE International SOC Conference, pp. 33–36,
Economic Affairs under the IOP Generic Communication— Santa Clara, Calif, USA, September 2004.
SenterNovem Program for the financial support. [18] X. Shao and C. H. Slump, “Quantization effects in OFDM
systems,” in Proceedings of the 29th Symposisum on Information
References Theory in the Benelux, pp. 93–103, Leuven, Belgium, May
2008.
[1] A. Bahai, B. Saltzberg, and M. Ergen, Multi-Carrier Digital [19] S. M. Ross, Introduction to Probability Models, Academic Press,
Communications: Theory and Applications of OFDM, Springer, Orlando, Fla, USA, 2003.
New York, NY, USA, 2004. [20] A. Doufexi, S. Armour, M. Butler, et al., “A comparison of the
[2] H. Liu and G. Li, OFDM-Based Broadband Wireless Networks: Hiperlan/2 and IEEE 802.11a wireless LAN standards,” IEEE
Design and Optimization, John Wiley & Sons, New York, NY, Communications Magazine, vol. 40, no. 5, pp. 172–180, 2002.
USA, 2005.
[3] M. Engels, Wireless OFDM Systems: How to Make Them Work?
Kluwer Academic Publishers, Dordrecht, The Netherlands,
2002.
[4] D. Tse and P. Viswanath, Fundamentals of Wireless Communi-
cation, Cambridge University Press, New York, NY, USA, 2005.
[5] J. Thomson, B. Baas, E. M. Cooper, et al., “An integrated
802.11a baseband and MAC processor,” in Proceedings of the
doi:10.1155/2009/635304
Research Article
Service Differentiation in OFDM-Based IEEE 802.16 Networks
Yi Zhou,1 Kai Chen,2 Jianhua He,3 Haibin Guan,4 Yan Zhang,5 and Alei Liang1
1 School of Electronic, Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200240, China
2 School of Information Security and Engineering, Shanghai Jiaotong University, Shanghai 200240, China
3 Institute of Advanced Telecommunications, Swansea University, Swansea SA2 8PP, UK
4 Department of Computer Science, Shanghai Jiaotong University, Shanghai 200240, China
5 Simula Research Laboratory, 1325 Lysaker, Norway
Correspondence should be addressed to Kai Chen, kchen@sjtu.edu.cn
Received 1 August 2008; Accepted 1 December 2008
IEEE 802.16 network is widely viewed as a strong candidate solution for broadband wireless access systems. Various flexible
mechanisms related to QoS provisioning have been specified for uplink traffic at the medium access control (MAC) layer in
the standards. Among the mechanisms, bandwidth request scheme can be used to indicate and request bandwidth demands to
the base station for different services. Due to the diverse QoS requirements of the applications, service differentiation (SD) is
desirable for the bandwidth request scheme. In this paper, we propose several SD approaches. The approaches are based on the
contention-based bandwidth request scheme and achieved by the means of assigning different channel access parameters and/or
bandwidth allocation priorities to different services. Additionally, we propose effective analytical model to study the impacts of the
SD approaches, which can be used for the configuration and optimization of the SD services. It is observed from simulations that
the analytical model has high accuracy. Service can be efficiently differentiated with initial backoff window in terms of throughput
and channel access delay. Moreover, the service differentiation can be improved if combined with the bandwidth allocation priority
approach without adverse impacts on the overall system throughput.
Copyright © 2009 Yi Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which
1. Introduction hotspot and cellular networks [2–7]. It can provide a

cost-effective alternative to the existing solutions for these
In today telecommunications, networking and services are applications.
changing in a rapid way to support next generation Internet Flexible bandwidth request and allocation mechanisms
user environment. Broadband wireless access (BWA) is one have been specified in the 802.16 standards to support
of the most promising solutions for broadband access and different scheduling services for uplink traffic, namely,
will play an important role in the next generation Internet. unsolicited grant services (UGSs), real-time polling services
BWA systems are being increasingly deployed and used in the (rtPSs), nonreal-time polling services (nrtPSs), and best-
last mile for extending or enhancing Internet connectivity effort (BE) services [1–3, 8, 9]. UGS connections will
for fixed and/or mobile clients located on the edge of the periodically receive bandwidth grant for the uplink traffic
wired network. IEEE 802.16 standard has been developed from the base station without sending bandwidth request.
as one of the technical solutions for BWA systems [1]. rtPS connections periodically receive bandwidth grants to
The physical (PHY) layer and MAC layer specifications send bandwidth request to the base station to indicate
are defined in 802.16d for fixed BWA, and enhanced with bandwidth demands. On the other hand, nrtPS and the
low to moderate mobility support in 802.16e [1]. World- best-effort (BE) service need transmit bandwidth requests
wide Interoperability for Microwave Access (WiMAX) was via random access or by piggybacking the requests to
founded to promote the compatibility of 802.16 products. already granted data transmissions. nrtPS connections may
IEEE 802.16 networks can have a wide variety of applications, receive sporadic exclusive bandwidth request opportunities
including high-speed Internet access, backhaul for WiFi to request bandwidth from the base station.
UGS and rtPS scheduling services can be used for the from SSs are directed to and centrally coordinated by the
applications with stringent QoS requirements, for example, BS. Mesh mode is an optional configuration, in which SSs
VoIP and VoD, and so on. However, one problem of the can communicate with the BS over multihops. The IEEE
advanced scheduling services (UGS and rtPS) is that a 802.16 standard specifies four physical layers: SC and SCa
good knowledge of the traffic characteristics (e.g., packet for single carrier transmission in line-of-sight (LoS) and
arrival interval and packet size) is required to provide nonline-of sight (NLoS) environments, OFDM and OFDMA
satisfactory QoS and efficiently utilize bandwidth [10]. Since for multicarrier transmission in NLoS environments. A
the traffic characteristics of many applications are not known common MAC layer is defined for all physical layers with
in advance, traffic transported via the random bandwidth small adaptations to the different physical layers. In the
request scheme will achieve better bandwidth utilization, following, we focus on the bandwidth request scheme in
which makes the random access very important for the the PMP networks with OFDM physical layer. But it worths
applications over IEEE 802.16 networks. noting that the proposed SD approaches and analytical
Several papers have investigated the IEEE 802.16 random model can be easily extended to the PMP networks with
access scheme by simulation, theoretic analysis, or both. OFDMA physical layer, as the bandwidth request scheme
Vinel et al. accurately analyzed the truncated binary expo- is very similar for both OFDM and OFDMA physical
nential backoff (TBEB) algorithm specified in the 802.16 layers.
standard [11]. The performance of the 802.16 contention- The MAC layer supports both time-division duplexing
based CDMA requesting mechanism are analyzed with (TDD) and frequency-division duplexing (FDD) modes. In
orthogonal frequency-division multiple address (OFDMA) the TDD mode, each MAC frame consists of a downlink
physical layer[12–15]. The attempts of analyzing the random subframe followed by an uplink subframe, which is inves-
access protocol and finding the optimal access parameters are tigated in this paper. The investigation can be applied to
made in [16, 17]. The 802.16 bandwidth request scheme is FDD model as well. The downlink subframe starts with a
analyzed with saturated traffic and limited bandwidth in [14, preamble and a frame control header. The frame control
18]. Random access and polling mechanisms for WiMAX header specifies the presence of control information within
networks are compared in [9]. The OFDM- and OFDMA- the downlink subframe. Downlink MAP (DL-MAP) and
based random access schemes are compared by simulation uplink MAP (UL-MAP) follow the frame control header and
in [10]. specify the usage of the downlink and uplink subframes,
In this paper, we investigate several SD approaches for respectively. Particularly, the DL-MAP defines the starting
the bandwidth request scheme with limited bandwidth. As times, destinations and the burst profiles (modulation and
the bandwidth request scheme is anticipated to support coding) of the data bursts within the downlink subframe.
applications with diverse QoS requirements, for example, The UL-MAP allocates resources in the uplink subframe
emergency services with bursty traffic and real-time voting to the different subscriber stations, for the purpose of
for live entertainments, it is important to implement SD ranging, bandwidth requesting, and data burst transmission,
over the bandwidth request scheme to support diverse QoS and so on.
requirements [19]. A QoS differentiation scheme for IEEE
802.16 mesh network was investigated in [20]. However, to
the best of our knowledge, no papers have been published 2.2. Basic Bandwidth Request Scheme. The bandwidth req-
so far, with a comprehensive study of the SD with the uest scheme in 802.16 networks consists mainly of pro-
802.16 bandwidth request scheme. Moreover, most of the cedures of bandwidth requesting and granting. A general
existing work has not considered the constraints of available method of bandwidth requesting is that SSs send stand-alone
bandwidth [11, 13, 16, 17]. In addition, we also propose an bandwidth request header to indicate required bandwidth
analytical model to study the impacts and effectiveness of (in bytes) in the transmission opportunities (TOs), which
the SD approaches. The analytical model can be used as a are allocated by the BS in the uplink subframes. The BS
tool for the adaptive configuration of the SD approaches. In will process the successfully received bandwidth request
the remainder of the paper, the bandwidth request scheme headers from the SSs and allocate available bandwidth to
is introduced in Section 2. SD approaches are also described the SSs. The bandwidth allocation will be indicated in
in Section 2. Analytical model is presented in Section 3. the downlink subframes. If bandwidth is granted to an
Simulation results are presented and analyzed in Section 4. SS, the SS can send collision-free transmission burst in
Finally, Section 5 concludes the paper. the allocated bandwidth. Each burst is associated with a
physical burst profile (i.e., modulation and error control
coding schemes). In addition, a collision-free bandwidth
2. Bandwidth Request Scheme request header can be sent together with data in the allocated
bandwidth to request further bandwidth if the SS has more
2.1. Overview of MAC Layer Protocol. The 802.16 standard traffic to send. The above general requesting method can be
specifies two modes for sharing wireless medium: point-to- used for different services. Alternative bandwidth requesting
multipoint (PMP) or mesh modes. A base station (BS) serves mechanisms include contention-based focused bandwidth
multiple subscriber stations (SSs) in the PMP mode. The requests (for WirelessMAN-OFDM physical specification
downlink (from the BS to the user) operates on a PMP basis only) and contention-based CDMA bandwidth requests (for
and is generally broadcast. All the transmissions in the uplink WirelessMAN-OFDMA only) [1]. In this paper, we focus on
the general bandwidth requesting method without request In this paper, we will consider the approaches of
piggyback. differentiating channel access parameters and priority-based
As the traffic from the nrtPS and BE connections can bandwidth allocation. In the 802.16 standard, the maximal
be bursty and is hard to predict, the TBEB algorithm has number of retries m is required to be no less than 16.
been specified in the 802.16 standard for collision resolution. Therefore, setting different m for the connections will not be
When an SS has a packet to send to the BS, it uses the an efficient SD approach. With regard to the priority-based
TBEB algorithm to determine which frame and which TO bandwidth allocation, equal priority and absolute priority-
to transmit their bandwidth requests [1]. A backoff process based schemes are considered. In the absolute priority-based
is initiated, with backoff counter uniformly chosen in [0, scheme, the available bandwidth is allocated to the higher
W0 − 1], where W0 is the initial backoff windows. The priority connections first. The lower priority connections
backoff counter decreases by one for each eligible TO in will receive bandwidth grant only after all the higher priority
the frames without the need of sensing the channel. Once connections receive bandwidth grants. For the connections
the backoff counter reaches zero, the SS can transmit its with the same priority and insufficient bandwidth, the BS
bandwidth request header to the BS. After the request header randomly choose some connections to serve.
is transmitted, the SS sets a timer (with value of Nw in the In general, we will assume that a BS serves Nu inde-
unit of frames) to wait for the bandwidth grant from the pendent SSs in the PMP network. There are K service
BS. If the transmitted request header does not collide with classes with each service class associated with Nu,k SSs,

other request headers and can be decoded by the BS, the and Nu = Kk=1 Nu,k . Denoted the duration of an OFDM
BS may allocate the requested bandwidth to the SS provided symbol by Tsym in seconds and the frame duration by
that bandwidth is available. In the case that the SS receives T f in seconds. Each REQ transmission opportunity (TO)
bandwidth grant before the timer expires, it can use the consists of Tr OFDM symbols. Assume a fixed number
granted bandwidth to transmit data burst in the subsequent (Nr ) of TOs are assigned in each uplink subframe, and the
frame [1]. However, if the SS fails to receive bandwidth bandwidth (in bytes) requested by each request header can
grant before the timer time out, the SS will retransmit be accommodated by one transmission burst. Each REQ is
the bandwidth request header to the BS. The bandwidth assumed to request bandwidth for a data packet of a fixed
request header can be transmitted up to a maximal number length of L bytes. Each burst consists of Td OFDM symbols.
of m retries. For every retransmission purpose, the backoff Bandwidth for Nd bursts is allocated for the bandwidth
window is doubled until it reaches the maximal backoff requesting scheme in each uplink subframe.
window (denoted by Wx ), which means the backoff counter In principle an SS can have traffic from multiple
for the ith retransmission of a bandwidth request header connections. For convenience, we assume each SS has only
will be uniformly chosen in [0, Wi − 1] with Wi = min one connection with BS, and the connection belongs to
(2i W0 , Wx ). If the bandwidth request header does not result one of the two predefined service classes in the network.
in a successful bandwidth grant after m retries, the SS will Each service class is associated with a set of channel access
stop this bandwidth request attempt and discard the packet parameters. A priority policy is used for the service classes on
associated with this request attempt. If there are still packets bandwidth allocation. Without loss of generality, we assume
left in the buffer, the SS can start the TBEB algorithm again class i connections (SSs) has higher priority over class j
to request bandwidth. connections (SSs) with 1 ≤ i < j ≤ K. Therefore, the
It is noted that the BS will determine and broadcast the channel parameters of W0 and Wx for class 1 connections
channel access parameters, such as the initial backoff window are not larger than those for class 2 connections, and class
W0 , the maximal backoff window Wx , the maximal number 1 connections have higher or equal priority if compared to
of retries m, and the timeout value Nw . It is also the BS’s class 2 connections on receiving bandwidth grant from the
responsibility to determine the number of TOs for the SSs BS.
in each uplink subframe.
2.3. Service Differentiation Schemes. From the bandwidth 3. Analytical Model

request scheme, we can find that the quality of service for
the nrtPS and BE connections will be affected directly by 3.1. Model Assumption. For simplicity, we model 2 service
the number of eligible TOs in a frame, the random channel classes in the PMP networks, with Nu,1 and Nu,2 SSs
access parameters, and the policy of granting available associated with the first and the second service classes,
bandwidth to the successfully received bandwidth request respectively. Each SS has saturated traffic to send to the BS.
headers by the BS. To differentiate the services for the If the number of successfully received bandwidth request
connections, we can use the following intuitive methods. headers (simply abbreviated as REQ) in a frame is larger
than Nd , BS will simply serve Nd randomly chosen SSs and
(1) Allocating different TOs for the connections. drop the REQs received from the other SSs. It is trivial to
(2) Setting different channel access parameters. extend the model to more than 2 services classes and unequal
(3) Giving priorities on allocating bandwidth to SSs with priority on the bandwidth allocation. Let W0,k and Wx,k
successfully received bandwidth request headers. denote the initial and maximal backoff windows of TBEB,
for k ∈ [1, 2], respectively. Denote the maximal number of
The differentiation methods can be used separately or jointly. retries as mk for service class k. The contention window Wi,k
for backoff stage i (i ∈ [0, mk ]) is min (2i W0,k , Wx,k ), where 3.3. Probability of Unsuccessful REQ Transmission. Note that
function min() calculates the minimum of the variable set. If here an unsuccessful REQ means a collided REQ trans-
a class k SS is unsuccessful in a TO transmission due to either mission or a uncollided REQ but not receiving bandwidth
collision or insufficient bandwidth, the SS waits Nw frames allocation from the BS due to limited bandwidth. Let pq,k (l)
before retransmission. denote the probability of exactly l TO transmissions from Nk
class k SSs in a frame, and Ntr,k denote the average number
3.2. REQ Transmission Probability. We assume that each SS of TO transmissions from class k SSs in a frame. We can
transmits independently of the other SSs in steady state. calculate them with given τk by
Let τk denote the probability that a tagged class k SS
transmits REQ over TOs in a general frame. Let pk denote

Nk l N −l
the probability that a TO transmission from the tagged class pq,k (l) = τk 1 − τk u,k , (5)
k SS fails due to either collision or insufficient bandwidth in l
BS. τk and pk are assumed to be constant in the steady state.
To derive the expression for τk , we define a contention N
and Ntr,k = l=u,k1 l pq,k (l).
resolution process (CRP) as the process of the tagged SS from
A collision will happen if more than one REQ is
the initialization of TBEB to the end of TBEB for a packet
transmitted in the same TO of a frame. Denote by puc (s1 , s2 |
transmission attempt. At the end of a TBEB, the packet is
l1 , l2 ) the probability that exactly s1 REQs among the l1 REQs
either discarded or successfully sent to BS. τk is therefore
from class 1 SSs and s2 REQs among l2 REQs from class 2
computed for the tagged class k SS by the ratio of the average
SSs are uncollided in a frame, where s1 ≤ l1 , s2 ≤ l2 and
number of TO transmissions in a CRP for class k SS (denoted
s1 + s2 ≤ Nr . We can compute puc (s1 , s2 | l1 , l2 ) by (6) with
by Nt,k , in frames) and the average period of a CRP (denoted
a similar method for a classic occupancy problem given in
by Ncrp,k , in frames) for class k SS. We can obtain τk (k ∈
[21]:
[1, 2]) by
Nt,k

τk = . (1) 1 l1 l2 Nr !
Ncrp,k puc s1 , s2 | l1 , l2 =
Nrl1 +l2 s1 s2 Nr − s1 − s2 !
Let preq,k (i) denote the probability that the number of
class k REQ transmissions is exactly i in an EBRP, for i ∈ min(Nr −s1 −
s2 ,l1 +l2 −s1 −s2 )
[1, m + 1]. We have preq,k (i) = (1 − pk )pki−1 , for i ∈ [1, m] ×
and preq,k (m + 1) = pkm . The average number of REQ v=0
transmissions in an EBRP (Nt,k ) can be computed by

v Nr − s1 − s2
× (−1)

m+1
v
Nt,k = ipreq,k (i)
i=1 l1 +l2 −s1 −s2 −v
(2) ×(Nr − s1 − s2 − v .

m
(6)
= (i + 1) 1 − pk pki + (m + 1)pkm+1 .
i=0
The average period of a CRP depends on the number for 0 ≤ s1 ≤ l1 , 0 ≤ s2 ≤ l2 and s1 + s2 ≤ Nr ; otherwise
of TO transmissions in a CRP by an SS and the outcomes puc (s1 , s2 | l1 , l2 ) = 0. In (6), the first factor 1/Nrl1 +L2 is the
of the transmissions. Let Nz,k denote the number of frames probability of each arrangement of random transmission of
required for the tagged class k SS to transmit a data packet l1 + L2 REQs over Nr Tos; the second, the third, and the
after a successful TO transmission. Denote by Nb,k (i) the forth factors calculate the number of ways that s1 of l1 class
average of the total number of frames that the tagged class 1 REQs and s2 of l2 class 2 REQs are chosen and each of the
k SS spends on the backoff processes for TO transmissions chosen REQs is transmitted over one of the Nr TOs without
in a CRP, under the condition of exactly i retries in the CRP. collision. The summation calculates the number of ways that
Assume that W0,k is dividable by Nr ·Nb,k (i), for i ∈ [0, m], is the left l1 + l2 − s1 − s2 REQs are transmitted over the left
given by Nr − s1 − s2 TOs and none of the l1 + l2 − s1 − s2 REQ
transmissions is successful.
W j,k −1

i
1 k+1
i
Nr + W j,k Then we can compute the average number of successful
Nb,k (i) = = , (3) REQ transmitted from class k SSs (denoted by Nsuc,k in a
j =0
W j,k k=0 Nr j =0
2W j,k
frame by (7) for k = 1:
where x obtains the minimal integer n no smaller than x.
The average period Ncrp,k of a CRP for the tagged class k Nu,1 Nu,2
SS can be computed by
Nsuc,1 = pq,1 l1 pq,2 l2

m
l1 =0 l2 =0
Ncrp,k = 1 − pk pki i Nw − 1 + Nb,k (i) + Nz,k
(7)
i=0 (4)
l1
l2
p s1 , s2 | l1 , l2 s1 min Nd , s1 + s2
× ,
+ pkm+1 (m + 1) Nw − 1 + Nb,k (m + 1) . s1 =1 s2 =0 s1 + s2
and by (8) for k = 2, resp.): 500
Throughput per subscription station (kbps)

Nu,1 Nu,2 450

Nsuc,2 = pq,1 l1 pq,2 l2
400
l1 =0 l2 =0

(8)

l1
l2 350
p s1 , s2 | l1 , l2 s2 min Nd , s1 + s2
× .
s1 =0 s2 =1 s1 + s2 300
The probability of unsuccessful REQ transmission pk can 250

be simply computed from Nsuc,k and the average number of 200
transmitted REQ in one frame:
150
N
p1 = 1 − Nu,1 suc,1 ,
100
l=1 pq,1 (l)l 10 20 30 40 50 60 70
(9)
N Number of subscriber stations
p2 = 1 − Nu,2 suc,2 .
l=1 pq,2 (l)l Analysis: class 1, Nr /Nd = 1.5
Simulation: class 1, Nr /Nd = 1.5
The values of τk and pk (k ∈ [1, 2]) can be obtained Analysis: class 2, Nr /Nd = 1.5
by solving nonlinear equations (1) and (9) with numeric Simulation: class 2, Nr /Nd = 1.5
techniques. Analysis: class 1, Nr /Nd = 3
Simulation: class 1, Nr /Nd = 3
Analysis: class 2, Nr /Nd = 3
3.4. Throughput and Delay. With the transmission probabil- Simulation: class 2, Nr /Nd = 3
ity τk and the probability of unsuccessful REQ transmission
pk , we can calculate the performance metrics of interest for Figure 1: Throughput of single SS versus the number of SSs
service differentiation, including throughput and channel with equal bandwidth allocation priority and differentiated channel
access delay for each service class. In addition, the overall access parameters: W0,1 = W0,2 = Nr , Wx,1 = 24 W0,1, and Wx,2 =
system bandwidth efficiency can also be computed. 28 W0,2 .
Define throughput of a single SS (denoted by θk for
service class k, k ∈ [1, 2]) as the average number of bits
transmitted from an SS to the BS in one second. The 4. Numerical Results
throughput of a single SS depends largely on the physical
burst profile. Let Bsym denote the number of uncoded bits A discrete event driven simulator for the IEEE 802.16
that can be transmitted with an OFDM symbol for an SS. bandwidth request scheme has been implemented. The
Then the throughput θk of a class k SS can be computed for simulator can be configured with various system and channel
k ∈ [1, 2]: access parameters. With the simulator, we can investigate the
effectiveness of the analytical model and how the network
Nsuc,k Tdt Bsym performances can be differentiated under the conditions of
θk = , (10)
Tf changing number of SSs and limited bandwidth.
For the results presented in the paper, channel bandwidth
where T f is the duration of a frame. Define channel access is set to 20 MHz in the physical layer. We assume an ideal
delay Dk for service class k (k ∈ [1, 2]) as the average time channel, where frames will be successfully received unless
elapsed between the beginning of the frame in which the first collision happens. Frame duration is set to 10 milliseconds.
backoff process is initiated for a packet from a class k SS to There are 256 independent subcarriers in an OFDM symbol
the end of the frame in which the last backoff process ends in and 844 OFDM symbols in a frame. Assume each TO takes 2
a bandwidth request attempt. Then we can simply get D = OFDM symbols (Tr = 2) and each transmission burst takes
Ncrp T f . 16 OFDM symbols (Td = 16). The number of uncoded bits
We define the bandwidth efficiency (denoted by η) of a Bsym that can be transmitted in an OFDM symbol is set to
bandwidth request scheme as the ratio of the data packet 384, obtained with the burst profile of 16QAM modulation
length (denoted by Td in OFDM symbols) to the average and 1/2 coding rate. The value of timer Tw is set to 2. For the
bandwidth (in OFDM symbols) consumed to successfully bandwidth configuration, the number of bursts Nd that can
transmit a data packet. Let T0 denote the bandwidth (in be transmitted in a frame is set to 12. The number of TOs Nr
OFDM symbols) needed for a TO, which is two for the in a frame has two configurations: Nr = 1.5Nd and Nr = 3Nd .
the general OFDM-based bandwidth request scheme. The The maximal number of retries m is set to 16 for both service
bandwidth efficiency η for the whole system can be calculated classes in all the tests.
by

Td Nsuc,1 + Nsuc,2 4.1. Differentiation with Channel Access Parameters. In this
η= . (11) section, we investigate the accuracy of the analytical model
Nr Tr + Nd Td
60 0.5
55
0.45
Transmission probability per frame

50
Channel access delay (ms)
45 0.4
40
0.35
35
30 0.3
25 0.25
20
0.2
15
10
10 20 30 40 50 60 70 10 20 30 40 50 60 70
Number of subscriber stations Number of subscriber stations
Analysis: class 1, Nr /Nd = 1.5 Analysis: class 1, Nr /Nd = 1.5

Simulation: class 1, Nr /Nd = 1.5 Simulation: class 1, Nr /Nd = 1.5
Analysis: class 1, Nr /Nd = 3 Analysis: class 1, Nr /Nd = 3
Simulation: class 1, Nr /Nd = 3 Simulation: class 1, Nr /Nd = 3
Figure 2: Delay of SS versus the number of SSs with equal Figure 3: REQ transmission probability of SS versus the number
bandwidth allocation priority and differentiated channel access of SSs with equal bandwidth allocation priorit, and differentiated
parameters: W0,1 = W0,2 = Nr , Wx,1 = 24 W0,1 and Wx,2 = 28 W0,2 . channel access parameters: W0,1 = W0,2 = Nr , Wx,1 = 24 W0,1, and
Wx,2 = 28 W0,2 .
and the achievable SD performances by tuning channel Especially when the network is not congested (e.g., less than
access parameters. In this case, only the channel access 40 SSs in the network), the SD is not obvious in Figure 1.
parameters are used for SD and the priority on bandwidth Next, we investigate the SD performances with different
allocation is the same for the two classes of SSs. initial backoff windows and maximal backoff windows. We
We first test the impact of maximal backoff window on set different initial backoff windows and maximal backoff
SD. The initial backoff windows for both classes are set to windows for the two classes of SSs. We keep W0,1 = Nr , Wx,1 =
W0,1 = W0,2 = Nr . The SD is achieved by setting Wx,1 = 24 W0,1 , and Wx,2 = 28 W0,2 , but change W0,2 to 2Nr and 3Nr .
24 W0,1 and Wx,2 = 28 W0,2 . The throughput of single SS θ, The corresponding throughput associated with W0,2 = 2Nr
REQ transmission probability τ, and channel access delay and W0,2 = 3Nr is plotted in Figures 4 and 5, respectively.
D is plotted in Figures 1, 2, and, 3, respectively. As the Compared to the results in Figure 1, again it can be observed
REQ unsuccessful probability is very close for both service that the analytical model has very high accuracy. We can find
classes, the corresponding results are not presented. The that the initial backoff window is much more effective than
number of SSs in the figures is the sum of class 1 and the maximal backoff window for SD, for the whole range of
class 2 SSs, and the number of class 1 SSs is fixed as 10. the number of SSs in the network. The SD is more obvious
The throughput and channel access delay of class 1 SSs are with increased differentiation on the initial backoff window.
represented by solid lines, and class 2 SSs by dashed lines The class 1 SS throughput is more than twice that of the class
in the figures. We also use symbols “square” and “diamond” 2 SS throughput in most of the investigated cases for W0,2 =
to denote the performances with Nr = 1.5Nd and Nr = 3Nd , 3Nr . Class 1 SS throughput is proportional to the throughput
respectively. The throughput, channel access delay, and REQ of class 2 SS with differentiated initial backoff window, which
transmission probability for each class of SSs are averaged can help to configure the channel access parameters for SD.
over the corresponding SSs in the classes. For comparison, Similarly, we plotted the results of channel access delay
the symbols corresponding to analytical results are not filled in Figures 6 and 7, and the REQ transmission probability
with any color and the symbols corresponding to simulation in in Figures 8 and 9, for W0,2 = 2Nr and W0,2 = 3Nr ,
results are filled with black color. Each simulation result respectively. It can be observed from Figures 8 and 9 that
is obtained by averaging 30 simulations. We can observe the REQ transmission probability of class 1 SSs does not
from Figures 1, 2, and, 3 that the analytical model has very change much as their initial backoff window is unchanged.
high accuracy. It is also observed that differentiating services The REQ transmission probability of class 2 SSs reduces
with only the maximal backoff window is not so effective. with the increased initial backoff window. Consequently, the
500 70
450
60
400

350 50
300
40
250
200 30
150
20
100
50 10
10 20 30 40 50 60 70 10 20 30 40 50 60 70

Figure 4: Throughput of single SS versus the number of SSs Figure 6: Delay of SS versus the number of SSs with equal
with equal bandwidth allocation priority and differentiated channel bandwidth allocation priority and differentiated channel access
access parameters: W0,1 = Nr , W0,2 =2 Nr , Wx,1 = 24 W0,1, and Wx,2 = parameters: W0,1 = Nr , W0,2 =2 Nr , Wx,1 = 24 W0,1, and Wx,2 = 28 W0,2 .
28 W0,2 .
500 throughput of class 2 SSs largely reduces and the channel

access delay increases.
450
400 4.2. Bandwidth Allocation Priorities. Next, we will investi-

350 gate the impact of bandwidth allocation priority on SD
performances. We keep the configurations of channel access
300 parameters the same as those in Section 4.1, except for
250 changing the equal bandwidth allocation priority to absolute
priority. Only simulation results are presented for the
200 investigation on bandwidth allocation priority performance.
150 The simulation results of throughput and channel access
delay with absolute bandwidth allocation priority and differ-
100 entiated maximal backoff window are shown by the symbols
50 filled with black color in Figures 10 and 11, but the initial
10 20 30 40 50 60 70 backoff window for both classes of SSs is set to the same. As
Number of subscriber stations observed previously, the maximal backoff window has minor
contribution to SD. Therefore, it is easy to understand the
Analysis: class 1, Nr /Nd = 1.5
impact of bandwidth allocation priority on SD from Figures
Analysis: class 2, Nr /Nd = 1.5 10 and 11. It is observed that in the case of Nr = 1.5Nd ,
Simulation: class 2, Nr /Nd = 1.5 bandwidth allocation priority has almost no impact on SD.
Analysis: class 1, Nr /Nd = 3 However, in the case of Nr = 3Nd , bandwidth allocation
Simulation: class 1, Nr /Nd = 3 priority is shown effective and can make more significant
Analysis: class 2, Nr /Nd = 3 contribution than the initial backoff window, especially when
Simulation: class 2, Nr /Nd = 3 the number of SSs is large. The reason can be explained by
Figure 5: Throughput of single SS versus the number of SSs that when the number of TOs is small, channel access is the
with equal bandwidth allocation priority and differentiated channel bottleneck and the initial backoff window is more critical
access parameters: W0,1 = Nr , W0,2 =3 Nr , Wx,1 = 24 W0,1, and Wx,2 = than bandwidth allocation priority. In contrast, when the
28 W0,2 . number of TOs is large in a frame, the bottleneck is no longer
80 0.5
70 0.45

0.4
60
0.35
50
0.3
40
0.25
30
0.2
20 0.15
10 0.1
10 20 30 40 50 60 70 10 20 30 40 50 60 70

Figure 7: Delay of single SS versus the number of SSs with equal Figure 9: REQ transmission probability of SS versus the number
bandwidth allocation priority and differentiated channel access of SSs with equal bandwidth allocation priority, and differentiated
parameters: W0,1 = Nr , W0,2 =3 Nr , Wx,1 = 24 W0,1, and Wx,2 = 28 W0,2 . channel access parameters: W0,1 = Nr , W0,2 =3 Nr , Wx,1 = 24 W0,1
and Wx,2 = 28 W0,2 .
0.5 in the channel access, and bandwidth allocation plays an

important role. In this case, bandwidth allocation priority
0.45 can be more effective. But the SD performance obtained with
bandwidth allocation priority is more nonlinear than that

0.4 with channel access parameter with increasing number of SSs
in the networks.
0.35 The joint impact of channel access parameters and
bandwidth allocation priority on SD is illustrated in Figures
0.3 10 and 11 by the symbols without filling. It is observed
that in the case of small number of TOs (Nr = 1.5Nd ),
0.25
the contribution of initial backoff window is dominated in
SD. However, the joint impact of initial backoff window
0.2
and bandwidth allocation priority is larger than that can be
achieved when they are separately used for SD.
10 20 30 40 50 60 70
Number of subscriber stations
5. Conclusion
Analysis: class 1, Nr /Nd = 1.5
In this paper, we investigated the service differentiation
Analysis: class 2, Nr /Nd = 1.5 (SD) for the bandwidth request scheme specified in the
Simulation: class 2, Nr /Nd = 1.5 IEEE 802.16 standard. Several SD approaches including
Analysis: class 1, Nr /Nd = 3 differentiating channel access parameters (mainly initial
Simulation: class 1, Nr /Nd = 3 and maximal backoff windows) and bandwidth allocation
Analysis: class 2, Nr /Nd = 3 priority are studied in detail. An analytical model is proposed
Simulation: class 2, Nr /Nd = 3
to understand the impact of the SD approaches, which can
Figure 8: REQ transmission probability of SS versus the number be used to adaptively configure and optimize the system
of SSs with equal bandwidth allocation priority, and differentiated performances. Simulation validates the analytical model.
channel access parameters: W0,1 = Nr , W0,2 =2 Nr , Wx,1 = 24 W0,1 Through the numerical results, it was observed that using
and Wx,2 = 28 W0,2 . maximal backoff window cannot effectively differentiate
500 service. Instead, initial backoff window-based SD approach

450
is very effective, and it is slightly worse than the priority-
based bandwidth allocation scheme in SD when the number
400 of REQ transmission opportunities (TOs) in a frame is
350
large. When the number of TOs is small, the initial backoff
window-based SD approach is much more effective than the
300 bandwidth allocation priority-based approach. The service
250
can be better differentiated when the initial backoff window
and bandwidth allocation priority are jointly used.
200
150 Acknowledgments
100
This work is supported by the National Grand Funda-
50 mental Research 973 Program of China under Grant no.
20 30 40 50 60 70 80 90 2007CB316506, 863 Program of China under Grant no.
Number of subscriber stations 2006AA01Z169, and by the European Union through the
Class 1, Nr /Nd = 1.5, W0,2 /Nr = 1
Welsh Assembly Government.
Class 2, Nr /Nd = 1.5, W0,2 /Nr = 1
Class 1, Nr /Nd = 3, W0,2 /Nr = 1
Class 2, Nr /Nd = 3, W0,2 /Nr = 1
References
Class 1, Nr /Nd = 1.5, W0,2 /Nr = 3
[1] “IEEE Standard for Local and metropolitan area networks—
Class 2, Nr /Nd = 1.5, W0,2 /Nr = 3
part 16: air interface for fixed broadband wireless access
Class 1, Nr /Nd = 3, W0,2 /Nr = 3
systems,” 2004.
Class 2, Nr /Nd = 3, W0,2 /Nr = 3
[2] J. He, K. Yang, K. Guild, and H.-H. Chen, “Application of IEEE
Figure 10: Throughput of single SS versus the number of SSs with 802.16 mesh networks as the backhaul of multihop cellular
absolute bandwidth allocation priority, and differentiated channel networks,” IEEE Communications Magazine, vol. 45, no. 9, pp.
access parameters: W0,1 = W0,2 = Nr , Wx,1 = 24 W0,1 and Wx,2 = 82–90, 2007.
28 W0,2 . [3] C. Cicconetti, A. Erta, L. Lenzini, and E. Mingozzi, “Perfor-
mance evaluation of the IEEE 802.16 MAC for QoS support,”
IEEE Transactions on Mobile Computing, vol. 6, no. 1, pp. 26–
38, 2007.
[4] J. He, K. Yang, and K. Guild, “A dynamic bandwidth reserva-
100 tion scheme for hybrid IEEE 802.16 wireless networks,” in Pro-
90 ceedings of IEEE International Conference on Communications
(ICC ’08), pp. 2571–2575, Beijing, China, May 2008.
80 [5] T.-C. Chen, Y.-Y. Chen, and J.-C. Chen, “An efficient energy
saving mechanism for IEEE 802.16e wireless MANs,” IEEE

70
Transactions on Wireless Communications, vol. 7, no. 10, pp.
60 3708–3712, 2008.
[6] M. P. Anastasopoulos, P.-D. M. Arapoglou, R. Kannan, and P.
50 G. Cottis, “Adaptive routing strategies in IEEE 802.16 multi-
40 hop wireless backhaul networks based on evolutionary game
theory,” IEEE Journal on Selected Areas in Communications, vol.
30 26, no. 7, pp. 1218–1225, 2008.
[7] H. Zhu, Y. Tang, and I. Chlamtac, “Unified collision-free
20
coordinated distributed scheduling (CF-CDS) in IEEE 802.16
10 mesh networks,” IEEE Transactions on Wireless Communica-
20 30 40 50 60 70 80 90 tions, vol. 7, no. 10, pp. 3889–3903, 2008.
Number of subscriber stations [8] J. Borin and N. Fonseca, “Scheduler for IEEE 802.16 net-
works,” IEEE Communications Letters, vol. 12, no. 4, pp. 274–
Class 1, Nr /Nd = 1.5, W0,2 /Nr = 1 276, 2008.
Class 2, Nr /Nd = 1.5, W0,2 /Nr = 1
[9] Q. Ni, A. Vinel, Y. Xiao, A. Turlikov, and T. Jiang, “Wireless
Class 1, Nr /Nd = 3, W0,2 /Nr = 1
broadband access: WiMax and beyond—investigation of
Class 2, Nr /Nd = 3, W0,2 /Nr = 1
bandwidth request mechanisms under point-to-multipoint
Class 1, Nr /Nd = 1.5, W0,2 /Nr = 3
mode of WiMAX networks,” IEEE Communications Magazine,
Class 2, Nr /Nd = 1.5, W0,2 /Nr = 3
Class 1, Nr /Nd = 3, W0,2 /Nr = 3 vol. 45, no. 5, pp. 132–138, 2007.
Class 2, Nr /Nd = 3, W0,2 /Nr = 3 [10] D. Staehle and R. Pries, “Comparative study of the IEEE
802.16 random access mechanisms,” in Proceedings of the
Figure 11: Channel access delay versus number of SSs with absolute International Conference on Next Generation Mobile Applica-
bandwidth allocation priority, and differentiated channel access tions, Services and Technologies (NGMAST ’07), pp. 334–339,
parameters: W0,1 = W0,2 = Nr , Wx,1 = 24 W0,1 , and Wx,2 = 28 W0,2 . Cardiff, UK, September 2007.
[11] A. Vinel, Y. Zhang, M. Lott, and A. Tiurlikov, “Performance

analysis of the random access in IEEE 802.16,” in Proceedings
of the 16th IEEE International Symposium on Personal, Indoor
and Mobile Radio Communications (PIMRC ’05), vol. 3, pp.
1596–1600, Berlin, Germany, September 2005.
[12] J.-B. Seo, H.-W. Lee, and O.-H. Cho, “Queueing behavior of
IEEE802.16 random access protocol for sporadic data trans-
missions,” in Proceedings of the 15th International Conference
on Computer Communications and Networks (ICCCN ’06), pp.
351–357, Arlington, Va, USA, October 2006.
[13] H.-W. Lee and J.-B. Seo, “Queueing performance of IEEE
802.16 random access protocol with bulk transmissions,” in
Proceedings of IEEE International Conference on Communica-
tions (ICC ’07), pp. 5963–5968, Glasgow, UK, June 2007.
[14] J. He, Z. Tang, and H. Chen, “Performance comparison of
OFDM bandwidth request schemes in fixed IEEE 802.16
networks,” IEEE Communications Letters, vol. 12, no. 4, pp.
283–285, 2008.
[15] S. Kwon and D. Cho, “CDMA code-based bandwidth request
mechanism in IEEE 802.16j mobile multi-hop relay (MMR)
systems,” in Proceedings of the 68th IEEE Vehicular Technology
Conference (VTC ’08), pp. 1–5, Calgary, Canada, September
2008.
[16] J. Yan and G.-S. Kuo, “Cross-layer design of optimal con-
tention period for IEEE 802.16 BWA systems,” in Proceedings of
IEEE International Conference on Communications (ICC ’06),
vol. 4, pp. 1807–1812, Istanbul, Turkey, July 2006.
[17] A. Doha, H. Hassanein, and G. Takahara, “Performance
evaluation of reservation medium access control in IEEE
802.16 networks,” in Proceedings of the 4th ACS/IEEE Inter-
national Conference on Computer Systems and Applications
(AICCSA ’06), pp. 369–374, Dubai, UAE, March 2006.
[18] J. He, K. Guild, K. Yang, and H.-H. Chen, “Modeling
contention based bandwidth request scheme for IEEE 802.16
networks,” IEEE Communications Letters, vol. 11, no. 8, pp.
698–700, 2007.
[19] K. Chen, Y. Zhou, J. He, and Z. Tang, “Service differentiation
for the bandwidth request scheme in fixed IEEE 802.16 net-
works,” in Proceedings of the 4th IEEE International Conference
on Circuits and Systems for Communications (ICCSC ’08), pp.
718–722, Shanghai, China, May 2008.
[20] H. Hu, Y. Zhang, and H.-H. Chen, “An effective QoS
differentiation scheme for wireless mesh networks,” IEEE
Network, vol. 22, no. 1, pp. 66–73, 2008.
[21] W. Feller, An Introduction to Probability Theory and Its
Applications, vol. 1, John Wiley & Sons, New York, NY, USA,
1957.
doi:10.1155/2009/940518
Research Article
Multiuser Radio Resource Allocation for Multiservice
Transmission in OFDMA-Based Cooperative Relay Networks
Xing Zhang, Shuping Chen, and Wenbo Wang

Wireless Signal Processing & Networks Lab (WSPN), Key Lab of Universal Wireless Communications,
Beijing University of Posts and Telecommunications, P.O. Box 93, Beijing 100876, China
Correspondence should be addressed to Xing Zhang, zhangx@bupt.edu.cn
Received 29 July 2008; Accepted 20 October 2008
The problem of multiservice transmission in OFDMA-based cooperative relay networks is studied comprehensively. We propose
a framework to adaptively allocate power, subcarriers, and data rate in OFDMA system to maximize spectral efficiency under the
constraints of satisfying multiuser multiservices’ QoS requirements. Specifically, first we concentrate on the single-user scenario
which considers multiservice transmission in point-to-point cooperative relay network. Based on the analysis of single-user
scenario, we extend the multiservice transmission to multiuser point-to-multipoint scenario. Next, based on the framework, we
propose several suboptimal radio resource allocation algorithms for multiservice transmissions in OFDMA-based cooperative
relay networks to further reduce the computational complexity. Simulation results show that the proposed algorithms yield much
higher spectral efficiency and much lower outage probability, which are flexible and efficient for the OFDMA-based cooperative
relay system.
Copyright © 2009 Xing Zhang et al. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction system over OFDM-TDMA and OFDM-CDMA systems is

the elimination of intracell interference (users with different
Orthogonal frequency division multiple (OFDM) has subcarriers in the same cell will not interfere with each
received considerable research in recent decades. And many other).
systems, standards, and networks have adopted OFDM as OFDMA-related technologies are currently attracting
the key technique. For multiuser applications, one way intensive attentions in wireless communications to meet
of applying OFDM is through OFDM-TDMA or OFDM- the ever-increasing demands arising from the explosive
CDMA, where different users are allocated with different growth of Internet, multimedia, and broadband services.
time slots or different spreading codes. However, the fact OFDMA-based systems are able to deliver high data rate,
that each user has to transmit its signal over the entire operate in the hostile multipath radio environment, and
spectrum leads to an averaged-down effect in the presence of allow efficient sharing of limited resources such as spectrum
deep fading and narrowband interference. Alternatively, one and transmit power among multiple users. OFDMA has
can divide the total bandwidth into frequency blocks (one been used in the mobility mode of IEEE 802.16 WiMAX
or a cluster of OFDM subcarriers) so that multiple access [2], and is currently a working specification in 3GPP
can be accommodated in an orthogonal frequency division long-term evolution (LTE) and LTE-advanced [3], and it
multiple access (OFDMA) fashion, some literatures call this is also the candidate access method for the IEEE 802.22
OFDM-FDMA. An OFDMA system is defined as one in “wireless regional area networks” (WRANs). Clearly, recent
which each user occupies a subset of frequency blocks and advances in wireless communication technology have led to
each block is assigned exclusively to one user at any time significant innovations that enable OFDMA-based wireless
(e.g., one time slot), thus the radio resources are allocated access networks to provide better quality of service (QoS)
in both the frequency (subcarrier) domain and the time than ever with convenient and inexpensive deployment and
domain, as shown in Figure 1 [1]. An advantage of OFDMA mobility.
Channel state information

(CSI)
Subcarrier 1
User 1, rate R1 , BER1
OFDMA
resource
allocation
(subcarrier, Subcarrier k IFFT and Add guard
User k, rate Rk , BERk parallel-to-serial
AMC, time prefix
slot
allocation) Subcarrier K
User K, rate RK , BERK
Resource
selection
(subcarrier, FFT and Remove guard
User k data time slot serial-to-parallel prefix Receiving data
selection)
for user k
de-AMC
Resource assignment
information
Figure 1: Diagram of orthogonal frequency division multiple access (OFDMA) systems.
However, regardless of the technology used, OFDMA (CC) protocol, of which AF is one attractive cooperative
networks must not only be able to provide reliable and protocol where the relay simply amplifies the signal received
high-quality broadband services, but also be implemented from the source and transmits the amplified signal to the
cost-effectively and be operated efficiently. OFDMA presents destination, it has very low complexity and requires no
many of the advantages and challenges of OFDM systems for decoding at relay nodes.
single users, and the extension to multiple users introduces Currently, there are some papers which have addressed
many further challenges and opportunities, both on the the problem of resource allocation in OFDMA-based coop-
physical layer and at higher layers. These requirements erative relay system [13–18]. In [13], the authors study
present many challenges in the design of network archi- the power allocation mechanism for capacity maximization
tectures and protocols, which have motivated a significant for fixed power at the source and relay nodes, respectively.
amount of research in the area. Radio resource allocation Reference [14] proposes a suboptimal power allocation for
(RRA) is essential for system performance enhancement, AF protocol aiming at maximizing the system capacity
and for OFDMA systems, it has brought many challenges using equivalent channel gain model, [15] studies the
[1, 4]. Currently, many literatures have investigated the power allocation for DF protocol, and [16] studies the
adaptive subcarrier, bit, and power allocation in the OFDMA power allocation of MIMO OFDM system for AF protocol,
systems [5–8]. These papers show that when the channel state the purpose is to maximize system capacity. In [17], the
information (CSI) is available at the transmitter (e.g., water- author studies the problem of minimizing power under the
filling power allocation can be utilized as the optimal power rate constraint and obtains the adaptive bit/rate allocation
allocation for multicarrier systems), the system capacity can scheme through Lagrange theorem. Reference [18] studies
be greatly increased by exploiting the frequency domain the optimal source/relay/subcarrier allocation problem using
diversity as well as multiuser diversity. However, this type a graph theoretical approach by transforming it into a linear
of allocation does not consider the time-varying nature of optimal distribution problem in a directed graph, and then
the fading channel; if the temporal channel state information obtains the optimal relay and subcarrier allocation scheme.
is also known beforehand (through channel prediction or In summary, current literatures are mainly focused
feedback information from the receiver) it can be utilized to on the problem of power allocation for system capacity
bring the time domain diversity as well as multiuser diversity maximization or data rate allocation for transmit power
to further improve the spectral efficiency. minimization. And current studies have not considered the
Recently, cooperative communications have also received traffic transmission in such system, especially the multiuser
considerable research attentions in academy, industry, and multiservice transmission under the QoS constraints. Mean-
standard institutes [9–12]. Several cooperative strategies are while, the problem of subcarrier allocation has not been
proposed such as the amplify-and-forward (AF) protocol, thoroughly studied, especially for multiservice transmission
decode-and-forward (DF) protocol, and coded cooperation in cooperative communication system. (The work in [18]
considers subcarrier allocation, but it does not consider Relay

power and rate allocation, as well as different traffic trans-
missions under QoS constraint.) The future network will R
be a network with multiuser multitraffic (multiservice), j
b j = hs,r
j
c j = hr,d
and different services/traffics will have completely different
characteristics and QoS requirements, thus multiple traffic OFDMA
transmissions in future OFDMA-based cooperative relay net-
works have given great challenge to the resource scheduling S D
j
and allocation. This paper addresses the problem of multiple a j = hs,d
Source Destination
traffic/service transmission in OFDMA-based cooperative
relay networks. We consider how to adaptively allocate Figure 2: OFDMA-based cooperative relay transmission model.
power, subcarriers and data rate to maximize system spectral
efficiency under the constraints of satisfying multiuser mul-
tiservices’ QoS requirements. First, we concentrate on the
and propose the optimal power, subcarrier, and data rate
single-user scenario considering multiservice transmission
allocation scheme to maximize the system spectral efficiency
in point-to-point cooperative relay network; then based on
under the constraints of multiservices’ QoS constraints (data
the analysis of single-user case, we extend the multiservice
rate and bit-error rate (BER)) for multiple traffic trans-
transmission to multiuser point-to-multipoint case.
mission. Based on this scheme, we propose a suboptimal
Specifically, the major contributions of this paper can be
scheme to further reduce the computational complexity of
summarized as follows:
the resource allocation scheme.
(i) a system model is proposed to study the radio
resource allocation of multiservice transmission in 2.1. System Model of Single-User Point-to-Point OFDMA-
OFDMA-based cooperative relay networks; Based Cooperative Relay Network. Figure 2 gives the system
(ii) a framework is given to adaptively allocate power, model for the OFDMA-based cooperative relay network.
subcarriers, and data rate to maximize system spec- Each node has only one antenna. OFDMA is used for the
tral efficiency under the constraints of satisfying channel access between the source and relay, relay and the
multiuser multiservices’ QoS requirements; destination, and source and the destination. The total data
transmission period is divided into two parts: first, the
(iii) several suboptimal resource allocation algorithms are
source node transmits to the destination node, the relay
proposed for multiservice transmission in OFDMA-
node and the destination can both receive the data; then, the
based cooperative relay networks to reduce the
relay node forwards the data it receives in the first period
computational complexity;
to the destination using AF protocol. At the destination,
(iv) the resource scheduling process is decomposed into maximal ratio combining (MRC) is used to recover the
several steps, that is, the first step performs an initial signal.
search without any constraint and in the following Suppose that the OFDMA subcarrier set is Σ and the
step, a complexity-reduced resource reallocation pro- cardinality |Σ| = N, that is, there are N orthogonal
cedure is performed for each resource unit; through j j j
this multistep scheduling procedure the scheduling subcarriers available in the system. Let hs,d , hs,r , and hr,d
complexity is greatly reduced. be the channel coefficients of the jth subcarrier between
the source and destination, source and relay, and relay and
j j
The rest of this paper is organized as follows: in Section 2, destination, respectively. Denote a j = |hs,d |2 , b j = |hs,r |2 ,
the system model for multiservice transmission in OFDMA- j
and c j = |hr,d |2 . And we suppose the channel experiences
based cooperative relay network is given and described in flat fading during each OFDM symbol period, and the
detail; in Section 3, we give the framework of multiser- channels between each symbol are independent. Let Ps and
vice transmission in single-user point-to-point cooperative Pr be the transmission powers at the source and relay
relay network; then multiuser multiservice transmission in node, respectively. Ps + Pr ≤ P, the power allocated to
multiuser point-to-multipoint cooperative relay networks is the jth subcarrier at the source and relay node is P sj and
given in Section 4, and several transmission algorithms are P rj , respectively. Then, the total power constraint can be
also presented; in Section 5, simulation results and analyses written as
are given to verify the proposed algorithm. Finally, we
conclude our paper and give the future work in Section 6.
P sj + P rj = P j ≤ P, (1)
j j
2. System Model for Multiservice
Transmission in OFDMA-Based in which P j is the sum of power allocated to the jth subcarrier
Cooperative Relay Networks at the source and relay node. Let the power be allocated to
the jth subcarrier at the source node psj = κ j P j , and at the
First we give the framework of radio resource allocation for relay node prj = (1 − κ j )P j , respectively, where κ j ∈ (0, 1] is
point-to-point OFDMA-based cooperative relay networks defined as the power allocation proportional factor.
For amplify-and-forward (AF) protocol, the channel

capacity between the source and destination node can be R
written as
psj a j psj b j · prj c j
1 D1
Cj = log 1 + + s r
, (2)
2 Γσ 2 Γσ 2 σ 2 + p j b j + p j c j
in which Γ = − ln(5μ)/1.5 is the SNR gap relating S D2

the performance of an M-ary QAM modulated signal to
the Shannon capacity of the channel [19–21], μ is the ···
BER requirement for the data transmission; σ 2 is the noise
power, 1/2 denotes that the data transmission is divided into
DK
two periods. At high SNR regime, above equation can be
simplified as [22]
Subcarrier set for D1
psj a j psj b j · prj c j Subcarrier set for D2
1
C j ≈ log 1 + + . (3) ···
2 Γσ 2 Γσ 2 psj b j + prj c j Subcarrier set for DK
Taking psj = κ j P j and prj = (1 − κ j )P j into above equation, Figure 3: System model for OFDMA-based point-to-multipoint
cooperative relay network.
we get

1 κjPjaj κ j P j b j ·(1 − κ j )P j c j
Cj = log 1 + + 2 its cardinality |Σ| = N. Let the channel coefficient of the jth
2 Γσ 2 Γσ κi P j b j + (1 − κ j )P j c j j
subcarrier from source to relay be hs,r , from source to the kth
1 Pj j j
= log 1 + h j 2 , destination be hs,k , and from relay to kth destination be hr,k .
2 Γσ Also, we assume that the channel remains constant during
(4) j j j j
an OFDM symbol, let ak = |hs,k |2 , b j = |hs,r |2 , and ck =
in which h j = κ j a j + κ j b j (1 − κ j )c j /(κ j b j + (1 − κ j )c j ) can j
|hr,k |2 . Using the similar deduction as in (1)∼(6), we obtain
be regarded as the equivalent channel coefficient of the jth the equivalent channel gain of the jth subcarrier from each
subcarrier between source and destination node. destination to source node as follows:
The power allocation proportional factor κ j is chosen to j j j
maximize the SNR of the jth subcarrier, that is, j j j κk b j 1 − κk ck
hk = κk ak + j j j , (7)
κk b j + 1 − κk ck
κjPjaj P j κ j b j (1 − κ j )c j
κ j = arg max + 2 2 .
κj Γσ 2 Γσ σ /P j + κ j b j + (1 − κ j )c j in which the parameter κk is
j
(5) ⎧
⎪
⎪1,
j
Dk < 0,
⎪
⎨
This problem is similar to [23, Theorem 5], using similar j j j j
deduction we can get the optimal factor κ j as κk = ⎪ Dk (Ck + 1) − Ek j (8)
⎪
⎪min 1, , Dk > 0,
⎩ j j
⎧ Ck − Bk
⎪
⎪ Dj < 0
⎨1, j j j j j j
κj = ⎪ in which we have Bk = b j Pk /Γσ 2 , Ck = ck Pk /Γσ 2 , Ak =
D j (C j + 1) − E j (6)
⎪
⎩min 1, , D j > 0, j j j j j j j j j j j j
Cj − Bj ak Pk /Γσ 2 , Dk = Bk Ck + Ak (Ck − Bk ), and Ek = Bk Ck (Bk +
j
1)·(Ck + 1).
in which B j = b j P j /Γσ 2 , C j = c j P j /Γσ 2 , A j = a j P j /Γσ 2 , The corresponding channel capacity of the jth subcarrier
D j = B j C j + A j (C j − B j ), and E j = B j C j (B j + 1)(C j + 1). for the kth user is
j j j j j j j j
2.2. System Model of Multiuser Point-to-Multipoint OFDMA- j 1 κP a κk Pk b j ·(1 − κk Pk ck
Ck = log 1 + k k2 k + j j j j j
Based Cooperative Relay Network. Figure 3 gives the point- 2 Γσ Γσ (κk Pk b j + 1 − κk Pk ck )
2
to-multipoint OFDMA-based cooperative relay network
j
model for multiuser scenario. Here the source node commu- 1 j P
= log 1 + hk k2 ,
nicates with K destination nodes. Let Λ be the destination 2 Γσ
node set. Relay node also utilizes AF protocol to forward the (9)
data. In practical scenarios, the source node can be the base j
station (BS) or access point (AP), relay node can be the relay in which Pk = psj,k + prj,k is the sum of the power allocated to
station, and the destination node can be the access users. the jth subcarrier of the kth user at the source and relay node.
Here, we consider the downlink case. Next, we give the multiservice transmission for both
Similar to the parameters in single-user point-to-point single-user and multiuser scenario in OFDMA-based coop-
scenario, we assume that the system subcarrier set is Σ and erative relay networks.
3. Multiservice Transmission for SNR regime as follows:

Single-User Point-to-Point OFDMA-Based 1
Pj
Cooperative Relay Network max log 1 + h j , (11)
j ∈Φb
2 Γb σ 2
In this section, we give the power, subcarrier, and rate allo- 1
Pj
cation scheme supporting multiservice transmission based s.t. log 1 + h j = R, (C11-1)
j ∈Φa
2 Γa σ 2
on the system model in Section 2 (the equivalent channel h j

of jth subcarrier and the corresponding capacity equation). P j ≤ P, P j ≥ 0 ∀ j ∈ Σ. (C11-2)
Here, we consider multiservice transmission between source j
and destination, where there are two classes of services, one
is real-time (RT, denotes service A) service, for example, For the objective function (11), since that subcarrier and
VoIP or streaming media service, and so forth. Generally, power are correlated with each other, for instance, the needed
there is data rate requirements for this kind of service, power will be reduced when there are more subcarriers; on
and the event of outage will happen when the offered the other hand, more power will be needed when there are
data rate is lower than the required data rate; the other fewer subcarriers. This problem is still a nonlinear optimal
kind of service is nonreal-time (NRT, denotes service B) problem, but for fixed subcarrier sets Φa and Φb , there is
service, for example, file downloading, E-mail, or HTTP, only power coupling between services A and B, thus the
and so forth. This kind of service has no strict require- above problem can be greatly simplified. Since the priority
ment for data rate, but in order to improve the system of service A is higher than that of service B, we first allocate
spectral efficiency and service quality, data rate as high as power to service A and minimize its power as low as possible
possible is preferred. Meanwhile, these two kinds of service while satisfying the QoS requirement of service A, thus we
will have different BER requirements; generally, real-time can leave more power to service B, which, in turn, can
(RT) service will be insensitive to BER compared to NRT improve the system efficiency. In this way, the above problem
service. can be transformed into two equivalent suboptimal problems
as follows.
3.1. Optimal Resource Allocation. In this paper, we consider Problem 1.

the problem of dynamically allocating radio resources such
min Pa = Pj, (12)
as power, subcarrier, and data rate so as to guarantee the j ∈Φa
QoS of multiservice (both RT and NRT services); specifically,
the aim of the resource allocation is to maximize the data 1 Pj
s.t. log 1 + h j = R, P j ≥ 0 ∀ j ∈ Φa .
rate of NRT service while guaranteeing the data rate and j ∈Φa
2 Γa σ 2
BER requirements of RT service and BER requirement of (C12-1)
NRT service. This optimal resource allocation problem can
be described as follows: Problem 2.
1
Pj
1 psj a j psj b j · prj c j max log 1 + h j , (13)
, j ∈Φb
2 Γb σ 2
max log 1 + +
j ∈Φb
2 Γb σ 2 Γb σ 2 σ 2 + psj b j + prj c j
(10) s.t. P j ≤ P − Pa , P j ≥ 0 ∀ j ∈ Φb . (C13-1)
j ∈Φb
1 psj a j psj b j · prj c j
s.t. log 1 + + = R, Here, Problems 1 and 2 can be solved independently,
j ∈Φa
2 Γb σ 2 Γb σ 2 (σ 2 + psj b j + prj c j ) using Lagrange multiplier and KKT (Karush-Kuhn-Tucker)
(C10-1) condition for Problems 1 and 2. We can get the following
power and data rate allocation scheme.
P sj + P rj ≤ P, P sj ≥ 0, P rj ≥ 0 ∀ j ∈ Σ, (C10-2)
j
For Problem 1, the jth subcarrier ( j ∈ Φa )’s allocated
power is
1/|Φa |
in which the objective function (10) is to maximize the data 4R 1
P j = Γa σ 2 − . (14)
rate of service B, (C10-1) is the data rate requirement of j ∈Φa h j hj
service A, (C10-2) is the power constraint, and Φa ⊆ Σ and
Φb ⊂ Σ are the subcarrier sets allocated to services A and The supported rate is
B, respectively. Φa ∪ Φb ⊆ Σ, Γa = − ln(5μa )/1.5, Γb =
R 1 h j |Φa |
− ln(5μb )/1.5 μa , and μb denote the BER requirements for rj = + log . (15)
|Φa | 2|Φa | j ∈Φa h j
services A and B, respectively.
It is easily seen that the above problem is a kind of For Problem 2, the power allocated to the jth subcarrier ( j ∈
nonlinear optimal problem, which is very hard to solve Φb ) is
directly. To obtain the closed-form optimal solution, first
P − Pa Γb σ 2 1 |Φ b |
take (4) into above equation, thus we can transform the Pj = + − , (16)
above problem into the equivalent optimal problem in high |Φ b | |Φ b | j ∈Φb
hj hj
and the supported rate is cooperative relay network; based on that, in this section we
concentrate on the multiuser scenario. First, we divide the
1 hj 1 P − Pa 1 destination node (users) into two groups, one user group
rj = log + log + , (17)
2 |Φ b | 2 Γσ 2 h
j ∈Φb j is the real-time (RT) service users, who have a specific
in which |Φ| is the cardinality of set Φ. rate requirements. The target rate for this kind of users is
Thus, (14)∼(17) show the optimal power and data generally a fixed value, for example 64 kbps for CBR video
rate allocation scheme supporting multiservice when the service, 12.2 kbps for VoIP service, and so forth. When the
subcarrier sets Φa and Φb are given. It is seen that power and actual data rate is lower than the target rate requirement, an
rate allocation all obey the water-filling strategy. While the outage event will occur. The other user group is the best-
optimal scheme can be obtained through searching all the effort service users, who have no rate requirement. In order
possible subcarrier sets Φa and Φb and allocating power and to improve system throughput as much as possible, we expect
data rate according to (14)∼(17) for a given subcarrier this kind of user has as high rate as possible. Since that real-
set. In time (RT) service have higher requirements than nonreal-
this way, we compute the data rate of service B Rb = j ∈Φb r j
time (NRT) service, in resource allocation real-time users
for every set and compare different Rb achieved by all possible should have higher priority. Here, each user has one kind
subcarrier allocations scheme and select the joint subcarrier, of service, but our study is also extendable for the multiple
power, and rate allocation scheme which can achieve the services per user.
largest Rb .
Current studies on multicarrier resource allocation are
3.2. Suboptimal Searching Algorithm. Since every subcarrier summarized as two problems.
can only be allocated to service A or service B, or no
allocation (the subcarrier is not allocated to any service and (1) Power minimization under the constraint of rate re-
remains unused), the computational complexity for search- quirements for real-time users, for example, [5] pro-
ing the optimal scheme will be O(3N ), which is impossible poses power minimizing while guaranteeing users’
in reality, especially for large number of subcarrier N. So, we minimum QoS requirements; others use genetic
need a suboptimal search algorithm to achieve near-optimal algorithm of biology to analyze this problem.
performance for a given computational complexity.
As we know, under the same subcarrier set condition, (2) Rate maximization under the constraint of power
the optimal power allocation scheme given by (14) will constraint.
be definitely superior to average power allocation [6–8];
on the other hand, if a certain subcarrier set X can meet In this paper, we consider the problem of resource
service A’s data rate and BER’s requirements for equal power allocation for multiuser multiservice in cooperative relay
allocation, then this subcarrier set X will surely satisfy network. Our study considers both of the above two
service’s requirements for optimal power allocation. Because problems, while it is not just the mixture of these two
of service A’s characteristics, if we allocate too much radio problems for the characteristics inherent in cooperative relay
resource to service A, the total system spectral efficiency systems. After allocating resources to real-time users, the
will be affected, the optimal solution will be that allocating factors affecting nonreal-time service users include not only
more resources to service B under the condition that service remaining power, but also remaining subcarriers and its
A’s QoS requirement has been guaranteed. Considering the channel gains for nonreal-time users, which is a very complex
above requirements, we propose Algorithm 1. problem.
In this search algorithm, Ω = { j ∈ Σ|(1/2) log(1 + Similar to point-to-point transmission, here our goal
h j (P/N/Γa σ 2 )) ≥ R/k}, we use Flag to further reduce the is to improve the system throughput as much as possible
complexity of the searching process. This algorithm needs to while guaranteeing the QoS requirements of real-time service
be executed only N times for the worst case (linear), for many (data rate and BER). If possible, admission control should
cases, this algorithm only needs to be executed less N times guarantee that the system resources meet the needs of all
than the condition can be met, that is, Flag = True, in this real-time service users; if not, then some users will suffer
way, we can greatly reduce the computational complexity. an outage. In this study, we use the following foundation to
Compared the optimal search algorithm whose complexity describe the problem:
[O(3N )]is exponentially increased, this algorithm’s complex-
ity is greatly reduced. j j
j1 psj,k ak psj,k b j · prj,k ck
In Appendix A, the more detailed algorithm flowchart of max wk log 1+ +
this algorithm is given. k∈Ψb j ∈Φb
2 Γb σ 2 Γb σ 2 σ 2 + psj,k b j + prj,k ckj
(18)
4. Multiuser Multiservice Transmission j j
for Multiuser Point-to-Multipoint j1 psj,k ak psj,k b j · prj,k ck
s.t. wk log 1+ + = Rk
OFDMA-Based Cooperative Relay Network j ∈Φa
2 Γb σ 2 Γb σ 2 (σ 2 + psj,k b j + prj,k ckj )
In Section 3, we analyze the problem of multiservice ∀k ∈ Ψ a ,

transmission in single-user point-to-point OFDMA-based (C18-1)
STEP I: reordering the subcarriers in the descending order according to h j , and let k = N, Flag = False;
STEP II: find subcarrier set Ω;
STEP III: decide whether |Ω| is greater than k; if yes, go to STEP IV, or else go to STEP V;
STEP IV: allocating the smallest k subcarriers (h j ) to service A from the subcarrier set Ω, and allocating the
remaining subcarriers to service B, go to STEP VI;
STEP V: allocating the largest k subcarriers (h j ) to service A from the subcarrier set Ω, and allocating the
remaining subcarriers to service B, then let Flag = True, go to STEP VI;
STEP VI: according to (14)∼(17), computing the allocated power and rate of each subcarrier and compute Rb =

j ∈Φb r j , decide the condition Flag = True, if it is true, then go to STEP VII, else let k − −; if k = 0, then go to STEP
VII, or go to STEP II;
STEP VII: compare Rb (k) of each loop, and select the resource allocation scheme which can achieve the largest
Rb (k) as the optimal allocation scheme.
Algorithm 1: Suboptimal search algorithm for resource allocation in single-user OFDMA-based cooperative system.

K
j Problem 3.
wk ≤ 1 ∀ j ∈ Σ, (C18-2) j
i=1 min Pa = Pk , (20)
k∈Ψa j ∈Φa
psj,k + prj,k ≤ P, psj,k ≥ 0, prj,k ≥ 0 ∀k ∈ Λ, j ∈ Σ,
j
k j j1 j P
(C18-3) s.t. wk log 1 + hk k2 = Rk ∀k ∈ Ψ a , (C20-1)
j ∈Φa
2 Γσ
j
wk ∈ {0, 1} ∀k ∈ Λ, j ∈ Σ, (C18-4)
K
j
wk ≤ 1 ∀ j ∈ Φa , (C20-2)
i=1
in which Ψa and Ψb are user sets for real-time and nonreal-
j
time services, respectively, and Ψa ∪ Ψb = Λ. Φa and Φb Pk ≥ 0 ∀k ∈ Ψa , j ∈ Φa , (C20-3)
are the subcarrier sets allocated to real-time service user j
and nonreal-time service user, respectively, and Φa ∩ Φb = wk ∈ {0, 1} ∀k ∈ Ψa , j ∈ Φa . (C20-4)
∅, Φa ∪ Φb = Σ. The other parameters are the same as Problem 4.
those in point-to-point case. Condition (C18-1) is the rate
j
requirement for real-time service; (C18-2), and (C18-4) j1 j P
denote that each subcarrier can only be allocated to one max wk log 1 + hk k2 , (21)
k∈Ψb j ∈Φb
2 Γσ
user; (C18-3) is the total power constraint and positive
power constraint. Obviously, this problem is a nonlinear
K
j
optimization problem which is very hard to obtain the s.t. wk ≤ 1 ∀ j ∈ Φb , (C21-1)
optimal solution. Next, we use the decomposition method, i=1
substituting (9) into (18), in high SNR region, the above j j
Pk ≤ P − Pa , Pk ≥ 0 ∀k ∈ Ψb , j ∈ Φb ,
problem can be equivalent to k∈Ψb j ∈Φb
(C21-2)
j j
j1 j P wk ∈ {0, 1} ∀k ∈ Ψb , j ∈ Φb . (C21-3)
max wk log 1 + hk k2 , (19)
k∈Ψb j ∈Φb
2 Γσ
It can be seen that Problem 3 can be regarded as power
j minimization under rate constraints, the solution of this
j1 j Pk
s.t. wk log 1 + hk = Rk ∀k ∈ Ψ a , (C19-1) problem is similar to that in [5]. After deciding the subcarrier
j ∈Φa
2 Γσ 2
set allocated to the real-time users, we can use similar

K
j method to allocate resources to users; meanwhile, Problem 4
wk ≤ 1 ∀ j ∈ Σ, (C19-2) is the rate maximization under power constraint, which can
i=1
j j
also be solved using the method in [24]. In this way we can
Pk ≤ P, Pk ≥ 0 ∀k ∈ Λ, j ∈ Σ, (C19-3) compute the throughput of the nonreal-time users according
k j to Problems 3 and 4 through searching all the possible
j
wk ∈ {0, 1} ∀k ∈ Λ, j ∈ Σ. (C19-4) subcarrier sets Φa and Φb , then compare the total system
sum rate of different combinations, and select the optimal
power and rate allocation solution which can achieve the
For fixed subcarrier sets Φa and Φb , the above problem can largest system capacity while guaranteeing all the real-time
be divided into two equivalent subproblems. users’ QoS requirements.
STEP I: allocate subcarriers to RT service users according to maximal allocation criteria or best first method [1];
STEP II: according the solution of Problem 1 in (11) and (12), allocate power and data rate for the subcarriers of each user;
STEP III: comparing (8) and (9), if the former is larger than the latter, go to STEP IV, else go to STEP I;
j j
STEP IV: according to hk , allocate the jth subcarrier to user k∗ who can achieve the hk , then perform power and rate allocation
j
using water-filling according to the subcarrier = {hk∗ }.
Algorithm 2: Suboptimal search algorithm for resource allocation in multiuser OFDMA-based cooperative system.
Since the optimal subcarrier search requires searching 1

all the combinations of the subcarrier sets Φa and Φb ,
Throughput of service B (bps/Hz)

each subcarrier can be allocated to any user or allocated 0.9
to nobody, the cardinality of all the combinations will 0.8
be (K + 1)N , the computational complexity of subcarrier 0.7
searching will be O[(K + 1)N ], if the number of users and
0.6
subcarriers is very large (in practice, N will be very large), so
the heavy complexity will be impossible in practice. Thus, 0.5
a suboptimal resource allocation scheme is practical and 0.4
useful. Next, we give our proposed scheme.
0.3
From the above analysis we can see that the difficulty
of the problem is that there are both power coupling and 0.2
bandwidth (subcarriers) coupling for multiservice multiuser 0.1
environment, that is, if allocating more bandwidth to the 10 12 14 16 18 20 22 24 26
nonreal-time service (since real-time service has strict QoS Average SNR (dB)
requirements and has higher priority, resource allocation for Optimal search
real-time service has higher priority), the power required for Proposed suboptimal algorithm
meeting real-time service’s QoS requirement will be reduced; Fixed allocation
on the other hand, more power will be required to meet the
Figure 4: Throughput of service B (service A’ required rate R =
real-time service users’ QoS requirements. When solving this 0.5 bps/Hz).
problem, we “remove” the bandwidth coupling and analyze
the problem from the point of power coupling, thus the
optimal problem can be solved in theory.
The criterion in STEP IV can be obtained as the following
These two kinds of coupling make the allocation problem optimal problem:
more complicated, but we have found that whether “two
much” bandwidth power allocated to real-time service, the 1 j
j Pk∗
throughput of nonreal-time (NRT) service users is definitely max log 1 + hk∗ , (24)
not the highest, thus experience of nonreal-time service is j ∈
2 Γσ 2
also not the optimal (which will be further verified in the j j
following simulations). Based on this, we define two resource Pk∗ ≤ P − Pa , Pk∗ ≥ 0 ∀ j ∈ . (C24-1)
j ∈
usage factors for real-time service users.
Solving the above problem, we can get
Power Usage.
j P − Pa Γσ 2 1 ||
Pk∗ = + − . (25)
j || || j
j ∈Jb hk∗
j
hk ∗
k∈Ψa j ∈Φa Pk Pa
λp = = , (22)
P P
The achievable data rate of subcarrier j is
Bandwidth Usage. j
j 1 h∗ 1 P − Pa 1
rk∗ = log k + log + , (26)
2 || 2 Γσ 2 j
j ∈ hk∗
|Φ | N
λb = a = a . (23)
| Σ| N j
in which Pk∗ is the power allocated the k∗ th user’s jth sub-
j
When λ p = λb , we believe that the power and bandwidth carrier, rk∗ is the corresponding data rate.
allocated to real-time users are relatively optimal. Based on In STEP I, we can adopt different subcarrier allocation
this and together with the scheme in the point-to-point schemes (i.e., maximum or best first), in the following
resource allocation, we propose Algorithm 2. simulations, we allocate the subcarriers to the user who can
1.2 100
1.1
Throughput of service B (bps/Hz)
Outage probability
0.9
10−1
0.8
0.7
0.6
0.5
10−2
0.4
0.3
0.2
10 12 14 16 18 20 22 24 26 4 5 6 7
Average SNR (dB) Average SNR (dB)
Optimal search Proposed algorithm
Proposed suboptimal algorithm Fixed allocation A
Fixed allocation Fixed allocation B
Figure 5: Throughput of service B (service A’ required rate R = Figure 7: Outage probability of service A (RT) users.
1.5 bps/Hz).
1
10−1
Throughput of best effort traffic (bps/Hz)
0.8
Outage probability
10−2
0.6
0.4
10−3
0.2
0
10−4
10 12 14 16 18 20 22 24 26
10 12 14 16 18 20
Average SNR (dB)
Average SNR (dB)
Proposed algorithm
Optimal search
Fixed allocation A
Proposed suboptimal algorithm
Fixed allocation B
Fixed allocation
Figure 6: Outage probability of service A (the required rate is R = Figure 8: Throughput of nonreal-time (NRT) service users.
0.5 bps/Hz).
obtain the largest channel gain, if some subcarriers with the algorithm. Simulations assume that the BER requirements
largest channel gain have been already allocated to some for service A (RT service) and service B (NRT service)
users, then allocate the subcarriers with the second largest j j j
are 10−3 and 10−6 , respectively. hs,d , hs,r , and hr,d are
channel gain to this user. Rayleigh-distributed random variables (RV). There are N =
The proposed algorithm need only to be executed N/K 8 subcarriers. In addition to the optimal and suboptimal
times in the worst case, and the computational complexity algorithms, our simulations also give the performance of
will be O( N/K ). fixed resource allocation, that is, allocating a fixed number
Also in Appendix B, a more detailed flow chart is shown. of subcarriers to service A and allocating the remaining
subcarriers to service B, then performing power and rate
5. Simulation Results and Analysis allocation according to (11)∼(14). Service B’s throughput
and service A’s outage probability are used to reflect the
5.1. Resource Allocation in Single-User Point-to-Multipoint performance during the simulations, here outage probability
OFDMA-Based Cooperative Relay Network. We use Monte is defined as the probability that service A’s obtained rate
Carlo simulations to verify the performance of the proposed lower than the target rate R.
Set h j ↓
Set k = N Start
Flag = false

Find

subcarriers

set Ω
1 P/N R
Ω = j ∈ Σ2 log 1 + h j 2
≥
Γa σ k
Allocate k lower
Yes
|Ω| > k? subcarriers to A, and other
subcarriers to B
No
Allocate k higher
subcarriers to A, and other
subcarriers to B
No
Calculate P j r j
as (11)(12)(13)(14) K = 0?

Rb (k) = j ∈Φb r j
Yes
No
Flag = true? K −−
Yes
Find max
Rb (k)
End
Figure 9
Figures 4 and 5 give the throughput of service B versus that the optimal algorithm can always guarantee service A’s
average SNR under the condition that service A’s required QoS requirements, and its outage probability is zero. Our
rate is 0.5 bps/Hz and 1.5 bps/Hz, respectively. The average proposed algorithm can achieve comparable performance
system SNR is defined as P/σ 2 . From this figure, we can with the optimal algorithm, while the computational com-
see that our proposed suboptimal algorithm can achieve plexity is greatly reduced.
remarkable performance improvement over fixed allocation
scheme, and is very near the optimal resource allocation
algorithm. 5.2. Resource Allocation in Multiuser Point-to-Multipoint
Figure 6 gives the outage probability of service A when OFDMA-Based Cooperative Relay Network. For multiuser
the required rate R = 0.5 bps/Hz. From this figure, it is seen point-to-multipoint scenario, in the simulations there are
Start k = k(1 < k < Ka)
j
Find highest hk
K++
to k
Calculate P j r j
as (11)(12) in
j
= −{hk }
Calculate λ p λb
as (8)(9)
λ p < λb ?
No
Yes
Allocate other
subcarriers and
power to B user End
using waterfilling
mehtod
Figure 10
altogether 5 users, among whom two are real-time (RT) Figure 8 gives the throughput for nonreal-time (NRT)
service users and the remaining three are nonreal-time service users, from which we can see that our proposed
(NRT) service users. There are N = 32 subcarriers. The suboptimal algorithm can achieve much higher throughput.
required data rate for real-time service is R = 2 bps/Hz; the When the system SNR is low, the throughput achieved
target BER for real-time service and nonreal-time service through fixed allocating 5 subcarriers to real-time service and
is 10−3 and 10−6 , respectively. The system average SNR is the remaining subcarriers allocated to nonreal-time service
defined as γ = 1/σ 2 . In the simulations, fixed resource will be lower than that fixed allocating 10 subcarriers; while
allocation is also performed to compare the different algo- with the increase of system SNR, the throughput of fixed
rithms. The fixed resource allocation is to allocate fixed allocating 5 subcarriers will be higher than that of fixed
number of subcarriers to real-time service users and allocate allocation of 10 subcarriers. (The conclusion is that when
subcarriers to the user with the highest channel gains, the system SNR is lower, a large number of subcarriers should
remaining subcarriers are allocated to nonreal-time (NRT) be allocated to guarantee the rate requirement.) The reason
service users. is that when SNR is lower, if allocating a small number
Figure 7 gives outage probability for real-time (RT) of subcarriers to real-time service users, more power will
service users, in which fixed allocations A and B denote be needed to guarantee the QoS requirements, thus the
that 10 and 5 subcarriers are fixed allocated to real-time remaining power allocated to nonreal-time service users
service users. It is seen that allocating more subcarriers to will be smaller. Although more subcarriers are allocated to
real-time service users can significantly reduce the outage nonreal-time service due to power constraint, its throughput
probability of the real-time service. This is because that can not be improved. When SNR is large, nonreal-time
if more subcarriers are allocated to real-time service users service can get more subcarriers as well as power, thus higher
the probability that the achievable data rate meets the throughput can be obtained in this way.
target rate is higher, and therefore the outage probability For our proposed suboptimal resource allocation algo-
will be much lower. Our proposed algorithm considers rithm, since power and subcarrier allocation are balanced,
both subcarriers allocation and the corresponding power the subcarriers and power allocated to nonreal-time service
allocation comprehensively which effectively guarantee the users are more reasonable, which makes that the proposed
QoS of real-time services. algorithm can achieve much more performance gains.
6. Conclusions and Future Work References

[1] X. Zhang and W. Wang, “Multiuser frequency-time domain
Radio resource allocation in OFDMA system has received
radio resource allocation in downlink OFDM systems: capac-
considerable attention in recent years. This paper studies ity analysis and scheduling methods,” Computers & Electrical
the problem of multiservice transmission in OFDMA-based Engineering, vol. 32, no. 1–3, pp. 118–134, 2006.
cooperative relay networks. A framework is proposed to [2] IEEE 802.16 Wireless Metropolitan Area Network, October
adaptively allocate power, subcarriers, and data rate to 1999.
maximize system spectral efficiency under the constraints [3] X. Zhang, R. Zhu, S. Liu, and W. Wang, “Modeling and
of satisfying multiuser multiservices’ QoS requirements. performance evaluation of 3GPP long-term evolution (LTE)
First, we concentrate on the single-user scenario consider- system—part I: modeling methodology and simulation plat-
ing multiservice transmission in point-to-point cooperative form & part II: numerical investigations and performance
relay network; then based on the analysis of single-user analysis,” in Proceedings of the Annual OPNET Technology Con-
case, we extend the multiservice transmission to multiuser ference (OPNETWORK ’06), Washington, DC, USA, August-
point-to-multipoint case. Based on the framework, we September 2006.
propose several suboptimal resource allocation algorithms [4] E. Zhou and X. Zhang, Next Generation Mobile Communi-
for multiservice transmission in OFDMA-based coopera- cation System: OFDM and MIMO, Posts & Telecom Press,
tive relay networks to further reduce the computational Beijing, China, 2008.
complexity. Simulation results show the proposed algo- [5] C. Y. Wong, R. S. Cheng, K. B. Letaief, and R. D. Murch,
“Multiuser OFDM with adaptive subcarrier, bit, and power
rithms yield much higher spectral efficiency and much
allocation,” IEEE Journal on Selected Areas in Communications,
lower outage probability, which are flexible and efficient vol. 17, no. 10, pp. 1747–1758, 1999.
for the downlink of OFDMA system. This paper will [6] Y. J. Zhang and K. B. Letaief, “Multiuser adaptive subcarrier-
provide insight in the design of OFDMA-based cooperative and-bit allocation with adaptive cell selection for OFDM
relay network, which can support efficient multiservice systems,” IEEE Transactions on Wireless Communications, vol.
transmission while satisfying the services’ QoS require- 3, no. 5, pp. 1566–1575, 2004.
ments. [7] K. Kim, H. Kim, Y. Han, and S.-L. Kim, “Iterative and greedy
resource allocation in an uplink OFDMA system,” in Proceed-
ings of the 15th IEEE International Symposium on Personal,
Appendices Indoor and Mobile Radio Communications (PIMRC ’04), vol.
4, pp. 2377–2381, Barcelona, Spain, September 2004.
A. Flow Chart of the Suboptimal Search [8] X. Zhang, E. Zhou, R. Zhu, S. Liu, and W. Wang, “Adaptive
multiuser radio resource allocation for OFDMA systems,”
Algorithm for Resource Allocation in in Proceedings of IEEE Global Telecommunications Conference
Single-User OFDMA-Based (GLOBECOM ’05), vol. 6, pp. 3846–3850, St. Louis, Mo, USA,
Cooperative System December 2005.
[9] J. N. Laneman and G. W. Wornell, “Distributed space-time-
Flow chart of the suboptimal search algorithm for resource coded protocols for exploiting cooperative diversity in wireless
allocation in single-user OFDMA-based cooperative system networks,” IEEE Transactions on Information Theory, vol. 49,
no. 10, pp. 2415–2425, 2003.
is shown in Figure 9.
[10] J. N. Laneman, D. N. C. Tse, and G. W. Wornell, “Cooperative
diversity in wireless networks: efficient protocols and outage
B. Flowchart of the Suboptimal Search behavior,” IEEE Transactions on Information Theory, vol. 50,
no. 12, pp. 3062–3080, 2004.
Algorithm for Resource Allocation in [11] A. Sendonaris, E. Erkip, and B. Aazhang, “User cooperation
Multiuser OFDMA-Based diversity—part I: system description,” IEEE Transactions on
Cooperative System Communications, vol. 51, no. 11, pp. 1927–1938, 2003.
[12] A. Sendonaris, E. Erkip, and B. Aazhang, “User cooperation
Flowchart of the suboptimal search algorithm for resource diversity—part II: implementation aspects and performance
allocation in multiuser OFDMA-based cooperative system is analysis,” IEEE Transactions on Communications, vol. 51, no.
11, pp. 1939–1948, 2003.
shown in Figure 10.
[13] I. Hammerström and A. Wittneben, “On the optimal power
allocation for nonregenerative OFDM relay links,” in Pro-
Acknowledgments ceedings of IEEE International Conference on Communications
(ICC ’06), vol. 10, pp. 4463–4468, Istanbul, Turkey, July 2006.
[14] G.-D. Yu, Z.-Y. Zhang, Y. Chen, S. Chen, and P.-L. Qiu, “Power
The authors would like to thank Dr. Yong Li, Mr. Lei Fu, allocation for non-regenerative OFDM relaying channels,” in
and other researchers from WSPN Laboratory of BUPT Proceedings of the International Conference on Wireless Com-
for their fruitful discussions and valuable suggestions. This munications, Networking and Mobile Computing (WCNM ’05),
work is jointly supported by the National Basic Research vol. 1, pp. 185–188, Wuhan, China, September 2005.
Program of China (973 Program) (Grant no. 2007CB310602) [15] Y. Wang, X. Qu, T. Wu, and B. Liu, “Power allocation and
and Specialized Research Fund for the Doctoral Program of subcarrier pairing algorithm for regenerative OFDM relay
Higher Education (Grant no. 200800131015). system,” in Proceedings of the 65th IEEE Vehicular Technology
Conference (VTC ’07), pp. 2727–2731, Dublin, Ireland, April

2007.
[16] I. Hammerström and A. Wittneben, “Power allocation
schemes for amplify-and-forward MIMO-OFDM relay links,”
IEEE Transactions on Wireless Communications, vol. 6, no. 8,
pp. 2798–2802, 2007.
[17] B. Gui and L. J. Cimini Jr., “Bit loading algorithms for
cooperative OFDM systems,” in Proceedings of IEEE Mili-
tary Communications Conference (MILCOM ’07), pp. 1–7,
Orlando, Fla, USA, October 2007.
[18] G. Li and H. Liu, “Resource allocation for OFDMA relay
networks with fairness constraints,” IEEE Journal on Selected
Areas in Communications, vol. 24, no. 11, pp. 2061–2069, 2006.
[19] A. J. Goldsmith and S.-G. Chua, “Variable-rate variable-
power MQAM for fading channels,” IEEE Transactions on
[20] A. J. Goldsmith and S.-G. Chua, “Adaptive coded modulation
for fading channels,” IEEE Transactions on Communications,
vol. 46, no. 5, pp. 595–602, 1998.
[21] X. Qiu and K. Chawla, “On the performance of adaptive
modulation in cellular systems,” IEEE Transactions on Com-
munications, vol. 47, no. 6, pp. 884–895, 1999.
[22] M. O. Hasna and M.-S. Alouini, “End-to-end performance
of transmission systems with relays over Rayleigh-fading
channels,” IEEE Transactions on Wireless Communications, vol.
2, no. 6, pp. 1126–1131, 2003.
[23] Y. Zhao, R. Adve, and J. L. Teng, “Improving amplify-and-
forward relay networks: optimal power allocation versus
selection,” IEEE Transactions on Wireless Communications, vol.
6, no. 8, pp. 3114–3123, 2007.
[24] J. Jang and K. B. Lee, “Transmit power adaptation for
multiuser OFDM systems,” IEEE Journal on Selected Areas in
doi:10.1155/2009/147231
Research Article
Throughput Analysis of Band-AMC Scheme in Broadband
Wireless OFDMA System
Sung K. Kim1 and Chung G. Kang2

1 Electronics and Telecommunication Research Institute, Korea 138 Gajeongno, Yuseong-Gu, Daejeon 305-700, South Korea
2 School of Electrical Engineering, Korea University, 5-1, Anam-dong, Sungbuk-Ku, Seoul 136-701, South Korea
Correspondence should be addressed to Chung G. Kang, ccgkang@korea.ac.kr
Received 1 August 2008; Revised 26 December 2008; Accepted 23 February 2009
In broadband wireless Orthogonal Frequency Division Multiple Access (OFDMA) systems where a set of subcarriers are shared
among multiple users, the overall system throughput can be improved by a band-AMC mode that assigns each suband, a set
of contiguous subcarriers within a coherence bandwidth, to individual user with the better channel quality. As long as channel
qualities for the subbands of all users are known a priori, multiuser and multiband gains can be simultaneously achieved with
opportunistic scheduling. This paper presents an analytical means of evaluating the maximum system throughput for a band-
adaptive modulation and coding (AMC) mode under the various system parameters. In particular, the practical features of
resource management for OFDMA system are carefully modeled within the current analytical framework. Our numerical results
demonstrate that band-AMC mode outperforms the diversity mode only by providing the channel qualities for a subset of good
subbands, confirming the multiuser and multiband diversity gain that can be achieved by the band-AMC mode.
Copyright © 2009 S. K. Kim and C. G. Kang. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction In the multiuser scenarios upon a multicarrier system, a

subcarrier in deep fading to one user may be of good
Demands for high bandwidth multimedia information in quality to another user, which lends support to dynamic
the mobile environment have spawned the development of subcarrier allocation on improving system throughput [2–
various mobile broadband wireless access (BWA) systems for 5]. The different signal quality (e.g., carrier-to-interference
high-speed communication. Particular examples include the ratio or CIR) seen at each subcarrier governs the capacity of
mobile WiMAX, which is based on the IEEE 802.16e Mobile each subcarrier. Ideally, a different modulation and coding
Wireless MAN technologies, and 3GPP’s new standards for level should be selected for each subcarrier in order to
3G long-term evolution (LTE). The IEEE 802.16e standard maximize the capacity. This particular approach is referred
aims to unify the underlying solutions [1], specifying two to as an adaptive modulation and coding (AMC) scheme.
flavors of OFDM systems: one simply identified as Orthog- For the fast selective AMC scheme, Channel Quality
onal Frequency Division Multiplexing (OFDM), the other Indication (CQI) must be reported immediately for all
Orthogonal Frequency Division Multiple Access (OFDMA). subcarriers within the entire bandwidth, which allows for
OFDMA is considered to be one of the most spectrally selecting the appropriate modulation and coding level for
efficient multiple access alternatives for mobile BWA systems. each subcarrier without incurring a channel mismatching
It has the ability to dynamically assign a subset of the problem. It usually involves unrealistic feedback overhead,
subcarriers to individual users, attuning the technology to especially under the fast fading channel. Fortunately, how-
the particular mobility requirement. This scheme fully takes ever, recent broadband measurements indicate that per-
advantage of multiuser diversity, in conjunction with the subcarrier information is typically not necessary [2]. Namely,
frequency diversity inherent in the OFDM scheme. In fact, the feedback coefficient is sufficient for a group of several
the mobile BWA system must contend with fluctuations subcarriers in the fast selective AMC process, while the
across the frequency band, in addition to time variations. coherence bandwidth of the channel is larger than that of
subband. In general, further enhancement can be realized by Subchannel i Subchannel j · · · Subchannel k

providing CQI reports a set of optimum subcarriers. This ··· ··· ···
particular approach is specified as a band-AMC mode in
IEEE 802.16 Task Group e standard. Due to the frequency-
selective characteristics of a time-varying nature in the
···
broadband channel, it is not straightforward to evaluate the
average throughput of the OFDMA system, without resort-
ing to the computer simulation. Furthermore, it becomes Data subcarrier index
more involved as many parameters are configured to opti- (a) Diversity subchannels
mize system performance. For example, the number of bands
selected for reporting CQI information is one important
Subchannel j ··· Subchannel k
parameter that governs overall average system throughput. Subchannel i
·
The objective of this paper is to develop an analytical
··
··
means of evaluating the maximum system throughput
x
de
of band-AMC mode. In particular, practical features of
l in
resource management for the OFDMA system are carefully
mbo
modeled within the proposed analytical framework. We
Sy
consider order statistics to model the statistical nature of
multiuser/multiband diversity in the OFDMA system. Order Band 1 ··· Band M Data subcarrier index
statistics have been a unique research area for statisticians for (b) Band-AMC subchannels
some time, with special application in statistical estimation.
Figure 1: Construction of diversity and band-AMC subchannels.
Recently, a more general case of order statistics has captured
the attention of researchers in the area of signal processing
and wireless communication systems [6, 7].
The remainder of this paper is organized as follows. In Two different types of subchannel allocation modes are
Section 2, the operational concept of band-AMC mode, is defined in the IEEE 802.16e OFDMA specifications: diversity
described, formulating the scheduling problem under con- and band-AMC modes. As shown in Figure 1, the difference
sideration. In Section 3, the maximum average throughput between these two modes depends on how subcarriers are
of the band-AMC system is derived, using the order statistics. selected to form a subchannel. In the diversity mode, the
Then, Section 4 presents the numerical results to demon- sub-carriers belonging to a subchannel are randomly dis-
strate the advantage of using the band-AMC mode with the tributed over the entire bandwidth, facilitating the frequency
sufficient number of CQI reports for the selection bands. diversity effect over a frequency-selective fading channel in
Furthermore, the maximum throughput bounds that depend the broadband OFDMA system. In this case, the channel
on the multi-user diversity and multiband effect are pro- quality of each subchannel is determined by taking the
vided. Finally, concluding remarks are provided in Section 5. average SNR over all corresponding subcarriers. In the band-
AMC mode, on the other hand, a subchannel consists of
a set of contiguous subcarriers and furthermore, a whole
channel bandwidth can be divided into the multiple number
2. System Description of subbands (also, referred to as a band for short in the
2.1. Diversity Mode versus Band-AMC Mode. In the OFDMA sequel). A finite number of contiguous subchannels form
system, all available subcarriers are shared by multiple users a subband, which spreads within the coherence bandwidth,
in each symbol, as opposed to the OFDM system where all thus requiring only a single value of CQI for each band
subcarriers must be assigned to a single user. In general, the to specify the channel condition. Therefore, the band-AMC
advantage of the OFDMA system is the multiuser diversity mode does not incur too much feedback overhead cost,
gain that can be obtained by selecting only good subcarriers especially when a channel condition does not change too
for individual user, so as to fill the whole band with the rapidly as in a fixed or low mobility environment [3]. It is
multiple users. In other words, a “water-pouring” type of opposed to the diversity mode which is more appropriate to
adaptive subcarrier and bit allocation algorithms can be mobile application under the fast fading channel condition.
evoked for maximizing system capacity [8]. However, this
involves reporting the channel quality indicator (CQI) for 2.2. Multiple-Access Interference and CIR Distribution. For
each subcarrier of every user. In practice, it may cost a users with similar propagation environments, the mean
prohibitive amount of overhead, especially under the fast carrier-to-interference ratio can be represented as
fading channel condition in the mobile communication. n
Instead, a subset of subcarriers can be randomly selected in C βPT / rd /r0 rd−n
= n = , (1)
each symbol, which can warrant a frequency diversity effect I βPT / ri /r0
i=
ri−n i=
/d /d
over a frequency-selective fading channel. Toward this end, a
subchannel is defined as a basic unit of resource allocation, where the r∗ is the distance separating the transmitter from
which consists of a finite number of subcarriers, for example, the receiver and the subscript d denotes the desired user and
48 subcarriers in the IEEE 802.16e standard. i=/ d corresponds to the interfering cochannel users, β is the
Subscriber stations
f
Base station
(C/I)
Band 1
Packet
1 scheduler
Band 2 (multi-user ..
(C/I) /multi-band .
.. ..
. . allocation)
2 Data queues
Band M
.
.
.
(C/I)
CQI ch.
N
Band CQI reports
Figure 2: Band-AMC system model.
loss at distance r0 , and n is a pathloss exponent that depends (CIR) of the band j for the user i. We assume that each user
on the propagation environment. measures the CQI for all bands in terms of the CIR {γi, j }
If channel measurements are taken at a number of and then selects a preferred subset of bands with the μ-best
random locations, then the received amplitude typically CIR’s (μ = 1, 2, . . . , M) for the CQI feedback. The partial
follows a Rayleigh distribution. Assuming that instantaneous CQI report reduces the feedback overhead cost while trading
interference is constant, a carrier-to-interference ratio for off the throughput performance. Some users may select the
each subband is shown to be exponentially distributed in same band within the same time slot. Let Ω j denote a set
a frequency nonselective channel. In particular, if γ0 is the of users who have chosen the band j in their CQI reports
mean value of the carrier-to-interference ratio at a specified in the same frame. We assume that the packet scheduler is
distance rd from the transmitter, then the distribution of designed to select a single user for each band so that the
the observed carrier-to-interference ratio γ has the following overall bandwidth utilization can be maximized, that is,
probability density function [9]:
i∗j = arg max γi, j for j = 1, . . . , M. (3)
⎧ i∈Ω j
⎪1
⎨ e−γ/γ0 , γ ≥ 0,
f γ = ⎪ γ0 (2) This particular scheduler, frequently referred to as a max C/I-
⎩ 0, otherwise. scheduler, is one of the most typical opportunistic packet
schedulers in the broadband wireless mobile systems.
2.3. Multi-user and Multi-band Scheduling Problem. Assume
that there are N active users in a band-AMC system 3. System-Level Performance Analysis
with M subbands. Figure 2 illustrates a system model of a
downlink band-AMC scheduler with multiple bands that In this section, the average throughput performance of the
are shared among the different users. Based on CQI for an band-AMC system is evaluated. It is assumed that a full
individual subband, a packet scheduler in the base station buffer traffic model is used, that is, infinite traffic waiting
must determine which band to be assigned to each user along for each user. Depending on the channel quality, it is
with the corresponding MCS level. A reasonable amount of assumed that each user belongs to one of L groups. The
resources must be reserved for CQI report, while ensuring channel quality of all users in the same group is identically
that too much overhead does not overwhelm the overall distributed. Let Bt and Bc represent the total bandwidth and
system efficiency. Meanwhile, in the case that the number of coherence bandwidth, respectively. Then, the total number
band-AMC users is not sufficiently large in each cell, multi- of independent subbands can be given approximately by
user and multi-band diversity gain tends to be strictly lim- M = Bt /Bc . Note that the optimal number of subbands
ited, degrading the overall system throughput performance. may be greater than or equal to Bt /Bc . For example, it has
Therefore, an optimum portion of band-AMC region must been demonstrated in [5] that the optimum contribution
be configured in each frame. In sequel, however, we just focus to performance improvement is found for Bc ≈ 4 · Bs ,
on the scheduling problem, assuming that some portion of where Bs denotes the bandwidth of subband. Nevertheless,
frame is reserved solely for the band-AMC mode users. M can be still fixed to the minimum number of independent
Note that each user experiences a varying channel quality subbands, that is, M is just large enough to warrant the
for each band. Let γi, j be the carrier-to-interference ratio independence of channel qualities between the adjacent
subbands. Determining a proper M is beyond the scope of must be taken, that the dependence assumption is retained
this paper. when the γ’s are no longer identically distributed, that is, for
Now let a vector γi(l) = {γi,1(l) (l) (l)
, γi,2 , . . . , γi,M } represent
the inid case.
(l)
the sampled values of a channel quality for user i in group Let Fi,(μ:M−{ j }) (γ; F) denote the CDF of μth-order statis-
l. Note that M is not always necessarily equal to M. tics, exclusive of band j within the entire band pool, where M
Therefore, we consider two different cases: M < M and represents a band set of the system, that is, M = {1, 2, . . . , M }.
M = M . For the case of M = M , there is no correlation Hence, the probability that the user i in group l selects the
between those samples, that is, the channel quality for each band j is given by
subband is independent of each other. Denoting m(l) j as
the expected value of CIR for band j in group l, then the λ(l) (l) (l)
i, j μ = Pr γi, j > γi,(M −μ:M−{ j })
following probability density function (PDF) for CIR of the ∞
corresponding band under the condition that M = M can (l) (7)
= Fi,(M −μ:M−{ j }) γ; F fγi,(l)j γ dγ,
be obtained: 0
for 1 ≤ μ ≤ M − 1.
1 −γ/m(l)j
fγi,(l)j γ = e . (4)
m(l)
j The CDF of the kth-order statistic γ(k) is generalized to
For the diversity channel, meanwhile, CIR for each user i M

M
i

in group l is given by taking average of CIRs for all subbands, F(k:M) γ; F = F jl γ 1 − F jl γ , (8)
M (l)
that is, γ(l)
i
(l)
= (1/M ) j =1 γi, j . In the case that γi, j are i=k Si l=1 l=i+1
identically distributed over a whole bandwidth, {γ(l) i } turns where the summation Si extends over all permutations
out to be the normalized M -Erlang random variables. ( j1 , . . . , jn ) of 1, . . . , n for which j1 < · · · < ji and ji+1 <
The design of band-AMC system depends on the band- · · · < jn [10]. For the distribution of order statistics in
width of each subband, channel characteristics, the number the inid case, however, the density of every possible order
of users served by band-AMC mode, the feedback overhead must be found out separately on a case-by-case basis, which
to report the CQI of subbands, and so on. Consider the makes (8) involve the complicated and tedious calculation,
situation that the bandwidth of subband, Bs , chosen by band- especially as the number of bands increases. Fortunately,
AMC system is subject to the nonflat fading characteristics. an alternative method for computing F(k:M) (γ; F) has been
This particular situation can be specified by Bt /M ≈ g × Bc provided by Cao and West [7]. It is acceptable to have results
for g > 1, which corresponds to the case of M < M . Then, and recurrence relations valid in the iid case, requiring only
the observed channel quality for each subband cannot be simple modification to hold quite generally. For convenience
represented by (4). When the bandwidth Bs is divided into of notation, let 1 − Fi (γ) denote the F i (γ). Starting with
several adjacent segments, each with the bandwidth of Bc ,
g
it can be now approximated as Γ(l) (l)
i, j ≈ (1/g) k=1 γi,( j −1)·g+k ,
m

j = 1, 2, . . . , [M /g]. Then, (4) is replaced with the following F1:m γ = 1 − Fi γ , (9)
i=1
PDF:

fΓ(l) γ they prove the following relation:
i, j

=g· fγ(l) (x) ∗ fγ(l) (x) ∗ · · · fγi,(l)j·g (x)
, Fk:m γ = Fk−1:m γ − Hk γ 1 − F1:m γ , (10)
i,( j −1)·g+1 i,( j −1)·g+2 x=g ·γ
(5) where H1 (γ) = 1, and
where ∗ denotes the convolution operation: x(t) ∗ h(t) = −1

k
∞ 1
0 x(τ)h(t − τ)dτ. Hk γ = (−1)i+1 Li Hk−i for k = 2, . . . , m (11)
k − 1 i=1
3.1. CQI Report for Band-AMC Mode. Suppose that every
band-AMC user feedbacks μ-best CQI reports to the base with
station in every scheduling interval. To represent the chance m k

that each subband is selected for feedback, define a band Fi γ
Lk = . (12)
selection vector for user i as follows: i=1 Fi γ

Λ(l) (l) (l) (l)
i μ = λi,1 μ , λi,2 μ , . . . , λi,M μ , (6) Now from (7) and (9)–(12), the band selection vector can be
directly determined. It is obvious that Λ(l)
i (μ) = {1, 1, . . . , 1}
where λ(l) i, j (μ) is the probability that user i in group l is obtained with μ = M, which corresponds to the case of full
has a preference to the band j within μ chances, that is, CQI feedback.
M (l)
j =1 λi, j (μ) = μ. In the case that samples in the subband
are independent and identically distributed, it is obvious 3.2. Maximum System Throughput in Band-AMC Mode.
that Λ(l)
i (μ) = {μ/M, μ/M, . . . , μ/M }. However, consideration Let nl denote the total number of users in group l. The
probability that band j is simultaneously selected by x(l)

j users Table 1: Basic OFDMA system parameters.
can be written as follows:
x Parameters Value
(l) nl −x
Pr x(l)
j =x =
(l)
nl Cx · λi, j · 1 − λi, j , (13) Frequency 2.3 GHz
System bandwidth 8.75 MHz
where nl Cx = nl !/x!(nl − x)!.
FFT size 1024
Similarly, a vector x j = [x(1) (2) (L)
j , x j , . . . , x j ] is defined Number of data subcarriers 768
to represent the distribution of order statistics in the Downlink: 27 symbols
corresponding band j. By means of the max C/I-scheduling Number of symbols per frame
Uplink: 15 symbols
scheme, the received signal quality γ∗j is then expressed as
Channel coding Convolutional turbo code

γ∗j = max γ1,(1)j , γ2,(1)j , . . . , γx(1)
1, j , j
, γ1,(2)j , γ2,(2)j , . . . , Frame duration 5 ms
γ
(14) Symbol duration 115.2 μs
γx(2)
2, j , j
, . . . , γ1,(L)j , γ2,(L)j , . . . , γx(L)
L, j , j
. Number of subcarriers per subchannel 48
Number of subcarriers per CQI channel 48
In general, the optimum signal quality in band j is
expected as the number of users selecting the corresponding
band increases. By order statistics, the conditional CDF of the Considering overall bandwidth, therefore, the average
received CIR in band j given x j can be calculated as throughput of band-AMC system is provided by U =
M
Pr γ∗j < γ | x j j =1 U j .

(1) (1)
= Pr γ1, j < γ · · · Pr γx1, j , j < γ
4. Numerical Results
(L) (L) (15)
· · · Pr γ1, j < γ · · · Pr γxL, j , j < γ
Extensive numerical solutions are studied for evaluating a

L theoretical system throughput in this section. The multi-
= Pr γ∗(l), j < γ | x(l)
j , users diversity effect on system level performance are
l=1
investigated by varying the number of user groups, for
where γ∗(l), j (l) (l)
= maxγ {γ1, j , γ2, j , . . . , γxi, j , j } and
(l) example, L = 1, 3. For L = 3, users are divided
into 3 groups with a mix of 0.2:0.3:0.5. Furthermore, the
x(l) number of total active users ranges from 10 to 70, that is,
Pr(γ∗(l), j < γ | x(l) (l) j
j ) = Fj γ . (16) N = 10, . . . , 70. We consider the OFDMA parameters for
the WiBro system, a mobile version of WiMAX, derived
Therefore, the CDF of the received CIR in band j can be
from the IEEE 802.16d wireless MAN standard [1]. The
expressed as

corresponding parameters are listed in Table 1. Furthermore,
an example of the transmission scheme for AMC under
S j γ = Pr γ∗j < γ
investigation is summarized in Table 2. In Table 2, data rate
= Pr x j = x(1) (2) (L)
j , xj , . . . , xj Rm is for downlink transmission when the ratio of DL:UL is
∀x j (17) given by 27:15. It also specifies the minimum required CIR

L to achieve a target FER of 1%.
· Pr γ∗(l), j < γ | x(l)
j . It is important to note that system throughput is
l=1 dependent on not only the mean channel quality but also
When the existing cellular systems are considered, in which in the user distribution. In the current numerical analysis,
multi-path fading is dominant, the rate function of the we consider the scenarios with the mean channel qualities
Shannon type with the log-based linear relationship between given by m1 = (9.1 dB), m3 = (12 dB, 10 dB, 6 dB) for
rate and CIR may not be valid. In practice, a link-level L = 1 and L = 3, respectively. To impartially compare
simulation is performed in order to determine the required the performance according to various users’ distributions,
CIR for a given data rate, so as to meet the target frame the mean channel quality on the same overall cases needs
error rate (FER). Let A denote a set of MCS levels with the to be kept. Furthermore, it is assumed that M = 12 while
corresponding data rates {Rm }, with the data rate for MCS M = 12, 6, and 3, respectively.
level m defined by Rm . To meet the given level of FER, a range Figures 3 and 4 present a comparison of average through-
of CIR is prescribed for each data rate Rm . More specifically, out for the band-AMC and diversity schemes by varying the
the CIR required for Rm is prescribed as Γm ≤ γ∗j ≤ Γm+1 . For number of users with M = 12 for L = 1 and L = 3,
the given target FER, the average system throughout of band respectively. From these results, a multiuser diversity gain
j is defined as follows: is clearly observed, that is, the system throughput increases
with the number of users in the system. Furthermore, it is
Uj = Rm · Pr Γm ≤ γ∗j ≤ Γm+1 shown that the band-AMC mode outperforms the diversity
m∈A mode, when each user provides a sufficient number of band
(18)
= Rm · S j (Γm+1 ) − S j (Γm ) . CQI reports. For the results in Figure 3, more than 20% of
m∈A throughput is improved by the band-AMC mode with a full
Table 2: Transmission modes for AMC.
MCS level m Modulation Coding rate CIR for 1% FER (dB) Data rate∗ Rm (kbps)
1 QPSK 1/12 −2.2 614.4
2 QPSK 1/6 0.1 1,228.8
3 QPSK 1/3 2.9 2,457.6
4 QPSK 1/2 6.0 3,686.4
5 QPSK 2/3 10.2 4,915.2
6 16QAM 1/2 10.9 7,372.8
7 16QAM 2/3 15.2 9,830.4
8 64QAM 2/3 20.2 14,745.6
9 64QAM 5/6 28.6 18,432
∗Data Rate Rm is for DL transmission when the ratio of DL:UL is given by 27:15.
11000 11000
10000 10000
9000 9000
8000 8000
Throughput (Kbps)
Throughput (Kbps)
7000 7000
6000 6000
5000 5000
4000 4000
3000 3000
2000 2000
1000 1000
0 0
0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70
The number of users The number of users
Diversity Band-AMC (μ = 12) Diversity Band-AMC (μ = 4)

Band-AMC (μ = 6) Band-AMC (μ = 5) Band-AMC (μ = 12) Band-AMC (μ = 3)
Figure 3: Average system throughput performance: L = 1. Figure 4: Average system throughput performance: L = 3.
CQI report, that is, μ = 12, over the diversity mode. For a Let us now consider a CQI signaling overhead cost
partial band CQI report of μ ≤ 3, however, the band AMC associated the band-AMC mode. If the CQI report period
mode performs worse than the diversity mode, suffering is 30 milliseconds, overhead for full band CQI feedback in
from a significant performance loss as compared to that WiBro can be approximated by 0.0694N · μ(%). For example,
with full band CQI feedback. For μ > 3, the band-AMC feedback overhead in the uplink becomes 10.41% when N =
mode outperforms the diversity mode as long as there are 30 and μ = 5. In Figures 3 and 4, note that a diminishing gain
a sufficient number of users, implying that its performance is is observed as the number of band CQI reports increases.
mainly governed by the multi-user diversity gain. Taking the overhead associated with the CQI report for each
As for L = 3, it is observed from Figure 4 that not much band into account, a reasonable number of CQI reports exists
multiuser gain can be achieved with the diversity mode. We to warrant the maximum system throughput, for example,
note that band-AMC mode is almost always superior to the μ = 5 and 6, with a sufficient number of band-AMC users.
diversity mode, even with a very small number of band CQI Adopting the same test parameters as in Figure 4, the mean
reports, as long as there are sufficient number of users in channel quality of each band for each group is given by
the system. It is also found that the maximum multiuser Figure 5.
and multiband diversity gain has been achieved by the band- Figure 6 presents the probability that each subband is not
AMC mode, corresponding to an increase in the average chosen by any user, that is, the probability of unfilled band,
throughput of 2.54 Mbps. computed for the cases of iid and inid, respectively. From
10000
14
12 9000
Mean channel quality (dB)
Throughput (Kbps)
10
Mean: 9.1 dB
8 8000
6
7000
4
2
6000
0
−2
5000
1 2 3 4 5 6 7 8 9 10 11 12
10 20 30 40 50
Band index
The number of users
Group 1 Diversity Band-AMC (M = 6)
Group 2 Band-AMC (M = 12) Band-AMC (M = 3)
Group 3
Figure 7: Average system throughput performance as varying the
Figure 5: Example of mean channel quality: L = 3. number of bands available(L = 1, M = μ).
1
0.9 independent subbands, note that no further improvement
The probabity of unfilled band
0.8 will be found even if M is greater than M .

0.7
0.6 5. Conclusions
0.5
0.4 In this paper, the maximum possible throughput of the
band-AMC mode in the OFDMA system has been numer-
0.3
ically evaluated using the order statistics for various system-
0.2
level parameters, including the number of band CQI reports,
0.1 the total number of available bands, and mean channel
0 qualities. A conventional system-level simulation involves
0 10 20 30 40 50 60
too much complexity associated with various physical
The number of users parameters and thus the proposed analytical approach will
No. CQI = 1 (inid) No. CQI = 2 (iid) be useful for dimensioning the system and configuring the
No. CQI = 2 (inid) No. CQI = 3 (iid) optimal set of parameters. Our numerical results confirm
No. CQI = 3 (inid) No. CQI = 6 (iid) the multiuser and multiband diversity gain that can be
No. CQI = 6 (inid) achieved by the band-AMC mode. It has been shown that
the band-AMC mode outperforms the diversity mode only
Figure 6: The probability of unfilled band: L = 3.
by providing the channel qualities for a subset of good
subbands. Depending on the average CINR for each subband
and how fast the channel varies for individual subband,
a practical viewpoint, depending on the amount of band for example, measured in terms of standard deviation of
CQI reports for each user and the number of users, some CINR for each subband, the band-AMC and diversity modes
subbands may be not selected such that a part of bandwidth can be adaptively combined, so as to maximize the overall
is wasted. This particular point is obviously observed in system throughput. Toward that end, the current analytical
Figure 6. When N = 10 and μ ≤ 6, more than 40% of the framework can be a useful basis for operation of the band-
entire bandwidth is not used. Furthermore, it is found that AMC mode under the varying traffic and CQI report
there is not much difference between the iid case and inid constraints.
case.
Figure 7 presents a throughput performance as varying
the number of bands, that is, M = 3, 6, and 12, when References
L = 1 and μ = M. As expected, the band-AMC system with [1] IEEE Standard 802.16-2004, “IEEE Standard for local and
12 bands demonstrates the best throughput performance. metropolitan area networks Part 16: Air Interface for Fixed
Meanwhile, as the number of bands decreases, throughput Broadband Wireless Access Systems”.
curves approach that of the diversity scheme. Since the multi- [2] B. Classon, P. Sartori, V. Nangia, X. Nangia, and K. Baum,
band diversity effect mainly depends on the number of “Multi-dimensional adaptation and multi-user scheduling
techniques for wireless OFDM systems,” in Proceedings of IEEE

International Conference on Communications (ICC ’03), vol. 3,
pp. 2251–2255, Anchorage, Alaska, USA, May 2003.
[3] P. Song and L. Cai, “Multi-user subcarrier allocation with min-
imum rate requests for downlink OFDM packet transmission,”
in Proceedings of the 59th IEEE Vehicular Technology Conference
(VTC ’04), vol. 4, pp. 1920–1924, Milan, Italy, May 2004.
[4] G. Song, Y. Li, L. J. Cimini Jr., and H. Zheng, “Joint
channel-aware and queue-aware data scheduling in multiple
shared wireless channels,” in Proceedings of IEEE Wireless
Communications and Networking Conference (WCNC ’04), vol.
3, pp. 1939–1944, Atlanta, Ga, USA, March 2004.
[5] S. Yoon, C. Suh, Y. Cho, and D. S. Park, “Orthogonal frequency
division multiple access with an aggregated sub-channel
structure and statistical channel quality measurement,” in
Proceedings of the 60th IEEE Vehicular Technology Conference
(VTC ’04), vol. 2, pp. 1023–1027, Los Angeles, Calif, USA,
September 2004.
[6] A. Harel and H. Cheng, “Applications of order statistics to
queueing and scheduling,” in Proceedings of the 34th IEEE
Conference on Decision and Control (CDC ’95), vol. 1, pp. 847–
852, New Orleans, La, USA, December 1995.
[7] G. Cao and M. West, “Computing distributions of order
statistics,” Communications in Statistics: Theory and Methods,
vol. 26, no. 3, pp. 755–764, 1997.
[8] K. M. Ok and C. G. Kang, “Complexity-reduced adaptive
subchannel, bit, and power allocation algorithm and its
throughput analysis for cellular OFDM system,” IEICE Trans-
actions on Communications, vol. E90, no. 2, pp. 269–276, 2007.
[9] W. C. Jakes, Microwave Mobile Communications, John Wiley &
Sons, New York, NY, USA, 1994.
[10] H. A. David and H. N. Nagaraja, Order Statistics, John Wiley
& Sons, New York, NY, USA, 2003.
doi:10.1155/2009/134579
Research Article
Contiguous Frequency-Time Resource Allocation and
Scheduling for Wireless OFDMA Systems with QoS Support
I. Gutiérrez,1 F. Bader,2 R. Aquilué,1 and J. L. Pijoan1

1 Enginyeria
i Arquitectura La Salle, Ramon Llull University, Ps. Bonanova, 8. 08022 Barcelona, Spain
2 Access
Technologies Department, Centre Tecnològic de Telecomunicació de Catalunya (CTTC), PMT,
Avenue Canal Olı́mpic, s/n 08860 Castelldefels, Spain
Correspondence should be addressed to I. Gutiérrez, igutierrez@salle.url.edu
Received 22 July 2008; Accepted 24 February 2009
Recommended by Thomas Michael Bohnert
The orthogonal frequency division multiple access (OFDMA) scheme has been selected as a potential candidate for many emerging
broadband wireless access standards. In this paper, a new joint scheduling and resource allocation scheme is proposed for the
OFDMA systems using contiguous subcarrier permutation. The proposed resource allocation algorithm provides contiguous sets
of frequency-time resource units following a rectangular shape yielding a reduction on the required burst signalling. The joint
scheduling and resource allocation process is divided into two phases: the QoS requirements fulfilment and the input buffers
emptying status. For each phase, a specific prioritization function is defined in order to obtain a trade-off between the fairness
and the spectral efficiency maximization. The new prioritization scheme provides a reduction of 50% of the 99th percentile from
the delivered packets delay in case of non real-time services, and 30% of the packet loss rate in case of real-time services compared
to the proportional fair scheduling function. On the other hand, it is also demonstrated that using the rectangular data packing
algorithm, the number of required bursts per frame can be reduced up to a few tenths without compromising the performance.
Copyright © 2009 I. Gutiérrez et al. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction subcarrier allocation, the absence of multiuser interference

due to subcarrier orthogonality, and the simplicity of the
The forthcoming 4th generation (4G) wireless networks are receiver among others. In current OFDMA systems like
expected to support high data rates (i.e., spectral efficiencies IEEE 802.16e, the subcarriers are grouped into larger units
from 10 to 20 bits/s/Hz are required) and high amounts of referred to as subchannels [2]. Then, these subchannels are
simultaneous users, especially in the downlink communi- grouped into bursts, where each burst is mapped to one
cation mode [1]. Recently, the major 3G standardization user (in unicast) or a group of users (in broadcast). The
bodies, that is, the 3G Partnership Project (3GPP) and burst allocation and the modulation and coding scheme
the 3GPP2, have defined the orthogonal frequency division (MCS) applied to each burst are adapted on a frame
multiple access (OFDMA) scheme as the dominant physical basis. This allows the base station (BS) to dynamically
layer (PHY) communication technology. As the early stages adjust the bandwidth usage per user according to the users’
of 4G wireless networking unfold, system developers are requirements, that is, the quality of service and the users’
beginning to consider the OFDMA solution as the best current channel state.
suited for WiMAX (IEEE 802.16e/m) [2] systems and other Scheduling policies based on weighted fair queuing tech-
multicarrier-based equipment (e.g., 3G-LTE, VSF-OFCDM niques have been designed to balance the system throughput
from NTT-DoCoMo, or FLASH-OFDM from Qualcomm) and fairness among users [5]. One of the most popular
[3, 4]. scheduling policies, currently used in the 3G networks,
The OFDMA technique efficiently combines discrete is the proportional fair scheduler (PFS) [6–8]. In each
multicarrier modulation with frequency division multiple radio resource unit, the PFS assigns each user a priority
access. The advantages of OFDMA include the flexibility in that is proportional to the channel quality and inversely
Table 1: Signalling data per burst used in the DL-MAP. OFDM symbol
Field Size in bits
FCH
Subchannel offset, ci
Number of CIDs, J 8
CIDs (optional) J ·16 Burst #1
N subchannels
Uplink subframe
Burst #3
DLMAP + ULMAP
MCS 4
Preamble
Preamble
OFDMA symbol offset, ti 8
Subchannel offset, ci 6 Burst #2
SC
Number of OFDMA symbols, wi 7
Number of subchannels, hi 6 Symbol offset , ti wi
Boosting 3 Burst #4 hi
Burst #5
Downlink subframe (NS OFDM symbols) RTG TTG

proportional to the offered data rate. However, the main
drawback of PFS comes from the fact that it considers
full buffers and constant bit rate (CBR) streams. Clearly, MRU Burst
multimedia networks have to deal with different traffic types, Figure 1: IEEE 802.16e OFDMA frame in TDD mode and burst
for example, variable bit rate (VBR) streams with very structure.
strict packet delay requirements. Recent trends in packet
scheduling consider cross-layer implementations such as
those proposed in [9–11]. Liu et al. proposed in [9] a
scheduling algorithm where a priority is assigned to each user number of MRUs allocated to each user can be determined a
according to its instantaneous channel and service status. priori according to the average SNR. Though these proposals
The channel state is obtained directly from the average may achieve a good tradeoff between complexity and spectral
received signal-to-noise ratio (SNR), and the service status efficiency, the gain from frequency scheduling (and multiuser
is obtained from the delay of the head-of-line packet. The diversity) is minimized since the channel effects have been
same principle is extended to the OFDMA system in [10], averaged through all the bandwidth.
where the priorities are also assigned as a function of In this paper, a new dynamic radio resource management
the subchannel index. Furthermore, Jeong et al. in [11] scheme considering the rectangular burst shape required
proposed to prioritize the packets according to the so-called for the IEEE 802.16e frames is presented. The proposed
“emergency factor” which is the ratio between the packet algorithm, which can be used indistinctly in case of cor-
delay and the maximum delay constraint. Therefore users related or uncorrelated channels per subchannel, jointly
with higher emergency factor are scheduled first. performs packet scheduling, resource allocation as well as
However, no one of those proposals has considered the adaptive modulation and coding (AMC) when uniform
effects of the resource allocation regarding the required power allocation is applied. The main contributions from
signalling and its payload neither the need of rectangular this paper are (i) a new resource allocation algorithm
shaped bursts. Each burst is signalled at least by its position which reduces the number of bursts per frame by allocating
in the frame (starting subcarrier and symbol, ci and ti in continuous MRUs, hence reducing the required signaling
Figure 1), the number of allocated MRUs in frequency and per frame, and (ii) a new prioritization function which
time (hi and wi ), the MCS, and (optionally) the associated allocates the resources in a fair fashion as the PFS. In
service flow or connection identifier (SFID/CID) [3]. Table 1 order to assess the performance of the proposed scheduler
resumes the fields that are transmitted for each burst. In (which is able to cope with maximum packet delays and
this proposal, we define one burst as a set of continuous VBR streams) different performance analyses are provided
minimum resource units (MRUs) (logical or physical) in where the PFS is also studied and compared. The paper
both time and frequency domains following a rectangular focuses on the downlink communication mode based on
shape containing data from one service flow. Each service IEEE 802.16e system parameters. However, it can be also
flow is a unidirectional stream of packets with a particular applied to any other OFDMA-based scheme. Furthermore,
set of QoS parameters [2]. Ben-Shimol et al. proposed in since the user’s data are in almost all the cases packed
[12] to allocate the resources following a “raster approach” together in the time and/or the frequency domain, the
to fit the resources into a rectangular shaped burst such mobile stations (MSs) power consumption is also reduced
that the resources are allocated first in frequency direction due to the reduced number of active symbols (shorter
and later in time direction (see Figure 1). Another algorithm connection in time) or the reduced number of active
that minimizes the number of bursts given the amount subchannels (lower computational cost at the receiver)
of resources allocated to each user has been proposed [14].
by Erta et al. in [13]. However, the works in [12, 13] The rest of the paper is organized as follows. In Section 2
have been conceived considering that the channel within the system model considered is described. The proposed
each subchannel is uncorrelated among subcarriers (thus radio resource management scheme is then studied in
a subcarrier permutation algorithm is assumed); thus the depth in Section 3. Afterwards, the performance of the
proposal is shown in Section 4 obtained over extensive to know how the downlink frame is organized in order
computer simulations. Finally, some conclusions are drawn to properly decode the data, the downlink control channel
in Section 5, where the benefits and the drawbacks of the includes the number of bursts transmitted as well as the
overall approach are stood out and summarized. signalling for each burst. In the IEEE 802.16e each burst is
signalled by the parameters indicated in Table 1. Multicast
transmission is addressed by mapping different connection
2. System Description identifiers (CIDs) to each burst, where the BS is responsible
for issuing the service flow identifiers (SFIDs) and mapping
We consider in this proposal the downlink mode in the it to single CIDs. As it is shown in Figure 1, the signalling
IEEE 802.16e PMP (point-to-multipoint) system with one bits described in Table 1 are those used into the DL-MAP
single cell with a total of K MSs within its cell area with structure and transmitted at the beginning of each frame
no interference sources. We consider only the time division after the synchronization preamble and the frame control
duplexing (TDD) scheme; thus channel reciprocity can be header (FCH) [2].
assumed between uplink and downlink. The whole TDD
frame is formed by a total of Ns symbols with Tframe duration.
The number of downlink and uplink OFDM symbols usually 3. Radio Resource Management
follows the ratio 2 : 1 or 3 : 1; however, it can be adjusted by
the BS according to users’ demand [2]. One of the main goals of the radio resource management
The whole transmission bandwidth BW is formed by a function is to maximize the spectral efficiency. This is
total of Nc subcarriers where only Nused are active. The active performed at the BS by the radio resource agent and by
subcarriers include both the pilot subcarriers and the data the radio resource controller which can be implemented
subcarriers which will be mapped over different subchannels apart from the BS. The tasks performed include the channel
according to the specific subcarrier permutation scheme [2]. estimation, the channel quality indicators management, and
For the full usage of subcarriers (FUSC), pilot subcarriers the control of the radio resources assigned to the BS.
are allocated first and the remainder subcarriers are grouped Since most of the tasks related to resource allocation and
into subchannels where the data subcarriers are mapped. scheduling are not defined in the 802.16.a/e standards, each
On the other hand, the partial usage of subcarriers (PUSC) operator or system developer can tune and optimize its
and the adjacent subcarrier permutation (usually referred network according to collected performances and metrics
as Band AMC) map all the pilots and data subcarriers to [15].
the subchannels, and therefore each subchannel contains In Figure 2, the protocol stack according to the IEEE
its own set of pilot subcarriers. For the FUSC and PUSC, 802.16e standard is depicted. As it was previously mentioned,
the subcarriers assigned to each subchannel are distant in only the medium access controller (MAC) layer and the
frequency, whereas for the Band AMC the subcarriers from physical (PHY) layer are defined within the standard [2].
one subchannel are adjacent. Note that the FUSC and PUSC This work will focus at the MAC layer blocks which perform
increase the frequency diversity and average the interference, the resource allocation and scheduling procedures and those
whereas the Band AMC mapping mode is more convenient implied blocks (i.e., the input queuing buffers), the packet
for loading and beamforming where multiuser diversity is data unit (PDU) management and fragmentation, and the
increased [10]. burst mapping. Therefore, all blocks within the dotted line
As it is depicted in Figure 1, the MRUs allocated to shaded shape are affected by the current proposal. On the
any data stream within an OFDMA frame have a two- other hand, the air link control (ALC) is in charge of
dimensional shape constructed by at least one subchannel recollecting the MS’s channel state information which is later
and one OFDM symbol. In the IEEE 802.16e standard the used by the scheduling and resource allocation processes as
specific size of the MRU varies according to the permutation well as other procedures such as the power control or the
scheme; concretely for the Band AMC it may take the shapes ranging among others.
9 × 6, 18 × 3, or 27 × 2 (subcarriers × time symbols, resp.), Following the block diagram in Figure 2, each data
where 1/9 of the subcarriers are dedicated to pilots. We define stream is classified according to its class of service and
an MRU as a resource unit formed by a set of Nsc × Nst mapped to a single service flow (SF). Without loss of
symbols in frequency and time domains, respectively. Once generality, in this work it is considered that each MS has
the size of the MRUs is defined we can obtain the total only one active SF. The packets from each SF are then
number of MRUs per frame Q × T ,where Q = Nc /Nsc is the independently buffered and each incoming packet is time
number of subchannels and T = Ns /Nst defines the number stamped. The packets are asynchronously received at the
of the time slots. input buffers following a rate that depends on the specific
Several MRUs may be grouped into a data region or burst SF properties. Five service classes are defined in the IEEE
(see Figure 1), formed by successive MRUs in frequency and 802.16e [2] as follows:
in time directions. Both the MRU and the data region always
follow a rectangular shape structure. We consider the case
that the transmitted data in each burst belongs to only one (i) unsolicited grant service (UGS) class: designed to
service flow (i.e., to a single MS), and the MCS applied to support real-time SFs that generate fixed data packets
each burst might be adapted. Since the MS receiver needs size on a periodic basis (e.g., VoIP);
transmission, and according to the channel state from each

IP ATM user and the scheduling policy some of the packets are
Convergence
sublayer
scheduled (and may be fragmented) for transmission in
the subsequent frame. The scheduling process is strictly
Header Packet
supression connected to the resource allocation process since the latter is
classifier (QoS)
who determines how many resources are assigned to each SF
Queuing Connection in every frame. Once the resources per SF have been resolved,
MBS the packet data unit (PDU) block prepares the data that will
buffers management
MAC common part

Network
be mapped into each burst at the PHY layer. Thus, the PDU
block and its counterpart at the MS side are responsible of the
sublayer
Scheduling + RA entry
fragmentation and the reconstruction of the network layer
PDU Handoffs packets. Finally, the burst mapping block breaks the packet
Air link data units in order to map each fragment into one physical
Power
Burst map. control management burst. Each physical burst may apply a different MCS. The
MCS for each burst is obtained according to the effective
sublayer
Key Security SNR (SNReff ) of the channel over the MRUs assigned to the
Encryption Authent.
management burst. For low mobility scenarios we can consider the channel
for each subcarrier nearly constant during the whole frame;
thus, the SNReff is an arbitrary function of the different
Randomizer postprocessing SNR per subcarrier (SNRi ) and the MCS,
MSs channel
PHY
estimation SNReff = f (SNR1 , SNR2 , . . . , SNRn , MCS), (1)

Channel coder
where SNReff would be the SNR that, in case of an additive
white Gaussian noise (AWGN) channel, would give the same
Bit interleaver Pilot insertion bit error rate (BER). Several metrics as the exponentially
effective SNR (EESM) [16], the mean instantaneous capacity
(MIC), or others based on the mutual information per bit
Modulation Signalling
can be applied to obtain the SNReff [15, 17]. In our proposal,
the harmonic mean of the channel values has been used as
proposed in [18], which gives a tight lower bound of the BER
Frame format
and is independent of the MCS. Next subsections describe
the scheduling and resource allocation algorithms presented
IFFT + guard interval insertion in this paper.
3.1. Resource Allocation and MCS Selection Problem For-

Data flow Control
mulation. The main goal of the resource allocation and
Figure 2: Protocol stack at the BS and layers interaction. scheduling mechanisms is to maximize the system through-
put (i.e., the spectral efficiency) while guaranteeing the QoS
constraints for each SF. Actually, most of these constraints
(ii) real-time polling service (rtPS) class: fitted to support are defined by the average bit rate, the peak bit rate, the
real-time SFs that generate variable data packets size minimum bit rate, the maximum tolerated delay per packet
on a periodic basis (e.g, video conference, MPEG, (and jitter), and the average bit error rate (or packet error
etc.); rate). Nevertheless, one key issue for any resource allocation
scheme is to minimize the signalling that is required to
(iii) extended real-time polling service (ertPS) class: similar inform the receivers how the frame is structured. Following
to the UGS class, but some of the periodic packets the IEEE 802.16e transmission format, since each burst
might be missing due to silence periods (e.g., VoIP requires a specific signalling, it is suitable that all the
with silence suppression); scheduled packets belonging to the same SF are transmitted
(iv) nonreal-time polling (nrtPS) class: in this case the SFs within the minimum number of bursts hence the signalling
are variable packet size data packets, delay tolerant, is minimized.
where only minimum data rate is specified; Thus the optimum shape and position of each burst
(v) best effort (BE) class: designed to support a data (with its respective MCS) are explored while the QoS require-
transmission when no minimum service level is ments are fulfilled for each user. To reduce the algorithm
required. complexity, the optimization problem formulation considers
uniform power allocation across subcarriers and that each
As it is depicted in Figure 2, the data from the input buffers SF is allocated a single burst per frame. According to these
is monitored by the scheduling and resource allocation block. premises and considering that there are M active SFs, the
During each frame all the input packets are evaluated for resource allocation and the rate adaptation problem that
guarantees the different QoS requirements while maximizing experience deep fading during certain frames). In addition,
the spectral efficiency can be mathematically expressed by using a unique burst per user may decrease the spectral
⎧ ⎫ efficiency when the burst spans over a large bandwidth due
⎨M
Q
T ⎬ to the effect of frequency selective fadings.
arg max⎩ ηi ξi (n, k) − M · ICC ⎭, (2) To overcome these limitations, the authors propose a low
ξ i=1 n=1 k=1
complexity iterative algorithm that adapts the number of
bursts for user scheduling and resource allocation purposes

Pi
Li,p (O(KNsc Nst )). In order to maximize the spectral efficiency
s.t. bi = Tframe
, (3) and undertaking the service flows QoS requirements, the
p=1 τmax,i − τi,p resource allocation and the rate adaptation problem is
described in Section 3. A is divided into two stages: the
with minimum requirements fulfilment and the spectral efficiency
maximization. For each stage a different prioritization
ξi (n, k) · ξ j (n, k) = 0, for i=
/ j, n ∈ [0, Q − 1], k ∈ [0, T − 1], function is applied.
(4)

ηi |BER≤μ = ψ SNReff,i , (5) 3.2.1. Service Flows Prioritization. In order to select which

Q
T resources will be assigned to each SF (and thus to each
Ri = ηi · ξi (n, k) ≥ bi . (6) MS), each ith service is assigned a priority over each nth
n=1 k=1 subchannel (we assume that the channel is constant in time
In (2) the term ICC means the number of the required during the whole frame, that is, low mobility environment).
signaling bits transmitted within the control channel for each For the well-known PFS [7], the priority ϕi (n) assigned to
burst. The minimum required bits per frame bi for the ith SF each ith SF in each nth subchannel is given by
are obtained by (3), where Li,p is the pth packet size in bits ⎧
⎪
⎪ 1 ηi (n)
P
from the ith SF, τi,p is the packet delay (time the packet has ⎨ · , if Li,p > 0,
been queued in the buffer), τmax,i is the maximum allowed ϕi (n) |PFS = ⎪ Thi (t) ηmax p=1 (8)
⎪
⎩
delay per packet for the ith SF, and Pi the total number of 0, otherwise,
the queued packets. ξi is a binary Q × T matrix which points
out which MRUs are allocated for the ith SF (i.e., ξi (n, k) = 1 where ηi (n) is the spectral efficiency achieved by the highest
means the (n, k) MRU has been assigned to the ith SF). In MCS that can be applied on the nth subchannel giving
order to force that each burst follows a rectangular shape, the an instantaneous BER lower than a certain upper bound
ones in ξi must be placed inside a rectangle. Since each ith BERmax , Thus, ηi (n) = 0 denotes a deep fading in the nth
burst must follow a rectangular shape and considering the subchannel for the ith MS, and clearly in this case the priority
burst starts at ni and ki with hi and wi the number of the becomes zero. ηmax is the spectral efficiency achieved by the
MRUs in frequency and time, respectively, ξi is given by highest MCS. Thi (t) is the average throughput obtained by a
⎧ moving average window with α as the latency scale and Thi (t)
⎪
⎪ if (ni ≤ n ≤ ni + hi − 1)
⎪1,
⎪
the instantaneous throughput, thus
⎨

ξi (n, k) = ⎪ and (ki ≤ k ≤ ki + wi − 1), (7) 1 1
⎪
⎪ Thi (t) = Thi (t)+ 1 − ·Thi (t − 1), with Thi (t) ≥ 0.
⎪
⎩0, α α
others. (9)
Equation (4) guarantees that the different bursts do not On the other hand, fairness might be also achieved by means
overlap (as seen in Figure 1). Finally, (5) and (6) determine of ad hoc user satisfaction indicators as proposed in [9–11].
the actual number of bits transmitted within the ith burst Ri . However, most of these algorithms have been designed based
The term ηi represents the upper layer throughput (in bits) on the average bit rate requirements, without considering
per MRU, and it is obtained as a function of the calculated the buffers state neither the VBR nature of the traffic. To
SNReff per each burst, the available MCS, and the upper overcome these restrictions, the authors propose a time
bound BER. stamped packets scheduling (TSPS) function based on the
input buffers status, the time stamps from each packet, and
3.2. Proposed Joint Packet Scheduling and Resource Allocation. the channel metrics. Then, for the TSPS the users’ priorities
The resolution of (2) to (6) might be obtained using non- ϕi (n)are given by
linear programming techniques. However, such techniques ⎧
are not feasible for practical systems due to prohibitive ⎪
⎪ bi ηi (n)
⎪
⎪ min ,1 · ,
computational complexity. Furthermore, the problem as ⎪
⎪ b ηmax
⎪
⎪
max
defined from (2) to (6) is very rigid since it forces the number ⎪
⎨
if ∀ p −→ τi,p < τmax,i − Δτ ,
of bursts to be equal to the number of services flows, and ϕi (n) = ⎪ (10)
⎪
⎪ ηi (n)
in consequence all service flows are scheduled during each ⎪
⎪ Purgency ,
⎪
⎪ ηmax
frame. However, the optimum number of bursts, B, should ⎪
⎪
⎩
be adapted to the different channel conditions (an MS may otherwise,
with NS OFDM symbols

⎧
⎪
⎪ P
Li,p
⎪
⎪
⎪
⎪Tframe , DT ,i
⎪
⎪ τ − Δτ − τi,p
⎪
⎪ p=1 max,i

3
⎪
⎪
⎪ if ∀ p −→ τ i,p < τmax,i − Δτ ,
NSC subchannels
⎪ 1 2
⎪
⎨ 8
bi = ⎪ P
Li,p (11) 3
⎪
⎪T + Li,p , DL,i 4 DB,i
⎪
⎪ frame 1 2
⎪
⎪ τ − Δτ − τi,p 7
⎪
⎪ p=1 max,i p 5
⎪
⎪
⎪
⎪ p=/ p
6 2 1
⎪
⎪ 4
⎩ otherwise,
3
DR,i
where min(x, y) takes the minimum value of x and y. The
term bi in (11) means the minimum number of bits that Figure 3: Burst increase options and example of bursts increments
should be transmitted in the actual frame in order to achieve after 15 iterations.
a delay for each packet τi,p ≤ τmax,i − Δτ, where Δτ is
a guard time. bmax is a normalization factor which is the
maximum number of bits that could be transmitted within
a new burst might be created and (ii) an already existing
a frame using the highest MCS. Furthermore, in case any
burst might be increased by allocating another MRU (or
packet from the ith SF is close to exceed its maximum
a group of) to the burst. In the second case, when one
delay the term bi /bmax is substituted by an urgency factor
MRU is allocated to an existing burst no extra signaling is
Purgency , which boosts the data transfer from the ith SF [11].
required; however, the enlargement of the burst may lead to
Analogously, the packet that is close to achieve the maximum
a reduction on the MCS level.
delay is entirely considered for transmission in the current
As it can be observed in Figure 3, each burst may be
frame by including the whole packet in bi . The value of
increased towards four directions, that is, top, bottom, left,
Purgency might be different for each class of service (i.e.,
and right with respect to its position in the frame. In
Purgency = 100 for the UGS and rtPS type, Purgency = 10 for
order to determine in which direction the increase is more
the nrtPS, otherwise Purgency = 1). Actually, those classes of
advantageous or suitable, an equivalent priority Dx (x ∈
service whose packets are susceptible of being dropped in
{T, B, L, R }) is assigned to each direction (as indicated
case of excessive delay should be prioritized. Furthermore,
in Figure 3) where Dx is obtained by averaging the priority
notice that in case an SF has not been allocated the minimum
values ϕi (n) of the MRU that are covered by the enlarged
resources bi during the current allocation process, its priority
burst. Whether in the x direction there is any occupied MRU
in the next frame will be automatically increased. Finally, in
or the burst is at the frame boundary then Dx is forced
case a buffer is empty the priority given to that SF is zero.
to 0. An example of the increasing principle is shown in
In order to check the performance of the TSPS proposal a
Figure 3 where the numbers inside the rectangles indicate
modified version of the PFS called buffer-based PFS (b2 PFS)
the order in which the resources have been allocated to each
is also introduced where, instead of balancing the throughput
burst. In this example, three bursts have been created after
of the different users, the scheduler levels the number of
15 iterations, where the number indicated inside each MRU
buffered bits from each user and in consequence VBR
indicates the order in which the MRUs have been allocated.
streams can be managed (improving the performance of the
Note that as the burst increases more MRUs are allocated per
PFS). Thus for the b2 PFS scheduler (8) is substituted by
iteration and as consequence, the resource allocation process
⎧ is accelerated.
⎪
⎪ ηi (n)
P
⎨ Li (t) · , if Li,p > 0, The algorithm, depicted in Figure 4, starts without any
ϕi (n) |b2 PFS = ⎪ Li (t) ηmax p=1 (12) allocated burst (B = 0). For the first burst, the (n, k)th
⎪
⎩
i
0, otherwise, MRU is allocated according to the ith service flow and the
nth subchannel combination that maximizes the value of
with ϕi (n). The position on the time axis of the MRU allocated
to the first burst is forced to k = 0. Once the first burst
1 1
Li (t) = Li (t) + 1 − ·Li (t − 1), with Li (t) = Li,p . is created, the iterative process starts checking the possible
α α p increments of the already existing bursts while at the same
(13) time it tries to the generate new bursts. Iteratively, the option
with the highest priority is allocated a new MRU (in case
3.2.2. Iterative Resource Allocation and Scheduling Algorithm. of creating a new burst) or a group of MRUs (in case of
Once the priority for each SF over each subchannel ϕi (n) enlarging an existing burst). In case a new burst is created
and the minimum bits per frame bi have been obtained, It has been stated before that Yi (n) is time independent
the MRUs are allocated iteratively in order to guarantee the (the channel is assumed constant for each subcarrier during
QoS of all SFs (their minimum required bits per frame). the whole frame). As a result, in case a new burst is
The flowchart of the proposed algorithm is shown in assigned to one subchannel, it position in the time axis is
Figure 4. Two cases are considered during each iteration: (i) determined by that position which maximizes the distance
No Yes
k<K
Next allocated
MRU? New burst
ϕeq,k
Equivalent priority
ϕeq,k × Pburst
Yes Is it a new
burst? Burst increase Best TTI burst
Best burst, placement?
B=B+1 No inc. direction and
equivalent priority
arg max ϕ
i,n
Yes
Update ξ, θ Any of its
bursts can be No
increased?
Estimate SNR eff
Yes The kth SF

Obtain MCS has any No
k=K+1
allocated
burst?
Update Ri , L i
k=0
Obtain bi and ϕi (n)
Requirements
No for minimum
fullfilled?
(Ri > bi) requirements
allocation
Yes
Is there any Update bi and ϕi (n)
Yes
unallocated for spectral efficiency
MRU? maximization Start: B = 0
Reset ξ, θ
No
Figure 4: Resource allocation and scheduling algorithm flowchart.
Table 2: Parameters of the simulated classes of service.

Class of service Average bit rate [Kbps] Peak bit rate [Kbps] Max. delay [ms] Packet rate [packets/s]
rtPS (videocall) 380 2000 50 10
nrtPS (streaming) 2000 10000 300 10
UGS 0.015 0.015 75 10
WWW — 2000 ∞ Variable
FTP — 10000 ∞ Variable
to other already allocated MRUs. This in fact assures that Considering the MCS applied in each burst, we can
the new created burst has higher chances to be increased obtain how many bits from each buffer are going to be
than whether it is placed near to the other already created transmitted and thus checking if the minimum requirements
bursts. Nevertheless, in order to achieve the lowest number are met. If the minimum requirements are satisfied, thus
of bursts, the equivalent priorities associated to each burst Ri ≥ bi for i = 1, . . . , K, and in case there is still any
increment are multiplied by a Pburst factor (e.g., Pburst = 5) to unassigned MRU, these unallocated resources should be used
push forward the enlargement of the existing bursts instead to flush the input buffers. Since the minimum requirements
of generating new ones. for the SF have been already allocated, the spectral efficiency
The algorithm is then iterated until all the requirements can be maximized by transmitting the data from those SFs
are fulfilled or when all the resources have been allocated. associated to the best channel conditions. Considering that
The number of bursts is not fixed and may change from the status of the input buffers has been updated according to
frame to frame depending on the buffers state, the QoS Ri , we can apply the same algorithm but with the following
requirements, and the channel state conditions. Moreover, scheduling priority ϕi (n):
since each SF may have more than one burst, another ⎧
⎪
⎨ ηi (n) ,
auxiliary matrix θ with size (Q × T) is defined. Each value of θ if ∀Li > 0,
indicates to which burst the MRU is allocated. Both matrices ϕi (n) = ⎪ ηmax (14)
⎩0, otherwise.
ξ and θ are updated each time a new MRU is allocated.
Now, the number of required bits per frame bi is directly Table 3: System parameters.
obtained from the remaining buffered bits after the previous
OFDMA air interface and system level parameters
allocation process, that is,
Carrier frequency 3.5 GHz

P
Bandwidth 20 MHz
bi = Li,p . (15)
p=1 Samplig frequency 22.857 Msps
Subcarrier
Finally, the end of the joint scheduling and resource allocation Band AMC
permutation
process may be achieved due to two main indicators: (i) all CP 0.125%
the MRUs have been allocated, or (ii) the input buffers have 2048
FFT length
been emptied. The number of allocated bits to each SF will
# of used subcarriers 1728
be then determined by the number of bursts associated to
such SF and the MCS of each burst. Since the packets must # of subcarriers per
18
be received in the correct order, the data from the buffers is MRU
extracted from older packets to newer packets (as in a first- # of OFDM symbols
3
in first-out queue). The delivered packet delay τ i,p is then per MRU
measured as the time since the packet is queued at the buffer # of data symbols per
48 (efficiency = 8/9)
until the instant where all the bits from the packet have been MRU
transmitted. Modulation M-QAM, M = {4, 16, 64}
Channel coding Punctured convolutional
4. Performance Results Bit error rate (BER) < 10−6
Channel model Pedestrian B
The simulated scenario is focused on a single cell system MS velocity 10 Km/h
environment having the main system parameters detailed in Channel estimation
Table 3. The simulation environment has been carried out Ideal
and feedback
using a developed simulator using c++ and it ++ communi- Shadowing standard
cation libraries. The simulator includes both the link level 5 dB
deviation
and the system level properties where both the MAC and BS Tx power 49 dBm
the PHY properties of the WiMAX system are considered
BS antenna gain and
(see Table 1 parameters). During each simulation run, the 14 dB (sectorial antenna), 70◦
pattern
users are dropped at different positions following a uniform
MS antenna gain and
distribution within the cell area. The position of the MSs 0 dB, Omnidirectional
pattern
remains fixed during the whole simulation process while
BS height = 30 m,
the speed of each MS is only employed to determine the
Doppler effect and the channel coherence time [17]. A Other link budget MS height =1.5 m,
simulation time analysis of 50 seconds is considered to parameters MS noise figure = 7 dB,
be enough to ensure the convergence of the service flows Connectors loss = 2 dB
and the performance metrics. The full process is repeated Path loss, urban 139.57 + 28∗ log 10(R),
with the MSs dropped at new random locations. The environment R = distance BS to MS in Km.
number of simulated drops is 25, which makes the results −174 dBm/Hz
Thermal noise
independent of the users’ position. Without loss of generality
# of sectors simulated 1
but to simplify the results, a single SF is assigned to each
user. The channel estimation is assumed ideal at the base Frame duration,
5 ms
station, and packet retransmission is not considered. Five Tframe
service classes, summarized in Table 3, have been considered DL/UL rate 2:1
according to the traffic models in [17, 19]. For the rtPS # of OFDM symbols
30
and nrtPS the flows are generated as variable size packets in the DL subframe
generated periodically (each 100 milliseconds) according to
the video conference and multimedia streaming models in
[19]. For the UGS packets are of fixed size and periodically
generated (e.g., VoIP). Finally the web browsing and file coding defined in the IEEE 802.16e standard [2] (constraint
transferring protocols are modelled as asynchronous process length 7 and native code rate 1/2) are the following: [7, 8.7,
that generate variable size packets following the models 9.6, 11.2] for QPSK, [13.9, 15.6, 16.6, 18] for 16QAM, and
described in [17]. The packets from each SF are buffered at [20, 21.7, 22.7, 24.3] for 64QAM with coding rates of 1/2,
independent queues where each packet is monitorized by its 2/3, 3/4, and 5/6, respectively. To obtain the effective SNR
size in bits and the time it has spent at the buffer. A maximum the channel values inside each subchannel are merged by the
BER BERmax < 10−6 after channel coding is required from harmonic mean which despite of being a very simple mean
all the service classes. In this case, the minimum effective calculation form independent of the modulation and coding,
SNR per MCS with the mandatory punctured convolutional it is able to extract very accurately the effective channel [18].
First, the performance of the proposed TSPS prioritiza- Non-real time traffic, VBR
tion function is evaluated and compared to the PFS and the
1
b2 PFS prioritization functions by means of the cumulative
Cumulative density function, p(τ < X)

density function (cdf ) of the delay from the delivered packets
(P(τ i,p < τ)) (see [17] for more information on the 0.95
measurement procedure). The allocation algorithm follows
the one proposed in Section 3 with Pburst = {10}. For the
0.9
PFS and b2 PFS scheduling functions, the number of bits per
frame bi that should be transmitted is equal to the number of
buffered bits (bi = Li (t)). The latency scale for both the PFS 0.85
and the b2 PFS is fixed to 10 frames (i.e., α = 10).
Then, the packet delay statistics obtained with the
0.8
different scheduling functions in case of nrtPS traffic are
depicted in Figure 5, where the number of MSs within the
cell is K = 15. The traffic from all the users is modelled 0.75
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
according to [19] as VBR streams with an average data
Packet delay, τ (s)
rate of 2 Mbps (an average system throughput of 30 Mbps
is then required). The maximum allowed delay per packet TSPS
is 300 milliseconds. The 99th percentile of the delivered PFS
packets delay measured using each prioritization function b2 PFS
is 275 milliseconds for TSPS, 535 milliseconds for PFS, Figure 5: Cumulative density function of the packet delay for
and 530 milliseconds for the b2 PFS. Nevertheless, for the nonreal-time traffic and K = 15 users.
TSPS scheduler the improvement due to the urgency factor
(Purgency ) is clearly appreciated since the slope of the cdf is
Real time traffic, VBR
changed for delays higher than the value τmax − Δτ, where
the guard time was fixed to Δτ = 0.2 × τmax . Furthermore,
1
we can also observe that the maximum delay of the b2 PFS
scheme is much lower than for the PFS. This difference in

performance comes from the fact that the b2 PFS considers 0.8
the states of the buffers, thus when a large packet is received
the priority for that queue is increased until all the buffers
have similar number of queued bits. On the other hand, the 0.6
PFS is designed to balance the throughput from all the users
during short periods of time. Using the same configuration 0.4
with K = 15 and the same average bit rate equal to 2 Mbps,
we have observed that for CBR traffic, the 99th percentile
is obtained at 55 milliseconds, 100 milliseconds, and 125 0.2
milliseconds for TSPS, PFS, and b2 PFS, respectively, giving
the b2 PFS scheme better performance than the PFS for VBR 0
traffic as it was expected. 0 0.01 0.02 0.03 0.04 0.05 0.06
In case of rtPS traffic, each user stream is modelled Packet delay, τ (s)
also as a VBR with an average bit rate of 380 Kbps. For
TSPS, K = 50 TSPS, K = 100
the rtPS traffic, in case of having a packet not transmitted
PFS, K = 50 PFS, K = 100
within the maximum delay, the packet is deleted from the b2 PFS, K = 50 b2 PFS, K = 100
queue and discarded. For this case, two parameters have been
analyzed: the delivered packets’ delay statistics and the packet Figure 6: Cumulative density function of the packet delay for real-
loss rate (i.e., number of delivered packets divided by the time traffic and K = {50, 100} users.
number of queued packets). Figure 6 shows the cdf of the
packet delay for this scenario having 50 and 100 users. As
it is shown in Figure 5, for K = 50 all the prioritization τmax − Δτ = 0.04 s). For K = 100, the packet loss rate for
schemes achieve a delay lower than the maximum (τmax = 50 each scheduling function is 8.98%, 33.4%, and 16.97% for
milliseconds); in fact, the 99th percentile measured over τ i,p the TSPS, the PFS, and the b2 PFS, respectively. Note that for
is 25 milliseconds for TSPS and PFS, and 15 milliseconds for the TSPS although most of the packets are sent when they are
the b2 PFS. Furthermore, the packet loss rate for each scheme nearly to expire, it achieves a lower packet loss rate.
is 0% for the TSPS, 1.6 · 10−3 % for the PFS, and 1.6 · 10−4 % So, despite the TSPS initially implies an increase on
for the b2 PFS. In case K = 100, it can be observed that the computational complexity since it requires more infor-
the PFS is the only one that achieves lower packet delays, mation about the buffers status (i.e., each packet must be
whereas the TSPS sent most of the packets when the urgency time stamped for the TSPS scheduler), its superiority has
factor was active (the urgency factor is applied when τi,p ≥ been shown for real-time and nonreal-time applications.
Mixed traffic Non-real time traffic, VBR

100
1
Probabiblity density function, p(B = X)

10−1
0.9
10−2
0.8
10−3
0.7
10−4
0.6
0.5
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 100 101 102 103
Packet delay, τ (s) Number of bursts per frame, B
nrtPS, τmax = 300 ms FTP, τmax = 90 s Pburst = 0 Pburst = 10

rtPS, τmax = 50 ms UGS, τmax = 75 ms Pburst = 1 Pburst = 100
WWW, τmax = 60 s Pburst = 5
Figure 7: Cumulative density function of the packet delay for mixed Figure 8: Probability density function of the number of bursts per
traffic obtained with the TSPS scheduling function and K = 50 frame for nrtPS, K = 15 users and different values of the Pburst factor
users. (when the TSPS prioritization function is applied).
Moreover, there is no necessity to update the priorities (by means of the probability density function (pdf)) related
each time an MRU is allocated; thus, the computational with the number of bursts per frame following the proposed
complexity is also drastically reduced compared to the PFS algorithm are shown. The considered scenario is formed by
and the b2 PFS. Another advantage from the TSPS is that K = 15 users, each requiring nrtPS services. The number
it can easily manage different traffic types by applying of bursts per frame is here analyzed as a function of the
different maximum delay bounds to each stream. In Figure 7 Pburst factor having values Pburst = {0, 1, 5, 10, 100}. The
the performance of the TSPS over heterogeneous traffics is prioritization function within the proposed TSPS is here
shown. In this scenario K = 50 where 10 users require applied. In case Pburst = 0, the algorithm considers that
nrtPS, 13 users require rtPS, 10 users are browsing internet each new allocated MRU is a new burst. Thus this is the
files (World Wide Web (www) service), 5 are downloading maximum granularity case, but clearly in this extreme case
files according with the file transfer protocol (FTP), and the signalling is unaffordable. It can be observed in Figure 8,
12 users demand UGS connections for applications such as how for Pburst > 0, the algorithm starts to merge the MRUs
Voice over IP. The total measured downlink throughput is into bursts. For Pburst = 1, during the allocation of each
26.54 Mbps, and the maximum delay for each service is based MRU, half of them are allocated to an existing burst (both
on what is indicated in Table 2. For the www and the FTP new bursts and existing bursts have the same priority). It
services, despite there is no delay restriction (i.e., τmax = ∞), is observed that the number of bursts for Pburst = 1 is still
a maximum delay of τmax = 60 seconds and τmax = 90 unaffordable in terms of required signalling. However, it is
seconds has been assumed for both services, respectively; shown that for Pburst ≥ 5 the number of bursts is lower
thus, the performance of each can be better appreciated. It than 60 for all the simulated frames. Furthermore, in case
is clearly depicted in Figure 7 that each traffic type achieves a Pburst = 5, the achieved number of bursts per frame is
maximum packet delay lower than the maximum tolerated. lower than 24 in 99% of the transmitted frames, which can
The 99th percentile for the delay sensitive applications is be considered as a very encouraging result. Furthermore, a
at 95 milliseconds, 25 milliseconds, and 15 milliseconds for soft limiter can be included to the algorithm to limit the
the nrtPS, the rtPS, and the UGS, respectively. Note that the maximum number of bursts per frame up to 20 without too
UGS achieves lower delay than that obtained for rtPS despite much affecting the spectral efficiency. Therefore, assuming
having a higher packet delay value. This is justified by the that approximately 60 bits are required for signaling each
fact that the packets of the UGS service are much smaller burst [2] and using a QPSK modulation with a code rate
than those from the rtPS; thus, fragmentation is not applied 1/3, the downlink signaling zone (i.e., the DL-MAP) would
in most cases. span less than 2 OFDM symbols. Hence, the loss due to the
Having illustrated the advantages of the proposed TSPS downlink signaling is 6.66% for the downlink mode when
prioritization function, the following figures depict the having a total of 30 OFDM symbols per subframe.
performance of the authors’ proposed resource allocation On the other hand, the spectral efficiency obtained by
algorithm described in Figure 4. In Figure 8, the statistics the proposed algorithm defined in Section 3.2 is plotted in
Non-real time traffic, VBR streams, respectively. On the other hand, the proposed
resource allocation algorithm, which packs users’ data into
10−7 rectangles based on iterative bursts increments, gives an
important reduction on the number of required bursts
Probability density function
per frame. According to the simulations carried out, it is

concluded that if the priority associated to increasing an
existing burst is five times that of generating new bursts
(i.e., Pburst = 5), a signaling loss lower than 10% can
10−8 be achieved without sacrificing spectral efficiency. Finally,
another advantage from the proposed resource allocation
algorithm that has been observed during simulations is its
lower computational complexity compared to the case where
each MRU is independently evaluated. Actually, since in
many cases several MRUs might be allocated in a single
10−9 iteration, the number of required iterations is reduced as the
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
System throughput (bps) ×107 number of bursts per frame is decreased.
Pburst = 0 Pburst = 10
Pburst = 1 Pburst = 100 Acknowledgment
Pburst = 5
This work was partially supported by the European ICT-
Figure 9: Probability density function of the system throughput for 2008-211887 project PHYDYAS.
nrtPS, K = 15 users and different values of the Pburst factor (when
the TSPS prioritization function is applied).
References
[1] Recommendation ITU-R M.1645, “Framework and overall
Figure 9 as a function of the Pburst factor. The simulated objectives of the future development of IMT-2000 and systems
scenario is exactly the same as in Figure 8. It is clear that beyond IMT-2000,” June 2003.
as Pburst increases the spectral efficiency decreases. In case [2] IEEE 802.16e-2005, “IEEE standard for local and metropolitan
Pburst = 0, two main behaviors are observed. First, almost area networks—part 16: air interface for fixed and mobile
the frames sent with a very high spectral efficiency achieve broadband wireless access systems, amendment 2: physical
the maximum throughput which is approximately 46 Mbps; and medium access control layers for combined fixed and
however, it can be observed that many frames have been sent mobile operation in licensed bands and corrigendum 1,”
February 2006.
quite unfilled due to the lack of buffered bits leading to a
[3] V. Seba and B. Modlic, “Multiple access techniques for
low system throughput (peak on the left side of the figure).
future generation mobile networks,” in Proceedings of the 47th
Furthermore, when computing the 99th percentile of the International Symposium Electronics in Marine (ELMAR ’05),
packet delay for each Pburst value, the following delay values pp. 339–344, Zadar, Croatia, June 2005.
have been obtained {160, 185, 250, 275, 725} (millisecond) [4] J. Moon, J.-Y. Ko, and Y.-H. Lee, “A framework design for the
for Pburst = {0, 1, 5, 10, 100}, respectively. Clearly, joining next-generation radio access system,” IEEE Journal on Selected
these results with those obtained in Figure 8, it can be Areas in Communications, vol. 24, no. 3, pp. 554–564, 2006.
concluded that having Pburst = 5 offers the best trade [5] V. Bharghavan, S. Lu, and T. Nandagopal, “Fair queuing
off between granularity (i.e., spectral efficiency), required in wireless networks: issues and approaches,” IEEE Personal
signalling, and the required QoS. Communications, vol. 6, no. 1, pp. 44–53, 1999.
[6] H. J. Kushner and P. A. Whiting, “Convergence of
proportional-fair sharing algorithms under general
5. Conclusions conditions,” IEEE Transactions on Wireless Communications,
vol. 3, no. 4, pp. 1250–1259, 2004.
In this paper, a new scheduling prioritization function [7] H. Kim and Y. Han, “A proportional fair scheduling for
is proposed as well as a continuous frequency and time multicarrier transmission systems,” IEEE Communications
resource allocation scheme for OFDMA systems (following Letters, vol. 9, no. 3, pp. 210–212, 2005.
the data packing standardized in the IEEE 802.16e) which [8] B. Classon, K. Baum, V. Nangia, et al., “Overview of UMTS air-
can be applied with both subcarrier permutation schemes interface evolution,” in Proceedings of the 64th IEEE Vehicular
(contiguous or distributed). Moreover, the proposed time Technology Conference (VTC ’06), pp. 1–5, Montreal, Canada,
September 2006.
stamped packet scheduling (TSPS) scheme has shown to
[9] Q. Liu, X. Wang, and G. B. Giannakis, “A cross-layer
handle sensitive delay applications (i.e., rtPS and nrtPS) scheduling algorithm with QoS support in wireless networks,”
while obtaining high spectral efficiencies (multiuser diversity IEEE Transactions on Vehicular Technology, vol. 55, no. 3, pp.
and frequency scheduling are exploited). Actually, a 50% 839–847, 2006.
reduction of the 99th percentile from the delivered packets [10] L. Wan, W. Ma, and Z. Guo, “A cross-layer packet scheduling
delay and 30% of the packet loss rate (compared to and subchannel allocation scheme in 802.16e OFDMA sys-
the PFS function) is achieved in case of nrtPS and rtPS tem,” in Proceedings of the IEEE Wireless Communications and
Networking Conference (WCNC ’07), pp. 1865–1870, Hong

Kong, March 2007.
[11] S. S. Jeong, D. G. Jeong, and W. S. Jeon, “Cross-layer design
of packet scheduling and resource allocation in OFDMA
wireless multimedia networks,” in Proceedings of the 63rd IEEE
313, Melbourne, Australia, May 2006.
[12] Y. Ben-Shimol, I. Kitroser, and Y. Dinitz, “Two-dimensional
mapping for wireless OFDMA systems,” IEEE Transactions on
Broadcasting, vol. 52, no. 3, pp. 388–396, 2006.
[13] A. Erta, C. Cicconetti, and L. Lenzini, “A downlink data
region allocation algorithm for IEEE 802.16e OFDMA,” in
Proceedings of the 6th International Conference on Information,
Communications and Signal Processing (ICICS ’07), pp. 1–5,
Singapore, December 2007.
[14] C. Desset, E. B. de Lima Filho, and G. Lenoir, “WiMAX down-
link OFDMA burst placement for optimized receiver duty-
cycling,” in Proceedings of the IEEE International Conference on
Communications (ICC ’07), pp. 5149–5154, Glasgow, Scotland,
June 2007.
[15] J. Andrews, A. Ghosh, and R. Muhammed, Fundamentals
of WiMAX, Understanding Broadband Wireless Networking,
Prentice-Hall, Englewood Cliffs, NJ, USA, 2007.
[16] A. Mourad, On the system level performance of MC-CDMA sys-
tems in the downlink, Ph.D. thesis, Ecole Nationale Supérieure
des Télécommunications de Bretagne, Brest, France, June
2006.
[17] IEEE 802.16m-07/002r4, “TGm System Requirements Docu-
ment (SRD),” October 2007.
[18] M. O. Hasna and M.-S. Alouini, “Application of the harmonic
mean statistics to the end-to-end performance of transmission
systems with relays,” in Proceedings of the IEEE Global
Telecommunications Conference (GLOBECOM ’02), vol. 2, pp.
1310–1314, Taipei, Taiwan, November 2002.
[19] IST-2001-32620—MATRICE, D1.3, “Specification of the per-
formance evaluation methodology ant the target perfor-
mance,” December 2002.
doi:10.1155/2009/512865
Research Article
OFDMA-Based Medium Access Control for
Next-Generation WLANs
H. M. Alnuweiri,1 Y. Pourmohammadi Fallah,2 P. Nasiopoulos (EURASIP Member),3

and S. Khan3
1 Department of Electrical and Computer Engineering, Texas A&M University at Qatar, P.O. Box 23874, Doha, Qatar
2 Institute
of Transportation Systems (EECS, CEE), University of California Berkeley, 604 Davis Hall, Berkeley, CA 94720-1710, USA
3 Department of Electrical and Computer Engineering, The University of British Columbia, 5500-2332 Main Mall,
Vancouver, BC, Canada V6T 1Z4
Correspondence should be addressed to H. M. Alnuweiri, hussein.alnuweiri@qatar.tamu.edu
Received 4 August 2008; Revised 6 December 2008; Accepted 18 January 2009
Existing medium access control (MAC) schemes for wireless local area networks (WLANs) have been shown to lack scalability in
crowded networks and can suffer from widely varying delays rendering them unsuited to delay sensitive applications, such as voice
and video communications. These deficiencies are mainly due to the use of random multiple access techniques in the MAC layer.
The design of these techniques is highly linked to the choice of the underlying physical (PHY) layer technology. The advent of new
PHY schemes that are based on orthogonal frequency division multiple access (OFDMA) provides new opportunities for devising
more efficient MAC protocols. We propose a new adaptive MAC design based on OFDMA technology. The design uses OFDMA
to reduce collision during transmission request phases and makes channel access more predictable. To improve throughput, we
combine the OFDMA access with a carrier sense multiple access (CSMA) scheme. Data transmission opportunities are assigned
through an access point that can schedule traffic streams in both time and frequency (subchannels) domains. We demonstrate the
effectiveness of the proposed MAC and compare it to existing mechanisms through simulation and by deriving an analytical model
for the operation of the MAC in saturation mode.
Copyright © 2009 H. M. Alnuweiri et al. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction mechanism in IEEE 802.11 WLANs and time division

multiple access (TDMA) in HIPERLAN [1–3]. The latter
The throughput and scalability of a wireless local area was never really accepted by the industry, while IEEE
network (WLAN) are greatly dependent on its medium 802.11 (dubbed WiFi) quickly became the technology of
access control (MAC) protocol and its associated multiple choice for WLANs worldwide. TDMA required complex
access (MA) scheme. The most important aspect of a MAC scheduling in a central node and used fixed time slot-
protocol is the ability to efficiently coordinate access among based transmissions, as opposed to variable length packet
multiple contending stations (STA) for a shared resource. transmissions. Such a scheme had limited flexibility in low
MAC protocols for WLANs must also address several other cost WLANs that are dominated by bursty data traffic.
key issues, such as coping with a varying and potentially Instead, simpler random access CSMA-based protocols were
large number of users, maintaining acceptable delays for preferred. However, as the experience with deploying 802.11
multimedia traffic, transmitting over a randomly varying based networks matured, it became obvious that the simple
wireless channel, and introducing minimal overhead to the CSMA/CA MAC mechanism can become quite inefficient,
transmitted data stream. mostly due to its collision prone random access and recovery
The original designs of MAC protocols for WLANs processes. Although mechanisms to control and minimize
included the CSMA/CA (CSMA with collision avoidance) the impact of collisions (such as collision detection in
the CSMA/CD mechanism of Ethernet) exist, a wireless out faraway STA’s lower power transmissions. OFDM-based
medium does not allow for collision detection through techniques such as OFDMA have been proven to perform
power measurement. Furthermore, the collision avoidance well and are more easily implemented for WLANs.
mechanism (CA), used in CSMA/CA protocols, has a very TDMA-based MAC designs are known to be mostly
limited capacity in improving the performance of the MAC. suitable for constant bit rate voice communications but not
In fact, the performance of the CSMA/CA protocol used for bursty data or video traffic [8]. Moreover, in TDMA,
in IEEE 802.11 standard deteriorates rapidly as the number where STAs share the available bandwidth in time, uplink
of stations increases and/or the amount of traffic grows connection request transmissions have to happen in exact
[4]. time slots. Such a mechanism has strict synchronization that
One of the main objectives of our proposed MAC is to needs and requires all STAs to be aware of these time slots
address this issue of low throughput (or efficiency) of the and to be perfectly aligned with the central controller. If
CSMA/CA MAC layer under heavy loading conditions in CSMA rules are used with TDMA to avoid collision and
crowded WLANs. Another major issue is to improve support relax synchronization requirements, the interframe spacing
for persistent real-time traffic, produced by multimedia required between slots will reduce the throughput of the
applications, while maintaining the random access and system.
variable length packet transmission features of the original OFDMA, on the other hand, is based on OFDM, which
MAC. To achieve these goals, we utilize the orthogonal is inherently more suited for wireless channels. OFDM can
frequency division multiple access (OFDMA) scheme [5, 6]. convert the frequency selective channel into a parallel but
OFDMA is currently used in the IEEE 802.16e standard orthogonal set of smaller frequency-flat channels which the
for wireless metropolitan area networks [7]. Multimedia receiver can deal with without needing expensive channel
traffic has traditionally suited more centrally controlled MAC equalization. With the use of guard intervals, ISI can be
protocols such as TDMA, over a random access network [8]. completely avoided in OFDM systems. OFDMA allows
In our proposed MAC, we utilize controlled CSMA access to assigning these subchannels to different STAs. Given this
provide better scheduling opportunity for multimedia traffic. ability of OFDMA and its superior performance in wireless
At the same time, we preserve and increase the efficiency of environment typical to WLANs, a system using OFDMA is
the random access feature through combining OFDMA with of more interest to future multiple access wireless networks.
CSMA. Some recent works on MAC design based on OFDMA or
In the next section, we briefly review some relevant OFDM are reported in [11, 12]. The design in [12] is based
works and techniques and elaborate on the advantages of an on a combination of OFDM and TDMA and attempts to
OFDMA-based MAC. We present our OFDMA-based MAC provide better performance through intelligent assignment
design in Section 3. The proposed design is then analyzed of subcarriers in time domain. This scheme is not designed
and evaluated in Section 4. Open issues and future research for efficient random access which is one of our objectives.
directions are discussed in Section 5. The work in [11] presents a MAC that relies on OFDMA for
all of its operation and resolves collisions through changing
2. Multiple Access Scheme for WLANs subchannel assignments. While such methods and the stan-
dard 802.16 MAC protocol [7] provide novel schemes using
Providing contention minimized random access is one of the OFDMA, they require complex scheduling, are more suitable
main aspects of our proposed design. For this purpose, we for consistent voice-like traffic, and do not provide efficient
can combine CSMA with any multiple access scheme such random access capabilities. We present a new design that is
as frequency division multiple access (FDMA), code division different in many aspects and is more suitable for heteroge-
multiple access (CDMA) (similar to the method in [9]), neous traffic, which is typical for WLANs. Our design is also
TDMA, or OFDMA. more flexible and easily extendable for operation in different
It is well known that OFDMA has a clear capacity environments.
advantage over FDMA, therefore we do not discuss FDMA
here. Also FDMA-based schemes such as multichannel 2.1. OFDMA System Specifications. An important aspect of
slotted Aloha have been well examined in the past [10]. As an OFDMA system that is of interest to the design of a
described below OFDMA has several advantages over CDMA MAC scheme is the number of subchannels. OFDMA allows
and TDMA schemes as well and is our primary choice. subcarriers to be grouped and assigned to different users.
In CDMA technique, the transmitted signal is multi- These groups of subcarriers are known as subchannels (also
plexed with a high bit rate pseudocode sequence, to distin- called subbands in some literature). In our scheme, the
guish it from other simultaneous transmissions in time and number of subchannels is primarily determined by the MAC
frequency. However, this high bit rate transmission is suscep- to achieve optimal performance. This issue is discussed in
tible to channel frequency selectively and requires complex Section 4.
channel equalization at the receiver to recover from the The number of subcarriers in an OFDMA scheme is a
resulting intersymbol interference. It is widely known that design factor that can be adjusted for different environments.
indoor radio channels experience high frequency selectively For IEEE 802.16 standard, a fixed separation of subcarriers
due to their rich scattering environments. Furthermore, is considered [7, 13]. Depending on channel bandwidth,
CDMA schemes suffer from the near-far phenomenon; a which may range from 1.25 to 40 MHz, the number of
nearby STA’s high power transmission may completely block subcarriers ranges from 128 to 2048. The 802.11 a/g standard
uses OFDM with 64 subcarriers in a 20 MHz BW [1]. We do OFDMA, whereby a set of subcarriers (i.e., a subchannel)
not make an assumption on the number of subcarriers in this is assigned to each station for sending the TR messages.
article. When the number of subchannels is equal to or more than
Our scheme does not require any specific choice of guard the number of stations, the TR phase will be contention
interval for OFDM symbols, or the guard bands for the free. To increase the utilization and the multiplexing gain
channel. Any existing PHY specification for OFDMA or for the TR phase, we can assign more than one station to
OFDM systems (e.g, from 802.16 or 802.11 standards) is each subchannel; this means that a probability of collision
acceptable. Subchannels formed from grouping subcarriers between stations exists within each subchannel. However,
together may be adjacent or may be distributed. In general, this probability can be capped through dynamically adjusting
distributed permutations outperform adjacent subchannels the number of subchannels and the number of stations
in high mobility applications while adjacent subcarriers can assigned to each subchannel. A CSMA/CA scheme is used to
be used to provide higher throughput in fixed, portable, and resolve the probable collisions, and a limited number of time
low mobility environments [13]. slots are allocated for the backoff process. This constitutes
a hybrid and dynamic OFDMA/CSMA scheme for the TR
3. Hybrid OFDMA/CSMA MAC Design phase. We evaluate the operation of different TR phase
settings later in this article.
Our proposed MAC uses a two-stage frame delivery process, When the TR phase completes, the AP processes the
consisting of a transmission opportunity request (TR) phase, received TRs and schedules the subsequent ST phase.
followed by a scheduled data transmission (ST) phase. Transmission in the ST phase is contention free and can
A station willing to send data has to inform the central be from one station to the AP or to another station. The
controller, the access point (AP), of its required transmission contention free transmission may be done in several ways.
opportunity (TO). The TO request is sent in a contention A simple method is to assign all subcarriers to one station
phase with reduced collision (using OFDMA). Once the AP and schedule stations in the time domain (i.e., OFDM
receives this message, it will assign contention free TOs to with time domain multiplexing). In this mode, either a
the station. A station can also transmit a more elaborate TO transmission schedule (TS) for stations is broadcasted, or
request, such as, a periodic TO assignment for its voice or individual TO assignment messages are sent to stations at
video traffic. The AP can then regularly assign TOs to the appropriate times. The other option for the ST phase is to
station, without needing subsequent TO requests from the utilize OFDMA to allow concurrent data transmission by
station (similar to unsolicited access grant in 802.16 [7]). multiple stations. In this mode, a subchannel assignment
Apart from the obvious decrease in control messages that (SA) map is broadcasted to inform the stations of their
need to be sent, such a mechanism reduces contention-based assigned subchannels. The SA map may include downlink
TO requests, thereby providing additional QoS guarantees transmissions by the AP, and therefore it is not equivalent
for multimedia traffic. Following the grant of a contention to the uplink map (UL Map) in other standards. Utiliz-
free TO, the station can send longer data frames. ing OFDMA in the ST phase incurs great complexity in
The two phases (ST and TR) are separated by time scheduling stations and may lead to inefficiency if a schedule
spaces (called interframe spaces or IFSs) that coordinate that fills all subchannels at all times is not feasible. In this
transition between phases using carrier sensing. A simplified article, we only consider the former case where OFDMA
view of the MAC operation timeline is depicted in Figure 1. is used in the TR phase and the ST phase uses controlled
We define the IFS values in our mechanism as: minimum CSMA.
IFS (MIFS), controlled access IFS (CIFS), and random
access IFS (RIFS); where MIFS < CIFS < RIFS. These
values are PHY specific and are, as in the 802.11 standard 3.1. TO Request (TR) Phase. During the TR phase, each sta-
[1], determined by the specifications of the clear channel tion can transmit its TO requests in its assigned subchannel
assessment (CCA) mechanism, MAC and PHY processing according to the CSMA/CA rules. Each subchannel runs its
delays, and transceiver turn around times. For example, own separate CSMA/CA procedure. An important part of
MIFS is the nominal time that the MAC and PHY will require the design for our system is the subchannel assignment for
to receive the last symbol of a frame at the air interface, random access messages, that is, TO requests. Assuming that
process the frame, and prepare to respond with the first N subcarriers (e.g., N = 256) are available, and there are
symbol on the air interface. CIFS is MIFS plus the CCA time, n stations, we divide the channel to M subchannels, or in
and RIFS is CIFS plus CCA time. other words, assign k = N/M subcarriers to each station.
The TR phase starts after an MIFS interval, following an If M = 1 or k = N, we have an all OFDM system with
explicit TR start message from the AP or automatically after CSMA/CA operation, similar to 802.11. If M = n, we have
sensing the entire channel (not just the subchannel) idle for the other extreme (k = N/n), which is an OFDMA system
RIFS seconds. Transition between TR and ST phases happens without contention (no need for CSMA). A balance can be
after sensing the entire channel idle for CIFS after the backoff found between the wasted bandwidth due to collision and
slots in the TR phase. If there has been no message in the that due to subchannels sometimes remaining unused. This
TR phase, and AP has no data or message to send, the TR is discussed in more detail in the next section.
phase is repeated with stations starting to count backoff slots To handle the possible contention in each subchannel, we
after a RIFS again. In our scheme, the TR phase always uses employ a CSMA/CA mechanism. Each subchannel uses its
STA 1 STA
AP: Tx schedule
STA1 STA3
AP: Tx schedule
Sub-channels
data data
STA 2
STA
STA 3 STA2 STA
data
TR phase TR phase
Scheduled transmission (ST) phase Time
Figure 1: Two-phase MAC operation: TO request (TR) phase, scheduled transmission (ST) phase.
own separate CSMA/CA procedure. Collisions are resolved 3.2. Scheduled Transmission (ST) Phase. The ST phase starts
using a random exponential backoff process. There are after sensing the channel idle for a CIFS following a TR phase
other collision resolution mechanisms such as changing (Figure 2). Transmission of TR messages in the TR phase
subchannel assignment as described in [11]. However, such starts during the first q backoff slots and lasts for a fixed
methods increase system complexity without providing duration of the TR message. This means that if there is at
significant gain in performance. Thus we use independent least one TR message, the ST phase has to wait until the end
CSMA/CA mechanisms in each subchannel. The CSMA/CA of the TR messages. If there are no TR messages, the ST phase
scheme operates similarly to the process used in 802.11, may start after the backoff slots plus CIFS.
except for the carrier sensing in each subchannel that uses All subchannels in the ST phase are assigned to only
OFDM symbol detection instead of simple channel power one station at a given time (i.e., OFDM operation), and
sensing. stations are scheduled in time. The schedule and order of
The TR phase timeline includes a limited number access to the medium are determined by the access point
(denoted as q) of minimum length time slots followed by a and enforced through broadcast messages indicating the
transmission opportunity for the TR message. The minimum schedule, or through explicit poll (TO assignment) messages.
time slots are used by the backoff process of the CSMA/CA Transmission by stations is only allowed in response to such
scheme to avoid collision. The duration of such time slots messages. This policy along with a carrier sense mechanism
is derived from the minimum time that is needed for the is used to coordinate contention free access to the medium
detection of an OFDMA symbol in the subchannel. When a during the ST phase. This mechanism utilizes the interframe
station wants to transmit a TR message, it chooses a random spaces described earlier.
backoff number from the interval (0,CW min ). The stations The first message in ST is sent by the AP. If this
decrement their backoff counter with each empty slot in message is a schedule announcement, the stations will send
the TR phase passing; when the counter reaches zero, they their responses in the specified order with CIFS spacing.
transmit the TR message immediately if they sense no other Acknowledgements are sent after MIFS following the data
station already transmitting in the subchannel. After a station frame. This process continues until the end of the scheduled
transmits a TR message, it will wait for a response from TOs with CIFS spacing separating the TOs. Since MIFS <
the AP, in the form of a poll or position in the schedule. CIFS, the AP can reclaim the channel by transmitting after
If no response is received, the station will interpret this as an MIFS, and change the schedule or end the ST phase.
a collision (or lost packet), double its contention window If explicit poll messages are used instead of transmission
size, and then select a new random backoff number from schedules, the AP will send the first poll (TO assignment)
the new contention window. The lower and upper limits of message after CIFS > MIFS. All message exchanges during
the contention window size can be traffic class dependent to a transmission opportunity use MIFS spacing. The use of
provide prioritized quality of service. These values can also time-based TO allows for multirate operation of the PHY
be adaptively assigned to stations by the AP, to achieve better while maintaining the temporal fairness (as discussed in
performance. other works [14]). The next TO is always initiated by the AP
To achieve even higher performance, the AP can dynam- after a CIFS. This process continues until the end of the ST
ically change the number of subchannels and assign stations phase when the AP either indicates the end through an ST
to different subchannels. This can be done using the network end frame or leaves the channel idle for at least RIFS > CIFS
allocation vector (NAV) field in the standard CTS frame sent (triggering a TR phase). Figure 2 demonstrates the timeline
by the AP. The access point keeps track of the number of of the ST phase in two modes: using the explicit polling and
associated stations (Q) and the number of active stations using schedule announcement messages.
(n). Active stations are ones that have set up a traffic There are several options for enhancing the efficiency of
stream with the AP, have indicated a nonzero queue size, the ST phase. One option is to piggyback the poll messages
or have transmitted in the past L beacon intervals (L being on the acknowledgement packets if the recipient of the
a configuration parameter). When assigning stations to station’s packet is the access point (which is the case in most
subchannels, the access point first assigns the active stations scenarios). Further enhancements can be achieved using
and then distributes the rest of the stations. block acknowledgment.
RIFS CIFS MIFS MIFS MIFS

MIFS CIFS RIFS
AP: poll, to assign
AP: poll, to assign

STA 1 STA 1
Sub-channels
AP: ACK
AP: ACK
STA 2 STA 2
Busy
STA1 STA2
data data
STA 3
(ST) phase TR phase
TR phase
Time
(a)
RIFS CIFS CIFS MIFS CIFS MIFS
RIFS
STA 1
AP: ST schedule
STA 1
Sub-channels
AP: ACK
STA 2
AP: ACK
STA1 STA2
Busy
STA 2
data data
STA 3
TR phase (ST) phase TR phase

Time
TO request Data RIFS ∼ 30 usec, CIFS ∼ 20 usec, MIFS ∼ 10 usec
(b)
Figure 2: Hybrid MAC operation, different operation methods for ST phase: (a) explicit polls, (b) schedule broadcast.
3.3. Quality of Service and Multimedia Support. To enable have data to send). We extend the analysis by simulation
QoS and multimedia provisioning for the proposed MAC, we experiments for different traffic load conditions.
propose two classes of service: prioritized random access and
scheduled guaranteed access. To provide priority services, 4.1. Analytical Modeling of the MAC in Saturation Mode.
we specify different limits for contention window sizes. The Knowing that each subchannel employs a CSMA/CA scheme
smaller the limits are, the higher the priority of access to the with exponential random backoff, similar to the distributed
channel will be. In this case, it is also required that the access coordination function (DCF) of 802.11, we can reuse the
point schedules TOs for higher priority stations ahead of the model developed for its backoff process [4]. This model
others (using a simple priority scheduler). has been shown to accurately model the exponential backoff
The scheduled guaranteed access mechanism is usually process in saturation mode [14–16].
used for traffic of persistent session (e.g., voice or video) A further extension of the model described here can
type. This mode requires more complex scheduling schemes include nonsaturation scenarios, but this will not be con-
in the access point. The AP must also be aware of the sidered here any further. The work in [4] models the MAC
traffic behavior of the requesting stations and their QoS events of successful transmission, collision, and idle waiting.
requirements. When a traffic stream is set up between the The duration between transitions to each state is called a
AP and a station, the scheduler inside the AP is configured to “slot” and the probability of a slot containing each event is
send unsolicited TO assignments (e.g., polls) to the station. found. For the model in [4] to be correct, two fundamental
For this purpose any scheduling mechanism can be used. An assumptions are made. First, after each idle slot, each station
example of such mechanisms is the controlled access phase may attempt to transmit with an independent and constant
scheduling mechanism [9], which was originally developed probability τ. Second, regardless of the number of past
for 802.11e MAC but is applicable with modifications to the collisions, a transmission attempt may result in a collision
MAC protocol proposed here. with an independent and constant probability p. The backoff
process is then modeled by a two-dimensional Markov chain
4. Analysis and Performance Evaluation and the following two equations are found for τ and p [4]:
2(1 − 2p)
To analyze the performance of the proposed MAC, we τ= ,
first analytically model the MAC operation and derive the (1 − 2p)(W + 1) + pW 1 − (2p)m (1)
n−1
throughput in saturation mode (in which all stations always p = 1 − (1 − τ) ,
where n is the number of contending stations, W is CW min , in an ST. If the number of stations assigned to all subchannels
M i
and m is defined so that CW max = W · 2m . Solving these is equal, we have M suc = M · Psuc
TR , otherwise M
suc = i=1 Psuc ,
equations using numerical methods, one can find τ and p i is the P TR of the ith subchannel. With these values,
where Psuc suc
based on the known values of W, m, and n. Using the values we have the throughput as follows:
of τ and p, we can find the probability that at least one
transmission happens in a given slot (Ptr ) and the probability M suc · E[P]
of a transmission in a slot being successful (Ps ). Ptr is the S= . (6)
E[ST] + E[TR]
complement of the probability of no station transmitting
(Pidle = (1 − τ)n ) and is simply derived as follows: The length of the ST phase can be found by multiplying
the expected number of successful TR subchannels and the
Ptr = 1 − (1 − τ)n . (2) expected duration of a TO (E[TO]), plus the duration of
transmitting the SA map of length LSA (TSA ) if one is used
Ps can be described as the probability of exactly one
transmission given that there has been a transmission on the E[ST] = TSA + M suc · E[TO],
channel, thus
E[TO] = 2∗H phy /Rb + Hmac + E[P] + Lack /R
(7)
nτ(1 − τ)n−1 + TMIFS + 2δ + TCIFS ,
Ps = . (3)
1 − (1 − τ)n TSA = Hphy /Rb + LSA /R + δ + TCIFS ,
The above probabilities are valid for our system and where δ is the propagation delay, R and Rb are the PHY
802.11 DCF; however, their meanings are different in each operational and basic rates, for example, R = 54 Mbps and Rb
MAC. For DCF, these probabilities describe the transition = 6 for IEEE the 802.11 a/g standard. Also, Lack is the length
probabilities between successful collision and idle states. of the Ack message and Hphy is the length of the PHY header.
Whereas, in our system they only describe the probabilities We approximate the expected length of a TR phase as the
that each of the q backoff slots in a TR phase subchannel con- duration of one TR message plus half of the backoff slots
tains a transition to successful collision or idle states. For our (since the random backoff number is uniformly chosen from
MAC scheme, the value of n in the above equations changes the contention window). For the case of M subchannels, the
to the number of stations assigned to each subchannel. length of the TR message sent using OFDMA is stretched due
For our system, we need to find the probability that a the use of a lower number of subcarriers. Thus,
TR phase subchannel is successful in delivering a request.
TR , can be found considering
This probability, denoted as Psuc E[TR] = TRIFS + M ∗ Hphy /Rb + LTR /R + Tslot ∗ q/2 + δ,
that a successful transmission can happen at any slot during (8)
a TR phase (Ps Ptr ), assuming that its previous slots were
idle ((Pidle )i−1 for ith slot). This can be written by summing where LTR is the length of a TR message in bits and Tslot
the probabilities of the ith slot being successful and all i-1 denotes the duration of backoff slots in each TR phase.
previous slots empty. Knowing that the number of backoff To evaluate the analytical model and examine the
slots in a TR phase is q, we have performance of the proposed MAC scheme, we assumed
a specific OFDMA-based PHY similar to the one specified
2 q−1
TR
Psuc = Ps Ptr +Pidle Ps Ptr + Pidle Ps Ptr + · · · + Pidle Ps Ptr in the IEEE 802.16 standard and computed the normalized
q throughput (throughput provided to layers above the MAC
i−1
= Ps Ptr Pidle . divided by the PHY operational rate). We then compared
i=1 the results with those obtained from simulation experiments,
(4) using a discrete event simulator written in C language. The
results are depicted in Figure 3, which shows that the model
Similarly, we can find the probability of a TR phase matches the simulation very closely and is therefore quite
subchannel being idle as the probability that all q slots are accurate. The accuracy and simplicity of using the model for
idle: calculating the throughput make it easy to devise adaptive
q schemes for the access point that can maximize the system
TR
Pidle = Pidle . (5) throughput as the number of active stations changes over
time.
Using the above probabilities, we can find the expected To compare the achievable throughput of the proposed
throughput of the system by dividing the expected amount of MAC to that of the basic CSMA/CA, we repeat the deriva-
traffic delivered in each ST phase, by the expected duration tions for the saturation throughput of the 802.11 DCF here
of the ST phase (E[ST]) plus the expected duration of the [4, 15]:
TR phase (E[TR]). The expected amount of traffic served
in each ST depends on the size of the TOs assigned by the Ps Ptr E[P]
AP. For simplicity, we assume that the TOs of all stations S802.11 = , (9)
1 − Ptr Tslot + Ptr Ps Ts + Ptr 1 − Ps Tc
contain a single packet transmission. The expected length of
a packet (E[P]) times the expected number of successful TR where Tc and Ts denote the durations of time spent in
subchannels (M suc ) gives the expected amount of data served collision or successful transmission and are given as follows
Table 1: Parameters used for simulation and numerical analysis.
Symbol Quantity Symbol Quantity

Tslot 16 μs BW 20 Mhz
TMIFS 10 μs Number of subcarriers 256
TCIFS 20 μs Rb 6 Mbps
TRIFS 30 μs R 54 Mbps
ACW Min 16 Lack 12 B
ACW Max 256 Lcts,Lrts 18 B
q 8 Hphy 120 bits
δ 1 μs Hmac 30 B
0.65 0.65
Normalized throughput
0.6 0.6
0.55 0.55
0.5 0.5
0.45
0.45
0.4
0.4
0.35
0.3 0.35
0.25 0.3
0.2 0.25
4 12 20 28 36 44 52 60 68 76 84 92 100 0.2
Number of stations 4 12 20 28 36 44 52 60 68 76 84 92 100
Number of stations
Simulation M = 1 Model M = 1
Simulation M = 4 Model M = 4 CSMA/CA Hybrid MAC M = 4
Simulation M = 16 Model M = 16 CSMA/CA with RTS Hybrid MAC M = 16
Hybrid MAC M = 1
Figure 3: Model validation, comparing simulation and analytical
results. Figure 4: CSMA/CA versus hybird OFDMA/CSMA (M: no. of
subchannels).
for normal CSMA/CA and CSMA/CA in RTS/CTS mode

(where each transmission is preceded by an RTS/CTS cycle):

Ts = 2Hphy /Rb + Hmac + E[P] + Lack /R
+ TMIFS + 2δ + TRIFS , (10) 0.7
∗
0.6
Tc = Hphy /Rb + Hmac + E P /R + TRIFS + δ,
0.4
and for (RTS/CTS) case:
0.2 100
Ts = 4Hphy /Rb + Hmac + E[P] + Lrts + Lcts /R 75 ns
io
+ 3TMIFS + 4δ + TRIFS , (11) 0 50 stat
128 64 32 of
25 r
Tc = Hphy /Rb + Hmac + Lrts /R + TRIFS + δ, Number
16 8 4 m
be
of subch 2 1 u
annels 1 N
where Hmac is the length of the MAC header. For a fair
comparison, we replaced the IFS values of 802.11 DCF in Figure 5: Normalized saturation throughput versus number of
(10)and (11) with the IFS values defined in our MAC. stations and number of subchannels.
Figure 4 shows that, compared to a pure CSMA/CA
system, the proposed hybrid MAC scheme achieves up to
30% performance gain. It also shows that for the parameters of subchannels, we can always maintain a higher throughput
used in the simulation, the maximum throughput is achieved than CSMA/CA-based schemes for WLANs with more than
when each subchannel is assigned to around 4 stations. This 4 stations. With less than 4 stations, pure CSMA schemes
is due to the use of the specific contention window sizes perform well, due to lower overhead.
given in Table 1. Using the model developed in this section, To better understand the effect of the number of
one can devise an optimization scheme that maximizes the subchannels on the throughput of the proposed MAC, we
throughput by adjusting the values of the contention window have set up two experiments that allowed us to observe the
sizes, as well as the number of subchannels. Further analysis normalized throughput (based on the presumed 54 Mbps
of Figure 4 shows that by dynamically adjusting the number PHY transmission rate) versus the number of subchannels
We have selected OFDMA as the basis of our system

due to its several advantages over other systems. Compared
to CDMA systems, OFDMA can combat fading with less
complexity. OFDMA can also achieve higher spectral effi-

0.7
0.6 ciency. In comparison to TDMA-based systems, our system
0.4 has a simpler random access scheme, that does not require
synchronization, and it is suitable for a combination of data
0.2
and multimedia traffic.
0 1.5
64 1.25 We presented the fundamental regulations and require-
32 1 ) ments of the proposed OFDMA-based MAC and developed
16 0.75 ate
8 0.5 HYr a model for saturation throughput analysis. This model can
P
4
0.25 a d/ be used for dynamically adjusting the number of subchannels
2
1 0 d (lo
a
Lo and subchannel assignment to achieve optimal performance.
Figure 6: Normalized throughput versus load and number of Devising more complex analytical models and optimization
subchannels: 32 stations throughput versus offered load. algorithms for the OFDMA-based MAC is an interesting area
of research. Another research subject that can be based on the
proposed MAC is the design of scheduling algorithms for the
contention free phase of the MAC operation. Such schedulers
and the number of stations in a 3D graph. In the first are required for QoS provisioning in the MAC layer.
experiment, we measured the total MAC throughput in
saturation mode. The results depicted in Figure 5 show that Acknowlegment
the maximum throughput is achieved when the number of
stations is almost 4 times the number of subchannels, for the This research was supported by a research grant from Qatar
set of parameters defined in Table 1. The figure also shows Foundation, under the National Priorities Research Program
that throughput performance is not very sensitive to the (NPRP) Grant 26-6-7-10.
number of subchannels, when a high number of stations are
accessing the channel. This can simplify the task of finding
an optimum number of subchannels. References
To see the effect of increasing the load in the network, [1] “Wireless LAN Medium Access Control (MAC) and Physical
we set up another experiment with 32 stations and observed Layer (PHY) specifications,” ANSI/IEEE Std 802.11: 1999 (E)
the normalized throughput for different number of subchan- Part 11, ISO/IEC 8802-11, 1999.
nels, as the network load increased. The traffic sources in [2] Amendment to IEEE Standard 802.11e, “Medium Access
this experiment were Poisson sources generating 2000-byte Control (MAC) Quality of Service (QoS) Enhancements,” July
packets (Poisson model is only important for nonsaturation 2005.
cases). The experiment results shown in Figure 6 indicate [3] “Broadband Radio Access Networks (BRAN); HIPERLAN
that at low loads the performance of the MAC is more Type 2 Functional Specification; Data Link Control (DLC)
or less the same for any number of subchannels. However, Layer;Partl: Basic Data Transport Function,” ETSI Report
when the load nears 35% of the PHY bit rate, the cases TR101761-1,v.1.1.1, ETSI, Nice, France, April 2000.
with larger number of stations assigned to each subchannel [4] G. Bianchi, “Performance analysis of the IEEE 802.11 dis-
experience more performance degradation. The case with tributed coordination function,” IEEE Journal on Selected
64 subchannels also performs poorly since there are only Areas in Communications, vol. 18, no. 3, pp. 535–547, 2000.
32 stations and half the subchannels are wasted. As in the [5] J. C.-I. Chuang and N. Sollenberger, “Beyond 3G: wideband
previous experiment, the best performance is achieved when wireless data access based on OFDM and dynamic packet
assignment,” IEEE Communications Magazine, vol. 38, no. 7,
8–32 subchannels are used.
pp. 78–87, 2000.
[6] C. Y. Wong, R. S. Cheng, K. B. Letaief, and R. D. Murch,
“Multiuser OFDM with adaptive subcarrier, bit, and power
5. Conclusion allocation,” IEEE Journal on Selected Areas in Communications,
The MAC protocol proposed in this article combines vol. 17, no. 10, pp. 1747–1758, 1999.
OFDMA with CSMA/CA mechanisms and significantly [7] “IEEE Standard for Local and Metropolitan Area Networks
Part 16: Air Interface for Fixed Broadband Wireless Access
increases the performance and utilization efficiency of a
Systems,” ANSI/IEEE Std 802.16-2004 (Revision of IEEE Std
WLAN in the MAC layer. Our results indicate that our 802.16-2001).
protocol works best when the ratio of stations to channels [8] A. Jamalipour, T. Wada, and T. Yamazato, “A tutorial on
is about 4 or 8. Below and above these ratios, performance multiple access technologies for beyond 3G mobile networks,”
tends to degrade. That may become a bottleneck if the AP can IEEE Communications Magazine, vol. 43, no. 2, pp. 110–117,
only offer a few channels; say 4. From practical perspective, 2005.
one does not expect more than 20 to 30 stations to be [9] F. Cuomo, A. Baiocchi, and R. Cautelier, “A MAC protocol for
associated with the same AP, making our scheme suitable for a wireless LAN based on OFDM-CDMA,” IEEE Communica-
most practical WLANs. tions Magazine, vol. 38, no. 9, pp. 152–159, 2000.
[10] Z. Zhang and Y. Liu, “Multichannel Aloha data networks

for personal communications services (PCS),” in Proceed-
ings of IEEE Global Telecommunications Conference (GLOBE-
COM ’92), vol. 1, pp. 21–25, Orlando, Fla, USA, December
1992.
[11] Y.-J. Choi, S. Park, and S. Bahk, “Multichannel random access
in OFDMA wireless networks,” IEEE Journal on Selected Areas
in Communications, vol. 24, no. 3, pp. 603–613, 2006.
[12] X. Wang and W. Xiang, “An OFDM-TDMA/SA MAC protocol
with QoS constraints for broadband wireless LANs,” Wireless
Networks, vol. 12, no. 2, pp. 159–170, 2006.
[13] H. Yaghoobi, “Scalable OFDMA Physical Layer in IEEE 802.16
WirelessMAN,” Intel Technology Journal, vol. 8, no. 3, pp. 201–
212, 2004.
[14] Y. P. Fallah, Per-session weighted fair scheduling for real time
multimedia in multi-rate Wireless Local Area Networks, Ph.D.
thesis, University of British Columbia, Vancouver, Canada,
March 2007.
[15] Y. P. Fallah and H. M. Alnuweiri, “Modeling and performance
evaluation of frame bursting in wireless LANs,” in Proceedings
of the International Wireless Communications and Mobile
Computing Conference (IWCMC ’06), pp. 869–874, ACM,
Vancouver, Canada, July 2006.
[16] J. W. Robinson and T. S. Randhawa, “Saturation throughput
analysis of IEEE 802.11e enhanced distributed coordination
function,” IEEE Journal on Selected Areas in Communications,
vol. 22, no. 5, pp. 917–928, 2004.
doi:10.1155/2009/341689
Research Article
Multiuser Resource Allocation Maximizing the Perceived Quality
Andreas Saul and Gunther Auer

DOCOMO Euro-Labs, Landsberger Str. 312, 80687 Munich, Germany
Correspondence should be addressed to Andreas Saul, saul@docomolab-euro.com
Received 1 August 2008; Accepted 24 January 2009
Recommended by Thomas Michael Bohnert
Multiuser resource allocation for time/frequency slotted wireless communication systems is addressed. A framework for
application driven cross-layer optimization (CLO) between the application (APP) layer and medium access control (MAC) layer
is developed. The objective is to maximize the user-perceived quality by jointly optimizing the rate of the information bit-stream
served by the APP layer and the adaptive resource assignment on the MAC layer. Assuming adaptive transmission with long-term
channel state information at the transmitter (CSIT), we present a novel CLO algorithm that substantially reduces the amount of
parameters to be exchanged between optimizer and layers. The proposed CLO framework supports user priorities where premium
users perceive a superior service quality and have a higher chance to be served than ordinary users.
Copyright © 2009 A. Saul and G. Auer. This is an open access article distributed under the Creative Commons Attribution License,
1. Introduction coding (SVC) extension [7, 8] of the advanced video coding

(AVC) standard H.264/MPEG-4 AVC the stream may be
With the high envisaged data rates of beyond 3rd generation received with a variable information bit rate. Other kinds of
(B3G) wireless communication systems [1, 2], multimedia video streams may be encoded or transcoded [9] with the
broadband applications can be offered to mobile users. desired data rate. In general, any application may be delivered
Multimedia applications are characterized by a multitude of with variable information bit rate, allowing to trade user-
data rate and quality of service (QoS) requirements. On the perceived quality with data rate.
other hand, owing to the nature of the mobile radio channel, The high level of flexibility and adaptability offered
frequency selective fading, distance dependent path loss, and by emerging system architectures provides an opportu-
shadowing cause vast variations in the attainable spectral nity for dynamic allocation of resources across users and
efficiency per user. The objective of multiuser resource applications, to increase the network resource usage and
allocation is to assign the available resources over the to enhance the user satisfaction. This effectively requires
shared wireless medium to mobile users running different interaction between system layers, a paradigm known as
applications [3]. cross-layer design [10–12]. For the multiuser resource allo-
Orthogonal frequency division multiple access cation problem at hand, a global cross-layer optimization
(OFDMA) provides orthogonal transmission slots in (CLO) problem is formulated: maximize the user-perceived
time and frequency, which may be flexibly assigned to quality by tuning the served data rate on the APP layer
the individual users [4, 5]. In B3G systems, this feature jointly with the adaptive resource assignment on the MAC
is exploited by the medium access control (MAC) layer layer. Application-driven CLO has been studied for systems
to freely distribute the available bandwidth between users supporting one single type of applications [11, 13, 14] as well
[6]. Provided channel state information at the transmitter as for various application classes [15].
(CSIT) is available, the number of transmitted information Several publications [15–17] consider a logarithmic
bits per slot can be adjusted to the channel conditions of a relation between utility metric and data rate, which may
particular user. result in a concave optimization problem. A more realistic
The application (APP) layer outputs encoded applica- utility metric, measuring the user-perceived quality, is given
tions, for example, a video stream. For the scalable video by the concept of mean opinion score (MOS) [18]. In [15],
a framework is established that allows to mathematically

formulate the MOS for multiple applications like voice, video
streaming, and file download. The resulting nonconcave
optimization problem may be approximated, for example, User 1: α1 = 40%
with a greedy algorithm that maximizes the sum of the MOSs
User 2: α2 = 40%
for all users [19].
User 3: α3 = 20%
In this paper, the optimum multiuser resource allocation
supporting multiple applications is derived in closed form Figure 1: Packet-based generalized processor sharing (PGPS).
for the case of adaptive transmission with long-term CSIT,
assuming a logarithmic relation between utility metric and
data rate. Interestingly, the cross-layer optimization problem CLO framework for the more realistic nonconcave optimiza-
is shown to become independent of the channel conditions tion problem is established in Section 5, and its performance
but is entirely determined by the application characteristics, is evaluated by computer simulations in Section 6.
provided that the offered data rate at the APP layer is
matched to the adaptive transmission parameters in the
MAC layer. For the special case where all users share the 2. System Overview
same application class, it turns out that the overall perceived A wireless downlink shared by K users is considered. An
quality is maximized when all users are allocated the same application server is transferring multimedia applications via
bandwidth, which corresponds to equal resource sharing. core network and base station to mobile users. There are K
This implies that users with good channel conditions applications, which, without loss of generality, generate K
transmit with higher rate and therefore enjoy better QoS, bit-streams, associated to K different users.
as adaptive transmission is more bandwidth efficient in this
case. This is in a sharp contrast to conventional approaches
for QoS provisioning that assume a fixed target rate per 2.1. Link and Physical Layer. In the considered shared wire-
user [3–5], where users with poor channel conditions are less downlink the resources are divided into slots occupying a
allocated more bandwidth, so that all receivers perceive the given bandwidth and time, which can be flexibly allocated to
same QoS. users. A scenario where mobile users travel with potentially
The theoretical analysis serves as a basis for a novel CLO high velocities is considered. The high dynamics of the time
algorithm that allows for a more realistic utility function varying channel prohibit the utilization of instantaneous
that is based on the MOS. The proposed algorithm for CSIT. However, long-term CSIT that includes distance
the underlying nonconcave optimization problem is easy to dependent path loss and log-normal shadowing is assumed
implement and exhibits significantly lower complexity than to be available. As the long-term CSIT is constant over the
the generic solutions in [19, 20]. Moreover, priority classes whole frequency band, multiuser scheduling corresponds to
can be supported in the way that premium users perceive the well known packet-based generalized processor sharing
superior service quality and are more likely to be served, even (PGPS) [23]. A PGPS scheduler aims to assign slots to user
under poor channel conditions. The proposed framework k proportionally to a coefficient αk , which serves as input
also allows to cater for additional constraints, such as a parameter for the scheduler, as illustrated in Figure 1.
guaranteed minimum perceived quality for all users. The long-term CSIT allows to extract the average signal-
The developed CLO framework for application driven to-noise ratio (SNR) for user k, which is used to select
multiuser resource allocation is evaluated by mathematical an appropriate modulation and coding scheme for the
and numerical analysis. We elaborate for which application respective user. The spectral efficiency of the selected symbol
classes CLO attains the most significant gains, and the origin mapping and coding scheme for user k is denoted by ηk
of these gains is identified. Furthermore, the computational in [bit/s/Hz]. Denote the number of symbols per slot by
cost and the overhead due to exchange of CLO related nslot ; the number of transmitted information bits per slot
parameters between layers is studied. It is demonstrated for user k amounts to ηk nslot . Given user k is assigned all
that the overhead of the proposed CLO framework grows available slots Nslot exclusively, the maximum achievable data
only linearly with the number of users and available slots, rate yields Rmax,k = Nslot nslot ηk . The actual data rate to user k
which compares to an exponentially growing overhead for by the PGPS scheduler is then given by
conventional techniques [11, 12, 21, 22]. This is particularly
Rk = αk Rmax,k = αk Nslot nslot ηk . (1a)
relevant to B3G systems with their high degree of freedom for
resource allocation, due to the large number of served users Additionally, the constraints
and available slots.

The remainder of this paper is structured as follows. 0 ≤ αk ≤ 1 ∀k ∈ K, αk = 1 (1b)
Section 2 provides an overview of the considered multiuser k∈K
downlink with focus on MAC and APP layers. Section 3
introduces the CLO framework and the flow of exchanged need to be fulfilled with K {1, . . . , K } being the set of
parameters between layers and optimizer. In Section 4, the all users; that is, the amount of assigned resources cannot
optimum multiuser resource allocation strategy is derived, be negative and the sum of all assigned resources equals the
assuming idealized application characteristics. The proposed available resources.
2.2. Application Layer. The objective MOS is recommended 4.5

as utility metric for voice transmission by the ITU-T [18]
as a measure for the user satisfaction. Practically, the MOS 4
may take values between 1 (not acceptable) and 4.5 (very
Mean opinion score

3.5
satisfied). In [15], the MOS is extended to other services
like video streaming, file download, and web browsing. The 3
obtained mathematical model of the user-perceived quality
2.5
can be used as universal utility metric for CLO, allowing for
joint optimization of different application classes. 2
The application characteristic is mainly influenced by
data rate and packet losses, described by the applications’ 1.5
rate-loss distortion [24]. In this paper, the perceived quality 1
is exclusively expressed as a function of the data rate Rk , while
105 106
packet losses are not considered as an explicit parameter.
Data rate (bit/s)
While this conveniently simplifies the analysis, this choice
requires some further motivation, since certain kinds of Figure 2: Time variant application characteristic of “Foreman”
source encoded bit-streams are sensitive to packet losses [11]. video stream.
Packet losses may be caused by transmission errors over
the mobile radio channel or by system overload. Regarding
the wireless channel the link layer may compensate for packet According to [15, 26], the relationship between PSNR
losses by means of adaptive modulation and channel coding and MOS may be approximated by the bounded logarithmic
in combination with automatic repeat request (ARQ). While function:
link adaptation ensures that transmission errors occur with
MOSk PSNRk
low probability, low latency retransmissions of erroneous
⎧
packets within the link layer [6] maintain reliable delivery of ⎪
⎪ 1 : PSNRk ≤ PSNR1.0 ,
⎪
⎪
packets, at the expense of a certain rate reduction. ⎨
In an overloaded scenario, the offered load by the APP = d log PSNRk + e : PSNR1.0 < PSNRk < PSNR4.5 ,
⎪
⎪
⎪
⎪
layer exceeds the capacity of the wireless link. Such an ⎩4.5 : PSNRk ≥ PSNR4.5 ,
overload scenario can be effectively avoided by a fine grained
adjustment of the offered data rate at the APP layer so as to (3a)
match the capacity of the wireless link. with
For instance, in case of video streaming, transcoding [9] 3.5
or using the SVC extension of H.264/MPEG-4 AVC [7, 8] d= ,
log PSNR4.5 − log PSNR1.0
allows to vary the data rate in a rather fine granularity. As (3b)
packets can be dropped at either the application server or log PSNR4.5 − 4.5 log PSNR1.0
the base station, a low latency rate adaption mechanism is e= .
log PSNR4.5 − log PSNR1.0
feasible, at the same physical location as the scheduler in the
MAC layer, effectively allowing to express perceived quality The parameters PSNR1.0 and PSNR4.5 denote the PSNR
by data rate. at which the perceived quality drops to “not acceptable”
Moreover, the possibility to selectively drop packets offers (MOS = 1.0) and exceeds “very satisfied” (MOS = 4.5),
one further opportunity to adjust the data rate. Likewise, respectively.
for file downloads the data rate can also be adjusted in The rate-distortion characteristic of a video typically
arbitrarily small steps. Hence, it is reasonable to assume that varies over time, which means that the parameters a, b, and c
the application data rates can be adjusted continuously. are time variant. For example, during a scene cut a higher
data rate is required to maintain a certain quality. As an
example Figure 2 shows the rate-MOS model for PSNR1.0 =
2.2.1. Video Streaming. We choose video streaming as one 30 dB and PSNR4.5 = 42 dB of the well known “Foreman”
relevant example of an application class. In [25], a simple video. The 9 different curves correspond to different parts of
concave rate-distortion model is proposed for H.264/MPEG- the video of 1 second duration each.
4 AVC that relates the data rate of a video stream to the peak
signal-to-noise ratio (PSNR):
3. Application-Driven
Cross-Layer Optimization
Rk c
PSNRk dB = a + b 1− . (2)
c Rk Cross-layer design implies that additional parameters are to
be exchanged between link and APP layers, denoted as con-
The parameters a, b, and c characterize a specific video trol information. Figure 3 illustrates the system architecture
stream or sequence, which is source encoded with rate including the flow of control information. In the following,
Rk . These parameters may be determined by matching the the architecture, functional blocks, and variables depicted in
distortion-rate model to the measured bit stream of a video. Figure 3 are described.
Application
U parameters
Operating
Local utility metric

Adaptive system
applications
Application Application server
models
R MOS Ropt Data
Optimizer
Core network
R α Parameter value
Link model Data Model (proposed)

αopt Operating point (conventional)
Operating mode
Cross-layer Adaptive
optimizer scheduler Figure 4: Visualization of operating modes.
Modulation
Rmax Data rate
estimation
Mean opinion score

Base station
4
Figure 3: Control information processing and flow. 3

2
1
R1.0 R4.5
3.1. Layer Model. A major challenge in cross-layer design is Data rate
the abstraction of parameters exchanged as control informa-
tion. In order to limit the amount of control information, Figure 5: Considered generic application characteristic for one
example application class.
we introduce a layer model at the optimizer that emulates
the relevant characteristics of the layer. The parameters of
the layer model are determined at the corresponding layer,
and only these parameters are sent as control information
to the optimizer. The optimizer then tunes the model so as points (circles). These are provided to the optimizer, which
to identify the operating modes that maximize the chosen performs CLO by choosing the overall best operating point.
utility, which are then fed back to the system layers. The proposed layer model is the curve in Figure 4,
Figure 4 demonstrates the difference between the pro- which represents an approximation of the utility metric u =
posed model-based approach, and conventional parameter f (a1 , a2 , . . .) as a continuous function. As demonstrated in
abstraction based on operating modes (crosses) and points the following the proposed parameter abstraction by a layer
(circles) [11, 12, 21, 22]. The X-axis indicates the choice of model exhibits a significant advantage for multiuser resource
one parameter a1 , and the Y-axis indicates the corresponding allocation, due to the potentially large number of available
utility metric u = f (a1 , a2 , . . .). Depending on the choice of slots.
a1 and further parameters a2 , . . . that cannot be determined
from Figure 4 different operating modes of the utility metric 3.1.1. Link Layer Model. For conventional CLO the parame-
are achieved. ters that are provided to the optimizer are the set of possible
For instance, applied to a video stream the local utility data rates for all users {Rk } in (1). Considering an OFDMA-

f could be the PSNR or MOS, and according to (2) the based B3G air interface with a large number of available
parameters a1 , . . . might represent source coding parameters slots, a prohibitive set of possible data rates is obtained.
such as the chosen codec, the frame rate, and the data rate Instead of offering a set of discrete values to the optimizer, the
Rk . As a second example, applied to the PHY layer the local link layer model defines the shares of the available resources
utility might be the sum throughput of all users, and a1 , . . . per users, αk ∈ [0, 1] in (1), as continuous functions. The
are parameters such as the channel coefficients or the velocity factors αk allow the optimizer to tune the link layer model.
of the mobile terminal. Then, according to (1) an arbitrary number of data rate
Following the conventional idea of parameter exchange, combinations R1 , . . . , RK can be emulated at the optimizer.
an intralayer optimization might deliver the subset of The only required parameters at the optimizer are the set of
operating modes that maximize the utility function u, called K parameters {Rmax,k }. Hence, the link layer model for the
efficient set in [22], also known as Pareto frontier. These optimizer is fully determined by (1). Once the optimizer has
operating modes are the crosses being located on the curve in found an optimum set of coefficients {αopt,k }, these are fed
Figure 4. A subset of operating modes is selected as operating back to the link layer.
3.1.2. Application Layer Model. The considered generic which is given by (1). This means that based on the opti-
application characteristic resembles a bounded logarithmic mization coefficients α, which reflect the resource allocation
relation between perceived quality and data rate as illustrated on the link layer, the achievable data rates R of the users are
in Figure 5, described by the MOS as a function of the data determined.
rate Rk of user k ∈ K The application layer models detailed in Section 3.1.2,
⎧ fA ( fA,1 , . . . , fA,K )T , are defined by the relationship
⎪
⎪ 1 : Rk ≤ R1.0,k ,
⎪
⎪
⎪
⎨
fA,k : Uk , Rk −→ MOSk = fA,k Rk . (10)
MOSk Rk = ⎪MOS0,k log Rk : R1.0,k < Rk < R4.5,k ,
⎪
⎪ R0,k
⎪
⎪ That means for each application k there is a corresponding
⎩4.5 : Rk ≥ R4.5,k ,
application model fA,k available at the optimizer. The
(4a) application model establishes a relationship between the data
rate Rk and a utility metric. As common utility metric the
with
mean opinion score MOSk is used, defined by the vector
3.5
MOS0,k = , (4b) T
log R4.5,k /R1.0,k MOS MOS1 , . . . , MOSK (11)
1/3.5
R1.0,k
R0,k = R1.0,k , (4c) containing the MOS of all users, which according to Figure 3
R4.5,k
is delivered to the optimizer.
0 ≤ R1.0,k < R4.5,k ∀k ∈ K. (4d) The optimizer uses a utility function
The semilogarithmic plot of Figure 5 visualizes the related
fO : fA,1 , . . . , fA,K −→ fO fA,1 , . . . , fA,K (12)
parameters: the parameter MOS0,k determines the slope of
MOSk (Rk ) while R0,k shifts the curve along the X-axis. providing a relationship between applications. The utility
Each user’s application characteristic can be function should be symmetric regarding a permutation of its
parametrized by only two parameters, {R1.0,k , R4.5,k }, or arguments and monotonic for each argument. We decide to
alternatively {MOS0,k , R0,k }. The optimizer then tunes the maximize the sum of the MOSs of all applications and choose
model by maximizing the user-perceived quality and returns the utility function
the set of optimum user data rates to the APP layer.

fO fA,1 , . . . , fA,K = fA,k . (13)
k∈K
3.2. Parameter Exchange
Using this utility function, the optimization problem
3.2.1. System Description. Figure 3 shows a block diagram of
the considered CLO framework and illustrates the signal flow
arg max fO fA,1 fL,1 α1 , . . . , fA,K fL,K αK (14a)
of the exchanged control information between optimizer and {α1 ,...,αK }
layers. In order to formally describe the proposed model-
based method of parameter exchange and optimization, we subject to
define the vector
T 0 ≤ αk ∀k ∈ K, αk = 1 (14b)
Rmax Rmax,1 , . . . , Rmax,K (5) k∈K
containing the maximum data rates of all users, the vector is to be solved, which delivers αopt and via (1) also Ropt .
T The optimizer outputs the resource assignments αopt and rate
α α1 , . . . , αK (6) allocation Ropt to the MAC and APP layer, respectively.
containing the optimization coefficients, the vector
3.2.2. Required Overhead. Reviewing the exchanged param-
T eters, we notice that the vectors Rmax and α contain
R R1 , . . . , RK (7)
only long-term information. No instantaneous CSIT, power
containing the actual data rates of all users, and the vector allocation, modulation, or schedules have to be exchanged
between PHY/MAC layer and the optimizer. Likewise the
T
U U1 , . . . , UK . (8) APP layer model specified in Section 3.1.2 is determined
by only two parameters that are slowly time varying. This
The parameter Uk describes the application characteristic for has the advantage that the system is less sensitive against
user k, which is R1.0,k and R4.5,k for the APP layer model from delays caused by parameter exchange between layers and
Section 3.1.2. In addition more detailed information about the optimizer. Robustness against delays is of importance
the applications in a real system may also be contained in Uk . for CLO as base station and application server are most
The link layer model described in Section 3.1.1 is defined likely located at different physical locations so that control
by the vector function fL ( fL,1 , . . . , fL,K )T with elements information is to be exchanged over the core network.
If the principles of conventional CLO systems [21] are
fL,k : αk , Rmax,k −→ Rk = fL,k αk , (9) applied to our case, all considered schedules have to be
Table 1: Number of exchanged parameters. application model (4) the optimization problem (14) can be
formulated as follows:
Number of slots Nslot 52 8

Number of users K 2 8 αopt = arg max MOSk Rk
α k∈K
Exchanged
parameters for: (18a)
= arg max MOSk αk Rmax,k
all possible schedules K Nslot +1 + 1 9.0e15 1.3e8 α
k∈K
only schedules with K + Nslot − 1 !
K +1 107 5.1e4
different data rates Nslot !(K − 1)! subject to
model-based proposal 2K − 1 3 15
0 ≤ αk ∀k ∈ K, αk = 1. (18b)
k∈K
transmitted from the link layer to the optimizer. For each As the above optimization problem is neither convex nor
schedule at least the K data rates that the users achieve are concave, we first define an idealized utility that produces a
transmitted. For Nslot slots there are concave optimization problem.
K Nslot (15)
4.2. Unbounded Application Characteristic. Removing the
permutations (each representing one possible schedule). bounds in the application model (4) results in an unbounded
However, since a PGPS scheduler does not utilize channel logarithmic relation between utility metric and data rate. The
knowledge, all slots may be considered equally. The sched- unbounded optimization problem is formulated as:
uler’s task is to assign K users to Nslots slots (which means to
αk Rmax,k
find all combinations of K elements, Nslots at a time) whereas αopt = arg max MOS0,k log (19a)
one user may be scheduled in multiple slots (repetitions are α k∈K
R0,k
allowed). Hence, the actual number of schedules is smaller
than (15) and is given by [27] subject to
⎛ ⎞
K + Nslot − 1 −1 ! 0 ≤ αk ∀ k ∈ K, (19b)
⎝ ⎠ = K + N slot . (16)
Nslot Nslot !(K − 1)! αk = 1. (19c)
k∈K
This means that for the conventional system [21]
The optimization (19a) can be simplified as:

K + Nslot − 1 !
K (17) αopt = arg max MOS0,k log αk
Nslot !(K − 1)! α k∈K

(20)
data rate values have to be transmitted to the optimizer and
= arg max f α, MOS0
one value is fed back as the chosen schedule. α
Table 1 shows some numerical examples for the num-
ber of exchanged parameters. Although conventional CLO with the equivalent utility function
attains a significant reduction of exchanged parameters by
intralayer optimization, which allows to consider only a f α, MOS0 MOS0,k log αk . (21)
subset of schedules (16), the control information overhead k∈K
may still be prohibitive for a high number of users and
slots. In contrast, the proposed parameter abstraction needs The vector MOS0 (MOS0,1 , . . . , MOS0,K )T contains coeffi-
to transmit only K data rates from the link layer to the cients that characterize the K applications as defined in (4b).
optimizer, while K − 1 values are fed back. Of particular Note that f (α, MOS0 ) and, hence, the solution of the
advantage is the fact that the control information overhead unbounded optimization problem is independent on the
is independent of the number of slots Nslot . physical radio channel, characterized by Rmax,k , and only
depends on MOS0 , which is determined by the ratio between
R1.0,k and R4.5,k .
4. Optimum Resource Assignment For finding a closed form solution of the optimum
Based on the model-based CLO framework the optimum resource assignment αopt in (19), in the following we
resource allocation assuming an idealized utility is derived in prove the concavity of the optimization problem, derive the
closed form in this section. The mathematical analysis is the optimum share of resources between two users, and find a
basis of an optimization algorithm presented in Section 5, solution for the absolute resource share of a user.
which maximizes a more realistic utility. Reformulating the constraint (19c) as:

α = 1 − αk , ∈K (22)
4.1. Problem Statement. The objective is to maximize the k∈K
sum MOS of all users. With the specific link model (1) and k=/
and inserting the result into (21) yields For finding an absolute value for the optimization
coefficients α the relation (31) is inserted into the constraint
f α, MOS0 = MOS0,k log αk (19c), which yields
k∈K
k=
/ MOS0,
⎛ ⎞ (23) α = (32)
⎟ k∈K MOS0,k
⎜
+ MOS0, log⎜
⎝1 − αn ⎟
⎠.
n∈K as the final solution of the unbounded optimization problem
n=/
(19).
Now, the first and second partial derivatives in directions of As a special case it can be easily seen from (32) that if all
αk and αm can be determined, users have the same parameter MOSk , then the resources are

distributed equally to the users,
∂f MOS0,k MOS0,
= − , (24)
∂αk k =/ αk 1 − n∈K,n =/ αn 1
MOS0,1 = · · · = MOS0,K =⇒ αk = ∀k ∈ K. (33)
∂2 f MOS0,k MOS0, K
=− − 2 , (25)
2
∂αk k=
/
α2k 1 − n∈K,n =/ αn Interestingly, given that all users use the same application, the
optimum resource allocation for the unbounded problem
∂2 f MOS0,
= − 2 . (26) results in an equal resource scheduler where all users are
∂αk ∂αm k,m =/ ,k =/ m 1−

assigned the same number of slots. This implies that users
/ αn
n∈K,n =
experiencing a good channel receive higher data rates and
Considering (4b) and (4d), it follows that MOS0,k > 0 ∀k ∈ therefore enjoy better QoS, as adaptive transmission is more
K so that bandwidth efficient in this case.
In summary, the optimum resource allocation for the
∂2 f
< 0 ∀k ∈ K (27) unbounded optimization problem (32) is independent of
∂α2k the channel conditions; the number of assigned slots (the
allocated bandwidth) is exclusively determined by the appli-
and
cation characteristics; users with a good channel enjoy higher
∂2 f data rates. On the other hand, all users are given a fair share
< 0 ∀k ∈ K. (28) of the available resources. This is in a sharp contrast to a
∂αk ∂αm
maximum throughput scheduler, which exclusively serves
This means that the graph is strictly concave downwards good users while users experiencing a poor channel starve
and any extremum not being located on the domain borders for resources. The significance of this finding is that the
maximizes the utility. Therefore, provided for all k ∈ K the maximized utility in (19) is an idealized measure of user-
following condition is satisfied perceived quality.

∂f = 0, (29)
∂αk k =/ 4.3. Subset of Users. For solving the bounded optimization
problem (18), it is useful to solve the unbounded problem
the global maximum is found. Setting (24) to zero yields only for a subset of “variable” users Kvar ∈ K. The
remaining users Kfix = K \ Kvar have fixed optimization
MOS0, coefficients αk and are not subject to optimization. Here, the
+ 1 αk = 1 − αn . (30) notation K \ Kvar denotes the relative complement of set
MOS0,k n∈K
Kvar in set K.
n=
/ ,k
The constraint (19c) is rewritten as
Likewise, the optimum share for user , α , when αk is fixed,
is determined by differentiating (24) with respect to α and αk = 1 − αm . (34)
setting the result to zero, which corresponds to swapping k∈Kvar m∈Kfix
users k and in (30). By combining the result with (30) the
dependency to other users n = / k, disappears. This means Following the derivation in Section 4.2, inserting (31) gives
that the relation between the optimum resource assignments
of any two users, k and , is independent of all other users’ MOS0,k
α = 1 − αm , (35)
utility functions. After some algebraic manipulations the k∈Kvar
MOS0, m∈Kfix
relation
MOS0,k which finally yields
αk = α (31)
MOS0,
MOS0,
between the optimization coefficients of users k and is α = 1 − αm . (36)
m∈Kfix k∈Kvar MOS0,k
obtained.
5. Optimization Algorithm Maximizing remaining users the upper bounded optimization problem
the User-Perceived Quality from Section 5.1 is solved. In case dropped users are selected
appropriately in the first step, the remaining served users will
Based on the analytical solution for the unbounded problem always achieve data rates Rk > R1.0,k so that the solution for
in Section 4, an optimization algorithm for the bounded the bounded problem is optimum.
problem (18) is presented in this section. In an intermediate The following iterative algorithm for the solution of the
step a solution for the upper bounded problem is derived, bounded problem is formulated as follows.
where the application characteristic MOSk (Rk ) is upper
bounded at an MOS of 4.5. Then the solution of the bounded Step 1. Initially, all users are served.
problem is developed, and its computational complexity is
assessed. Finally, the proposed CLO algorithm is extended to Step 2. Drop users as detailed in Steps 2.1–2.4.
support different priority classes.
Step 2.1. If stop criterion is fulfilled, continue with
Step 3.
5.1. Upper Bounded Problem. We define the upper bounded
application characteristic by Step 2.2. Solve upper bounded problem for the served users
⎧ as described in Section 5.1.
⎪
⎪ Rk
⎨MOS0,k log : Rk < R4.5,k ,
MOSuk Rk = ⎪ R0,k (37) Step 2.3. User kdrop = arg maxk u
⎪ / k MOSk
k ,k = is dropped by
⎩4.5 : Rk ≥ R4.5,k , setting αkdrop = 0.
which gives the upper bounded optimization problem Step 2.4. Continue with Step 2.1.

arg max MOSuk αk Rmax,k (38a) Step 3. Solve upper bounded problem for the served users as
α k∈K described in Section 5.1 and stop.
subject to In this algorithm the stop criterion determines how
many users are served. When the objective is to maximize the
0 ≤ αk ∀k ∈ K, αk = 1. (38b) sum of all users’ MOS, referred to as “increase sum MOS”, an
k∈K
appropriate strategy is to continue dropping users until this
Let Ropt,k = αopt,k Rmax,k denote the optimum rate does not further improve the sum MOS.
allocation of user k of the unbounded problem (32). In case An alternative stop criterion is to check
Ropt,k > R4.5,k , the rate for user k may be reduced to R4.5,k
αk − αstop,k > 0 ∀k, (39a)
without sacrificing service quality, and the retained resources
can be given to users with Ropt, < R4.5, , =
/ k. A solution of where
this concave problem is found by the iterative algorithm:

αstop,k αk | MOSk αk = MOSstop,k . (39b)
Step 1. Initially, Kfix = ∅ and Kvar = K.
Step 2. Solve unbounded problem (36). This condition checks whether the MOS that would be
achieved with the allocated resources αk exceeds a certain
Step 3. Users with Ropt,k ≥ R4.5,k are moved from Kvar to Kfix minimum MOSstop,k ∈ [1, 4.5]. Setting MOSstop,k = 1 ∀k ∈
and set αk = R4.5,k /Rmax,k . K ensures that only a minimum of users are dropped,
while no resources are wasted to users that would anyhow
Step 4. If any user has been moved in Step 3, continue with experience unacceptable service quality of MOSk (αk ) =
Step 2, otherwise stop. 1. On the other hand, higher values of MOSstop,k enforce
a certain minimum perceived quality. This variant of the
If any of the application characteristics deviates from algorithm is therefore termed “reduce outage”.
(4), Step 2 can be replaced by a conventional algorithm that As the above discussion touches upon the issue of
solves the unbounded problem. Alternatively, appropriate admission control, other criteria that determine which
values for R1.0,k and R4.5,k can be chosen to approximate users are admitted to the system might be introduced. For
the real application characteristic, giving rise to a certain example, in a cellular system it might be desirable to give
deviation to the exact solution. Optionally, this approxi- priority to users that hand over from a neighboring cell
mation could be used as a starting point for an applicable rather than to serve a user who wishes to enter the network.
conventional algorithm.
5.3. Computational Complexity. An appealing feature is
5.2. Bounded Problem. We approach the bounded optimiza- that the proposed optimization algorithm deterministically
tion problem (18) by dividing it into two subproblems: terminates after a certain time. To prove this the worst case
first, a subset of users is determined who cannot be served run time is calculated in the following. Since in each iteration
and therefore get no resources, αk = 0; second, for the at least one user is dropped, there are at most K iterations
in the outer loop. The inner loop computes the solution of Table 2: Link layer parameters.
the upper bounded problem. In the worst case, one user is
Transmission scheme OFDMA
moved from Kvar to Kfix so that the number of iterations at
most equals the number of served users. The total number of Number of subcarriers N = 416
iterations is therefore upper bounded by K(1 + K)/2. Cyclic prefix duration 3.2 μs
An observation from the simulation results in Section 6 Symbol mapping BPSK, 4-,16-, 64-QAM
is that typically most users can transmit. Hence, the number 1 1 1 9 2 3
Channel coding Convol., Rc ∈ , , , , ,
of iterations for the outer loop is likely to be significantly 4 3 2 16 3 4
smaller than K. Likewise, trials suggest that for the inner Channel bandwidth B = 16.25 MHz
loop it is rather unlikely that more than two iterations are Channel model WINNER urban macro-cell [28]
required. Since the essential calculation within the inner Duplex ratio DL/UL 1/1
loop is given by the closed form expression (36), the total Cell radius 50 · · · 500 m
complexity of the optimization algorithm is low. Shadowing log-normal, σs = 8 dB
Path loss 38.4 dB + 35.0 dB log10 (d/m)
5.4. Priority Classes. In order to support different priority Center frequency f0 = 5.25 GHz
classes, the utility function is adjusted in the following. Transmit power 10 W
Let λk ∈ R be a real number that reflects the priority of Antenna gain 8 dBi
user k where, without loss of generality, λk > λ indicates Noise figure 7 dB
that user k has a higher priority than user . Priority Noise spectrum density −174 dBm/Hz
classes are incorporated to the utility function by substituting Delay spread τds = 313 ns
the application dependent constant MOS0,k in (19) by the Maximum Doppler speed v = 50 km/h
function gk (MOS0,k , λk ), that is,
Slot size (freq. × time) 8 × 12
αk Rmax,k Number of users K = 1, . . . , 64
gk MOS0,k , λk log . (40) Number of available slots Nslot = 52
k∈K
R0,k
Scheduler PGPS
In the calculation of the first and second partial deriva-
tives in direction of αk and αm in (24), (25), and (26), MOS0,k
is treated as a constant. Therefore, the derivation of the
unbounded optimization problem in Section 4.2 also applies 6. Performance Evaluation
to the priority function gk (MOS0,k , λk ), if the following The performance of the proposed CLO framework is evalu-
condition holds ated by means of system simulations. The link layer param-
eters listed in Table 2 mostly follow the WINNER (World
∂gk MOS0,k , λk
= 0 ∀{k, } ∈ K 2 . (41) Wireless Initiative New Radio, URL: www.ist-winner.org)
∂α system concept [2].
Likewise, (4b) and (4d) strictly require a positive constant
MOS0,k , which translates to 6.1. Simulation Setup. We consider an OFDMA downlink
that occupies a bandwidth of B = 16.25 MHz. Due to
gk MOS0,k , λk > 0 ∀k ∈ K. (42) the inherent orthogonality of orthogonal frequency division
multiplexing (OFDM), each subcarrier in each OFDM
Under these conditions, the conclusions from Section 4.2 symbol may be assigned to a different user without causing
apply: the utility function that supports priority classes interference, so that users can be scheduled independently
(40) is strictly concave downwards, and the underlying in time and frequency. Adjacent subcarriers and OFDM
optimization problem is solved by substituting MOS0,k with symbols are correlated and, therefore, experience a similar
gk (MOS0,k , λk ) in (31), (32), and (36). channel gain. In order to limit the signaling overhead 8 × 12
An intuitive realization of a priority function that satisfies symbols are grouped to form one slot.
the constraints (41) and (42) is given by The WINNER typical urban macrocell channel (model
C2 [28]) is used, which models channel attenuation due
gk MOS0,k , λk = λk MOS0,k , λk > 0 ∀k ∈ K, (43) to frequency selective fading, distance dependent path loss
and log-normal shadowing [29]. Instantaneous channel
which is similar to the approach described in [19]. This variations due to velocities of mobile users are generated
function is applied for obtaining the numerical results using Jakes’ model [30]. The channel model is implemented
presented in Section 6.5. such that the average SNR always allows transmission with
There are several possibilities how to further incorporate the lowest supported modulation and coding scheme. This
priority classes, for example, by adjusting the upper bound of is motivated by the fact that users with lower SNR would
the upper bounded optimization problem, the stop criterion not be able to establish a connection to the base station and,
or by using an alternative criterion for dropping users. hence, cannot request to be served. While the average SNR
100
3
70 64QAM, Rc = 4
2
64QAM, Rc =
Cumulative distribution
3
60
9
64QAM, Rc = 16
10−1
Data rate Rmax,k (Mbit/s)
50
3
16QAM, Rc = 4
2
16QAM, Rc = 3
40
9
16QAM, Rc = 16
30 16QAM, Rc = 12
10−2
3
QPSK, Rc = 4
20 QPSK, Rc = 23 106 107 108
9
QPSK, Rc = 16
QPSK, Rc = 12 Data rate Rmax,k (bit/s)
10 QPSK, Rc = 1
QPSK, Rc = 14
3 Figure 7: CDF of maximum data rate Rmax,k , which characterizes
1
BPSK, Rc = 3 the communications channel on the link layer.
BPSK, Rc = 14
0
−5 0 5 10 15 20 25 30
Signal-to-noise ratio (dB)
realizations are generated for each user according to a
Figure 6: Adaptive modulation: relation between instantaneous uniform user distribution within the cell area. Then Rmax
data rate and signal-to-noise ratio (SNR). is estimated and passed to the optimizer. CLO is performed
to determine the optimum share of resources αopt , which is
subsequently fed back to the PGPS scheduler at the MAC
always exceeds the given limit, the instantaneous SNR may layer.
be significantly lower due to frequency selective fading. After the 100-millisecond snapshot, the actually achieved
Mobile velocities up to v = 50 km/h are assumed, which average data rates are determined. The actually achieved data
implies that instantaneous CSIT may not be available. It rates may deviate from the optimizer’s estimate Rmax . Each
is assumed that the average SNR over all simultaneously user’s MOS is determined based on the user’s application and
transmitted slots is available for link adaptation. Hence, the achieved data rate. Then, the CDF of the MOS averaged
the same modulation and coding scheme is applied to all over all users is calculated.
subcarriers of one user during one slot duration. However,
slots assigned to different users will typically use a different 6.2. Performance of Different Optimization Algorithms. In
modulation and coding scheme. Figure 8, the CDF of the MOS is shown for the different
The transmitter chooses the symbol mapping with resource allocation strategies and optimizer variants dis-
cardinality M and code rate Rc of a convolutional code, based cussed in Section 5. The applications of all K = 16 users
on the average SNR of each user k (see Figure 6). Note that are described by the same parameters R1.0 = 100 kbit/s
due to half-duplex transmission the average data rate is only and R4.5 = 1 Mbit/s (compare Figure 5). As a reference
half of the instantaneous data rates indicated in Figure 6. The equal resource allocation with αk = 1/16 for all 16 users is
modulation and coding scheme is selected that achieves the also plotted, which is the optimum resource assignment of
largest spectral efficiency ηk = Rc log2 M at a frame error the unbounded optimization problem (19) (see Section 4.2).
rate (FER) of 10−2 . The SNR values for which FER = 10−2 Greedy resource allocation [19], as a conventional technique
are determined by reference simulations and are stored in a for solving optimization problems, is also included for
look-up table. It is assumed that an ARQ protocol at the link comparison. From our experience the Greedy algorithm
layer takes care of error events by retransmitting erroneously is significantly more computationally expensive than the
received packets. Due to the low occurrence of errors at proposed CLO algorithm. The other two curves show the
FER = 10−2 retransmissions only have marginal impact on performance of the proposed algorithm, the “increase sum
the throughput and will therefore not affect the perceived MOS,” and the “reduce outage” variants, where the stop
quality. Hence, simulations assume that packets are always criterion is set to MOSstop,k = 1 ∀k ∈ K.
received error free. As seen in Figure 8, both variants outperform equal
For CLO the long-term average data rate Rmax,k = ηk Nslot resource allocation and achieve a comparable average MOS
for each user k indicates the link capacity and is the relevant as greedy resource allocation. Compared to equal resource
abstraction of the link layer. Figure 7 shows the cumulative allocation, any performance improvement of the considered
distribution function (CDF) of Rmax,k , which is averaged over optimization algorithms is due to the bounds in the MOS
a large number of randomly chosen channel realizations and trajectory, since users with Rk = Rmax,k /16 > R4.5 perceive
user locations within a cell. the same QoS as if they were served with the reduced rate
Simulations are executed as follows: every 100 millisec- Rk = R4.5 . Likewise, users with Rk < R1.0 perceive the same
onds independent shapshots of path loss and shadowing QoS as a user who is not served at all. The “reduce outage”
0.6 Table 3: Video parameters.
0.5 Video coding H.264/MPEG-4 AVC [7]

Implementation JSVM 9.12.2, 25 April 2008 [31]
0.4 Resolution CIF (352 × 288)

Frame rate 30 Hz
0.3 Chroma subsampling 4:2:0
GOP size 32
0.2 GOP coding structure I-P-· · · -P
PSNR range PSNR1.0 = 30 dB, PSNR4.5 = 42 dB
0.1
0
1 1.5 2 2.5 3 3.5 4 4.5 Table 4: Transmitted video streams.
Mean opinion score Duration Average desired
Video name Ratio R4.5 /R1.0
(GOP) data rate R4.5
Equal resources, MOS = 3.818
Reduce outage, MOS = 4.02 Foreman 9 2, 156 kbit/s 18
Greedy allocation, MOS = 4.024 Mother 9 447 kbit/s 26
Increase sum MOS, MOS = 4.033 News 9 638 kbit/s 11
Figure 8: CDF of perceived quality for different optimization Container 9 1, 159 kbit/s 22
algorithms. Salesman 9 2, 265 kbit/s 40
Bus 4 4, 141 kbit/s 7
City 9 2, 202 kbit/s 13
variant serves practically all users, although some perceive a Crew 9 2, 677 kbit/s 15
poor service quality. In contrast, the “increase sum MOS”
variant tends to drop users with poor quality and assigns
the freed resources to served users. This is due the objective,
which aims to maximize the sum MOS of all users: a user
As seen in Figure 9, for the considered real video streams
will be dropped, if the increase in MOS of the served users
similar conclusions as for the generic applications from
outweighs the decrease in MOS of dropping a certain user.
Section 6.2 can be drawn. The considered optimizers exhibit
similar performance, achieving a significantly superior MOS
6.3. Deviation due to Application Model Abstraction. In with respect to equal resource allocation.
Section 6.2, the application is characterized by the idealized
bounded logarithmic relationship (4), so that the APP layer
model at the optimizer perfectly matches the application 6.4. Guaranteed Service Quality. It may be desirable to
characteristics. In order to assess the benefits of CLO in a support the demand for minimum QoS. This may be accom-
real system with real applications running, a video streaming plished by tuning the parameter MOSstop of the stop criterion
example is chosen where the user-perceived quality is in the “reduce outage” variant of the proposed optimization
approximated as described in Section 2.2.1. Eight different algorithm. As the stop criterion controls which users are
H.264/MPEG-4 AVC videos in common intermediate format dropped from the list of active users (see Section 5.2), setting
(CIF) resolution at 30 Hz frame rate are cut into snippets MOSstop to a value in the range [1, 4.5] ensures that all
containing one group of pictures (GOP) each. With a GOP served users achieve at least a minimum perceived quality of
size of 32 frames the snippets contain approximately 1 second MOSstop .
of video. Further parameters of the videos are summarized Figure 10 shows the CDF of the achieved sum MOS for
in Tables 3 and 4. The snippets are subsequently analyzed to MOSstop = 2.0 and MOSstop = 3.0. The higher MOSstop
extract the parameters a, b, and c for each snippet. the less users achieve the required data rates due to poor
In order to assess the effect of rate variations of the video channel conditions and are therefore not served. On the
stream over time, for each 100-millisecond PHY channel other hand, the served users with better channels benefit
snapshot a new (random) snippet of the respective video from freed resources of the dropped users, which improves
stream is used. For the proposed optimization algorithm their perceived quality.
from Section 5 the parameters R1.0,k and R4.5,k are estimated Figure 11 shows the MOS, averaged over all users and
by the application server for each video snippet and provided channel realizations, against MOSstop . The choice of MOSstop
to the CLO. Because the optimization algorithm is based affects the overall perceived quality and the maximum is
on the bounded logarithmic relationship (4), which deviates approached for MOSstop ≈ 2. In case MOSstop < 2, users
from the actually used video model (3), the decided resource with poor channels are served, which have only a marginal
distribution will be suboptimum. For comparison CLO with contribution to the overall sum MOS. On the other hand,
greedy optimization using the exact video model (3) is also if MOSstop > 2, an increasing number of users are denied
simulated. service, which cannot be compensated by the enhanced QoS
0.6 4.1
0.5 4.05
Average mean opinion score

0.4 4
0.3 3.95
0.2 3.9
0.1 3.85
3.8
0 1 1.5 2 2.5 3 3.5 4 4.5
1 1.5 2 2.5 3 3.5 4 4.5
Guaranteed mean opinion score MOSstop
Mean opinion score
Increase sum MOS bound
Reduce outage, MOS = 4.072 Reduce outage
Greedy allocation, MOS = 4.079
Figure 11: Average MOS as a function of the minimum MOS
Increase sum MOS, MOS = 4.089
constraints MOSstop .
Figure 9: CDF of the perceived quality for video streaming with
nonconcave application characteristic.
0.6
0.3
0.5
0.25
0.4
0.2
0.3
0.15
0.2
0.1
0.1
0.05
0
1 1.5 2 2.5 3 3.5 4 4.5
0
1 1.5 2 2.5 3 3.5 4 4.5 Mean opinion score
Mean opinion score
MOS > 3 Ordinary users, MOS = 3.77
Increase sum MOS No priorities, MOS = 4.033
MOS > 2 Premium users, MOS = 4.21
Figure 10: CDF of the perceived quality for different minimum Figure 12: CDF of the perceived quality for ordinary and premium
MOS constraints MOSstop . For comparison the “increase sum MOS” traffic.
variant is also included.
of the remaining active users. The perceived quality achieved 6.6. Application Characteristic. In order to identify for which
by the “increase sum MOS” variant, which approximates the application characteristics CLO is most effective, different
maximum sum MOS, is also indicated in Figure 10. generic application classes are examined, characterized by
their relationship between data rate and perceived quality
6.5. Traffic Priority Classes. The performance of CLO as described by the parameters R1.0 and R4.5 (see Figure 5),
supporting different traffic priority classes developed in for the “increase sum MOS” variant of the proposed
Section 5.4 is examined in Figure 12. The K = 16 users, all optimization algorithm. The application characteristics are
running the same applications, are split up into two priority the same for all users and R4.5 = 10R1.0 is chosen. Figure 13
groups of 8 users each; premium and ordinary users are given shows the average MOS against the required data rate for a
a priority of λk = 2 and λk = 1, respectively. maximum perceived quality of R4.5 , for a system with K = 16
Figure 12 shows the CDF of the sum MOS. Premium users.
users exhibit a significantly better MOS than ordinary users As seen from Figure 13, the attainable gains of CLO
and are more likely to be served. maximizing the sum MOS (solid lines) over equal resource
4.5 0.35
0.3
Mean opinion score gain

0.25
Mean opinion score
4
0.2
0.15
3.5
0.1
0.05
3 0
0 4 8 12 16
105 106
Number of high-rate users
Demanded data rate R4.5 (bit/s)
R4.5 /R1.0 = 10 Rlow = 2 × 104 bit/s, Rhigh = 2 × 105 bit/s

R4.5 /R1.0 = 100 Rlow = 6 × 104 bit/s, Rhigh = 1.8 × 105 bit/s
Rlow = 2 × 104 bit/s, Rhigh = 6 × 104 bit/s
Figure 13: Impact of rate-distortion characteristic on the average
MOS. Solid and dashed lines show results for the proposed CLO Figure 15: Gain in average MOS for different ratios of low-rate and
and equal resource allocation, respectively. high-rate users.
Mean opinion score
1
4 rs
use s
ser
3 -r ate ate u
r
Low h-
2 Hig 0.8
1
Rlow Rhigh 0.6
Data rate
Figure 14: Example characteristic for two user groups running 0.4
different application classes.
0.2 High-rate users
allocation (dashed lines) are dependent on both R4.5 and Low-rate users
0
the ratio R4.5 /R1.0 . For low data rate requirements the CLO 1 1.5 2 2.5 3 3.5 4 4.5
gain diminishes, as there is an excess of available resources Mean opinion score
to serve all users with excellent quality MOS = 4.5. For
increasing data rate requirements the CLO gain depends on Equal resources
the ratio R4.5 /R1.0 , in the way that the CLO gain increases Optimization
with decreasing R4.5 /R1.0 . This is explained by the fact that Figure 16: CDF of the perceived quality for two application classes.
for an increasing ratio R4.5 /R1.0 the MOS characteristic as Both high- and low-rate users benefit from CLO.
a function of the data rate, MOSk (Rk ) in (4), approaches
the unbounded problem addressed in Section 4.2, for which
according to (33) equal resource allocation is optimum. In
other words, the attainable CLO gains over equal resource Figure 15 shows the CLO gain in sum MOS relative to
allocation with αk = 1/K are due to users whose rates Rk = equal resource allocation against the number of users in each
Rmax,k /K are outside the logarithmic range of MOSk (Rk ). group. Results are plotted for different values of Rhigh and
As the logarithmic range is specified by the ratio R4.5 /R1.0 , Rlow , for a total number of K = 16 users and R4.5 /R1.0 =
the lower R4.5 /R1.0 the higher the gains to be achieved by 10. Interestingly, in some cases the overall MOS gain for
optimization. scenarios with mixed service classes exceeds the case when
all users are within either of the service classes. This is due
6.7. Mixed Service Classes. In Figures 14–16 a scenario with to the freed resources by replacing a high-rate user by a less
two user groups is investigated. Each of the two user groups demanding low-rate user, which allows the remaining users
run applications of a different service class, characterized by to fetch some of the freed resources.
different data rate requirements, as illustrated in Figure 14. The relationship between low- and high-rate users is
Low- and high-rate users request a minimum data rate R1.0 = further investigated in Figure 16, which shows the CDF of
Rlow and R1.0 = Rhigh , respectively. the sum MOS for both user groups. Corresponding to the
4.5 7. Conclusion
Resource allocation with QoS constraints where multiple
users share a wireless downlink is one key challenge in the
Mean opinion score
4 design of future wireless systems. The MOS is chosen as

a universal utility metric for the user-perceived quality for
CLO between link and APP layer.
Adaptive transmission based on long-term CSIT over a
3.5 time and frequency selective fading channel is considered,
including distance dependent path loss and log-normal
shadowing. Applications are described by a rate-distortion
characteristic, expressed by the MOS. With these settings
3
a model-based CLO framework is devised, which emulates
10 20 30 40 50 60 70 80 the functionalities of the system layers within the optimizer.
Number of users Compared to known CLO approaches significantly less
parameters need to be exchanged. Simulations of a video
Scenario 2 streaming scenario confirm that model mismatch, where the
Scenario 1 APP layer model at the optimizer is not perfectly matched
to the actual application, only results in modest performance
Figure 17: Overall gain achieved by CLO in terms of number
of served users K. Solid and dashed lines correspond to the
degradation.
CLO variant “increase sum MOS” and equal resource allocation, As a metric for the user satisfaction we chose to
respectively. maximize the sum MOS, which resulted in a nonconcave
optimization problem. Given an idealized utility metric
with an unbounded logarithmic relation between perceived
quality and data rate, a concave problem is retained, so that
the optimum resource allocation is derived in closed form.
maximum in Figure 15, there are 4 users with Rlow = 6 ×
One noteworthy result of the analysis is that the optimum
104 bit/s and 12 users with Rhigh = 1.8 × 105 bit/s. An
solution is independent of the physical channels and is solely
appealing observation is that both user groups gain from
described by the application characteristics.
CLO. While the average gain is ΔMOS = 0.22, low- and high-
The theoretical findings are the basis for a low complexity
rate users gain ΔMOS = 0.10 and ΔMOS = 0.26 in overall
and easy to implement CLO algorithm for the more realistic
perceived quality, respectively.
nonconcave optimization problem. The proposed iterative
optimization algorithm is significantly less complex than
6.8. System Performance. In order to assess the attainable known optimization algorithms and has the appealing
MOS gains from a system level perspective, the average feature to deterministically terminate.
sum MOS is plotted against the number of users K in The proposed algorithm offers an additional degree
Figure 17. Two scenarios are investigated. In scenario 1, of freedom to the network operator to configure its own
there are two groups with equal number of users, where policies, such as enhancing user satisfaction, ensuring a
low- and high-rate users request the rate Rlow = 2 × minimum perceived quality to all users, or to operate the
104 bit/s and Rhigh = 2 × 105 bit/s, respectively. It can wireless system with higher load so as to maximize revenue.
be deduced from Figure 17 that CLO maximizing the sum Furthermore, different priority classes can be supported.
MOS (solid lines) increases the number of users being The attainable gains of CLO strongly depend on the
served with the same average perceived quality by more than application characteristics. The higher the sensitivity of the
60%, compared to equal resource allocation (dashed lines). perceived quality to changes of the data rate, the more
In scenario 2, all users run the same application with a considerable the gains that can be achieved. Dependent
desired data rate of R4.5 = 6 × 105 bit/s and R4.5 /R1.0 = on the application more than 60%, additional users can
100, which achieves a comparably small MOS gain of at be served without sacrificing user satisfaction. If multiple
most ΔMOS = 0.11, as reported in Section 6.6 for K = service classes with different application characteristic are
16 users. In scenario 2, CLO also enables to serve more running simultaneously, all users can be expected to benefit
users with the same perceived quality, although in this from CLO. In some cases additional CLO gains that exploit a
case the gains diminish for increasing number of users. certain mix of service classes are observed.
In line with the discussion in Section 6.6, for scenario 2
gains of CLO over equal resource allocation are mainly Acknowledgment
in the region where the sum MOS is high, since then
users, whose rate Rk = Rmax,k /K is outside the logarithmic This paper was presented in part at the IEEE Int. Conf. on
range of MOSk (Rk ) in (4), are more likely. Otherwise, Communications (ICC’2007), Glasgow, UK, at the IEEE Int.
equal resource allocation tends to approach the optimum Symp. on Wireless Communication Systems (ISWCS’2007),
resource allocation strategy, leading to diminishing CLO Trondheim, Norway, and at the IEEE Vehicular Technology
gains. Conference (VTC’2008 Spring), Singapore.
References of the IEEE International Conference on Communications (ICC

’07), pp. 4560–4565, Glasgow, Scotland, June 2007.
[1] 3GPP TS 36.211 V8.5.0 Release 8, “3rd Generation Partnership [18] ITU-T Recommendation P.800, “Methods for subjective deter-
Project (3GPP); Evolved Universal Terrestrial Radio Access (E- mination of transmission quality,” International Telecommu-
UTRA); Physical Channels and Modulation,” December 2008. nications Union, Geneva, Switzerland, August 1996.
[2] IST-4-027756 WINNER II, “D6.13.14 WINNER II system [19] S. Khan, S. Duhovnikov, E. Steinbach, and W. Kellerer, “MOS-
concept description,” December 2007. based multiuser multiapplication cross-layer optimization for
[3] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, mobile multimedia communication,” Advances in Multimedia,
and R. Vijayakumar, “Providing quality of service over a vol. 2007, Article ID 94918, 11 pages, 2007.
shared wireless link,” IEEE Communications Magazine, vol. 39, [20] S. P. Boyd and L. Vandenberghe, Convex Optimization, Cam-
no. 2, pp. 150–153, 2001. bridge University Press, Cambridge, UK, 1st edition, 2004.
[4] C. Y. Wong, R. S. Cheng, K. B. Letaief, and R. D. Murch, [21] L.-U. Choi, M. T. Ivrlač, E. Steinbach, and J. A. Nossek,
“Multiuser OFDM with adaptive subcarrier, bit, and power “Bottom-up approach to cross-layer design for video trans-
allocation,” IEEE Journal on Selected Areas in Communications, mission over wireless channels,” in Proceedings of the 61st IEEE
vol. 17, no. 10, pp. 1747–1758, 1999. Vehicular Technology Conference (VTC ’05), vol. 5, pp. 3019–
[5] M. Ergen, S. Coleri, and P. Varaiya, “QoS aware adaptive 3023, Stockholm, Sweden, May-June 2005.
resource allocation techniques for fair scheduling in OFDMA [22] J. Brehmer, C. Guthy, and W. Utschick, “An efficient approx-
based broadband wireless access systems,” IEEE Transactions imation of the OFDMA outage probability region,” in Pro-
on Broadcasting, vol. 49, no. 4, pp. 362–370, 2003. ceedings of the 7th Workshop on Signal Processing Advances
[6] M. Sternad, T. Svensson, T. Ottosson, A. Ahlen, A. Svensson, in Wireless Communications (SPAWC ’06), pp. 1–5, Cannes,
and A. Brunstrom, “Towards systems beyond 3G based on France, July 2006.
adaptive OFDMA transmission,” Proceedings of the IEEE, vol. [23] A. K. Parekh and R. G. Gallager, “A generalized processor
95, no. 12, pp. 2432–2455, 2007. sharing approach to flow control in integrated services
[7] ITU-T Recommendation H.264, “Advanced video coding for networks—the single node case,” in Proceedings of the 11th
generic audiovisual services,” November 2007. Annual Conference of the IEEE Computer and Communications
[8] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Societies (INFOCOM ’92), vol. 2, pp. 915–924, Florence, Italy,
scalable video coding extension of the H.264/AVC standard,” May 1992.
IEEE Transactions on Circuits and Systems for Video Technology, [24] G. J. Sullivan and T. Wiegand, “Rate-distortion optimization
vol. 17, no. 9, pp. 1103–1120, 2007. for: video compression,” IEEE Signal Processing Magazine, vol.
[9] I. Ahmad, X. Wei, Y. Sun, and Y.-Q. Zhang, “Video transcod- 15, no. 6, pp. 74–90, 1998.
ing: an overview of various techniques and research issues,” [25] L. U. Choi, M. T. Ivrlač, E. Steinbach, and J. A. Nossek,
IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 793–804, “Sequence-level models for distortion-rate behaviour of com-
2005. pressed video,” in Proceedings of the International Conference
[10] S. Shakkottai, T. S. Rappaport, and P. C. Karlsson, “Cross- on Image Processing (ICIP ’05), vol. 2, pp. 486–489, Genova,
layer design for wireless networks,” IEEE Communications Italy, September 2005.
Magazine, vol. 41, no. 10, pp. 74–80, 2003. [26] O. Nemethova, M. Ries, M. Zavodsky, and M. Rupp, “PSNR-
[11] S. Khan, Y. Peng, E. Steinbach, M. Sgroi, and W. Kellerer, based estimation of subjective time-variant video quality
“Application-driven cross-layer optimization for video for mobiles,” in Proceedings of the International Conference
streaming over wireless networks,” IEEE Communications on Measurement of Audio and Video Quality in Networks
Magazine, vol. 44, no. 1, pp. 122–130, 2006. (MESAQIN ’06), Prague, Czech Republic, June 2006.
[12] E. Setton, T. Yoo, X. Zhu, A. Goldsmith, and B. Girod, “Cross- [27] E. Kreyszig, Advanced Engineering Mathematics, John Wiley &
layer design of ad hoc networks for real-time video streaming,” Sons, New York, NY, USA, 7th edition, 1993.
IEEE Wireless Communications, vol. 12, no. 4, pp. 59–64, 2005. [28] IST-2003-507581 WINNER, “D5.4 final report on link level
[13] L.-U. Choi, W. Kellerer, and E. Steinbach, “On cross-layer and system level channel models, ver. 1.4,” November 2005.
design for streaming video delivery in multiuser wireless [29] T. S. Rappaport, Wireless Communications: Principles and Prac-
environments,” EURASIP Journal on Wireless Communications tice, Prentice-Hall, Englewood Cliffs, NJ, USA, 2nd edition,
and Networking, vol. 2006, Article ID 60349, 10 pages, 2006. 2002.
[14] M. van der Schaar and N. Sai Shankar, “Cross-layer wireless [30] W. C. Jakes, Microwave Mobile Communications, John Wiley &
multimedia transmission: challenges, principles, and new Sons, New York, NY, USA, 1974.
paradigms,” IEEE Wireless Communications, vol. 12, no. 4, pp. [31] Joint video team (JVT), “JSVM software manual, version
50–58, 2005. 9.12.2, April 25th, 2008,” Heinrich-Hertz-Institut, June
[15] S. Khan, S. Duhovnikov, E. Steinbach, M. Sgroi, and W. 2008, http://ip.hhi.de/imagecom G1/savce/downloads/SVC-
Kellerer, “Application-driven cross-layer optimization for Reference-Software.htm.
mobile multimedia communication using a common applica-
tion layer quality metric,” in Proceedings of the International
Wireless Communications and Mobile Computing Conference
(IWCMC ’06), pp. 213–218, Vancouver, Canada, July 2006.
[16] A. Sang, X. Wang, M. Madihian, and R. D. Gitlin, “A flexible
downlink scheduling scheme in cellular packet data systems,”
IEEE Transactions on Wireless Communications, vol. 5, no. 2,
pp. 568–576, 2006.
[17] X. Zhang, M. Tao, and C. S. Ng, “Time sharing policy in wire-
less networks for variable rate transmission,” in Proceedings
doi:10.1155/2009/275121
Research Article
Admission Control Threshold in Cellular Relay Networks
with Power Adjustment
Ki-Dong Lee and Byung K. Yi

Research and Standards Department, LG Electronics Mobile Research, San Diego, CA 92131, USA
Correspondence should be addressed to Ki-Dong Lee, kidonglee@lge.com
Received 2 August 2008; Revised 22 November 2008; Accepted 6 January 2009
In the cellular network with relays, the mobile station can benefit from both coverage extension and capacity enhancement.
However, the operation complexity increases as the number of relays grows up. Furthermore, in the cellular network with
cooperative relays, it is even more complex because of an increased dimension of signal-to-noise ratios (SNRs) formed in
the cooperative wireless transmission links. In this paper, we propose a new method for admission capacity planning in a
cellular network using a cooperative relaying mechanism called decode-and-forward. We mathematically formulate the dropping
ratio using the randomness of “channel gain.” With this, we formulate an admission threshold planning problem as a simple
optimization problem, where we maximize the accommodation capacity (in number of connections) subject to two types of
constraints. (1) A constraint that the sum of the transmit powers of the source node and relay node is upper-bounded where both
nodes can jointly adjust the transmit power. (2) A constraint that the dropping ratio is upper-bounded by a certain threshold value.
The simplicity of the problem formulation facilitates its solution in real-time. We believe that the proposed planning method can
provide an attractive guideline for dimensioning a cellular relay network with cooperative relays.
Copyright © 2009 K.-D. Lee and B. K. Yi. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction integer programming problem and a continuous-relaxation-

based suboptimal solution method was studied. In [4], a
It is expected that both the operational complexity and class of computationally inexpensive methods for power
the signaling burden are increased as the number of com- allocation and subcarrier assignment were developed, which
munication nodes increases in cellular networks. However, are shown to achieve comparable performance, but do not
the large number of nodes distributed over the service require intensive computation.
area may act as a relay node for other nodes so that the Specifically for data traffic, several studies have con-
transmit power and the achievable rate can be improved sidered providing a fair opportunity for users to access a
[1]. The use of relays is considered to be one of the wireless system so that no user may dominate in resource
most attractive strategies for the next generation wireless occupancy while others starve. In [5], the authors proposed
network [2]. Also, orthogonal frequency-division multiple a fair scheduling scheme to minimize the total transmit
access (OFDMA) is one of the most promising solutions power by allocating subcarriers to the users and then to
to provide a high-performance physical layer in emerging determine the number of bits transmitted on each subcarrier.
cellular networks. OFDMA is based on OFDM and inherits Also, they developed suboptimal solution algorithms by
immunity to intersymbol interference and frequency selec- using the linear programming technique and the Hungarian
tive fading. Recently, adaptive resource management for method. A new scheme to fairly allocate subcarriers, rate,
multiuser OFDMA systems has attracted enormous research and power for multiuser OFDMA system was proposed
interest [3–7]. In [3], the authors studied how to minimize [6], where a new generalized proportional fairness criterion,
the total transmission power while satisfying a minimum rate based on Nash bargaining solutions and coalitions, was
constraint for each user. The problem was formulated as an used. The study in [6] is very different from the previous
In this paper, more specifically, we consider admission

B1
capacity planning for cellular networks with cooperative
relays [1], considering the randomness of channel gains
B0
RS1 between three types of links formed in cooperative relaying as
shown in Figure 1. In cooperative relaying through decode-
and-forward, the achievable rate of the link between the
source node and the destination node is characterized by the
BS channel gains stochastic of three links: source-relay, source-
RS3 destination, and relay-destination. This figure depicts three
RS2
exemplary cases. In Case 1, where mobile station (MS) moves
from point B0 to B1 , the distance between base station (BS)
S1 U1
and MS gets longer with the distance between MS and relay
U0 station (RS) kept. The channel gain between RS and MS
S0 increases but the channel gain between BS and MS does
Figure 1: Three examples of channel gain change according to not increase. Supposing that the current achievable rate
movement of mobile station (MS). Case 1 (B0 → B1 ): MS goes between MS to BS (in the cooperatively formed link, but
away from BS with RS-MS distance constant. The channel gain not in the direct transmission link) is upper bounded by
between RS and MS increases but it does not increase the achievable the channel gain of BS-MS link, the movement from point
rate between MS and BS. Case 2 (S0 → S1 ): MS gets close to RS B0 to point B1 will cause a reduction of the achievable
with BS-MS distance constant. The channel gain between BS and rate in the cooperatively formed link between BS and MS.
MS decreases and it decreases the achievable rate between BS and However, supposing that there is surplus power used in the
MS. Case 3 (U0 → U1 ): MS gets close to RS whereas it goes away transmitter (either of BS or of MS), the movement does
from BS. The channel gain between RS and MS increases but the not necessarily lead to a reduction in the achievable rate. In
channel gain between BS and MS decreases, which finally decreases
Case 2, where MS moves from point S0 to S1 , the MS-RS
the achievable rate between BS and MS.
distance gets shorter with the MS-BS distance unchanged. In
Case 3, where MS moves from point U0 to U1 , the MS-BS
distance gets larger whereas the MS-RS distance gets shorter.
For example, suppose that BS is transmitting some packets to
OFDMA scheduling studies in the sense that the resource MS. Also, suppose that the current transmit power vector is
allocation is performed with a game-theoretic decision rule. in equilibrium. Then BS needs to adjust the transmit power
They proposed a very fast near-optimal algorithm using not to loose the current level of achievable rate. However, if
the Hungarian method. They showed by simulations that BS adjust the transmit power but RS does not, RS may cause
their fair-scheduling scheme provides a similar overall rate a certain level of power waste, also causing interference to
to that of the rate-maximizing scheme. In [7], they provided other receivers to grow up.
achievable rate formulations from the physical layer perspec- In [8], Niyato and Hossain studied two call admission
tive and studied algorithms using the Lagrangian multiplier schemes in OFDMA networks. However, they did not
technique, where they showed that their algorithms can find consider the nonstationary nature of SNR in determining
the global optimum even in the case that the problems are the threshold value for admission control. Also, the network
nonconvex. model does not include relaying architectures. These two
Most previous work on resource allocation in OFDMA points are the major difference between their contributions
systems, however, did not consider the connection-level and ours.
performance which is limited by the fluctuations in per- In [9], we considered a capacity planning problem in
formance, for example, signal-to-noise ratio (SNR), in the cooperative cellular relay network but no power adjustment
lower layer. Because of the random nature of user mobility, was considered. In this paper, however, we propose a new
the average channel gain of a targeted group of users method for admission capacity planning in OFMDA cellular
(referred simply as the average channel gain in the rest of networks with cooperative relays with power adjustment
the paper) in a cellular relay network changes over time, between source and relay nodes, which take into con-
causing the average SNR of the user group to continuously siderations of the random nature of the average channel
change and fluctuate. Figure 1 presents an example where gain. We derive the dropping ratio, and formulate an
the maximum number of users have been accommodated optimization problem to maximize the admission capacity
in the best SNR case, which may cause a portion of them subject to a dropping ratio constraint. The simplicity of
to be dropped if the SNR falls down from that point. Since the problem formulation enables the admission capacity
the maximum achievable transmit rate is bounded by the planning problem to be solved in real-time.
SNR, ongoing connections may experience outage events There are extensive studies on subcarrier and power
and, furthermore, the dropping ratio increases for any given allocations in OFDM (see [3–7] and the literatures therein),
number of connections admitted in the system. Therefore, where the authors assume that the SNR is not variable during
it is necessary to take the fluctuating nature of SNR into the scheduling period. The results of these studies can be
account when planning for the admission capacity threshold used in an adaptive manner in accordance with the frequent
value. changes of SNR. Regardless of adaptations with respect to
Frame no. N Frame no. N + 1 ··· assigned rate is smaller than the minimum required transmit
rate φ.
Time
In cellular networks, the user nodes are normally mobile,
which implies that the channel gains Gi j ’s can be considered
Slot no. 1 Slot no. 2
as random variables. The allocation of subcarrier and power
Figure 2: An example of frame structure (in the time domain) is dependent upon the instantaneous values of the random
for decode-and-forward relaying. The first slot is used for source variables. In such situations, we propose an alternative to
node to transmit whereas the second slot for relay node to relay the approximate the total rate of connections when y connec-
received data from the source node. tions are ongoing as follows [11]:

W
R(y; ps , pr ) ≈ C ·
2
SNR variations, outage events of ongoing real-time connec-
tions are unavoidable in the cases that the instantaneous ay
· min log2 1 + · Gs,r ps ,
capacity with respect to the locations of users residing in a C

cell becomes lower than the minimum capacity required to ay
log2 1 + · Gs,d ps + Gr,d pr ,
serve those connections. A simple solution to improve the C
dropping ratio of ongoing connections is to apply a certain (2)
“bound” to the maximum number of connections. Because
where W is the bandwidth of a subcarrier, 1/2 is because
of simplicity of this type of solution, it is useful for practical
of two-slot frame structure for cooperative relaying (as in
applications. However, it is necessary to investigate how to

y
find appropriate bounds for connection admission that take Figure 2), G(·,·) = (1/ yC) Ci=1 j =1 Gi j in the associated link
into account the particular characteristics of OFDM systems, (·, ·), and (y/C) · p(·) is the average power allocated to a
which differentiates this problem from similar problems in subcarrier at the associated node (·). By letting α CW/2
the other wireless systems. and β(y) (a/C)y, we can rewrite (2) as
The main objective of this work is to find appropriate
upper bounds of the number of connections that can R y; ps , pr ≈ α · min log2 1 + β(y) · Gs,r ps ,

be admitted in the system so that the dropping ratio is log2 1 + β(y) · Gs,d ps + Gr,d pr .
upper bounded by a certain threshold. More specifically, (3)
the objective is to maximize the admission capacity while
keeping the dropping ratio upper bounded by a certain There are practical reasons to use G instead of the
threshold value. In this paper, we call these upper bounds individual random variables Gi j ’s. First, the variances of Gi j ’s
the “admission capacity.” We consider the case that the with respect to indices i and j are small in the case of
channel gain of user j using subcarrier i, denoted by Gi j , group-mobility users because the users are located at the
is a random variable that varies over time. In this case, nearly same position with respect to the base station. Second,
the optimal subcarrier and power allocations will vary over the mean value G is an unbiased estimator that provides
time as they are completely dependent on the values the sufficient statistical information on the targeted population.
random variables Gi j ’s. We assume the perfect condition The probability density function (pdf) of random variable
that optimum power and subcarrier allocations are made G is denoted by fG (·). In the case of a system filled with
given the values of Gi j ’s. This assumption is necessary and individual mobility users, the approximation used in (3)
widely adopted in the literature to enable an analytical may not be sufficiently accurate because the channel gains
evaluation of the achievable system capacity. For example, and allocated powers of individual mobility users are quite
in capacity planning of CDMA systems with time-division different, which is beyond the scope of this work. In the case
duplex (TDD), it is commonly assumed to have perfect of group-mobility users, however, because of the first reason,
power control and resource allocation [10]. the approximation is much more accurate.
We consider an OFDMA cellular relay network with
cooperative relaying called decode-and-forward [1]. A cell has 2. Dropping Ratio Formulation
a total of C subcarriers and each user has a transmission
power limit of p. In a single link (without cooperative In this section, we derive the dropping ratio D(y; ps , pr )
diversity scheme), the throughput of user j using subcarrier when there are y connections are ongoing. The dropping
i, denoted by Ri j , is given by ratio is defined as the average fraction of the total number
of connections suffering from outages:
Ri j = W log2 1 + a · Gi j pi j , (1)
D y; ps , pr = Pr R y; ps , pr < y · φ . (4)
where W is the bandwidth of a subcarrier, a ≈ −1.5/(σ 2 ·
log(5 · BER)) (BER denotes desired bit-error rate), Gi j By letting ρ(y) = (y · φ/α), we have

denotes the channel gain of user j at subcarrier i, σ 2 is the D y; ps , pr = Pr min 1 + β(y) · Gsr ps ,
thermal noise power, and pi j denotes the power allocated to ρ(y)

user j at subcarrier i [6]. Each connection has the minimum 1+β(y) · Gsd ps +Grd pr <2 .
rate requirement φ such that an outage event occurs if the (5)
Let A = 1+β(y) · Gsr ps and B = 1+β(y) · (Gsd ps +Grd pr ). 3.1. Problem Formulation. We have the following:
Under the assumption that random variables Gsr , Gsd , and
Grd are mutually independent, we can rewrite the above (P) maximize y
expression as (9)
subject to D y; ps , pr ≤ γO
w · ps + (1 − w) · pr ≤ p, (10)
D y; ps , pr = Pr min(A, B) < 2ρ(y)

= 1 − Pr min(A, B) ≥ 2ρ(y) (6) where ps , pr are nonnegative real numbers, y is nonnegative
integer, and w, p are given values.
= 1 − Pr A ≥ 2ρ(y) · Pr B ≥ 2ρ(y) ,
The role of problem (P) is to find the maximum y that
satisfies a dropping ratio constraint. In other words, it is to
and, therefore, maximize y subject to a constraint that the dropping ratio is
less than or equal to γO .

D y; ps , pr = 1 − 1 − Pr 1 + β(y) · Gsr ps < 2ρ(y) 3.2. Solution Method of (P).
ρ(y)
· 1 − Pr 1+β(y) · Gsd ps +Grd pr <2 Proposition 1. The dropping probability D(y; ps , pr ) is a
strictly increasing function of y.
2ρ(y) − 1
= 1 − 1 − Pr Gsr <
β(y) · ps
Proof. Let

2ρ(y) − 1
· 1 − Pr Gsd ps + Grd pr < .
β(y) 2ρ(y) − 1
(7) h(y) ,
β(y)ps
((2ρ(y) −1)/β(y)p
r) 2ρ(y) − 1 pr
We can rewrite this as follows: H(y) FGsd − · x · fGrd (x)dx
0 β(y)ps ps
∞
D y; ps , pr 2ρ(y) − 1 pr
= FGsd − · x · fGrd (x)dx.
0 β(y)ps ps
2ρ(y) − 1 (11)
= Pr Gsr <
β(y) · ps
((2ρ(y) −1)/β(y)p
r) 2ρ(y) − 1 pr Then
+ FGsd − · x · fGrd (x)dx
0 β(y)ps ps
dD y; ps , pr
−1
2ρ(y)
− Pr Gsr < dy
β(y) · ps

((2ρ(y) −1)/β(y)p
r)
d 2ρ(y) − 1 d
2ρ(y) − 1 pr = FGsr + H(y)
· FGsd − · x · fGrd (x)dx dy β(y) · ps dy
0 β(y)ps ps

2ρ(y) − 1 d 2ρ(y) − 1
= FGsr − FGsr · H(y)
β(y) · ps dy β(y) · ps
((2ρ(y) −1)/β(y)p
r) 2ρ(y) − 1 pr 2ρ(y) − 1 d
+ FGsd − ·x − FGsr · H(y)
0 β(y)ps ps β(y) · ps dy

2ρ(y) − 1 2ρ(y) − 1 dh(y)
· fGrd (x)dx − FGsr = fGsr
β(y) · ps β(y) · ps dy
((2ρ(y) −1)/β(y)p ∞
r) 2ρ(y) − 1 pr 2ρ(y) − 1 pr dh(y)
· FGsd − · x · fGrd (x)dx. + fGsd − ·x · fGrd (x)dx
0 β(y)ps ps 0 β(y)ps ps dy
(8)
2ρ(y) − 1 dh(y) 2ρ(y) − 1
− fGsr · H(y) − FGsr
β(y) · ps dy β(y) · ps
3. Minimization of Dropping Ratio ∞
2ρ(y) − 1 pr dh(y)
· fGsd − ·x · fGrd (x)dx.
We find the maximum y that satisfies a dropping ratio 0 β(y)ps ps dy
constraint by solving the following simple problem (P). (12)
This can be rewritten as Table 1: Parameters used in experiments.

dD y; ps , pr Item Value Description
dy p 50 Avg. transmit power (mW)
>0 σ2 1e − 11 Thermal noise level (W)
2ρ(y) − 1 dh(y) C 128 No. of subcarriers
= fGsr · 1 − H(y)
β(y) · ps d y BER 1e − 5 Desired bit-error rate
>0 W 25000 Bandwidth of subcarrier (Hz)

φ 100 Min. required rate per connection (Kbps)
2ρ(y) − 1
+ 1 − FGsr G ∼N (100, 5)
β(y) · ps
w 0.5
∞
2ρ(y) − 1 pr dh(y)
· fGsd − ·x · fGrd (x)dx. 610 0.011
0 β(y)ps ps dy
(13)
Here,
Admission capacity, y
600
0.0105
ρ(y) ρ(y)
dh(y) ρ (y)2 ln 2 · β(y) − 2 − 1 · β (y) 1
= 2 ·
D(y)
dy β(y) ps
590
γO = 0.01
2ρ(y) · ρ(y) · ln 2 − 1 +1 1 0.01
= · ∵ β(y) = β (y) · y .
β(y) · y ps
(14) 580
Let 0.0095
30 40 50 60 70
g(y) 2ρ(y) · ρ(y) · ln 2 − 1 + 1. (15) ps
Then we have the following: y

D(y)
g(0) = 0,
(16) Figure 3: Relation between the admission capacity and the transmit
g(y) > g(0), ∀y > 0 ∵ ρ (y) > 0 . power of the source node when transmitting its own traffic γO =
0.01).
Thus, dh(y)/d y > 0. This yields dD(y; ps , pr )/d y > 0. This
completes the proof.
Proposition 2. For a given value of ps , a feasible solution y ∗ In other words, the following solution:

is the global optimal solution if and only if y ∗ = D−1 γ0 , ∀ ps (20)
∗

y = sup y : D y; ps , pr ≤ γ0 . (17) is the unique global optimum.
In other words, the following solution: Proposition 4. The constraint (10) is binding at the optimum.
∗
−1

y = D γ0 (18)
Proof. The accommodation capacity y is an increasing
is the unique global optimum. Since dD/d y > 0, D is invertible. function of ps and pr . However, in decode-and-forward
cooperation, the capacity is not a strictly increasing function
Proof. Suppose that there is a feasible solution y0 better than of them because there may exist a portion of wastage in either
y ∗ : that is, y0 ≥ y ∗ + 1 and D(y0 ) ≤ γ0 . Since D(y; ps , pr ) is side, which cannot necessarily contribute to the increase of
strictly increasing, D(y ∗ + k) > γ0 for all k ≥ 1, which yields the achievable rate. However, if there is a waste in one side,
the two inequalities under this assumption cannot hold either source node or relay node, there is a binding in the
at the same time. Therefore, there are no solutions better other side. Therefore, the former can reduce a certain portion
than y ∗ . of the transmit power and the latter can increase a certain
From these two Propositions, we have the following re- portion of the transmit power such that the constraint (10) is
sult. not violated. As far as there is a waste in one side, the other is
binding; this means all power vector (ps , pr ) that has a waste
Proposition 3. A feasible solution y ∗ is the global optimal has room to improve the capacity y, that is, it is not optimal.
solution if and only if Therefore, even if y is not a strictly increasing function
of ps and pr , the power vector (ps , pr ) is not optimal if the
y ∗ = sup y : D y; ps , pr ≤ γ0 , ∀ ps . (19) equality does not hold.
780 780
760 760
740 740
720 720
700 700
680 680
660 660
640 640
620 620
1E − 3 0.01 0.1 1 1E − 3 0.01 0.1 1
γO
γO
BER = 1E − 4 BER = 1E − 6 ps = 30 mW ps = 45 mW
BER = 1E − 5 BER = 1E − 7 ps = 40 mW ps = 50 mW
Figure 4: The maximum number of connections y versus γO with Figure 6: The maximum number of connections y versus γO with
respect to BER (ps = pr = 50 mW, σ 2 = 10−11 , φ = 100 Kbps). respect to ps (pr = 50 mW, σ 2 = 10−11 , φ = 100 Kbps, BER = 1e −5).
780 780
760 760
740 740
720 720
700 700
680 680
660 660
640 640
620 620
1E − 3 0.01 0.1 1 1E − 3 0.01 0.1 1
γO γO
φ = 98 Kbps pr = 30 mW pr = 45 mW
φ = 100 Kbps pr = 40 mW pr = 50 mW
φ = 102 Kbps
Figure 7: The maximum number of connections y versus γO with
Figure 5: The maximum number of connections y versus γO with respect to pr (ps = 50 mW, σ 2 = 10−11 , φ = 100 Kbps, BER = 1e −5).
respect to φ (ps = pr = 50 mW, σ 2 = 10−11 , BER = 1e − 5).
128 over a 3.2-MHz band, BER = 10−5 , and the minimum

Using Proposition 4, we may eliminate one of variables rate requirement is φ = 100 kbps, which are used as default
ps , pr by setting the equality of (10). With this, we can simply values.
solve the problem. Figure 3 shows the relation between the admission
capacity and the transmit power of the source node when the
3.3. Experimental Results. We examine the three proposed source node spends 50% of its whole activity duration for
methods for various pdfs of the average channel gain G and transmitting its own traffic and the other 50% for relaying
for various values of BER, φ, σ 2 , and p. In our simulation others’ traffic. It is observed that the admission capacity
setups, the transmission power is p = 50 mW, the thermal increases as the portion of used power at the source node
noise power is σ 2 = 10−11 W, the number of subcarriers is C = becomes greater than that of the relay node.
For different requirements of bit-error rates, Figure 4 [2] IEEE 802.16 Task Group m (TGm), 802.16m System Require-
shows the maximum number of connections that can be ments Document (SRD).
accommodated in a cell with a target dropping ratio of γO . [3] C. Y. Wong, R. S. Cheng, K. B. Letaief, and R. D. Murch,
It is observed that the admission capacity decreases as the “Multiuser OFDM with adaptive subcarrier, bit, and power
bit-error rate requirement becomes stringent. For the given allocation,” IEEE Journal on Selected Areas in Communications,
vol. 17, no. 10, pp. 1747–1758, 1999.
setup of this numerical experiment, it is observed for γO =
[4] D. Kivanc, G. Li, and H. Liu, “Computationally efficient
0.01 that the admission capacity increases at a rough rate of
bandwidth allocation and power control for OFDMA,” IEEE
8% per 10-fold increase in the targeted bit-error rate. Transactions on Wireless Communications, vol. 2, no. 6, pp.
For different values of required data rates, Figure 5 1150–1158, 2003.
shows the maximum number of connections that can be [5] M. Ergen, S. Coleri, and P. Varaiya, “QoS aware adaptive
accommodated in a cell with a target dropping ratio of γO . resource allocation techniques for fair scheduling in OFDMA
We test how much increase or decrease in the admission based broadband wireless access systems,” IEEE Transactions
capacity we may have if there is 2% of decrease and increase. on Broadcasting, vol. 49, no. 4, pp. 362–370, 2003.
For γO = 0.01, the admission capacity at φ = 100 Kbps is 644. [6] Z. Han, Z. Ji, and K. J. R. Liu, “Fair multiuser channel allo-
If there is 2% decrease in required data rate, the admission cation for OFDMA networks using Nash bargaining solutions
capacity increases to 659 (2.3% of increase) whereas if there and coalitions,” IEEE Transactions on Communications, vol. 53,
is 2% increase in required data rate, the admission capacity no. 8, pp. 1366–1376, 2005.
[7] Y. Yao and G. B. Giannakis, “Rate-maximizing power allo-
decreases to 631 (2.0% of decrease).
cation in OFDM based on partial channel knowledge,” IEEE
For different levels of transmit power for the source Transactions on Wireless Communications, vol. 4, no. 3, pp.
node and relay node, Figures 6 and 7 show the maximum 1073–1083, 2005.
number of connections that can be accommodated in a [8] D. Niyato and E. Hossain, “Connection admission control
cell with a target dropping ratio of γO . It is commonly algorithms for OFDM wireless networks,” in Proceedings
observed, for any γO less than a certain value (e.g., 0.1), that of IEEE Global Telecommunications Conference (GLOBE-
an increase in transmit power greater than a certain value COM ’05), vol. 5, pp. 2455–2459, St. Louis, Mo, USA,
(e.g., 45 mW) results in a remarkable increase in admission November 2005.
capacity. Also, it is observed, by comparison of these figures, [9] K.-D. Lee, B. K. Yi, S. Kwon, and S. Kim, “Capacity plan-
that adjustment in transmit power of relay nodes does not ning of OFDMA cellular networks with decode-and-forward
have good impact on the increase of admission capacity when relaying,” in Proceedings of IEEE Wireless Communications
the transmit power level is small (e.g., less than 45 mW in this and Networking Conference (WCNC ’08), pp. 1911–1915, Las
Vegas, Nev, USA, March-April 2008.
experimental setup).
[10] L.-C. Wang, S.-Y. Huang, and Y.-C. Tseng, “Interference
analysis and resource allocation for TDD-CDMA systems to
4. Conclusions support asymmetric services by using directional antennas,”
IEEE Transactions on Vehicular Technology, vol. 54, no. 3, pp.
Since the admission capacity, defined as the upper bound 1056–1069, 2005.
of the number of connections that a base station can [11] K.-D. Lee and V. C. M. Leung, “Capacity planning for group-
accommodate, fluctuates in accordance with the signal- mobility users in OFDMA wireless networks,” EURASIP
to-noise ratio in cellular networks with node cooperation Journal on Wireless Communications and Networking, vol.
diversity, it is highly probable that a portion of ongoing con- 2006, Article ID 75820, 12 pages, 2006.
nections may be dropped prior to their normal completion
because of outage events. In this paper, we have developed
a challenging method for admission capacity planning in
an OFDMA-based cellular relay system with cooperative
diversity scheme called decode-and-forward. Taking into
account of the fluctuations of the average channel gains in
the multihop cellular network, we have derived dropping
ratio at the connection level. Based on the metric, we have
formulated a problem to optimize admission capacity under
given conditions. Because of the simplicity of its formulation,
each problem can be solved in real-time. We believe that
the proposed capacity planning method can be effectively
applied in the design and dimensioning of OFDMA cellular
networks with cooperative relays.
References
[1] J. N. Laneman, D. N. C. Tse, and G. W. Wornell, “Cooperative
diversity in wireless networks: efficient protocols and outage
behavior,” IEEE Transactions on Information Theory, vol. 50,
no. 12, pp. 3062–3080, 2004.
doi:10.1155/2009/953018
Research Article
Advanced Receiver Design for Quadrature OFDMA Systems
Lin Luo,1, 2 Jian (Andrew) Zhang,1, 2 and Zhenning Shi1, 2

1 Department of Information Engineering, the Australian National University, Canberra, ACT 0200, Australia
2 Canberra Research Laboratory, National ICT Australia (NICTA), Canberra, ACT 2601, Australia
Correspondence should be addressed to Lin Luo, lin.luo@ieee.org
Received 1 August 2008; Revised 24 December 2008; Accepted 24 January 2009
Quadrature orthogonal frequency division multiple access (Q-OFDMA) systems have been recently proposed to reduce the peak-
to-average power ratio (PAPR) and complexity, and improve carrier frequency offset (CFO) robustness and frequency diversity for
the conventional OFDMA systems. However, Q-OFDMA receiver obtains frequency diversity at the cost of noise enhancement,
which results in Q-OFDMA systems achieving better performance than OFDMA only in the higher signal-to-noise ratio (SNR)
range. In this paper, we investigate various detection techniques such as linear zero forcing (ZF) equalization, minimum mean
square error (MMSE) equalization, decision feedback equalization (DFE), and turbo joint channel estimation and detection, for
Q-OFDMA systems to mitigate the noise enhancement effect and improve the bit error ratio (BER) performance. It is shown that
advanced detections, for example, DFE and turbo receiver, can significantly improve the performance of Q-OFDMA.
Copyright © 2009 Lin Luo et al. This is an open access article distributed under the Creative Commons Attribution License, which
1. Introduction despreading limited the applications of SC-FDMA. More

importantly, from the viewpoint of user end (UE), usable
Future broadband wireless communication systems require and legal resource blocks of subcarriers are limited, therefore
high-speed data rate transmissions through severe multipath the complete FFT/IFFT computation for OFDMA and SC-
wireless channels. As an effective antimultipath multiple FDMA demodulations is not necessary especially under the
access scheme, orthogonal frequency division multiple access low-power consideration of the battery-driven handsets.
(OFDMA) is endorsed by leading standards such as HIPER- The Quadrature OFDMA (Q-OFDMA) systems [4]
LAN/2, IEEE802.11, and IEEE802.16 and downlink in the overcome the aforementioned problems with improved
3GPP long-term evolution (LTE). Nevertheless, to support performance and reduced complexity. Based on the concept
a number of users’ access, the number of subcarriers, N, of layered fast Fourier transform (FFT) structure [4], the
in OFDMA systems is usually very large, which provides intermediate domain is introduced and a Q-OFDMA system
flexibility and high spectrum efficiency, at the expense has multiple small-size inverses (IFFTs) in the transmitter,
of high complexity, severe PAPR, and sensitivity to CFO which results in a loss of the subcarrier orthogonality. While
in general. Alternatively, single-carrier transmission with at receiver, the orthogonality is recovered by FFT operations.
cyclic prefix (CP) is a closely related transmission scheme, In terms of minimizing the bit error ratio (BER), the
which significantly reduces PAPR and CFO sensitivity, with optimum maximum likelihood (ML) [5] detector is able
the same multipath interference mitigation property as to utilize both the diversity and coding gain furnished
OFDM [1, 2]. As an extension of the single carrier with by frequency-selective fading channels. However, in most
frequency domain equalization (SC-FDE) [2] to accom- practical systems, linear equalizer (LE) [5–7] and decision
modate multiuser access, single-carrier frequency division feedback equalizer (DFE) [5–9] have been designed for
multiple access (SC-FDMA) [3] is adopted as the uplink complexity reasons. Turbo equalization [10–13] has been
multiple access scheme in 3GPP LTE. However, noise extensively studied when signal-to-noise ratio (SNR) and
enhancement and higher complexity introduced by discrete channel impulse response (CIR) are precisely known to the
Fourier transform (DFT) spreading and inverse DFT (IDFT) receiver. In cases where such information is not available or
time varying thus need to be tracked, channel information defined in the one-dimension frequency domain, subchan-
should be estimated. Methods [14–16] attempt to perform nels in Q-OFDMA systems are defined over an array of two
estimation and equalization jointly, which improve the dimensions in the intermediate domain [4]. This array is
system performance at the cost of intractable complexity. P × Q, where both P and Q are powers of 2, and N = PQ is
From the BER performance analysis of Q-OFDMA the equivalent to the total number of subcarriers in ordinary
systems [17] we find that the essential characteristics of the OFDMA systems. Thanks to the judicious use of divide-and-
Q-OFDMA systems. When linear zero forcing (ZF) equalizer conquer approach in the computation of DFT [5], smaller
is employed, there is a tradeoff between noise enhancement, size of IFFTs/FFTs are utilized in the transmitter/receiver of
error propagation, and frequency diversity gain, by setting Q-OFDMA, which results in reduced complexity and PAPR.
different value of P. When SNR is small, Q-OFDMA systems Given three N-point time-domain symbols x, h, and
with smaller P have better BER performance; while with SNR their circular convolution output y = x h, their DFTs have
√
increasing, Q-OFDMA systems with larger P will become the relationship y = N x h. If we rearrange the frequency
superior. The exact SNR point where one system starts to and y
domain symbols x, h, into P × Q matrices (PQ = N)
outperform the other depends on the channel condition and
row-wise according to the layered IFFT structure concept, the
modulation scheme [4]. As a special case of P = 1, the Q- q , and y
OFDMA system becomes the conventional OFDMA system, vectors xq , h q from the qth column of the matrices
√
which outperforms the Q-OFDMA system (1 < P < N) only retain that yq = N xq h q , where [y
q ] p = y(pQ + q),
in low SNR range. This problem can be solved by utilizing [xq ] p = x(pQ+q), [h
q ] = h(pQ+q), and p = 0, 1, . . . , P − 1.
p
advanced receivers, which is the motivation of this paper.
When linear minimum mean square error (MMSE) equalizer Define the intermediate-domain symbols {x̆q , h̆q , y̆q } as
is used, for BPSK modulated signals, Q-OFDMA system is q, y
the IDFTs of {xq , h q }, given by
always better than OFDMA system with ZF equalizer (for
conventional OFDMA systems, ZF is already the maximum x̆q = FH q , h̆q = FH y̆q = FH q ,
Px P hq , Py (1)
likelihood solution and MMSE equalizer cannot achieve
better BER performance [18]). Other advanced equalizers, where FH P is the normalized P-point IDFT matrix. According

such as decision feedback equalizer and iterative equalizers to the convolution property of DFT, we get y̆q = Qx̆q
can efficiently improve the performance of Q-OFDMA h̆q , which establishes the relationship of the symbols in the
system, whose complexity is similar to that of the linear intermediate domain, and can be expressed in matrix form
equalized OFDMA/SC-FDMA systems. as
In this paper, we focus on analyzing the various detection

techniques for Q-OFDMA systems, including ZF and MMSE y̆q = QH̆q x̆q , (2)
LEs, DFE, and iterative equalization. The rest of this paper
is organized as follows. In Section 2, Q-OFDMA system
based on the layered FFT structure is presented. We present where the P × P circulant matrix H̆q represents the dispersive
signal detection and decoding techniques for Q-OFDMA channel, with [H̆q ]i, j = h̆(((i − j)mod P)Q + q), where h̆(·)
and analyze the performance in Section 3. Finally, we denotes the channel response in the intermediate domain.
demonstrate the performance of Q-OFDMA systems using At the receiver of the Q-OFDMA system, in order to
various detection techniques by simulations in Section 4. realize a one-tap equalization, the weighting outputs are
The following notations will be used throughout the transformed from the intermediate domain to frequency
paper. Matrices and vectors are denoted by symbols in bold domain as
face, xi, j indicates the (i, j)th element of a matrix X, and x(i)
indicates the element i in a vector x. Tr[·] denotes the trace yq = FP y̆q = NDq FP x̆q + FP n̆q , (3)
of a matrix, E[·] denotes the expectation, |·| and · denote
the absolute value and estimated value, respectively. and where n̆q ∼N (0, N0 ) are additive white Gaussian noise
denote the circular convolution and element-wise product (AWGN) samples, the symbol energy of modulation symbols
of two vectors, respectively. (·)−1 , (·)T and (·)H represent x̆q is Es , and
inverse, transpose, and Hermitian conjugate. x, x̆, and x
denote symbol x in time domain, intermediate domain, and FP H̆q FH q)
Dq = √ P = diag(h (4)
frequency domain, separately. P
indicates the diagonalized channel matrix. This scheme

2. Q-OFDMA System Model recovers the orthogonality between subcarriers in the fre-
quency domain to allow for a simple one-tap equalization,
To compare the Q-OFDMA with the well-known OFDMA similar to that for conventional OFDMA systems.
and SC-FDMA systems, Figure 1 shows the intuitionistic An interesting observation is that (3) actually resembles
difference of the core baseband modules among three to the results obtained in precoded OFDMA systems [18],
systems. At the transmitter, each user’s data is first encoded, with a precoding matrix FP . Thus, frequency diversity can
interleaved, and mapped to a certain constellation. Unlike be achieved without introducing any complexity relating to
the subchannel in conventional OFDMA systems, which is precoders in the transmitter, and PAPR is reduced as well.
Q-OFDMA
Subchannel P Q-pt Inter- P/S & Remove P Q-pt Subchannel Equali-
leaving Channel Detect
assignment IFFTs add CP CP & S/P FFTs collection zation
OFDMA
Subchannel N-pt P/S & Remove N-pt Subchannel Equali-
assignment Channel CP & S/P Detect
IFFT add CP FFT collection zation
SC-FDMA
P-pt Subchannel N-pt P/S & Remove N-pt Subchannel Equali- P-pt
Channel Detect
FFT assignment IFFT add CP CP & S/P FFT collection zation IFFT
Figure 1: System structure comparison.
3. Signal Detection the frequency domain, while the feedback filter is realized
in the time domain. Similar to the time-domain DFE (TD-
In this section, we will present techniques for signal detec- DFE), the hybrid-domain DFE (HD-DFE) is affected by the
tion, including ZF and MMSE equalizers, DFE and turbo precursors of the intersymbol interference (ISI) and error
receiver, specially for Q-OFDMA systems. propagation. Since both the signal processing and the filter
design are performed entirely in the frequency domain, the
3.1. Low-Complexity Linear Detections. The simplest detec- frequency-domain DFE (FD-DFE) only requires a quarter
tion is ZF equalization, and the subchannel signal x̆q can be of the complexity of the HD-DFE, whose complexity is
calculated as half of that of the TD-DFE [9]. Regarding to the work of
DFE presented in this paper, our main contribution lies
−1
FH
P Dq FP y̆q in extending the general DFE concept to the Q-OFDMA
x̆q = √ , (5)
N systems and testing its performance, instead of proposing
new DFE structure.
which leads to the average BER for a Q-OFDMA system with Applied to the signal represented in (3), the block DFE,
M-ary QAM modulation as [17] as shown in Figure 2, can be realized with HD-DFE and FD-
√ ⎛ ⎞ DFE. The block FD-DFE, as shown in Figure 2(b), can be

Q−1
4(1 − 1/ M) ⎝ (3/(M − 1))γ ⎠ described by the following equations:
(Pe)ZF = Q
,
Q log2 M q=0 (1/P) Pp=−01 |h p,q |−2
α = AFP y̆q = NADq FP x̆q + AFP n̆q ,
(6)
+∞ xq = α − BFP xq , (8)
where γ√= Es /N0 , |h p,q | = |h pQ+q | and Q(x) = x exp(−
t 2 /2)dt/ 2π. From (6) we can see, similar to those in single- x̆q = T FH
Pxq ,
carrier systems [2], any small channel coefficient h pQ+q leads
where the feedforward and feedback filters, A and B,
to noise enhancement and error propagation in a group
respectively, are chosen to minimize the mean square error
of P subcarriers. On the other hand, frequency diversity is
(MSE) and whiten the noise at the input of the decision
improved by averaging channel power over the same group
device T (·). Since we can only feedback decisions in a causal
of subcarriers.
fashion, B is usually chosen to be a strictly upper or lower
Another low-complexity alternative, MMSE equalizer,
triangular matrix with zero diagonal entries. The matrices
can efficiently solve these problems. Similar to that in
A and B are designed according to MMSE criteria. When
conversional OFDMA systems, the MMSE equalizer for
B is chosen to be triangular and the MSE between the
Q-OFDMA incurs a marginal increase in complexity by
block estimate before the decision device is minimized, the
requiring the estimation of noise variance σn2 , and is given
feedforward and feedback filters can be expressed as [19]
by
−1
−1
−1 UH ΛU = Rx̆−1 + FH H H
P Dq FP Rn̆ FP Dq FP , (9a)
FH H H
P Dq Dq Dq + γ I FP y̆q
x̆q = √ , (7) −1
N Gmmse = Rx̆ FH H H H H
P Dq FP Rn̆ FP + Dq FP Rx̆ FP Dq , (9b)
where γ = Es /N0 , and I is an identity matrix. A = FP UGmmse , (9c)

B = FP (U − I)FH
P, (9d)
3.2. Decision Feedback Detection. The class of decision-
directed detectors improve the system performance on the where we assume the autocorrelation matrices Rx̆ and Rn̆ are
cost of complexity. Current DFE techniques can be operated known, (9a) is obtained using Cholesky decomposition, U
in the time domain [5], frequency domain [9], or with hybrid is an upper triangular with unit diagonal,
√ Λ is a diagonal
structure [7, 8], where the feedforward filter is realized in matrix, and for simplicity, the factor N is absorbed in Dq .
y̆q P-point yq Feedforward α P-point ᾰ + x̆q Decision x̆q

FFT filter A IFFT − device
Feedback
filter B
(a)
y̆q P-point yq Feedforward α + xq P-point x̆q Decision x̆q

FFT filter A IFFT device
−
Feedback P-point
filter B FFT
(b)
Figure 2: Decision feedback detector for Q-OFDMA systems: (a) hybrid domain DFE, and (b) frequency domain DFE.
(k)
(k) (k) (k) (k) (k)
bm
(k) LE cn LeE dn LE dn x̆ζi yζi y̆ζi
Deinter- Demodu- P-point MMSE P-point
Decoder +
leaver lator IFFT equalizer FFT
−

(k)
E x̆ζi
H (k)

(k) (k)
− Cov x̆ζi , x̆ζi
+ P-point Channel
Interleaver Modulator
FFT estimator
(k) (k) (k)
LD cn LeD cn LD dn
Figure 3: The turbo receiver for Q-OFDMA systems.
Since DFE takes into account the finite-alphabet property The equalizer gives the MMSE estimates x̆ of x̆ based on
of the information symbols and the decision feedback filter the received signal y and the a priori information of x̆, that
eliminates the intersymbol interference from previously is, E(x̆) and Cov(x̆, x̆). After passing through a demapping
detected symbols, the performance of DFE is usually better module, the extrinsic information for each coded bit is
than linear detectors, especially at moderate high SNR values, delivered as [11]
where decision errors are less likely to propagate.
P x̆(p) | dn = 1
LeE (dn ) = ln (11)
3.3. Turbo Detection with Soft Interference Cancellation. In P x̆(p) | dn = 0
this section, as shown in Figure 3, we propose an iterative
∀d:dn =1 P(x̆(p) | d) ∀n :n =/ n P(dn )
receiver for joint estimation, equalization, and decoding = ln (12)

∀d:dn =0 P(x̆(p) | d) ∀n :n =
for the Q-OFDMA systems based on the turbo processing / n P(dn )
principle. The estimator makes use of training symbols
and the soft-decoded data information to track the channel P dn = 1 | x̆(p) P(dn = 1)
= ln − ln .
frequency response. The equalizer can use the re-estimated P dn = 0 | x̆(p)
P(dn = 0)
(13)

channel to detect the transmitted data iteratively until the LE (dn ) LD (dn )
satisfactory outcome is obtained. We can judiciously choose
estimation, equalization, and decoding algorithms according As we can see in Figure 3, the output of the demodulator,
to the performance/complexity tradeoff. LE (dn ), has been defined as the a posteriori log-likelihood
For the pth element of y, we rewrite (3) as ratio (LLR) of the coded bit dn , and the output of the
interleaver, LD (dn ), as the a priori LLR of dn . The extrinsic
y(p) = (DFP ) p,p x̆(p) + (DFP ) p,k x̆(k) + FP n̆(p). (10) information, LeE (dn ), is a function of x̆(p) and the a priori
k=
/ p information about the coded bits other than the nth bit, that
is, LD (dn ), n =
/ n, from the previous iteration. For the initial
From (10), we can see the precoding matrix FP breaks the equalization stage, no a priori information is available and
orthogonal character of D and introduces ISI, which can be hence we have LD (dn ) = 0, ∀n. The extrinsic information
eliminated by the following turbo equalization. LeE (dn ), which is independent of LD (dn ), is deinterleaved
and fed into the decoder as the a priori information for where ε p is a column vector whose P elements are all zeros
the decoder. Based on the a priori LLR LE (cn ), the decoder except the pth element which is one. Thus, the MMSE
provides the a posteriori LLR of each coded bit as follows: estimate x̆ of x̆ can be given by [11]

P {LE (cn )} | cn = 1 x̆(p) = x̆(p) + z̆(p). (22)
LeD (cn ) = ln
P {LE (cn )} | cn = 0 We apply (19) to (22) and formulate the MMSE estimate as

P cn = 1 | {LE (cn )} P(cn = 1) (14)
= ln − ln . x̆(p) = x̆(p) + wHp (y − DFP x̆)
P cn = 0 | {LE (cn )} P(cn = 0) (23)

LD (cn ) LE (cn ) = wH
p y − DFP x̆ + x̆(p)DFP ε p ,
At the last iteration, a hard decision is made as whose statistics mean μx̆ (p), x̆ ∈ B (for BPSK, B =
{+1, −1}), and variance σx̆2 (p) are computed as

bm = arg max P bm = b | {LE (cn )} . (15) μx̆ (p) = wHp E[y | x̆(p) = x̆] − DFP x̆ + x̆(p)DFP ε p
b∈{0,1}
Here, the interleaver/deinterleaver module shuffles = x̆wH

p DFP ε p ,
coded bits to decorrelate errors introduced by the 2 (24)
decoder/equalizer, and assure, locally in several iterations, dn σx̆2 (p) = E x̆(p) − μx̆ (p)
are independent and LD (dn ) are true a priori information on H
= wH
p DFP ε p 1 − (DFP ε p ) w p .
the dn , which make the iterative error correction possible.
Thus, the output extrinsic LLR LeE (dn ) (11) of the equalizer,
3.3.1. MMSE Criteria. To perform MMSE estimation, we is given by
require the statistics x̆(p) E[x̆(p)] and v̆(p)
P x̆(p) | dn = 1
Cov[x̆(p), x̆(p)] of the symbols x̆(p), which can be computed LeE (dn ) = ln
by the a priori LLR of the coded bits, LD (dn ). For simplicity, P x̆(p) | dn = 0
we assume BPSK modulation is used in the following
P x̆(p) | x̆(p) = +1
analysis. The soft estimates and their variance are defined as = ln
[11] P x̆(p) | x̆(p) = −1
(25)
2x̆(p)μx̆=+1 (p))
LD (dn ) =
x̆(p) = tanh , (16) σx̆2=+1 (p)
2

2 2wHp y − DFP x̆ + x̆(p)DFP ε p
v̆(p) = 1 − x̆(p) . (17) = .
1 − (DFP ε p )H w p
Define
T For the initial iteration, we have LD (dn ) = 0, ∀n, x̆(p) =
p
x̆ = x̆(1), . . . , x̆(p − 1), 0, x̆(p + 1), . . . , x̆(P) , 0 and v̆(p) = 1∀ p, then the MMSE linear equalizer solution
(18) is simplified to
V̆ p = Diag v̆(1), . . . , v̆(p − 1), 1, v̆(p + 1), . . . , v̆(P) ,
−1
wp = σn̆2 I + DDH DFP ε p , (26)
a soft interference cancellation is performed on y to obtain
and the corresponding MMSE output and LLR are given by
s̆ y − DFP x̆ H
(19) x̆(p) = wp y,
= DFP (x̆ − x̆) + FP n̆, H
2 wp y (27)
which then be fed into a linear MMSE filter and we get LeE (dn ) = .
1 − (DFP ε p )H wp
z̆(p) wHp s̆(p), (20) For alleviating the high complexity of computing w p for
each iteration, in the first several iterations, we utilize the
where the filter w p is chosen to minimize the MSE between coefficient matrix wp for the first iteration to compute x̆(p)
the coded bit x̆ and the filter output z̆, that is, and LeE (dn ) according to (27).
In the following iterations, approximately perfect a priori
w p = arg min E{x̆ − z̆2 } p
LLR |LD (dn )| → ∞, ∀n is available, which leads to x̆ =
T
= Cov[y
, y
−1
] Cov[x̆, y
] (x̆(1), . . . , x̆(p − 1), 0, x̆(p+1), . . . , x̆(P)) , and v̆(p) = 0, ∀ p.
w p is then simplified to
H −1
= σn̆2 I + DFP V̆ p (DFP ) DFP ε p (21) −1
2 wp = σn̆2 I + DFP ε p (DFP ε p )H DFP ε p ,
H
= σn̆ I + DFP V̆(DFP ) (28)
DFP ε p
−1 = .
+ (1 − v̆(p))DFP ε p (DFP ε p )H DFP ε p , σn̆2 + (DFP ε p )H DFP ε p
Table 1: System receiver complexity in terms of numbers of complex multiplications per frame. For the Q-OFDMA systems, the linear
MMSE equalizer, FD-DFE, and turbo receiver (the complexity of the decoder is excluded, i denotes the number of the iterations) are listed
for comparison. For the conventional OFDMA system, only the maximum likelihood solution, linear ZF equalizer, is compared. For SC-
FDMA system, the linear MMSE equalizer, which reduces the effect of the noise enhancement, is compared. For the example scenario,
numerical values are for N = 1024, Q = 16, P = 64, and M = 1.
System Equalizer Complexity Example

MMSE N/2 log2 Q + MP log2 P + 2MP 2560
FD-DFE N/2 log2 Q + 2MP log2 P + 3MP 3008
Q-OFDMA
3264 (i = 2)
Turbo N/2 log2 Q + i(4MP + MP log2 P) − MP
5184 (i = 5)
OFDMA ZF N/2 log2 N + MP 5184
SC-FDMA MMSE N/2 log2 N + MP + P/2 log2 P 5376
3.3.2. Matched Filter Criteria. Analyze (26), we find in the (2) Iterative channel estimation. In this stage, data-aided
first iteration, channel D which is estimated based on the LS channel estimation is utilized;
training sequence, may not be reliable. In order to reduce the
complexity, the operator of matrix inverse can be bypassed p,p =
y(p)
H 2 = H p,p + Δ(p). (31)
by replacing MMSE equalizer with an approximate matched x̆(p)
filter as [20]
Similar to the initial estimation stage, it can be shown
2

DFP ε p that Δ(p) has zero mean and variance (σn2 + σISI ).
wp = 2 . (29)
σn̆ + Tr[DDH ] (3) Final channel estimation. In the last iteration, the
decoding information from decoder becomes very
3.3.3. Turbo Channel Estimation. As a result of (3), chan- reliable, MMSE estimator [5] is able to provide
nel estimation can be easily implemented by transmitting further performance improvement.
carefully chosen training symbols x̆tr such that each element
in FP x̆tr has unity magnitude. However, the estimation 3.4. Complexity Analysis. Complexity is defined as the num-
based on training symbols may not be reliable, especially ber of complex multiplications required in processing each
when the channel is time varying and channel tracking is frame. FFT complexity is based on radix-2 algorithm, which
needed. In this section, we propose an iterative channel means the computational complexity for N point FFT/IFFT
estimation technique in conjunction with data detection. is O(N/2 log2 N). Assume user-k occupies M subchannels
The idea is to firstly use training symbols to perform an in Q-OFDMA systems, and equivalently, MP subcarriers in
initial estimation, then the soft data information delivered by conventional OFDMA systems.
decoder will be utilized in estimation. At last iteration, when With a linear equalizer, a general OFDMA receiver
the decoding information from decoder becomes reliable, includes an N-point FFT and a one-tap equalizer, and the
advanced estimators, that is, maximum likelihood or MMSE complexity is N/2 log2 N + MP. For a SC-FDMA receiver,
estimator, are employed to provide further performance refer to Figure 1, an extra p-point IFFT is required based on
improvement. the OFDMA receiver, thus the complexity is N/2 log2 N +
From (4), we can see DFP = FP H̆, which is a frequency MP + P/2 log2 P. For a Q-OFDMA system, the receiver
response of channel. Therefore, we can use H = DFP as includes PQ-point FFTs, MP-point IFFTs, MP-point weight-
the channel estimates for Q-OFDMA systems. The channel ing operators, and M one-tap equalizer. The complexity is
estimation method is summarized as the following several N/2 log2 Q+MP log2 P +2MP. When the channels change, the
steps: computational complexity of linear ZF/MMSE equalizer is
O(P 3 ) for Q-OFDMA systems, and O(N 3 ) for OFDMA/SC-
(1) Initial channel estimation FDMA systems, where N equals to Q (Q ≥ 1) times of P.
From Table 1, we note that the receiver of the Q-OFDMA
with linear equalizer only requires half of the complexity of
p,p =
y(p) p,p + ΔT (p), the OFDMA, whose complexity is similar to the SC-FDMA
H 1 =H (30)
x̆T (p) system.
The complexity of decision feedback detection is com-
where x̆T (p) is the training symbols, ΔT (p) is AWGN parable to that of linear detectors, because the feedforward
2
with zero mean and variance (σn2 + σISI ). Once the and feedback filters only have matrix-vector multiplications.
initial channel estimates are obtained, the detected Additionally, an FD-FDE equalizer in Figure 2(b) needs an
soft data symbols x̆ are achieved by (16) for BPSK extra P-point FFT for feedback filter, that is, cancellation
modulation. is performed in the frequency domain. Therefore, the
complexity of the receiver of Q-OFDMA with FD-DFE is 10−1

N/2 log2 Q + 2MP log2 P + 3MP.
The complexity of the turbo receiver mainly comes
from the MMSE equalizer, MAP decoder, and the order of 10−2
iterations. For each iteration, the MMSE equalizer performs
three FFT operations, whose complexity is O(P/2 log2 P) 10−3
for Radix-2 algorithms, and four matrix operations whose
BER
complexity is O(P 2 ). For the MAP decoder, the complexity
of soft output Viterbi algorithm (SOVA) with five iterations 10−4
is twice as that of Viterbi algorithm, and the ratio becomes
three with ten iterations [21]. Comparing with the linear
equalizer and DFE, the complexity analysis is far more 10−5
complicated for joint turbo estimation, equalization, and
decoding. Assuming the channel is fixed, given the MMSE
10−6
equalizer, the overall complexity of the turbo receiver of the 2 4 6 8 10 12 14 16 18 20
Q-OFDMA system is N/2 log2 Q + i(4MP + MP log2 P) − MP, Eb /N0 (dB)
which excludes the complexity of the decoder and i denotes
the number of the iterations. Q-OFDMA P = 256
In our previous work, we found that larger P leads SC-FDMA
to more reduction in complexity of Q-OFDMA and lower OFDMA
PAPR at the transmitter, and better CFO robustness [4]. Figure 4: BER comparison of uncoded systems with QPSK
Thus in Q-OFDMA systems with turbo receiver, P should be modulation under CM2 channel model, where N = 1024.
chosen carefully within system constraints according to the
complexity/performance tradeoff.
4. Simulations 10−2
In this section, we present the BER performance of Q- ZF

OFDMA systems with different receivers, including linear ZF
10−3
BER
and MMSE, DFE, and iterative (turbo) receiver. In OFDMA,

subcarriers are first grouped per Q successive subcarriers,
and each subchannel occupies one subcarrier in each group MMSE
with a fixed index. Distributed SC-FDMA is used in the 10−4
simulation, the subcarriers of each user are spread over
the entire signal band with a fixed index. For simplicity,
system imperfections such as CFO and PAPR distortions are
not introduced in the simulation. In each simulation result,
10 15 20 25 30
BER is averaged over a number of channel realizations. In
SNR (dB)
coded systems, each user’s data is encoded with 1/2-rate
convolutional code, and a rectangle interleaver is applied Q-OFDMA P = 64 OFDMA2
to the coded bits before modulation. SOVA is used for Q-OFDMA P = 16 OFDMA3
decoding. The initial channel coefficients are estimated by OFDMA1
matched filter scheme over two consecutive training symbols.
Figure 5: BER of uncoded systems with BPSK modulation under
Two types of channel models are simulated to compare
CM2 channel model, where N = 256.
systems performance. One is the CM2 channel model from
IEEE802.15.3a, which is a dense nonline-of-sight multipath
model with tens of significant taps. The other is the SUI3
channel model from IEEE802.16, which is a sparse channel the system performance and Q-OFDMA is inferior to
model with only a few taps and small normalized delay conventional OFDMA systems; with SNR increasing, noise
spread. In either case, the length of the guarding interval is enhancement effect is relatively suppressed and diversity
set to be 64, and channel impulse response longer than 64 is improvement makes Q-OFDMA superior. It also shows that
truncated to have 64 taps to avoid ISI. the OFDMA performance is generally better than that of SC-
Figure 4 presents an uncoded case to illustrate a few key FDMA with the linear MMSE receiver.
points about the systems comparison under CM2 channel We depict the simulation results in Figure 5 for uncoded
model. All of the MMSE equalized systems are with 16QAM systems with BPSK modulation under CM2 channel model.
modulation. The parameter N is fixed at 1024, 16 users Four users equally sharing 256 subcarriers are simulated
sharing 64 subcarriers in all three systems. It can be noticed and parameters are set as N = 256, P = 16, and 64 for
that when SNR is small, noise enhancement dominates Q-OFDMA, P = 64 for general OFDMA (the subchannel
10−1 100
10−2 10−1
BER
BER
10−3 10−2
10−4 10−3
10−4
10 11 12 13 14 15 16 17 18 19 20 10 12 14 16 18 20 22 24 26 28 30
SNR (dB) SNR (dB)
Q-OFDMA, ZF, no iteration Q-OFDMA, ZF, no iteration

Q-OFDMA, MMSE, no iteration Q-OFDMA, MMSE, no iteration
Q-OFDMA, MMSE, 2 iterations Q-OFDMA, MMSE, 2 iterations
Q-OFDMA, MMSE, 5 iterations Q-OFDMA, MMSE, 5 iterations
Q-OFDMA, FD-DFE
Figure 7: BER performance comparison between Q-OFDMA
Figure 6: BER performance comparison between Q-OFDMA systems with different receivers in Wimax channel model, with 64-
systems with different receivers in CM2 channel model, with QPSK QAM modulation.
modulation.
length is 64). From the figure, we can see that linear MMSE conjunction with channel estimation for Q-OFDMA systems
equalizer can significantly improve the performance of Q- is proposed and evaluated. We can judiciously choose
OFDMA systems by suppressing the noise enhancement estimation, equalization, and decoding algorithms according
effect. While for general OFDMA systems, it is known that to the performance/complexity tradeoff. From simulations
MMSE equalizer almost has the same performance as ZF on wireless dispersive channels, we have shown that Q-
equalizer. OFDMA with FD-FDE achieves improved performance.
Figure 6 shows the system performance with QPSK Since both the signal processing and the filter design are
modulation under CM2 channel model. From the figure, performed entirely in the frequency domain, the complexity
we can see that DFE detection further reduces the effect of of FD-FDE Q-OFDMA is similar to that of the linearly
noise enhancement and improves the system performance equalized Q-OFDMA systems. Moreover, by reducing the
compared with linear detectors. The proposed iterative interference and noise enhancement effect, and increasing
(turbo) receiver scheme performs better than Q-OFDMA the reliability of the detected data, the iterative receiver for
systems with linear and decision feedback detectors. At joint estimation, equalization, and decoding significantly
BER = 10−4 level, the Q-OFDMA systems with 2 iterations improves the performance of the Q-OFDMA system, with
can achieve it at 17 dB SNR, which is about 2 dB lower the similar complexity to the linearly equalized OFDMA/SC-
than MMSE equalized Q-OFDMA without iteration process, FDMA systems.
and Q-OFDMA systems with more iterations get better
performance. Figure 7 shows BER performance for systems
with 64-QAM modulation, under SUI3 channel model.
Acknowledgments
Subcarriers have very high correlation due to very limited NICTA is funded by the Australian Government as repre-
number of multipath signals. In this case, the influence of sented by the Department of Broadband, Communications
frequency diversity is weakened, while the noise propagation and the Digital Economy, and the Australian Research
is highlighted in Q-OFDMA systems. However, we can see Council through the ICT Centre of Excellence program.
a similar trend, in BER performance of Q-OFDMA systems
with different order of iterations, to that of Figure 6.
References
5. Conclusions [1] A. Czylwik, “Comparison between adaptive OFDM and single
carrier modulation with frequency domain equalization,” in
In this paper, we analyze linear, decision direct and iter- Proceedings of the 47th IEEE Vehicular Technology Conference
ative (turbo) detections for Q-OFDMA systems to miti- (VTC ’97), vol. 2, pp. 865–869, Phoenix, Ariz, USA, May 1997.
gate the noise enhancement effect and improve the BER [2] D. Falconer, S. L. Ariyavisitakul, A. Benyamin-Seeyar, and
performance. Furthermore, a dedicated turbo equalizer in B. Eidson, “Frequency domain equalization for single-carrier
broadband wireless systems,” IEEE Communications Magazine, [20] H. Omori, T. Asai, and T. Matsumoto, “A matched filter
vol. 40, no. 4, pp. 58–66, 2002. approximation for SC/MMSE iterative equalizers,” IEEE Com-
[3] H. G. Myung, J. Lim, and D. J. Goodman, “Single carrier munications Letters, vol. 5, no. 7, pp. 310–312, 2001.
FDMA for uplink wireless transmission,” IEEE Vehicular [21] G. Bauch and V. Franz, “A comparison of soft-in/soft-
Technology Magazine, vol. 1, no. 3, pp. 30–38, 2006. out algorithms for “turbo-detection”,” in Proceedings of the
[4] J. Zhang, L. Luo, and Z. Shi, “Quadrature OFDMA systems,” International Conference on Telecommunications (ICT ’98), pp.
in Proceedings of the 50th Annual IEEE Global Telecom- 259–263, Porto Carras, Greece, June 1998.
munications Conference (GLOBECOM ’07), pp. 3734–3739,
Washington, DC, USA, November 2007.
[5] J. G. Proakis, Digital Communications, McGraw-Hill, New
York, NY, USA, 4th edition, 2001.
[6] G. K. Kaleh, “Channel equalization for block transmission
systems,” IEEE Journal on Selected Areas in Communications,
vol. 13, no. 1, pp. 110–121, 1995.
[7] F. Pancaldi and G. M. Vitetta, “Block channel equalization in
the frequency domain,” IEEE Transactions on Communications,
vol. 53, no. 3, pp. 463–471, 2005.
[8] N. Benvenuto and S. Tomasin, “On the comparison between
OFDM and single carrier modulation with a DFE using a
frequency-domain feedforward filter,” IEEE Transactions on
[9] N. Benvenuto and S. Tomasin, “Iterative design and detection
of a DFE in the frequency domain,” IEEE Transactions on
[10] C. Douillard, M. Jézéquel, C. Berrou, A. Picart, P. Didier, and
A. Glavieux, “Iterative correction of intersymbol interference:
turbo-equalization,” European Transactions on Telecommuni-
cations, vol. 6, no. 5, pp. 507–511, 1995.
[11] X. Wang and H. V. Poor, “Iterative (turbo) soft interference
cancellation and decoding for coded CDMA,” IEEE Transac-
tions on Communications, vol. 47, no. 7, pp. 1046–1061, 1999.
[12] M. Tüchler and J. Hagenauer, “Linear time and frequency
domain turbo equalization,” in Proceedings of the 53rd IEEE
1453, Rhodes, Greece, May 2001.
[13] M. Tüchler, R. Koetter, and A. C. Singer, “Turbo equalization:
principles and new results,” IEEE Transactions on Communica-
tions, vol. 50, no. 5, pp. 754–767, 2002.
[14] L. M. Davis, I. B. Collings, and P. Hoeher, “Joint MAP
equalization and channel estimation for frequency-selective
and frequency-flat fast-fading channels,” IEEE Transactions on
[15] M. Qaisrani and S. Lambotharan, “An iterative (turbo) chan-
nel estimation and symbol detection technique for doubly
selective channels,” in Proceedings of the 65th IEEE Vehicular
Technology Conference (VTC ’07), pp. 2253–2256, Dublin,
Ireland, April 2007.
[16] T. Zemen, J. Wehinger, C. Mecklenbräuker, and R. Müller,
“Iterative detection and channel estimation for MC-CDMA,”
in Proceedings of IEEE International Conference on Communi-
cations (ICC ’03), vol. 5, pp. 3462–3466, Anchorage, Alaska,
USA, May 2003.
[17] L. Luo, J. Zhang, and Z. Shi, “BER analysis for asymmetric
OFDM systems,” in Proceedings of IEEE Global Telecommuni-
cations Conference (GLOBECOM ’08), pp. 1–6, New Orleans,
La, USA, November-December 2008.
[18] Z. Wang and G. B. Giannakis, “Complex-field coding for
OFDM over fading wireless channels,” IEEE Transactions on
Information Theory, vol. 49, no. 3, pp. 707–720, 2003.
[19] A. Stamoulis, G. B. Giannakis, and A. Scaglione, “Block FIR
decision-feedback equalizers for filterbank precoded trans-
missions with blind channel estimation capabilities,” IEEE
Transactions on Communications, vol. 49, no. 1, pp. 69–83,
2001.
doi:10.1155/2009/263695
Research Article
Residue Number System Arithmetic Assisted Coded
Frequency-Hopped OFDMA
Dalin Zhu and Balasubramaniam Natarajan

Department of Electrical & Computer Engineering, Kansas State University, 2061 Rathbone Hall, Manhattan, KS 66506, USA
Correspondence should be addressed to Dalin Zhu, dalinz@ksu.edu
Received 31 July 2008; Revised 17 December 2008; Accepted 23 February 2009
We propose an RNS arithmetic-based FH pattern design approach that is well suited and easy to implement for practical
OFDMA systems. The proposed FH scheme guarantees orthogonality among intracell users while randomizing the intercell
interferences and providing frequency diversity gains. We present detailed construction procedures and performance analysis for
both independent and cluster hopping scenarios. Using simulation results, we demonstrate the gains due to frequency diversity
and intercell interference diversity on the system bit error rate (BER) performance. Furthermore, the BER performance gain is
consistent across all cells unlike other FH pattern design schemes such as the Latin squares (LSs-)-based FH pattern design where
wide performance variations are observed across cells.
Copyright © 2009 D. Zhu and B. Natarajan. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction constructed in a way that two users in different cells interfere

with each other only during a small fraction of all hops. The
Orthogonal frequency division multiplexing (OFDM) has third condition requires base stations to have the capability
been widely accepted as an enabling technology for next of distinguishing different users efficiently according to their
generation wireless communication systems. In OFDM, unique FH signatures. Finally, the last requirement not only
high-rate data streams can be broken down into a number ensures the security of the transmission, but also mitigates
of parallel lower-rate streams, thereby avoiding the need for the effect of fading by exploiting frequency diversity.
complex equalization. OFDM also forms the foundation for Frequency hopping pattern design has received con-
a multiple access scheme termed as orthogonal frequency siderable attention in both commercial and military com-
division multiple access (OFDMA). In OFDMA, each user munication systems. There has been extensive work on
is assigned a fraction of available subcarriers based upon designing FH-OFDMA systems [3–10]. In [3], concepts of
his/her demand for bandwidth. The advantages of OFDMA fast frequency hopping along with OFDM are illustrated. In
include (1) the flexibility in subcarriers’ allocation; (2) [4], authors show that the expected number of collisions
the absence of multiuser interference due to subcarriers’ per symbol under both independent and cluster hopping
orthogonality; (3) the simplicity of the receiver design [1]. does not depend on the hopping strategy. In their later
In order to enhance system throughput and spectral effi- work [5], it is shown that the number of collisions can
ciency, frequency hopping (FH) is generally used in OFDMA be further reduced by using space-frequency coding in
cellular systems. It is desirable for FH patterns to satisfy the multiple-antenna systems. Orthogonal Latin squares (LSs)
following conditions [2]: (i) minimize intracell interference; are presented as FH patterns in TCM/BICM coded OFDMA
(ii) average intercell interference; (iii) avoid ambiguity while in [6]. In LS-aided FH-OFDMA systems, it is seen that
identifying users; (iv) exploit frequency diversity by forcing there is a wide variability in the performance of users
hops to span a large bandwidth. The first aspect is relatively in different cells. Therefore, it is not an effective scheme
easy to achieve by using orthogonal hopping patterns within if one considers fairness to be important. Welch-Costas
a cell. To average intercell interferences, hopping patterns are array is introduced in [7] and evaluated in [8] for coded
FH-OFDMA. Here, although users across cells experience 2. System Model

significant performance improvements, users within a cell
may not occupy all of the available bandwidth to exploit In this section, we first describe the signal transmission
full frequency diversity. Other aspects focusing on preventing scheme for each individual user in an OFDMA system. Then,
hostile jamming and pilot-assisted channel estimation in FH- we introduce the access model and interference model under
OFDMA are explored in [9, 10], respectively. both independent and cluster hopping schemes.
In this paper, we propose a novel frequency hopping
pattern design strategy based on RNS arithmetic for practical 2.1. Signal Transmission Scheme. The block diagram of FEC
OFDMA cellular systems. We show that the resulting patterns coded FH-OFDMA system is shown in Figure 1. Here, data
are orthogonal within a cell and intersect only once across bits of every user are first channel coded and then mapped
cells in a frequency hopping cycle. RNS arithmetic has found to complex constellation points. We assume that there are M
applications in many areas. However, its use in designing users in the system, utilizing a total of N OFDM subcarriers.
frequency hopping patterns is rarely considered [11, 12]. Each user is assigned a specific set of subcarriers out of the
In [11], the design procedure can be visualized as a “top- total available subcarriers according to his/her data rates. Let
down” approach where a given bandwidth is divided into Ni be the number of subcarriers allocated to user i. Then, user
multiple candidate subbands based on a predetermined i transmits the information symbols xi = (x1i , x2i , . . . , xNi i )T
moduli set. As a result, if the moduli set changes, the ((·)T represents the transpose operation) on the assigned Ni
bandwidth of subcarriers varies. In this work, the division subcarriers. Therefore, the baseband transmitted signal of
of bandwidth into candidate subcarriers is assumed to be user i can be expressed as
given or determined in advance. Therefore, we can consider
our proposed approach as a “bottom-up” method driven
Ni
by grouping and indexing the subcarriers according to the si (t) = xki e j2π(k/T)t , 0 ≤ t < T, (1)
RNS arithmetic. For practical OFDMA cellular systems, k=1
the proposed “bottom-up” approach is more feasible. For

where si (t) represents the time-domain signal, and T denotes
example, in downlink OFDMA cellular systems, a fixed
one OFDM symbol duration. Since this is an OFDMA
number of subcarriers (e.g., 1024) with identical subcarrier
system, it is important to remember that every user is
bandwidth within each cell is usually assumed. Furthermore,
assigned a different set of subcarriers for transmission,
for reducing intercell interference, [11] suggests the use
and this allocation is dynamic in the case of frequency
of different moduli sets for adjacent cells. This approach
hopping OFDMA. That is, in the IFFT module, the frequency
results in adjacent cells employing different numbers of
assignment follows a predetermined FH pattern. Moreover,
subcarriers with different bandwidths across cells. Once
each user transmits zeros on subcarriers which are not
again, this is a stringent requirement that may not be feasible
assigned to him/her.
in practice. In this work, we invoke the use of the so-called
For convenience, we note Ci as the subcarrier that is
two-stage and multistage selection algorithms to construct
assigned to user i. Hence, N × 1 information symbols vector
RNS-FH patterns such that (1) different users can use the
of user i can be written as
spectral resources simultaneously within each cell and (2) ⎧
the same number of subcarriers can be employed from cell ⎨0, k∈
/ Ci ,
to cell. Additionally, the proposed FH sequences force the xi (k) = ⎩ (2)
intracell interferences to zero and average out the intercell xki , k ∈ Ci .
interferences. The performance of the proposed FH pattern
incorporating with both independent and cluster hopping The discrete form of the transmitted signal si (t) is then given
schemes is characterized. Simulation results show that RNS- as,
FH OFDMA has significantly better BER performance
si = Fxi , (3)
relative to traditional OFDMA scheme without FH. Another
aspect that makes RNS-FH pattern design outperforms other where F is the IFFT matrix defined as
existing FH techniques is that user hopping patterns span ⎛ ⎞
a larger bandwidth. Therefore, the channel fades associated WN00 ··· WN0(N −1)
with consecutive hops become independent. Moreover, with 1 ⎜
⎜ .. .. .. ⎟
⎟
F= √ ⎜ . . . ⎟, (4)
the use of FEC codes over multiple hops, the system can N⎝ ⎠
correct errors due to subcarriers that experience deep fades WN(N −1)0 ··· WN(N −1)(N −1)
or subcarriers that are severely interfered by others.
pq
The rest of the paper is organized as follows. In Section 2, where WN = e j2π pq/N .
system model along with signal transmission scheme, Let hi = [hi (0), hi (1), . . . , hi (N − 1)] denote the channel
access strategies, and interference models is introduced. impulse response vector, then its Fourier transform is
In Section 3, detailed RNS-FH pattern design procedures
along with comparisons with the existing technique are Hi = FH hi , (5)
presented. Simulation results with performance analysis
are given in Section 4. Finally, we conclude this paper in where (·)H represents the Hermitian transpose. In general,
Section 5. each channel impulse response is a function of time and
Binary FEC
Modulator
inputs encoder
IFFT P/S Add CP
Coded
steams from .
.
other users .
Hopping pattern
generator Wireless
channel
Binary FEC
Demodulator
outputs decoder
Remove
FFT P/S
Decoded CP
steams for .
.
other users .
Figure 1: The block diagram of coded FH-OFDMA system.
access delay which can be modeled as a tapped delay line, 2.2.1. Clustered FH-OFDMA. In cluster hopping, each user
that is, selects a set of continuous subcarriers, termed cluster, to
transmit the information symbols. Specifically, the hopping

L

hi (τ, t) = hil (t)δ τ, τl , (6) takes place among clusters of subcarriers based on prede-
l=1 termined FH patterns. Therefore, collisions occur among
clusters first, and then across all OFDM subcarriers within
where L is the number of multipaths and τl is the time that cluster. The expected number of symbol losses per
delay of the lth path. The tap coefficients are independent, cluster collision corresponds to [4]
zero mean, circularly symmetric complex Gaussian random
processes at each instant t, that is, hil (t) ∼ CN(0, σl2 ) with Ec = Nc Pint , (10)

the total power normalized to unity, that is, Ll=1 σl2 = 1. In where Nc is the number of subcarriers per cluster and Pint
this work, we use Jakes’ model to describe the time/frequency represents the probability that at least one interfering user
variation of each channel coefficient. Therefore, the spaced collides with the desired user. For cluster hopping, we have
frequency (Δ f ) spaced time (Δt) correlation function of the N/Nc hopping clusters. Therefore, the collision probability
channel frequency response can be expressed as [13] between the desired user and the interfering user in one

L

cluster is 1/(N/Nc ). Hence, the probability that at least one of
rH (Δ f , Δt) = σl2 Jo 2π fD Δt e− j2πΔ f τk , (7) the M − 1 users collides with the desired user can be expressed
l=1 as
M −1
where fD is the Doppler frequency. 1
At the receiver end, after FFT, the received signal Pint = 1 − 1 − . (11)
N/Nc
corresponding to user i on subcarrier k is
For convenience, throughout the rest of this paper, we
ri (k) = Hi (k)xi (k). (8) assume that each user employs the same number of subcar-
riers (Nc ) per cluster.
Then, the overall received signal which is a superposition of
the signals transmitted from all M users is
2.2.2. Independent FH-OFDMA. In independent hopping,

M subcarriers occupied by a user are selected independently
r(k) = ri (k) + n(k) from all available subcarriers. In other words, Nc subcarriers
i=1 in one cluster are not continuous anymore, and they are
(9)

M chosen in a pseudorandom fashion across the frequency
i i
= H (k)x (k) + n(k), spectrum. With independent hopping, the expected number
i=1 of symbols lost per symbol collision is given by [4]
where n(k) is the Fourier transform of the noise vector.
Nc
Ec = x pNc (x), (12)
x=1
2.2. Access Model. In this part, clustered and independent
FH-OFDMA are introduced, and closed form expressions of where pNc (x) is the probability that x subcarriers out of Nc
the expected number of collisions per symbol under both of subcarriers occupied by each user experience collisions due
these two hopping strategies are presented. to interfering users.
Theorem 1. For independent FH-OFDMA scheme described is an Nc × Nc matrix that contains the frequency domain
above, pNc (x) corresponds to representations of channel impulse response; ni is an Nc × 1
vector whose components are complex Gaussian random
M −1 x
Nc N − 2Nc + x variables with zero mean and variance σ 2 . Here, the Nc ×
pNc (x) = 1− 1 vector ei is the interference vector that captures the
x N − Nc + x
interference from all adjacent cells. The components of ei
(13)
Nc −1 M −1 are i.i.d complex Gaussian random variables independent of
N − Nc + x − y T
× . xi , Hi and ni with mean zero, variances (σ12 , . . . , σN2 c ) . The
y =0
N−y variances correspond to
Proof. pNc (x) is the probability that x subcarriers of the ρEs
desired user collide with the subcarriers of interfering user σ 2j = , j = 1, 2, . . . , Nc , (18)
SIRs
given that each user occupies a total of Nc subcarriers. It
is evident that the number of possible combinations of x where SIRs denotes the symbol signal-to-interference ratio
subcarriers that experience collisions is ( Nxc ). Define qNc (a) and ρ ∈ {1, 0} characterizes the presence/absence of a
as the probability that a symbols are collision-free given collision between users in different cells. That is, if there
that each user occupies Nc subcarriers. Furthermore, define is a collision, ρ equals to one; if there is no collision, ρ is
pNc (b | c) as the conditional probability that b symbols set to zero. Furthermore, ρ can be modeled as Bernoulli’s
collide given that c symbols are collision-free. Therefore, we random variable with probability of collision equals to p (i.e.,
can write pNc (x) as P(ρ = 1) = p and P(ρ = 0) = 1 − p), which can be expressed
as
Nc
pNc (x) = qNc Nc − x pNc x | Nc − x . (14) M −1
x 1
p =1− 1− , (19)
N/Nc
Here, qNc (Nc − x) corresponds to
where M is the number of active users. If the system is fully
Nc −1 M −1 loaded, then M = N/Nc . If there is a collision, that is, ρ =
N − Nc − x − k
qNc Nc − x = . (15) 1, then all subcarriers in the cluster will be affected by the
k=0
N −k intercell interference.
Equation (15) denotes the probability that the desired user’s
remaining Nc − x subcarriers are collision-free while none 2.3.2. Independent FH-OFDMA. In independent hopping,
of the other M − 1 users within the same cell occupies these since subcarriers are selected independently of all other sub-
subcarriers. pNc (x | Nc − x) is expressed as [4] carriers according to predetermined FH patterns, collisions
occur independently. Hence, for the kth subcarrier of the ith
M −1 x
N − 2Nc + x user,
pNc x | Nc − x = 1 − . (16)
N − Nc + x ri (k) = Hi (k)xi (k) + ni (k) + ei (k). (20)
Equation (16) represents the conditional probability that
Here, the interference power σ 2 of the i.i.d complex Gaussian
each of the x subcarriers of the desired user collides given that
random variable ei (k) corresponds to
the other Nc − x subcarriers are collision-free. By substituting
(15) and (16) into (14), we obtain the result in (13). ρEs
σ 2 = , (21)
SIRs
2.3. Interference Model. In this paper, we model intercell
interferences as additive complex Gaussian-distributed dis- where ρ = 1, 0 with probabilities p and 1 − p, respectively.
tortions. This model is accurate when interferences from The collision probability p is given by
adjacent cells are perfectly randomized with respect to the MNc −1
cell of interest. Models specific to clustered and independent 1
p =1− 1− . (22)
FH-OFDMA are presented in the following. N
For a fully loaded system with independent hopping, M is
2.3.1. Clustered FH-OFDMA. In clustered FH-OFDMA, if identical to N, Nc becomes to one.
interference occurs on any symbol on one subcarrier in
the cluster, all other symbols in the same cluster will also
experience interferences from adjacent cells. Hence, the 3. RNS-FH Pattern Design
interference for the ith user can be modeled as [6]
RNS is defined by the choice of v number of positive integers
i
r =Hx +n +e, i i i i
(17) mi (i = 1, 2, . . . , v), referred to as moduli [14]. If all the
moduli are pairwise relative primes to each other, any integer
where ri is an Nc × 1 vector, representing the received signal Nk which falls in the range of [0, Mr ) can be uniquely
of user i; xi is the Nc × 1 transmitted signal vector; Hi and unambiguously represented by the residue sequence

(rk,1 , rk,2 , . . . , rk,v ), where Mr = vi=1 mi and rk,i = Nk Time slots (0 ∼ 5)
textmod {mi } for i = 1, 2, . . . , v. Here, Nk is used to describe
··· 2 1 0
the kth user FH address. To recover Nk , or to distinguish
User index (1 ∼ 6)
users at the base station, Chinese remainder theorem (CRT) ··· 3 2 0 1
is generally used which is well known for its capability ··· 4 3 2
of solving a set of linear congruences, simultaneously.
According to CRT, it can be shown that the numerical value ··· 5 4 0
of Nk can be computed as [15] ··· 6 5 1 1
··· 7 6 2

v
Nk = rk,i ai Mi mod Mr , (23) Figure 2: One example of RNS-assisted two-stage hopping strategy.
i=1
where Mi = Mr /mi and ai = Mi−1 mod {mi } for i =

1, 2, . . . , v. (1) Divide the total available subcarriers N into Mc
clusters with each cluster containing Nc number of
Theorem 2. The residue sequences obtained using the RNS contiguous subcarriers.
arithmetic as described above are orthogonal.
(2) If Mc can be written as a product of two pairwise
Proof. In order to prove that the residue sequences are relative primes, for example, Mc = a1 · b1 , we can
orthogonal, we need to show that every Nk in the range of first group Mc clusters into a1 groups with b1 clusters
[0, Mr ) has a unique residue set that is different from residue in each group. Then, we index the groups from 0 to
sets generated by other integers within the same range. We a1 − 1.
will prove this by contradiction as follows. (3) Index the clusters in each group from 0 to b1 − 1.
Assuming that N1 and N2 are different integers which are
(4) At the 0th time slot, assign integer Nk to user k as its
in the same range of [0, Mr ) with the same residue set. That
FH address according to its access order to the system,
is,
where 0 < Nk ≤ Mc .

N1 mod mi = N2 mod mi , i = 1, 2, . . . , v. (24) (5) If Nk mod {a1 , b1 } = { a1 , b1 }, then user k selects the
Therefore, we have b1 th cluster out of the a1 th group for transmission.
(6) At the ts th time slot, assign integer Nk + ts to user k as
N1 − N2 mod mi = 0. (25)
its current FH address and repeat step 5.
Thus, we can conclude from (25) that N1 − N2 is actually (7) Repeat steps 4–6 until one mutually orthogonal FH
the least common multiple (LCM) of mi . Furthermore, if pattern is obtained.
mi are pairwise relative primes to each other, their LCM is
Mr = vi=1 mi and it must be that N1 − N2 is a multiple of (8) If Mc can be expressed as products of other combi-
Mr . However, this statement does not hold since N1 < Mr nations of two pairwise relative primes, for example,
and N2 < Mr . Therefore, by contradiction, N1 and N2 should Mc = a2 · b2 = · · · = aw · bw , then w different
not have the same residue set. In general, the residue set orthogonal FH patterns can be obtained by repeating
(rk,1 , rk,2 , . . . , rk,v ) generated by Nk is unique and can be used steps 2–7, w times.
to represent the integer Nk if Nk < Mr .
An example is given in Figure 2 to illustrate the two-
Following the RNS arithmetic presented above, we pro- stage RNS-assisted frequency hopping strategy. Here, 6 users
pose to design FH patterns that satisfy all the requirements access the system (M = 6); the total number of subcarriers
described in Section 1 while avoiding the limitations in [11]. is 30 (N = 30) and they are divided into 6 clusters (Mc = 6)
Detailed procedures of constructing RNS-FH patterns are with each cluster containing 5 contiguous subcarriers (Nc =
given in the following subsections. The first part describes 5). At the 0th time slot, the FH address assigned to the 5th
the two-stage algorithm, while the second part introduces the user is 5 according to his/her access order to the system.
multistage algorithm which can be considered as generaliza- Therefore, 5 mod {2, 3} = {1, 2}. User 5 will choose the
tion of the two-stage algorithm. At the end of this section, we 2nd cluster of subcarriers out of the 1st group of clusters to
compare our proposed RNS-FH pattern design strategy with transmit. At the 1st time slot, the FH address assigned to this
the method presented in [11]. user becomes 5 + 1 = 6. Obviously, 6 mod {2, 3} = {0, 0},
then he/she will select the 0th cluster of subcarriers out of
3.1. Two-Stage Algorithm. In this part, the detailed proce- the 0th group of clusters for transmission at this time. This
dures of constructing RNS-FH patterns via the so-called two- process continues until one FH sequence of length Mc is
stage algorithm is introduced. We present the algorithm for constructed.
a cluster hopping OFDMA system. It is straightforward to
extend the algorithm to the independent hopping scenario. 3.2. Multistage Algorithm. The multistage algorithm is an
The steps involved in the two-stage selection algorithm are extension of the two-stage algorithm. Introducing the mul-
given as follows. tistage algorithm cannot only enhance the flexibility of the
pattern design, but also strengthen the robustness of the 0

1
entire FH scheme. We describe the multistage algorithm 0
2
assuming an independent hopping scheme with each user 3
employing the same number of subcarriers, that is, Ni = Nc 4
for i = 1, 2, . . . , M. The steps involved in the multistage Time slots (0 ∼ 29) 0
algorithm correspond to the following. 1
1
0
··· ··· 1 2
(1) If N can be written as a product of m pairwise relative 2 3
primes, for example, N = a1 · b1 · c1 . . ., we can 4
··· ···
3
. 0
first group N subcarriers into a1 groups with b1 .
. 1
subgroups in each group. Then, we index the first- 2
User index (1 ∼ 30)

2
stage groups from 0 to a1 − 1. .
.
3
··· ··· . 4
(2) Index the second-stage groups in each first-stage 0
group from 0 to b1 − 1. Then group the subcarriers . 1
.. 0 2
in each second-stage group into c1 subgroups.
27 3
(3) Similar steps continue on until all of the subcarriers ··· ··· 4
are grouped and indexed at the mth-stage. 28
0
29
(4) At the 0th time slot, assign integer set {Nk , Nk + 1 1
1
M, . . . , Nk +MNc } to user k as its FH addresses, where ··· ··· 30 2
3
Nk is its access order to the system, 0 < Nk ≤ N. 4
a1 , b1 , c1 , . . .}, then user
(5) If Nk mod {a1 , b1 , c1 , . . .} = { 0
k first selects the b1 th second-stage group out of 2
1
the a1 th first-stage group, then similar selecting 2
3
procedures continue on until the subcarrier at the 4
mth-stage has been extracted out for transmission.
(6) The process in step 5 is repeated on the other ele- Figure 3: One example of RNS-assisted multistage hopping
ments in the integer set of user k until Nc subcarriers strategy.
have been extracted out for user k to transmit.
(7) At the ts th time slot, assign integer set {Nk + ts , Nk +
M + ts , . . . , Nk + MNc + ts } as the current FH addresses is assigned a set of FH addresses rather than one unique
of user k and repeat steps 5-6. FH signature. For example, consider user 2 in Figure 3, the
2nd subcarrier occupied by user 2 at the 0th time slot is
(8) Repeat steps 4–7 until one mutually orthogonal FH determined starting from his/her current FH address 2+30 =
pattern is obtained. 32 and following the steps as before. These steps are repeated
(9) If N can be expressed as products of other combi- until Nc subcarriers for user 2 are identified. Extrapolating
nations of m pairwise relative primes, for example, the procedure across the time axis, an entire FH sequence of
N = a2 · b2 · c2 · · · · = · · · = aw · bw · cw . . ., then w length N is designed.
different orthogonal FH patterns can be obtained by With respect to the design procedures, the major dif-
repeating steps 1–8, w times. ference between independent hopping and cluster hopping
is the following: in independent hopping, each FH address
It is easy to visualize the multistage algorithm by using specifies a single subcarrier that can be used. Therefore, if
a tree diagram. An example is given in Figure 3. Here, 30 users have very high bandwidth/rate or other QoS require-
users access the system (M = 30); a total of 30 subcarriers ments, multiple FH addresses can be given to accommodate.
are used, that is, N = a1 · b1 · c1 = 2 · 3 · 5 = 30. Two In cluster hopping scenario, a user may demand only one
specific examples are illustrated as follows: (1) consider user unique FH address as a single address completely specifies
2. The subcarriers used by this user at the 0th time slot can all Nc subcarriers required for transmission. Fully loaded
be calculated as follows: 2 mod {2, 3, 5} = {0, 2, 2}; that is, independent hopping system is a special case of cluster
in the 0th first-stage group, the 2nd subcarrier out of the hopping with one subcarrier in each cluster.
2nd second-stage group is selected for transmission. This From Figures 2 and 3, it is evident that the proposed
is indicated with a solid line in Figure 3; (2) consider user RNS-FH patterns guarantee the orthogonality among differ-
27. 27 mod {2, 3, 5} = {1, 0, 2}; that is, in the 1st first-stage ent users within a cell. That is, users within the same cell
group, the 2nd subcarrier out of the 0th second-stage group will not interfere with each other when they simultaneously
is selected by the 27th user for transmission at the 0th time access the system. The next example, which is shown in
slot. This is indicated with a dashed line in the figure. This Figure 4, demonstrates that if different RNS-FH patterns
procedure continues until an FH sequence of length N is are assigned to adjacent cells, intercell interferences can be
completed. We should note that in this example, the system perfectly averaged. In this example, N is set to 10 while the
is fully loaded (M = N = 30). For M < N, each user moduli sets used to construct FH patterns in cells 1 and 2 are
Subcarriers OFDM “top-down” approach where a given bandwidth is divided

symbols into multiple candidate subcarriers in multistages according
10 9 8 7 6 10 9 8 7 6 to the predetermined moduli set (see [11, Figure 2]). That is,
6 5 4 3 2 5 4 3 2 1 the choice of the moduli set (top level decision) determines
the number of subcarriers that can be used (bottom level
2 1 10 9 8 6 5 4 3 2
decision) for hopping. This scheme is driven in conjunction
8 7 6 5 4 1 10 9 8 7 with MFSK-modulated signals and a reference register C,
4 3 2 1 10 2 1 10 9 8 which has the same length as the moduli set (v), providing
5 4 3 2 1 7 6 5 4 3 reference to each user in order to enable synchronous
transmission. However, in our work, we assume that the
1 10 9 8 7 8 7 6 5 4
division of the frequency bandwidth has already been done
7 6 5 4 3 3 2 1 10 9 in advance. That is, the number of subcarriers that can be
3 2 1 10 9 4 3 2 1 10 used for hopping is given (bottom level decision). Based on
this number, we employ a proper moduli set to group and
9 8 7 6 5 9 8 7 6 5
index each of the candidate subcarriers (top level decision).
Cell-1 Cell-2 Therefore, we can interpret our proposed initialization
process as a “bottom-up” approach (see Figure 3). It is
Figure 4: Two different FH patterns are given and their only
collision point is highlighted.
important to note that in practical OFDMA cellular systems,
the division of the bandwidth within a cell is usually fixed
and predetermined (e.g., 1024 subcarriers). Therefore, our
“bottom-up” approach is more suitable for such practical
{a1 = 2, b1 = 5} and {a2 = 5, b2 = 2}, respectively. From systems. Furthermore, unlike the length-v reference register
Figure 4, it is evident that every user in cell 1 experiences C that is used in [11], the FH scheme proposed in this paper
interference from different users from cell 2 during each invokes the use of only a length-one register to store the time
of his/her hops. For example, in the first OFDM symbol index which in turn can be used to calculate current FH
duration, user 1 in cell 1 is interfered by user 8 from cell 2; address of each user at the base station.
in the next OFDM symbol slot, user 1 is interfered by user 5 Secondly, for reducing intercell interference, [11] sug-
from cell 2 and so on. In general, users from different cells gests the use of different moduli sets for adjacent cells.
collide only once during a frequency hopping cycle under Since the choice of the moduli set determines the number
the proposed scheme. Therefore, full interference diversity is of subcarriers used for hopping, a different moduli set in
exploited in the case of RNS-FH patterns. adjacent cells will result in different number of subcarriers
The properties of the proposed RNS-FH patterns can be in adjacent cells. If the total bandwidth is the same for all
summarized as follows. cells, this approach translates into subcarriers in adjacent
cells having different bandwidths. This may be an unrealistic
(1) At most, a size of N × N mutually orthogonal FH assumption for practical OFDMA systems. If the method in
pattern can be obtained for the independent hopping [11] is applied to a practical scenario using fixed number of
scheme. The size becomes Mc × Mc for the cluster subcarriers (each with the same bandwidth), high intercell
hopping. interference will result (as shown in Figure 8). Our proposed
(2) If N (Mc ) can be written as a product of m pairwise “bottom-up” approach does not suffer from this drawback as
relative primes, then at least, (m−1)m! different RNS- it is built on the premise that the number of subcarriers and
FH patterns can be obtained. their bandwidths are fixed across cells.
(3) With the use of the same moduli set, for indepen- In summary, the method proposed in this work is flexible
dent hopping, RNS-FH patterns constructed after N and well suited for practical OFDMA cellular systems.
frames (Mc for cluster hopping) are actually peri-
odical extensions of the RNS-FH pattern designed
during the first N (Mc ) frames. 4. Simulation Results
(4) With knowledge of moduli and residue, the base Parameters of the simulated system are provided in Table 1.
station can regenerate the entire RNS-FH pattern The cyclic prefix within one OFDM symbol duration is
using the CRT. assumed long enough to eliminate ISI (intersymbol inter-
ference). Two 6-ray channel pulse responses are considered
3.3. Comparison with [11]. In this section, we compare our following the UTRA vehicular test environment [16]. In
proposed RNS-FH pattern design method with the technique Figure 5, the correlation functions of these two channels are
presented in [11] (which also considers RNS as the design plotted versus the variation of Δ f , while Δt = 1 slot and
metric). fD Ts = 0.01. From Figure 5, we can conclude that if small
First of all, although both strategies (one proposed here hopping intervals occur frequently in an FH pattern, Veh B
and the other presented in [11]) use the RNS arithmetic as a can provide more frequency diversity than Veh A.
basis, the mechanisms of determining the hopping sequence Theoretical (see (10) and (12)) and simulated expected
are different. In [11], the FH scheme can be visualized as a number of collisions per symbol in RNS-FH OFDMA are
Channel correlation in the frequency domain Δt = 1 slot 100

1
Expected number of collisions per symbol

0.9
0.8
|rH (Δ f , Δt)|
10−1
0.7
0.6
0.5
10−2
0.4 0 10 20 30 40 50
0 20 40 60 80 100 120
Number of users
Δ f (subcarriers)
Cluster, analytical Independent, simulated
Veh A
Cluster, simulated Independent, analytical
Veh B
Figure 5: Channel correlation function. Figure 6: Expected number of collisions per symbol versus the
number of users.
Table 1: System parameters.

100
Transmission BW 5 MHz
Carrier frequency 2 GHz
10−1
OFDM symbol duration 100 μs
CP duration 10 μs
10−2
Tone spacing 11 KHz
Bit error rate
FFT size 128

10−3
Occupied subcarriers 110
Channel impulse response Veh A/Veh B
10−4
Channel coding 1/2 convolutional code
Modulation QPSK
10−5
Time slots 10
10−6
0 5 10 15 20 25 30
given in Figure 6. The high collision probability severely SNR (dB)
limits the number of active users that can be simultaneously
supported by the FH system. No hopping, Veh A Cluster hopping, Veh B
In Figure 7, bit error rate (BER) versus SNR of RNS- No hopping, Veh B Independent hopping, Veh A
FH OFDMA under both cluster and independent hopping is Cluster hopping, Veh A Independent hopping, Veh B
plotted. The main objective of this example is to characterize Figure 7: BER versus SNR of RNS-FH OFDMA under cluster
the effects of frequency diversity exploited by RNS-FH and independent hopping with different channel conditions. N =
patterns on system performance. Here, we assume that 10 110, M = Mc = 10, Nc = 11, fD Ts = 0.01.
users are in the system with 11 subcarriers assigned to
each via the two-stage RNS hopping strategy. For cluster
hopping, the moduli set used is {a1 = 2, b1 = 5}, while for
independent hopping, it is {a1 = 2, b1 = 55}. It is observed Furthermore, since independent hopping scheme results
that both independent and clustered RNS-FH OFDMA dra- in a much larger FH pattern than cluster hopping, more
maticallyoutperforms the regular OFDMA scheme without frequency diversity can be exploited in the independent
hopping in both Veh A and Veh B environments. Another hopping case. This is also clearly reflected by the simulation
observation is that under both independent and cluster results shown in Figure 7. For example, at a BER level of 10−3 ,
hopping, the system performs better in Veh A. That is, nearly 8 dB gain is offered by independent hopping relative to
in the proposed RNS-FH patterns, large hopping intervals cluster hopping in Veh A environment.
occur more frequently than small hopping distances. This Figure 8 quantifies the intercell interferences experienced
characteristic is very important since it reveals that users by different users in the cell of interest, averaged across
occupy a wide bandwidth during a small fraction of all hops. time. The x-axis represents the indices of the users within
10 Effect of inter-cell interference on system performance

with cluster hopping
100
Interference-to-signal power ratio (dB)
−5
10−1
Bit error rate

−10
−15
−20 10−2
−25
−30
0 20 40 60 80 100
10−3
Users’ indices 0 5 10 15 20 25 30
Diff. RNS-FH SIR (dB)
Same RNS-FH No hopping, Veh A Same RNS-FH, Veh B
Figure 8: Intercell interference-to-signal power ratio for given users No hopping, Veh B Diff. RNS-FH, Veh A
under different RNS-FH patterns and identical RNS-FH patterns Same RNS-FH, Veh A Diff. RNS-FH, Veh B
assignments across cells. Figure 9: BER versus SIR of RNS-FH OFDMA under cluster
hopping with different channel conditions. N = 110, M = Mc =
10, Nc = 11, SNR = 25 dB, fD Ts = 0.01.
the cell of interest while the y-axis characterizes the time-

averaged intercell interference-to-signal power ratio for a Effect of inter-cell interference on system performance
given user. Two situations are considered: (1) different with independent hopping
100
RNS-FH patterns are allocated to the cell of interest and
the interfering cell (denoted by the solid line); (2) the
same RNS-FH pattern as the cell of interest is assigned
to the interfering cell (denoted by the dashed line). Here, 10−1
we model the intercell interference as additive Gaussian-
Bit error rate
distributed distortion. Therefore, in scenario (1), users in

the cell of interest will experience different interferences 10−2
from the interfering cell across all hops, which in turn
induces interference diversity. Figure 8 clearly demonstrates
that by employing the proposed method (i.e., allocating a
10−3
different RNS-FH pattern to the interfering cell), the intercell
interference floor can be significantly lowered relative to the
scenario where all cells employ identical RNS-FH patterns.
Figures 9 and 10 show the effects of intercell interfer- 10−4
0 5 10 15 20 25 30
ence diversity on system performance. BER versus signal-
SIR (dB)
to-interference ratio (SIR) is plotted under cluster and
independent hopping in Figures 9 and 10, respectively. For No hopping, Veh A Same RNS-FH, Veh B
cluster hopping, the FH pattern assigned to the interfering No hopping, Veh B Diff. RNS-FH, Veh A
cell is constructed by using {a2 = 5, b2 = 2} while it is Same RNS-FH, Veh A Diff. RNS-FH, Veh B
{a2 = 55, b2 = 2} for the independent hopping scenario.
Figure 10: BER versus SIR of RNS-FH OFDMA under independent
We simulate the case where the same RNS-FH pattern used hopping with different channel conditions. N = 110, M = Mc =
in the cell of interest is assigned to adjacent interfering 10, Nc = 11, SNR = 25 dB, fD Ts = 0.01.
cells. Thus, users in the cell of interest will be affected by
the same interferences from adjacent cells during all hops.
Therefore, no interference diversity is exploited. Simulation
results also reflect this feature. When the same RNS-FH system BER performance. For example, in cluster hopping
pattern is assigned, frequency diversity as a result of hopping (Figure 9), with different pattern assignments, nearly 3 dB
reduces the interference floor. Therefore, the no hopping case gain at a BER level of 10−2 is achieved relative to the
still exhibits the worst BER performance. When different system employing identical hopping. This gain grows to
patterns are allocated to interfering cells, the interference 5 dB under independent hopping scenario (Veh B environ-
diversity along with frequency diversity further improves ment).
Cluster-hopped RNS-FH OFDMA 100

10−1
10−1
10−2
Bit error rate

10−2
Bit error rate
10−3
10−3
10−4 10−4
10−5
10−5 0 5 10 15 20 25 30
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 SNR (dB)
User loads
Cluster hopping, M = 10, Nc = 10
No hopping, Veh A Same RNS-FH, Veh B Cluster hopping, M = 6, Nc = 10
No hopping, Veh B Diff. RNS-FH, Veh A Cluster hopping, M = 10, Nc = 5
Same RNS-FH, Veh A Diff. RNS-FH, Veh B Cluster hopping, M = 6, Nc = 5
Figure 11: BER versus user loads of RNS-FH OFDMA under cluster Figure 13: Performance of cluster-hopped RNS-FH OFDMA with
hopping with different channel conditions, N = 110, M = Mc = different cluster sizes and different number of active users, fD Ts =
10, Nc = 11, SNR = 25 dB, SIR = 15 dB, fD Ts = 0.01. 0.01.
Independent-hopped RNS-FH OFDMA

10−1
if no hopping occurs. Furthermore, the performance gap
between the identical hopping and the different hopping
decreases with the increase in user loads. That is, the benefit
10−2 of intercell interference diversity is greater for lower user
loads.
Bit error rate
Figure 13 illustrates that by increasing the cluster size

10−3 (the number of subcarriers in one cluster), or the number of
active users, the number of collisions increases. This in turn
induces degradation in BER performance as can be seen from
Figure 13.
10−4
Finally, we compare our proposed RNS-FH pattern
design strategy with state-of-the-art FH pattern designs.
Specifically, our benchmark for comparison is the Latin
10−5 squares (LSs-)-aided FH pattern design presented in [6]. In
0 0.2 0.4 0.6 0.8 1
User loads
our proposed RNS-FH pattern, the spacing between hops in
time and frequency is far enough that subcarriers employed
No hopping, Veh A Same RNS-FH, Veh B in a single time slot are weakly correlated. This feature
No hopping, Veh B Diff. RNS-FH, Veh A provides remarkable performance improvements that are
Same RNS-FH, Veh A Diff. RNS-FH, Veh B consistent across all cells. However, in Latin squares (LSs-)-
Figure 12: BER versus user loads of RNS-FH OFDMA under aided FH pattern design, performances in different cells may
independent hopping with different channel conditions, N = vary a lot. Relative comparisons are given in Figure 14, where
110, M = Mc = 10, Nc = 11, SNR = 25 dB, SIR = 15 dB, fD Ts = two Latin squares-based FH patterns A4 and A38 [6] are
0.01. employed. In LS A38 , smaller hops happen more frequently,
and for such smaller hops, Veh B exploits more frequency
diversity than Veh A. The opposite is also true for LS A4 .
BER versus user loads is plotted in Figures 11 and 12 Using simulation results, we first observe that in RNS-aided
under cluster and independent hopping, respectively, in both FH-OFDMA, different RNS-FH patterns provide nearly the
Veh A and Veh B. Effects of frequency and interference same BER performance, while it varies a lot in LS-aided FH-
diversities on system performance are explored at given OFDMA; the second observation is that our proposed RNS-
SNR and SIR. It is evident that the system throughput FH patterns have similar BER performances to LS A4 while
can be significantly enhanced by assigning different RNS- outperforming LS A38 . Although there may exist LS-aided
FH patterns to different cells, while it is severely limited FH pattern that has better performance than the proposed
100 [3] T. Scholand, T. Faber, A. Seebens, et al., “Fast frequency

hopping OFDM concept,” Electronics Letters, vol. 41, no. 13,
pp. 748–749, 2005.
10−1
[4] T. Kurt and H. Deliç, “On symbol collisions in FH-OFDMA,”
10−2 (VTC ’04), vol. 4, pp. 1859–1863, Milan, Italy, May 2004.
Bit error rate
[5] T. Kurt and H. Deliç, “Space-frequency coding reduces the

10−3 collision rate in FH-OFDMA,” IEEE Transactions on Wireless
[6] K. Stamatiou and J. G. Proakis, “A performance analysis
10−4
of coded frequency-hopped OFDMA,” in Proceedings of
IEEE Wireless Communications and Networking Conference
10−5 (WCNC ’05), vol. 2, pp. 1132–1137, New Orleans, La, USA,
March 2005.
10−6 [7] S. V. Maric and O. Moreno, “Using costas arrays to construct
0 5 10 15 20 25 30 frequency hop patterns for OFDM wireless systems,” in
SNR (dB) Proceedings of the 40th IEEE Conference on Information Sciences
and Systems (CISS ’06), pp. 505–507, Princeton, NJ, USA,
LS-FH, A4 , Veh A RNS-FH, m1 , Veh A March 2007.
LS-FH, A4 , Veh B RNS-FH, m1 ,Veh B
[8] C. Wang, X. Zhang, and D. Yang, “Evaluation of welch-costas
LS-FH, A38 , Veh A RNS-FH, m2 , Veh A
frequency hopping pattern for OFDM cellular system,” in Pro-
LS-FH, A38 , Veh B RNS-FH, m2 , Veh B
ceedings of the 18th IEEE International Symposium on Personal,
Figure 14: Performance of independent-hopped RNS-FH OFDMA Indoor and Mobile Radio Communications (PIMRC ’07), pp.
versus LS-FH OFDMA, N = 110, M = Mc = 10, Nc = 11, LS 1–5, Athens, Greece, September 2007.
A4 and A38 are used, different moduli sets m1 = {2, 55} and m2 = [9] T. Li, Q. Ling, and J. Ren, “A spectrally efficient frequency
{2, 11, 5} are applied to construct RNS-FH patterns, fD Ts = 0.01. hopping system,” in Proceedings of the 50th IEEE Global
Telecommunications Conference (GLOBECOM ’07), pp. 2997–
3001, Washington, DC, USA, November 2007.
[10] B. M. Popovic and Y. Li, “Frequency-hopping pilot patterns
scheme, the performance variations in LS-aided FH pattern for OFDM cellular systems,” IEICE Transactions on Fundamen-
design really limit their applications. tals of Electronics, Communications and Computer Sciences, vol.
E89-A, no. 9, pp. 2322–2328, 2006.
[11] L.-L. Yang and L. Hanzo, “Residue number system assisted
5. Conclusions fast frequency-hopped synchronous ultra-wideband spread-
In this paper, we propose an RNS arithmetic-based FH spectrum multiple-access: a design alternative to impulse
radio,” IEEE Journal on Selected Areas in Communications, vol.
pattern design that is well suited and easy to implement
20, no. 9, pp. 1652–1663, 2002.
for practical OFDMA cellular systems. RNS-FH patterns not
[12] J. Chen, T. Lv, and H. Zheng, “Joint cross-layer design
only guarantee zero collision within a cell, but also average
for wireless QoS content delivery,” in Proceedings of IEEE
the intercell interferences by assigning different FH patterns
International Conference on Communications (ICC ’04), vol. 7,
to adjacent cells. Additionally, by having a large spacing pp. 4243–4247, Paris, France, June 2004.
between the hopping frequencies, the RNS-FH patterns
[13] J. G. Proakis, Digital Communications, McGraw Hill, New
exploit frequency diversity effectively and provide significant York, NY, USA, 4th edition, 2001.
improvement in BER performance. The BER performance
[14] K. W. Watson and C. W. Hastings, “Self-checked computation
gain is consistent across all cells unlike other FH pattern using residue arithmetic,” Poceeding of the IEEE, vol. 54, no.
design schemes such as the LS-based method where wide 12, pp. 1920–1931, 1966.
performance variations are observed across cells. Simulation [15] L.-L. Yang and L. Hanzo, “Redundant residue number system
experiments demonstrate the superior performance of the based error correction codes,” in Proceedings of the 54th IEEE
RNS-FH scheme in terms of frequency diversity and intercell Vehicular Technology Conference (VTC ’01), vol. 3, pp. 1472–
interference diversity under both independent and cluster 1476, Atlantic City, NJ, USA, October 2001.
hopping strategies. [16] ETSI TR 101 112, UMTS 30.03, V3.1.0, Annex B, Std.
References
[1] S. Gault, W. Hachem, and P. Ciblat, “Performance analysis of
an OFDMA transmission system in a multicell environment,”
IEEE Transactions on Communications, vol. 55, no. 4, pp. 740–
751, 2007.
[2] M. K. Simon, J. K. Omura, R. A. Scoltz, and B. K. Levitt,
Spread Spectrum Communications, Computer Science Press,
Rockville, Md, USA, 1985.
doi:10.1155/2009/950674
Research Article
Implementation of a Smart Antenna Base Station for
Mobile WiMAX Based on OFDMA
Seungheon Hyeon, Changhoon Lee, Chang-eui Shin, and Seungwon Choi

Department of Electronics and Computer Engineering, Hanyang University, 17 Haengdang-Dong, Seongdong-Gu,
Seoul 133-791, South Korea
Correspondence should be addressed to Seungwon Choi, choi@dsplab.hanyang.ac.kr
Received 1 August 2008; Revised 7 January 2009; Accepted 12 February 2009
We present an implementation of a mobile-WiMAX (m-WiMAX) base station (BS) that supports smart antenna (SA) functionality.
To implement the m-WiMAX SA BS, we must address a number of key issues in baseband signal processing related to symbol-
timing acquisition, the beamforming scheme, and accurate calibration. We propose appropriate solutions and implement an m-
WiMAX SA BS accordingly. Experimental tests were performed to verify the validity of the solutions. Results showed a 3.5-time
(5.5 dB) link-budget enhancement on the uplink compared to a single antenna system. In addition, the experimental results were
consistent with the results of the computer simulation.
Copyright © 2009 Seungheon Hyeon et al. This is an open access article distributed under the Creative Commons Attribution
cited.
1. Introduction is impossible prior to decoding, it is difficult to properly

apply a weight to the desired ranging signal.
Modern mobile communication requires not only a high Various beamforming algorithms for OFDMA commu-
data rate transmission but also a relatively fast mobility. nications have been investigated [2, 3]. However, most of
The mobile WiMAX (m-WiMAX) based on orthogonal fre- the research focuses on beamforming per subcarrier using
quency division multiple access (OFDMA) is believed to be a
the conventional single-carrier beamforming algorithm. This
solution that addresses both of these requirements [1]. More-
approach causes high computational loads and increases
over, the application of smart antenna (SA) technologies to
system complexity.
OFDMA is regarded as a key solution for increasing the data
rates and the mobility of fourth generation (4G) wireless The calibration technique is essential for the SA system
communication systems operating in frequency-selective to apply a proper beamforming weight to the transmission.
fading environments. However, there are several things to Without an accurate calibration technique, the advantages of
consider in baseband signal processing when implementing SA technology cannot be provided in the downlink [4]. More
SA systems in OFDMA. These include the performance of specifically, even if the optimal weight vector is computed
symbol-timing acquisition, the beamforming scheme, and from the received signal, downlink (DL) beamforming
accurate calibration. can never be optimized without accurate calibration. The
The SA system enlarges cell coverage through beamform- primary reason is that the beamforming parameter for the
ing. However, to obtain effectively enlarged cell coverage, DL is, in most cases, heavily dependent upon the parameter
performance of the initial acquisition and symbol synchro- values computed during the uplink (UL). Thus, the overall
nization should also be enhanced. Since initial acquisition communication quality of the SA base-station (BS) system
is performed prior to calculating the weight vector, an cannot be improved without a proper calibration technique.
algorithm to enlarge the acquisition coverage is required. In this paper, we propose solutions for these prob-
Moreover, in the contention-based ranging used in m- lems and implement an m-WiMAX SA BS accordingly. In
WiMAX, since classification of the ranging signal by the user Section 2, we propose our solutions, and Section 3 shows
the implementation of the m-WiMAX SA BS. Each signal- then becomes a rectangular function with its phase rotated in
processing module is described in detail in this section. proportion to the propagation delay. After taking the inverse-
The performance of the m-WiMAX SA BS is presented FFT (IFFT) of the descrambled signal, the absolute value of
in comparison to the conventional single-antenna BS in the signal of each antenna is summed. This value is denoted
Section 4, and computer-simulation results are shown to as Z[n] and has its maximum value when n = τ. The
verify our experimental results. Finally, we conclude this structure of the ranging channel receiver shown in Figure 1
paper in Section 5. provides a diversity gain in both ranging code detection and
propagation delay estimation because the detection variable
is obtained through a noncoherent combination at each
2. Considerations for Implementation of antenna path.
the m-WiMAX SA BS The signal received through antenna path, l, can be
written as
This section addresses some essential problems that must be
considered when implementing the m-WiMAX SA BS. These rl [n] = xm [n − τ]·e− j2π(dl /λc ) sin θm + wl [n],
include the performance of symbol-timing acquisition, an (1)
optimized beamforming scheme, and accurate calibration. n = 0, 1, . . . , N − 1,
For SA BS to provide effective coverage, the coverage of the where xm [n] is the time-domain symbol obtained as the
symbol-timing acquisition must be enhanced. The optimized result of an IFFT at subscriber m, dl is the distance between
beamforming scheme is essential to implement an SA BS. the lth and reference antenna element, θm is the direction
Finally, to provide proper downlink and uplink beamform- of arrival (DOA), and λc is the wavelength of the received
ings, a pragmatic procedure for automatic calibration is signal at its carrier frequency. For simplicity, but without loss
required for the SA BS. In the following subsections, we of generality, we have assumed that there are no other user
propose solutions to these problems. signals. The FFT of (1) can then be written as
⎧
2.1. Ranging Processing. The problem of ranging arises ⎪
⎪Xm [k]e− j(2π/N)kτ
⎪
⎨
because the propagation delays between the SA BS and each Rl [k] = ⎪×e− j2π(dl /λc ) sin θm + Wl [k], N − C ≤ k ≤ N − 1,
of the mobile stations (MSs) in a given cell is different, ⎪
⎪
⎩
so the arrival time of the signal associated with each of Wl [k], 0 ≤ k < N − C,
the subscribers cannot be the same. Beamforming gain (2)
can be obtained in the SA BS only when symbol time
synchronization is performed properly. Thus, proper symbol where C is length of the ranging code. To apply the proposed
time synchronization is a prerequisite if the SA BS is to algorithm, the received signal shown in (2) is descrambled
enhance communication capacity and cell coverage. with the ranging code, Xm [k], and processed with the IFFT
Time synchronization, which is used to compensate for operator as shown in Figure 1. In the case of i = m, the result
differences in propagation delays, is referred to as “ranging” of the IFFT operation can be written as
in the mobile-WiMAX system. Each subscriber randomly 1 1 − e j(2π/N)C(n−τ) j(2π/N)(N −C)(n−τ) − j2π(dl /λc ) sin θm
selects a ranging code, allocates that code to the ranging hl,m [n] = e e
N 1 − e j(2π/N)(n−τ)
channel, and transmits it in the form of a ranging symbol.
The BS then checks whether or not the ranging code has + wl [n]∗xm [n], n = 0, 1, . . . , N − 1.
been transmitted in a given uplink frame at each frame (3)
time throughout the code detection procedure. When the BS
detects the ranging code transmitted by a subscriber, it finds The received signal shown in (3) is a complex Gaussian
the ranging code index and estimates the propagation delay random process with a mean of C/N, which implies that the
associated with that MS. detection variable obtained at each antenna channel, Zl [τ],
Figure 1 illustrates the ranging channel receiver in an m- is a noncentral chi-square random process with two degrees
WiMAX SA BS. This algorithm is less complex and more of freedom. The detection variable of the array antenna
efficient than conventional correlation-based algorithms [5, system consisting of L antenna elements is consequently a
6]. In other words, for an N-subcarrier m-WiMAX system, noncentral, chi-square distributed random variable with 2L
the conventional correlation-based algorithm requires N degrees of freedom, and can be written as
complex multipliers while the proposed ranging algorithm ⎧ (L−1)/2
⎪
⎪ α/ σ 2 ·γ
requires only log4 N − 1. Assuming that the propagation delay ⎪
⎪
⎪
⎪
of the ranging symbol arriving at the BS is τ, the receiving ⎪
⎨ 2σ 2

(RX) signal of each antenna is not retrieved correctly pZ (α) = ⎪ 1 α γα
⎪ × exp −
⎪ +γ I − for α ≥ 0,
because of the propagation delay. Based on the correlation ⎪
⎪ 2 σ 2 L 1
σ2
⎪
⎪
characteristics of the pseudorandom binary sequence (PRBS) ⎩ 0, otherwise,
and the circular shift property of the discrete Fourier (4)
transform operator, after the fast Fourier transform (FFT)
operation, the signal of each antenna is descrambled using where γ = (μ2I + μ2Q )(L/σ 2 ), IL−1 (·) is the modified Bessel
the ranging code transmitted by the target subscriber and function of the first kind of order L − 1, and where μI and
1st ant.
CP r1 [n] Tile R1 [k] H1,m [k] h1,m [n] Z1 [n]

remover FFT permutation IFFT 2
2nd ant.
r2 [n] R2 [k] H2,m [k] h2,m [n] Z2 [n]

CP Tile
remover FFT IFFT 2
permutation
m
Z[n] Select first peak
with threshold τ
. . Z[n] > β
Lth ant. .. ..
rL [n] RL [k] HL,m [k] hL,m [n]

CP Tile ZL [n]
remover FFT permutation IFFT 2
Xi [k]
Ranging code
generator
Figure 1: Ranging processing for the m-WiMAX SA system.
μQ denote the real and imaginary parts of hl,m [n]. The mean 1
and variance of the detection variable in an array system Exact time estimation probability, Pc 0.98
consisting of L antenna elements are expressed as 0.96
0.94
E[Z] = L 2σ 2 + μ2I + μQ , 2
0.92
2 (5)
E Z−Z = L 4σ 2 + 4σ 2 μ2I + μ2Q , 0.9
0.88
where Z denotes E[Z]. The mean and variance of the 0.86
detection variable increase linearly in accordance with the 0.84
number of antenna elements, as shown in (5). This means
0.82
that the SNR of the ranging code detector increases in
proportion to L, where the SNR of the ranging channel 0.8
2 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3
receiver is defined as (E[Z])2 /E[(Z − Z) ]. Eb /N0 (dB)
On the contrary, if the signal of each antenna is
descrambled with a code that is different from the one L=4 L=2
transmitted by the target subscriber, Z[n] approaches zero L=3 L=1
due to the correlation characteristics of the ranging codes. Figure 2: Symbol-timing acquisition probability of the proposed
Figure 2 illustrates the probability, PC , of estimating the ranging algorithm.
exact propagation delay provided by the proposed ranging
channel receiver in terms of the number of antenna elements.
As shown in the figure, the performance of the propaga-
tion delay estimation improves as the number of antenna time and frequency in narrowband communications. There-
elements increases. For a PC of at least 99%, the minimum fore, we can obtain enough samples to estimate the spatial-
Eb /No of the communication channel with an array system of channel basis in both the time and frequency domains.
four antenna elements is about −4.4 dB. Compared to the BS In this paper, we propose a beamforming scheme that
consisting of a single-antenna element, the BS consisting of uses samples from both the time and frequency domains
four antenna elements provides a performance enhancement to estimate a spatial-channel basis which is used as the
of approximately 6.0 dB in the SNR. beamforming-weight vector. The processing procedure for
the proposed scheme is depicted in Figure 3. In Figure 3,
2.2. Beamforming Scheme. The conventional beamforming n and k are the time and frequency indices, and N and
algorithms for OFDMA use samples in time to estimate K are the total number of pilot subcarriers in the time
the statistical characteristic of the spatial channel [2, 3]. and frequency domains of a given packet. Compared to
This approach avoids the effect of frequency selective fading. conventional beamforming, the biggest advantage of the
However, it is difficult to obtain enough samples to estimate proposed scheme is that more samples can be obtained from
the statistical characteristic of a spatial channel in an m- the given OFDMA symbols (i.e., NK > K) to calculate
WiMAX waveform which is a packet-based communication. the weight vector. The second advantage is that the delay
Note that the spatial-channel basis is independent of both for converging the weight vector calculated by the adaptive
Time L = 4, QPSK, ray leighfading, fd = 266.667 Hz

Frequency 100
w(C)
w(1) 10−1
w(2)
Bit error rate

10−2
10−3
···
w(k) 10−4
0 5 10 15 20 25 30 35 40
w(K) W(NK − 1) Eb /N0 (dB)
wk (C) wk (1) wk (n) wk (N)
L = 1, SISO
Data subcarrier Conventional beamforming
Pilot subcarrier Proposed beamforming
Proposed scheme
Figure 4: Performance comparison of the proposed beamforming
Conventional scheme
scheme to the conventional scheme.
Figure 3: Calculation of autocorrelation matrix for the m-WiMAX
SA system.
is performed separately, since the RX and TX modes exist
separately in the frame format of mobile WiMAX. By using a
test signal orthogonal to the RX/TX signal, the influence on
algorithm is reduced. In this paper, the Lagrange multiplier- the SA BS can be minimized when the calibration operation
based algorithm [7] is used for the beamforming scheme. is performed.
Figure 4 shows the performance comparison between the The RX path calibration was performed using the
conventional beamforming and the proposed beamforming following procedure.
when the m-WiMAX packet consists of 15 OFDMA symbols.
In this computer simulation, quadrature phase-shift keying (1) The additional calibration antenna generates and
(QPSK) was used as the modulation and the SA BS had transmits a test signal.
a four-element array. The channel environment for the (2) Each RX path in the SA system receives the signal
simulation was a Rayleigh fading channel of which the simultaneously.
maximum Doppler-frequency component was 266.77 Hz.
Note that the channel environment did not correspond to (3) The calibration processor calculates a calibration
the experimental test in Section 4. As shown in Figure 4, the value for each RX path in the SA system.
performance of the conventional beamforming was reduced An exact numerical analysis of the procedure is given
by 1.2 dB in bit error rate (BER) due to the lack of samples. in [11]. The phase delay of the wireless path between each
antenna and the additional antenna can be calculated by
2.3. Calibration. The problem of calibration occurs be- making a connection between each antenna path and the
cause the phase characteristics of the radio frequency additional antenna path with a cable. The phase difference
(RF)/intermediate frequency (IF) chains associated with between each antenna RX path is obtained by correlating the
each antenna are different in both the receiving (RX) and received signal from each antenna path with the test signal.
transmitting (TX) modes. Several calibration techniques The TX path calibration is performed separately from the
for the SA system have been proposed [8–11]. Of these RX path calibration using the following procedure.
techniques, we chose to use [11] because it offers simple (1) The calibration processor generates N (the number
and accurate calibration. Although the experimental data in of antenna elements) orthogonal test signals for each
[11] was obtained using the CDMA2000 1x standard, it is TX path of the SA system.
noteworthy that this technique can be applied to the OFDMA
standard. Another advantage is that this technique can be (2) Each path transmits the signals.
applied while the SA system is operating. (3) The additional calibration antenna receives the sig-
The chosen calibration technique requires the installa- nals.
tion of an additional antenna which is used to TX or RX (4) The calibration processor calculates the calibration
a test signal to or from each antenna element for RX and value for each TX path of the SA system.
TX calibrations. This additional antenna transmits the test
signal through an RX carrier frequency and receives the As shown in [6], the phase difference between each
test signal through a TX carrier frequency. The calibration antenna and the reference antenna is almost eliminated using
analyzed in MAC GPP and sent to the network. In DL,

the MAC protocol data unit (PDU) is fed into DL DSP for
encoding. The encoded data is passed to the rearFPGA for DL
weighting, cluster permutation, and IFFT. The frontFPGA
MAC GPP receives the OFDMA symbol and adds the CP. When the DL
module Power frame clock is enabled, the frontFPGA sends the OFDMA
block symbol to the intermediate frequency (IF) module via
LVDS. The calibration is performed independently of UL/DL
processing. The result of the calibration is multiplied with the
ROM
ROM
ROM
ROM
DL UL BF CAL weight vector in BF DSP to compensate for the amplitude
DSP DSP DSP DSP and phase differences among the RF/IF chains.
SDRAM SDRAM SDRAM SDRAM
Figure 7 describes how the signal processing is performed
in synchronization with the system clock. The system clock
ROM
(sysClk in Figure 7) generates a 10 MHz pulse. The frmSync
SDRAM
rear front
RNG FPGA FPGA is raised at the beginning of every frame duration, and
DSP UL DL is toggled at every DL and UL duration. In Figure 7,
LVDS we can see that all signal processes in Figure 6 are performed
block in parallel.
Reserved DSPs for redundancy Figure 8 is a photograph of the up-down converter unit
(UDCU) employed in our SA BS. The UDCU consists
of an analog-to-digital (A/D) converter, a digital-to-analog
(D/A) converter, an Up/Down converter, and automatic gain
Figure 5: Photograph of the SA modem for the m-WiMAX SA control (AGC). When transmitting, the digital data from the
system. SA modem is converted to the corresponding analog signal
through D/A conversion. This analog signal is converted
to an RF signal via the Up-converter. Then, the RF signal
is transmitted through the front-end unit (FEU). When
the calibration. As a result, a proper beam pattern can be receiving, the received signal obtained from the FEU is first
obtained. fed into the AGC. Then, the output of the AGC is converted
to a digital signal which is sent to the SA modem.
3. Implementation of the m-WiMAX SA BS The FEU, shown in Figure 9, includes a TDD switch and
a low-noise amplifier (LNA). The TDD switch isolates the
Figure 5 shows the baseband-SA modem for the m-WiMAX transmit and receive signals from each other in accordance
SA BS. The SA modem consists of eight fixed point digital with the DL and UL duration. The LNA amplifies the
signal processors (DSPs), two field programmable gate arrays received signal with a noise level that is as low as possible.
(FPGAs), and a general purpose processor (GPP). In the The array antenna was implemented using five patch-
modem, three DSPs exist for redundancy and are not used for type elements. The element spacing was a half-wavelength
signal processing. Five DSPs are used for encoding/decoding, (6.52 cm). Four elements were used for transmitting and
beamforming, calibration, and ranging processing. Two receiving signals, and the other element was used for
FPGAs perform FFT/IFFT and permutations. Finally, the calibration.
GPP is used for medium access control (MAC) to interface The signal processing modules presented in this section
between the SA BS and the network. The detailed function- were integrated into the m-WiMAX SA BS. A photograph
ality of each device is described as follows. of the entire SA BS is provided with a description of the
Figure 6 shows the signal flow of the baseband as well experimental environment in the next section.
as the allocation of the signal processing components to the
devices in the SA modem. In the case of UL, the received
signal is fed into the frontFPGA via low-voltage differential 4. Experimental Results
signaling (LVDS). The frontFPGA removes the CP of the
received OFDMA symbols and passes it to the rearFPGA. The In this section, experimental results obtained from the
rearFPGA performs FFT, tile permutation, and UL weight- implemented m-WiMAX SA BS are presented, including
ing. The ranging code is also descrambled in the rearFPGA. the symbol-timing estimation probability for the ranging
The descrambled ranging channel is passed to RNG DSP process, the accuracy of the phase-delay compensation for
for estimating the symbol timing, and the data channel is the calibration, and throughput. In addition, various com-
passed to UL DSP for decoding. The beamforming-weight puter simulations supported the validity of our experimental
vector is calculated by BF DSP using the pilots embedded results.
in the permutated data channel. The BF DSP returns the Figure 10 shows the experimental environment that
weight vector to the rearFPGA. The weight vector is used included the implemented m-WiMAX SA BS, a six-element
for both UL and DL, since the m-WiMAX is operated in array antenna, mobile-station emulator, signal generator,
time-division duplex (TDD) mode. The decoded data is spectrum analyzer, and server and client laptops which were
RNG_DSP
Delay ranging_code_num
estimation propagation_delay
MAC_GPP
Ranging code
detector MAC
PDU
rearFPGA MAC
UL_DSP PDU
Ranging code
correlator Bit Zero Channel
frontFPGA deinterleaver padding decoding Randomization
Remove Tile
weighting
FFT permutation
CP Channel Subcarrier Digital Slot
UL
estimation rearrange demodulation concatenation
Remove FFT Tile
CP permutation
LVDS RX/TX
Add Cluster
Buffer IFFT
weighting
CP permutation DL_DSP
DL
Add Cluster Channel Slot
Buffer IFFT Puncturing Randomization concatenation
CP permutation coding
Pilot Subcarrier Digital Bit

insertion arrange modulation interleaver
Buffer
Calibration Weight
processing calculation
CAL_DSP BF_DSP
Figure 6: Functional allocation for baseband modem of the SA system.
sysClk
frmSync
UL DL UL DL
UL_DL
frontFPGA Buffering OFDM symbol transmiting Buffering OFDM symbol transmiting

Add CP Add CP
OFDM symbol receiving OFDM symbol receiving
Remove CP Remove CP
RX_Calibration signal transmiting TX_Calibration signal transmiting RX_Calibration signal transmiting TX_Calibration signal transmiting
Buffering Buffering Buffering
Receiving RX_Calibration signal Receiving TX_Calibration signal Receiving RX_Calibration signal Receiving TX_Calibration
rear FPGA IFFT IFFT

Cluster permutation Cluster permutation
DL_weighting DL_weighting
FFT FFT
Tile permutation Tile permutation
UL_weighting UL_weighting
Ranging_code correlation Turbo decoding ranging_code correlation Turbo decoding
BF_DSP Weight calculating Weight calculating
TX_Cal calculation TXCal signal generating RX_Cal calculation RXCal signal generating TX_Cal calculation TXCal signal generating RX_Cal calculation
CAL_DSP
RNG_DSP Ranging processing Ranging processing
UL_DSP UL_Symbol processing UL_Symbol processing UL_Symbol processing
DL_DSP DL_Symbol processing DL_Symbol processing
Figure 7: Timing diagram for baseband signal processing.
connected to the BS and MS via Ethernet. Four elements modem was set to the conventional single-antenna mode by
of the array antenna were used to transmit and receive the receiving the signal from an element of the array antenna,
m-WiMAX signal, and the other element was used for the and the other modem was set to the SA mode. The system
proposed calibration. An additional element, connected by parameters used in this test are summarized in Table 1.
the spectrum analyzer, was used for measuring the signal- Figure 11 shows a comparison of the symbol-timing
to-noise ratio (SNR) at the RF input of the SA BS. The estimation probability of the conventional ranging process
signal generator radiated additive white Gaussian noise for and the proposed ranging process. The experimental results
handling the SNR. To compare the performance between were obtained by averaging the measurements during 10 000
the SA BS and the conventional single antenna BS, two SA frames, that is, a 50-second period. In addition, the exper-
modems for the SA BS were used simultaneously. One SA imental results coincided well with the results of computer
Array antenna for BS

MS emulator
m-WiMAX SA BS
UD/AD Server/client
converter #0 laptop
Antenna for
signal generator
Spectrum
Antenna for analyzer
LVDS block
UD/AD MS
converter #1
AGC Signal
generator
Figure 10: Photograph of experimental environment.
Figure 8: Photograph of the UDCU for the m-WiMAX SA BS. 0.98
Symbol-timing estimation probability

0.96
0.94
0.92
0.9
0.88
0.86
0.84
Figure 9: Photograph of the FEU for the m-WiMAX SA BS.
0.82
0.8
Table 1: System parameters of the implemented m-WiMAX SA BS. −22 −20 −18 −16 −14 −12 −10
SNR @ RF input (dB)
System parameter Value
Channel bandwidth 8.75 MHz L = 4, computer simulation
FFT size 1024 point L = 4, experimental result
L = 1, computer simulation
CP ratio 1/8 L = 1, experimental result
Subcarrier spacing 11.156 KHz
OFDMA symbol duration 100.840 μs Figure 11: Experimental results of the proposed ranging algorithm.
Number of symbols (DL/UL) 27/15
Frame length 5 ms
Modulation scheme QPSK antenna element remained steady for a duration of over 500
Number of antennas (BS/MS) 4/1 symbols. Figure 13 shows the phase delay after the proposed
calibration. The standard deviation of the residual phase
error of the relative phase delay at each antenna element was
simulations which were calculated by compensating for the 2-3◦ and remained steady for five hours. Figure 13 shows
SNR in Figure 2. As shown in Figure 11, the proposed rang- that the proposed calibration technique eliminated the phase
ing process provided about a 5.7 dB enhancement in symbol- difference of the RF/IF chain associated with the antenna
timing estimation probability compared to the conventional elements.
ranging process. Finally, Figure 14 shows the measured uplink throughput
Figures 12 and 13 show the measurements of the relative of the conventional single-antenna BS and SA BS. The
phase differences between each RF/IF chain and a reference experimental results were averaged over five minutes per
RF/IF chain before and after the proposed calibration. given SNR. To measure the throughput of both BSs, a
As shown in Figure 12, the relative phase delay at each movie file was uploaded from the client laptop, which was
RF/IF chain differed from the others but remained nearly connected to the MS, to the server laptop connected to
constant over time. From the measurements, we observed the BS. In other words, the experiment was performed
that the phase delay of the RF/IF chain associated with each with packet-based communication. As shown in Figure 14,
180 70
Antenna 1
Phase characteristic (deg)
120
60
60
Antenna 0 50
Throughput (Kbps)
0
Antenna 3
−60 40
Antenna 2
−120 30
−180
0 50 100 150 200 250 300 350 400 450 500 20
Time (OFDMA symbol duration)
10
Figure 12: Phase characteristics obtained by experiment before
calibration.
0
−26 −24 −22 −20 −18 −16 −14 −12 −10 −8
SNR @ RF input (dB)
180
L = 1, experimental result
Phase characteristic (deg)
120
L = 4, experimental result
60 L = 1, computer simulation
Antenna 0–3 L = 4, computer simulation
0
Figure 14: Throughput of implemented SA system obtained by
−60 experiment.
−120
−180
0 50 100 150 200 250 300 350 400 450 500
Time (OFDMA symbol duration)
BS. In addition, the m-WiMAX SA BS increased the link-
Figure 13: Phase characteristics obtained by experiment after budget by 5.5 dB.
calibration. It should be noted that the experiments described in this
paper represent lab tests only, which might be quite different
from the outdoor environments in which m-WiMAX is used.
As shown in Figure 10, the MS in our lab tests was located
the proposed beamforming scheme provides a 5.5 dB link- just 4-5 meters away from the BS in a direct line of sight.
budget enhancement. These results mean that the proposed Since a mobile fading environment cannot easily be set up
beamforming scheme can be implemented. In addition, the in the laboratory, we checked the proposed beamforming
experimental results are consistent with the results from the scheme in fading environments through various computer
computer simulation. simulations. As shown in Figures 2 and 4, it is clear that
the proposed beamforming scheme provided a remarkable
5. Conclusion improvement in mobile fading environments as well as in the
static circumstances of the lab tests. Another limitation of the
In this paper, we addressed three key issues in implementing experimental tests was that the calibration performance was
the m-WiMAX SA BS: ranging, beamforming, and calibra- not verified in the throughput tests shown in Figure 14. Note
tion. that as the calibration was used for downlink beamforming,
First, the proposed ranging process significantly reduced the uplink performance shown in this paper does not
calculation loads using IFFT instead of a correlation opera- confirm the validity of the proposed calibration procedure
tion. Moreover, the proposed process achieved diversity gain except that the phase differences at each antenna channel
in the received signals from each antenna path. were equalized as shown in Figures 12 and 13. Future tests
Second, the proposed beamforming scheme addressed could include the downlink measurements to verify the
the lack of samples in OFDM-based packet communications. actual performance of the proposed calibration procedure.
The proposed scheme used time and frequency samples for
obtaining the statistical properties of the spatial channel.
Finally, the calibration method, which can be applied Acknowledgments
while the SA system is operating, was proposed. Although
additional antenna chains are required, the proposed method This work was partly supported by the IT R&D program
provided fast and accurate performance. of MIC/IITA (2007-S001-01, Implementation of Advanced-
The experimental results and computer simulations MIMO system) and the HY-SDR research center at Hanyang
verified the validity of these solutions. As shown in Section 4, University, Seoul, South Korea under the ITRC program of
the proposed solutions can be applied to the m-WiMAX SA MIC, South Korea.
References
[1] WiMAX Forum, “Mobile WiMAX—part I: a technical over-
view and performance evaluation,” http://www.wimaxforum
.org/.
[2] Y. Li and N. R. Sollenberger, “Adaptive antenna arrays for
OFDM systems with cochannel interference,” IEEE Transac-
tions on Communications, vol. 47, no. 2, pp. 217–229, 1999.
[3] Y.-F. Chen and C.-P. Li, “Adaptive beamforming schemes for
interference cancellation in OFDM communication systems,”
(VTC ’04), vol. 1, pp. 103–107, Milan, Italy, May 2004.
[4] M. Wennström, T. Öberg, and A. Rydberg, “Effects of finite
weight resolution and calibration errors on the performance of
adaptive array antennas,” IEEE Transactions on Aerospace and
Electronic Systems, vol. 37, no. 2, pp. 549–562, 2001.
[5] J. J. van de Beek, M. Sandell, and P. O. Börjesson, “ML
estimation of timing and frequency offset in OFDM systems,”
IEEE Transactions on Signal Processing, vol. 45, no. 7, pp. 1800–
1805, 1997.
[6] X. Fu and H. Minn, “Initial uplink synchronization and power
control (ranging process) for OFDMA systems,” in Proceedings
of the IEEE Global Telecommunications Conference (GLOBE-
COM ’04), vol. 6, pp. 3999–4003, IEEE Communications
Society, Dallas, Tex, USA, November-December 2004.
[7] S. Choi and D. Shim, “A novel adaptive beamforming
algorithm for a smart antenna system in a cdma mobile
communication environment,” IEEE Transactions on Vehicular
Technology, vol. 49, no. 5, pp. 1793–1806, 2000.
[8] J. Litva and T. K. Lo, Digital Beamforming in Wireless
Communications, Artech House, Norwood, Mass, USA, 1996.
[9] S. Mano and T. Katagi, “A method for measuring amplitude
and phase of each radiating element of a phased array
antenna,” Journal of the Institute of Electronics and Communi-
cation Engineers of Japan, vol. 65, no. 5, pp. 555–560, 1982.
[10] K. Nishimori, K. Cho, Y. Takatori, and T. Hori, “Automatic
calibration method of adaptive array for FDD systems,” in
Proceedings of the IEEE Antennas and Propagation Society
International Symposium (APS ’00), vol. 2, pp. 910–913, Salt
Lake, Utah, USA, July 2000.
[11] S. Hyeon, Y. Yun, and S. Choi, “Novel automatic calibration
technique for smart antenna systems,” Digital Signal Process-
ing, vol. 19, no. 1, pp. 14–21, 2009.

Matlab

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matlab

Uploaded by

Copyright:

Available Formats

EURASIP Journal on Wireless Communications and Networking

OFDMA Architectures, Protocols, and

Guest Editors: Victor C. M. Leung, Alister G. Burr, Lingyang Song,

OFDMA Architectures, Protocols,

Victor C. M. Leung,1 Alister G. Burr,2 Lingyang Song,3 Yan Zhang,4

Correspondence should be addressed to Yan Zhang, yanzhang@ieee.org

Received 31 March 2009; Accepted 31 March 2009

Finally, we would like to express our gratitude to the

Wen Zhou and Wong Hing Lam

Correspondence should be addressed to Wen Zhou, wenzhou@eee.hku.hk

Received 20 July 2008; Revised 10 January 2009; Accepted 20 March 2009

Recommended by Lingyang Song

1. Introduction information such as cyclostationarity induced by the cyclic

where S ∈ s, and s is the set containing all constellation

3. The Proposed Fast LMMSE Algorithm

Figure 1: Baseband OFDM system.

The magnitude of the first row of channel

Since the matrix R  Hp Hp is circulant, R

4.1. MSE Analysis of the Conventional LMMSE Algorithm.

Figure 4: The first row of the LMMSE matrix

φMSE (SNR, SNR)

(ii) MSE for Mismatched SNR. Similarly, the MSE of the

100 we choose the designed SNR to be 10 dB, SNR  will be set to

Appendices Step 1. Compute N p points FFT of the vector C and we

[21] H. Minn and V. K. Bhargava, “An investigation into time-

Han Zhang, Xianhua Dai, Dong Li, and Sheng Ye

Correspondence should be addressed to Xianhua Dai, issdxh@mail.sysu.edu.cn

Received 30 July 2008; Revised 22 November 2008; Accepted 27 January 2009

Recommended by Lingyang Song

1. Introduction be estimated over the whole frequency band. In conven-

Figure 1: System model.

the channel transfer function during an OFDMA symbol can

3.1. Channel Estimation over One OFDMA Symbol. In this −1

T (9) = Ep λκ (i)e− j2πκl/B + w(m) (i, k),

  By (16)-(17), the modeling coeﬃcients are estimated over the

 Combining the variances in (27) and (28), we have the

L(Q + 1)Es B L(Q + 1) Es 

When the SER of the initial hard detector in (34) is 10−1

Mean square error (MSE)

Mean square error (MSE)

Doppler frequencies. It shows clearly in Figure 4 that our

with the conventional ST-based scheme in estimating the LTI

10−1 6.3. Complexity Analysis. The description of the proposed

into a weighted average process in (18). Thus, compared to

Yi Wang,1, 2 Lihua Li,1, 2 Ping Zhang,1, 2 and Zemin Liu1, 2

Correspondence should be addressed to Yi Wang, wangyi81@gmail.com

Received 31 July 2008; Revised 10 November 2008; Accepted 18 January 2009

Recommended by Yan Zhang

1. Introduction implementation and require exact channel covariance matri-

3. Channel Estimation Based on

DFT e− j(2π/N)lk . (7)

 3.2. Partial Frequency Response by Conventional DFT. In

where M is the length of partial frequency response. For partial

simplicity, we consider M1 = 0 in this paper. However, partial

 symmetric 2  From (20), the MSE of our proposed estimator is

noted that the FFT length of the conventional DFT method

the leakage power is significantly reduced in our proposed

5. Conclusion [14] Y. Li, L. J. Cimini Jr., and N. R. Sollenberger, “Robust channel

Sanhae Kim,1, 2 Oh-Soon Shin,2 and Yoan Shin2

Correspondence should be addressed to Yoan Shin, yashin@e.ssu.ac.kr

Received 10 June 2008; Revised 4 December 2008; Accepted 9 January 2009

Recommended by Alister G. Burr

1. Introduction the IEEE 802.16e standards. In the virtual MIMO scheme,

reported to achieve a near ML performance. However,

Since the matrix R Hp Hp is circulant, R

100 we choose the designed SNR to be 10 dB, SNR will be set to

By (16)-(17), the modeling coeﬃcients are estimated over the

Combining the variances in (27) and (28), we have the

L(Q + 1)Es B L(Q + 1) Es

3.2. Partial Frequency Response by Conventional DFT. In

symmetric 2 From (20), the MSE of our proposed estimator is