You are on page 1of 14

International Journal of Electronics and Communication Engineering and Technology (IJECET)

Volume 8, Issue 1, January - February 2017, pp. 1831, Article ID: IJECET_08_01_003
Available online at
http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=8&IType=1
ISSN Print: 0976-6464 and ISSN Online: 0976-6472
IAEME Publication

HYPERSPECTRAL IMAGERY CLASSIFICATION


USING TECHNOLOGIES OF COMPUTATIONAL
INTELLIGENCE
Priya G. Deshmukh
Electronics Department, Amrutvahini College of Engineering,
Sangamner, Maharashtra, India

Prof. M. P. Dongare
Assistant Professor, Electronics Department, Amrutvahini College of Engineering,
Sangamner, Maharashtra, India

ABSTRACT
Texture information is exploited for classification of HSI (Hyperspectral Imagery) at high
spatial resolution. For this purpose, framework employs to LBP (Local Binary Pattern) to extract
local image features such as edges, corners & spots. After the extraction of LBP feature two levels
of fusions are applied along with Gabor feature & spectral feature, i.e. Feature level fusion &
Decision level fusion. In Feature level fusion multiple features are concurred before pattern
classification. While in decision level fusion, it works on probability output of each individual
classification pipeline combines the distinct decisions into final one. Decision level fusion consists
of either hard fusion, soft fusion method. In hard fusion we consider majority part & in soft fusion
linear logarithmic opinion pool at probability level (LOGP). In addition to this, extreme learning
machine (ELM) classifier is included which is more efficient than support vector machine (SVM),
used to provide probability classification output. It has simple structure with one hidden layer &
one linear output layer. ELM trained much faster than SVM.
Key words: Decision fusion, extreme learning machine (ELM), Gabor filter, hyperspectral imagery
(HSI), local binary patterns (LBPs), pattern classification.
Cite this Article: Priya G. Deshmukh and Prof. M.P. Dongare, Hyperspectral Imagery
Classification using Technologies of Computational Intelligence, International Journal of
Electronics and Communication Engineering and Technology, 8(1), 2017, pp. 1831.
http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=8&IType=1

1. INTRODUCTION
To develop an innovative technique to classify the hyperspectral images using tools of computational
intelligence. Classification of hyperspectral imagery (I) at high spatial resolution is done by exploiting
texture information. It uses local binary pattern to extract local features and a simple efficient extreme
learning machine with a very simple structure is employed as the classifier. Many algorithms have been
proposed improve the local image feature for classification of hyperspectral images currently feature-level

http://www.iaeme.com/IJECET/index.asp 18 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

fusion simply concatenates a pair of different features (i.e., Gabor features, LBP features, and spectral
features) in the feature space
A) LBP Features- LBP for HSI classification works on gray scale image with single spectral band. In this
method, linear prediction error (LPE) is used for unsupervised band selection. LPE is first applied to the
set of featured & distinct bands. For each band LBP code is generated for each pixel in the entire image. So
that to generate LBP code image. From LBP code image LBP image patch is extracted to calculate its
histogram. Performance of LPE is better than principle component analysis.
After the extraction of LBP feature two levels of fusions are applied along with Gabor feature &
spectral feature, i.e. Feature level fusion & Decision level fusion. In Feature level fusion multiple features
are concurred before pattern classification. While in decision level fusion, it works on probability output of
each individual classification pipeline combines the distinct decisions into final one. Decision level fusion
consists of either hard fusion, soft fusion method. In hard fusion we consider majority part & in soft fusion
linear logarithmic opinion pool at probability level (LOGP).
B) ELM- Extreme Learning Machine (ELM) classifier is used to provide probability classification outputs
using LBP features. ELM is a neural network based method. It has very simple structure consisting of one
hidden layer & one linear layer. It is much faster technique due to random generation of input weight &
analytical generation (least square method) of output weights, which reduces computational cost.

2. LITERATURE REVIEW
It is of great interest in exploiting spatial information to improve HSI classification. In previous system
like SVM classifier composite kernels is employed for combination of both spectral & spatial information
referred [9] as SVM-CK. Further researches are like SVM-MRF [10] based on segmentation map obtained
by pixel wise SVM classifier, Gaussian mixture model classifier (MRF-GMM). Moreover, morphological
profile (MP) [7] generated by certain morphological operators which is widely used for modeling structural
information. MPs are extracted from principle components (PC). But in PCs fine structures tend to be
present in minor PCs than in major PCs. Before SVM-CK [8] kernel discriminate analysis were employed
but it has problem of overload computation. Next researches are like Gabor texture feature, Gabor texture
feature including gray level concurrent matrix, different MPs, & urban complexity index also Gabor
feature with band selection

3. PROBLEM STATEMENT
Hyperspectral image processing has been a very dynamic area in remote sensing and other applications in
recent years. Hyperspectral images provide ample spectral information to identify and distinguish
spectrally similar materials for more accurate and detailed information extraction. Wide range of advanced
classification techniques are available based on spectral information and spatial information. To improve
classification accuracy it is essential to identify and reduce uncertainties in image processing chain. Large
number of high spatial resolution images is available through various advances of sensor technology. In
conventional HSI classification systems, classifiers only consider spectral signatures and ignore the spatial
information at neighboring locations. So we focused on classification of hyperspectral images using local
binary patterns and technologies of computational intelligence.

4. PROPOSED SOLUTION
There are total two primary stages in this method i.e. an effective texture feature extraction & fusion of
extracted local LBP features, global Gabor features, & original spectral feature. First LPE is applied to an
image for band selection with gray scale image generation. Then LBP code generation takes place to each
pixel in an image. From that only local LBP image patch extracted & then histogram is calculated. After
this process fusion of extracted local LBP is carried out along with Gabor, spectral features. In this process
LOGP plays a vital role in merging probability outputs of the multiple texture & spectral features. Gabor
filter is used as a global operator to capture global texture feature like orientation &scale. While LBP can

http://www.iaeme.com/IJECET/index.asp 19 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

characterized the local spatial textures such as edges, corners, & knots. Finally Gabor feature & LBP
texture features are combined for HSI classification betterment.

5. SYSTEM OVERVIEW
5.1. Hyper Spectral Image Classification Approaches
The hyper means over i.e. too many and refers as large number of measured wavelength bands.
Hyper spectral images are spectrally over determined, which provides ample spectral information to
recognize and distinguish spectrally unique materials. Hyper spectral imagery gives the potential for more
accurate and detailed information extraction than possible with any other type of remotely sensed data
[1].Hyper spectral images are 3D data, with a spectral signature for the scene spread over several bands.
Generally, the high dimensional spectral information is used to perform operations like pixel-by-pixel
classification of the scene. Band selection or feature extraction methods had been developed to improve the
performance of parametric classifiers like ML, Distance Classifiers and clustering methods. However, the
classification accuracies of these methods do not match for gray scale/color images. To identify groups of
pixels which having similar spectral characteristics and also to determine the various features represented
by these groups is an important part of image analysis, called as classification. To classify an image, visual
classification based on the analyst's ability to use visual elements (tone, contrast, shape, etc.) is necessary.
Digital image classify on the basis of the spectral information used to create the image and each individual
pixel is classified based on its spectral characteristics & then all pixels in an image are assigned to
particular classes (e.g. water, coniferous forest, deciduous forest, corn, wheat, etc.). This classified image
is called a thematic map of the original image. Hence a classification is performed for observation of land
use patterns, geology, vegetation types, or rain fall. In classification of an image we have to distinguish
between spectral classes and information classes. Spectral classes are groups of pixels which have
approximately uniform spectral characteristics. The main objective of image classification procedures is to
automatically categorize all pixels in image into land cover classes. Based on pixel information, images is
classified into Per-pixel, Sub pixel, Per-field, Knowledge based, Contextual and multiple classifiers. Per-
pixel classifiers are parametric or non- parametric. By using training samples, images can be classified as
Supervised and Unsupervised Classification. The unsupervised classification is the identification of natural
groups. The supervised classification is the method of using samples of known identity to assign
unclassified pixels to one of several informational classes. Supervised method follows the steps such as
feature extraction, training and labeling processes. In first step transforming the image to a feature image to
reduce the data dimensionality and improve the data interpretability takes place. It is optional phase and
composed techniques such as HIS transformation, principal component analysis and linear mixture model.
In the training phase, a set of training samples in the image is selected to specify each class. Training
samples trained classifiers to identify the classes and are used to determine the rules which allow
assignment of a class label to each pixel in the image. Hyperspectral Image Classification approaches are
classified as shown in Figure 1

http://www.iaeme.com/IJECET/index.asp 20 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

Figure 1 Hyperspectral image classification


On the basis of pixel information, Images can be classified as Per-Pixel, Sub Pixel, Per-field,
Knowledge based, Contextual and multiple Classifiers. In Per-Pixel Classifier image classification is based
on processing the entire scene in an image, pixel by pixel referred as pixel-based classification. In many
applications per-pixel classifiers are not suitable as they basically handle spectral information. In Sub pixel
classifier, each pixel is classified into one category. It deals with mixed pixel problems. Using an extended
version of the Gaussian Maximum Likelihood (GML) algorithm per-field classifier scene is divided into
homogeneous image segments. Contextual classifier makes the use of the spectral information at every
pixel to predict the class of that pixel independently by observation at other pixels. It utilizes the
information from neighborhood pixels.

5.2. Local Binary Pattern (LBP)


The LBP operator is an image operator which transforms an image into an array which describes small-
scale appearance of the image. These labels, most of the histogram, are then used for further image
analysis. The basic version of the local binary pattern operator works in a 33 pixel block of an image.
Each pixel in this block is threshold by its center pixel value, multiplied by powers of two and then added
to obtain a label for the center pixel. Total 28= 256 different labels can be obtained on the basis of relative
gray values of the center and the pixels in the neighborhood as the neighborhood consists of 8 pixels. An
example of an LBP image and histogram are shown in Figure 2

Figure 2 Example of an input image, the corresponding LBP image and histogram

http://www.iaeme.com/IJECET/index.asp 21 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

Figure 3 The circular (8,1),(16,2) and (8,2) neighborhoods. The pixel values are bilinear interpolated whenever the
sampling point is not in the center of a pixel

5.3. Mappings of the LBP Labels: Uniform Patterns


It is of great interest to have features that are invariant to rotations of the input image in many texture
analysis applications. Rotation of the input image has two effects first is each local neighborhood is rotated
into other pixel location, and second is within each neighborhood as the LBP P, R patterns are obtained by
circularly sampling around the center pixel. The sampling points on the circle surrounding the center point
are rotated into a different orientation. Another one is uniform patterns which uses original operator [5].For
this, a uniformity pattern is used with the bit pattern is considered circular: U (pattern) which is the
number of bitwise transitions from 0 to 1 or vice versa. If uniformity of local binary pattern measures at
most 2 then it is called as a uniform. For example, the patterns 00000000 (0 transitions), 01110000 (2
transitions) and 11001111 (2 transitions) are uniform and the patterns 11001001 (4 transitions) and
01010011 (6 transitions) are not uniform. Separate output label for each uniform pattern and all the non-
uniform patterns are assigned to a single label in uniform LBP mapping. Hence, the number of different
output labels for mapping for patterns of P bits is P (P 1) + 3.59 output labels for neighborhoods of 8
sampling points, and 243 labels for neighborhoods of 16 sampling points are produced by the uniform
mapping. The reasons for neglecting the non-uniform patterns are twofold. First is, most of the local binary
patterns are uniform in natural images. It is noticed that in experiments with texture images, when using
the (8, 1) neighborhood uniform patterns gives result, a bit less than 90% of all patterns and for around
70% in the (16, 2) neighborhood. In experiments with facial images [1], it is found that 90.6% of the
patterns in the (8, 1) neighborhood and 85.2% of the patterns in the (8, 2) neighborhood are uniform. The
second thing for considering uniform patterns is the statistical robustness. Using uniform patterns produces
better recognition results in many applications. Uniform patterns themselves are more stable, i.e. less
affected to noise and whereas, considering only uniform patterns makes the number of possible LBP labels
significantly lower and reliable estimation of their distribution requires fewer samples. The uniform
patterns allow seeing the LBP method as a unifying approach to the traditionally divergent statistical and
structural models of texture analysis [5].Every pixel is labeled with the code of the texture primitive that
matches to the local neighborhood for the best result. Thus each LBP code can be considered as a micro
text on. Spots, flat areas, edges; edge ends, and curves and so on are local primitives detected by the LBP.
Some examples are shown in Figure 4with the LBP 8, R operator. In the figure, 1s are represented as bold
black circles, and 0s are white. The LBP distribution hence has both the properties of a structural analysis
method: texture primitives and placement rules. Whereas, the distribution is just a statistic of a non-linearly
filtered image, which clearly makes the method a statistical one

http://www.iaeme.com/IJECET/index.asp 22 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

Figure 4 Different texture primitives detected by the LBP


For these reasons, the LBP distribution can be successfully used in recognizing a wide variety of
different textures, to which statistical and structural methods have normally been applied separately.

5.4. Decision Fusion

Flowchart 1 Decision Fusion Approach


To combine the results from supervised and unsupervised classifiers a decision fusion approach is
developed. Advantage of the power of a support-vector machine-based supervised classification in class
separation and the capability of an unsupervised classifier, such as K-means clustering, in reducing trivial
spectral variation impact in homogeneous regions is taken by final output. In this three decision level
fusion methods and four schemes for input data are used to hyper spectral remote sensing image
classification. The first scheme is the most common in which the original hyper spectral dataset is used by
different classifiers. The second scheme is an improved one, in which all classifiers still use identical input
dataset, but these dataset should consists of both the original data and texture features derived from
original data. In third, all wavebands are divided into different groups based on inter-band correlation
analysis. Each group of data including texture feature are used to a specific classifier, that means the input
for multiple classifiers are different but every group of data should be a representative subset of original
data. In fourth one, the first ten components derived by MNF transformation to original data and texture
features are used as input of different classifiers. A modification to principal components analysis that
normalizes each band of the hyper spectral image by its noise level prior to processing is the Minimum (or
Maximum) Noise Transform (MNF).This acts to reduce the influence of noise in the transformed images
as the noisier bands of the hyper spectral image are deemphasized. Generally noise is calculated by using

http://www.iaeme.com/IJECET/index.asp 23 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

"shift-difference" statistics. In this method, the difference between adjacent pixels is assumed to be an
estimate of noise.

5.5. Support Vector Machine


Support Vector Machines (SVM) is recently being used with success for the classification of hyper spectral
images. This method observe like a robust alternative for pattern recognition with hyper spectral data: as
the method is based on a geometric point of view, no statistical estimation has to be achieved. Then, SVM
performs classical supervised classification algorithms such as the maximum likelihood with the number of
spectral bands increases or number of training samples remains limited. The technique consists of finding
the optimal separation surface between classes. These samples are called support vectors. If the training
data set is not linearly separable a kernel method is used to simulate a non-linear projection of the data in a
higher dimension space, where the classes are linearly separable. Also, without statistical estimations, a
small number of training samples (understood that those are representative) is enough to find the support
vectors. Then, such a kind of classifier gives very interesting properties for hyper spectral image
processing: it does not affect due to the Hughes phenomenon (which is: for a limited number of training
samples, the classification rate decreases as the dimension increases). It may perform class separation even
with a small number of training samples spaced much closed to each other. This separability remains quite
difficult even with techniques dedicated to hyper spectral data such as Spectral Angle Mapping or Spectral
Un-mixing. However, separability measures are based on dot product or geometric distance between
vectors. Such a kind of approaches does not take into consideration spectral meaning and behavior.
Spectral signature of an object remains of the same shape, though it is observed with several illumination
conditions, and requires to be classified in the same way. Then for processing hyper spectral data cubes, it
is proposed to integrate spectral knowledge into SVM classifiers. It enables the classification results to be
improved for thematic classification of hyper spectral data cube. The process has been applied on hyper
spectral images from the CASI sensor.

5.6. Extreme Learning Machine


Extreme learning machine (ELM) [4] belongs to the class of single-hidden layer feed- forward neural
networks (SLFNs). To train such networks traditionally, gradient based method such as back propagation
algorithm is used. ELM randomly generates the hidden node parameters and analytically determines the
output weights instead of iterative tuning, to make learning extremely fast. ELM is computationally
efficient as well as tends to achieve similar or even better generalization performance than SVMs.
However, ELM can produce a large variation in classification accuracy, with the same number of hidden
nodes, due to the randomly assigned input weights and bias. But in these works, ELM was employed as a
pixel-wise classifier, which indicates that only the spectral signature has been exploited, ignoring the
spatial information at neighboring locations. Still for HSI, it is highly probable that two adjacent pixels
belong to the same class. Both spectral and spatial information are verified to improve the HSI
classification accuracy significantly [1].Two main categories utilizing spatial features to extract some type
of spatial features (e.g., texture, morphological profiles, and wavelet features), and to directly use pixels in
a small neighborhood for joint classification assuming that these pixels usually share the same class
membership. The first one (feature dimensionality increased), i.e. Gabor features are successfully used for
hyper spectral image classification [1] due to the ability to represent useful spatial information.

6. SYSTEMANALYSIS
6.1. Band Selection
Hyper spectral images consist of a large number of spectral bands, but many of which contain redundant
information. By selecting a subset of spectral bands with distinctive and informative features, band
selection, such as LPE [1], reduces the dimensionality. Linear projections, such as PCA, can also transform
the high-dimensional data into a lower dimensional subspace. In previous study [1], [5], investigation on

http://www.iaeme.com/IJECET/index.asp 24 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

both LPE and PCA for spatial-feature-based hyper spectral image classification is done and found that the
classification performance of LPE was superior to that of PCA. The reason might be fine spatial structures
tend to be present in minor PCs rather than in major PCs. Thus, band selection (i.e., LPE) is employed in
this research. LPE [1] is a simple. Efficient band selection method based on band similarity measurement.
Assuming there are two initial bands B1 and B2. For every other band B, an approximation can be
expressed as = 0 + 1 1 + 2 2, where a0, a1, a2 are the parameters to minimize the LPE: =
2. Let the parameter vector be = , , . A least squares solution is employed to obtain
= ( 1 2) 1 where XB1B2 is an N3 matrix whose first column is with all 1s, second
column is the B1-band, and third column is the B2-band. Here, N is the total number of pixels, and XB is
the B-spectral band. The band which produces the maximum error e is considered as the most dissimilar
band toB1andB2, and it will be selected. Using these three bands, a fourth band can be found by using the
same strategy and so on. More implementation details can be found in [5].

6.2. Feature Extraction


It belongs to gray scale & rotation invariant texture operator. An effective texture feature extraction
approach that is more suitable to HSI is provided. To find a set of distinctive and informative bands, an
LPE-based band selection is first employed. For each band, the LBP code is computed for every pixel in
the entire image to form an LBP code image, and then for each local patch cantered at a pixel of interest,
the LBP histogram is generated. The second thing is in the effective fusion of extracted local LBP features,
global Gabor features, and original spectral features, LOGP plays a vital role in merging probability
outputs of the multiple texture and spectral features.

Figure 5 Pixel orientation

Figure 6 Example of LBP binary thresholding (a) Centre pixel and its eight circular neighbors { } 7i=0 with
radius r=1.(b) 33 sample block.(c) Binary labels of eight neighbors
With center pixel , each neighbor of (t0 to t7) is assigned with a binary label, either 0 or 1. These
values are depends on value of intensity of center pixel . All these samples are equispaced with radius r.
where r is refers to distance between neighbor and center pixel. For m number of neighbors{ } !" , the
LBP code for is given by

http://www.iaeme.com/IJECET/index.asp 25 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

# $ ,% ( ) = !" '( ( )) 2 (1)


Where, '( ( )) = 1 (* ( > ) =1&
'( ( )) = 0 (* ( )
Figure 6 shows an example of binary thresholding process of (m, r) = (8, 1). LBP divide examine
window into the cells (for e.g., 16x16). For each pixel in the cell, compare the pixels to each of its eight
neighbors; follow the pixel along the circle clockwise or counter clockwise. If center pixel value greater
than neighbors value write 0, otherwise write 1 gives 8 digit binary number. If LBP code is calculated
in clockwise direction then 11001010= 83. Assuming that the coordinate of is (0, 0), each neighbor has
a coordinate of (r sin (2i/m), r cos (2i/m)).Practically, parameter set (m, r) may change, such as (4, 1), (8,
2), etc. The locations of circular neighbors that do not fall exactly on image grids are estimated by bilinear
interpolation [5]. The output of the LBP operator in (1) indicates that the binary labels in a neighborhood,
represented as an m-bit binary number (including 2m distinct values), reflect texture orientation and
smoothness in a local region. After obtaining the LBP code, an occurrence histogram, as a nonparametric
statistical estimate, is computed over a local patch. A binning procedure is required to guarantee that the
histogram features have the same dimension.

Figure 7 Implementation of LBP feature extraction


After band selection, the LBP feature extraction process or Gabor filtering is applied to each selected
band image. Figure 3 illustrates the implementation of LBP feature extraction. In Figure 7, the LBP code is
first calculated for the entire image to form an LBP image, and the LBP features are then generated for the
pixel of interest in its corresponding local LBP image patch. Note that patch size is a user-defined
parameter.

6.3. Gabor Filter


It is band-pass filter whose value is based on orientation & sensitive to rotation invariance. Generally we
prefer circularly symmetric Gabor filter to consider all directions for pass band. Gabor features includes
magnitude of signal power in corresponding filter pass band of Gabor filtered image. Gabor filter can be
89: ;: =9: 89
represented as -.,/,0,1,2 (a, b) = 56 7 ? 56 @A 72B + D?E
>: C
Where,
9
= acos I + J sin I
J 9 = asin I + J cos I (2)
In this equation; : wavelength of sinusoidal factor : orientation separation angles (/8, /4, /2 etc.)
: phase offset : Standard derivation of Gaussian envelope : Spatial aspect ratio. With =0 & = /2
return the real & imaginary parts of the Gabor filter respectively.
C PQ RSTU
M = NO RS (3)

http://www.iaeme.com/IJECET/index.asp 26 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

6.4. Comparison of Gabor & LBP


From the above description, it is notified that the Gabor belongs to a global operator while the LBP is a
local one. From the output result, Gabor features and LBP represent texture information from different
perspectives.

Figure 8 Example (a) Input image (b) LBP-coded image (different intensities representing different codes) (c) (f)
Filtered images obtained by the Gabor filter with different values. (c) Gabor feature image, =0. (d) Gabor feature
image, =/4. (e) Gabor feature image, =/2. (f) Gabor feature image, =3/4
Figure 8 illustrates an example of comparison between LBP and Gabor features in a natural image
(namely, boat) of size 256256. Figure 8 (b) shows the LBP-coded image obtained using (1) with (m, r) =
(8, 1), and figure 8(c)(d) illustrates the filtered images obtained by the Gabor filter with different (i.e., 0,
/4, /2, and 3/4). In figure 8, the Gabor features produced by the average magnitude response for each
Gabor-filtered image reflect the global signal power. While the LBP coded image gives a better expression
of detailed local spatial features, like edges, corners, and knots. Hence, it is necessary to apply the global
Gabor filter as a supplement to the local LBP operator that lacks the consideration of distant pixel
interactions, for getting better result. As stated earlier, the Gabor filter captures the global texture
information of an image, and LBP represents the local texture information, it came to know that an HSI
data usually contains homogeneous regions where pixels fall into the same class. Gabor features can able
to reflect such a global texture information as the Gabor filter can effectively capture the orientation and
scale of physical structures in the scene. Hence, combining Gabor and LBP features can achieve better
classification performance than using only LBP features.

6.5. Classifier
ELM [2], [4] a type of classifier is a neural network with only one hidden layer and one linear output layer.
The weights of the output layer are computed using a least squares method and the weights between the
input and the hidden layers are randomly assigned. Which causes to the computational cost is much lower
than any other neural-network-based methods. For C classes, let the class labels be defined as VW
{1, 1} (1 Y Z). Thus, a constructed row vector\ = \1, . . . , \Y, . . . , \Z indicates the class to which
a sample belongs. For example, if \Y = 1and the other elements in y are 1, then the sample belongs to
the kth class. Thus, the training samples and corresponding labels are represented as{5(, \(}Q! , where
5( ]^ and\( ] , the output function of an ELM with L hidden nodes can be expressed as,

http://www.iaeme.com/IJECET/index.asp 27 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

*P(_` ) = d! aA(cA. 5( + JA) = \ , i=1,2,3,n (4)


where h () is a nonlinear activation function (e.g., sigmoid function), f g] denotes the weight vector
connecting the jth hidden node to the output nodes, cf g] ^ represents the weight vector connecting the jth
hidden node to the input nodes, and f is the bias of the jth hidden node. The term cf . 5 denotes the inner
product ofcf and5 . If a value of 1 is padded to xi to make it a (d+1)-dimensional vector, then the bias can
be considered as an element of the weight vector, which is also randomly assigned. For n equations, (4)
can be written as
ha = V (5)
Where
V = \1; \2 . \k ] Q_ , a = a1; a2. . . ak g] d_l , And H is the hidden layer output matrix
of the neural network expressed as
5n (c . 5n + J ) (cd . 5n + Jd )
h=m p=m p (6)
5Q (c . 5n + J ) (cd . 5Q + Jd )
In (6), (_` ) = (c . 5n + J )... (cd . 5n + Jd )] is the output of the hidden nodes in response to the
input xi, which maps the data from d-dimensional input space to L-dimensional feature space. In most of
cases, the number of hidden neurons is much smaller than the number of training samples, i.e.,# << k the
least squares solution of (7) described in [4] can be used.
=HY (7)
Where H is the MoorePenrose generalized inverse of matrix H and H=HT (HHT) 1. For better

stability and generalization, a positive value1/ is normally added to each diagonal element of HHT. As a
result, the output function of the ELM classifier is expressed as
n
t# (5() = (5()a = (5()hu 7v + hhu? 1V (8)
In ELM, the feature mapping h (xi) is assumed to be known. Recently, kernel-based ELM [4] has been
proposed by ex-tending explicit activation functions in ELM to implicit map-ping functions, which have
exhibited a better generalization capability. If the feature mapping is unknown, a kernel matrix of ELM can
be considered as
wxdy = hhu: wxdy (, A = (5() (5A) = |(5( , 5A)(9)
Hence, the output function of KELM is given as
|(5 , 5 )
n
*d_` =m p 7 + wxdy ? ~ (10)
|(5 , 5Q )
The input data label is finally determined as per the index of the output node with the largest value. In
these experiments, the kernel version of ELM is implemented. The training of ELM has only one analytical
step as compared to the standard SVM, which needs to solve a large constrained optimization problem. In
these experiments, it will be demonstrated that ELM can provide a classification accuracy that is similar to
or even better than that of SVM.

http://www.iaeme.com/IJECET/index.asp 28 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

6.6. Feature-Level Fusion (FF)


Feature level fusion employed in the proposed classification framework, as shown in Figure 9

Figure 9 Feature level fusion


Each feature reflects various properties and has its special meaning, like the Gabor feature provides
spatial localization and orientation selectivity, the LBP feature open up the local image texture (e.g., edges,
corners, etc.), and the spectral feature represents the correlation among bands. For various classification
tasks, these features are having their own advantages and disadvantages, and hence it is difficult to
determine which one is always optimal [1]. Thus, it is straightforward to stack multiple features into a
composite one. In this fusion strategy,to modify the scale of feature values feature normalization before
feature stacking is a necessary;i.e. pre-processing step. A simple treatment is to perform a linear
transformation on these data and preserves the relationships among the values. For an instance, a minmax
technique maps all of the values in the range of [0, 1].Here, three aforementioned features, i.e., LBP
features (local texture), Gabor features (global texture), and selected bands(spectral features), and their
combinations, such as LBP features+ Gabor features +spectral features, LBP features+ spectral features,
Gabor features+ spectral features, etc., will be discussed. Note that there are at least two potential dis-
advantages of feature-level fusion: 1) multiple feature sets to be stacked may be incompatible, causes to
the induced feature space to be highly nonlinear, and 2) the induced feature space has a much larger
dimensionality, which may deteriorate classification accuracy and processing efficiency

6.7. Decision-Level Fusion (DF)

Figure 10 Decision level fusion


Different from feature level fusion, decision level fusion [3], [5] is to merge results from a classifier
ensemble of multiple features as shown in Figure 10 This mechanism combines distinct classification
results into a final decision, improving the accuracy of a single classifier that uses a certain type of
features. The main objective is to utilize the information of each type of features, compute the probability
outputs by ELM, and then combine them with the soft LOGP for final decision. As the output function
[i.e., (11)] of ELM estimates the accuracy of the predicted label and reflects the classifier confidence, the
conditional class probability from the decision function is attempted to achieve. The probability should be

http://www.iaeme.com/IJECET/index.asp 29 editor@iaeme.com
Priya G. Deshmukh and Prof. M.P. Dongare

higher for a larger output of the decision function. Platts empirical analysis using scaling functions of the
following form is added
$( n ) = ;_ () ;
(11)

Where$( n ) means the conditional class probability of the qth classifier,*d(_) is the output decision
function of each ELM, and (W , W ) are parameters estimated for ELM in class Y (1 Y Z). The
parameters W and W are found by minimization of the cross-entropy error over the validation data. Note
that W is negative. In the proposed framework, LOGP [3], [5] uses the conditional class probabilities to
estimate a global membership function$(n ) a weighted product of these output probabilities. The
final class label y is given according to
5
\ = arg Y 1, ) 6(\W | ) (12)
Where the global membership function is
$(\W | ) =
! $ (\W | )
8
(13)
log $(\W | ) = ! $ (\W | ) (14)

With {} = 1being the classifier weights uniformly distributed over all of the classifiers and Q
being the number of pipelines (classifiers) in Figure 10

6.8. Comparison between FF & DF


DF based method superior than FF based method as FF based method couldnt take advantages of discrete
power of each feature image. FF is incompatible to multiple feature sets FF is having large dimensions

7. CONCLUSION
In this paper, a framework based on LBP proposed to extract local image features for classification of HSI.
Specifically, LBP implemented to a subset of original bands selected by the LPE method. Two types of
fusion levels (i.e., feature and decision levels) were defined on the extracted LBP features along with the
Gabor features and the selected spectral bands. A soft-decision fusion process of ELM utilizing LOGP
proposed to merge the probability outputs of multiple texture and spectral features. The experimental
results express that local LBP representations are effective in HSI spatial feature extraction, because they
encode the information of image texture configuration while providing local structure patterns. Also, the
decision-level fusion of kernel ELM provides effective classification and also it is superior to SVM-based
methods. Recently, feature-level fusion simply concatenates a pair of different features (i.e., Gabor
features, LBP features, and spectral features) in the feature space.

REFERENCES
[1] C. Chen, W. Li, H. Su, and K. Liu, spectralspatial classification of hyperspectral image based on
kernel extreme learning machine, Remote Sens., vol. 6, no. 6, pp. 57955814, Jun. 2014.
[2] R. Moreno, F. Corona, A. Lend asse, M. Grana, and L. S. Galvao Extreme learning machines for
soybean classification in remote sensing hyperspectral images,Neuro-computing, vol. 128, no. 27, pp.
207216, Mar. 2014.
[3] W. Li, S. Prasad, and J. E. Fowler, Decision fusion in kernel-induced spaces for hyperspectral image
classification, IEEE Trans. Geosci. Remote Sens., vol. 52, no. 6, pp. 33993411, Jun. 2014
[4] Y. Baziet al. Differential evolution extreme learning machine for the classification of hyperspectral
images IEEE Geosci. Remote Sens. Lett., vol. 11, no. 6, pp. 10661070, Jun. 2014.
[5] Z. Guo, L. Zhang, and D. Zhang, Rotation invariant texture classification using LBP variance (LBPV)
with global matching Pattern Recogn., vol. 43, no. 3, pp. 706719, Mar. 2010

http://www.iaeme.com/IJECET/index.asp 30 editor@iaeme.com
Hyperspectral Imagery Classification using Technologies of Computational Intelligence

[6] X. Kang, S. Li, and J. A. Benediktsson, spectralspatial hyperspectralimage classification with edge-
preserving filtering ,IEEE Trans. Geosci.Remote Sens., vol. 52, no. 5, pp. 26662677, May 2014
[7] 7. M. Fauvel, J. A. Benediktsson, J. Chanussot, and J. R. Sveinsson Spectral and spatial classification
of hyperspectral data using SVMs and morphological profilesIEEE Trans. Geosci. Remote Sens., vol.
46, no. 11,pp. 38043814, Nov. 2008.
[8] 8.C. Chen and J. E. Fowler, Single image super-resolution using multi hypothesis prediction, in Proc.
46th Asilomar Conf. Signals, Syst., Comput.,Pacific Grove, CA, USA, Nov. 2012, pp. 608612
[9] 9. C. Chenet al., Multihypothesis prediction for noise-robust hyperspectral image classification IEEE
J. Sel. Topics Applies Earth Observe. RemoteSens., vol. 7, no. 4, pp. 10471059, Apr. 2014.
[10] 10. X. Huang and L. Zhang, An SVM ensemble approach combining spectral, structural, and semantic
features for the classification of high-resolution remotely sensed imagery, IEEE Trans. Geosci. Remote
Sens.,vol. 51, no. 1, pp. 257272, Jan. 2013
[11] Sorna Percy. G and Dr. T. Arumuga Maria Devi, An Efficiently Identify the Diabetic Foot ULCER
Based on Foot Anthropometry Using Hyper Spectral Imaging. International Journal of Information
Technology & Management Information System, 7 (2), 2016, pp. 3644.
[12] Preethi N Patil and G. G. Rajput, Detection and Classification of Non Proliferative Diabetic Retinopathy
Stages Using Morphological Operations and SVM Classifier. International Journal of Computer
Engineering & Technology, 4 (6), 2013, pp. 18.

http://www.iaeme.com/IJECET/index.asp 31 editor@iaeme.com

You might also like