Professional Documents
Culture Documents
INSTITUTE OF ENGINEERING
CENTRAL CAMPUS, PULCHOWK
by
Ajeya Acharya
2073/MSTR/252
A THESIS PROPOSAL
SUBMITTED TO THE DEPARTMENT OF CIVIL ENGINEERING
September, 2018
ABSTRACT
CONTENTS
ABSTRACT .................................................................................................................................... 2
LIST OF FIGURES ........................................................................................................................ 4
LIST OF TABLES .......................................................................................................................... 5
Chapter 1 ......................................................................................................................................... 6
INTRODUCTION .......................................................................................................................... 6
1.1 Background ........................................................................................................................... 6
1.2 Statement of Problem ............................................................................................................ 7
1.3 Objectives ............................................................................................................................. 7
1.4 Overall Framework of the Study........................................................................................... 8
Chapter 2 ......................................................................................................................................... 9
LITERATURE REVIEW ............................................................................................................... 9
Chapter 3 ....................................................................................................................................... 17
METHODOLOGY ....................................................................................................................... 17
3.1 Methods of cluster analysis ................................................................................................. 17
3.2 Cluster validation and selection of best cluster: ................................................................. 22
3.3 Questionnaire Form Survey: ............................................................................................... 23
3.4 Factor analysis (KMO test) ................................................................................................. 23
3.5 Reliability test (Cronbach’s Alpha) .................................................................................... 24
3.6 PLOS model based upon perception of pedestrian ............................................................. 25
Chapter 4 ....................................................................................................................................... 26
STUDY AREA and DATA COLLECTION ................................................................................ 26
4.1 Study Area: ......................................................................................................................... 26
4.2 Data Collection: .................................................................................................................. 26
Chapter 5 ....................................................................................................................................... 27
RESULT and ANALYSIS ............................................................................................................ 27
Chapter 6 ....................................................................................................................................... 28
SUMMARY AND CONCLUSION ............................................................................................. 28
REFERENCES ............................................................................................................................. 29
LIST OF FIGURES
LIST OF TABLES
Chapter 1
INTRODUCTION
1.1 Background
The population of Nepal in urban areas has increased significantly from 269 thousand in 1955 to
5.8 million in 2018 and is estimated to grow to around 8 million by the year 2030. Kathmandu is
only city of Nepal having population more than 1 million. The annual growth rate of motor vehicle
population has been increasing in significant amount. One of the important factor for urban
development is transportation and so every model trip has walking trip a significant proportion,
which implies that pedestrian, is inseparable part of transportation system. For the design of urban
and transportation facilities needs of pedestrian also must be considered along with needs of motor
vehicles. Large percentage of people in Kathmandu travels on foot or by public transportation but
with exponential increase of traffic vehicles there is no adequate attention given to public transport
and pedestrian facilities which has led to higher pedestrian fatalities. In addition, there is high
levels of air pollution and noise pollution due to large vehicular population as well as other
conditions like heat, dust, poor walking area condition and large distance trip and so people prefer
driving and riding than walking.
Walking provides mobility to large percentage of people in cities like Kathmandu especially to
tourists, students and poor people. It is also essential to support public transport facilities and short
distance trips for people owing private vehicles. An improved pedestrian safety and safer walkable
environment has many benefits like improvement of accessibility of pedestrians, reduction of
transportation cost, increase of parking efficiency, aesthetic environment, reduction of pollution,
improvement of heath of people due to walking.
According to HCM, Pedestrian Level of Service (PLOS) is “qualitative measure that describe the
operational characteristics of pedestrian which is based upon several service measures like speed,
travel time, comfort, convenience, interruptions and freedom to maneuver.” Six classes from “A”
to “F” that describe operations from best to worst for each type of facilities describe PLOS.
The traffic in urban street of Nepal is highly heterogeneous consisting of various kinds of vehicles
with different operational behaviors like motorbike, car, bus, micro, tempo, etc. moving in same
carriageway of roads. Our country don’t have flexible working hour system and so most of the
people like students, employees, etc. make their journey at similar time frame that results to huge
congestion at peak hours. Highly heterogeneous traffic flow, poor enforcement of traffic law,
illegal parking, poor surface condition, obstructions, unauthorized vendors activities, etc. are some
major factors that affect PLOS of urban off street facilities i.e. sidewalk. Sidewalk characteristics
in low-income country are much different compared to that of high income and developed
countries. Therefore, analysis of PLOS based upon models of developed country may not suite in
low-income country like Nepal.
In Nepal, there is no proper methodology to evaluate Pedestrian Level of Service (PLOS) in urban
areas. The suitable methodologies need to be developed that can help in planning, design and
operation phase of transportation projects. An attempt will be made in this study to define and
evaluate PLOS in urban sidewalk of Nepal. Both qualitative and quantitative methods will be used
to define and classify PLOS. Qualitative methods shall be based upon pedestrian perception
analysis at real time and quantitative method will be done to classify the PLOS of speed, flow,
space and volume to capacity ratio by powerful cluster analysis algorithms.
1.2 Statement of Problem
Rapid growth of vehicles has created threats upon pedestrian level of safety. Engineers often fail
to provide satisfactory facilities on the roadside or they compromise their safety for designing
better transportation facilities. Large percentage of death and injury are among pedestrians as they
are most vulnerable road users. There must be provision of safe environment for pedestrians and
prevent them without any conflicts with other modes of transportation.
PLOS analysis defines operating condition of pedestrian facility and this analysis helps in
addressing growth management. PLOS criteria judge the operational efficiency of infrastructures
of pedestrians. The road users, traffic, road facilities and environmental factors of Nepal are
completely different from those in context of developed countries like US, Canada, Australia, etc.
So, the modified criteria is must in Nepalese context for qualitative measurement of service level
of sidewalk of pedestrians.
1.3 Objectives
This thesis shall attempt to determine factors that affect sidewalk performance, which is based
upon perception of pedestrians, and level of services will be calculated from provided information.
In addition, video graphic techniques will be adopted to obtain pedestrian data for quantitatively
estimation of PLOS.
1. To discover suitable criteria which are appropriate for urban street of Nepalese context to
evaluate pedestrian LOS.
2. To determine perception based PLOS by the methods of questionnaire surveys.
3. To determine PLOS of pedestrian average speed, speed, flow rate and v/c ratio by the help
of various clustering algorithms.
4. To find most suitable cluster analysis algorithms for defining the ranges of PLOS
5. To compare the obtained ranges of PLOS with other existing international widely used
PLOS models.
1.4 Overall Framework of the Study
Data Collection
LITERATURE REVIEW
Level of Service (LOS) is used to traffic engineers to evaluate effectiveness of transport facility.
LOS was introduced in Highway Capacity Manual (HCM), 1965 for the first time to describe
quality of service by a given facility at different conditions. Later HCM 2000 defined PLOS into
two broad segments that are uninterrupted and interrupted pedestrian facility. HCM 2010 analyzed
PLOS by measurement of flow rate of pedestrian that incorporates speed, density and volume and
sidewalk space. Pedestrian speed decreases as volume and density increase and so pedestrian space
reduces which reduce the ease of maneuver.
HCM defines six level of service that range from A to F. In sidewalk with LOS A, pedestrian are
ability to move in desired path without the need of alter movement. With LOS B, there is sufficient
area for pedestrians to walk freely, bypass others, occasional need of adjusting path and avoiding
conflicts. At LOS C, Pedestrian need to adjust path frequently to avoid conflicts and space is
sufficient for normal walking speed. At LOS D, there is restriction of freedom to walk in normal
speed and bypass other slower pedestrians. At LOS E, there is restriction of pedestrians’ walking
speed virtually and the need to adjust their walking behaviors frequently. LOS F is worst condition
where all walking speeds are severely restricted and there is frequent contact with other
pedestrians.
Following steps are used to determine PLOS of wide walk in HCM.
Determine LOS
Jaskiewicz (2000) proposed method of evaluation of PLOS based upon trip quality. Nine specific
evaluations of pedestrian system were measured in terms of pleasantness, safety, and functionality
and the nine measures are enclosure/definition, complexity of path networks, building
articulations, complexity of spaces, transparencies, buffers, shades, trees,
overhangs/awnings/varied roof lines, and physical components/conditions. Each of these measures
was derived from a combination of safety issues, volume and capacity consideration.
Miller et al. (2000) used visualization as simulation tool to validate and calibrate PLOS in sub
urban areas. The simulation were produced in short time and respondents were able to understand
from the visualizations what type of improvements were being considered.
Pascal (2003) included the obstacles in pedestrian simulations. A person requires a 0.3m lateral
spacing on each side and extra longitudinal space for speed deviation. Based on this research, the
measured distance to obstacles is 0.45m for wall, 0.35m for fence and roadway, 0.3m for poles.
Sarkar (2003) introduced some major theoretical guidelines for qualitative evaluation of the levels
of comfort offered along walkways in major activity centers. Researches on urban design,
environmental psychology, landscape architecture, and urban planning were used to develop the
method. The method included two separate evaluations; one service level, which gives standards
for overall desirable and undesirable comfort condition at the macro level, and other the quality
level, which looks at the micro level finer details of comfort of pedestrians. Service level and
quality level were based on physical, physiological, and psychological comfort. Comfort
requirements were vary depending on cultural and spatial.
Rahaman (2005) tried to explore qualitative level of pedestrians comfort in Dhaka by offering six
broad categories of roadside walking environment in terms of safety, security, convenience and
comfort, continuity of the walkway, system coherence, and attractiveness by some specific
facilities. Some qualitative data had been collected from observation survey, whereas the walker's
responses had been recorded through questionnaire survey. The questionnaire was designed to get
the opinion of pedestrians concerning the sidewalks environment with those six criteria. Result of
the research stated that pedestrians were neglected for their safety and convenience. Hence, city
authorities must give more attention in pedestrian infrastructures rather than those for motorized
vehicles.
Kim et al. (2006) found that street performers have negative impact on pedestrian LOS because
they create congestion, limit access and interfere with pedestrian flows. Petritsch et al.(2006)
incorporated traffic volumes on the adjacent roadway and exposure at conflict points with
intersections and driveway and the study reveals that traffic volumes on the adjacent roadway and
the density of conflict points along the facility are the primary factors in the LOS model for
pedestrians traveling along urban arterials with sidewalks.
Dandan (2007) developed a method to assess pedestrian LOS with pedestrian perceptions.
Respondents were categorized into three groups based on age, gender, and walking experience,
then a questionnaire survey was conducted. Stepwise regression model was used to build a model.
The model included several variables like bicycle volume, pedestrian volume, vehicle volume,
driveway access quantity per meter, and distance between sidewalks to vehicle lanes. He studied
the methods of assessing pedestrian level of service by analyzing the relationship between the
pedestrian's subjective perceptions and the quality of the road physical facilities as well as the
traffic flow operation. The model was developed using the 395 real-time observations from 12
urban roadway segment sidewalks in China. After data collection he calculated Pearson correlation
coefficient linear relation between two variable using formulas the following model was developed
which is given below.
PLOS= -1.43 + 0.006QB - 0.003QP + .056QV/Wr+11.24(P-1.17P3)
Where
QB= bicycle traffic during 5 min period
QP=pedestrian traffic during 5 min period
QV=vehicle traffic during 5 min period
P=driveway access quantity per meter
Wr= distance between sidewalk and vehicle lane
Jianhong et al. (2008) analyzed the pedestrian flow characteristics on basis of one-way
passageways, two-way passageways, descending stairways, and ascending stairways in Shanghai
metro stations and revealed that consistent rules for traffic flow, density, and speed could be
applied to both pedestrian flows and vehicle flows from the view of macroscopic statistics.
Aultman-Hall et al. (2009) analyzed that season and weather have an effect on levels of pedestrian
volume in downtown Montpelier, Vermont. Precipitation reduces the average hourly volume level
by nearly 13% and the winter months reduce it by 16%. It was noted that at best a combination of
weather variables accounts for 30% of the variance measured in hourly volumes.
Australian Method: The Australian method for PLOS depends on three factors, namely the
physical characteristics, location factors, and user factors. Pedestrian conditions are described by
PLOS grade from PLOS A (ideal pedestrian condition) to PLOS E (unsuitable pedestrian
conditions. Physical characteristics include path width, surface quality, obstructions, crossing
opportunities, and support facilities. Location factors address issues related to connectivity, path
environment, and the potential for vehicle conflict. Path environment is a measure of the degree
of pleasantness of the surrounding environment and relates to distance from the roadway. User
factors takes into consideration pedestrian volume, mix of path users and personal security. It also
effectively determines which factors contribute to the high or low PLOS. While using the
Australian Method for determining the PLOS values for the study sidewalks, geometric, location
and user friendly factors are considered.
Hidayat (2010) determines factors affecting sidewalk's performance based on pedestrians'
perception. A questionnaire with a 27 items was developed to measure pedestrian perception in
five different areas: (a) safety, (b) sidewalk performance (c) accessibility (d) vendors presence,
and (e) comfort/convenience. It is believed that each item could potentially affect sidewalk
performance. Data collection is performed in Pratunam area, one of the commercial areas in
Bangkok, Thailand. Street vendors exist side by side along the sidewalks at Pratunam area having
the majority (60%) were female. Respondents grouped in age in under 18 years (32%), from 18 to
30 years (61%), and 31 to 56 years (5%). Walking behavior included two persons (45%), walking
alone (29%), walking in-group with 3 persons (12%)), and walking in-group with more than three
persons (10%). About 67 % of respondents stated that walking was their main mode during the
survey. After this Kaiser-Meyer-Olkin (KMO) test and/or the Barlett's test of sphericity undertook
for examining the interview data to see whether it is appropriate to use factor analysis. Reliability
test can be used to measure the consistency of a questionnaire form surveyed by him.
Sahani and Bhuyan (2013, 2015) studied off street pedestrian facility in mid-sized city in India
.They calculate pedestrian space, flow rate ,v/c ratio and average walking speed as measure of
effectiveness in population less than million- Bhubaneswar and Rourkela using self-organizing
map in artificial neural network system. In their study, clustering analysis has been taken up. Self-
Organizing Mapping (SOM) is a hierarchical agglomerative clustering in Artificial Neural
Network (ANN) use for grouping subset of small typical traffic pattern for determination of
appropriate number of groups.
Prativa Gywali (2014) developed PLOS model in context of Kathmandu city of Nepal. A little
perception of pedestrian was also considered during analyisis.
Y=1.76158+0.0048FR+1.00495D-0.2531W-0.6712B
Where,
Y= PLOS
FR= Flow Rate
D= Density
W= Width
B= Buffer
Cluster Analysis:
Clustering is the formation of groups of object based upon the information in data that describes
their relationships. Clustering is the way to form group from a large data set. Various cluster
analysis methods are used for pedestrian level of service categories like k-means, fuzzy c-means,
hierarchical agglomerative clustering, SOM in ANN, affinity propagation and GA Fuzzy Cluster
Analysis.
K-means clustering
It is a kind of learning(that are not supervised) used when there are unlabeled data (i.e., data
without well-defined groups). The algorithm finds groups in the data, where the numbers of groups
are represented by the variable K. The algorithm uses iterations to assign each data point to one of
K groups based on the features provided. Data points are clustered based on similarity of features.
The function k-means partitions the observed data into k mutually exclusive clusters. It also returns
a vector of indices indicating to which of the k clusters it has assigned each observation. The
algorithm minimizes the sum of distances from each object the centroid of clusters. This algorithm
moves objects between clusters until the sum cannot be decreased further.
Kim and Yamashita (2005) used k-means clustering to analyze the pedestrian crash pattern. They
illustrated the use of k-means clustering technique to analyze the locations and patterns of traffic
accidents. They also found that for pedestrian safety analysis k-means is the most appropriate
method to locate compact, localized clusters.
Fuzzy c-means clustering
In Fuzzy c-means clustering generalization of partition clustering methods (such as k-means) is
done. It allows us to classify an individual into more than one cluster. Suppose we have k clusters
and we define a set of variables that represent the probability that object i is classified into cluster
k in partition clustering algorithms, one of these values will be one and the rest will be zero. These
algorithms classify an individual into one and only one cluster. However, in fuzzy clustering,
objects are not assigned to a particular cluster: they possess a membership function indicating the
strength of membership in all or some of the clusters. This is called fuzzification of the cluster
configuration. The concept of a membership function derives from fuzzy logic, an extension of
Boolean logic in which the concepts of true and false are replaced by that of partial truth. Boolean
logic can be represented by set theory, and in an analogous manner, fuzzy logic is represented by
fuzzy set theory.
Chakroborthy and Kikuchi (1990) have shed light on the application of fuzzy set theory to the
analysis of highway capacity and level of service. The authors have shown the inadequacies in use
of the current procedure to determine highway capacity and service level. The values of input
variables and output variables involved in calculating capacity and service level were represented
by the fuzzy numbers. In this study it has been shown that it is much better if the levels of service
categories are defined as fuzzy sets.
Hierarchical agglomerative clustering
Hierarchical clustering investigates grouping in data, simultaneously, by creating a cluster tree.
The tree is not a single set of clusters, but rather a multi-level hierarchy, where clusters at one level
are joined as clusters at the next higher level. This allows us to decide what level or scale of
clustering is most appropriate in our application.
Lingra (1995) compared grouping of traffic pattern using the Hierarchical Agglomerative
Clustering and the Kohonen Neural Network methods in classifying traffic patterns. It has been
mentioned that the Kohonen neural network integrates the hierarchical grouping of complete
patterns and the least-mean-square approach for classifying incomplete patterns. It is advantageous
to use hierarchical grouping on a small subset of typical traffic patterns to determine the
appropriate number of groups and change its parameters to reflect the changing traffic patterns.
Such an approach is useful in using hour-to-hour and day-to-day traffic variations in addition to
the monthly traffic-volume variation in classifying highway sections.
Self-Organizing Map (SOM) Clustering
Self-Organizing Map (SOM) is an Artificial Neural Network (ANN). It has the capability to learn
the pattern of input and to find out correlations in their input and responses. For clustering of speed
data, the application particular problem to define the LOS of urban street artificial neural network
(ANN) may be used. Levy et al. (1994) compared the ability of supervised and unsupervised
learning method for classification and clustering. Garni and Abdennour (2008) developed a
method to detect and count the vehicles plying on road from the video graph data using the ANN
neural network. Author applied a self-organizing neural network pattern recognition method to
classify highway traffic states into some distinctive cluster centers.
Jian-ming (2010) devised a way to combine ANN and Genetic Algorithm method for the
prediction of traffic volume in Sanghai Metropolitan Area. The accuracy of prediction of traffic
volume of future traffic improved due to this combined algorithm. Cetiner et. al. (2010) developed
a back propagation Neural Network traffic flow model for prediction of traffic volume of Istanbul
City. The model uses the historical data at major junctions of the city for prediction of future traffic
volume. Florio and Mussone (1995) have taken the advantage of application of ANN in
classification problem to develop the flow-density relationship of a motorway. The author defined
the stability and instability of spacing of vehicle in traffic stream. Murat and Basken (2006) used
ANN for determination of non-uniform delay which is part of total vehicular delay at signalized
intersections. Sharma et al. (1994) studied and compared the learning ability of both supervised
and unsupervised type of learning method for clustering.
Affinity Propagation (AP) clustering
Affinity propagation is a theoretic clustering method developed by Frey and Dueck (2007). It
considers all of data points as center point. Every message is sent to reflect the latest interest that
is owned by each data point to be able to select another data points as their center point also called
as exemplar. The researchers have used this algorithm in solving various clustering problems. Frey
and Deuck (2007) used AP algorithm to cluster images of faces and genes in microarray data. They
found AP to perform accurately and one-hundredth time as fast as other conventional methods of
clustering . Conroy and Xi (2009) developed a semi-supervised AP algorithm for face-image
clustering and functional Magnetic Resonance Imaging (fMRI) volumetric pixel clustering. Xia
et.al. (2008) presents two variants of AP for grouping large scale data with a dense similarity
matrix. The local approach was Partition Affinity Propagation (PAP) and the global method was
landmark affinity propagation (LAP). Refianti et.al. (2012) compared accuracy and effectiveness
of AP and K-Means algorithm. They found that AP was effective than K-Means by implementing
these algorithms on the relationship between two variables i.e Grade Point Average (GPA) and
duration of Bachelor-Thesis completion at Gunadarma University. Zhang and Zhuang (2008)
presented a modified AP algorithm called voting partition affinity propagation (voting-PAP) which
is a method of clustering using evidence accumulation.. Yang et.al. (2010) used this AP clustering
algorithm in traffic engineering. A model-based temporal association scheme and novel pre-
processing and post-processing operations have been proposed by the authors, which together
with affinity propagation makes a successful method for vehicle detection and on traffic
surveillance. Zhang et.al. (2012) proposed an instant traffic clustering algorithm using AP to find
points on road having similar traffic pattern. Authors found the algorithm to be suitable in
predicting the traffic pattern and for finding the influence of traffic pattern at one point to that at
another point.
Chapter 3
METHODOLOGY
Pedestrian LOS is determined by two methods. One is quantitative method by cluster analysis on
pedestrian data ( average speed, flow rate, v/c ratio and pedestrian space) and other is qualitative
method by questionnaire survey based on pedestrian perceptions.
3.1 Methods of cluster analysis
The key step involves for applying methodology are:
Assign number of
cluster
Centriod
No object End
move
group
Object to centriod
distance
Mathematically,
Step 1: From a data set of N points, k-means algorithm allocates each data point to one of c clusters
to minimize the within-cluster sum of squares.
D2ik=(xk-vi)T (xk-vi), 1≤i≤c, 1≤k≤N
Where,
D2ik is the distance matrix between data points and the cluster centers,
xk is the kth data point in cluster i
vi is the mean for the data points over cluster i, called the cluster centers.
Step 2: Selecting points for a cluster, which are having the minimal distances from the centroid.
Step 3 Calculating cluster centers
𝑁𝑖 𝑥
∑𝑗=1 𝑖
𝑣𝑖 (𝑙) =
𝑁𝑖
Max |v(l)-v(l-1)|≠0
Where, Ni is the number of objects in the cluster i, j is the jth cluster; l is the number of iterations
Fuzzy c-means clustering
The Fuzzy c-means clustering algorithm is based on the minimization of an objective function
called c-means functional.
Here, 1<=m<∞
M= real number >1
Uij= degree of membership of xi in the cluster j
Xi= ith of d-dimensional measured data
Cj= d-dimension center of the cluster
||*|| is any norm expressing the similarity between any measured data and the center.
Fuzzy partitioning is carried out through an iterative optimization of the objective function Jm, with
the update of membership uij and the cluster centers cj by:
Where
Ɛ =termination criterion between 0 and 1,
k are the iteration steps.
This procedure converges to a local minimum or a saddle point of Jm.
2 2 2
D(I,j)=√(𝑥𝑖1 − 𝑥𝑗1 ) + (𝑥𝑖2 − 𝑥𝑗2 ) + ⋯ . . +(𝑥𝑖𝑝 − 𝑥𝑗𝑝 )
Minkowski distance which is generalization of both Euclidean and Manhattan distance is given
by,
D(I,j) = (|𝑥𝑖1 − 𝑥𝑗1 |q +|𝑥𝑖2 − 𝑥𝑗2 |q+…..+|𝑥𝑖𝑝 − 𝑥𝑗𝑝 |q)1/q
Here, q>=1, if q=1 then this distance will be Manhattan distance and for q=2 this distance will be
Euclidean distance.
Step 2: Group the objects into a binary, hierarchical cluster tree. Here, we link together pairs of
objects that are in close proximity using the linkage function. The linkage function uses the
distance information generated in step 1 to determine the proximity of objects to each other. As
objects are paired into binary clusters, the newly formed clusters are grouped into larger clusters
until a hierarchical tree is formed.
Step 3: Determine where to divide the hierarchical tree into clusters. Here, we divide the objects
in the hierarchical tree into clusters using the cluster function. The cluster function can create
clusters by detecting natural groupings in the hierarchical tree or by cutting off the hierarchical
tree at an arbitrary point.
SOM Algorithm:
In SOM, a set of nodes is arranged in a geometric pattern which is typically a 2-dimensional lattice.
This arrangement of neuron may be grid, hexagonal or random topology. Each node is associated
with a weight vector with the same dimension as the input space. The purpose of the SOM is to
find a good mapping. During training, each node is presented to the map so also the input data
associated with it. The clustering using SOM algorithm follows two steps.
Step 1:
Compare input data with all the input weight vectors mi(t)
Identify Best Matching Unit (BMU) on the map. The BMU is the node having the lowest
Euclidean distance w.r.t input pattern x(t) . The final topological organization of the map
is heavily influenced by this distance.
BMU mc(t) is identified by:
For all i, ||x(t)-mc(t)||<= ||x(t)-mi(t)||
Step 2: Update weight vectors of BMU as:
mi(t+1)=mi(t)+αhb(x)i (x(t)-mi(t)) where, hb(x) is neighbourhood function defined as:
2
||𝑟𝑖 −𝑏(𝑥)||
(− )
ℎ𝑏(𝑥) = 𝛼(𝑡)𝑒 2𝜎2 (𝑡)
, 0<α(t)<1 is learning factor and it decreases with each iteration. ri and
rb(x) are locations of neuron in input lattice.
End
Start
Construct Change in
similarity decision?
matrix
Here,
d(vk,xi)= Euclidean distances between object xi=(xi1,xi2,…,xin)π/3 and center of cluster
vk=(vk1,vk2,…vkn)
mϵ(1,∞) is exponential weight that determine fuzziness of clusters
The local minimum obtained with the fuzzy c-means algorithm often differs from the global
minimum. Due to large volume of calculation, realizing the search of global minimum of function
J is difficult. GA, which uses the survival of fittest, gives good results for optimization problem.
3.2 Cluster validation and selection of best cluster:
From the above six clustering methods, the best method which will be relevant for city in Nepal
context has to be determined which could be evaluated by the help of Silhouette Width Index.
Silhouette Width Index was proposed by Rousseeuw (1987) to evaluate clustering results.
Silhouette width index (Si) is a composite index which reflects the compactness and separation of
the clusters. The average s(i) of all data points reflects the quality of clustering result. Larger
silhouette value signifies good cluster.
Silhouette width is calculated as follows:
𝑏(𝑖) − 𝑎(𝑖)
𝑆(𝑖) =
max{𝑎(𝑖), 𝑏(𝑖)}
Where,
a(i)= average distance of a data point i to other data point in the same cluster
b(i)= average distance of that particular data point to all the data points belonging to the nearest
cluster
The silhouette ranges from -1 to 1. A high value indicates that the object is well matched to its
own cluster and poorly matched to neighboring clusters. If most objects have a high value, then
the clustering configuration is appropriate. If many points have a low or negative value, then the
clustering configuration may have too many or too few clusters.
3.3 Questionnaire Form Survey:
The key step involves for applying methodology are:
PLOS modelling
Where
R = [rij] is the correlation matrix and
U = [uij] is the partial covariance matrix.