Professional Documents
Culture Documents
AbstractLane-based road network information, such as the applied in the Open Street Map (OSM) for road-level map
number and locations of traffic lanes on a road, has played an construction, which uses the Global Positioning System (GPS)
important role in intelligent transportation systems. In this pa- for localization [6][9]. But low-end GPS data devices and
per, we propose a Collecting Lane-based Road Information via
Crowdsourcing (CLRIC) method, which can automatically extract urban canyons with tall buildings reduce the position accuracy
detailed lane structure of roads by using crowdsourcing data of GPS data to about 1015 m in urban areas. So it is a
collected by vehicles. First, CLRIC filters the high-precision GPS challenging to extract lane-based road information from low
data from the raw trajectories based on region growing clustering precision crowdsourcing GPS data.
with prior knowledge. Second, CLRIC mines the number and In this paper, we propose CLRIC: collecting lane-level road
locations of traffic lanes through optimized constrained Gaussian
mixture model. Experiments are conducted with taxi GPS tra- network information via crowdsourcing. CLRIC can automat-
jectories in Wuhan, China, and the results show that CLRIC is ically extract the detailed lane structure of roads using crowd-
quantified and displays detailed road networks with the number sourcing GPS data collected by vehicles. CLRIC is based on
and locations of traffic lanes comparing with the satellite image two key observations. The first observation is that high preci-
and human-interpreted situation. sion GPS trajectories with accuracies of about 3 m still exist in
Index TermsLane-based road information, crowdsourcing raw vehicle trajectories based on GPS error analysis [10]. Thus,
data, high-precision GPS data filtering, spatiotemporal GPS region growing clustering with prior knowledge (RGCPK) in
trajectories. CLRIC system is used to select high-precision GPS data from
low precision raw GPS data. The second observation is that
I. I NTRODUCTION vehicle trajectories contain abundant information regarding
road networks [11][13], traffic conditions [14], [15], points
optional compared with professional way, so the raw GPS data from registered users. Likewise, WikiMapia, Google Maps, and
mixes with many outliers. At present, there are several ways other map applications let users to update maps. The methods
to optimize raw trajectories such as filtering, map matching, proposed in [20][25] can generate and update road-level maps
and clustering algorithm. Filtering is suitable in those situations from crowdsourcing data, while detailed road network gener-
where the high-sampling rate trajectory data is particularly ation has gradually shifted down to lane-based road network
noisy, or when it is necessary to derive other quantities like information such as the number and locations of traffic lanes.
speed or direction from trajectory data [18]. Map matching Lane-based information extraction from vehicle trajectories
is another way for raw trajectory data optimization that each starts with differential GPS data and concludes with a refine-
trajectory is matched to road centerline it corresponds [19]. ment of an existing map, including finding lanes and lane transi-
In addition, some researchers proposed that using clustering tions through the intersections [26], [27]. This process involves
method to remove outliers. In reference [14], authors used smoothing and filtering the GPS data, matching it to an existing
Kernel density method to identify outliers and remove them. map, spline fitting for the road centerlines, clustering to find
The authors of [4] sort all the data points in ascending order ac- lanes, and refinement of the intersection geometry [28]. The
cording to their distances from the median and then choose 95% authors of [29] proposed to use vehicle trajectories collected by
of the sorted data points as the experimental data. However, all mobile phones equipped with GPS and MEMS (Micro-Electro-
these methods [4], [14], [18], [19] have their defects. Filtering Mechanical System) to generate lane-level road maps in open
is sensitive to the sampling rate of GPS data so its unfortunate area. The lane-level information was extracted by statistically
for GPS data with low-sampling rate. Map-matching is valid for analyzing the probability density distribution of trajectories
road-level information extraction like road network updating based on non-parametric Kernel Density Estimation. However,
and traffic flow detection and so on, but it is useless for lane- the methods discussed in [26][29] are based on the assumption
based road information extraction because each GPS point is that GPS trajectories from different lanes are separated well.
matched to road centerline. Besides, the existing clustering For low-precision crowdsourcing GPS data, this assumption is
methods [4], [14] are confined to parameter setting and cant seriously violated, and therefore we propose CLRIC to extract
remove outliers which are mixed in high-density points cluster. lane structure from a mass of low-precision crowdsourcing GPS
Extracting information form pre-processed GPS data is the data in urban area.
key issue in geographic area. In this study, we focus on the lane-
based information extraction from GPS data. There has also III. C OLLECTING L ANE -BASED ROAD
been work on completely automated methods aimed at inferring I NFORMATION VIA C ROWDSOURCING
road maps from crowdsourcing data. Those methods include
matching GPS traces to prototypical shapes [20], and using an The overview of the CLRIC system is shown in Fig. 1.
incremental method to process GPS traces that can be used to As seen in Fig. 1, CLRIC includes two steps:
generate road maps [21], [22], and applying clustering methods Step 1) select high-precision data from crowdsourcing data
or artificial algorithms to extract road network from GPS traces based on region growing clustering with prior knowl-
[23][25]. Besides, OpenStreetMap uses user-contributed GPS edge (RGCPK). The positional accuracy for selected
trajectories to create free digital maps that are open for editing data can approach 3 m.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 2. Trajectories and trajectory vectors. (a) and (b) show the trajectory and
trajectory vector respectively, where N indicates the north and is the angle
with north of vector v 1i .
k
1 (x j )2
where P eri represent the percentage of STi , N (STi ) and p(x) = j exp (11)
N (T ) are the number of tracking points of STi and T , respec- j=1 2 2 2 2
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Fig. 7. Lane centerline. (a) is the fitting result of CGMM and the location of each lane is shown in (b).
Fig. 10. The construction rule of road. Part1 and Part2 are the most likely parts
of road to add lanes.
Fig. 11. The collection of DGPS trajectories and synchronized GPS trajectories, (a) shows the driving region of shuttle vehicles, (b) is the magnification of (a) that
the black points and blue points represent GPS data and DGPS data respectively.
Fig. 12. Crowdsourcing data: collected by taxis, (a) indicates the road network for taxi driving and (b) shows the raw trajectories collected by taxis.
Thus, we present a method to optimize the results of the performance of region growing clustering with prior knowledge
number of lanes extraction, as follows. method. The test data set was applied to as the main data
First, comparing N lanei+1 with N lanei and N lanei+2 , source for lane-based information extraction. Two data sets are
N lanei+1 is replaced by N lanei when N lanei and N lanei+1 introduced as follows.
are different, and N lanei and N lanei+2 are the same. The training data set was collected by shuttle vehicles. Each
Secondly, clustering the results from Step 1 according to the shuttle vehicle was equipped with a GPS logger and Inertial
value of N lanei and their arrangement, for instance N lanee , Measurement Unit (IMU) that recorded two kinds of traces,
N lanee+1 , N lanee+2 , . . . , N lanee+c will be clustered when GPS traces based on the GPS single-point positioning technique
their value is same, e < t, e + c < t. Each cluster also corre- and synchronized DGPS traces based on differential global
sponds to a number of lanes. Assuming there are s clusters, and positioning technology. The positional accuracy of the GPS
recorded as Cj = N lj , ncj , where N lj is the number of lanes and DGPS data in urban area was about 1015 m and 0.5 m
of cluster Cj , ncj is the total number of N lanei that belong to respectively. The sampling rate for the training data set was 1 s.
Cj , j = 1, 2, . . . , s. The data collection period for the shuttle vehicles was seven
Finally, comparing Cj+1 with Cj , N lj+1 of Cj+1 is replaced days. We obtained about 40 thousand GPS and DGPS points,
by N lj of Cj when N lj+1 and N lj are different, and ncj+1 < shown in Fig. 11. The prior knowledge for high-precision data
cv, where cv is a constraint value that depends largely on the selection from the crowdsourcing data was extracted from part
road construction rules. of the training data set by analyzing the similarity of DGPS data
and its synchronized GPS data. The remaining training trajec-
tory data were used to evaluate the performance of RGCPK.
IV. E XPERIMENTS
The test data set were collected by thousands of taxis based
CLRIC includes two steps high-precision data selection and on point position technique in Wuhan; the GPS devices were
lane-based information extraction. Thus, to evaluate the perfor- placed at the center of the taxi roofs. The sampling frequency
mance of CLRIC, we used two different types of data sets a of taxi traces ranged from 10 s to 20 s while the positioning ac-
training data set of ten shuttle vehicle traces and a test data set of curacy for them ranged from 10 m to 15 m in urban areas. Each
thousands of taxi GPS traces, and both data sets were collected taxi recorded traces for an average of 14 days. We collected
in urban area. The training data set was used to extract priori about 200 billion GPS points, shown in Fig. 12. According to
knowledge for high-precision data selection and verify the each tracking point location and heading direction, we got about
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
TABLE I
T HE W EIGHTS OF S IMILARITY E VALUATION M ODEL
TABLE II
T HE P RIOR K NOWLEDGE E XTRACTION
Fig. 14. High-precision data selection; (a) indicates the result of test data set selection, and (b) shows the selecting results of a part of training data set. The
red-solid points and black-empty circles represent outliers and high-precision data.
Fig. 15. CGMM results. (a), (b), (c) and (d) indicate the overview of k = 2, k = 3, k = 4, k = 5, respectively.
Fig. 17. The optimized results of the number of lanes detection. The blue-
Fig. 16. The lane detection results. The black-solid points and red-empty empty circles and black-solid points represent the optimized results and truth
circles represent the true value and detection results for the number of lanes. value of the number of lanes.
The measurement errors of the selected data points were
C. Quantitative Evaluation
computed along with synchronized DGPS data. The results
1) The Performance of Region Growing Clustering With show that the position accuracy of selected data can achieve
Prior Knowledge Method (RGCPK): To evaluate the perfor- 3.02 1.2 m, where 3.02 m is the average value and 1.2 m is
mance of the proposed RGCPK, we implemented RGCPK on the standard deviation.
two data sets seen in Fig. 19. The first data set from the training For test data set, we could not estimate the position ac-
data set that was used to estimate the accuracy of selected data curacy of selected data because there was no high-precision
and the other is used to identify the performance of RGCPK for synchronized DGPS data. Thus, the performance of RGCPK
lane number identification. for crowdsourcing data was evaluated by comparing it with the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
TABLE III
L ANE N UMBER I DENTIFICATION C OMPARISONS
Fig. 18. The locations of each lane, where the yellow line depicts the centerline
of each lane and the white line shows the incorrect identification of lane
centerline.
Fig. 20. Lane width comparison between detecting value and actual value.
R EFERENCES [26] S. Rogers, P. Langley and C. Wilson, Mining GPS data to augment road
models, in Proc. 5th ACM SIGKDD Int. Conf. Knowl. Discovery Data
[1] A. B. Hillel, R. Lerner, D. Levi, and G. Raz, Recent progress in road and Mining, New York, NY, USA, 1999, pp. 104113.
lane detection: A survey, Mach. Vis. Appl., vol. 25, no. 3, pp. 727745, [27] K. Wagstaff, C. Cardie, S. Rogers, and S. Schrodl, Constrained
Apr. 2014. k-means clustering with background knowledge, in Proc. 18th ICML,
[2] M. Thuy and F. Len, Lane detection and tracking based on lidar data, San Francisco, CA, USA, 2011, pp. 577584.
Metrol. Meas. Syst., vol. 17, no. 3, pp. 311321, 2010. [28] S. Edelkamp and S. Schrdl, Route planning and map inference with
[3] B. Yang, Z. Dong, and W. Dai, Hierarchical extraction of urban objects
global positioning trajectories, Comput. Sci. Perspective, vol. 2598,
from mobile laser scanning data, ISPRS J. Phothogramm. Remote Sens., pp. 128151, 2003.
vol. 99, pp. 4557, Jan. 2015. [29] A. Uduwaragoda, A. S. Perera, and S. A. D. Dias, Generating lane level
[4] Y. Chen and J. Krumm, Probabilistic modeling of traffic lanes from GPS road data from vehicle trajectories using kernel density estimation, in
traces, in Proc. 18th SIGSPATIAL Int. Conf. Adv. Geographic Inf. Syst.,
Proc. IEEE 16th Int. Annu. ITSC, Oct. 69, 2013, pp. 384391.
2010, pp. 8188. [30] X. Liu et al., Road recognition using coarse-grained vehicular traces,
[5] A. G. O. Yeh et al., Hierarchical polygonization for generating and HP Lab., Palo Alto, CA, USA, 2012, pp. 110.
updating lane-based road network information for navigation from road
[31] J. G. Lee and J. Han, Trajectory clustering: A partition-and-group
markings, Int. J. Geographical Inf. Sci., vol. 29, no. 9, pp. 124, 2015. framework, in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2007,
[6] B. Zhou et al., ALIMC: Activity landmark-based indoor mapping via pp. 593604.
crowdsourcing, IEEE Trans. Intell. Transp. Syst., vol. 16, no. 5, pp. 111,
Oct. 2015.
[7] N. D. Lane, S. B. Eisenman, M. Musolesi, E. Miluzzo, and A. T. Campbell,
Urban sensing systems: Opportunistic or participatory? in Proc. 9th
Workshop Mobile Comput. Syst. Appl., 2008, pp. 1116. Luliang Tang received the Ph.D. degree from
[8] M. Haklay and P. Weber, OpenStreetMap: User-generated street maps, Wuhan University, Wuhan, China, in 2007. He is
IEEE Pervasive Comput., vol. 7, no. 4, pp. 1218, Oct.Dec. 2008. currently a Professor with Wuhan University. His
[9] B. Hull et al., Cartel: A distributed mobile sensor computing system, research interests include spacetime GIS, GIS for
in Proc. 4th Int. Conf. Embedded Netw. Sens. Syst., 2006, pp. 125138. transportation, and change detection.
[10] H. W. Mckenzie, C. L. Jerde, D. R. Visscher, E. H. Merrill, and
M. A. Lewis, Inferring linear feature use in the presence of GPS mea-
surement error, Environ. Ecol. Stat., vol. 16, no. 4, pp. 531546, 2009.
[11] J. Wang et al., A novel approach for generating routable road maps from
vehicle GPS trajectories, Int. J. Geographical Inf. Sci., vol. 29, no. 1,
pp. 6991, Jan. 2014.
[12] L. Tang, F. Huang, X. Zhang, and H. Xu, Road network change detection
based on floating car data, J. Netw., vol. 7, no. 7, pp. 10631070, 2012.
[13] P. Yin et al., Mining GPS data for trajectory recommendation, in
Advances in Knowledge Discovery and Data Mining. New York, NY, Xue Yang received the M.Eng. degree from Wuhan
USA: Springer-Verlag, 2014, pp. 5061. University, Wuhan, China, in 2013. She is currently
[14] C. de Fabritiis, R. Ragona, and G. Valenti, Traffic estimation and predic- working toward the Ph.D. degree in the State Key
tion based on real time floating car data, in Proc. IEEE 11th Int. ITSC, Laboratory of Information Engineering in Surveying,
2008, pp. 197203. Mapping and Remote Sensing, Wuhan University.
[15] L. Tang, X. Chang, and Q. Li, Public travel route optimization based on Her research interests include intelligent transporta-
ant colony optimization algorithm and taxi GPS data, Chin. J. Highway tion system, spatiotemporal data analysis, and infor-
Transp., vol. 24, no. 2, pp. 8995, 2011. mation mining.
[16] Y. Zheng, L. Zhang, X. Xie, and W.-Y. Ma, Mining interesting locations
and travel sequences from GPS trajectories, in Proc. Int. World Wide Web
Conf., 2009, pp. 791800.
[17] D. Sun et al., Urban travel behavior analyses and route prediction based
on floating car data, Transp. Lett. Int. J. Transp. Res., vol. 6, no. 3,
pp. 118125, Jul. 2014.
[18] W. C. Lee and J. Krumm, Trajectory preprocessing, in Computing Zhen Dong received the M.Eng. degree from Wuhan
With Spatial Trajectories. New York, NY, USA: Springer-Verlag, 2011, University, Wuhan, China, in 2013. He is currently
pp. 333. working toward the Ph.D. degree in the State Key
[19] S. Brakatsoulas, D. Pfoser, R. Salas, and C. Wenk, On map-matching ve- Laboratory of Information Engineering in Surveying,
hicle tracking data, in Proc. 31st Int. Conf. Very Large Data Bases, 2005, Mapping and Remote Sensing, Wuhan University,
pp. 853864. Wuhan University. His research interests include in-
[20] Y. Yanagisawa, J. Akahani, and T. Satoh, Shape-based similarity query telligent transportation system, computer vision, and
for trajectory of mobile objects, in Proc. 4th Int. Conf. Mobile Data LiDAR data processing.
Manage., Melbourne, Vic., Australia, Jan. 2124, 2003, pp. 6377.
[21] R. Bruntrup, S. Edelkamp, S. Jabbar, and B. Scholz, Incremental
map generation with GPS traces, in Proc. IEEE Intell. Transp. Syst.,
Sep. 1315, 2005, pp. 574579.
[22] J. Li, Q. Qin, C. Xie, and Y. Zhao, Integrated use of spatial and semantic
relationships for extracting road networks from floating car data, Int. J.
Appl. Earth Observ. Geoinf., vol. 19, no. 10, pp. 238247, 2012. Qingquan Li received the Ph.D. degree in geograph-
[23] A. Fathi and J. Krumm, Detecting road intersections from GPS traces, ic information system (GIS) and photogrammetry
in Geographic Information Science. Berlin, Germany: Springer-Verlag, from Wuhan Technical University of Surveying and
2010, pp. 5669. Mapping, Wuhan, China, in 1998. He is currently
a Professor with Shenzhen University, Guangdong,
[24] G. Agamennoni, J. Nieto, and E. M. Nebot, Robust inference of principal
road paths for intelligent transportation systems, IEEE Trans. Intell. China, and Wuhan University, Wuhan. His research
Transp. Syst., vol. 12, no. 1, pp. 298308, Mar. 2011. areas include dynamic data modeling in GIS, sur-
[25] L. Cao and J. Krumm, From GPS traces to a routable road map, in veying engineering, and intelligent transportation
Proc. 17th ACM SIGSPATIAL Int. Conf. Adv. Geographic Inf. Syst., 2009, system.
pp. 312.