You are on page 1of 6

2011 IEEE Intelligent Vehicles Symposium (IV) Baden-Baden, Germany, June 5-9, 2011

Vision based rail track and switch recognition for self-localization of trains in a rail network
J rgen Wohlfeil u German Aerospace Center Institute of Robotics and Mechatronics Rutherfordstr. 2, 12489 Berlin, Germany
Abstract A collision avoidance system for railroad vehicles needs to determine their location in the railroad network precisely and reliably. For a vehicle-based system, that is independent from the infrastructure, it is vital to determine the direction a railroad vehicle turns at switches. In this paper a vision based approach is presented that allows to achieve this reliably, even under difcult conditions. In the images of a camera that observes the area in front of a railroad vehicle the rail tracks are detected in real-time. From the perspective of the moving railroad vehicle rail tracks branch and join from/to the currently travelled rail track. By tracking these rail tracks in the images, switches are detected as they are passed. It is shown that the followed track can be determined at branching switches. The approach is tested with real data from test rides in different locations and under a variety of weather conditions and environments. It proved to be very robust and of high practical use for track-selective self-localization of railroad vehicles, mandatory for collision avoidance.

I. INTRODUCTION Trains are considered to be a safe mean of transportation. However, in increasingly dense railroad networks collisions may occur from time to time. A Railway Collision Avoidance System as RCAS [1] [2], which can be deployed on top of any existing safety infrastructure in train networks, can effectively prevent such collisions. On the basis of a digital map of the rail network but without any additional information from the infrastructure each train continuously localizes itself. It continuously communicates its determined position and speed to other trains via mobile communication system. This enables the early detection of collision courses and helps to increase safety independent from the infrastructure. One important component of self localization in a known rail network is the automatic detection of switches. On one hand, switches are important structural elements in the network that can be used as landmarks for self localization. On the other hand they are the critical points of navigation. Therefore it is vital for self localization to determine the course that a train takes when passing a branching switch. Passed switches and the track that is followed at a switch can be determined with several techniques. Global Satellite Navigation Systems (GNSS) are vital for self localization and precise enough to determine the position of a train along the track. But they are not precise enough to clearly determine which route the train takes when passing a switch. Instead, several GNSS measurements have to be evaluated until the route the train took can be clearly determined. At Shunting speeds this can already take some tens

of meters [3]. In addition to that, GNSS is not reliable close to high mountains or buildings and generally unavailable in tunnels. Therefore, additional sensors are required for the immediate and highly reliable detection of switches and the direction a train turns. One possible option is to use an eddy current sensor that is mounted at the bottom side of a train, which is very expensive. Another option is the usage of an inertial measurement unit (IMU) and determine in which direction the train turns at a switch by measuring the vehicles acceleration and rotation. At low speed and/or large curve radiuses the rotation and acceleration is very low and requires very expensive IMUs to be determined reliably. In this contribution a cost-efcient vision based approach for switch detection is presented. It is based on images from a digital camera observing the space in front of (or behind) a train. By recognizing all rail tracks visible in a certain detection area of the images it can provide a lot of information about the currently passed part of the rail network. This includes the detection and localization of parallel rail tracks close-by and the estimation of the curve radius of the currently travelled rail track. However, only the detection of switches and the determination of the direction the train turns when passing a switch is subject of this contribution. A. Related Work The basic concepts for vision based rail track recognition come from the automotive sector. Since many years it is of big interest to detect the lanes of the road by their markings automatically, e.g. for driver assistance. A good overview of different approaches coming from this range of application is given by J. C. McCall et al. [4]. For rail track maintenance vehicles travelling in convoy a vision based anti-collision approach has been presented by Maire [5]. It is based on the detection of the rails in front of a vehicle in order to estimate the length of the free, non-occluded rail track. Another interesting approach for railroad track extraction using dynamic programming has been developed by Kaleli and Akgul [6]. Both approaches aim the analysis of the entire track visible in front of the train. Detecting obstacles on the rails is only possible for a very limited distance. It may be reasonable for relatively slow rail track maintenance vehicles but will not be applicable for most of the heavy cargo and fast passenger

978-1-4577-0891-6/11/$26.00 2011 IEEE

1025

Fig. 1. a: Original camera image with rail hypotheses (white lines) and the center lines (black lines) of the two determined rail tracks at a switch. b: linear edges extracted from the analyzed area displayed as black pixels. c: target function (darker values indicate higher function values).

trains. The stopping distance of such trains is usually much larger than the part of the track visible from the train. B. Main Concept of the Approach Instead of analyzing the rail track as far as possible the presented approach only detects the rail tracks immediately in front of the train. The detection of collision courses is performed by the Railway Collision Avoidance System that communicates with other vehicles a long way off the visible part of the track. Even though the upper surface of rails is typically reective and not as light as road markings, a rail track can be detected better than road markings. Different to road markings rails are always present in front of a rail vehicle and their distance (track gauge) is always exactly the same. An additional advantage is that only large curve radiuses down to a dened minimum radius have to be expected. Similar to the other approaches mentioned above, this approach is based on the detection of rails due to their edges in the images. As described in Section II linear edges are extracted from the images in a rst step and then used to generate hypotheses for single rails. The known track gauge is used to nd plausible pairs of rail hypotheses belonging to the same rail track. In Section III it is described how the determined rail tracks are tracked through a series of subsequent images in order to detect switches. An overview of the different processing steps is given in Fig. 2. The results

of an empirical test of the approach are shown in Section IV and concluded in Section V.

Fig. 2.

Overview of the different processing steps of the approach

II. RAIL TRACK DETECTION A. Edge Extraction Each rail is characterized by at least one sharp linear edge in the image. Due to this, the rst step for the recognition of rail tracks is to extract these edges from a relatively small area of the image. This area is chosen as close to the vehicle

1026

as possible. But it has to be wide enough to include enough space at both sides of the track to see branching and joining tracks early enough. Depending on the orientation and optics of the camera this might be the central or lower area of the image (see light rectangles in the images of Fig. 5). Similar to the rst step of the robust and well performing extraction of lane markings in [7], the eigenvalues of local grey value gradients are determined. The grey value gradients gx and gy in x- and y-direction are calculated for every pixel of the analyzed area using the optimized 3x3 edge operator proposed by Ando [8]. Regarding the grey value gradient as a 2-dimensional vector g = (gx gy )T this edge operator is optimized in terms of the magnitude and angle of the gradient. The gradients are analyzed in order to determine linear edges and their precise orientations. In order to save computation time only pixels are processed with a grey value gradient magnitude |g| larger than the magnitude of half of the gradients in its 3x3 neighbourhood. For these pixels the 2x2 covariance matrix S is calculated with S= Covxx Covyx Covxy Covyy

B. Generation of Single Rail Hypotheses Candidates for rails are all long, linear structures in the analyzed area of the image. A two dimensional target function is used to nd them. It is dened for a range of discrete lateral positions and angles where rails could be present in the image. The values of the target function express the probability of rails with the corresponding lateral positions and angles. In Fig. 1c a full target function is showed.

where the covariance Covab is calculated from the n gradients g in the 3x3 neighbourhood N (including the central pixel). Considering that the mean value of greyvalue gradients is zero by denition Covab simply results to: Covab 1 = n (ga gb )
N

Fig. 3. A small part of the left rails in Fig. 1: Calculating the position of the linear edge K in the target function.

The covariance matrix S has two eigenvalues 1 and 2 (with 1 2 ). They can be calculated straightforward with Covxx + Covyy 1,2 =
2 (Covxx + Covyy )2 + 4Covxy

A small value of 2 (and so 1 ) indicates that around the examined pixel the grey values are almost homogeneous. If both eigenvalues have large values there is an edge or at least a non-linear and inhomogeneous structure around the pixel. Only for pixels located at linear edges 2 is much larger than 1 . Pixels are classied with the help of an empirically determined threshold tL and factor mL . A pixel is classied as linear edge if both, 1 tL and 2 /1 mL is true. In Fig. 1b the pixels classied as linear edges extracted from the image above are displayed in black. From the calculated covariance matrix S also the two eigenvectors e1 and e2 are calculated. In the case of a linear edge, e1 is oriented orthogonal to the linear edge (in the direction where the grey values change) and e2 in parallel. In comparison to the single grey value gradients the orientations of the eigenvectors are very accurate. They contribute a lot to the detection of the rails, which is described in the following section.

The target function is built up as follows (see also Fig. 3): For each linear edge K the smaller eigenvector e2 is expanded and its intersection with yc at xc is calculated. yc is the location of the central row of the analyzed area of the image for which the target function is built up. K supports the assumption that there is a linear edge at xc with the orientation k . This is taken into account by increasing the values around the location (xc , k ) of the target function. At this location a Gaussian distribution function with a standard deviation of one pixel is added to the target function. The other pixels of the linear edge refer to almost the same position and orientation and increase the values of the target function around (xc , k ) as well. After adding the contributions of all linear edges to the target function, clear local maxima can be found at locations and angles where long linear edges with a uniform orientation are present in the image. These maxima can be caused by the linear edges of rails as well as other objects or shadows. In Fig. 1, for example, one bar of the fence and two of its shadows have been chosen as rail hypotheses in addition to the actual rails. At one rail the bottom edge was detected in addition to the upper edge, which can easily be handled, as described in the following section. The presented technique of edge detection has proved to be faster and more robust in this case than Hough transformation for lines (which would achieve similar results). Thanks to the classication of edges only linear edges are used to build up the target function. By ignoring other edges time is saved and the bad inuence of irrelevant edges on the target function

1027

are suppressed. Due to the precisely known orientations of linear edges (thanks to both, the Ando operator and the Eigen decomposition) only very few elements of the target function have to be modied when adding an edge. C. Determination of Rail Tracks The next step is to analyze the hypotheses for rails in order to nd plausible pairs belonging to the same track. It takes advantage of two conditions that are always met by rail tracks: 1) The gauge width is given and always (nearly) constant 2) The rails belonging to the same track are parallel These conditions are evaluated for different pairs of rail hypotheses in object space. To be able to determine the location of rails in object space, it is assumed that all rails are located at a ground plain. This way a basic photogrammetric approach can be used to project the rail hypotheses from the image plain onto the ground plain. At the setup of the camera system, the focal distance c and the principal point H are determined by a basic geometric camera calibration procedure. With these parameters, the position of an image point p (x , y ) in the camera coordinate system p(x, y, z) can be calculated (see Fig. 4).

where wN tw w wN + tw and |d| td , where wN is the known gauge with. Every pair A that fulls both of these conditions is assigned with the cost CA = (w wN )2 d2 . All pairs are sorted by their costs and beginning with the pair A with the lowest costs CA , the nal rail tracks are determined. After the determination of each new rail track the remaining list of pairs is cleaned from all pairs containing one of the new tracks rails. After this the next rail track is determined. This procedure is repeated until the list is empty. As a result there is a set of determined rail tracks with known coordinates in object space. By denition the center of the rail track at the distance of the projected central row (yc ) is used for further processing. As already mentioned in Section I they can be used for various applications. In this paper the focus in on the detection of switches, being subject of the following Section. III. SWITCH RECOGNITION A. Tracking of Rail Tracks Switches are detected via rail tracks that appear and join or branch from the currently travelled rail track. In order to achieve this, rail tracks that appear in the eld of view have to be detected and tracked continuously. In each new image the rail tracks are determined as described, without using any knowledge about the rail tracks detected in the previously processed image. In the rst image for each detected rail track a new tracker is initialized with the lateral position of the current center of the track in object space. In each of the following images it is attempted to reassign the existing trackers to the determined rail tracks again. A detected rail track has to be close to the predicted position of a tracker to be reassigned. Every tracker has an individual rating r [0, 1] that is initialized with rmin when a new tracker is generated. If a track can be reassigned, the rating r of the track is raised by rd and its lateral velocity Vc is calculated. Vc is used to predict the tracks new position in the following image. If no track can be assigned to the tracker the lateral position is set to the predicted position and the rating r is decreased by rd . If r falls below zero the tracking is given up. Only while r rmin the track is regarded as detected. For the experiment described in Section IV the values rd = 0.1 and rmin = 0.5 have led to robust tracking even if tracks were not detected in a small number of images. B. Detection of Switches From the perspective of the moving railroad vehicle, tracks branch or join the currently travelled track when passing a switch. If they cross the track they are regarded to join and branch subsequently. Knowing the minimum curve radius (specied for the railroad vehicle) the maximum lateral deviation s1 of the center Xt of the currently travelled track from the center of the vehicle can be dened. Hence, we expect one of the

Fig. 4. Visualization of the relationship between image coordinates p(x , y ) and object coordinates P (X, Y, Z) with respect to the railroad vehicle. The image plain is logically ipped in front of the center of projection O

Next, the position of this point in the object coordinate system P (X, Y, Z) is determined with the help of the exterior orientation, consisting of the offset O and the spatial rotation R of the camera. O and R are also retrieved during the camera calibration procedure. The object coordinate system is xed to a dened point of the railroad vehicle and moving and rotating with the vehicle. Each rail hypotheses can now be projected onto the ground plane at Z = 0 to retrieve its estimated position and orientation in object space. Each rail hypothesis is compared with each other hypothesis with respect to the gauge width w and angle of difference in their orientation d. A reasonable tolerance in gauge width tw and orientation difference td can be dened. Within this tolerance all pairs are determined

1028

Fig. 5. Examples for the detection of different switch types. a: passing straight and b: turning to the left at a branching switch. c: coming from the left and turning to the left at a double switch crossing. d: passing a joining switch. The black curves show the lateral positions X of the detected rail tracks over time t. The image shown above is taken at the time marked by the horizontal grey line. The arrows point at the time when a switch was detected.

tracked rails to stay always within the range [s1 , +s1 ]. If the center of a rail track falls below s1 it must be a rail track that branches to the left and the vehicle follows the rightmost branch. If a rail track exceeds +s1 the opposite case is true: The vehicle passes a switch and follows the leftmost branch. Moreover, if a track comes into the range [s1 , +s1 ] from the outside, it must be due to a switch where a track joins the currently travelled track from the left or the right, respectively. To make the recognition of branches more reliable the limit s2 is introduced, which is slightly larger than s1 . s1 and s2 are used like the two levels of a hysteresis function. Only if a track passes both limits it is counted as a branch/join. In the rst example in Fig. 5 (a) the vehicle follows the straight track. The travelled track stays very close to the center (in the lower part it diverges slightly due to a wide curve before the switch). The branching track appears almost next to the current track and crosses s1 and s2 . The arrow indicates the point of time when the switch is detected. In the second example (b) the vehicle follows the left branch, which is detected after one rail track exceeds s2 (again indicated by an arrow). The third example (c) shows what happens at a double switch crossing, which is actually detected as two switches. First, a rail track joins the current track from the right. Then a track vanishes to the right. In the last example (d) a track is joining from the right. These examples show that with the presented approach two basic types of switches (joining and branching) can be distinguished. In each case it can be distinguished if the track is joining or branching from the left or the right. Complex switches like double switches are recognized as the two or more basic switches they consist of. This information is very

benecial for the localization of the vehicle in a known rail network [3]. IV. EXPERIMENTAL RESULTS The presented approach has been implemented in C/C++ to enable real time processing on a Laptop with an Intel Core 2 Quad Q9300 CPU. It was already presented as a part of the DLRs RCAS on May 11th 2010 at the railway test site in Wegberg-Wildenrath, Germany [9], where it proved to work very well even in combination with bad weather conditions. Unfortunately the results of the live demonstration have not been stored and cannot be re-evaluated. Instead, recordings from different test rides are used to evaluate the approach by processing them ofine. Recorded images from six different test rides in three different places have been collected. Near the Braunschweig central station three early test rides have been performed with the DLRs RailDriVE R [10]. The camera (Prosilca GC1380H) was mounted on the top at about 3.6 m of height with a wide 4.8 mm optics, oriented almost horizontal (see Fig. 5d 1 ). At one test ride in Lenggries and two in WegbergWildenrath the camera was mounted on the coupling of an Integral VT train at a height of 1.26 m with a pitch angle of 11.6 using a 12 mm optics. The parameters of the exterior and interior orientation, as well as the analyzed area of the image are different for each installation. All other parameters are equal for all of the evaluated test rides with a total length of over 30 km where 110 switches (69 branching and 41 joining) were passed.
1 The horizon is not in the centre of the image because the upper part was cropped in order to improve the visibility of the rail tracks

1029

The six test rides took place under various challenging lightning conditions. They include low sun elevations from the rear and the front, heavy clouds, drizzle and light rain. Also a variety of switch types are included as well as mixed gauge widths ( 25 km) and third rails ( 22km, see Fig. 5). Despite of these difculties from 110 switches only two have not been detected and one has been detected by mistake. One of the misses has two reasons: First, the ground was very dark in the images because the automatic exposure control of the camera was inuenced by a very large area of bright sky. Second, an antenna of the vehicle was visible in the bottom-left of the image and impaired the detection of rail tracks in this region. The other missed switch and the false detection occurred on a mixed-gauge track with a third rail, both impairing the tracking of the correct rails. In addition rain drops smudged the image. Even under inuence of these simultaneously occurring difculties, almost all of the switches were detected correctly. But they obviously affect the reliability of the approach slightly. V. CONCLUSIONS AND FUTURE WORKS With the presented approach switches can be detected very early and reliably as well as the direction the train turns at a switch. How early a switch is detected depends on the camera geometry and orientation, the wheelbase of the vehicle, and the maximum expected curve radius. In all conguration used for testing switches were always detected before the front of the vehicle reached the frog of the switch. Although the sample of test data is too small for strong statistical evidence, the results of the empirical test suggest a high robustness and reliability of the approach; even under challenging conditions. It shows as well that in seldom cases switches can remain undetected or are detected by mistake. This fact must be taken into account in the collision avoidance system that makes use of the approach. But in any case it helps to improve the reliability of self localization in a high degree. At the live demonstration in WegbergWildenrath it also proved its real-time capability on mobile hardware. Some scenarios that are important for the practical application of the approach were not tested yet. One of them is the operation at night and in tunnels. It is very possible that the rails directly in front of the vehicle can be detected well in the darkness being illuminated by the standard vehicle lightning. Another scenario includes snow that covers the rails. If the layer of snow is thin the edges of the rails are expected to be still sharp enough to be detected. But under a thick layer of snow the detection of rails is supposed to fail completely. Very heavy snowfall, rainfall, or fog is expected to impair the operation as well. On the other hand it should be generally impossible to conduct a train safely under such whether conditions. Other objects with sharp edges dont affect the rail track detection if their edges are not parallel and have a distance different to the gauge width. Even if edges have the same orientation and distance as rails it is very unlikely that they lead to a false detection of a switch. This is because they

would have to appear like a rail track for a certain time while passing the limits s1 and s2 . But although it is very unlikely objects with sharp edges can cause false detections, as mentioned in the previous section. As only rail tracks a few meters in front of the train are to be detected, the resolution of the camera images can be low. Tests have shown that a single rail can have a width of down to 5 pixels in the analyzed area of the images to reliably detect its position and orientation with sufcient accuracy. A higher resolution does not lead to signicantly better results but raises the amount of computation time needed to process the images. Thus, the approach makes use of a cheap camera that is possibly already installed in the railroad vehicle. In an operational setup it could be placed behind the windshield of the vehicle. If the train moves in the opposite direction the camera is directed, switches are detected in the same way as if it moves forward. Especially at passenger trains with driver cabins on both ends of the train the simultaneous detection of switches at both ends could increase the reliability even more. There is a chance to reduce the probability of false detections even more by using a stereo camera system instead of a single camera. With such a system the detections of both cameras can be checked for consistency. As the two cameras are viewing the scene from a different perspective sharp edges would not only need to have the correct distance in their projection onto the ground level. They would also need to actually have the same height as the rails, which is very unlikely. In the context of the DLRs RCAS project the presented approach will be tested, applied, and improved continuously. R EFERENCES
[1] T. Strang, M. Meyer zu H rste and X. Gu, A Railway Collision Avoido ance System exploiting Ad-hoc Inter-Vehicle Communications and GALILEO, Proceedings of the 13th World Congress and Exhibition on Intelligent Transportation Systems and Services, 2006, London, UK. [2] C. Rico Garcia, A. Lehner, T. Strang and M. R ckl, Comparison o of Collision Avoidance Systems and Applicability to Rail Transport. Proceedings of 7th International conference on Intelligent Transport Systems Telecommunications, ISBN 1-4244-1177-7, 2007, pp 521-526. [3] K. Lddecke and C. Rahmig, Evaluating Multiple GNSS Data in a Multi-Hypothesis Based Map-Matching Algorithm for Train Positioning, To be published in Proc. of IEEE Intelligent Vehicles Symposium 11 together with this paper, 2011. [4] J.C. McCall and M.M. Trivedi, Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation, IEEE Transactions on Intelligent Transportation Systems, vol. 7, issue 1, 2006, pp 20-37. [5] F. Maire, Vision based anti-collision system for rail track maintenance vehicles, Proc. of IEEE Conference on Advanced Video and Signal Based Surveillance, London, 2007, pp 170-175. [6] F. Kaleli and Y. S. Akgul, Vision-Based Railroad Track Extraction Using Dynamic Programming, Proceedings of the 12th Int. IEEE Conference on Intelligent Transportation Systems, 2009. [7] J. Wohlfeil and R. Reulke, Detection and tracking of vehicles with a moving camera, Proc. of Optische Technologien in der Fahrzeugtechnik, Stuttgart, 2008. [8] S. Ando, Consistent Gradient Operators, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, 2000. [9] (2011) The Railway Collision Avoidance System (RCAS) website. [Online]. Available: http://www.collision-avoidance.org [10] (2011) RailDriVE - Rail Driving Validation Environment. [Online]. Available: http://messtec.dlr.de/de/technologie/verkehrssystemtechnik/ raildrive-rail-driving-validation-environment/

1030

You might also like