Extraction of Public Transport Linegraphs from OpenStreetMap

The following selected aspects provide an overview of the ideas behind the implemented code and contribute to the transparency of the work and the results. The used concepts in this project are rather simple and mainly supposed to show the possibility of adjusting the osm-data in a specific way, especially for Freiburg, a city (operating a tram system) in south-west Germany. In order to get correct results with the tool for all cities, more advanced algorithms, which cover more osm-data and transport system conditions, might be beneficial.

Selected Implementation Aspects

Extraction of Relevant Data

To parse large osm(xml) files the lmxl.etree API for python is used.

The following tags are considered to find the relevant nodes, ways and relations of a specified transport system in an osm file:[3][4]

tram subway
nodes k="tram"
k="station" and v="subway"
k="subway" and v="yes"
ways k="tram"
k="railway" and v="subway"
relations v="tram" k="route" and v="subway"

Furthermore the following tags are considered to either exclude objects or to gather additional information:[5][6][7]

tram subway
nodes k="construction"
ways k="railway_switch" and v="no"
k="service" and v="siding"
k="railway_switch" and v="no"
relations v="historic"

Including Stations in Ways

The stations of the public transport systems are not by default included in the ways of OpenStreetMap data. Thus, after the different nodes of the stations are merged together to new center stations, the new center stations need to be included in the ways for the first time, in order to simplify merging the ways between the stations later. Since ways in OSM consist of ordered node lists, a simple distance measure could be used to check whether a station should be included in a way: Whenever a node of a way is close (e.g. within 40 metres) to one of the new stations, the station gets included in the way instead of the original node.

Map extract from OSM. Stations and ways are not connected by default.
The algorithm includes the station nodes in the ways.

Connecting Stations through Ways

In order to further simplify merging the ways in the next step, the goal of this step is to create ways, which run from station to station, meaning that each ordered node list starts and ends with a station node at the end of this step. To achive this, every way containing station nodes is split at that station nodes, so that a station node is either at the beginning or the end of a way segment. Starting from a way segment with a station node at the beginning, a new way segment is added to a way in an iterative loop, if the last node id of the current way and the first node id of the new way segment match together. The loop terminates as soon as a way segment, which contains a station node at the end, is added, the way contains a switch node or no additional way segment can be found. This simple search already finds most of the ways, at least of the Freiburg sample data.

However, to also allow for ways to be found that are not constructed in that precise order, every way is also saved in the reverted direction and the loop can change between these sets of ways in between.

Merging Ways between Stations

To merge all ways between stations, so that only one way exists between a set of two stations, the shortest way (or more precice: the way which consists of the minimum number of nodes) of all ways between two stations is filtered. For every node of this shortest way, the closest nodes of the other ways are computed using a euclidean distance measure. Then one single new node is calculated at the center of those nodes to assure that the new way runs in the middle of the merged ways.

According to the merged ways the line numbers of the transport system are also merged and assigned to the resulting new line.

The ways running between a set of two stations should be merged.
The new merged ways run in the middle of the old ways.

Improvements of Line Numbers

Some quality improvements need to be made with regard to missing or wrong line numbers. Especially if ways split after crossroads, the line numbers aren’t splitted accordingly. Thus, a couple of existing line number conditions are checked and line numbers are either added or deleted. The conditions are easy to determine if stations on both sides of a way only have two incident egdes (ways). Then it's obvoius that the same line numbers should to be at both sides (ways) of a station. Since conditions get more complex, if stations have more than two incident edges (see the example below), a correct line numbering is only assured for a data set of Freiburg from October 2018.

This shows the situation at the 'Bertoldsbrunnen', a tram station in Freiburg as a test case. The '2' (red circle) is impossible (or rather highly unlikely) and needs to be deleted. (A line number leaving a station three times (station e) and then stopping one station later ((station h), with only two incident edges) is considered unlikely.
The unlikely line number is deleted.