The following selected aspects provide an overview of the ideas behind the implemented code and contribute to the transparency of the work and the results. The used concepts in this project are rather simple and mainly supposed to show the possibility of adjusting the osm-data in a specific way, especially for Freiburg, a city (operating a tram system) in south-west Germany. In order to get correct results with the tool for all cities, more advanced algorithms, which cover more osm-data and transport system conditions, might be beneficial.

**Extraction of Relevant Data**

To parse large osm(xml) files the lmxl.etree API for python is used.

The following tags are considered to find the relevant nodes, ways and relations of a specified transport system in an osm file:[3][4]

tram | subway | |
---|---|---|

nodes |
k="tram" v="tram_stop" |
k="station" and v="subway" k="subway" and v="yes" |

ways |
k="tram" v="tram" v="light_rail" |
k="railway" and v="subway" |

relations | v="tram" | k="route" and v="subway" |

Furthermore the following tags are considered to either exclude objects or to gather additional information:[5][6][7]

tram | subway | |
---|---|---|

nodes |
k="construction" k="historic:railway" |
k="construction" k="historic:railway" |

ways |
k="railway_switch" and v="no" k="service" and v="siding" k="historic" k="construction" v="construction" |
k="railway_switch" and v="no" k="historic" k="construction" v="construction" |

relations | v="historic" |

**Including Stations in Ways**

The stations of the public transport systems are not by default included in the ways of OpenStreetMap data. Thus, after the different nodes of the stations are merged together to new center stations, the new center stations need to be included in the ways for the first time, in order to simplify merging the ways between the stations later. Since ways in OSM consist of ordered node lists, a simple distance measure could be used to check whether a station should be included in a way: Whenever a node of a way is close (e.g. within 40 metres) to one of the new stations, the station gets included in the way instead of the original node.

**Connecting Stations through Ways**

In order to further simplify merging the ways in the next step, the goal of this step is to create ways, which
run from station to station, meaning that each ordered node list starts and ends with a station node
at the end of this step.
To achive this, every way containing station nodes is split at that station nodes, so that a station node
is either at the beginning or the end of a way segment. Starting from a way segment with a station node at the beginning, a new way segment
is added to a way in an iterative loop, if the last node id of the current way and the first node id of the new way
segment match together. The loop terminates as soon as a way segment, which contains a station node at the end, is
added, the way contains a switch node or no additional way segment can be found.
This simple search already finds most of the ways, at least of the Freiburg sample data.

However, to also allow for ways to be found that are not constructed in that precise order,
every way is also saved in the reverted direction and the loop can change between these sets of ways
in between.

**Merging Ways between Stations**

To merge all ways between stations, so that only one way exists between a set of two stations,
the shortest way (or more precice: the way which consists of the minimum number of nodes) of all
ways between two stations is filtered. For every node of this shortest way, the closest nodes of
the other ways are computed using a euclidean distance measure. Then one single new node is
calculated at the center of those nodes to assure that the new way runs in the middle of the merged ways.

According to the merged ways the line numbers of the transport system are also merged and
assigned to the resulting new line.

**Improvements of Line Numbers**

Some quality improvements need to be made with regard to missing or wrong line numbers. Especially if ways split after crossroads, the line numbers arenâ€™t splitted accordingly. Thus, a couple of existing line number conditions are checked and line numbers are either added or deleted. The conditions are easy to determine if stations on both sides of a way only have two incident egdes (ways). Then it's obvoius that the same line numbers should to be at both sides (ways) of a station. Since conditions get more complex, if stations have more than two incident edges (see the example below), a correct line numbering is only assured for a data set of Freiburg from October 2018.