PT De Lijn Vlaanderen BE Import
- 1 About
- 2 Import Plan Outline
- 3 Goals
- 4 Schedule
- 5 Import Data
- 6 Data Preparation
- 7 Data Merge Workflow
- 8 QA
This import project page is about the De Lijn public transport import. We have permission to add their data to Openstreetmap.org.
This dataset contains all the stops and the timetables for buses and trams in Flanders (and some in Wallonia and The Netherlands for some lines).
Import Plan Outline
The goal of the import is to use this dataset in mapping activities. We are not attempting a blind import, all data has to seen by human eyes before it can appear in the OSM data.
While working out the technical implementation, lots of stops were added in already. I only learned recently how to write this kind of proposal.
Link to permission (if required): I asked for permission several times over the years. Eventually we got the permission to add the data and use the PDF files on http://www.delijn.be/aanpassingen/ with help of okfn.be
OSM attribution (if required): https://wiki.openstreetmap.org/wiki/Contributors#De_Lijn
ODbL Compliance verified: the reason why De Lijn didn't feel comfortable with sharing the data previously had nothing to do with the license, but rather with perceived liability. i.e. where will people go to complain when errors creep in. Permission was received over a year ago to add the data and no one ever complained about it in the mean time. There is also a feedback loop when we find errors in the data. So they know about it.
We also have explicit permission to use the zone information, the colours and the internal 4 digit route references.
Each stop and each route is vetted before it gets added to OSM. No automatic import is to take place. The OSM file has an extra line:
<osm version="0.6" upload="no" generator="Python script">
which causes an additional message from JOSM if somebody would try to upload all of the data at once.
The file is only meant as an aid to facilitate adding/integrating stops manually, not for automatic upload/import.
Data Reduction & Simplification
The technical nitty gritty for converting the data can be found here:
The latest version of the resulting osm file can be found here:
In this file, all stops which are not in OSM yet, get an odbl=new tag. This has nothing to do with odbl, but those tags will get removed automatically before JOSM uploads the data. created_by is used because of the same side effect to contain street and city/village name. This is handy to cross check the street names and occasionally to move the stop next to the correct street near street corners.
The file contains all stops, for each stop a route_ref has been calculated from the timetable information which is part of the data. To select a group of stops in order to add route relations in the next step, this search expression (RE) can be used:
RR route_ref="(^|.+;)26(;.+|$)" inview odbl=new
26 gets replaced with the route number you want to work on.
Then copy/paste the selected stops to your work layer and reposition them one by one, checking the names for abbreviations which weren't converted properly and add zone information
In order to add the route relations, the member stops need to be uploaded first, then the file needs to be saved and a script needs to run to update the local DB.
After that createOSMrouterelations.py can be used to create all route relations which have sequences stops in different order. In case of telescopic lines, only the longest sequence of stops gets a route relation.
|name||ongoing effort to expand abbreviations automatically and to streamline/generalise others like O.C.m.w. -> OCMW|
At some point we decided to add the name of the village/city to distinguish between all the Dorp; Kerk; Markt, except for bigger cities like Antwerpen, Brussels, Gent and their suburbs.
|ref||internal ref number of De Lijn, as found on every stop pole in the field|
|zone||not included in the data for some strange reason, can be found on http://www.delijn.be/aanpassingen/|
public_transport=platform / bus=yes will be added if and when those tags will eventually, if ever, get rendered. At the moment they would be quite useless, except to make JOSM's validator shut up, as it doesn't like highway=bus_stop combined with a platform role.
When a stop is served by more than 1 operator (common in Brussels region, but also near to the linguistic border or when De Lijn serves stops in The Netherlands) 1 node per operator is used. This facilitates automated QA. All these stops are combined in a stop_area relation.
There are some instances where a tram stop and a bus stop share a ref number. In this case the bus stop gets the ref, the tram stop doesn't have a ref. Both stops are combined in a stop_area relation.
|from||Menen Station Perron 5|
|name||De Lijn 54 Menen Station - Roeselare Station|
|route||bus (or tram)|
|to||Roeselare Station Perron 2|
|via||needs to be added manually|
ways: get no roles and form an ordered sequence from beginning to end (they need to added manually, although I do have a script which runs inside JOSM which can find the nearest way to a stop)
stops: get a platform role automatically, this needs to be changed to a more correct role if needed for stops where one can only board or get off.
|colour||from http://www.delijn.be/aanpassingen/ with colour_picker|
source = De Lijn;AGIV
AGIV for the use of highres imagery which is available for the Flemish region and Brussels
Bing for Wallonia and The Netherlands
Data Transformation Results
Data Merge Workflow
Tedious manual labour
If people want to join in, send me a message (Polyglot), I'll explain what you need to know during a few hangouts. This usually takes several hours...
Dedicated upload account
It has become obvious we will never, ever agree on this. It is not practical to have to switch between accounts and it will never be, as in the process of adding stops and creating/maintaining route relations a lot of other work is also performed, like adding street names from AGIV WMS, adding cycle ways, pedestrian crossings, give_way, stop signs from AGIV or Bing aerial imagery. At times even houses and addresses will be added in the same changeset from CRAB data.
If this remains mandatory, I will create umpteen accounts, but I will give no guarantee that I'll always think of changing when appropriate, so good luck untangling that mess. Also, make sure I can create accounts from the same email address, as it's going to be a lot worse when complaints get sent to a bunch of email addresses which never get checked.
Every stop gets vetted, the data from upstream serves as one of many references, integration/conflation is manual labour.
Conflation has to be done by each individual contributor. It's better to let a human decide on this.
For the stops I have a script which generates output in wiki format where names and route_refs are compared. JOSM RC is used to make it easy to upload them.
For routes, it's work in progress. QA and maintenance on them would be a lot easier if it were possible to use route segments.