Google Summer of Code/2018/GTFS to OSM Staging
GTFS Integration Tool
Many transport operators provide their transit data as GTFS. Some of them even do this under suitable licenses for adding the data related to stops, lines and their itinerary variations to OpenStreetMap.org.
Preparation of the student to write a proposal for this GSoC project
- Do some mapping in your surroundings, preferably with JOSM to understand OSM's data model (nodes, ways, relations)
- Map some bus/tram/metro/train lines you know according to the PT version 2 specification. Also find some route relations that are not mapped according to this scheme and understand the differences. There are still many of those in the data and at some point they would need to be converted.
- Familiarise yourself with the GTFS data format
- Set up a GeoDjango environment on a hosting service like PythonAnywhere, Heroku, Digital Ocean, AWS free tier, Google cloud platform, Bitnami at the student's discretion
- Look at the possibilities of Overpass API. you can use Overpass Turbo to experiment with queries.
- The primary goal is to provide a web interface where a GTFS file can be uploaded, after which it is processed and the data in it added to a PostGIS database. Then download all public transport related data for that area using Overpass API and load that in a database as well. Then start comparing stop details and positions and report discrepancies to the user. Do the same for lines (route_master relations) and their itinerary variations (route relations).
- Show the user where the differences are and prepare data in OSM format, so it can be passed to JOSM using its remote control functionality.
- If the GTFS file contains shapes, compare those to the resulting shapes of the route relations and report if there are differences.
- Keep track of the changes the user applies in OSM, or chooses not to apply in some cases where the data in OpenStreetMap is more detailed than what the operator provides.
- Also report when the routes in OSM are no longer continuous, or when not all stops are served, or not served in the correct order.
- Keep track of the licensing. Sometimes we have permission to work with the internal data, but not with the shapes in the GTFS.