GTFS
The GTFS (General Transit Feed Specification) is a data format that was created for sharing public transportation information such as bus stops and bus routes and timetables.
It is useful for potential users of OSM data to provide routing using public transport as in many cases timetables change so often that representing them in OSM is basically impossible. In some cities timetables can be expected to change daily due to road/track closures/renovations. And in some areas timetables are massively changes multiple times during the year, for example as holidays and school start/end.
In such cases data consumers can use OSM data for roads and stop positions and footways and take available trips on public transport directly from the organization.
It was originally called the Google Transit Feed Specification, was developed by Google. It is now maintained by the MobilityData organization which also maintains tools for using GTFS data, a database of GTFS data, and the General Bikeshare Feed Specification.
Structure of GTFS
a GTFS trip identifies a single trip of a bus ...
- GTFS file trips.txt
- specifies the trip_id
- links a trip to a route_id in routes.txt
- links a trip to a service_id in calendar.txt and calendar_dates.txt (on which days the trip will be done)
- if shapes exist, shape_id links the trip to a path that a vehicle travels in shapes.txt
- specifies some more information like the headsign
for each trip_id there are several entries in the GTFS file stop_times.txt
- trip_id to identify which trip the entry belongs to
- stop_id, the id of a stop of the trip
- stop_sequence, the order of the stop for this particular trip (1,2,3,...)
- arrival_time and departure_time are other entries that are not so relevant for OSM
detailed information of stops is in the GTFS file stops.txt
- stop_id to link it to trips via stop_times.txt
- stop_name
- stop_lat and stop_lon, the position of the stop (platform)
- GTFS stops are usually platforms beside the road (public_transport=platform in PTv2 jargon)
a GTFS shape identifies the path that a vehicle will travel to along a route ...
- GTFS file shapes.txt identifies a path (GPX route)
- usually, several trips use the same path/shape with different departure times (such as "every 10 minutes between 7 AM and 7 PM")
- for the passengers the path/shape is not so important as long as departure and arrival stops are on the path
- the bus driver must know which roads to take to get from stop(n) to stop(n+1)
- GTFS shapes define how buses will travel along roads to pass the stops (OSM's PTv2 defines public_transport=stop_position for this purpose)
Tags
Looking on Taginfo at the tags that are currently in use there was no decision about the name space made, yet, as underscore (gtfs_*
) and semi-colon (gtfs:*
) are both used.
Similar is the situation about multiple values where it is not clear how to handle these cases (example).
At least for routes, gtfs_id=* or gtfs:id=* can be misleading as it is not clear which of route_id
, shape_id
or trip_id
is meant and more than one of the three ids can be added to a PTv2 route relation.
Overview of used tags
The big difference of opinion is whether to use an underscore or a colon between the "gtfs" prefix and the rest of the key.
Keys with wiki pages
- gtfs:feed=* (highest priority identifier for the whole GTFS data set; it (or one of its alternatives) should be included when any other gtfs keys are present)
- gtfs:name=*
- gtfs:release date=*
- gtfs:route id=* (identifier to associate type=route_master relations with routes)
- gtfs:shape id=* (preferred identifier for a route variant -- but not always present, does not provide information about stop positions)
- gtfs:stop id=* (useful if the on-the-ground ref=* for a stop is different from the identifier in the GTFS data set)
- gtfs:trip id=* (alternative for route variants with only one trip)
- gtfs:trip id:sample=* (fallback for identifying a route variant -- but more likely to change, provides information on stop positions and their sequence only)
- gtfs id=*
Keys by use (over 100 uses as of when this was updated)
Currently unused tags previously mentioned on this page
- Mapping to OSM tags (draft)
Alternative for stops
In Europe, for public transport stops, the European standard IFOPT is defined and in some GTFS-data the
stop_code
is identical to the IFOPT references. In these situations, instead of gtfs_id=*, gtfs_stop_code=* or gtfs:stop_id=*, it is wise to use the established ref:IFOPT=*.
Data sources
- PTNA - Public Transport Network Analysis aggregates open and correctly licensed GTFS data from some countries. More countries can easily be supported if demanded and links to sources are provided.
- GTFS Data Exchange - Data available for 1000 transit agencies (as of 9 Dec 2016), though licensing varies. Soon to be shutting down.
- Mobility Database (formerly TransitFeeds/OpenMobilityData) - open source aggregation project of GTFS data.
- Transitland at transit.land - commercially funded aggregation of GTFS data.
- transport.data.gouv.fr - french open data GTFS (ODbL)
- European Union NAPs - links to `.pdf` with EU National Access Points (see also unofficial list)
Visualizing of GTFS
- PTNA - nice online visualization of aggregated and correctly licensed GTFS data with tag recommendations for route relations and map overlay for shapes.
Conversion of OpenStreetMap and GTFS
OSM → GTFS
- osm2gtfs - An extendable python script to query OpenStreetMap data about public transport, combining it with time information provided from a different source and convert it into the GTFS format.
GTFS OSM
- GO-Sync (aka gtfs-osm-sync) - a desktop tool to synchronize GTFS feeds with OSM
- GTFS-OSM-Validator - console tool that will read GTFS and output exact problems it finds in OSM
- gtfs-sql-importer - This tool can convert GTFS to SQL postgis schema where GTFS can be further manipulated. More examples of this tool can be found in GTFS SQL examples.
Editor support
- The external JOSM preset Public Transport GTFS and rule Public Transport GTFS support some of the tags.
Software using tags
- PTNA evaluates gtfs:feed=*, gtfs:release_date=*, gtfs:route_id=*, gtfs:shape_id=*, gtfs:trip_id:sample=* and gtfs:trip_id=* to provide a link from the relation to the GTFS data.
Discussions
- GO-Sync - a GTFS and OpenStreetMap data synchronization tool - a Google Groups thread announcing gtfs-osm-sync, and difficulties of multiple operators for bus stops
- GO-Sync - a GTFS and OpenStreetMap data synchronization tool - gtfs-osm-sync announcement on Talk-transit
- GTFS compatibility (and [1] and [2]) - discussion on Talk-transit
- Bus stops in North America from GTFS data - thread on Talk-transit
- Proposal:GTFS Tagging Standard