GTFS

From OpenStreetMap Wiki
Jump to navigation Jump to search

The  GTFS (General Transit Feed Specification) is a data format that was created for sharing public transportation information such as bus stops and bus routes and timetables.

It is useful for potential users of OSM data to provide routing using public transport as in many cases timetables change so often that representing them in OSM is basically impossible. In some cities timetables can be expected to change daily due to road/track closures/renovations. And in some areas timetables are massively changes multiple times during the year, for example as holidays and school start/end.

In such cases data consumers can use OSM data for roads and stop positions and footways and take available trips on public transport directly from the organization.

It was originally called the Google Transit Feed Specification, was developed by Google. It is now maintained by the MobilityData organization which also maintains tools for using GTFS data, a database of GTFS data, and the General Bikeshare Feed Specification.

Structure of GTFS

a GTFS trip identifies a single trip of a bus ...

  • GTFS file trips.txt
    • specifies the trip_id
    • links a trip to a route_id in routes.txt
    • links a trip to a service_id in calendar.txt and calendar_dates.txt (on which days the trip will be done)
    • if shapes exist, shape_id links the trip to a path that a vehicle travels in shapes.txt
    • specifies some more information like the headsign

for each trip_id there are several entries in the GTFS file stop_times.txt

  • trip_id to identify which trip the entry belongs to
  • stop_id, the id of a stop of the trip
  • stop_sequence, the order of the stop for this particular trip (1,2,3,...)
  • arrival_time and departure_time are other entries that are not so relevant for OSM

detailed information of stops is in the GTFS file stops.txt

  • stop_id to link it to trips via stop_times.txt
  • stop_name
  • stop_lat and stop_lon, the position of the stop (platform)
  • GTFS stops are usually platforms beside the road (public_transport=platform in PTv2 jargon)

a GTFS shape identifies the path that a vehicle will travel to along a route ...

  • GTFS file shapes.txt identifies a path (GPX route)
    • usually, several trips use the same path/shape with different departure times (such as "every 10 minutes between 7 AM and 7 PM")
    • for the passengers the path/shape is not so important as long as departure and arrival stops are on the path
    • the bus driver must know which roads to take to get from stop(n) to stop(n+1)
    • GTFS shapes define how buses will travel along roads to pass the stops (OSM's PTv2 defines public_transport=stop_position for this purpose)

Tags

Looking on Taginfo at the tags that are currently in use there was no decision about the name space made, yet, as underscore (gtfs_*) and semi-colon (gtfs:*) are both used.

Similar is the situation about multiple values where it is not clear how to handle these cases (example).

At least for routes, gtfs_id=* or gtfs:id=* can be misleading as it is not clear which of route_id, shape_id or trip_id is meant and more than one of the three ids can be added to a PTv2 route relation.

Overview of used tags

The big difference of opinion is whether to use an underscore or a colon between the "gtfs" prefix and the rest of the key.

Keys with wiki pages

  • gtfs:feed=* (highest priority identifier for the whole GTFS data set; it (or one of its alternatives) should be included when any other gtfs keys are present)
  • gtfs:name=*
  • gtfs:release date=*
  • gtfs:route id=* (identifier to associate type=route_master relations with routes)
  • gtfs:shape id=* (preferred identifier for a route variant -- but not always present, does not provide information about stop positions)
  • gtfs:stop id=* (useful if the on-the-ground ref=* for a stop is different from the identifier in the GTFS data set)
  • gtfs:trip id=* (alternative for route variants with only one trip)
  • gtfs:trip id:sample=* (fallback for identifying a route variant -- but more likely to change, provides information on stop positions and their sequence only)
  • gtfs id=*

Keys by use (over 100 uses as of when this was updated)

Currently unused tags previously mentioned on this page

Alternative for stops

In Europe, for public transport stops, the  European standard IFOPT is defined and in some GTFS-data the stop_code is identical to the IFOPT references. In these situations, instead of gtfs_id=*, gtfs_stop_code=* or gtfs:stop_id=*, it is wise to use the established ref:IFOPT=*.

Data sources

Visualizing of GTFS

  • PTNA - nice online visualization of aggregated and correctly licensed GTFS data with tag recommendations for route relations and map overlay for shapes.

Conversion of OpenStreetMap and GTFS

OSM → GTFS

  • osm2gtfs - An extendable python script to query OpenStreetMap data about public transport, combining it with time information provided from a different source and convert it into the GTFS format.

GTFS OSM

  • GO-Sync (aka gtfs-osm-sync) - a desktop tool to synchronize GTFS feeds with OSM
  • GTFS-OSM-Validator - console tool that will read GTFS and output exact problems it finds in OSM
  • gtfs-sql-importer - This tool can convert GTFS to SQL postgis schema where GTFS can be further manipulated. More examples of this tool can be found in GTFS SQL examples.

Editor support

Software using tags

Discussions

External links