GTFS SQL examples

From OpenStreetMap Wiki
Jump to navigation Jump to search

Introduction

gtfs-sql-importer is a tool that can import GTFS file into postgis-enabled database. Check out documentation before how to import it. Once you import both GTFS into your database and import your .PBF using osm2pgsql, you can query them both and extract interesting information.

Examples here assume that you imported GTFS into gtfs schema and that osm2pgsql is in public (this is default behaviour).

Examples

Merging all stops in GTFS and OSM and getting their distance

Tries to match stop_id from GTFS and ref from OSM to get stop from OSM. Ideally, it should match gtfs_id, not ref.

WITH belgrade_osm_stops AS (
    -- filter only Belgrade (from serbia.pbf)
    SELECT *
    FROM planet_osm_point
    WHERE (highway='bus_stop' OR railway='tram_stop')
      AND st_transform(way, 4326) && ST_MakeEnvelope(20.1, 44.63, 20.77, 45.01, 4326)
),
joined AS (
    -- join stop_id from GTFS and ref from OSM
    SELECT *, st_transform(osm_stops.way, 4326) points4326,
      st_distance(ST_Transform(stops.the_geom::geometry, 3857), osm_stops.way) AS distance
    FROM gtfs.stops stops
    LEFT JOIN belgrade_osm_stops osm_stops ON stops.stop_id=osm_stops.ref
)
SELECT stop_id, stop_name, the_geom as gtfs_geo, osm_id, name as osm_name, ref, points4326, distance
FROM joined
ORDER BY distance

All stops from GTFS missing in OSM in Belgrade

Building on top of query above, this one find only stops where there is no match with OSM and those stops can be exported to CSV and imported further into QGIS/JOSM to figure out is there are really missing stops or just ref tag is missing.

WITH belgrade_osm_stops AS (
    -- filter only Belgrade (from serbia.pbf)
    SELECT *
    FROM planet_osm_point
    WHERE (highway='bus_stop' or railway='tram_stop') AND st_transform(way, 4326) && ST_MakeEnvelope(20.1, 44.63, 20.77, 45.01, 4326)
),
joined AS (
    -- join with GTFS data
    SELECT *, st_distance(ST_Transform(stops.the_geom::geometry, 3857), osm_stops.way) as distance
    FROM gtfs.stops stops
    LEFT JOIN belgrade_osm_stops osm_stops ON stops.stop_id=osm_stops.ref
),
missing_stops AS (
    -- filter only those that do not exist
    SELECT stop_id, stop_name, the_geom AS gtfs_geo
    FROM joined
    WHERE distance IS null
),
stops_with_routes AS (
    -- join with trip and routes, so we get all lines that goes through those missing stops
    SELECT ms.stop_id, ms.stop_name, ms.gtfs_geo, string_agg(DISTINCT r.route_short_name, ',') AS routes
    FROM gtfs.trips t
    INNER JOIN gtfs.stop_times st ON st.trip_id = t.trip_id
    INNER JOIN missing_stops ms ON ms.stop_id = st.stop_id
    INNER JOIN gtfs.routes r ON t.route_id = r.route_id
    GROUP BY ms.stop_id, ms.stop_name, ms.gtfs_geo
)
SELECT stop_id, stop_name, ST_X(gtfs_geo) AS long, ST_Y (gtfs_geo) AS lat, routes
FROM stops_with_routes