Automated Edit West Midlands NAPTAN data

From OpenStreetMap Wiki
Jump to: navigation, search

Proposed Automated Edit of NAPTAN data for the West Midlands

This project is being performed in conjunction with the local public transport authority Transport for West Midlands by Brian Prangle (osm username brianboru)

Rationale

Our NAPTAN data was imported in 2009 without highway=bus_stop tags so we could survey them, add the tag once verified that the location was either correct or corrected. In 8 years we have added tags to c 8500 nodes from a total dataset of 12500. Not all have been surveyed. Some might still be attached to highways. Some will have shelter tags. Many route_ref tags will be out-of-date.

In 8 years there will have been many additions and deletions of bus stops in the NAPTAN database which might or might not have been surveyed and added to OSM. OSM additions are unlikely to carry the NAPATAN-originated data

In the intervening 8 years NAPTAN positional accuracy has improved to 1m with the GPS instrumentation of buses and the general uplift in vehicle telemetry and is no longer reliant on 3rd party surveying.

There are new fields in the NAPTAN data which are not in the OSM dataset which require a review

Basically our efforts need a refresh that doesn't require a surveying and manual edit input effort that is beyond us

TfWM approached us with a collaborative approach to improve the West Midlands NAPTAN data in OSM to bring it up to data because their strategy for online information services will rely on OSM as a basemap layer.

Extent

All NAPTAN data within the West Midlands. West Midlands is defined as the boundary of the West Midlands Combined Authority (i.e Birmingham, Walsall, Wolverhampton, Dudley,Sandwell, Solihull, Coventry)


Licence

NAPTAN data is available via data.gov.uk under an OGL licence

Local Community involvement

The mappamercia community has been informed and consulted and there is agreement to proceed with the collaborative update.

A tasklist has been drawn up and shared with the local community who have already discussed it together at our February monthly meeting (5 members present) and fedback the results to TfWM

Posts to talk WestMidlands and talk transit (talk GB because it might be a template for update of NPATAN data generally?). To be completed once this page is complete

Process

Two TwFM developers have been assigned to this task and currently I spend half a day a week with them plannng how to do this. Our approach will be to split the task into a number of passes, changing one attribute of the dataset with each pass. This is a cautious aproach but should make it easier to do a human review where it is only necessary to check one attribute each time .

It will also enable us to learn and improve. Between each upload there will be period for community review

Following the update we need to elaborate processes for: protection of the data in OSM and for regular updating of OSM data from NAPTAN data

Prior to extracting and merging data it was necessary to clean some OSM data. Namely removing bus stop nodes that were on highways and palcing to the side; and restoring to nodes, those bus stops that had been edited into polygons (for the shelters)

Process Overview

Extract data from OSM with OverPass Turbo as raw data to SQL Server for any manipulation to update with current NAPTAN data.

Convert to csv and open in JOSM layer. Add any OSM-specific tags. Sample Human Review. Upload from JOSM using 500 unit "chunks" from a max dataset size approx. 12500

Human community review and revert if necessary. Correct and re-iterate


Process Detail

|}
NAPTAN bus stop clean up West Mids
Task Priority Issues OSM Community Comments Process
Each task to be performed separately Can JOSM scale to c 12000 node edits? If Y we have the current skillset, if N toolset choice and skill acquistion. OR subdivide data by LA boundary (this will probably be more difficult than learning new skills) JOSM should be able to scale to this but to be safe there is a feature ADVANCED in the upload dialog box that allows you to set the chunk size of uploads. Local experience suggests setting it to 500 works well. Upload successfully carried out CLOSED
Identifying OSM Changesets By changeset or username? Either TfWM create unique username for this work or use use #TfWM NAPTAN update 2017 for changeset comment TfWM created a unique username and are using #tfwmnaptanrefresh as changeset comment CLOSED
Positional accuracy - move existing OSM nodes to current NAPTAN co-ordinates Comparison needed with those actually surveyed before updating or just accept new NATAN data accuracy? Accept current NAPTAN accuracy. There are still some very old edits where the bus stop node is on the highway rather than to the side. Moving these will distort the attched way. Either needs a script or a prior cleanup. On examination the current naptan position data was not as good as we were expecting so the OSM position was used N/A CLOSED
Tag all nodes with both old and new OSM public transport tags i.e highway= bus_stop and public_transport= platform Ensure CUS stops not tagged OK reluctantly add platform. Also need to add bus=yes for patform type CUS stops filtered out in first refresh
Remove OSM nodes that are deleted in NAPTAN database OK Query in Overpass for naptan:BusStopType=DEL. Removed highway=bus_stop tag and left the naptan node suitably commented CLOSED
Add new NAPTAN nodes that were not in original import avoid duplication with nodes added manually "Use a proximity script to determine presence of duplicates. How to handle duplicates? First determine extent of problem. Possible solution remove bus top tag from OSM node and add FIXME note
Import new data fields added to NAPTAN data since original import Shelter ref nos. Currently tagged as asset_ref: should this be changed to shelter_ref or something similar? OK with shelter_ref tag. Additional request: is there additional shelter data: eg, type, seats, electronic display Add shelter=yes tag where necessary
Standardise naming convention to Stop Ref e.g SN5 at Interchanges and Road Name/Common Name at non-interchange bus stops OK CLOSED. Change made on 14th Feb 2017
Standardise naming convention in Bus Stations to "Stand X" Cartographically cleaner and also naming the Bus Station in the name is redundant as the node is located in the Bus Station OK
Add new route_ref tags OK CLOSED Change made 14th Feb 2017
Build stop area relations from NAPTAN stop areas OK
Add bus stops to route relations OK
Add public_transport=stop_position on highway perpendicular to bus stop nodes Can this be automated? OK for CUS stops. Reluctantly agree for MKD stops
Update bus route relations OK but a huge task.Historically that's how it's been done but manually intensive and prone to obsolescence. Much better solution is keep the data separate in another layer, but there will still be OSM public transport enthusiasts adding relation data. Plus it is a good concept to know that a highway is a bus route
Complete adding all bus lanes OSM manual task - currently only Birmingham is complete Ask TfWM to contact other LAs for the information in open format. More likely to get a positive repsonse than if OSM ask
Complete adding all bus lane conditional restrictions OSM manual task - nowhere is complete Ask TfWM to contact other LAs for the information in open format. More likely to get a positive repsonse than if OSM ask
Add new bus lane camera enforcement locations from bcc Andy Radford working on presenting the data in clean and accurate format. Mechanical bulk edit when ready. OSM task. Needs to coincide with bcc work schedule OK. Add enforcement relation too?
Add TfWM CCTV locations How to avoid duplication with existing OSM data. Depending on number this could be done by tagging with operator=TfWM and then manually comparing and deleting duplicates Still to be discussed
Data maintenance once cleanup is complete 2 issues- protecting the OSM data from inadvertent edits + adding new TfWM amendments Suggest running an automated report on OSM data to detect changes. Automatic reverts are likley to meet wider OSM community opposition, so a process needs to be elaborated
Add Swift collectors 1 Accuracy for stop locations needs improving Correlate shelter asset no with bus stop Atco ref and substitute Atco lat/lon, then re-import. Decided to do this manually