NaPTAN/Lancashire

From OpenStreetMap Wiki
Jump to navigation Jump to search

NaPTAN Lancashire is a planned import of the NaPTAN dataset covering the Lancashire area of the United Kingdom. The import is planned subject to approval. A summary and review of the import process can be found here. It is very similar to the NaPTAN/Aberdeen import recently completed.


Goals

   Import data specifically for the Lancashire area  as  defined in the NaPTAN download(ATCO code 250)
   Import stops of type BCT ("On-street Bus / Coach / Trolley Stop.")
   Import BCT stops of type MKD ("Marked (pole, shelter etc)")
   Only those stops marked Act (Active). 
   Manually conflate and review the data before upload using JOSM
   Split the edits up according to the "Town" NaPTAN data field  - enables management of the project.

Schedule

As this is data for a specific region and I intend to split it into smaller areas, there is no pressing need to import everything at once. I would plan to import the data as quickly as possible, but the priority is really accurate conflation (in test runs this hasn't taken very long following the process described below). I can't see the process taking more than a day or two per town at most.

Import Data

Background

Data source site: https://data.gov.uk/dataset/ff93ffc1-6656-47d8-9155-85ea0b8f2251/national-public-transport-access-nodes-naptan

Data license: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3

OSM attribution (if required): https://wiki.openstreetmap.org/wiki/Contributors#Department_for_Transport_-_NaPTAN_data

ODbL Compliance verified: yes as per the NaPTAN entry at Import/Catalogue

Import Type

One time import, human conflated and uploaded using JOSM.

Data Preparation

Data Reduction & Simplification

Not really anything to simplify, this import is just bus stop nodes. Other data is present in NaPTAN and will be left out of this import.

Several dataset records have Town missing or misspelt, these will be handled manually and checked for sensibility.

Records marked 'del' will be manually checked and any OSM bus stops there will be marked with a Fixme or deleted depending on available evidence.

Tagging Plans

The naptan: tagging namespace already exists and will be consulting with the talk GB mailing list to determine which tags are valuable to import

Following the Aberdeen discussion on the talk GB mailing list I've arrived at the use of the following tags:

   highway=bus_stop
   public_transport=platform
   bus=yes
   name [Imported - defer to existing tagging during conflation, use field naptan:CommonName]
   naptan:AtcoCode [Imported - useful for linking OSM data to NaPTAN data]
   naptan:NaptanCode [Imported - same as above, plus some external services may use this code, it is frequently found on my local bus stop notice boards]
   naptan:CommonName [Imported - should match name but may differ, is what the indicator is relative to]
   naptan:Indicator [Imported - useful to distinguish stops and sometimes can be loc_ref data]
   naptan:verified=no [Gets deleted upon survey verification according to the original guidelines]

Notably, these do not cover all naptan: tags which were present in the old import. The reason for this is to avoid bloating the OSM database with unnecessary tags. As per the import guidelines: "Your data source may have many many fields, but OSM data elements with many many tags can be difficult to work with."

Changeset Tags

Changesets will be marked with source=naptan_import as was previously done (see changesets of user NaPTAN) and uploaded by user naptan_import_lancashire. Changeset descriptions will be "Uploading NaPTAN data for [Town]".

Data Transformation

Data will be downloaded from the NaPTAN site as csv for area 250 (Lancashire). [Town] data will be extracted using MS Excel native filters and sorting to a [Town] extract csv which will be imported into JOSM. Analysis of the downloaded data has shown several mistakes in the data - mostly the Town is incorrectly completed - missing or typo or wrong Town (e.g. Withnell an outlying village of Chorley has been defined in the Town field). These will be manually checked and changed prior to importing. I am able to perform this effectively due to my local knowledge.

Load csv into JOSM, conflate using JOSM/Conflate tool, upload.

Team Approach

Just me.

Workflow

   Filtering and sorting 
   Output data will be loaded into JOSM.
   JOSM conflation plugin will be used to conflate with existing data.
   Data will be uploaded.
   Reversion can be achieved with JOSM reverter plugin if necessary.

Conflation

Data can be conflated using the JOSM conflation plugin. Below is my tested step by step process for this.

   Open up one of the csv files in JOSM.
   Download bus stop data for the current area as a new layer from Overpass API
       [out:xml][timeout:25][bbox:(bbox={{{minlon}}},{{{minlat}}},{{{maxlon}}},{{{maxlat}}})]; ( node["public_transport"="platform"]; node["highway"="bus_stop"]; ); (._;>;); out meta;
   Configure the conflation settings
       Set the existing OSM data layer as active and Ctrl + A to select all stops, click freeze to make these the conflation reference.
       Set the imported data layer as active and Ctrl + A to select all stops, click freeze to make these the conflation subject.
       Using default settings, generate matches (can tweak if necessary, but NaPTAN data seems to be close to true stop positions).
   Go through the list of matches and perform conflation.
       Check that the stop nodes being matched are actually the same stop. If not, click remove in the conflation window to unmatch them. If other stops are   nearby then check for matches amongst those and manually conflate.
       Check the position of the stop against aerial imagery (Bing and Esri enhanced have best alignment in this region). If the existing data has a bad position, change the active layer and move the node as needed.
       With position and match verified, click conflate in the conflation window. This brings up a list of tag conflicts if there are any. Defer to any existing stop name. If naptan:verified=yes is present then delete the tag as per the original import guidelines.
   Upload the data layer which was originally the imported data (now the imported and conflated data).