Mechanical Edits/willkmis/Long Beach Transit tag upgrades

From OpenStreetMap Wiki
Jump to navigation Jump to search

Page content created as advised on Automated Edits code of conduct#Document and discuss your plans.

As of 12/20/2023, this mechanical edit is complete. See below for implementation details

Who

User:Willkmis, using the OSM account of the same name.

Motivation

The bus stops of the Long Beach Transit agency were imported 4 years ago in 2019, but the data were incorrectly connected to OSM tagging practices. Notably, the stops' names were imported as stop_name=*, and their reference numbers were imported as stop_code=*. These uses are non-standard and contradict the documented practice on the wiki for bus stops, which recommends name=* and ref=*, respectively. In fact, Long Beach Transit stops make up the vast majority of the use of both stop_name and stop_code. There are also other non-standard aspects of this import, such as the names being set in all caps and including the cardinal direction of the corner the stop as a suffix, which does not appear to be practiced in any other major US city.

This mechanical edit will change the unusual imported values into standard tags with standard title case formatting. This edit will affect approximately 1800 nodes in the Long Beach Transit service area.

Method

  1. Investigate and manually rectify unusual tagging, including nodes with both name=* and stop_name=*, stop_code=* without stop_name=*, pre-existing ref=* or network=* values, etc.
    1. Duplicate names can either be due to an intervening naming or a misnaming, such as name=LBT. Only 25 nodes contain both name and stop_name.
    2. Existing networks can be due to shared bus stops with other agencies, in which case LBT information should be added in semicolon-delimited form.
    3. Some bus stops have name values upgraded previously from stop_name, but retain stop_code, which should be upgraded to ref, and with spelled out corner directions, which should be deleted for consistency. These will be treated manually in a separate changeset.
  2. Download all affected data using this overpass query:
    [out:json][timeout:150];
    (
      nwr["stop_name"]({{bbox}});
    );
    out body;
    >;
    out skel qt;
    
  3. Add standard NSI preset values for Long Beach Transit: network=Long Beach Transit and network:wikidata=Q6672372. If network tags already exist, manually review.
  4. Modify tags: stop_name -> name and stop_code -> ref
  5. Run custom validators:
    1. Convert names to title case (e.g. "ATLANTIC & COLUMBIA NE" to "Atlantic & Columbia Ne")
      *["name"]["name"=~/^[A-Z 0-9 &\.\(\)]+$/] {
          assertNoMatch: "way \"name\"=Redmond Way";
          assertMatch: "way \"name\"=REDMOND WAY";
          throwWarning: tr("name is ALL CAPS, may to be Title Case");
          fixAdd: concat("name=", title(tag("name")));
      }
      
      (Based on this code from user watmildon)
    2. Remove directional suffixes (e.g. "Atlantic & Columbia Ne" to "Atlantic & Columbia")
      *["name"]["name"=~/^.+ [NESWnesw]{2}$/] {
          assertNoMatch: "way \"name\"=Redmond Way";
          assertMatch: "way \"name\"=Redmond Way NW";
          throwWarning: tr("name has 2 letter directional ending");
          fixAdd: concat("name=", substring(tag("name"), 0, length(tag(name))-3));
      }
      
      *["name"]["name"=~/^.+ [NESWnesw]{1}$/] {
          assertNoMatch: "way \"name\"=Redmond Way";
          assertMatch: "way \"name\"=Redmond Way N";
          throwWarning: tr("name has 1 letter directional ending");
          fixAdd: concat("name=", substring(tag("name"), 0, length(tag(name))-2));
      }
      
    3. Expand PCH abbreviation (e.g. "Pch & Atlantic" -> "Pacific Coast Highway & Atlantic")
      *["name"]["name"=~/^.*Pch.*$/] {
          assertNoMatch: "way \"name\"=Pacific Coast Highway & Atlantic";
          assertMatch: "way \"name\"=Pch & Atlantic";
          throwWarning: tr("Unexpanded PCH abbreviation");
          fixAdd: concat("name=", replace(tag("name"), "Pch", "Pacific Coast Highway"));
      }
      
  6. Manual review of edited names to ensure no problems were introduced.


A test edit following the process above was done here: https://www.openstreetmap.org/changeset/144533560

Discussion

Feedback has been solicited on OSMUS Slack and the OSM Community Forum, and via direct message with the original importer. The proposed edits were favorably received in all fora, and no objections were raised after two weeks.

Implementation

A demonstration edit following the above procedure was done here: 144533560

Rectification of unusual tagging combinations began on 12/13/2023. This included manual investigation of:

The main bulk edit was performed on 12/19/2023, in changeset 145317056

Further tail work post-mass edit to clean up and standardize data was completed on 12/20/2023. This included:

  • Clean up of bus stops with unusual delimiters or characters that caused them to be missed by my simple regex, fixed in changesets 145317596, 145317606, 145318013, and 145318024
  • Expansion of abbreviations, fixed in changesets: 145317908 and 145319310
  • Follow up to upgrade some stops that were missed due to conflation errors in the original bulk edit, which arose due to duplicate stops which belonged to route relations, fixed in changeset 145355713
  • Tagging upgrades for stops which contain stop_code but previously had stop_name changed to name. I removed the directional suffixes for consistency. These were treated in changeset 145355954