Mechanical Edits/willkmis/Long Beach Transit tag upgrades
Page content created as advised on Automated Edits code of conduct#Document and discuss your plans.
As of 12/20/2023, this mechanical edit is complete. See below for implementation details
Who
User:Willkmis, using the OSM account of the same name.
Motivation
The bus stops of the Long Beach Transit agency were imported 4 years ago in 2019, but the data were incorrectly connected to OSM tagging practices. Notably, the stops' names were imported as stop_name=*, and their reference numbers were imported as stop_code=*. These uses are non-standard and contradict the documented practice on the wiki for bus stops, which recommends name=* and ref=*, respectively. In fact, Long Beach Transit stops make up the vast majority of the use of both stop_name
and stop_code
. There are also other non-standard aspects of this import, such as the names being set in all caps and including the cardinal direction of the corner the stop as a suffix, which does not appear to be practiced in any other major US city.
This mechanical edit will change the unusual imported values into standard tags with standard title case formatting. This edit will affect approximately 1800 nodes in the Long Beach Transit service area.
Method
- Investigate and manually rectify unusual tagging, including nodes with both
name=*
andstop_name=*
,stop_code=*
withoutstop_name=*
, pre-existingref=*
ornetwork=*
values, etc.- Duplicate names can either be due to an intervening naming or a misnaming, such as
name=LBT
. Only 25 nodes contain bothname
andstop_name
. - Existing networks can be due to shared bus stops with other agencies, in which case LBT information should be added in semicolon-delimited form.
- Some bus stops have
name
values upgraded previously fromstop_name
, but retainstop_code
, which should be upgraded toref
, and with spelled out corner directions, which should be deleted for consistency. These will be treated manually in a separate changeset.
- Duplicate names can either be due to an intervening naming or a misnaming, such as
- Download all affected data using this overpass query:
[out:json][timeout:150]; ( nwr["stop_name"]({{bbox}}); ); out body; >; out skel qt;
- Add standard NSI preset values for Long Beach Transit: network=Long Beach Transit and
network:wikidata=Q6672372
. Ifnetwork
tags already exist, manually review. - Modify tags:
stop_name
->name
andstop_code
->ref
- Run custom validators:
- Convert names to title case (e.g. "ATLANTIC & COLUMBIA NE" to "Atlantic & Columbia Ne")(Based on this code from user watmildon)
*["name"]["name"=~/^[A-Z 0-9 &\.\(\)]+$/] { assertNoMatch: "way \"name\"=Redmond Way"; assertMatch: "way \"name\"=REDMOND WAY"; throwWarning: tr("name is ALL CAPS, may to be Title Case"); fixAdd: concat("name=", title(tag("name"))); }
- Remove directional suffixes (e.g. "Atlantic & Columbia Ne" to "Atlantic & Columbia")
*["name"]["name"=~/^.+ [NESWnesw]{2}$/] { assertNoMatch: "way \"name\"=Redmond Way"; assertMatch: "way \"name\"=Redmond Way NW"; throwWarning: tr("name has 2 letter directional ending"); fixAdd: concat("name=", substring(tag("name"), 0, length(tag(name))-3)); } *["name"]["name"=~/^.+ [NESWnesw]{1}$/] { assertNoMatch: "way \"name\"=Redmond Way"; assertMatch: "way \"name\"=Redmond Way N"; throwWarning: tr("name has 1 letter directional ending"); fixAdd: concat("name=", substring(tag("name"), 0, length(tag(name))-2)); }
- Expand PCH abbreviation (e.g. "Pch & Atlantic" -> "Pacific Coast Highway & Atlantic")
*["name"]["name"=~/^.*Pch.*$/] { assertNoMatch: "way \"name\"=Pacific Coast Highway & Atlantic"; assertMatch: "way \"name\"=Pch & Atlantic"; throwWarning: tr("Unexpanded PCH abbreviation"); fixAdd: concat("name=", replace(tag("name"), "Pch", "Pacific Coast Highway")); }
- Convert names to title case (e.g. "ATLANTIC & COLUMBIA NE" to "Atlantic & Columbia Ne")
- Manual review of edited names to ensure no problems were introduced.
A test edit following the process above was done here: https://www.openstreetmap.org/changeset/144533560
Discussion
Feedback has been solicited on OSMUS Slack and the OSM Community Forum, and via direct message with the original importer. The proposed edits were favorably received in all fora, and no objections were raised after two weeks.
Implementation
A demonstration edit following the above procedure was done here: 144533560
Rectification of unusual tagging combinations began on 12/13/2023. This included manual investigation of:
- Nodes containing both
ref
andstop_code
, fixed in changesets 145097349, 145097802, 145098104, 145098153, 145098285, and 145098398 - Nodes containing both
name
andstop_name
, or other spurious names, fixed in changesets 145173712 and 145276172 - Nodes with incorrect
network
, fixed in changeset 145276304
The main bulk edit was performed on 12/19/2023, in changeset 145317056
Further tail work post-mass edit to clean up and standardize data was completed on 12/20/2023. This included:
- Clean up of bus stops with unusual delimiters or characters that caused them to be missed by my simple regex, fixed in changesets 145317596, 145317606, 145318013, and 145318024
- Expansion of abbreviations, fixed in changesets: 145317908 and 145319310
- Follow up to upgrade some stops that were missed due to conflation errors in the original bulk edit, which arose due to duplicate stops which belonged to route relations, fixed in changeset 145355713
- Tagging upgrades for stops which contain
stop_code
but previously hadstop_name
changed toname
. I removed the directional suffixes for consistency. These were treated in changeset 145355954