Talk:Import/Catalogue/Bus stop import Norway

From OpenStreetMap Wiki
Jump to navigation Jump to search

Cleanup workflow

I have found quite some nonsense (apart from a few cases of regressions where the imported data apparently has replaced good existing data). One major thing are platforms that simply do not exist in reality. Many bus routes have stops where there is no indication whatsoever. Nevertheless, the import has created (badly placed) nodes for them, e.g., https://www.openstreetmap.org/node/6372380794. There are probably some bureaucratic reasons for their existence in the external DB but they really should not be in OSM at all in this form. However, removing them might not be the best option because further imports might re-introduce them. Since we have stored the external IDs in the ref:nsrq field I propose leaving the wrong nodes in OSM with only this tag and a note. Further imports should handle these nodes accordingly (i.e., not touch them). Any thoughts? --Stefanct (talk) 07:21, 28 May 2022 (UTC)

*Continues ranting* It's even worse... there are some nodes that are even duplicates imported years apart that are placed at completely wrong locations... but even have unique IDs (e.g., https://www.openstreetmap.org/node/6372379489). I'll strip them for now from their attributes because they are way too bad to ignore. This seems to be another one of those imports that OSM suffers from for years... why don't we ever learn. :( Make them available as easily usable data but let people import them one by one after verification. Else you will continue to degrade OSM as so many imports before. --Stefanct (talk) 07:50, 28 May 2022 (UTC)
The bureaucratic reason is a legal requirement of all public transport to report, maintain and use a centralised database of stops for all public transport information. Long story short, it harmonises the data and duplicates can be avoided. This data is made openly available and OSM has made use of this to populate the map with all actively used stops (very roughly 100.000 positions). The requirement of the data is to place the "Quay" object exactly on the position for boarding (bus front door/edge of the sidewalk - or best estimate when there is no sidewalk). OSM uses a script to harvest all updates and improvements to the dataset and can benefit thereof. For users to monitor and import all these stops individually would be impossible, so the downsides of incorrectly placed stops are a price worth paying, imho. The NSR database is continually maintained and improved upon - as NKA says, slowly, but actively - since it is used by the whole public transport sector in Norway. If you have a list of obviously misplaced data I'd much rather you report them to me instead of modifying them in OSM, so that they can be fixed upstream. --Johan Wiklund, mapping director at Entur AS (talk) 11:11, 28 May 2022 (UTC)
I have heard that argument so often but it has been disproved so many times already. The problem is that importing data is easy but the true burden is maintenance. I am sure I am not telling you anything new there :) Usually, imports introduce a lot of "unattended" data that then degrades over time - even if the initial quality was very high (in my hometown we have imported a hundred thousand trees with very detailed metadata - however, it was a onetime dump). In contrast to about every other import I had to deal with, here at least there is some continuous effort to improve the data. That is great to see and I would have ranted less if I would have known that. I am very skeptical about the "reporting upstream" attitude though. That does not really seem feasible when I look at the number of changes I would make if I would edit more in Norway but I'll stop touching anything the NSR tags from now on. --Stefanct (talk) 12:34, 28 May 2022 (UTC)
I agree that there is a difficult situation with reporting upstream, because how would an OSM user know how to, unless i put my e-mail address in a note on every stop :) Also, a lot of people make edits which I can not accept into the official dataset due to a different view on "what is correct", but any feedback on erroneous stops is looked into and I sometimes check out the stops in OSM to see what changes have been made - and when someone had found something that is obviously wrong I correct it upstream, or request someone with local knowledge to correct it. The benefits here is that our dataset can be improved from downstream (OSM) into the upstream (NSR) in a dataset which is published to the whole world. Google Maps, Apple Maps, HERE etc. also uses our stop place data. So, any edits in OSM can have value - even if some of them dont. The negative aspect of users editing the stops in OSM is that the script no longer touches them, so that if there are actual changes (from upstream), these would not be updated in OSM. In a sense, once a user touches a highway=bus_stop in Norway, they claim responsibility for maintaining it. So the situation is the opposite of what you mentioned about "unattended data". It will please you to know that the stop "Ramberg" has both been adjusted in NSR and reported to the local authority due to the unused shelter further up the road. It will also please you to know that there are usually ~100 edits to NSR each week by the local authorities and Entur which slowly increases the data quality. We have no trees. I should also mention that our journey planner (entur.no) uses public_transport=platform as a routable area for walking/cycling, so having these mapped (when meaningful) and connected to the rest of the road network is a primary concern for us in terms of OSM-data. :) --Johan Wiklund, mapping director at Entur AS (talk) 14:02, 28 May 2022 (UTC)
Regarding "Lofotenekspressen", I think this is outdated now. I checked and there is no longer a 23-720 line. I think it might be line 300 now, but they have 3 lines called 300 so I'll have to take it up with Nordland fylkeskommune. I think they are in the process of renumbering their lines. If you are interested in updating public transport routes in Norway you can use our API's to find the exact journey patterns. You can find the Authorities here: https://api.entur.io/graphql-explorer/journey-planner?query=%7Bauthorities%7Bid%7D%7D, the Lines here: https://api.entur.io/graphql-explorer/journey-planner?query=%7Blines%28authorities%3A%22NOR%3AAuthority%3A12%22%29%7Bid%20publicCode%20name%7D%7D, the pattern(s) of stops (JourneyPattern) in a Line here: https://api.entur.io/graphql-explorer/journey-planner?query=%7Bline%28id%3A%22NOR%3ALine%3A12_8640%22%29%7Bid%20name%20publicCode%20transportMode%20transportSubmode%20operator%7Bname%7DjourneyPatterns%7Bquays%7Bid%20name%7D%7D%7D%7D (I believe this is Lofotenekspressen). --Johan Wiklund, mapping director at Entur AS (talk) 14:02, 28 May 2022 (UTC)

---

I think you don't fully understand how this works for bus stops in Norway. These are official bus stops from the database and used by the different bus operators. Some stop are poorly placed and can be corrected in OSM. This will in turn give ENTUR notification due to unique ID and the official database can be updated. The second example is not a duplicate as both sides of the road have its own stop with a unique ID. This is not a mistake. Gazer75 (talk) 08:22, 28 May 2022 (UTC)
I do fully understand that these are meant to reflect official stops coming from an external faulty database, which is why I started this discussion in the first place. Also, look again, you did not grasp the problem. --Stefanct (talk) 08:30, 28 May 2022 (UTC)
How is it faulty? Sure some of the stops might be poorly placed, but that is probably due to old data before the availability of high quality digital map data. 99% of the stops are good or have been improved since being imported, thus benefiting both parties. So if you found poorly placed stop nodes, fix them and all is good. Gazer75 (talk) 08:36, 28 May 2022 (UTC)
The real Strand is 20km further west: https://www.openstreetmap.org/node/6372378473 And no the ratio is rather 1% of the stops are good but that's not the point anyway. "Just fix them" does not really work for nodes that should not exist in either database as they would be imported again later into OSM because they seem "missing" in OSM. I doubt that deletions in OSM get propagated to the NSR so just deleting them in OSM (and implicitly doing the QA for incompetent external operators) does not really work... --Stefanct (talk) 09:15, 28 May 2022 (UTC)
Actually no, both are accurate as they exist in different counties and/or municipality. Please do not start trying to fix things before fully understanding how things work. There are like 6 stops around Norway that are called Strand. Gazer75 (talk) 09:28, 28 May 2022 (UTC)
"Interesting". And the Strand in Bogen is exactly at the same location as the stop that is actually called Bogen? May I see your source for that please because it does not only no sense at all but also contradicts the official time tables? --Stefanct (talk) 09:41, 28 May 2022 (UTC)
The stop position is wrongly named Bogen, Bogen stop is a bit further east on E10. Official timetables can be found on entur.no if you want to search for them. There you can even find all routes using a specific stop if you like. The NSR database is actually used by the public transport companies for the routes so removing them from both databases as you say above would basically break all routes for them. By moving a node with an nsr ref it will end up in a file that employers at Entur will check regularity and if required verify with the county responsible for that stop. It's mentioned in the update section. Gazer75 (talk) 09:57, 28 May 2022 (UTC)
Right, it was the mislabeled Bogen stop that was completely wrong not the Strand one, thanks, fixed. Also, thanks for the pointer to the update section, I did indeed miss it. However, it does not really explain what would happen on imports if an entity with and NSR ID is deleted in OSM. I would presume it gets imported again if it wasn't also deleted in the actual NSR database? That would be a problem. --Stefanct (talk) 12:34, 28 May 2022 (UTC)
Your example compares a highway=bus_stop with a public_transport=stop_position, which are two different things (not a duplicate). Your are right that some of the locations could have better quality, but there is a reporting mechanism in the import which slowly improves the quality. There are hundreds of improvements each week. --NKA (talk) 08:53, 28 May 2022 (UTC)
I am very well aware of how public_transport=stop_position and public_transport=platform work, thank you. See above. The main page and both of you refer to the back-propagation in very superficial words. How does it work? Where is it documented? Are there humans in the loop? What happens to nodes with ref:nsrq that get deleted in OSM? --Stefanct (talk) 09:15, 28 May 2022 (UTC)
Please note that the import is not touching public_transport=stop_position at all. In general, stop positions are not used in Norway, with some exceptions. Many of the existing stop positions were created years ago but are not being maintained by their creators or anyone else (with notable exceptions). Also please note that many bus stops in rural areas in Norway are not visible on the ground. They might not even have a sign, but the bus will stop there. And another thing: The practice in Norway is to keep roundabouts as one closed way, even in route relations. --NKA (talk) 10:24, 28 May 2022 (UTC)
There were about a dozen or so nodes that were added to a route relation without any tags that presumably have been stop positions previously. I'd have to investigate to find them again because I fixed the relation already. If it wasn't the import then somebody else made those errors, fair enough. It's a pity that Norway seems to still use the legacy PT scheme in most cases. I wonder why? The V2 scheme is far from perfect but a huge step forward. In fact it's impossible to tag all details commonly mapped today with the old scheme. And it's also what's suggested at Norway/Public_transport.
The stops without any physical evidence should be tagged accordingly with either unsigned=yes or physically_present=no (predominately used in the UK but specifically invented for bus stops).
What's the rationale behind the roundabouts? That breaks most validators and makes the life harder for humans too. I of course split them to fix the routes as you probably discovered. There are even tools to do that automatically... not without reason. :) --Stefanct (talk) 12:34, 28 May 2022 (UTC)
"Please note that the import is not touching public_transport=stop_position at all." Well, that's actually not true. It does at least add them, cf. #11 of https://www.openstreetmap.org/node/21522243/history. And very often they break the PTv2 scheme by removing public_transport=platform from the nodes that also contain the highway=bus_stop tag. :( I would understand if any highway=platform would be replaced but removing public_transport=platform actually worsens the data without need. --Stefanct (talk) 12:34, 28 May 2022 (UTC)