I'm mainly interested in improving OSM data for geocoding ("place finding") purposes. This means working on names, administrative regions, postal codes etc.
I propose to link place nodes to their respective admin relation in an automated manner.
- I've seen that there's a lot of administrative regions and a lot of place nodes. But there is a huge gap in linking them together.
- Manually adding 'admin_centre' members to relations is a tedious job. I've tried both iD and JOSM editors. Although both are really comfortable editors for other use cases, linking place nodes and relations is extremely time intensive.
- To fill the gap I've implemented an automatic resolver that I published on Github.
- To broaden the effect I think it's a good idea to submit the resolved links directly to OSM source data. I will document the whole approach and publish it via import list before performing any changes.
- ~19000 admin relations (I only count the ones with admin_level and wikidata tags)
- only ~9100 of these relations have a relation member with role "admin_centre" or "label"
- in contrast to that there are ~150 000 place nodes defined
- easier handling of geocoding tasks
- be better than geographical center of the polygons
- improve the visibility of place nodes acting as admin_centres
- set basis for further improvements
Some of these could be goals for later approaches
- I won't move or improve position of centre nodes
- I won't check whether centre node is inside of landuse area
- I won't remove duplicate place nodes
Overview: see flowchart
- take PBF and extract admin relations and place nodes to separate files
- combine admins and places via position and name. A place node must be inside of its respective admin boundary.
- review generated mappings
- if required, remove opted-out IDs
- upload mappings to OSM in small batches (e.g. one changeset for each county)
The solution is implemented in Java. I'm using the following tools and libs to solve the problem:
- Osmium to extract relevant admin relations and place nodes to separate files
- osm4j to read place nodes
- index all place nodes in an R-Tree (JTS STRTree)
- use osm4j to read admin relations to JTS polygons
- for each imported polygon:
- if the admin relation has admin_level < 6, ignore it
- if the admin relation already has an admin_centre member, ignore it
- query the R-tree for all covered place nodes
- search for place node with the same name as the admin relation
- write a table with mapped IDs of admin relations and place nodes
- TODO: create changesets
An 1000 line example of resolved admin_centres can be found here: admins-resolved-germany-sample.tsv
The full result for Germany is here
|relationID||placeID||name||relation link||place link|