OSM - GWR Street and Place Names Comparison
http://qa.poole.ch/ch-roads/ shows a comparison of named features (street and place names) in OSM to a list generated by the Federal Statistical Office from the Eidg. Gebäude- und Wohnungsregister (GWR). This gives a broad completeness metric. The GWR contains information on all houses and apartments in Switzerland and is updated regularly. By definition OSM will typically contain more named features than the GWR list, however it is fair to say that the GWR will contain essentially all inhabited places that you would want to navigate to, georeference or similar.
- while permission has been obtained to use the street/place list, the GWR address data is not available for inclusion in to OSM for legal reasons -> this has changed please see Switzerland/GWR
- the numbers take the following OSM keys in to account: name, name:de, name:fr, name:it, name:rm, alt_name, official_name, short_name, name:left and name:right on ways and polygons with a highway tag and on nodes and polygons with a place tag. It currently does not use named landuse features.
- please do not simply change the OSM name tag to reflect the GWR value. The OSM name tag should reflect the unabbreviated signed name of the feature and that should have been surveyed on the ground. In case of conflicts use the alt_name and official_name tags. Most of the time such conflicts are due to punctuation and capitalization differences.
- please do not "googlefy" the OSM data, if a name is actually the name of an area or house or similar, add a node or a polygon. DO NOT add the name to nearby roads see blog post on sosm.ch
- do not guess what name a feature should have from the list, survey the location! The noname layer on http://qa.poole.ch/ can help locating likely locations for missing roads.
- the "Typos" column counts the number of near matches (lower case) with a Levenshtein distance of maximum one. The "Approx. Matches" column in the per muncipiality pages uses a distance of max. 3 (the names with a distance of one are in red).
- the table is sortable by the nummeric values.
- the match pecentage takes in to account "exact matches" (name and type of feature) and matches with the same name, but a different feature type in to account
- the data is updated daily
- the GWR supports a geometry type for the roads this can be "road", "point", "area", unluckily this value can be empty (mapped to "unknown") or set to none. How this is handled seems to depend very much on the local municipality.
Status & News
2017-11-28 Updated GWR data to 2017-09 dataset. Note it seems as if there are no more entries with "Unknown" status.
2015-11-28 Updated GWR data to 2015-11 dataset.
2015-01-04 Updated GWR data to 2014-12-01 dataset.
2013-12-27 Updated GWR data to 2013-12-01 dataset. Changed logic for counting roads see: blog article on sosm.ch
2013-06-25 Updated GWR data to 2013-06-01 dataset. Not much movement, however there seems to be a negative trend in that the number of roads that are classified as roads seems to be decreasing and in some cases they are obviously being reclassified as types "none" or "unknown" (example Bözberg).
2013-05-03 Slight change in logic: count non exact matches against number of road names found. This increased the number of GWR road names found by roughly 150, a lot of these are correctly not roads in OSM and are misclassified in GWR.
2013-03-23 Added support for named junction nodes.
2013-03-21 Added short_name support.
2013-03-02 Some municipialities have all their forest roads and similar in the GWR list. This caused the numbers to look worse than they really were in some places. In some cases, for example Hedingen, substantially more forest roads where counted than "real" ones. I'm now supressing such roads, which has lowered the overall object count by roughly 7'500 (of these a bit over 1'000 are named in OSM, as a result in some places the match percentages may actually go down).
2013-02-06 Started following the updates to the municipiality boundaries, see Switzerland/2013_Municipiality_Mergers
2013-01-12 Switched to using boundary polygons that have been expanded by 20m (using ST_Buffer) this addresses the issue of roads where only one side is in a municipality (just the houses) and the centre line is in the neighbour community. This increased the match count by roughly a further 230 roads.
2013-01-09 Updated GWR list to 2012-12-01 data, only small changes in results.
2012-11-18 Table on the per municipality page is now sortable.
2012-11-13 Clicking on the "m" in the main list after the muniipality name will open a window with the noname map centered on the place in question.
2012-11-12 Added maps.
2012-11-12 Update: added percentage of exact road matches to table. Added PLZ6 (4 dgit Swiss post code plus 2 digit extension) to per municipality files, note there are some obvious errors in the GWR data wrt the post codes. Use lower case version of the names for calculation of the levenshtein distance, this increases the number of typo candidates, but seems to make more sense.