Import/Catalogue/KMS

From OpenStreetMap Wiki
Jump to: navigation, search
Import.png
Example of housenumbers on map from KMS data set

The KMS import is related to a one time import of nodes with housenumbers for Denmark. Import specific information is tagged with the kms: prefix.

Source

Screenshot of original page with statement of free usage

The data originates from KMS (Kort & Matrikelstyrelsen); the Danish National Survey and Cadastre. The data was available from KMS' web site.

The data is from 2002 meaning that obsolete information of old municipalities and counties before the municipal reform of 2007 are present. Furthermore the data doesn't contain new zip codes that have been created to solve the problem of ambiguity in addresses.

Data

The data was split up in several data sets for different counties. Each set contained the following files:

  • vejnavn.txt
  • altvej.txt
  • kommune.txt
  • post.txt
  • sogn.txt
  • adresser.txt
  • amt.txt

Coordinates have been converted from UTM32 to latitude/longitude.

Import

Peter Brodersen (User:Findvej) is performing the import. The data has been converted to JOSM file format and is being uploaded using the Bulk_import.pl tool.

The imported data consists of addr:housenumber, addr:street, addr:postcode and addr:city as well known tags.

All information from the KMS data set has been added in kms: specific tags to preserve original information. This also includes name of local city, (pre-2007) municipality, (pre-2007) county and parish (Danish: Sogn) as well as the former IDs of street, municipality, county and parish.

Import progress

The import began 4. February 2009 at 23:00 (CEST) and ended 7. February 2009 at 17:00 (CEST).

Import progress (bugfix)

131,344 addresses were missing from the first import. These were addresses without any known parish name in the KMS dataset (mainly in the middle of Jutland) that wasn't included in the first database extract.

These have been inserted early at the 7. February between 02:15 and 06:15 (CEST).

The import is now complete.

Visibility and usage

The housenumber values are rendered in both Mapnik and Osmarender layer at highest zoom level:

Examples of individual nodes:

Unnamed streets

The result can be used to derive street names for unnamed streets when all the nodes along the street has the same addr:street value:

Where the Streets Have No Name (in our data)


Unmarked streets

The result can be used to locate areas where streets are missing:

Areas with house numbers but no streets


Bugs and caveats

Overlapping nodes

Every address is created as a node. Some addresses have the same coordinates. However the Address comment mentions that multiple house numbers should be entered as one value, separated by comma.

This is not always possible to achieve as some addresses with same coordinates are associated with different street names.

The JOSM Validator might be picky about this. There doesn't seem to be a feature to aggregate different values to a single comma separated value in its conflict resolution.

Cascaded nodes

Some addresses around the same block are placed very close to each other in the KMS data. The position of the numbers in a grid results from pre-adjustment resulting in a lot of housenumbers in the same area with slightly displaced positions.

This will show up as large chunks in the rendering and might look spurious. However this is simply the result of the original KMS data. This has not really been a concern earlier (besides from the lack of precision) as nodes have typically been searched and viewed on an individual basis and not presented altogether.

Recent address data (OSAK) are more detailed and do not contain these types of chunks.

Cascaded nodes resulting in blocks of housenumbers


Existing addresses

No concern is taken for existing nodes with address information.

A quick scan in denmark.osm before the import reveals 475 entries where addr:housenumber is present. This will result in duplicated information. A quick solution would be to wipe all existing housenumber tags. However the manually added information might be more updated (with newer addresses) or more exact located.

A better approach would be to search for duplicates after the import based on street, housenumber and perhaps postcode or proximity and then remove one of the entries. This search might also reveal how many of the existing nodes are duplicates giving a better knowledge of how to proceed.

Large data set

The data for Denmark is now about 10 times as large meaning that nodes with addr:housenumber is about 90% of the total content. This could lead to a slower progress working with editors and data sets.

The XAPI supports tag predicates giving the ability to only fetch specific tags in the result set. However it does not yet support the exclusion of specific tags.

If this feature is enabled (and editors will support it) it could easily be a way of cutting the content and transfer size with day-to-day operations that does not require the KMS dataset.

Links