Data sourced from LINZ simple street address layer NZ Street Address
Conversion and Tagging
Subsets of addresses to be processed differently.
address_type: Water | Road (initially will ignore water)
town_city + suburb_locality: Suburb+City | Locality | Town
We won't use the _ascii fields.
in generated osmchange file
|LINZ||OSM||Comment||address_id||LINZ:address_id or ref:linz:address_id ref=*||Explicit connection to source data, to be used for mainenance||full_road_name||addr:street=*||E.g. "Open Map Street"||address_number + address_number_suffix||addr:housenumber=*||E.g. "2" or "3A"||unit_value||addr:unit=*||E.g. "A" or "1" or "1-198" only if present in source data (about 150k addresses)||<coordinates>||<location>||Datum converted NZ to OSM||suburb_locality||addr:suburb=* or addr:hamlet=*||suburb when town_city also present, otherwise hamlet.||town_city||addr:city=*||When present in source, suburb_locality is always present|
NOTE1: the inclusion of addr:city, addr:hamlet, addr:suburb keys is mainly for the mapper to verify against the underlying map.
When this information is already present in e.g. place=* on an area (very likely in urban areas), the redundant information will be removed by the mapper doing the upload. The mapper may also choose to create or adjust a suburb or place boundary or POI manually.
NOTE2: About 5k addresses are "ranged". I.e. they have an "address_number_high" in the source database. A few also have unit numbers. And the full address in the source data looks like "4B/22-26 High Street". This import proposal uses only "address_number" and ignores the "address_number_high" as redundant.
Keys on each changeset changeset
- source=* https://data.linz.govt.nz/layer/3353-nz-street-address
- source:revision=* The linz dataset revision number, e.g. "43"
- attribution=* https://wiki.openstreetmap.org/wiki/Contributors#LINZ
- comment=* e.g. "LINZ addresses for <suburb> <city>"
- Obtain all OSM items that contain addr:housenumber
- Find centroid of ways (buildings) to use as position for proximity testing.
- Generate table of node/way id, obj type, position, addr:street, addr:housenumber
- Convert LINZ positions to WGS84 SRID=4326 for comparison with OSM positions
- Match with LINZ data on proximity + housenumber (+ street if present).
Other odd stuff, involving relations, interpolations etc. Initially, relations can be ignored. Members that are points or polys with addr:housenumber can be identified as duplicates by number and proximity to LINZ address with same number.
What to do with the duplicates?
Eventually, all items that are real duplicates would have LINZ id attributes added.
Nodes with only addr:housenumber that aren't part of a relation - add addr:street
Addresses that seem to be duplicates, but are far away from the LINZ point location would be reviewed by a person. E.g. sometimes houses get tagged with the correct number, but the wrong street name.
Polygons (i.e buildings), need to discuss. EliotB opinion is to add the address as a node which applies to location independent of what is built on it. The building can be demolished, but the address remains valid.
An idea of how many duplicates there might be:
# Get NZ items that contain addr:housenumber > wget -O nz_addr.osm "http://www.overpass-api.de/api/xapi_meta?*[addr:housenumber=*][bbox=157.5,-59.0,179.9,-25.5]" > spatialite_osm_raw -d nz_address-osm_raw.spatialite -o nz_addr.osm
Analysis : nodes 12727, ways 33009, relations 22
Approximately 3% of NZ addresses are already in OSM in some form (40K/2M). The remaining 97% will be new.
In the first pass, potential duplicates will be identified, and saved as a separate dataset for later processing. The remaining non-duplicates will be uploaded in batches.
Batch membership will be determined by town_city & suburb_locality in common.
A quick delve into the data gives
- Water addresses 160, Road 1.9 million
- Localities (town_city = NULL) 1930 distinct. 1 to 2800 items per locality. 25 localities with > 1000 addresses
- town + suburb 1182 distinct, 258 of which have town=suburb. 645 have <1K addresses, 907 < 2K, 6 > 10K
Using the above distinction would give about 3100 separate batches.
An osmchange file will be generated for each place. After review, each will be uploaded as a separate changeset.
Record the version of the LINZ database used to generate each import.
Periodically retrive a new version of the LINZ dataset, and obtain list of additions, deletions, changes w.r.t. previous set. The list would be checked manually against OSM (potentially a bot could do the checking, can investigate after the manual process has been trialled)
How to find new things entered on OSM side? Addressed items lacking linz:address_id would be candidates.
- Thread on imports list and maybe others.
- Thread in NZ OpenGIS group and maybe others.
- Real time chat
- See Addresses#Denmark and Addresses#Norway for examples of other countries with full scale address 'import'.
Details of source database
|Column Name||Data Type||Length||Precision||Scale||Example||Description||address_id||integer||32||0||505588||AIMS unique identifier for an address.||change_id||integer||32||0||1304726||AIMS unique identifier for the address version.||address_type||varchar||20||Road||The type of address. Includes: Road and Water.||unit_value||varchar||70||Alpha numeric value for a unit||address_number||integer||32||0||1||Address number||address_number_suffix||varchar||A||Alpha numeric characters that may follow the address number.||address_number_high||integer||32||0||High address number of a ranged address.||water_route_name||varchar||100|| Name of the beach the water address relates to. Currently this contains the
captured segment of coastline. This will be blank for ROAD addresses.
|water_name||varchar||100||Water body the address relates to. This will be blank for ROAD addresses.||suburb_locality||varchar||80||Dannemora||Suburb/Locality from the NZ Localities (NZ Fire Service owned dataset).||town_city||varchar||80||Auckland||Town/City from the NZ Localities (NZ Fire Service owned dataset).||full_address_number||varchar||100||1A||All number components concatenated for an address.||full_road_name||varchar||250||Joe Bloggs Road|| All road name components concatenated for an address. This has been derived
from the ‘Landonline: Roads’ Data and will move to using the new ‘Roads’ data tables when they become available’.
|full_address||varchar||400||1A Joe Bloggs Road, Dannemora, Auckland||All address components concatenated for an address.||road_section_id||integer||32||0||199943||Landonline Road Centreline ID (RCL_ID).||gd2000_xcoord||numeric||12||8||174.9255518167||NZGD2000 X-coordinate for the address in metres.||gd2000_ycoord||numeric||12||8||-36.9246773||NZGD2000 Y-coordinate for the address in metres.||shape||geometry||<geometry>||Spatial geometry for the point in long/lat GD2000 ESPG 4167.||ascii variants||not going to use|