Brisbane City Council Property and Address Import

From OpenStreetMap Wiki
Jump to: navigation, search

This page is the plan for the import of Brisbane City Council Property and Address Import dataset which is address data covering Brisbane City in QLD, Australia. The import is currently 2019-02-16 at its first stage of import: Importing addresses in areas with low amounts of address data.

Goals

Brisbane City area is missing a lot of useful address data, Mapping this manually is time consuming, and distracts from a mappers interests.

Schedule

The data is basically ready as of 2019-01-16, we hope to have this imported by Feb 2019

Import Data

Background

Provide links to your sources.

Data source site: https://www.data.brisbane.qld.gov.au/data/dataset/property-address-data
Data license: CC BY 4.0
Link to permission (if required): Waiver

OSM Data Files

https://github.com/zayuim/osm-misc

Import Type

Identify if this is a one-time or recurring import and whether you'll be doing it with automated scripts, etc.

Identify what method will be used for entering the imported data into the OSM database - e.g. API, JOSM, upload.py, etc.

Data Preparation

Data Reduction & Simplification

Only addresses classified as official will be uploaded. Only addresses which aren't already in OSM will be uploaded.

Tagging Plans

Normal 'addr' tags will be used, as well as source tags.

  • addr:housenumber - will contain the house numbers from dataset
  • addr:street - will contain the name and street type merged from dataset. (e.g. Example Road)
  • addr:suburb - contains the suburb from the dataset
  • addr:city - Will all contain "Brisbane"
  • source - Brisbane City Council
  • source:date - 2018-12-14

Changeset Tags

Data Transformation

Data was downloaded as CSV, from here the osm tags were changed using LibreOffice Calc, I also used Calc to change the casing of names. I used a Python script to merge collums in the csv document.

In JOSM I deleted any addresses that weren't marked as official, and ran the validation test, I fixed any issues that arose.

Problems with original data

I have raised these issues with Brisbane City data feedback page (and have gotten a response), Here is what is wrong:

  • House Numbers with "0", I am seeing a lot of House Number = 0 these don't seem to correlate to anything.
  • Address positions in the Port Of Brisbane are not correct, many are on top of each other.
  • Duplicate addresses (where two items have the same house number and street)
  • Addresses in the same position

Data Merge Workflow

Team Approach

It will be solo at first, but conflation will be a team effort.

References

Pre-existing address data will be evaluated in the import.

Workflow

Detail the steps you'll take during the actual import.

Prep

  • Use overpass to grab current address data.
  • Compare with final address data.
  • Move addresses in suburbs with data already to separate files

First Import: Addresses in areas without

  • With the remaining data, compare with existing data, choose whether to keep old or replace with new, do this case-by-case.
  • After doing this to a large enough area take a fairly large chunk of addresses in the now blank spaces and import it part by part.

=Second Import: Conflation in areas with many addresses

  • Afterwards we take data excluded from import and compare it suburb by suburb...

Conflation

Pre-existing address data will be downloaded in a separate layer and compared with the existing data, Suburbs which already have data will be excluded from the export and be moved over to seperate .osm files, where the community will carefully conflate.

QA

Auditing 400,000+ addresses will be difficult. Using the JOSM validator and by comparing to OSM data, We can catch any breaking errors.

See also

The email to the Imports mailing list was sent on YYYY-MM-DD and can be found in the archives of the mailing list at [1].