Import/Milwaukee County, Wisconsin addresses

From OpenStreetMap Wiki
Jump to navigation Jump to search

Goals

To add the vast majority of addresses in Milwaukee County, WI to OpenStreetMap without creating duplicates. This import is completed.

Progress/Schedule

The import should start in late 2022-early 2023 and last 1-2 months, depending on interest from local mappers.

  • Data processing was done in late 2022.
  • All active local mappers were messaged and asked for any comments/concerns and if they would like to participate (corporate and streetcomplete-only users were not queried)
    • Feedback has been positive, just a comment to be careful when merging in the areas with some addressing already
  • An email was sent to the imports mailing list on 2022-11-13.
    • A modification to the QA based on feedback
  • The import was started on 2022-11-25 via the accont System-users-3.svgpopball-import (on osm, edits, contrib, heatmap, chngset com.)
  • The import was finished on 2022-12-06

Import Data

Background

Data source site: https://gis-mclio.opendata.arcgis.com/maps/MCLIO::address-points

Data license:: Public Domain (confirmed with the municipality that this is the case)

Type of license: Public Domain

Link to permission: n/a

ODbL Compliance verified: Yes

The data is of rather good quality, and the address points are directly above the buildings they represent. Therefore conflation should yield good results.

Import Type

A one time import that will be completed in many small uploads.

Data Preparation

Data Reduction & Simplification

The data was converted to OSM xml using JOSM using the OpenData plugin.

The following fields were used:

  • HOUSENO
  • HOUSESX
  • DIR
  • STREET
  • PDIR
  • MUNI
  • UNIT (mostly deleted)
  • ZIP_CODE
  • ADDR_STATUS (Only used for filtering)

Fields not relevant to OSM were deleted. These are:

  • OBJECTID
  • SOURCE_OID
  • TAXKEY
  • ALT_ID
  • DATE_CHANGED
  • COMMENT
  • SOURCE
  • SOURCE_DATE
  • SOURCE_ID
  • FULLADDR
  • ADDR_STATUS
  • STREET_LN_OID
  • BLDG_POLY_ID
  • MAILABLE

Using JOSM, the address points were filtered. Specifically, this removed houses without house numbers or without a street specified (house names do not exist in Milwaukee County, at least as part of the official address) In addition, addresses the ADDR_STATUS was used to detect addresses unwanted for import. This includes utility rights of way, freeway rights of way, railroad rights of way, waterways, and vacant lots. Parking lots addresses were also filtered out (the addresses of these are rarely verifiable on the ground and are usually duplicate of the buildings they serve. This data may be useful in a future import, however.

Using JOSM and Python scripts, the street types (ST, AVE, etc.), Directions (N, S, E, W), Post Directions (Like direction, but after the street name) were expanded into their abbreviated couterparts (e.g. ST -> Street). The fields were converted from all caps to title case. Then, using a python script, the street name, type, directions, and post directions were combined to get the addr:street=* field for OSM. Housenumber extensions (HOUSESX) were added onto the end of addresses (123 + A => 123A).

Using JOSM, the dataset was split into tracts of approximately 5000 addresses for easier manageability.

Zip codes often contained a Zip+4 code. A dash was added to fit the standard formatting of Zip+4.

Due to a bug in the JOSM OpenData import plugin, addresses "stacked" on one point were reduced to only one address in the stack. The additional addresses were added back by reading the CSV manually in with a python script and editing the OSM files. Stacked addresses had their housenumbers concatinated with commas unless they were on different streets, in which case a new address node was added.

Units numbers were removed for the most part when they corresponded to units in the same building. This helps simplify the merging process and should lead to more consistent results.

Duplicate addresses were detected with JOSM and cleaned up (after units were removed).

A few fixme=* tags were added. This was done where the dataset included the same housenumber for a number of buildings (this is accurate in some housing complexes) but it would be desirable for the exact unit numbers to be captured on the ground.

In cases where a single building has a large number of addresses (more than 10 for example) this was replaced with address interpolation (very long addr:housenumber=* are undesirable)

Tagging Plans

addr:housenumber=*, addr:street=*, addr:city=*, addr:postcode=*, and addr:state=* will be used on each address point. addr:unit=* will be added where available and appropriate. No source tags will be used on the addresses.

Changeset Tags

The changeset should have source=Milwaukee County LIO

Data Transformation Results

The scripts used to process the address points alongside a sample of the proccessed address points (as well as the entire compressed set of points) are available on github.

Data Merge Workflow

Team Approach

While getting local consensus, active local mappers will be asked if they want to participate in merging the data. If this is the case, then the processed tracts will be assigned to the mapper to import.

Workflow

Note: Do all import via a dedicated import account

  1. Open one tract in JOSM
  2. Within the tract, remove any address points not corresponding to addresses according to OSM standards. This includes addresses in freeway Right of Ways, Unility righ of ways, and demolished buildings, etc.(Most of these should have been removed already, but some may still remain)
  3. Run JOSM validation to find any anomolies (most importantly duplicate housenumbers).
  4. Manually conflate any non-building addresses with areas. This includes things like cemetaries, parks, etc. Also manually conflate any buildings which are multipolygons.
  5. Run conflation using the JOSM plugin to find matches. The 'subject' of the conflation should be any building=* as well as any points with addr:housenumber already filled in (to avoid duplication)
  6. Review the address nodes which did not match with anything
These filters work well for selecting the subject for JOSM conflation
    1. In case it is a building with multiple address nodes, unmatch the node automatic conflation matched it with and keep the nodes within the building.
    2. Manually match buildings which automatic conflation missed
    3. Delete address nodes which don't refer to objects on the ground anymore. Typically this will happen if a building was demolished.
  1. Pay special attention to conflations with a large distance or a large distance, as these are more likely to be faulty conflations.

Conflation

The JOSM conflation tool will be used to conflate the addresses with the existing buildings (The vast majority of buildings already have outlines in the county from a prior import).

Quality Assurance

JOSM address data validation was run on the dataset, and will be run with the merged data before upload. Additionally, we will run JOSM/Plugins/FixAddresses, which scans addr:street=* names and compares them with the names of the surrounding streets.

As a "sanity test" of the data, the data was conflated locally in an area where many housenumbers were tagged already (the neighborhood bounded by Bluemound Road, 76th Street, Hawley Road, and I-94). Out of 1317 address nodes, 170 conflicts had to be resolved.

  • 86 had a missing address extension in OSM (e.g. 170 vs 170,170A).
  • 18 were conflicts due to mutiple addresses separated by commas vs using separate nodes.
  • 17 conflicts were due to the street name (Blue Mound Road vs Bluemound Road; even street signs are inconsistent on this issue).
  • 14 were due to the imported data node being in the wrong place. Two neighboring houses would have their addresses swapped in this case.
  • 12 were due to a missing housenumber in OSM (on a house with multiple housenumbers.
  • 9 were conflation conflicts due to nodes selecting the wrong matching node in the subject.
  • 7 were due to incorrect data in OSM (invariably due to small typos).
  • 5 were due to a missing address extension in the import dataset.
  • 2 were due to a missing address in the import dataset.

Tract Status

Tract File User Status
addresses_tract1 watmildon Completed
addresses_tract2 popball Completed
addresses_tract3 popball Completed
addresses_tract4 watmildon Completed
addresses_tract5 watmildon Completed
addresses_tract6 popball Completed
addresses_tract7 watmildon Completed
addresses_tract8 watmildon Completed
addresses_tract9 Popball Completed
addresses_tract10 Popball Completed
addresses_tract11 Popball Completed
addresses_tract12 Popball Completed
addresses_tract13 Popball Completed
addresses_tract14 Popball Completed
addresses_tract15 Popball Completed
addresses_tract16 Popball Completed
addresses_tract17 watmildon Completed
addresses_tract18 watmildon Completed
addresses_tract19 watmildon Completed
addresses_tract20 watmildon Completed
addresses_tract21 Popball Completed
addresses_tract22 watmildon Completed
addresses_tract23 watmildon Completed
addresses_tract24 watmildon Completed
addresses_tract25 Popball Completed
addresses_tract26 watmildon Completed
addresses_tract27 watmildon Completed
addresses_tract28 watmildon Completed
addresses_tract29 watmildon Completed
addresses_tract30 watmildon Completed
addresses_tract31 watmildon Completed
addresses_tract32 watmildon Completed
addresses_tract33 watmildon Completed
addresses_tract34 watmildon Completed
addresses_tract35 watmildon Completed
addresses_tract36 watmildon Completed
addresses_tract37 watmildon Completed
addresses_tract38 watmildon Completed
addresses_tract39 watmildon Completed
addresses_tract40 watmildon Completed
addresses_tract41 Popball Completed
addresses_tract42 watmildon Completed
addresses_tract43 Popball Completed
addresses_tract44 Popball Completed
addresses_tract45 Popball Completed
addresses_tract46 Popball Completed
addresses_tract47 Popball Completed
addresses_tract48 Popball Completed
addresses_tract49 Popball Completed
addresses_tract50 Popball Completed
addresses_tract51 Popball Completed
addresses_tract52 Popball Completed
addresses_tract53 Popball Completed
addresses_tract54 Popball Completed
addresses_tract55 Popball Completed
addresses_tract56 Popball Completed
addresses_tract57 Popball Completed
addresses_tract58 Popball Completed
addresses_tract59 Popball Completed
addresses_tract60 Popball Completed
addresses_tract61 Popball Completed
addresses_tract62 Popball Completed
addresses_tract63 Popball Completed
addresses_tract64 Popball Completed
addresses_tract65 Popball Completed
addresses_tract66 Popball Completed
addresses_tract67 popball Completed
addresses_tract68 Popball Completed
addresses_tract69 popball Completed
addresses_tract70 Popball Completed
addresses_tract71 Popball Completed
addresses_tract72 Popball Completed
addresses_tract73 Popball Completed
addresses_tract74 Popball Completed