Import/Boston Street Address Management (SAM) Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Current Status

2016-04-01: Awaiting response from City of Boston regarding attribution requirements since the data is licensed under CC BY 4.0.

2016-09-29: Restarted trying to reach City of Boston GIS over the phone to get a definite answer on attribution requirements.

About

Boston, Massachusetts is relatively well mapped, however the buildings usually lack addr:housenumber=* and addr:street=* tags. Since the city is not built on a grid-based design, the streets can wander around making finding the right house with OpenStreetMap a challenge. City of Boston provides a sizeable amount of data via Open Government initiative. One of the things available freely is a Live Street Address Management (SAM) dataset that identifies the location of the buildings on the map. We can use that.

As part of the import, existing buildings that span multiple tax parcels will be split provided that the roof pattern looks different among different parts of the building.

Import Plan Outline

Since a large amount of Boston buildings were imported from a LIDAR/orthoimagery survey (or traced from Bing), a sizable number of OSM buildings are actually 2 or more real world buildings, so the first step will be to update them on the map itself and it can be performed while waiting for an approval.

Import will be performed by neighborhood in these stages:

  1. Import unique buildings.
  2. Regenerate a set of .osc, .osn files, post the links here for ongoing review by interested parties.
  3. Upload ONLY unique buildings into OSM (no fixmes, no notes, etc.).
  4. Modify the rest of buildings so that they become unique (verify with Tax Parcel Map/Bing and roughly with MassGIS Data - 2015 WorldView Orthoimagery) if possible, if not - skip them.
  5. Add missing buildings, verify they are unique.
  6. Repeat from #2
  7. When there are no more buildings to split, import -addresses.osc. This will put address nodes onto the buildings that are confined to a single tax parcel, yet have more than one address. The number of such buildings is relatively low (see the data below).

Goals

Add the missing house numbers and associate the building ways with the street they are on. Improve navigation for offline applications and provide correct destination information for OSRM, graphhopper etc. With addresses allocated sometimes seemingly at random, having the correct addr:housenumber=*s is extremely helpful.

Schedule

Total time: 2 weeks from the date of approval.

  • Initial upload for Jamaica Plain (the current location of User:Rye), corrections (1 day).
  • Unique addr:housenumber upload for all the neigborhoods (1 day).
  • Splitting the buildings according to the Boston Tax Parcel map, adding missing buildings and upload of the changes in two steps (rest of the time).

Import Data

Background

Data source site: http://bostonopendata.boston.opendata.arcgis.com/datasets/b6bffcace320448d96bb84eabb8a075f_0

Tax Parcel Basemap: http://app01.cityofboston.gov/ParcelViewer/?pid=1103032000 (http://gis.cityofboston.gov/arcgis/rest/services/Basemaps/base_map_webmercatorV2/MapServer)

Tax Parcel Shapefile: http://boston.maps.arcgis.com/home/item.html?id=cd6d9058ee9d4475b924751a2bb9d263

MassGIS Data - 2015 WorldView Orthoimagery: http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-of-geographic-information-massgis/datalayers/colororthos2015wv.html

Data license: CC BY 4.0 - Open and Protected Data Policy, awaiting for attribution clarification as per Import/ODbL_Compatibility.

Type of license (if applicable): CC BY 4.0

ODbL Compliance verified: in progress

Script repository: https://bitbucket.org/roman-yepishev/osm-boston-sam-import (Heavily Work In Progress, spaghetti code by User:Rye).

CartoDB visualization: https://rye.cartodb.com/viz/e81047a4-fbfb-11e5-a30c-0ef24382571b/embed_map

OSM Data Files

MediaFire folder with all the data: http://www.mediafire.com/folder/j7lb1tuhwd17i

These files are for reference only and to increase the bus factor (in case User:Rye is hit by a bus). These are now ready for upload (no safeguards of any kind).

Unique buildings: ways that can be uniquely tagged and require no further manual interaction - .osc file.

Address nodes: nodes that bear addr:* tags for the buildings that are within the same tax parcel yet having multiple numbers - .osc file.

Fixme buildings: (not for upload into OSM) ways that are tagged with more than one address, ways that are already tagged with an address that does not match City data. - .osc file.

Missing/fixme building markers: (not for upload into OSM) Missing buildings, buildings that are most likely not split properly (same as first item in Fixme buildings) according to both tax parcels and SAM, buildings with unknown street or a street name that does not match SAM exactly. .gpx is for field verification, .osn for JOSM.

Name / Poly Unique buildings Address nodes Fixme buildings Missing/fixme building markers
Allston / Brighton 6509 30 85 250 .gpx, .osn.gz
Back Bay 1516 4 12 34 .gpx, .osn.gz
Bay Village 234 0 17 39 .gpx, .osn.gz
Beacon Hill 1233 2 12 51 .gpx, .osn.gz
Charlestown 1765 14 16 95 .gpx, .osn.gz
Chinatown 158 3 6 20 .gpx, .osn.gz
Dorchester 13835 51 22 306 .gpx, .osn.gz
East Boston 4272 32 128 692 .gpx, .osn.gz
Fenway / Kenmore 350 5 85 558 .gpx, .osn.gz
Financial District/Downtown 290 2 22 63 .gpx, .osn.gz
Government Center/Faneuil Hall 38 0 1 22 .gpx, .osn.gz
Hyde Park 7617 8 6 140 .gpx, .osn.gz
Jamaica Plain 4143 7 59 177 .gpx, .osn.gz
Mattapan 5535 15 33 199 .gpx, .osn.gz
Mission Hill 864 8 26 204 .gpx, .osn.gz
North End 968 9 6 49 .gpx, .osn.gz
Roslindale 6748 9 40 239 .gpx, .osn.gz
Roxbury 6322 217 196 788 .gpx, .osn.gz
South Boston 5617 84 42 284 .gpx, .osn.gz
South End 2741 8 32 158 .gpx, .osn.gz
West End 117 0 5 26 .gpx, .osn.gz
West Roxbury 8275 36 22 135 .gpx, .osn.gz

Last updated: 2016-04-11 18:08:18-0400

Previews

File Description Size Last updated
Massachusetts-latest+boston-sam_2.obf OsmAnd map + unique Boston SAM numbers 211408445 2016-04-11 16:38:38
massachusetts-latest+boston-sam.osm.pbf OSM PBF export + unique Boston SAM numbers 222951460 2016-04-11 10:27:25

Import Type

While split between many uploads, this is a one-time action. The scripts are released in public domain to enable updates of this kind in the future. Future updates will require the same steps to be taken.

JOSM will be used to review the changes, resolve conflicts and upload the resulting updates as a separate 'ryebread_bos_sam_import' user.

Data Preparation

Data Reduction & Simplification

Only updated way tags will be uploaded. If there are missing ways in OSM and the buildings are found in Bing/Boston Tax Parcel basemap, these buildings will be added after the initial import of unique addresses.

Tagging Plans

Excluded from processing (won't be updated, won't appear in fixme/notes):

  1. Buildings that have source:addr=survey.
  2. Buildings that have at least one entrance=yes node with addr:housenumber=* defined.

The following tags will be added by buildings affected by the import:

Since the street names may be different in various official sources, a separate mapping table is set up. The script adds a note/gpx waypoint pointing to a building that is about to be tagged with an unknown street name.

Changeset Tags

tag value
comment Import Boston SAM addresses (Neighborhood Name)
source Boston Street Address Management

Data Transformation

osmconvert will be used to convert the .pbf export to .osm. MariaDB is used to store both osm data as well as the SAM Addresses. XML is processed with an actual XML parser library.

Data Transformation Results

See OSM Data Files above.

Data Merge Workflow

Team Approach

I'm going to perform this work solo.

References

Buildings that only have 'fixme' added by the system will not be uploaded (as that's bad data).

Workflow

  1. Import fresh massachusetts_latest.pbf file into MariaDB
  2. Run boston-sam-generate-osm.py "Neighborhood Name".
  3. Open resulting polygon, notes file and osc file in JOSM with Boston Tax Parcel Basemap as imagery.
  4. Evaluate the changes. Validate the street names.
    1. For buildings that have more than one address:
      1. Using Bing and Tax Parcel Basemap verify the buildings are actually split. If they are, split them in OSM
      2. If buildings aren't split on Tax Parcel Basemap and the aerial image does not show party wall, manually mark the building with "fixme=survey required" remove any addr: tags.
      3. Upload the OSM building changes as a regular user.
      4. Re-fetch data for the neighborhood and re-run the import script.
  5. Upload the resulting dataset in chunks of 100 objects as 'Boston SAM Import' user.
  6. If needed, revert the changesets produced by 'Boston SAM Import' user.

Conflation

A building with source:addr=survey will not be modified.

If a building already has the correct addr:housenumber=*, addr:city=*, addr:postcode=*, and addr:street=*, no changes will be uploaded. If a building has a number that is different from the one currently specified, Tax Parcel Basemap, Bing, and search engines will be used to verify the address. If new address is not verifiable, it stays in the -fixmes.osm. In case postcode starts from the ZIP retrieved from SAM, it is left unchanged as ZIP+4 format may have been used during manual entry.

Since current Boston joined a number of cities, some street names are not unique within the official city borders (e.g. Boylston Street in Jamaica PLain and in Back Bay neighborhoods). The street mapping script will therefore map the wrong SAM street ID. This is not an issue, since addr:street=* references the street name, not an ID.

QA

  1. Using JOSM verify that no buildings about to be uploaded have +addr: in fixme.
  2. Verify that there is only 1-1 building way mapping to addr:housenumber=* for a addr:street=*.
  3. At the moment the collection script does not support buildings presented as multipoligon relations, these will trigger a false-positive for a missing building and will have to be checked manually. As of 2016-03-19 there are 48 relations tagged as building in Boston area being processed.

See also

Thread in talk-us@: https://lists.openstreetmap.org/pipermail/talk-us/2016-March/015994.html