VCGI E911 address points import

From OpenStreetMap Wiki
Jump to navigation Jump to search

VCGI E911 address points import is an import of Emergency 911 addresses dataset which is maintained and made available by the Vermont Center for Geographic Information. The dataset covers the U.S. state of Vermont. The import is currently (as of September 2022) at the planning stage.

The initial plan is to start small, and minimize risk by working on a town by town basis. The scope of this particular import project is to focus on smaller towns with less than 100 existing OSM addresses. This approach will make it easier to spot issues, undo errors, and make the process more robust for larger, or more complicated towns. Larger towns, or towns that already have significant number of building outlines/addresses will be excluded in this "phase one" project.

Goals

The goal is to import missing Vermont addresses.

Schedule

2021-07 assessment and manual entry of VCGI address data
2022-09 Public planning starts
2022-10 creation of street expansion script and generation of draft OSM data files
TBD Initial town import applied in JOSM

Import Data

Background

Data source site: https://geodata.vermont.gov/datasets/VCGI::vt-data-e911-site-locations-address-points-1/about
Data license: VCGI has stated the data has been made public with the intent for to be used in projects like OSM. An employee (J. McMullen) responded to an inquiry about data licensing on 2018-10-08, "The data that is post [sic] to the VCGI data warehouse is considered public data. There are no licensing restrictions on that data. Its posted to hopefully make every one else’s data better and at the same time making ours more accurate." Esri has evaluated the data set (listed here) and found it compatible with OSM, and includes the address points in the RapiD editor.

OSM Data Files

Import Type

Data imports will be done manually with JOSM.

Data Preparation

Data Reduction & Simplification

For the scope of this import project, we will work on towns that have less than 100 existing OSM addresses. Addresses that already exists in OSM will manually removed from the import data. As progress is made, the plan is develop a script that will help with the matching process, but ultimately every address will be manually, and individually be removed from the import data.

If a reliable script is made, an update to this project proposal will be made. The Vermont mapping and Import communities will be notified for project review and approval.

Tagging Plans

The following tags will be included for each node:

OSM tag VCGI source
addr:city TOWNNAME
addr:housenumber HOUSE_NUMBER
addr:street PD + SN + ST + SD PD = Prefix Direction, SN = Street Name, ST = Street Type, SD = Suffix Direction
addr:postcode ZIP
addr:state "VT" for all nodes
ref:vcgi:esiteid ESITEID Unique ID from VCGI to facilitate tracking and future updates
source "VCGI/E911_address_points"

Data Transformation

VCGI data is all uppercase and includes abbreviations in street names. A script has been written that Title Cases the elements makes the following transformations.

Street name suffixes (eg. Ave., St., Rd.) are expanded using this list.

Street names that include "U.S. Route" will be transformed from "US ROUTE #" to, "U.S. Route #".

Addresses on Vermont state routes will be transformed from "VT ROUTE #" to "Vermont Route #".

VCGI addresses that don't have a house number, or a house number of 0 are excluded.

Through testing of the script, several other custom "text cleanups" have been incorporated. As new exceptions are encountered, these transformations will continue to grow. See comments in the script for further details.

Data Merge Workflow

Team Approach

Preliminary work is being done by OSM user jared and Adam Franco with review from others on the #local-vermont Slack channel

Workflow

  • Town priorities will be made. Towns with a small number of address points, and few existing OSM address nodes will be worked on first. The plan is to never commit a changeset larger than a town. Most towns are relatively small, but if the process is successful, and larger towns are going to be imported, it is possible that they would be broken up into smaller import files.
  • For each town, an OSM file will be generated using E911 addresses from VCGI,
    • Address point data (primarily street name) will be transformed, and expanded to meet OSM standards.
    • This initial town OSM file will be conflated against existing OSM address data and sorted into several buckets:
      • no-match: to be reviewed and imported
      • tag-conflicts: to be reviewed and manually fixed if needed
      • review-distance: addresses that already exist in OSM, but significantly far from the E911 location
      • review-multiples: addresses that already exist in OSM multiple times. This is often the case if both a business and a building are tagged with the same address. These are not necessarily errors, but might be worth reviewing.
      • matches: to be skipped as the address already exists in OSM close to the E911 location.
  • A visual check in JOSM will be made to make sure the address nodes visually look reasonable (eg. confirm the points all fall within the town boundary and are on or very near to their associated buildings).
  • OSM files will be made available in advance to give others the opportunity to inspect
  • Once everything looks good, the changeset will be committed through JOSM. The preliminary import user will be "jared-import".
  • Progress updates will be provided here, and on the #local-vermont Slack channel so that others can keep an eye out for any issues

Conflation

As described in the workflow above, the initial town OSM file will be conflated against existing OSM address data and sorted into several buckets, each of which will be handled separately.

QA

Over 50,000 VCGI address points have already been added manually using RapiD. The data is considered highly accurate.

Initial imports will be kept small enough that data can be manually inspected to confirm that transformations were done properly. Points will be verified in JOSM to make sure alignment is accurate.

OSM files will be made available to other contributors for verification. Conflated files for review are available at https://github.com/JaredOSM/vermont-address-import/tree/main/data_files_to_import/conflated

As QA issues are found in initial small imports, this section will be expanded.

Comparing E911 addresses to existing OSM addresses in JOSM

  1. Open JOSM and install the "conflation" plugin.
  2. Load the existing OSM addresses for a town in JOSM using an Overpass Query (see below)
  3. Open the import project's .osm file for the town in JOSM as a new layer
  4. In the Conflation plugin's dialog, select the existing objects as the subject and the import's as the reference and have it look for matches.
  5. Compare the conflicts and missing addresses, looking for issues like invalid name conversions or other problems.

Loading existing OSM addresses for a town into JOSM

Use "Expert Mode" to download from an Overpass query, replacing "Middlebury" with the name of the town you are interested in. This will load all buildings and other features that have street or housenumber tags:

[out:xml][timeout:90];
(
  area["ISO3166-2"="US-VT"]["admin_level"="4"]["boundary"="administrative"]->.state;
  area["name"="Middlebury"]["admin_level"="8"]["boundary"="administrative"]->.city;
  nwr["addr:street"](area.city)(area.state);
  nwr["addr:housenumber"](area.city)(area.state);
  nwr["bulding"](area.city)(area.state);
);
(._;>;);
out meta;

See also

The preliminary import plan was shared with the #local-vermont Slack channel on 2022-09-14.

The imports-us mailing list was notified on September 17, 2022 (link).