Import/Maine E911 Addresses

From OpenStreetMap Wiki
Jump to navigation Jump to search

The Maine E911 Address Import is an import of Maine E911 Addresses (covering Maine in the United States). The import is being directed by Alex who is welcoming to feedback and collaborators.

FAQ for mappers encountering these addresses

Maine Addresses.png

Q: Should I add buildings and then merge address nodes with the buildings?

   A: Yes! ~ But don't merge multiple address into one building, best leave those as separate nodes.

Q: Should I use aerial imagery to align address nodes?

   A: Yes!

Q: Should I delete addresses which seem like they shouldn't be there?

   A: Yes! ~ But remember, your imagery might be out of date.

Q: Should I change a tag on an address which seems wrong?

   A: Yes!

Q: Should I message blackboxlogic about something which might be a serious issue or I don't understand?

   A: Yes!

Q: Should I worry about messing something up?

   A: No! ~ I trust you.

Current Status

This import is COMPLETE

To see activity, review the history of user blackboxlogic+bot

Goals

  • Import addresses from the Maine E911 address set into OpenStreetMaps (which previously lacked local addresses) to improve local routing
  • Meet community expectations of quality, transparency, and collaboration

Schedule

  • July, 2019 - Reviewed existing conflation tools, started developing OsmPipeline for this import
  • August, 2019 - Evaluated the data quality and license of the data source
  • December, 2019 - Finish OsmPipeline project, produce sample results, draft this proposal, and formally request review
  • January, 2020 - Begin processing the import
  • May, 2020 - Finished initial import
  • June, 2020 - Fixing issues in the "mistakes were made" section
  • July, 2020 - Done

Outreach, Q&A, Review & Feedback

I have formally requesting review of this proposal by channels

Please critique all parts which you have expertise or interest

  • This proposal
  • OsmPipeline (the software that I wrote to automate most of the data processing)
  • the sample results (see Sample Data)

If there are no unresolved issues, the import will begin in January 2020

How to Respond

I'll be watching for feedback on:

  • the discussion side of this page
  • the #local-maine channel in OSM US Slack

I can be contacted directly by email, OSM message, @blackboxlogic in the OSM US slack, or issues on my github.

Data Source

Website: https://www.maine.gov/geolib/, specifically: https://gis.maine.gov/arcgis/rest/services/Location/Maine_E911_Addresses_Roads_PSAP/MapServer/1
Data license: "Access: Public Use: User assumes risk. NOT FOR EMERGENCY DISPATCH USE."
Type of license (if applicable): Public Domain
Link to permission (if required): https://www.mainelegislature.org/legis/statutes/1/title1sec538.html
OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#Maine
ODbL Compliance verified: yes

Permissions

Hello, State of Maine GIS data that can be accessed via the Internet without first authenticating as a named user to gain access, is considered information in the public domain. Reasonable efforts have been made to ensure data is complete and accurate at time of publishing, however it is the end user's responsibility to determine suitability for use. Reference: <https://www.mainelegislature.org/legis/statutes/1/title1sec538.html>

You have made reference to Maine Parcels. Please note that Maine Parcels Organized Towns <https://geolibrary-maine.opendata.arcgis.com/datasets/maine-parcels-organized-towns>, Maine Parcels Organized Towns ADB <https://geolibrary-maine.opendata.arcgis.com/datasets/maine-parcels-organized-towns-adb>, and Maine Parcels Unorganized Territory <https://geolibrary-maine.opendata.arcgis.com/datasets/maine-parcels-unorganized-territory> are not the authoritative source for address information. Address information is available in Maine E911 Addresses <https://geolibrary-maine.opendata.arcgis.com/datasets/maine-e911-addresses>

Thank you for contacting the Maine Office of Information Technology. Sincerely, Todd M

Data Processing Plan

Many technical details are simplified for this summary. Please get in touch for more detail.

All data manipulation was performed using OsmPipeline (a c# tool I wrote for this import). The import tools and process allow for future imports of updated data without conflict. The following is an explanation of its design and intended use.

Partitioning the Region

This process will be run individually per municipality in Maine (one town at a time). Processing a large town can take a few hours to manually review the change. There are nearly 700 municipalities in Maine, but most are very small. After completing a municipality, I will note the progress.

Data Source Translation

  • The user selects a target municipality and a folder is created with this name to hold the resulting data files
  • The reference data for this municipality is fetched from Maine's E911 API and saved in ReferenceRaw.GeoJson
  • Points are omitted if they are on the municipality's Blacklist in the progress report
    • This is a curated list of points that failed any validation step and, upon manual review, were declared undesirable (invalid, inaccurate or redundant)
  • The data is validated (see Translation Validation below). Errors are printed to console for manual review
  • Points are translated from GeoJson features into OSM nodes (see Translating Tags below) and saved in ReferenceTranslated.osm
  • Similar addresses are combined (if tags only differ by addr:unit=* and level=* and are within 5 meters of each other)
    • The effect is to turn an apartment building with multiple units into a single address, but not to change a trailer park
    • Nodes at the exact same location (but with tags preventing them from being combined) are nudged slightly to de-stack them
  • Reference processing is finished. The results are saved in Reference.osm

Translating Tags

Translation Validation

Elements that don't pass these validation rules produce output in console for manual review

  • OBJECTIDs must be unique
  • The [street] SUFFIX, PREDIR, and POSTDIR must be in the list of expected values
  • FLOOR and BUILDING can be made into a number
  • ZIPCODE is 5 numerals
  • ADDRESS_NUMBER is numeric and great than zero and is not 9999
  • LAT, LON must not be zero

Conflating into OSM

  • The reference's bounding box (plus a small margin) is calculated, and this map region is fetched from OSM's API and saved in Subject.osm
    • If the region is too large to fetch (over 50,000) nodes, then the region is automatically subdivided and retried.
  • Subject elements (nodes, ways and relations) are matched and merged with Reference nodes (see Conflation Validation below)
    • Only OSM elements with these tags can be matched, aka "addressable features"
    • Matches are found by name, by address addr:street=* + addr:housenumber=*, then by location (exclusively near addressable features)
      • "Exclusively near": Centroids are within 30 meters and the next closest thing is 4x farther away (modeled after my own decision making process)
      • String matching is loose, agnostic of punctuation, spacing, capitalization and accent characters
      • Number matching is loose, accounting for different formats like lists, ranges, lists of ranges and agnostic of delimiter.
      • When a match is found, the new tags are applied to the existing element, tag conflicts are flagged. If the existing element is a node, it is moved to the position of the reference node
    • Otherwise, unmatched nodes are just added
    • Three files are created for easy reviewing, but these files should never be committed to OSM because they contain Review Tags
      • Conflated.Review.osm has elements which failed validation
      • Conflated.Create.osm has new elements which are added
      • Conflated.Modify.osm has existing elements which were modified, (including child elements for ways and relations for proper rendering)
      • Review Tags are customize tags to make manual review simpler. They are never committed to OSM
        • maineE911id={id} - The original objectID(s) from the E911 data-set
        • maineE911id:{key}=added - This {key} was added
        • maineE911id:{key}=change from {oldValue} - This key was changed
        • maineE911id:moved={distance} meters {direction} - The element has been moved this far in this direction
        • maineE911id:info={validation message} - Details which might help a reviewer understand why software made a decision
        • maineE911id:warn={validation message} - The reason that the element failed validation. This element will be included unless blacklisted
        • maineE911id:error={validation message} - The reason that the element failed validation. This element will be excluded unless whitelisted
        • maineE911id:whitelist=yes - Indicates that the element has been manually included in the change even though failing validation
    • An OsmChange file is created Conflated.osc to be committed to the OsmApi after the user has finished manual review

Conflation Validation

Elements that don't pass these validation rules are "exceptional" and placed into their own map layer for manual review

  • Matches by name or address tags must be within 30 meters of the centroid of the other element
  • Matches by name or address tags most not match multiple subject elements (unless it is an exact match to exactly one of them)
  • Multiple reference elements can match to the same subject element, but not if they offer different non-addressy tags.
  • Matched elements must not have fixme=* or gnis:reviewed=no if they were matched by geometry
  • Matched nodes which are moved must not be a member of a way or have highway=*
  • Matched element must not be a relation and must not be "large".
  • While merging, tag conflicts are a validation error, except for tags known to be synonymous, or which can be changed to a more specific descendant tag
  • Measure if most points were moved in the same direction and warn that the data seems shifted
  • A addr:street=* or addr:place=* could not be formatted to match existing elements because there were multiple elements with different formats
  • A addr:street=* couldn't be matched to a highway and the name looked like it could be a place, like an island
  • Validate that there is a highway=* with a similar name=* as this addr:street=* Roads were so lacking that this was pointless
  • And more... Check the source code or message me if you really want all the details.

Manual Review

  • A human reviews the console output and three review files as layers in JOSM, along side Subject.osm and background imagery
    • Run the command "review reference" shows a summary of all fields being added, or "list [key]" like "list name" to see just those values with their respective counts
    • Remove bad values using "filter [key]" like "filter name" and pressing Y or N (or another option like Move) for each value as it appears
    • The command "review" will open JOSM with all output layers
    • Look for clusters of nodes which are not positioned correctly. Add a note to the map to flag the problem
    • Inspect any large objects being modified, a common problem is hospital grounds having a building tag added
    • Special attention is taken to the review file, as these elements have failed validation. They can be manually blacklisted, whitelisted, edited in iD, or ignored. From experience, many of the exceptions come from incorrect data in OSM, which I correct during this step.
    • If adjustments are made, the user selects specific steps to re-run and re-evaluation (minimizing load on exterior resources)

Committing the Change

Only after the change has been reviewed, the user will direct OsmPipeline to commit the change file to OSM's API. The commit author will be OSM user [osmuser:blackboxlogic+bot/ blackboxlogic+bot] using basic authentication. If the commit is too large for OSM, it is broken into pieces, then committed.

ChangeSet Tags

Sample Data

This is an early example of the results (covering only Westbrook Maine) which has not yet been committed to OSM. The full import will cover the entire state of Maine.

Here is the Westbrook OsmChange file which is [a sample of] the final product but I found it was not easy to review an osc file. I recommend reviewing the following .osm files which group the changes by change type. They also include the child elements for modified ways and relations (which wouldn't render in JOSM otherwise), and include Review Tags.

  • Nodes in Westbrook which will be Added
  • Existing elements in Westbrook which will be Modified
  • A layer of Review (Elements that failed validation) Each element has a 'warn' or 'error' tag describing the problem

Risks & Known Issues

  • New address nodes may be placed near existing Points Of Interest without combining them since I have no way to confidently determine that the POI is related to this address
  • I could be blocked or throttled from OSM API or Maine's API for exceeding "normal" use patterns
  • Each municipality may have its own data quality issues. I may need to refine the process or skip entire towns if they are too mangled. For example: I've noticed that Lewiston has many addresses missing ADRESSS_NUMBER
  • The data source sometimes locates addresses at the roadside instead of on a building
  • If the import is re-run (for example: a year later to include fresh data), more development may be needed to prevent a situation where a point was added, an OSM mapper deleted the point, and the point will be re-added. This could be mitigated by pre-filtering the data source by last-modified date
  • Many roads in Maine are missing or mis-named. I'm mostly ignoring the issue for this import. The [GeoFabric Address Validation](https://tools.geofabrik.de/osmi/#) layer is going to get very messy
  • Some neighborhoods (example) have their addresses bunched together. When I see it, I'm adding a note "The addresses imported on this neighborhood need to be aligned with the correct buildings"

Mistakes Were Made

  • It isn't always clear when to use Municipality vs PostalCommunity for addr:city=*
    • PostalCommunity is sometimes a neighboring municipality (should not be used) or a village (should be used).
    • This should be revisited once municipal boundaries are fixed in Maine.
  • I composed street names as [PreDir] [Name] [PostDir] [Suffix] but it should have been [PreDir] [Name] [Suffix] [PostDir]. Fixed in 80656824
  • Addresses may have a place's name (like an island) in addr:street=* (it should be in addr:place=*). Fixed in 86482696
  • Some addresses didn't merge into elements when addr:housenumber=* was a list or range (8 should merge into "6;8;10" and "6-10"). Won't do won't fix, the effect was small.
  • E911 Street names are missing punctuation. Fixed in 86259403
  • Some E911 notes ended up in the name=* field. Fixed in 86063644, 86067866, 86067948, and 86068378.
  • addr:unit=* is intended to hold values like "B" or "2", but I populated it with "Apartment B" or "Unit 2". Fixed in 86087432.
  • Since name was not used for matching early on, new elements may have been imported next to old elements with the same name.
    • Fixed in 86584007, 86583981, 86583264, 86582353, 86581526, 86580910, 86579885, 86579485, 86578734, 86578069, 86577284, 86576240, 86574500, 86572900, 86572495, 86571429, 86569836.
  • Matching by geometry is wrong for un-reviewed GNIS elements or anything with a fixme tag. Fixed in 86292688
  • I wish I had designed this process so the work could be divided between mappers. I'll keep this in mind for my next import.
  • I regret including unit numbers, they are generally low value and were high effort.
  • I manually resolved many errors in the E911 dataset, but I didn't track them in a way that I could meaningfully report them to the team that maintains it.
  • Diagonal direction abbreviations like "NE" were mis-translated to "North East" instead of "Northeast". Fixed in 88317479

Thank You

  • Harel helped me polish and publish OsmApiClient
  • Ben wrote an amazing base library call OsmSharp, and accepted my pull requests
  • Vorpalblade gave me some early guidance
  • Alan offered much-needed feedback and support
  • Slack's #imports, #tagging, #dev channels handled my endless questions

Up Next

Import/Maine E911 Roads