Utah/CacheValleyAddressImport

From OpenStreetMap Wiki
Jump to navigation Jump to search

Scope

Import address data for Cache Valley, UT. This includes the regions Logan, North Logan, Hyde Park, Smithfield, Mendon, Wellsville, Paradise, Hyrum, Providence, and other local municipalities.

Future imports could re-use much of this procedure to import addresses to any part of Utah.

Community Buy-In

Please notify Xvtn of any issues with past or future progress in this ongoing import..

License Approval

UGRC Waiver Letter
UGRC Waiver letter

The source dataset from UGRC is licensed under Creative Commons Attribution 4.0 International License. (Utah GIS License Page)

According to the Utah OSM Wiki page, "the UGRC office has signed a waiver so we can use those datasets to improve OSM." A copy of the waiver letter is shown here and on the UT wiki page.

Documentation

Data Flow

UGRC Shapefile -> QGIS Filter Data -> OGR2OSM Translation Script to OSM format -> JOSM Quality Check -> Upload to OSM Database

Please see the README on translation script GitHub for detailed technical instructions.

Data Format, Conflation

The data will first be imported as individual address point nodes. This approach has several benefits:

  • The import for an area could be considered "done" with the address points. They are usable by geocoders, search engines, etc.
  • Some buildings have multiple addresses in them, they should be left as points anyway
  • It's easy to manually merge an address point and building outline where it's obvious they belong together.

Next, where appropriate, address points are merged with other features:

  • Using JOSM Conflation plugin, move address tags to nodes and areas where it can reasonably be assumed they represent the same feature: Buildings, certain amenities, etc.
  • Manually review cases where it isn't possible to automatically merge.
    • Where possible, do armchair mapping to resolve conflicts with obvious solutions.
    • Use notes or FIXME tag to queue up conflicts for in-person review.
    • Check on the ground to resolve remaining issues.
  • These steps are all performed before uploading to OSM.

Tag Format

  • Required:
    • addr:{housenumber, street, postcode, city, country, state}
  • Optional / as applicable:
    • addr:unit for apartment units, etc. According to the wiki, the secondary unit designator is not commonly used, but mail can sometimes not be delivered to addresses missing it. This seems like something to concatenate and include: eg addr:unit=Apt 3 rather than addr:unit=3
    • name from LandmarkNa field if present
    • fixme - When a potential issue is detected by translation script. For example, one address in Clarkston has no street suffix, so the script has done what it can (addr:street=North 300) and flagged for manual review.

Future Updates

Rather than using foreign keys, which are now discouraged, updates will be handled mainly using the same process as the initial import: Select areas which have missing or incorrect data, combine automatically where possible, manually review conflicts as necessary.

Other Details

  • I (Xvtn) am prepared to do this import myself. I'd love to develop a plan to collaborate with other mappers if there is interest.
  • The import will be done in smaller chunks, for example small municipalities such as Clarkston, UT. Changeset size will be limited to a single municipality.
  • The "bot" account used will be Xvtn_Import.

Sample Data

Before (.gpkg) and after (.osm) translation script data files are available on the Git repository. Note that .osm files are without any manual work - they are straight out of the script.

Timeline / Schedule / Progress

  • Preparation
    • 2022-10-05 - Initial Proposal
    • 2022-10-12 - Approval from imports@osm
  • Import Data
    • 2022-10-13 - Import Clarkston
    • 2022-10-16 - Import Newton
    • 2022-10-18 - Import Providence
    • 2022-10-24 - Import North Logan
    • 2022-12-13 - Import Richmond
    • 2023-05-14 - Import Logan
    • Ongoing
  • Updates
    • Need to finish importing data first