Utah/UtahBuildingsImport

From OpenStreetMap Wiki
Jump to navigation Jump to search

Utah Buildings Import is an import of the Utah AGRC dataset which contains building footprints and address information covering sections of the state of Utah. If you'd like to help with this import, message OSM user osmjwh to get started.

Goals

Improve address and building footprint coverage in Utah

Schedule

2017-11 to 2017-12: Planning stage

2018-01 to ??: Manual implementation and minor ongoing updates to processing scripts

Import Data

Background

The data is sourced from the Utah AGRC, which is a publicly funded governmental dataset. This import focuses primarily on the Building and Address datasets.

Data source site: https://gis.utah.gov/data/location/

Data license: No license (explicit permission has been given to use to contribute to OSM)

Type of license (if applicable): n/a (No license has been selected yet)

Link to permission (if required): https://lists.openstreetmap.org/pipermail/imports-us/2018-January/000836.html

OSM attribution (if required): n/a

ODbL Compliance verified: n/a

Import Details

Import Type

One time import, done in many separate uploads using a manual extraction process from QGIS, script-assisted processing with JOSM, and a manual review and upload.

Team Approach

OSM user osmjwh plans to lead the effort, but others are absolutely welcome to help.

Conflation Plan

At this time, we plan to have a manual conflation strategy, where imports are kept to a small enough size that identifying any duplicated features is easy. When duplicated features are found, the import will be compared to the existing feature, and the one with a more detailed footprint will be kept. All tags will be merged so nothing is lost from the deletion of the less detailed feature, and any conflicting tags will be investigated for quality and the higher quality one will be taken.

OSM User Requirement

These imports must be made using the UtahBuildingsImport OSM user. If interested in contributing, contact osmjwh to get credentials for this user.

Data Processing Scripts

A large portion of the data processing is scripted, using the JOSM/Plugins/Scripting and the scripts found at the OSM Utah Buildings Imports code repository.

Changeset Tags

Changesets will be tagged with the following tags:

  • comment=UtahBuildingsImport - Include a description of what area is being imported here
  • source=Utah AGRC
  • import=yes

QA

We plan to QA using a peer review method.

Workflow

Before Starting

Make sure you have the following installed:

The following plugins are optional, but very helpful:

Data Extraction/Preparation

The dataset will be processed in small subsets of roughly 10-30 city blocks. QGIS is used to extract this subset from the AddressPoints.shp and Buildings.shp files using the following process for each file. Note that the same area should be extracted for both files.

  1. Open the shapefile in QGISE
  2. Select the entities to upload
    1. It is usually easier to orient yourself if you have aerial imagery, available by downloading the OpenLayers plugin for QGIS.
  3. Copy the selected entities (for addresses, this may take a while)
  4. Paste as new vector layer (Edit -> Paste Features As... -> New Vector Layer)
  5. Save this new layer as ESRI shapefile

Then, open both of the new ESRI shapefiles in JOSM using the opendata plugin, by following the instructions here.

Example

See *.shp file examples of extracted Address and Building datasets here

Data Processing

Buildings Shapefile

After preparing the data as specified in the "Data Extraction" section, remove all tags from the Buildings layer. Select all items in this layer, de-select all nodes, and assign all the polygons the tag of building=yes.

If present, process the Type tag as appropriate based on knowledge of OSM tags to further refine the building tag value.

This process is automated by running the "UtahBuildingsImport_Buildings.js" script from the OSM Utah Buildings Imports code repository.

Addresses Shapefile

Process the address points layer by converting the following AGRC tags to the relevant OSM tags:

Shapefile Attribute OSM Tag Description
AddNum addr:housenumber=* The house number of the address.
City addr:city=* The city of the address in all caps. Decapitalize prior to assigning to the OSM tag.
ParcelID utahagrc:parcelid=* The Utah AGRC parcel ID number
PtLocation name=* The common name of the address in all caps. This is not present on most address points.
State addr:state=* The US State of the address. These should all be 'UT'.
ZipCode addr:postcode=* The postcode of the address

Futhermore, keep the following tags (if present) to help in processing street/location names:

Shapefile Attribute Description
PrefixDir The street prefix in N, S, E, or W, representing North, South, East, or West. For example, for 123 S 4500 E, PrefixDir=S. Only present where applicable.
PtType Denotes the type of building. Common values are 'Residential', 'Commercal', etc.
StreetName The name of the street in all caps. This does not include the street type. For example, for Abc Ave, StreetName=ABC.
StreetType The type of the street in all caps. For example, for Abc Ave, StreetType=AVE. Options include 'AVE', 'CIR', 'DR', 'ST', 'WAY',etc.
SuffixDir The street suffix in N, S, E, or W, representing North, South, East, or West. For example, for 123 S 4500 E, PrefixDir=E. Only present where applicable.

To process the street names, follow the steps below:

  1. For each street in the dataset, find all entities in the layer that have a StreetName matching it. Combine the StreetName and StreetType or SuffixDir tags to create the addr:street OSM tag.
    1. For named streets (e.g. Wilson Ave), the StreetType tag contains the type of street (St, Ave, Cir, etc). For example, for Wilson Ave, StreetName = Wilson and StreetType = Ave. If there are multiple street types for the same name (e.g. Wilson Ave and Wilson Ct), make sure to include the StreetType in your entity query.
    2. For numbered streets (e.g. 700 East), the SuffixDir tag contains the direction. For example, for 700 East, StreetName = 700 and SuffixDir = E.
  2. Assign the selection an addr:street OSM tag according to the tagging above. This will require decapitalizing the street name and fully spelling out the street type or direction.
  3. Remove all points with duplicate addresses, especially those with a PtType="BASE ADDRESS" tag.

The process above is automated by running the "UtahBuildingsImport_Addresses.js" script from the OSM Utah Buildings Imports code repository.

Next, merge the Address and Building layers together.

Merging AGRC Shapefiles

Using the "BuildingsTools" JOSM plugin, merge the address points into the building shapes (Data -> Merge Address Points). This will only merge the addresses if there is a single address inside of the building footprint. You will likely run into the following issues:

  1. The address point lies just outside of a building footprint. In this case, just move the address inside of the building footprint and merge the address points again.
  2. A building has 3 address points with the same house number within the same building. Just delete two of these address points and merge the address points again.
  3. A building has 2 address points with different house numbers within the same building. If this is a house, it is likely a duplex. You may choose to split the house into two separate shapes, each with the relevant house number.
  4. A commercial building has many address points. You may choose to split the commercial building into smaller spaces, or leave the address points unmerged depending on the building configuration. It may be easier to merge the data into any existing OSM nodes if the addresses are not merged into the buildings.

Processing Merged AGRC Data

Process the PtType tag using the following guidance:

  1. If the UnitType tag is "APT", adjust building=yes to building=apartments.
  2. If the PtType tag is "Commercial", adjust building=yes to building=commercial.
  3. If the PtType tag is "Residential", and the building tag hasn't been adjusted to apartments in the steps above, apply the tag building=house.

This process above is automated by running the "UtahBuildingsImport_Merged.js" script from the OSM Utah Buildings Imports code repository.

Review

Next, review the remaining footprints with building=yes.

  1. There may be some larger public buildings, like churches or libraries, so tag them accordingly.
  2. Some of these may be sheds, in which case apply the building=shed tag.
  3. Most of the remaining entities are garages, so apply the building=garage tag.

Using an aerial imagery underlay that has been calibrated using GPS traces, adjust the building entities so that they match the aerial imagery footprints.

If using automated scripts, review the objects that have the "utahagrc:review" tag. As these objects are reviewed and tagging is corrected, remove the "utahagrc:review" tag.

At this point, remove all AGRC tagging that was not removed earlier in the process. All that should remain are name, building, and various addr tags.

Example

See a *.osm file example of a fully processed dataset (the same as was extracted above) here

Data Merging

Finally, data should be merged into a downloaded OSM layer by selecting both layers and clicking 'Merge'. Then upload this layer and tag the changeset as specified in the Import Details section above.

Note, with large changesets, uploads can take a long time. To speed it up, consider adjusting the upload to create a single changeset for a larger number (5000 or so) objects at once.

Warning about canceling uploads

DO NOT CANCEL IN THE MIDDLE OF THE UPLOAD! The uploaded data will be committed to OSM, but the remaining will not. If you just try to upload of all the data again, it will cause a large number of upload errors because it will identify that duplicates are being uploaded. At this point, you have a few options:

  1. Revert your changeset and re-upload your work (not especially easy or clean)
  2. Figure out what was committed by your change and remove it from either your upload or OSM.

Even after this, you will likely deal with duplicated objects, for which the verifier can be a big help.

Basically, it turns into a huge mess so just don't cancel your uploads halfway through.