Address Data Import for Oconee County, GA, USA

From OpenStreetMap Wiki
Jump to navigation Jump to search

This Import Plan Outline is intended to help ensure that your "Import plan" document covers as many of the common questions about imports as possible. Just create your own page and copy and paste the wiki text from here (starting from below the line)

Please! If you identify ways that this outline didn't meet the needs of your import (key evidence of this: tons of questions or alarm bells on mailing lists!), please return and fix this page.


(Import title) is an import of (name of dataset) dataset which is of type (data type) covering (broad location in country). The import is currently (as of 05/18/2022) at the planning stage.

Goals

Identify the goals of the import.

Schedule

  • Reached out to start communication with GIS office around mid May 2022.
  • Reached out to Import ListServ around same time
  • Transformed and filtered data in June and July 2022
  • Will reach out to notify Import ListServ and Slack of intention to import around mid to end of July

Import Data

Background

Provide links to your sources.

Data source site: http://website.tld

Data Projection Type: NAD_1983_StatePlane_Georgia_West_FIPS_1002_Feet, may have an EPSG identifier of 2240
Data license: https://website.tld/license
Type of license (if applicable): e.g. CC-BY-SA, Public Domain, Public Domain with Attribution, etc.
Link to permission (if required): e.g. link to mail list reference url - http://lists.openstreetmap.org/pipermail/imports/2012-December/001617.html
OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#yourdataprovider
ODbL Compliance verified: yes/no

Ian Van Giesen <ianvangiesen@gmail.com> Wed, May 18, 2022 at 9:06 AM
To: mbeal@oconee.ga.us, nwilliams@oconee.ga.us

Title: Inquiry for Address Data of Oconee County

Good morning Matt Beal and Nicole Wiliams,

Hope all is well with you both.

My name is Ian Van Giesen and I am a once and longtime resident of Oconee County (grew up here). I am also an avid mapper of Athens-Clarke County and the surrounding area through OpenStreetMap (OSM), a project to create and distribute free geographic data for people across the globe. To my knowledge it is the largest open-source, volunteer-managed alternative to other large GIS and map providers in the world.

I am emailing you because I am interested in adding addresses to Oconee County. I have worked with the GIS Office in Athens-Clarke County, where I was able to acquire address data (stored separately from buildings and parcels) that I am currently importing there. Do you have an address dataset available that I could import into OSM?

I look forward to your response to this inquiry.

Thank you very much for your time and consideration,

Ian

Matt Beal <mbeal@oconee.ga.us> Wed, Jun 15, 2022 at 8:30 AM
To: Nicole Williams <nwilliams@oconee.ga.us>, "ianvangiesen@gmail.com" <ianvangiesen@gmail.com>

Ian,


We have no objections to our address data being in OSM.


Best,

Matthew D. Beal

GIS Administrator

Oconee County, Georgia

www.OconeeCounty.com

706-310-3546

OSM Data Files

Link to your source data files that you have prepared for the import - e.g. the .osm files you have derived from the data sources.

Import Type

Identify if this is a one-time or recurring import and whether you'll be doing it with automated scripts, etc.

Identify what method will be used for entering the imported data into the OSM database - e.g. API, JOSM, upload.py, etc.

Data Preparation

QGIS was used to create the latitude and longitude attributes for the geometries and then also used to export the address data file as a .csv to be used in R. In the GitHub page for this project, an RStudio project file will include the files and script used to create the .csv file for import. JOSM was used to create the .osm file.

I used the following link to help add longitude and latitude to the shapefile in QGIS, so that I can work with it in RStudio.https://www.reddit.com/r/QGIS/comments/k8sf01/can_not_get_latitude_and_longitude_coordinates/

Tagging Plans

In order to map source attributes to OSM tags, I have gone through and removed all tags that do not have common counterparts in the OSM community in my area. For example I decided to use addr:unit, instead of other possibilities (e.g. addr:flats) to describe how unit numbers will be mapped, as it seemed to be more customary and appropriate in Georgia and in the U.S.

I plan on only having the necessary addr:*=* key-value pairs present in the OSM file before upload.

I will use the following tags for the addresses I import in:

addr:housenumber=*

addr:unit=*

addr:street=*

addr:postcode=*

addr:city=*

I did not include a ref:*=* value, for two reasons:

  1. I did not confirm that there was a unique ID associated with each address that does not change upon updating of the data base.
  2. The relatively small number of addresses and the relative simplicity (most addresses are associated with stand-alone structures) of those addresses, means that an update and re-import of addresses in Oconee County should be relatively easier (especially with the JOSM conflation plugin).

Changeset Tags

Describe how you'll use changeset tags in the import.

I will use the following changeset tags:

I will also be importing under the account and username IanVG_Import, to keep the changesets in a separate account.

The link to this wiki page will also be added under source, in order to link curious and interested mappers to this page.

Data Transformation

After importing a .csv file from QGIS into RStudio, the data needed to trimmed down:

  1. I deleted all columns that had no values (i.e. NA values) present, will not include them here for sake of simplicity, but they trimmed down the file column number to about half.
  2. Deleted remaining columns that did not include information that would be beneficial for address nodes.
    1. The columns conveniently had nice tags for creating addr:street=* values, so I was not required to manipulate the data at all for that value.
    2. I was able to directly use a column for addr:housenumber=* values and no filtering or transformation was required.
    3. The Building and Unit columns did provide a little bit of trouble, but I decided to include the text (e.g. "Building", "Suite") in the values, as this was suggested by a user (bgo-...) on Slack (link here).
      1. I then combined (concatenated) the two columns into value for the addresses that had both Building and Unit values (e.g. "Building 100 Suite 105").
    4. I decided to keep all addresses that were associated with both structures and parcels from the Placement column.
      1. I asked about this on the Slack and was informed that individual parcel address data nodes are not harmful.

Data Transformation Results

Post a link to your OSM XML files.

Data Merge Workflow

Team Approach

The user IanVG will be doing this import solo. The account for importing the data will of course be the IanVG_Import account.

References

List all factors that will be evaluated in the import.

Workflow

Detail the steps you'll take during the actual import.

Information to include:

  • Step by step instructions
  • Changeset size policy
  • Revert plans

The steps I will take during the import are:

  1. Load shapefile into JOSM using the opendata plugin.
  2. Use Overpass Query wizard to run: "addr:housenumber" = * OR building=* in "Oconee County" for nodes and ways only and use the the Search Area of Interest, to only download within Oconee County Administrative Area bounding box.
  3. Use filter tool to filter out everything except:
    • Buildings

Nodes and ways with addresses

  • In addition filter out "type=relation" kinds of objects as the conflation plugin cannot handle them
  • Use the conflation plugin (as specified below) to conflate the right points into the map.
    • Use: Disambiguating Method with Centroid Distance set to 10.0
  • Manually go about importing and merging (when possible) the address nodes with building ways
  • Merge without Warning: addr:city, addr:postcode, addr:street and level

Conflation

Identify your approach to conflation here.

QA

Add your QA plan here.

See also

The email to the Imports mailing list was sent on YYYY-MM-DD and can be found in the archives of the mailing list at [1].