Address Data Import for Athens-Clarke County

From OpenStreetMap Wiki
Jump to navigation Jump to search


This Import Plan Outline is intended to help ensure that your "Import plan" document covers as many of the common questions about imports as possible. Just create your own page and cut and paste the wiki text from here (starting from below the line).

Please! If you identify ways that this outline didn't meet the needs of your import (key evidence of this: tons of questions or alarm bells on mailing lists!), please return and fix this page.

The link to find this outline is here.



(Address Data Import for Athens-Clarke County) is an import of Athens-Clarke County Open Data Address dataset which is of type (address) covering (Athens-Clarke County in United States). The import is currently (as of January 24, 2021) at the planning stage.

Goals

The goal of this import is to augment existing address data in Athens-Clarke County.

Schedule

The timeline of the import is to be determined as currently there is only one active participant in the project IanVG. It is thought that addresses will be manually/mechanically added in through JOSM using the Open Data plugin, so the process will not be instant.

Import Data

Github Project Page

The github project page for the address import is located here.

Background

Provide links to your sources.

Data source site: https://data-athensclarke.opendata.arcgis.com/
Data license: https://website.tld/license (The data license of the data is currently unknown, maybe will ask but received written communication to use the data so far).
Type of license (if applicable): e.g. CC-BY-SA, Public Domain, Public Domain with Attribution, etc. (Unknown see above).
Link to permission (if required): e.g. link to mail list reference url - http://lists.openstreetmap.org/pipermail/imports/2012-December/001617.html

Ian Van Giesen <ianvangiesen@gmail.com> Sat, Sep 19, 2020 at 12:03 PM
To: joseph.dangelo@accgov.com

Hello Joseph Angelo,

Hope all is well with you.

My name is Ian Van Giesen and I am a current resident and student in Athens. I am also an avid mapper of Athens-Clarke County through OpenStreetMap (OSM), a project to create and distribute free geographic data for people across the globe. To my knowledge it is the largest open-source, volunteer-managed alternative to other large GIS and map providers in the world. I have spent a good deal of time the past 5 months greatly improving the quality of the map around the Athens-Clarke County area, having added several thousand new building outlines that have enhanced the quality of the map around the county.

I am emailing you because I am at a point now where I am interested in adding addresses to all the buildings and houses that I have added in. I recently became aware of the ACC Open Data portal which appears to provide a great deal of useful information, including address points. As far as I can tell, the ACC Address Point data set would be perfect for importing en masse all the address points needed to get a majority of addresses added into OSM.

My question for you is whether or not there are any copyright restrictions on the data or if the data is in the public domain.

In any case I look forward to your response to this inquiry.

Thank you very much,

Ian

Joseph D'Angelo <Joseph.D'Angelo@accgov.com> Mon, Sep 21, 2020 at 8:29 AM
To: Ian Van Giesen <ianvangiesen@gmail.com>

Good morning Ian,


Thank you very much for your inquiry, your request, and most of all your contributions to OSM. We strive to offer as many up-to-date, clean, and accurate open datasets as possible for just the kind of purpose you’re describing. Please feel free to use any information you find on the portal in any way you see fit. I am glad you asked!


Also available via the portal are all known building footprints. They typically run about a month behind new construction, and like the addresses, are available to all.


Sincerely,


Joseph D’Angelo

OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#yourdataprovider
ODbL Compliance verified: Yes.

OSM Data Files

The .osm file for import is here.

Import Type

This is a one time import and will be done by hand through JOSM. The method for importing the data will be done by using JOSM and the Open Data plugin of JOSM to mechanically add in the addresses of Athens. This will take time but will allow for careful and consistent mapping methods of the addresses.

Data Preparation

Data Reduction & Simplification

Describe your plans, if any, to reduce the amount of data you'll need to import.

Examples of this include removing information that is already contained in OSM or simplifying shapefiles.

Tagging Plans

Describe your plan for mapping source attributes to OSM tags.

Changeset Tags

Describe how you'll use changeset tags in the import.

Data Transformation

Describe the transformations you'll need to conduct, the tools you're using, and any specific configurations or code that will be used in the transformation.

The transformation I am performing on the .csv file containing the address data includes:

DELETING THE FOLLOWING COLUMNS:

  • OBJECTID
  • ParcelID
  • FullAdd
  • FullHouse
  • PreDir
  • PostDir
  • FullStreetName
  • UnitType
  • Building
  • POSTALCITY
  • State
  • AddClass
  • AddType
  • AddStart
  • AddEnd
  • EditDate
  • JOINID
  • GlobalID
  • Building
  • UnitType

TRANSFORMING THE FOLLOWING COLUMN NAMES:

  • AddressID -> ref:athensclarkeaddress
    • Depends, likely do not include if mechanical update process is used. Waiting for clarification on guidance on this issue.
  • HouseNumber -> addr:housenumber
  • StreetName -> addr:street
  • Unit -> addr:unit
  • Floor -> level
  • City -> addr:city
  • Zipcode -> addr:postcode
  • Changing the floor column to "level" as this is not officially part of the complete address for the data points.
    • Changing the BASEMENT to -1. (as per the level page).
    • I subtracted one (minus one) from each level row with a numerical value. (Eg changed all the 1's to 0's)
      • This is accordance with the level page that -1 is for basements, 0 for ground levels and 1 for the first level above ground level.
    • Deleted the field (but not the row) for the ".." level field
    • Changed the one BOTTOM value to -1
    • Changes the 396 "GROUND" values to 0
    • Deleted the fields (not the rows) with the values "D" and "X" and "SIDE"

FILTERING AND THEN DELETING THE FOLLOWING COLUMNS:

  • Include only 'Active' from the Status column and delete the rest.
    • Still waiting for response on the meaning of 'Retired', 'Reserved', and 'Potential'
  • Potentially only including blanks from 'Anomoly' column
    • I'm going to only include blanks from the anomoly column for now. May be later, I will go and individually add in the addresses from the anomoly column aftewards.
  • Potentially only including blanks from 'Comments' column
    • I deleted all entries (rows) with any comments. I may go back in later to update these addresses. Some of the comments indicated that the address was soon to change or that it needed to be merged with another.
  • Street type: "er" for 250 Oglethorpe Er, not sure what this means, I deleted it
  • Filtered out the rows from the HouseNumEx: "2021, January, 02 (stands for one-half); A; B; C; D; E)
    • Maybe we'll add these in later, as there is a good deal of them (3398 of them to be exact).
    • After this I deleted the FullHouseNumber as it's now the same as the addr:housenumber
  • I ended up deleting addr:housenumbers with values of 0. Didn't see anything on google maps.
  • After everything was said and done, there ended being 925 duplicate address rows. So I had to take them out as well. Once I do that, I will remerge that .csv with the shapefile and hopefully everything will work itself out.
    • Used R to remove the duplicate addresses. This number turned out to be around 700. I did not include the either part of the duplicate address pairs, as there is extra information needed to uniquely identify the addresses in OSM.

Data Transformation Results

Post a link to your OSM XML files.

In QGIS

Upload the shapefile for the address points. Then import the .csv edited file to QGIS. Join the .csv file to the shapefile once they are both uploaded. Joined via the ref:athensclarkeaddress in the column in the .csv and the AddressID column in the shapefile. There will be a ton of null spaces, because the table join is joining the complete dataset from the shapefile with the already filtered data from the .csv file.

In the attribute table, choose Select by Expression and write "FIELD_NAME" IS null (replace FIELD_NAME with your actual field names, of course). Click "Select Features", then simply delete the resulting selected features. Use the little red trash can to delete these fields (rows).

Right click on the shapefile layer (not the .csv layer) and hover over export and then click on "save as" and export as ESRI shapefile with the projection ESGS something.

Data Merge Workflow

Team Approach

I'll be working solo on this project. Reaching out to others regarding this import, but likely only IanVG will be involved in this project. Reached out to Randal Hale about bringing him back onto the project. He was the one that initiated the data import originally.


I will create a new account; IanVG_Imports to facilitate the import.

Workflow

  • Load shapefile into JOSM using the opendata plugin. Manually go about importing and merging (when possible) the address nodes with building ways

Conflation

Manual process on JOSM, so existing ways and nodes will be looked for first to determine if the address already exists.

QA

No QA plan.

See also

The email to the Imports mailing list was sent on 2020-09-19 and can be found in the archives of the mailing list at [1].

Old Building Footprint Import Discussion

Location: Athens-Clarke County, Georgia, USA

Loading map...


About: We received a data donation from the GIS department from Athens-Clark County. OUt of all the data - the most useful for our efforts is the building foot print data. In that data layer there appears to be about 55,688 polygons. Some are multi-polygons and will have to be dealt with individually.


Plan: We consider this import to be large (at least for us).

1. Segment the county into a grid. Since we want to do a neighborhood at a time we are using the census to create a "fluid" grid.

PDF Showing Grid

2. Data will be separated out approximately 75 projects since there are 75 grids that were created. That gives us a manageable upload size since the census will have smaller grids in more densely populated areas. Larger grids will be in the rural areas.

3. We plan on using the open data tool with JOSM. It gives us a chance for validation checks before an upload.We also plan on testing with OGR2OSM

TAGS that will be used: building=yes and source=ACCGIS

If the opportunity arises we may do some classification by census block into houses, commercial, residential, shed, etc.