Kiowa building import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Bing Building Import for Kiowa County, Colorado

Goals

The goal of this import is to add the missing buildings in Kiowa County, Colorado. There are 2531 building in the Microsoft data file, but only 258 are currently in OpenStreetMap. This is a very remote, and sparsely populated part of my state.

While not having open addresses data for this area yet, just having the building footprints is useful for local volunteer fire/rescue districts. As I plan to do some ground-truthing and data collection in this area, I thought it would be nice to have the buildings in OSM.

There aren't barely any amenities, or much of anything else in this area in OSM, but all these small hamlets/villages/towns usually have a gas station, a small convenience stop, and maybe a restaurant or cafe. Collecting this data will help both recreation users, and the local volunteer emergency responders. I use ODK Collect with custom XLSForms for data collection.

Schedule

Once there is community approval, with only a 2531 buildings, this is a relatively easy import to do solo, probably in one day. Plus I have experience at this task. Since I plan to be in this area in a few weeks, the ideal timeline is to have this done by then.

Import Data

Background

Data source site: Footprint data on GitHub

Data license: ODbl

ODbL Compliance verified: yes

The Microsoft building footprints have been used by many, many imports.

OSM Data Files

There are 3 files for the county, the original extracts for this of both the footprints, and OSM. There is a third file, which is the post conflation output, before validation has started. Here's the ones for Kiowa County.

Import Type

Data import will be done manually using JOSM, with a mix of Bing and ERSI imagery for validation. I plan to ground-truth this area to collect data on amenities once this import is done, and clear up any other issues with the import.

There are other very remote and sparely populated counties in South Eastern Colorado, and there aren't that many buildings to import in the entire area. So once this import is done, I'll start on the next County. Someday I'll find open address data for this county, but that's a whole other future project.

Data Preparation

Data Reduction & Simplification

Luckily for this county, the data files aren't that large once you reduce the data for the entire state. I first extract the county boundaries from OSM as a GeoJson file. I then used osmium to convert the PBF file I got from geofabrik into a GeoJson file. After that conversion was done, I used the clipping function of ogr2ogr to produce smaller county sized data files. The footprint data is already in GeoJson format.

Once I had the two input data files, I used my own conflation software, producing a county wide file of buildings only in the Microsoft footprints in GeoJson format.

Tagging Plans

Like any import of the Microsoft building footprints, there are only two tags added. These are source=bing and building=yes. Ground-truthing is required to add any other tags.

Data Transformation

As the Microsoft building footprints lack any tags, there is nothing to transform.

Data Merge Workflow

Team Approach

With only 2531 building in the Microsoft data file, I plan to do this a solo project.

Workflow

The plan is to initially import the buildings not in any town, as having these remote inhabited buildings on the map is incredibly useful for emergency response. Somebody I'll add addresses if I can find a data source with an appropriate license.

Once I have the remote individual buildings imported, then I'll do each town. Most of these towns are only a few hundred people, so will all be in a single upload, one per town.

Since this area is much too large to use the Tasking Manager for, I'll just simply delete the buildings from the result data after they have been uploaded. There are so few buildings, it's not difficult to work with a large area.

I'll use a changeset hashtag of kiowa-import with the import user of rob-import.

Conflation

I use my own software for building footprint conflation that I developed for HOT. This software has been utilized in East Africa for importing these same Microsoft footprints, under funding from Microsoft.

The software is GPLv3, and is available here. There is additional documentation on the conflation process and the software in the doc directory.

QA

The small data sets reduce the validation difficulty considerably. Having worked with the Bing footprints for our East Africa import, I'm quite familiar with the false positives in footprint data. This includes mis-identifying large rocks, hay stacks, etc.. as buildings. The other problem is when buildings are close together, it generates only one polygon for the group. As OSM prefers each building to be separate, the footprint is deleted, and the buildings traced manually using imagery. It also fails to recognize round buildings much of the time, and turns them into squares. A real problem in Africa. At least for this area, most all houses are rectangular, and anything round is usually a water tank or grain silo.

I usually use a mix of imagery, since it's impossible to tell how old the imagery is, and images taken at different times can be more obvious. When I can't make a clear decision on a footprint, I'll drop that from the import.

Initial validation with imagery shows the quality of the footprint data is so-so. Many round grain-silos are mis-identified as buildings, and there's a lot of them in this county. Luckily they're usually in groups, and usually near farms with many buildings, so not hard to find. The geometry of non-rectangular buildings is pretty good.