San Francisco Building Height Import

From OpenStreetMap Wiki
Jump to: navigation, search

Current Status

In accordance with the Import/Guidelines:

Join us:

Background

The SF city government has published a LIDAR-derived building footprint dataset in 2016. Our goal is to add height tags to existing OSM building footprints. The data is licensed CC0, making it suitable to combine with OSM.

comparison of OSM building footprints / LIDAR 1 meter raster

Why does this belong in OSM?

San Francisco has building footprints for the entire city already in OSM, thanks to the work of the Mapbox Data Team. In 2016, the SF local government released a raw LIDAR dataset for the city as well as a LIDAR-derived building footprint dataset (Documentation PDF).

Adding a height attribute to the existing and OdbL-licensed footprints will substantially improve OSM, especially when used with popular renderers like OSM Buildings and Tangram.

NOTE: This import only covers buildings modeled as Ways and excludes the ~870 building multipolygon relations in the affected area. discussion

There are many other completed or proposed building imports - see discussions Bend, Oregon; Miami-Dade County; Los Angeles; Dakota County, Minnesota; Norfolk, Virginia just for the year 2016.

We think this import is less complex than other building imports for a few reasons:

  • No possibility of overwriting mapper's work since we're not adding any new nodes, ways, or relations, nor are we attempting to conflate or merge data from other footprints or 3D models. We're adding a height tag to buildings where they don't already exist.
  • We have access to the raw LIDAR data - we will manually verify that the automatically derived height values are reasonable by checking for agreement between LIDAR and footprints, presence of trees, etc.
  • the scope is limited: the only changes being made are addition of one tag for ~150,000 buildings in a geographically constrained area.

Is the data accurate?

We don't want to pollute OSM with inaccurate height data - the height values we automatically assign to footprints should be at least as good as a human would measure without the aid of surveying instruments.

NOTE: The source LIDAR data is from 2010, so any changes in height since then won't be reflected. We're OK with this kind of error.

We can be confident in our workflow if:

  • The raw LIDAR data matches real-world measurements, and:
  • Our method of assigning a height to a building footprint matches real-world measurements.

How we're assigning heights

The SFdata LIDAR-derived building footprints do not necessarily line up with the traced OSM footprints. We tag an OSM footprint's height by matching it to an SFdata footprint, where the footprints are both 70% of each other's area, and then using the hgt_Median_m median LIDAR value. This eliminates instances where the OSM and SFdata datasets have different building parts tagged. The 70% threshold is enough to cover roughly 85% of OSM building footprints in the coverage area.

See PostGIS implementation here: https://github.com/osmlab/sf_building_height_import/blob/master/sql/create_tables.sql

Evaluating Raw data

We want to sanity check the raw data to ensure it matches reality, and that terrain elevation is correctly subtracted out. One way to assess the accuracy of the data is to compare LIDAR readings for buildings with known heights, such as famous landmarks.

Landmark LIDAR height max Actual Height (wikipedia)
Sutro Tower 286.39 283.3
Transamerica Pyramid 258.49 258.1
Ferry Building 71.39 Tower is 75 meters

From the above, we can conclude that the LIDAR data has reasonable measurements for tall structures, especially in areas of varied elevation.

Evaluating our method of assigning heights

There are some caveats to automatically tagging the height data. Examples:

  • a building footprint traced from satellite imagery may not line up exactly with its LIDAR footprint; our method should not take height values from adjacent buildings
  • a building may have a sloped or gabled roof
  • a building may be covered by trees

REMINDER: The OSM guidelines at Key:height specify that the tagged height value should be the maximum height of the building, e.g. the top point of a gabled roof.

Mapillary has freely licensed street-level imagery which we can use to estimate heights of buildings and determine that they match the values our method produces.

Comparison Images

Our method is successful at tagging buildings of very different heights; in this case a 6-. 2- and 4- story building are assigned heights that seem plausible.

Raw LIDAR imagery layer is inset; this is what mappers will see in JOSM as part of the workflow. Note that the center white building is split into two parts in the SFdata dataset, while the OSM footprint has only one part; it does not meet the 70% match threshold

Our method is also successful at tagging buildings with small variations in heights. Note in pink a rough estimation of a building's height assuming a a standard doorway height of 2 meters. This shows the tagged height is a plausible value for a 3 story building. The values are not perfect since they are median heights and the roofs of some buildings are gabled. This can be corrected by mappers who can query manually for the data.

Failure Modes

We've identified a few instances where our automatic height tagging method produces a wrong result.

In this case our LIDAR method identifies that the building footprint outlined in red is ~18 m tall, however in reality it is only one narrow tower that is tall; the existing OSM footprint fails to capture the parts of the building with different heights.

Our strategy: mappers can use their discretion to skip the height tag on the footprint and make a note to improve the building:part entities, outside of the workflow of this import.

In this case a 2-3 story building is covered by a tree; our method assigns it a height of ~14 meters which is unreasonable.

Our strategy: Trees are easily identifiable in the raw LIDAR imagery (see inset), so mappers should identify occluding trees in JOSM and query for the correct height using QGIS. If it is too difficult to determine the height of the building the tag should be skipped.

Building Survey

Team members Maning Sambale and Chetan Gowda also performed a survey of 87 buildings in the Sunset District (a residential area), Downtown/Tenderloin (where more skyscrapers are located), Cow Hollow (another residential area), and an industrial area.

Google spreadsheet of building height differences

For each building, the # of floors visible in Mapillary was multiplied by 3 meters for a rough approximation of the building height (this value is also used by many OSM renderers for building:levels). This approximation was then compared to the tagged height from our LIDAR process:

These results suggest:

  • Our tagging method is good at identifying building heights up to around 4-5 stories tall (12-15 m). 80% has diff of <4 meters between LIDAR derived tag and approximate height based on level.
  • Mapillary imagery is often misleading for tall buildings because building parts are not clear. All errors in the above chart with diff > 10 meters are because of building parts not visible on Mapillary. These parts are clearly visible in the raw Hillshade data - see below inset. Buildings tagged 15 m or taller should be treated with extra scrutiny in our import process - there are only around 2300 of these (<2% of total).
Example of building with discrepancy between Mapillary imagery and results - tall building part not visible from street level.

Import Workflow

Tags

Here are all the affected tags and how they relate to accepted OSM mapping practices.

Shapefile Attribute OSM Tag Description
maxheight - minheight height=* The height of the building above ground level.
building=* We're only modifying ways/relations with a building tag that don't currently have a height.
building:part=* We'll add heights to these too, but in most cases they're part of detailed models that already have height tags, so we won't be overwriting other mapper's work.


Imagery

The main imagery layer we're using is a colorized elevation raster + hillshade of the LIDAR data. It is available as a TMS layer for use in JOSM:

tms:https://s3-us-west-2.amazonaws.com/openmassing/sf_lidar/{z}/{x}/{y}.png

This imagery layer is intended to make differences in height clear, as well as make the shape of trees identifiable.

LIDAR tileset

Workflow

OSM Tasking Manager

SF Building Height Import on OSM Task Manager

Note: once we're live, we'll move this to the main OSM tasking manager. This is a custom instance for now.

Mappers will create an import specific account on openstreetmap.org e.g. "bdon_sfimport" and join the OSMTM project.

There are 143953 buildings without a height tag in the area covered by LIDAR. These are split into tasks matching the area of a z16 or 17 web mercator tile, sized to a maximum of 500 buildings per task. There are 804 tasks - an average of 179 buildings per task.

SF building height import tasks.png

JOSM

Mappers will use JOSM with the LIDAR imagery tileset and a custom MapCSS style. They'll download a changeset into josm via a link on the Task Manager. This JOSM changeset is dynamically generated from the main OSM api (bbox endpoint) and the LIDAR height data.

SF building height import josm.png

Steps for mappers:

  1. Inspect each building and ensure that its OSM footprint aligns with the LIDAR shape.
  2. If the OSM footprint is not detailed enough to capture elevation details like towers or building parts, remove the "height" tag in the right editing panel.
  3. If the LIDAR footprint is obscured by trees, refer to QGIS to find the correct height, and then edit the "height" value in the right editing panel.
  4. Upload and mark task as done.
  5. Make a note in the Task Manager which areas were particularly tricky so the validator knows where to double check.

All changesets will have the comment:

San Francisco Building Height Import #sfbuildingheights https://wiki.openstreetmap.org/wiki/San_Francisco_Building_Height_Import

Validators should follow the above tasks, double checking any heights that were manually entered in QGIS.

QGIS

Link to the raw LIDAR dataset:

Mappers should download the raw LIDAR raster along with an OSM reference layer.

Future Data Updates

The SF County LIDAR data is from 2014 and is the most recent dataset. Since the data is raw LIDAR, there is no metadata such as building IDs, so there's no additional tags to add as part of the initial import.

Team and Supporters