San Francisco Building Height Import

From OpenStreetMap Wiki
Jump to: navigation, search

Current Status

In accordance with the Import/Guidelines:

Join us:


The SF city government has published a 1 meter resolution LIDAR dataset from 2014. Our goal is to add height tags to existing OSM building footprints. The data is licensed CC0, making it suitable to combine to OSM.

comparison of OSM building footprints / LIDAR 1 meter raster

Why does this belong in OSM?

San Francisco has building footprints for the entire city already in OSM, thanks to the work of the Mapbox Data Team. This is the highest quality open building dataset existing for San Francisco. The city's own building footprint dataset has many erroneously merged buildings (it was created by segmenting the LIDAR data).

Adding a height attribute to the existing and OdbL-licensed footprints will substantially improve OSM, especially when used with popular renderers like OSM Buildings and Tangram.

There are many other completed or proposed building imports - see discussions Bend, Oregon; Miami-Dade County; Los Angeles; Dakota County, Minnesota; Norfolk, Virginia just for the year 2016.

We think this import is less complex than other building imports for a few reasons:

  • No possibility of overwriting mapper's work since we're not adding any new nodes, ways, or relations, nor are we attempting to conflate or merge data from other footprints or 3D models. We're adding a height tag to buildings where they don't already exist.
  • We have access to the raw LIDAR data - we will manually verify that the automatically derived height values are reasonable by checking for agreement between LIDAR and footprints, presence of trees, etc.
  • the scope is limited: the only changes being made are addition of one tag for ~150,000 buildings in a geographically constrained area.

Is the data accurate?

We don't want to pollute OSM with inaccurate height data - the height values we automatically assign to footprints should be at least as good as a human would measure without the aid of surveying instruments.

NOTE: The LIDAR data is from 2014, so any changes in height since then won't be reflected. We're OK with this kind of error.

We can be confident in our workflow if:

  • The raw LIDAR data matches real-world measurements, and:
  • Our method of assigning a height to a building footprint matches real-world measurements.

Evaluating Raw data

We want to sanity check the raw data to ensure it matches reality, and that terrain elevation is correctly subtracted out. One way to assess the accuracy of the data is to compare LIDAR readings for buildings with known heights, such as famous landmarks.

Landmark LIDAR height Actual Height (wikipedia)
Sutro Tower 302.18 283.3
Transamerica Pyramid 259.86 258.1
Ferry Building 70.7 Tower is 75 meters

From the above, we can conclude that the LIDAR data has reasonable measurements for tall structures, especially in areas of varied elevation.

How we're assigning heights

Our goal is to avoid any kind of "black box" process to find the height value. We're determining the height of a building by shrinking its footprint by 2 meters to avoid any adjacent buildings, and then finding the maximum height value in that area. See PostGIS implementation here:

Evaluating our method of assigning heights

There are some caveats to automatically tagging the height data. Examples:

  • a building footprint traced from satellite imagery may not line up exactly with its LIDAR footprint; our method should not take height values from adjacent buildings
  • a building may have a sloped or gabled roof
  • a building may be covered by trees

REMINDER: The OSM guidelines at Key:height specify that the tagged height value should be the maximum height of the building, e.g. the top point of a gabled roof.

Mapillary has freely licensed street-level imagery which we can use to estimate heights of buildings and determine that they match the values our method produces.

Comparison Images

Our method is successful at tagging adjacent buildings of very different heights; in this case a 6-. 2- and 4- story building are assigned heights that seem plausible.

Raw LIDAR imagery layer is inset; this is what mappers will see in JOSM as part of the workflow. Note that the center white building is not segmented in OSM; the height of the taller building part is correctly used.

Our method is also successful at tagging buildings with small variations in heights. Note in pink a rough estimation of a building's height assuming a a standard doorway height of 2 meters. This shows the tagged height is a plausible value for a 3 story building.

Failure Modes

We've identified a few instances where our automatic height tagging method produces a wrong result.

In this case our LIDAR method identifies that the building footprint outlined in red is 35 m tall, however in reality it is only one narrow tower that is tall; the existing OSM footprint fails to capture the parts of the building with different heights.

Our strategy: mappers can use their discretion to skip the height tag on the footprint and make a note to improve the building:part entities, outside of the workflow of this import.

In this case a 3 story building is covered by a tree; our method assigns it a height of 18 meters which is unreasonable.

Our strategy: Trees are easily identifiable in the raw LIDAR imagery (see inset), so mappers should identify occluding trees in JOSM and query for the correct height using QGIS. If it is too difficult to determine the height of the building the tag should be skipped.

Import Workflow


Here are all the affected tags and how they relate to accepted OSM mapping practices.

Shapefile Attribute OSM Tag Description
maxheight - minheight height=* The height of the building above ground level.
building=* We're only modifying ways/relations with a building tag that don't currently have a height.
building:part=* We'll add heights to these too, but in most cases they're part of detailed models that already have height tags, so we won't be overwriting other mapper's work.


The main imagery layer we're using is a colorized elevation raster + hillshade of the LIDAR data. It is available as a TMS layer for use in JOSM:


This imagery layer is intended to make differences in height clear, as well as make the shape of trees identifiable.

LIDAR tileset


OSM Tasking Manager

SF Building Height Import on OSM Task Manager

Note: once we're live, we'll move this to the main OSM tasking manager. This is a custom instance for now.

Mappers will create an import specific account on e.g. "bdon_import" and join the OSMTM project.

There are 168104 buildings without a height tag in the area covered by LIDAR. These are split into tasks matching the area of a z16 or 17 web mercator tile, sized to a maximum of 500 buildings per task. There are 990 tasks - an average of 170 buildings per task.

SF building height import tasks.png


Mappers will use JOSM with the LIDAR imagery tileset and a custom MapCSS style. They'll download a changeset into josm via a link on the Task Manager. This JOSM changeset is dynamically generated from the main OSM api (bbox endpoint) and the LIDAR height data.

SF building height import josm.png

Steps for mappers:

  1. Inspect each building and ensure that its OSM footprint aligns with the LIDAR shape.
  2. If the OSM footprint is not detailed enough to capture elevation details like towers or building parts, remove the "height" tag in the right editing panel.
  3. If the LIDAR footprint is obscured by trees, refer to QGIS to find the correct height, and then edit the "height" value in the right editing panel.
  4. Upload and mark task as done.
  5. Make a note in the Task Manager which areas were particularly tricky so the validator knows where to double check.

All changesets will have the comment:

San Francisco Building Height Import #sfbuildingheights

Validators should follow the above tasks, double checking any heights that were manually entered in QGIS.


Link to the raw LIDAR dataset:

Mappers should download the raw LIDAR raster along with an OSM reference layer.

Future Data Updates

The SF County LIDAR data is from 2014 and is the most recent dataset. Since the data is raw LIDAR, there is no metadata such as building IDs, so there's no additional tags to add as part of the initial import.

Team and Supporters