Santa Clara County, California/San Jose building import

From OpenStreetMap Wiki
Jump to navigation Jump to search

This is an in-progress import of municipal building and address data throughout the city of San José, California.

Source

The City of San José makes several GIS layers available as shapefiles. Basemap contains parcel outlines, while Basemap_2 contains address points, building footprints, and condominium outlines. The data comes projected in a mix of NAD 1983/CORS96 California state plane 3 (ESRI:103240) and NAD83 California zone 3 (EPSG:2227). Basemap_2 is the same shapefile from which San José sidewalks were imported in 2017 and 2018. The buildings are generally from 2006. Parcel and condo outlines will not be imported, rather they are used in the import script to cross-reference addresses to buildings where possible. Building footprints include height and elevation information, and address points include full address information as well as unit type, "place type," exploded street name, associated parcel, and "Addtl_Loc" which seems to refer to the parcel owner. The city website claims the data is updated monthly, but we do not plan on doing a continuous import.

License

public domain iconSeal of California

This work is in the public domain in the United States because it is a work of the State of California that was in any way "involved in the governmental process" and "prepared, owned, used or retained by any state or local agency" or officer. That work is available pursuant to court interpretation of the Sunshine Amendment of the Constitution of California, and/or the California Public Records Act (CPRA), which contained no relevant provision(s) for copyright.

It is not copyrighted because (lacking an exception in statute like those for works of the Department of Toxic Substances Control or works of certain colleges established by statute) "unrestricted disclosure is required".

See County of Santa Clara v. CFAC. In brief, the "CPRA contains no provisions either for copyrighting [this work] or for conditioning its release on an end user or licensing agreement by the requester. The record thus must be disclosed as provided in the CPRA, without any such conditions or limitations." Subject to general disclaimers.

This template should only be used on file pages.

Preprocessing

start.sh downloads and imports the shapefile data and existing OSM data into a PostgreSQL/PostGIS database, runs merge.sql, and runs ogr2osm once for each TAZ.

merge.sql:

  1. removes inactive addresses;
  2. finds and deletes address points that don't have a matching street name in OSM;
  3. detects and merges groups of address points on grids;
  4. flags addresses and buildings that intersect existing OSM data;
  5. on condo parcels where there is only one address or merged address, merges addresses onto the building closest to the address point;
  6. on small- to medium-sized parcels where there is only one address or merged address that is not a hospital or school, merges addresses onto the building closest to the address point;
  7. on parcels where there are more buildings than addresses and every address point intersects a building, merges addresses onto intersecting buildings;
  8. on parcels where all addresses have a common Addtl_Loc value, assign that value to the parcel; and
  9. assigns buildings, addresses, and named parcels to the nearest TAZ key.

basemap.py is an ogr2osm translation filter that maps fields in the shapefile or database to OSM tags.

Some parts of the scripts above are based on the scripts used in the Hamilton County Building Import. The TAZ areas are the same as the ones used in the crossings part of the sidewalk import.

Tag mapping

BuildingFootprint mapping
Source tag OSM tag
always building=yes
BLDGHEIGHT height=*
BLDGELEV ele=*
Site_Address_Points mapping
Source tag OSM tag
Inc_Muni addr:city=*
Add_Number, AddNum_Suf addr:housenumber=*
CompName addr:street=*
Unit addr:unit=*
Post_Code addr:postcode=*
Place_Type see below
Place_Type mapping
Source value OSM tag
ED amenity=school
FB amenity=place_of_worship
GO office=government
GQ amenity=social_facility
HS amenity=hospital
HT tourism=hotel
RE club=sport
RT amenity=restaurant
RL shop=yes
TR public_transport=platform
Place_Type on merged buildings
Source value OSM tag
BU building=commercial
ED building=school
FB building=religious
GO building=government
HS building=hospital
HT building=hotel
MH building=static_caravan
Condominium building=residential
MF
RL building=retail
RT
SF building=house
Parcel with Addtl_Loc
Source tag OSM tag
always landuse=residential
Addtl_Loc name=*

Issues needing manual resolution

  • Some buildings have a negative height.
  • The building data is at least two years old; some buildings have been demolished or rebuilt during that time. Check against aerial imagery, and if there is no outline for a new building, remove building=* and add demolished:building=*.
  • Occasionally, the merge script will choose the incorrect building to tag with an address, or the data has mismatching parcel information for an address.
  • The merge script leaves address points for cases where it isn't sure they can be matched. In cases where there are no conflicts and the assignment is obvious, lone address points should be merged onto buildings.
  • Some merged address points include so many unit numbers that the field length exceeds the maximum.
  • Addresses tagged as "Retail" are imported as shop=yes, which, while valuable information, is considered a tagging mistake and is not rendered in the default tile layer. These should be made into specific shop types, but can be resolved after the import.
  • Some buildings are split into different pieces in the data set. Ideally, these should be made into building:part=*s.
  • Sometimes it's better to assign an address to a site boundary than a single building, for example hospitals, schools, or apartment complexes. The merge script leaves the address points separate to manually evaluate these cases.
  • Some street names in the address data don't exactly match any street names in OSM. Reconciling them needs external validation, so those address points will be saved for after the import.

Workflow

We are using the OSMUS Tasking Manager to distribute the generated tasks of non-conflicting data to volunteers to review and import. Importing conflicting data will be done similarly in a second phase after all non-conflicting data is imported.

Import accounts

External links