Cupertino import

From OpenStreetMap Wiki
Jump to: navigation, search

This page details a completed import of building footprint and address data for the city of Cupertino, California. The data was requested from the city government GIS department by User:Erjiang.


Licensing and restrictions

The data came with a standard disclaimer that it should not be used for planning purposes and that the city does not guarantee the correctness of the data. When asked about restrictions that would prevent it from being imported into OpenStreetMap, the city replied saying, "Our disclaimer does not restrict you from importing the data to openstreetmap."

Original email about disclaimer:
Delivered-To: eric@doublemap.com
Received: by 10.76.60.138 with SMTP id h10csp631448oar;
        Thu, 1 Oct 2015 09:08:03 -0700 (PDT)
X-Received: by 10.68.94.3 with SMTP id cy3mr12819064pbb.113.1443715682957;
        Thu, 01 Oct 2015 09:08:02 -0700 (PDT)
Return-Path: <gis@cupertino.org>
Received: from mail.cupertino.org (64-165-34-28.cupertino.org. [64.165.34.28])
        by mx.google.com with ESMTPS id sz5si9914382pab.19.2015.10.01.09.08.02
        for <eric@doublemap.com>
        (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
        Thu, 01 Oct 2015 09:08:02 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of gis@cupertino.org designates 64.165.34.28 as permitted sender) client-ip=64.165.34.28;
Authentication-Results: mx.google.com;
       spf=pass (google.com: best guess record for domain of gis@cupertino.org designates 64.165.34.28 as permitted sender) smtp.mailfrom=gis@cupertino.org
Received: from CH-EXCHANGE2K10.ad.cupertino.org ([fe80::30e2:b47e:de54:f842])
 by CH-EXCHANGE2K10.ad.cupertino.org ([fe80::30e2:b47e:de54:f842%23]) with
 mapi id 14.03.0224.002; Thu, 1 Oct 2015 09:07:47 -0700
From: GIS Coordinator <gis@cupertino.org>
To: Eric Jiang <eric@doublemap.com>
Subject: RE: Cupertino building outlines in OpenStreetMap
Thread-Topic: Cupertino building outlines in OpenStreetMap
Thread-Index: AQHQ99iMOMccl0DJiUOQc2E0NR0zD55SWQTQgAB2KYCABAa1QA==
Date: Thu, 1 Oct 2015 16:07:46 +0000
Message-ID: <2F63A5118871B64C8DCDC1ED90D7C472ADD6353F@CH-EXCHANGE2K10.ad.cupertino.org>
References: <CAOfJSTzUcCCePzMWM2U1vruo6eOZgJ=4B57NY8ShQQosvRuzzw@mail.gmail.com>
 <2F63A5118871B64C8DCDC1ED90D7C472ADD5D464@CH-EXCHANGE2K10.ad.cupertino.org>
 <CAOfJSTyYOsNpD_grCC9+PbTJaRZdeHJkqc=zM_SjQDMU6d_fxw@mail.gmail.com>
 <CAOfJSTxFrjCG_tXWfrQOnoaSwQ8f4pbQBpCYeACdCebbsFfb2Q@mail.gmail.com>
In-Reply-To: <CAOfJSTxFrjCG_tXWfrQOnoaSwQ8f4pbQBpCYeACdCebbsFfb2Q@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [192.168.101.12]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0

T3VyIGRpc2NsYWltZXIgZG9lcyBub3QgcmVzdHJpY3QgeW91IGZyb20gaW1wb3J0aW5nIHRoZSBk
YXRhIHRvIG9wZW5zdHJlZXRtYXAuDQoNClRlcmkgR2VyaGFyZHQgR0lTUA0KR0lTIENvb3JkaW5h
dG9yDQpDaXR5IG9mIEN1cGVydGlubw0KNDA4Ljc3Ny4zMzExDQp0ZXJpZ0BjdXBlcnRpbm8ub3Jn
DQoNCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBFcmljIEppYW5nIFttYWlsdG86
ZXJpY0Bkb3VibGVtYXAuY29tXSANClNlbnQ6IE1vbmRheSwgU2VwdGVtYmVyIDI4LCAyMDE1IDEy
OjM1IFBNDQpUbzogR0lTIENvb3JkaW5hdG9yIDxnaXNAY3VwZXJ0aW5vLm9yZz4NClN1YmplY3Q6
IFJlOiBDdXBlcnRpbm8gYnVpbGRpbmcgb3V0bGluZXMgaW4gT3BlblN0cmVldE1hcA0KDQpJIHJl
YWxpc2UgbXkgZmlyc3QgZW1haWwgd2FzIG5vdCB2ZXJ5IGNsZWFyIC0gSSB3b3VsZCBsaWtlIHRv
IGltcG9ydCB0aGlzIGRhdGEgaW50byBPcGVuU3RyZWV0TWFwLiBTaW5jZSBPcGVuU3RyZWV0TWFw
IGlzIGEgcHVibGljIHByb2plY3Qgd2hvc2UgZGF0YSBpcyBhdmFpbGFibGUgdG8gYW55b25lLCBJ
IHdhbnRlZCB0byBjaGVjayB0aGF0IHdlIHdvdWxkbid0IHJ1biBpbnRvIGFueSBsZWdhbCBpc3N1
ZXMgYnkgZG9pbmcgc28uDQoNCkVyaWMNCg0KT24gTW9uLCBTZXAgMjgsIDIwMTUgYXQgMTI6MzEg
UE0sIEVyaWMgSmlhbmcgPGVyaWNAZG91YmxlbWFwLmNvbT4gd3JvdGU6DQo+IFRoYW5rIHlvdSBm
b3Igc2VuZGluZyBtZSB0aGlzIGRhdGEhIFdoaWxlIEkgdW5kZXJzdGFuZCB0aGF0IHRoZSBjaXR5
IA0KPiBwcm92aWRlcyBubyBndWFyYW50ZWVzIG9uIHRoZSBkYXRhLCBJIGRvIHdhbnQgdG8gZG91
YmxlLWNoZWNrIHRoYXQgDQo+IHRoaXMgZGF0YSBpcyBpbiB0aGUgcHVibGljIGRvbWFpbiAtIGlm
IHRoZXJlIGFyZSBhbnkgY29weXJpZ2h0IG9yIA0KPiBsaWNlbnNpbmcgcmVzdHJpY3Rpb25zIG9u
IGl0LCBwbGVhc2UgbGV0IG1lIGtub3cuDQo+DQo+IFRoYW5rcyENCg0KDQoNCg0KLS0NCkVyaWMg
SmlhbmcsIERvdWJsZU1hcA0KU3VpdGUgNDAxIHwgNDI5IE4uIFBlbm5zeWx2YW5pYSBTdHJlZXQg
fCBJbmRpYW5hcG9saXMsIElOIDQ2MjA0IE9mZmljZSArMSA4NTUuNDYzLjY2NTUgZXJpY0Bkb3Vi
bGVtYXAuY29tIHwgd3d3LmRvdWJsZW1hcC5jb20NCg==
Follow-up email about restrictions:
Delivered-To: eric@doublemap.com
Received: by 10.76.60.138 with SMTP id h10csp1349619oar;
        Mon, 28 Sep 2015 07:56:46 -0700 (PDT)
X-Received: by 10.68.242.130 with SMTP id wq2mr26757790pbc.117.1443452205296;
        Mon, 28 Sep 2015 07:56:45 -0700 (PDT)
Return-Path: <gis@cupertino.org>
Received: from mail.cupertino.org (mail.cupertino.org. [64.165.34.28])
        by mx.google.com with ESMTPS id rt6si11759471pbb.18.2015.09.28.07.56.42
        for <eric@doublemap.com>
        (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
        Mon, 28 Sep 2015 07:56:45 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of gis@cupertino.org designates 64.165.34.28 as permitted sender) client-ip=64.165.34.28;
Authentication-Results: mx.google.com;
       spf=pass (google.com: best guess record for domain of gis@cupertino.org designates 64.165.34.28 as permitted sender) smtp.mailfrom=gis@cupertino.org
Received: from CH-EXCHANGE2K10.ad.cupertino.org ([fe80::30e2:b47e:de54:f842])
 by CH-EXCHANGE2K10.ad.cupertino.org ([fe80::30e2:b47e:de54:f842%23]) with
 mapi id 14.03.0224.002; Mon, 28 Sep 2015 07:56:36 -0700
From: GIS Coordinator <gis@cupertino.org>
To: Eric Jiang <eric@doublemap.com>
Subject: RE: Cupertino building outlines in OpenStreetMap
Thread-Topic: Cupertino building outlines in OpenStreetMap
Thread-Index: AQHQ99iMOMccl0DJiUOQc2E0NR0zD55SCnXw
Date: Mon, 28 Sep 2015 14:56:34 +0000
Message-ID: <2F63A5118871B64C8DCDC1ED90D7C472ADD5D464@CH-EXCHANGE2K10.ad.cupertino.org>
References: <CAOfJSTzUcCCePzMWM2U1vruo6eOZgJ=4B57NY8ShQQosvRuzzw@mail.gmail.com>
In-Reply-To: <CAOfJSTzUcCCePzMWM2U1vruo6eOZgJ=4B57NY8ShQQosvRuzzw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [192.168.101.12]
Content-Type: multipart/mixed;
	boundary="_007_2F63A5118871B64C8DCDC1ED90D7C472ADD5D464CHEXCHANGE2K10a_"
MIME-Version: 1.0

--_007_2F63A5118871B64C8DCDC1ED90D7C472ADD5D464CHEXCHANGE2K10a_
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64

SEkgRXJpYywNCg0KV2UgY3VycmVudGx5IGhhdmUgbm8gcGxhbnMgdG8gcHV0IG91ciBkYXRhIG9u
IE9wZW5TdHJlZXRNYXAuIEhvd2V2ZXIgb3VyIGRhdGEgaXMgYXZhaWxhYmxlIHRvIHRoZSBwdWJs
aWMgdGhyb3VnaCBhIHJlcXVlc3Qgc3VjaCBhcyB0aGlzLiANCg0KVGhlIENpdHkgb2YgQ3VwZXJ0
aW5vIGRvZXMgbm90IGd1YXJhbnRlZSB0aGUgYWNjdXJhY3ksIGFkZXF1YWN5LCBjb21wbGV0ZW5l
c3Mgb3IgdXNlZnVsbmVzcyBvZiBhbnkgaW5mb3JtYXRpb24uIFRoZSBDaXR5IGRvZXMgbm90IHdh
cnJhbnQgdGhlIHBvc2l0aW9uYWwgb3IgdGhlbWF0aWMgYWNjdXJhY3kgb2YgdGhlIEdJUyBkYXRh
LiBUaGUgR0lTIGRhdGEgYW5kIGNhcnRvZ3JhcGhpYyBkaWdpdGFsIGZpbGVzIGFyZSBub3QgbGVn
YWwgcmVwcmVzZW50YXRpb25zIG9mIHRoZSBkZXBpY3RlZCBkYXRhLiBJbmZvcm1hdGlvbiBzaG93
biBvbiB0aGVzZSBsYXllcnMgaXMgZGVyaXZlZCBmcm9tIHB1YmxpYyByZWNvcmRzIHRoYXQgYXJl
IGNvbnN0YW50bHkgdW5kZXJnb2luZyBjaGFuZ2UuIFVuZGVyIG5vIGNpcmN1bXN0YW5jZXMgc2hh
bGwgR0lTIG1hcHBpbmcgYmUgdXNlZCBmb3IgZmluYWwgZGVzaWduIHB1cnBvc2VzLiBXaGlsZSBl
dmVyeSBlZmZvcnQgaGFzIGJlZW4gbWFkZSB0byBlbnN1cmUgdGhlIGNvbnRlbnQsIHNlcXVlbmNl
LCBhY2N1cmFjeSwgdGltZWxpbmVzcyBvciBjb21wbGV0ZW5lc3Mgb2YgbWF0ZXJpYWxzIHByZXNl
bnRlZCB3aXRoaW4gdGhlc2UgcGFnZXMsIHRoZSBDaXR5IG9mIEN1cGVydGlubyBhc3N1bWVzIG5v
IHJlc3BvbnNpYmlsaXR5IGZvciBlcnJvcnMgb3Igb21pc3Npb25zLCBhbmQgZXhwbGljaXRseSBk
aXNjbGFpbXMgYW55IHJlcHJlc2VudGF0aW9ucyBhbmQgd2FycmFudGllcywgaW5jbHVkaW5nLCB3
aXRob3V0IGxpbWl0YXRpb24sIHRoZSBpbXBsaWVkIHdhcnJhbnRpZXMgb2YgbWVyY2hhbnRhYmls
aXR5IGFuZCBmaXRuZXNzIGZvciBhIHBhcnRpY3VsYXIgcHVycG9zZS4NCg0KDQpUZXJpIEdlcmhh
cmR0IEdJU1ANCkdJUyBDb29yZGluYXRvcg0KQ2l0eSBvZiBDdXBlcnRpbm8NCjQwOC43NzcuMzMx
MQ0KdGVyaWdAY3VwZXJ0aW5vLm9yZw0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJv
bTogRXJpYyBKaWFuZyBbbWFpbHRvOmVyaWNAZG91YmxlbWFwLmNvbV0gDQpTZW50OiBGcmlkYXks
IFNlcHRlbWJlciAyNSwgMjAxNSAyOjI0IFBNDQpUbzogR0lTIENvb3JkaW5hdG9yIDxnaXNAY3Vw
ZXJ0aW5vLm9yZz4NClN1YmplY3Q6IEN1cGVydGlubyBidWlsZGluZyBvdXRsaW5lcyBpbiBPcGVu
U3RyZWV0TWFwDQoNCkhlbGxvLA0KDQpJIGFtIGludGVyZXN0ZWQgaW4gd2hldGhlciBpbXBvcnQg
Q3VwZXJ0aW5vIGJ1aWxkaW5nIGZvb3RwcmludCBkYXRhIGludG8gT3BlblN0cmVldE1hcC4gT3Bl
blN0cmVldE1hcCBpcyBhIHZvbHVudGVlci1lZGl0ZWQgbWFwIG9mIHRoZSB3b3JsZCB0aGF0IHBy
b3ZpZGVzIGl0cyBkYXRhIGFzIG9wZW4gZGF0YSBmb3IgYW55b25lIHRvIHVzZSAod3d3Lm9wZW5z
dHJlZXRtYXAub3JnKS4gTWFueSBvdGhlciBwYXJ0cyBvZiB0aGUgQmF5IEFyZWEgYWxyZWFkeSBo
YXZlIGNvbXBsZXRlIGJ1aWxkaW5nIGRhdGEgaW4gT1NNIChTRiwgUGFsbyBBbHRvKSwgc28gSSdt
IGhvcGluZyB0byBpbmNsdWRlIEN1cGVydGlubyBhcyB3ZWxsLg0KDQpJIHdvdWxkIGxpa2UgdG8g
a25vdyBpZiB0aGlzIGRhdGEgaXMgYXZhaWxhYmxlIGFuZCBpZiBzbywgd2hldGhlciBpdCdzIGlu
IHRoZSBwdWJsaWMgZG9tYWluIG9yIGF2YWlsYWJsZSB1bmRlciBhbiBvcGVuIGxpY2Vuc2Ugb3Ig
c2ltaWxhci4NCg0KVGhhbmtzLA0KRXJpYw0KDQotLQ0KRXJpYyBKaWFuZywgRG91YmxlTWFwDQpT
dWl0ZSA0MDEgfCA0MjkgTi4gUGVubnN5bHZhbmlhIFN0cmVldCB8IEluZGlhbmFwb2xpcywgSU4g
NDYyMDQgT2ZmaWNlICsxIDg1NS40NjMuNjY1NSBlcmljQGRvdWJsZW1hcC5jb20gfCB3d3cuZG91
YmxlbWFwLmNvbQ0K

Pre-existing data

The vast majority of buildings within Cupertino are unmapped. For the buildings that are drawn, there are only a handful of areas that have addresses, and many of the addresses lack addr:street tags. Where available, pre-existing data will be preferred over the imported data, unless imagery or surveys show that one source is far better.

Import contents

Three shapefiles were provided by the city:

  • Building footprints
  • Building addresses (approx. 16,000)
  • Secondary addresses (apartment unit numbers, suite numbers, etc.)

Building footprints

The building footprints provided by the city are current as of 2015 and seem to have pretty good resolution and line up well with Bing imagery.

The building shapefile also has "Height" tags. I asked the city about it, and they said that the height information is measured in feet and was acquired using LIDAR. The height data is only in whole feet, but otherwise seems to be accurate.

Building addresses

Overall, the address data look very good. Addresses for single-family residences (the majority of the data) are mapped to the building entrances, which should make merging with the building footprints very easy. There are some address points that do not match up perfectly to a building entrance or represent a group of buildings. These will be handled manually.

Secondary addresses

There is no plan to import the secondary addresses.


Tagging

Tags are extracted from the address data and reformatted for OSM conventions. The following tags will be added to the buildings and/or address points:

  • addr:housenumber
  • addr:street
  • addr:postcode (zip or zip+4 where available)
  • building=yes
  • height (height in feet divided by 3.281)

The buildings shapefile comes with "Floors" tags that seems like it should indicate the number of floors in the building, but most buildings have Floors=0. Only a handful of buildings have Floors=1, and there do not seem to be any other values in use.

Process

Tasks will be coordinated using the US OSM Tasking Manager: http://tasks.openstreetmap.us/project/6

  1. Merge addresses with building footprints. (See scripts in Github repo)
  2. Manually import good-looking data one neighborhood at a time to resolve conflicts with existing data.
  3. Survey any suspicious data and import only if it is valid.

Contributor's guide

If you are relatively new to using JOSM or importing data, please look at the Louisville, Kentucky/Building Outlines Import/Contributor Guide page. The guide it describes is for Louisville, but the same process applies here, especially with regards to working with data in JOSM.

How to get help

For questions and concerns about the data, contact user "erjiang" via OSM messages. For cases where you are unsure of a specific building and you cannot survey it yourself, you can use your best judgment and then leave an OSM Note at that location asking for someone to examine it. (See this video guide to adding OSM notes in JOSM). If you have many concerns or are very unsure about a region, it's best to leave a comment in the tasking manager and let someone else look into it.

Create an import account

Use a separate account for importing data, instead of your regular OSM account. A good pattern to follow is: your_username_cupertino_import

Acquire JOSM

You will need the JOSM editor to open the import data files. It runs on Java, so it will run on most computers.

Tasking manager

You can find the project in the US OSM Tasking Manager here: [1]

Log in using your new OSM account, and pick a region.


Osmtask.png

Click the green "Start Mapping" button, which will lock that region so that other people don't try editing it at the same time.

Opening data in JOSM

There will be a link in the comments that looks like "http://erjiang.github.io/-XXXXX.osm". Copy that URL into your browser to download the file, and then open it in JOSM. You will see the building data for that area in JOSM. (If there is more than one link in the comments, one of them is a mistake and is for another region. Download both and then just delete the one that isn't the region you selected.)

Download OSM map data in JOSM (Ctrl+Shift+Down) and make sure to check the "Download as new layer" box. You will now have at least two layers: one with the to-be-imported data, one with OSM data.

Merging data

This is the part that requires careful inspection.

Some buildings in the import are already mapped in JOSM. We prefer to keep the existing OSM building instead of the import data unless the imported data is much better. However, the imported data has some tags that aren't in the OSM data. You can copy the tags from the import data into the existing OSM building:

  1. Examine tags in both data sets to see if there are any conflicts.
  2. Select building in import data.
  3. Select existing OSM building and paste the tags (Ctrl+Shift+V).
  4. Delete building from import data.

Other things to watch out for:

  • Cut-off buildings on the edges of the data. This data set includes some buildings along the boundaries that are actually outside Cupertino. Many of these buildings were cut off by an imaginary boundary line. Delete any buildings that are not whole.
  • Address points that don't correspond to buildings. There are some address points that are not in a building. These might be because the building was demolished, or that the building has not been built yet. Typically, these address points should not be imported and a note should be left for someone to survey.
  • Crooked lines. Some buildings were drawn sloppily, and corners that should be right angles are not right angles. Fix up these building outlines as needed.
  • Points of Interest. Some address points or buildings match up with a corresponding restaurant or other business. You can look up the business's website to see if these should be merged, but be careful: some businesses only occupy one unit of a multi-tenant building.
  • Intersections with existing features. There may be places where buildings intersect streets or other existing OSM data. Check these against aerial imagery.
  • Bad geometry. This includes duplicate points in a way (validator will find this) or buildings that are self-intersecting.


When you are done with this step, you should still have two separate layers, but they should not have any duplicate elements between them.

Run validator

The JOSM validator can catch many common errors in both the OSM data and the imported data. Deselect any data in the JOSM editor and then click the Validation button in the lower-right corner.

Upload data

Make sure you are logged in with your import account, not your regular OSM account.

First, switch to the OSM data layer and upload any changes you made to the existing OSM data. Then, switch to the import data layer and upload the data there to OSM.

Finally, mark the task as completed in the tasking manager.

Data

The data and scripts for this import can be found at: https://github.com/erjiang/cupertino-addresses

If you do not have Python handy, the finalized output, separated into regions, can be found at: http://erjiang.github.io/ (but see the Process section if you are interested in contributing)

Import users

The username "erjiang_imports" will be used for the import.