User:Ale Zena IT/Sandbox

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This page talks about importing buildings of Bologna using the data provided by the Municipality itself.

The import has been discussed on the OSM mailing list and the Italian OSM mailing list. This wiki page is the result of consensus there.

Import Plan Outline

Goals

This import goal is to use the high-quality dataset provided by the Municipality of Bologna in order to replace that already available in OSM that came from a former import, with lower quality data, provided by Regione Emilia-Romagna.

Schedule

TBD.

Import Data

Background

Building data format

Building data comes in a ESRI Shapefile format.


Legal

Data source site: http://dati.comune.bologna.it/node/177
Data license: http://dati.comune.bologna.it/node/3656
Type of license: CC BY 4.0 ODbl compliant (a standard CC BY 4.0 plus an ODbL addendum in order to makes it compatible with OpenStreetMap
OSM attribution: https://wiki.openstreetmap.org/wiki/Contributors#Bologna
(Please, remember to update that section with information about the buildings)

ODbL Compliance verified: yes

From the license specific page http://dati.comune.bologna.it/node/3656

(in Italian): "In caso di riuso del dataset "Edifici" da parte di OpenStreetMap, l'attribuzione da parte di OpenStreetMap e dei suoi utenti attraverso http://wiki.openstreetmap.org/wiki/Contributors è sufficiente a soddisfare l'obbligo di attribuzione in favore del Comune di Bologna in "maniera ragionevole" in applicazione dell'articolo 3(a)(1) della licenza CC BY 4.0.."

(in English): "In case of reuse of dataset “Edifici” by OpenStreetMap, the attribution by OpenStreetMap and its users through http://wiki.openstreetmap.org/wiki/Contributors is sufficient to provide attribution to Comune di Bologna in a "reasonable manner" in accordance with Section 3(a)(1) of the CC BY 4.0 license."

Import Type

The dataset will be imported with several changesets as it is composed by more than 66k building or parts of them.

The single small dataset chunk will be loaded in JOSM and replacing existing OpenStreetMap data manually and prior to the upload.

Data Preparation

Tagging Plans

The data is presented as a shapefile. This shapefile consists in a collection of polygons, one for each building part.

Each poligon has the keys:

  • ID_VIA: a unique numeric identifier for street (see "elenco_viesit.csv" table)
  • CIV: housenumber
  • SUB: subordinate, if present
  • CIVICO: housenumber with subordinate
  • ZACCESSO: access type
  • ZPRINCIP: 1 if main entrance, 0 otherwise

elenco_viesit.csv table contains the following keys:

  • Id_VIA: a unique numeric identifier for street
  • NOME_VIA: street name

The shapefile will be converted to OSM XML using ogr2osm. The projection is correctly detected automatically as EPSG:4326 (WGS84 latitude-longitude).

The tags that will be used in the final upload are ref, building, building_heigh and sometimes name.

The tags will be as follows:

  • ref will contain the number in PROGAGG.
  • building will contain the building use in DEC_TIPO preprocessing the value of the field choosing the best OSM category in wich the building can fit.
  • buildingːheight will contain "ALT_UV" with the value divided by 1000. The 0 values will be omitted.
  • name or description will eventually contains valuable information from the "NOTEOGG" field.

Dedicated upload account

Some dedicated accounts will be used to upload the imported data.

Changeset Tags

Changeset will be tagged with source=OpenData Bologna.

Data Transformation

The street names in NOME_VIA in "elenco_viesit.csv" do not follow Italian naming conventions. A python script will be used to normalize street names. A new file called "elenco_viesit_osm.csv" will be written. It will have a new key named "NOME_VIA_OSM" with the normalized street name.

The "elenco_viesit_osm.csv" table will be manually edited to correct errors (i.e.: normalization failures, apostrophes instead of accents, person spelled with surname + name instead of name + surname, mispelled names, etc). It's better to edit one time now than multiple times later after the import.

After that, ogr2osm will be used to convert the shapefile to OSM XML format using the above tagging plan.

Source scripts will be found at https://github.com/musuruan/osm_imports

Data Transformation Results

OSM XML file: https://dl.dropboxusercontent.com/u/12575912/biella_civici_od.osm

OSM XML with addresses already in OSM merged: https://dl.dropboxusercontent.com/u/12575912/biella_civici_od_merged.osm

Data Merge Workflow

Addresses already in OSM will be extracted using the following Overpass query:


<osm-script>
  <query into="comune" type="area">
    <has-kv k="admin_level" v="8"/>
    <has-kv k="name" v="Biella"/>
  </query>
  <union>
  <query type="node">
    <area-query from="comune"/>
    <has-kv k="addr:housenumber"/>
  </query>
  <query type="way">
    <area-query from="comune" />
    <has-kv k="addr:housenumber"/>
  </query>
  <item/>
  <recurse type="down"/>
  </union>
<print mode="meta" />
</osm-script>

If you perform the query, you'll see there are just a few housenumbers: 17 nodes and 1 building.

Since address data in the Municipaly of Biella data source are placed exclusively on nodes (a wise choice because a building can have different entrances and therefore different addresses) the only address for a building will be removed.

Addresses already present will be merged. It will done manually since they are just a few.

Addresses not found in the open data source will be tagged as fixme and must be later verified on place.

Team Approach

This will be a solo effort. Import will be managed by Andrea Musuruane.

Workflow

Step by step instructions:

  1. Normalize street names and create "elenco_viesit_osm.csv" table
  2. Manually edit "elenco_viesit_osm.csv" table to correct errors
  3. Run ogr2osm to export the data in OSM XML
  4. Run overpass query to export the existing addresses
  5. Merge these addresses in JOSM
  6. Upload the changeset in OSM

The changeset should be small enough to be uploaded at once.

In case of import problem the changeset will be reverted using the JOSM Reverter Plugin

Conflation

See #Data Merge Workflow.

QA

Street names

After the import, addr:street names could be slightly different than street names.

These differences should be catched using OSM Inspector (map already centered on Biella).

Since we spent a lot of time editing the "elenco_viesit_osm.csv" table, addr:street name should be used as a reference.

Unmarked streets

The result can be used to locate areas where streets are missing.

Missing roads will be created in JOSM using PCN 2012 areal images.

Unnamed streets

The result can be used to derive street names for unnamed streets when all the nodes along the street has the same addr:street value.

Missing road names will be identified using the OpenStreetMap NoName Map Overlay:
tms:http://tile3.poole.ch/noname/{zoom}/{x}/{y}.png

OSM Inspector can also be used to find these streets.