Import/Catalogue/Address import for Biella
- 1 About
- 2 Import Plan Outline
- 3 Goals
- 4 Schedule
- 5 Import Data
- 6 Data Preparation
- 7 Data Merge Workflow
- 8 QA
This page talks about importing addresses using the data provided by the Municipality of Biella (Italy).
Import Plan Outline
This import goal is to use the high-quality dataset provided by the Municipality of Biella in order to steadily improve the addresses available in OSM. It will not be a blind import, all data will be edited by a local mapper.
Address format in Biella
House numbering in Biella follows the European scheme.
An address in Biella is determined by its streetname and housenumber.
A housenumber is also unique per street.
Housenumbers can include a subordinate. These are noted with suffix letters (e.g. in "7a", "a" is the subordinate). Subordinates usually arise when a new house is build between existing houses with subsequent housenumbers. E.g. when a house is build between numbers 7 and 9, the new house will most likely get number 7a (since even numbers are reserved for the other side).
The postal codes boundary is the same as the municipal boundary and therefore the only postcode for Biella is "13900".
Data source site: http://www.comune.biella.it/sito/index.php?biella-open-data
Data license: http://www.dati.gov.it/iodl/2.0/
Type of license: IODL v2.0
OSM attribution: http://wiki.openstreetmap.org/wiki/Contributors#Biella
ODbL Compliance verified: yes
From the IODL 2.0 license (in Italian): "indicare la fonte delle Informazioni e il nome del Licenziante, includendo, se possibile, una copia di questa licenza o un collegamento (link) ad essa."
Translation: "state the data source and the licensor name, including, if possible, a copy of this License or a connection (link) to it."
It should be enough to add the attribution in the Contributors page, like already done for Venice.
The dataset will be imported as a single changeset.
The dataset will be loaded in JOSM and it will be merge with existing OpenStreetMap data manually and prior to the upload.
The data is presented as a shapefile. This shapefile consists in a collection of punctual elements, one for each housenumber.
Each node has the keys:
- ID_VIA: a unique numeric identifier for street (see "elenco_viesit.csv" table)
- CIV: housenumber
- SUB: subordinate, if present
- CIVICO: housenumber with subordinate
- ZACCESSO: access type
- ZPRINCIP: 1 if main entrance, 0 otherwise
elenco_viesit.csv table contains the following keys:
- Id_VIA: a unique numeric identifier for street
- NOME_VIA: street name
The shapefile will be converted to OSM XML using ogr2osm. The projection is correctly detected automatically as EPSG:4326 (WGS84 latitude-longitude).
The tags that will be used in the final upload are addr:housenumber, addr:street, addr:postcode and addr:city.
The tags will be as follows:
- addr:housenumber will contain the number in CIVICO converted to lowercase (for subordinates).
- addr:street will contain the street name in NOME_VIA (found through ID_VIA) but normalized to follow Italian conventions.
- addr:postcode will contain "13900".
- addr:city will contain "Biella".
Dedicated upload account
The account Andrea Musuruane import will be used to upload the imported data.
The street names in NOME_VIA in "elenco_viesit.csv" do not follow Italian naming conventions. A python script will be used to normalize street names. A new file called "elenco_viesit_osm.csv" will be written. It will have a new key named "NOME_VIA_OSM" with the normalized street name.
The "elenco_viesit_osm.csv" table will be manually edited to correct errors (i.e.: normalization failures, apostrophes instead of accents, person spelled with surname + name instead of name + surname, mispelled names, etc). It's better to edit one time now than multiple times later after the import.
After that, ogr2osm will be used to convert the shapefile to OSM XML format using the above tagging plan.
Source scripts will be found at https://github.com/musuruan/osm_imports
Data Transformation Results
OSM XML with addresses already in OSM merged: https://dl.dropboxusercontent.com/u/12575912/biella_civici_od_merged.osm
Data Merge Workflow
Addresses already in OSM will be extracted using the following Overpass query:
<osm-script> <query into="comune" type="area"> <has-kv k="admin_level" v="8"/> <has-kv k="name" v="Biella"/> </query> <union> <query type="node"> <area-query from="comune"/> <has-kv k="addr:housenumber"/> </query> <query type="way"> <area-query from="comune" /> <has-kv k="addr:housenumber"/> </query> <item/> <recurse type="down"/> </union> <print mode="meta" /> </osm-script>
If you perform the query, you'll see there are just a few housenumbers: 17 nodes and 1 building.
Since address data in the Municipaly of Biella data source are placed exclusively on nodes (a wise choice because a building can have different entrances and therefore different addresses) the only address for a building will be removed.
Addresses already present will be merged. It will done manually since they are just a few.
Addresses not found in the open data source will be tagged as fixme and must be later verified on place.
This will be a solo effort. Import will be managed by Andrea Musuruane.
Step by step instructions:
- Normalize street names and create "elenco_viesit_osm.csv" table
- Manually edit "elenco_viesit_osm.csv" table to correct errors
- Run ogr2osm to export the data in OSM XML
- Run overpass query to export the existing addresses
- Merge these addresses in JOSM
- Upload the changeset in OSM
The changeset should be small enough to be uploaded at once.
In case of import problem the changeset will be reverted using the JOSM Reverter Plugin
See #Data Merge Workflow.
After the import, addr:street names could be slightly different than street names.
These differences should be catched using OSM Inspector (map already centered on Biella).
Since we spent a lot of time editing the "elenco_viesit_osm.csv" table, addr:street name should be used as a reference.
The result can be used to locate areas where streets are missing.
Missing roads will be created in JOSM using PCN 2012 areal images.
The result can be used to derive street names for unnamed streets when all the nodes along the street has the same addr:street value.
Missing road names will be identified using the OpenStreetMap NoName Map Overlay:
OSM Inspector can also be used to find these streets.