IT:Toscana import numeri civici 2017

From OpenStreetMap Wiki
Jump to: navigation, search

Goals

This page talks about importing addresses using the data provided by the Regione Toscana (Tuscany) in central Italy.

The import is discussed on the Italian mailing list to reach a consensus:


Schedule

The import will take place after all the bureaucratic steps.

Data Available

Data Source

The data are provided by the Regione Toscana open data portal. The data are stored in a big ZIP file downloadable here.

As soon as the formatted and splitted data will be available we'll publish here the links where download the data.

Legal

The data are released in CC-BY for the use within OSM. There is an agreement between Wikimedia Italia (Italian OSM Chapter) and Regione Toscana, here is an extract of an email from Regione Toscana clarifying the position about the licence:
in virtù dell'accordo siglato con WikiMedia (www.regione.toscana.it/documents/10180/70968/Accordo_wikimedia_RT_20140924.pdf/e05d1e81-2b09-4b70-9862-b3d86b8c9301 ) che prevede che "La Regione si impegna a rendere disponibili i propri patrimoni conoscitivi geografici, rilasciati come Open Data, secondo condizioni di licenza adeguate al pieno recepimento nell'ambito delle banche e dei progetti dell'Associazione Wikimedia..." e che "L'Associazione si impegna ad operare per favorire il recepimento e l'integrazione dei dati e dei documenti acquisiti dalla Regione Toscana nell'ambito delle proprie banche dati e progetti, con il vincolo di citazione del presente accordo", già nel novembre 2014 ci fu uno scambio di mail e la fornitura dei dati del grafo e dei civici con licenza CC-BY (modificando, esclusivamente per OSM ed in virtù dell'accordo siglato, la nativa licenza CC-BY-SA - vedi mail del 9/10/2014 a Simone Cortesi "La Regione si impegna a rendere disponibili i propri patrimoni conoscitivi geografici, rilasciati come Open Data, secondo condizioni di licenza, adeguate al pieno recepimento nell'ambito delle banche e dei progetti dell'Associazione Wikimedia. Il che significa che i dati che condividiamo con OSM avranno condizioni di licenza "adeguate al pieno recepimento nell'ambito delle banche e dei progetti dell'Associazione Wikimedia", e quindi continueranno ad essere CC-BY e CC-BY-SA (a seconda dei casi) come download dai siti regionali e possono essere veicolati, esclusivamente ad OSM, sulla base dell'accordo siglato, come CC-BY.") in coordinate EPSG 4326, come da richiesta di Simone.

Brieflyː Due to the agreement between Regione Toscana and Wikimedia Italia, Regione Toscana releases the whole open data catalog according to the licences used by Wikimedia Italia in it's projects. Since 2014 Regione Toscana provided such data with a explicit "fully OSM compatible license".

Import Type

All house numbering in Toscana, except the Province of Florence, follows the European scheme. The Province of Florence follows a slightly difference schema adding a red coloured numeration (just like Genova does), for details please see Tagging Plans section.

An address is determined by its streetname and housenumber.

A housenumber is also unique per street.

Housenumbers can include a subordinate. These are noted with suffix letters or numbers (e.g. in "7a", "a" is the subordinate or 7/1).

The dataset will be normalized and data already existing will be put aside.

The existing OpenStreetMap data are not present in the dataset and will be analyzed and merged later.

Data Preparation

Tagging Plans

The data is presented as a QGIS Project called "iternet.qgs". It's necessary to open the project in QGIS, right click on the layer "civici" and save in a CSV format whith EPSG:4326 projection.

We obtain a csv file that consists in a collection of punctual elements, one for each housenumber.

From each row the relevant columns are:

  • X: longitude
  • Y: latitude
  • COMUNE: city name
  • DUG: first part of the street name (usually the word "Via", "Strada", "Corso", etc)
  • INDIRIZZO: second part of the street name
  • CIVICO: numeric part of the house number
  • ESPONENTE: suffix letters or numbers
  • COLORE: (only for the province of Florence) indicates if the number is in red color ("rosso")
  • passocarː if yes means there is surely a vehicle access


Data Transformation

The software used to normalize data is OpenRefine

All the rows with blank housenumber or an invalid housenumber (like zero or 99999999) have been deleted. All the rows with blank street name have been deleted.


The tags that will be used in the final upload are addr:housenumber, addr:street, vehicle, addr:city.

The tags will be as follows:

  • addr:housenumber will contain the housenumber and are obtained joining CIVICO+ESPONENTE and COLORE if necessary.
  • addr:street will contain the street name normalized to follow Italian conventions and is obtained joining DUG+INDIRIZZO.
  • addr:city will contain COMUNE.
  • vehicle applicable only if a vehicle access license exist (derived from 'passocar' value).

Italian community is discussing whether use vehicle tag or not.


Data Transformation Result

Output file has been split in several files, one for each Comune (addr̙_city). When a single city has more than 10,000 address (the objects limit for a single upload import) file will be splitted in parts.

Dedicated upload account

The account used for the import are: Ale_Zena_IT-import, italia-import.


Changeset Tags

Changeset will be tagged with source=dati Regione Toscana.


Workflow

There is a first normalization of dataː change capitalized record to OSM name standard.

We split data for each Municipality (Comune) and, due to new limitations to the imported objects, create file up to 10.000 object per changeset.

In case of import problem the changeset will be reverted using the JOSM Reverter Plugin

In order to avoid duplication with existing address we're going to compare the single municipalities with the 'Conflate' JOSM plugin.

QA

Street names

After the import, addr:street names could be slightly different than street names.

These differences should be catched using OSM Inspector.

Unmarked streets

The result can be used to locate areas where streets are missing.

Missing roads will be created in JOSM using PCN 2012 areal images.

Unnamed streets

The result can be used to derive street names for unnamed streets when all the nodes along the street has the same addr:street value.

Missing road names will be identified using the OpenStreetMap NoName Map Overlay:
tms:http://tile3.poole.ch/noname/{zoom}/{x}/{y}.png

OSM Inspector can also be used to find these streets.