Import/Catalogue/Bergamo address import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Goals

This page talks about importing address data provided by the City of Bergamo (Italy). This import was first performed not following import guidelines. It was noticed by Cascafico who notified the talk-it ML.

There is a general agreement that address data in the City of Bergamo are useful. Since these are licensed with an ODbL compatible license, we would like to avoid a revert, if possible. Therefore the Italian community members has written this procedure in an effort to preserve the data even if they were added without following the import guidelines.

The import has been discussed on the Italian OSM mailing list. This wiki page is the result of consensus there.

Address format

House numbering follows the European scheme.

An address is determined by its street name and house number. A house number is also unique per street. House numbers can include a subordinate. These are noted with suffix letters (e.g. in "7a", "a" is the subordinate).

Schedule

The data have been imported during April and May 2020.

Import Data

Background

Data source site: https://www.dati.lombardia.it/Territorio/Comune-Bergamo-Numerazione-civica/pcif-9jj7
Data license: https://www.dati.gov.it/content/italian-open-data-license-v20
Type of license: IODLv2
Link to permission: Not required
OSM attribution: https://wiki.openstreetmap.org/wiki/Contributors#Italy
ODbL Compliance verified: yes

The IODLv2 license is compatible with the ODbL as stated in the FAQ written by the license issuer.

From the IODL 2.0 license (in Italian): "indicare la fonte delle Informazioni e il nome del Licenziante, includendo, se possibile, una copia di questa licenza o un collegamento (link) ad essa."

Translation: "state the data source and the licensor name, including, if possible, a copy of this License or a connection (link) to it."

It should be enough to add the attribution in the Contributors page, like already done for the City of Venice.

OSM Data Files

Not available.

Import Type

This is a one-time import and it has been performed manually using JOSM.

Data Preparation

Data Reduction & Simplification

Unknown.

Tagging Plans

The data are provided in a CSV file.

Each address has the following keys:

  • ClasseToponimo: DUG (Via, Piazza, Lungomare, Salita, etc.)
  • DescrizioneToponimo: street name
  • Numero: housenumber (without subordinate)
  • Subalterno: subordinate
  • CAP: postal code
  • SezioneISTAT: ISTAT census section
  • Lat: latitude
  • Lon: longitude
  • Location: ordered pair of latitude and longitude

The tags used in the final upload are addr:housenumber, addr:postcode, addr:street, addr:city and addr:country.

The tags should be as follows:

  • addr:housenumber contains Numero + Subalterno converted to lowercase. This is required by Italian rules
  • addr:street contains ClasseToponimo + " " + DescrizioneToponimo.
  • addr:postcode contains CAP.
  • addr:city contains Bergamo.
  • addr:country contains IT.

Changeset Tags

Changeset tags haven't been used in the import.

We have identified the following changeset where data were imported:

  • CAP 24121
  1. 83273684
  2. 83289667
  • CAP 24122
  1. 83385416
  2. 83370612
  3. 83398787
  • CAP 24123
  1. 83408444
  2. 83406193
  • CAP 24124
  1. 83423357
  2. 83423337
  3. 83423324
  • CAP 24125
  1. 83462316
  • CAP 24126
  1. 83528925
  • CAP 24127
  1. 83668871
  • CAP 24128
  1. 84365785
  • CAP 24129
  1. 84567356

Data Transformation

The original uploader has trasformed the data with this procedure.

Data Transformation Results

Not available.

Data Merge Workflow

Team Approach

The import was performed - prior to the creation of this page - by:

Workflow

The original uploader has used this workflow.

Conflation

It seems conflation has been preformed manually.

QA

Imported data

Addresses in Bergamo can be extracted using the following overpass query:

[out:json][timeout:25];
area[name="Bergamo"][admin_level=8]->.searchArea;
nwr["addr:housenumber"](area.searchArea);
out body;
>;
out skel qt;

Subordinates

Right now subordinates are written uppercase. These must be corrected to lowercase to follow Italian rules.

Run the following overpass query and use this procedure:

[out:xml][timeout:25];
area[name="Bergamo"][admin_level=8]->.searchArea;
nwr["addr:housenumber"~"[0-9]+([A-Z]|BIS|A1)"](area.searchArea);
out meta;
>;
out meta qt;

You can user Notepad++ to transform subordinates to lowercase with the following search & replace regex:

Search: (addr:housenumber\s=\s\d+)([A-Z]|BIS|A1)
Replace with: $1\L$2

Or you can use sed under Linux:

$ sed -i -E 's/(addr:housenumber = [0-9]+)([A-Z]|BIS|A1)/\1\L\2/' filename.txt

Level0 will only get the first 500 items from overpass. Therefore you'll need to run this procedure more than once.

Numerical Subordinates

Although current rules in Italy prohibit this, there are still many cases where the subordinate is a number and not a letter. Therefore the use of a separator is mandatory. It has been chosen to use the slash (e.g. addr:housenumber=4/1).

Unfortunately many numerical subordinates have been wrongly imported. For example, 7/1 has been imported as 71. These must be found in the original dataset and amended in OSM.

A support umap (feeded by original dataset) has been created to spot and fix numerical subordinates.

addr:country

All imported addr:* nodes have the addr:country=IT tag. These tags will be deleted because it is quite obvious the addresses are located in Italy.

addr:* on buildings

Address data in Italy must be placed exclusively on nodes because the house number identifies the external access (door, gate, etc) leading from the street to the housing units (houses, stores, offices, etc). Moreover data placed on nodes have already been imported.

Therefore addresses on buildings are duplicate and incorrectly placed and will be removed. OSM Inspector can be used to find these buildings.

Duplicate addresses

Conflation was not performed very well from the original uploader. There are many duplicate address nodes (nodes with the same address). To fix them, download all addresses in Bergamo using the aforementioned overpass query in JOSM and run JOSM Validator. Fix errors reported for "Duplicate house numbers" and "House number without street".

Street names

addr:street names could be slightly different than current street names.

These differences should be caught using OSM Inspector (map already centred on the City of Bergamo).

Unmarked streets

It is possible to locate areas where streets are missing.

Missing roads can be traced in JOSM using "Esri World Imagery (Clarity) Beta".

Unnamed streets

It is possible to derive street names for unnamed streets when all the nodes along the street has the same addr:street value.

Missing road names will be identified using the OpenStreetMap NoName Map Overlay:

tms:http://tile3.poole.ch/noname/{zoom}/{x}/{y}.png

OSM Inspector can also be used to find these streets.

See also

The email to the Imports mailing list was sent on 2020-06-28 and can be found in the archives of the mailing list at [1].