Import/Catalogue/Venice addresses import
About
This page is about importing addresses in OSM planet file from the data provided by the Municipality of Venice (Italy).
The Municipality of Venice released their complete address data in 2023.
The import will be discussed in this Italian OSM mailing list [https:// TBD]. This wiki page will be result of consensus there.
Import Data
Background
Address format
House numbering follows the European scheme. An address is determined by its streetname and housenumber. Housenumber is also unique per street. In several cases Venice island itself features place addresses.
Housenumbers can include:
- subordinates, noted with suffix letters (e.g. in "7a", subordinate "a" ); subordinates usually arise when a new house is built between existing houses with subsequent housenumbers
- extensions, noted with a slash "/" followed by an integer; most cases occur when a single entrance is shared by different buildings.
Data quality
Possible offset issues due to source data or reprojection will be inspected. [Pellestrina housenumber offset]
Legal
Data source site: https://portale.comune.venezia.it/node/117/80
Data license: https://portale.comune.venezia.it/filebrowser/download/14249419
Type of license: IODL 2.0
Source data
Dataset identification string is "Strato03_GestioneViabilita_Indirizzi" and has been downloaded from toponimi numeri civici page, Venice municipality website.
Import Type
The dataset will be cleaned and OSM-formatted by Openrefine; then it will be conflated with OSM conflator and published in a shared audit maps prior to upload.
Data Preparation
Operations applied to original dataset are listed in this operations file. Due to dataset large size (86k nodes), import shall be split on a geografic base, ie: Pellestrina Island, Venice Island, Mestre and Marghera mainland.
Tagging
The CSV file derived from QGIS conversion consists of a collection of punctual elements, one for each housenumber.
The following fields will be evaluated:
- INDRIZZO: place, housenumber, subordinate (San Polo 1175b)
- DENOMINAZI: street, housenumber, subordinate (Via Piave 13b)
- lat
- lon
- IDMASTER: official housenumber id, used for conflation and optionally for OSM loc_ref tag
Housenumber
addr:housenumber has been built titlecasing INDIRIZZO and DENOMINAZI fields.
Changeset Tags
Changeset will be tagged with:
- source=Comune di Venezia
- source:license=ODBL 2.0
- type=import
- url=https://wiki.openstreetmap.org/w/index.php?title=Import/Catalogue/Venice_addresses_import
Thus people will know the data has been imported following the guidelines and they will find this page for details.
Data Transformation
After the data preparation process, the following workflow has been performed on a subset (Pellestrina Island):
- dataset pruned records have been converted in a json file;
- Json file has been processed thru OSM conflator, using this profile;
- Preview conflated data has been uploaded in an audit map for shared review.
Data Transformation Results
After completion of the audit process, the OSM XML upload candidate file will be available here TODO
Data Merge Workflow
Non-node objects
Address data in Italy must be placed exclusively on nodes because the housenumber identifies the external access that leads from the street to the housing units (houses, stores, offices, etc). Please read https://wiki.openstreetmap.org/wiki/IT:Addresses#Regole_specifiche_per_l.27Italia (in Italian) for more details. At present date, query result for housenumbers applied to polygons or multipolygons count 1134 matches. Distance from dataset nodes and polygon centroids can often be more than conflation 10 meters usable radius, causing several cases (tagged with fixme "suppressed or wrong position: please check") that will need post-import QA inspection.
Conflation
Conflation is performed by OSM Conflator. Objects tagged ad natural=tree and denotation=natural_monument will be extracted from OSM in a bounding box defined by source dataset. Conflator output shall generate a public audit map for visual review.
OSM objects to be conflated
The following query gathers OSM objects for Pellestrina Island:
[out:xml][timeout:25]; area[name="TBD"]["admin_level"=TBD]->.searchArea; ( nwr["addr:housenumber"](area.searchArea); ); out meta qt center;
At present (Febraury 2024) there are about 16k addresses already present in OpenStreetMap. In Pellestrina subset, addresses are about TBD and exported data from query above (export.osm) will be piped to conflator.
Addresses and tags already present are merged by conflator using authoritative addr:housenumber and addr:street. Existing OSM unmatched addresses will be kept in order not to remove other useful tags (amenities, shops, etc).
Matching addrs
Any matching between input dataset and OSM element within a range (defined in profile.py) shall be considered and a proposal for change will be displayed in an audit map as a blue pin.
New addrs
Any input dataset address which has not OSM matches around the above range, will generate a proposal for a new OSM address and will be displayed in an audit map as a green pin.
Not in dataset
Existing OSM elements which don't have an input dataset match will generate a proposal for a fixme tag; text shall be 'this addr is missing from source dataset: please check'. They will be displayed in an audit map as a blue pin.
Conflator output example
pi@raspberrypi:~/OSM conflate -i pellestrina.json --osm export.osm -v -c preview.json profile.py
08:37:53 Found 421 duplicates in the dataset
08:37:53 Read 4876 items from the dataset
08:37:53 Downloaded 1085 objects from OSM
08:38:13 Matched 790 points
08:38:13 Removed 401 unmatched duplicates
08:38:13 Adding 3685 unmatched dataset points
08:38:14 Deleted 0 and retagged 295 unmatched objects from OSM
pi@raspberrypi:~/OSM
Conflator re-run
Once audit is completed, online data is downloaded from conflator project page (example) and reprocessed.
pi@raspberrypi:~/OSM conflate -i pellestrina.json -a audit_pellestrina.json -o pellestrina.osm profile.py
[some echoes...]
pi@raspberrypi:~/OSM
Candidates
Municipio | Audit published | Post audit conflator run | File |
---|---|---|---|
9 | 2020-06-01 | 2021-05-18 | TBD.osm |
Team Approach
This import is managed and supervised by:
- Cascafico (import account: attilaimport)
During the upload process, the subset import will be evaluated; possibly the batching criteria will be municipal district (Municipio, in Italian).
Reverting
In case of import anomalies, changeset(s) will be reverted using OSM reverter scripts or, if possible, the JOSM Reverter Plugin.
Post-import QA
Street names
After the import, addr:street names could be slightly different than current street names.
These differences should be caught using OSM Inspector (map already centered on Milan).
Unmarked streets
The result can be used to locate areas where streets are missing.
Missing roads will be created in JOSM using PCN 2012 areal images.
Unnamed streets
The result can be used to derive street names for unnamed streets when all the nodes along the street have the same addr:street value.
Missing road names will be identified using the OpenStreetMap NoName Map Overlay:tms:http://tile3.poole.ch/noname/{zoom}/{x}/{y}.png
OSM Inspector can also be used to find these streets.
Non-node objects
Since several polygon and multipolygon OSM address objects will be tagged as in wrong place, manual adaptation or deletion has to be performed.
See also
The email to the Imports mailing list was sent on 2020-04-04 and can be found in the imports mailing list archives.