Import/Catalogue/Bed and Breakfast-RAFVG

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This page is about importing procedure that has been followed in order to upload 410 new POIs in planet.osm. A 600+ rows Bed&Breakfast dataset is available with opendata licence issued by RAFVG w/o geo coordinates. Since housenumbers were recently imported from RAFVG dataset, geocoding has been applied.

Dataset

Bed and Breakfast is a rather new dataset (Oct 17) with more than 600 POIs. Many useful fields such as

  • name and operator
  • phone
  • email
  • site
  • opening hours
  • category (standard, comfort, superior)

Cleaninig data

Such duty has been accomplished by OpenRefine and Reconcilie plugin, connected as a reconciliation service. In order to standardize messy B&B addresses (entered by B&B operators theirselves), Reconcile has been feed with an authoritative set of highway names, derived from overpass-turbo (see Strade d'Italia OSM diary entry).

Record format and tagging plan

RAFVG dataset table structure will be adapted and pruned thru OpenRefine.

RAFVG Bend and Breakfast - record format
Field Name Description:it Description:en Example tagged as
1 codice esercizio codice univoco scheda BB unique BB code 48460 ref
2 PROVINCIA nome provincia province name Gorizia
3 COMUNE Nome Comune Municipality name Cormons
4 CATEGORIA categoria category Superior
5 DENOMINAZIONE denominazione name Il Gelso di Pippo Franco
6 INDIRIZZO indirizzo address Via Bancaria, 7
7 CAP codice di avviamento postale postcode 33100
8-10 TELEFONO/FAX/CELLULARE recapito telefonico phone/fax/other 0433 555666
11 EMAIL posta elettronica e-mail pippo@franco.it
12 N_CAMERE posta elettronica e-mail pippo@franco.it
13 N_POSTI_LETTO posta elettronica e-mail pippo@franco.it

Geocoding

csvgecode command line tool has been used. An issue about aposrtrophe in municipalities and addresses has been bypassed removing such character from source data.

Here is a run using mapbox service:

$ csvgeocode input.csv output.csv --handler mapbox --delay 1000 --verbose --url "http://api.tiles.mapbox.com/v4/geocode/mapbox.places/{{INDIRIZZO}},{{CAP}} {{COMUNE}}.json?access_token="

Here nominatim, instead:

$ csvgeocode input.csv output.csv --handler osm --delay 1000 --verbose --url "http://nominatim.openstreetmap.org/search?q={{INDIRIZZO 1}}, {{COMUNE}}&format=json"
Rows geocoded: 468
Rows failed: 114
Time elapsed: 879.4 seconds

114 rows not geocoded exposed the geocoder issue with apostrophes in city field. Workaround to bypass such not escapable apostrophe is both removing it (ie: Farra d'Isonzo >> Farra disonzo) or use postcode instead. Same problem for address, but only solution found is removing char (ie: San Francesco d'Assisi >> San Francesco dassisi). Above edits are for geocoding sake only, since no addr:* shall be imported. Besides, part of "success" geocoding rows could have been geocoded even with missing housenumber, resulting in highway centroid coordinates. To limit these false positives, some municipalities w/o addresses (here in red) has been isolated and geocoded nodes excluded.

Conflating

Finally conflation has been run to generate an audit map. After manual check, 410 nodes has been written in fianl osm file.

Files

Please, find here files involved.