OpenData Puglia Import
OpenData Puglia Import is an import of different CSV datasets produced by Apulia region (regione Puglia) in Italy covering places of cultural and turistic interest. The import is currently (Feb 12, 2018) just planned and not executed yet.
The import task has been discussed in email@example.com list (nov 2017 - dec 2017).
- 1 Goals
- 2 Schedule
- 3 Import Data
- 4 Data Preparation
- 5 Data Merge Workflow
- 6 QA
The goal of this project is to import interesting data in OSM. The data only relate to Apulia region in Italy. Datasets contain data from two important website: Apulia Digital Libray (DL) and viggiareconnoi.it (VCN) and they mainly represent churches, manor farm, tourist attractions, paintings, etc.
Jan 2018 - Ongoing
The import will be performed by a dedicate account User:innovaP_importOSM
Provide links to your sources.
Data source site: Dataset
Data license: IODL v2
Data License: An explicit permission of owner data is pending
Type of license (if applicable): IODL v2.
Link to permission (if required): PENDING
OSM attribution (if required): http://wiki.openstreetmap.org/wiki/Contributors#yourdataprovider
ODbL Compliance verified: -
OSM Data Files
Link to your source data files that you have prepared for the import - e.g. the .osm files you have derived from the data sources.
Bulk import of missing data on OSM. Data duplication is semi-automatically detected during the import thus avoiding the creation of entities already available on OSM.
Data Reduction & Simplification
After a manual dataset clean up from incomplete and wrong data, the result will be processed by a python script.
This Script will extract all data and process the dataset row by row .
Not all collumns will be extracted but only relevant ones containing data, such as name, description, category, Lat, Lon and website (representing a reference to the original DL or VCN Web portal).
Common Tags are:
Category in dataset -----> OSM matching tag(s)
Torri ----> man_made = tower
Chiese e cattedrali ----> building = church, building = cathedral
musei ---> tourism=museum
Basiliche e santuari ---> amenity=place_of_worship
No changeset Tags have been used.
A python script will proccess the dataset and export in OSM XML.
By processing we mean:
- Check if extracted data is correct;
- Check if the "new" node is actually missing on OSM. This is done by quering OverPassTurbo (looking in nodes and ways in a range of 200 meters from given lat and lon). Above a query example;
- Find and assign the right category for the adding node ( A dictionary has been populated manually with all potential tags associated to categories in dataset);
- Adding a description written in the dataset;
- Adding a reference to resource page in DL and VCN website;
- Serializing all - correct - data in OSM XML formatt;
Query to Overpass turbo example
Node in dataset: Grotta di Santa Maria d'Agnano
This node (Grotta di Santa Maria d'Agnano) will be skipped because a very similar node (Santa Maria di Agnano) is already existent on OSM nearby to given coordinates. There is no point to process and add it.
Data Transformation Results
Data Merge Workflow
Describe if you'll be doing this solo or as a team.
List all factors that will be evaluated in the import.
Detail the steps you'll take during the actual import.
Information to include:
- Step by step instructions
- Changeset size policy
- Revert plans
Identify your approach to conflation here.
Add your QA plan here.