AGIV CRAB Import

From OpenStreetMap Wiki
Jump to: navigation, search

About

This import project page is about the AGIV CRAB database import. The dataset was released in 2013 under the new Flemish OpenData License.

This dataset contains all addressing information from the entire Flanders area in Belgium.

Import Plan Outline

  • Convert AGIV CRAB data to basic addresses: commune_name, postalcode, street, housnumber.
  • Split this data up into small tasks per street (commune_name, postalcode, street) and upload this to http://addr.openstreetmap.fr/vlaanderen/.
  • A small task is created per street (split by postalcode).
  • Mappers will take tasks (in their local areas preferably) and merge the data into OSM using JOSM.

Goals

The goal of the import is to use this dataset in mapping activities. We are not attempting a blind import, all data has to seen by human eyes before it can appear in the OSM-data.

Schedule

The schedule is very flexible but people are eager to get started. The sooner there is agreement on how to do this import the better.

There will be two phases:

  • An experimental phase with a small group of local OSM veterans.
  • An second phase when everybody can participate and the only requirement is a good understanding of JOSM.

Import Data

Background

Data source site: https://download.agiv.be/Producten/Detail?id=102&title=CRAB_adresposities
Data license: http://agiv.be/gis/producten/?artid=2101
Type of license (if applicable): Flemish OpenData License (translation: AGIV_CRAB_Import/Free_open_data_licence_Flanders)
Link to permission (if required): Permission did not need to be obtained reading the license but it was explicitly confirmed by Laura D'heer at AGIV.
OSM attribution (if required): https://wiki.openstreetmap.org/wiki/Contributors#AGIV
ODbL Compliance verified: yes

Import Type

This import will not a machine-import. The entire address dataset is divided into small tasks per street/postalcode and merged with OpenStreetMap data manually by using an addressing import tool made by OSM-FR and JOSM to conflate the data per street.

Data Preparation

Data Reduction & Simplification

There is no data reduction but addresses in the source data contain multiple points, at building level (centered on the building) and at the parcel level. There is consensus to take the position of the address on the building similar to what is done in OSM already.

The building position is also considered the most accurate position by AGIV.

Tagging Plans

We will be adding the address to the building outline.

The housenumber and the street is the only information that will be used:

  • addr:housenumber : The housenumber correponding to the housenumber in the source-data.

Source: huisnr.huisnr

  • addr:street : The street corresponding to the streetname in the source-data.

Source: straatnm.straatnm

Changeset Tags

source = AGIV Crab

Data Transformation

There is a python script that extracts basic address information from the AGIV CRAB dataset and outputs the result as a CSV file. The entire script can be found here:

http://github.com/xivk/crab-tools/tree/master/python

The script finishes in about 30-40 mins and the result is one huge file with for every address the best position found in the source data (corresponding to the rules above). The source Belgium Lambert 72 X-Y coordinates are converted to the regular OSM lat-lon WSG 84 projection.

Data Transformation Results

The results of the transformation can be found here for one commune:

http://github.com/xivk/crab-tools/blob/master/python/2275.csv

The six columns that are used are:

  • HUISNR
  • STREET_NL
  • PKANCODE
  • COMMUNE_NL
  • LAT
  • LON

The PKANCODE, COMMUNE_NL and STREET_NL are used to split the data into smaller chuncks. LAT/LON contains obviously the postion of the address node and HUISNR and STREET_NL are used the generate the addr:huisnr and addr:street tags.

Data Merge Workflow

Team Approach

The import will be conducted in two phases. In a first phase, a handful of mappers will start the import and get a direct feel for the intricacies of the task. This group is currently made up of Marc Gemis, Ben Abelshausen, <to be decided >. Individuals who’d like to join this group, please send an email to the Belgian mailing list

Based on findings and progress in the first phase, tasks will be opened to broader group of people kicking off with a community session in to get everyone up to speed.

Workflow

All data are converted from the AGIV CRAB database by the above Python script and made available via http://addr.openstreetmap.fr/vlaanderen/. A separate task is made for each street, each task is represented by a yellow pin.
Import tasks indicated by pins

By clicking on a task icon, one can see some details of the task. The mapper can then click on the link "JOSM Remote xxxxx". Where xxxxx is the number of the task.

Dialog for import task

The data will be downloaded to the mapper's computer. In order to make this work, the mapper needs to install JOSM and enable the remote control capacity of the editor.

The data will be placed in a new layer in JOSM. The data consists of individual address nodes with the housenumber and the streetname.

Josm with imported address data

The mapper can then download the existing OSM data into that layer. It's the mapper's task to merge the imported data and the OSM-data, before uploading the data to OSM. During this "merge" the house numbers will be placed on the building, duplicates house numbers will be removed. When there is a conflict with existing data it advised to contact the mapper that made those changes.

Josm with imported address data and downloaded OSM-data

The house numbers should be placed on the building, as we are used to in Belgium.

A more detailed description, which can hopefully also be used as a tutorial, can be found here: WikiProject_Belgium/Using_AGIV_Crab_data.

Dedicated upload account

When importing AGIV CRAB data the mapper will use a dedicated account. Is is suggested to use {username}_crab. Each changeset will also be tagged with source=AGIV CRAB.

Conflation

Conflation has to be done by each individual contributor.

QA

The address itself is considered as the reference between the AGIV CRAB dataset and OSM. Different ways exist on getting this data from OSM and comparing this the AGIV CRAB database becomes trivial.

We have a unofficial agreement to report errors in the AGIV CRAB reference data when we find them.