AIV GRB Building Import

From OpenStreetMap Wiki
Jump to: navigation, search

WORK IN PROGRESS - Using AGIV CRAB Adressenlijst import as template

About

This import project page is about the AIV GRB Building database import.

The dataset, with as full name 'Grootschalig Referentie Bestand Vlaanderen' was released under the new Flemish OpenData License. (Gratis Open Data Licentie Vlaanderen v. 1.02), and contains data on parcels, buildings, road boundaries, sewer manholes, ...

In addition, in order to detect certain building types, the building height info is derived from the '3D GRB LOD1' ('level of detail 1') dataset, which adds building height info (min/max/average).

The import is using the building layers of said dataset, to complete missing buildings in Flanders, given in some area's there are still signficant amounts of buildings missing.

Import Plan Outline

The import happens in 3 main steps

  • Data preparation
  • Data insertion
  • Data validation

This means: take all the building data from the GRB dataset (layers 'GBG' and 'GBA'), add height data from the 3D GRB LOD1 dataset, compare data with the current OpenStreetMap data (building on current tagging in OSM, and the landuse that's currently mapped for the location), and make a suggestion on what tags the building should get. This step is done 'behind the scenes' for the entirety of Flanders, so end users will have access to this data. It will not be recalculated on the fly, but will be periodically updated.

A front-end tool will allowing the user performing the import to select a (limited) area for which he wishes to add the missing buildings. The data will be pulled from the pre-processed data from the first step, and inserted into OSM.

While the data is carefully prepared, it is still the importing party's responsability to VERIFY the data. This implies that, to ensure proper data validation, the data insertion part needs to be limited to small enough chunks to ensure it will be validated properly by humans.

Goals

The goal of this import is to use the high-quality dataset in order to significantly increase the building contour data available in OSM. This will not be a blind import, all data will be edited by local mappers.

Schedule

Data preparation

Data import preparation is complete and can be re-run in an automated manner to sync with both OSM and government datasets.

Communication with the local community has been done through the Riot Chat channel, and been summarized on the Belgian subsection of the OSM forums: https://forum.openstreetmap.org/viewtopic.php?id=61597

The data can be previewed here: http://dataviewer.grbosm.site/ , by clicking 'enable info window' on the top left and hovering over buildings to see the underlying data taken into account.

As 'proof of concept' to show what the endresult could look like, this link http://tiles.grbosm.site/slide/app/index.html#16/50.9523/3.1193 gives a slide-over comparison: moving the slider at the top from left to right shows the difference between a dataset with or without the buildings.

Data import

This step is currently in the works. Estimates are it can be finished by the end of june 2018.

Data validation

Given this step is not automated, it's more about creating a 'how to' as guideline. This work can be continued in parallel with the dta import step.

Assuming approval of the import, the aim is to have things up and running by august 2018.

There is no 'end date' set on completing, for two reasons:

- We want the import to be processed by people with local knowledge, and are therefor dependent on the presence of volunteers with said knowledge.

- The source data receives periodic updates, thus at some point people will be looking to import just those changes.

Import Data

Background

About GRB

GRB, or 'Grootschalig Referentiebestand' is a database with large scale data, such as buildings, parcels, roads and their layout, waterways, railways and constructions. It is managed by the Agentschap Informatie Vlaanderen (AIV). It was made for each community in Flanders, and this in a uniform way, based on terrain survey and aerial imagery. (summary translated from: https://www.agiv.be/producten/grb/objectcatalogus/entiteiten )

About 3D GRB LOD1 DHMVII

The 3D GRB - buildings LOD1 is a 'block model' (so footprint + height data) of all buildings present in the aformentioned GRB, where height data is derived from the DHMV II (Digital Height Model Vlaanderen, second version).It allows rendering the building objects in simplified 3D and to efficiëntly query building height data.

The dataset builds on the layers GBG (building footprints), GBA (other building parts) and KNW (artificial, non-building constructions).

(summary translated from: https://download.agiv.be/Producten/Detail?id=1210&title=3D_GRB_Gebouw_LOD1_DHMV_II )

Legal

Data source site: https://download.agiv.be/Producten/Detail?id=1&title=GR
Data license :https://download.agiv.be/Producten/GetGebruiksvoorwaardenPDF?id=1210
Type of license (if applicable): Flemish OpenData License (translation: AGIV_CRAB_Import/Free_open_data_licence_Flanders)
Link to permission (if required): The license is compatible for use within OSM.
OSM attribution (if required): https://wiki.openstreetmap.org/wiki/Contributors#AGIV
ODbL Compliance verified: yes

Import Type

This import will not be an automated bulk import. It will be chopped up in smaller pieces (approx 1000 building maximum) to ensure data validation

Data Preparation

Data Combination

There is source data to be combined from three sources:

  1. OpenStreetMap (all building contours, landuses and their attributes)
  2. GRB-data (from layers GBG, GBA en KNW + the OIDN/UIDN fields and the source date)
  3. 3D GRB-data (data H_DTM_MIN, H_DTM_GEM, H_DSM_MAX, H_DSM_P99, HN_MAX and HN_P99)

Note that for the 3D GRB data, DTM and DSM stand for Terrain Model and Surface model respectively.

The differences between DTM and DSM result in the building heights, The data source lists both the maximum value, as a value beneath which 99% of the points are located.

In turn those two allow to detect 'flat' versus 'pointy' roof structures.

Data analysis - decision tree for building types

Building on the 'extended dataset' created by combining the data from three sources, there is a decision tree in place to figure out which tag should be suggested for building=*

For refence, the decision tree is as follows (in pseudocode):

INSERT DECISION TREE

The result of that analysis can be preview through http://grbtiles.byteless.net/grb.html, by toggling 'Enable Info Window' in the top bar and hovering over buildings.

Tagging scheme

Obligated tags

Only the streetname and housenumber are mandatory tags, as those are the only tags needed to have a complete and non-ambiguous address (together with the boundaries that should already be present in the data, and can be corrected or improved at any time).

  • addr:housenumber=*: The housenumber correponding to the housenumber in the source-data.

Source: Huisnummer

Source: Straatnm

Optional tags, provided by the tool

The mappers also get a lot of freedom to adapt the tagging to their own workflow or individual preferences. For the following ways of tagging, it's up to the actual mapper to decide whether or not (s)he wants to add it. The import tool allows the mapper to check some checkmarks, whether (s)he wants to add the following tags or not.

  • addr:flats=*: A list of flats derived from all addresses with different flat numbers at this position. This data is optional because:
    • There's no uniform notation, which makes the data not so usable.
    • The CRAB data is more likely to contain mistakes, so about every apartment requires a survey.
    • The data isn't needed to find an address (since all numbers should have the same entrance, and postboxes next to each other).

Source: Appartementnummer / Busnummer

  • addr:postcode=* and addr:city=*: The postal code and municipality of the address. This data is optional because:
    • It isn't needed since the boundaries are present
    • Not all mappers agree to put the municipality inside the addr:city=* tag, some prefer to put the name of the postal-code zone there, which is the name of the part-municipality in some cases.
    • It's duplicated data, making it harder to maintain.
    • Sometimes, it's handy to add the postal code and municipality to clarify border cases, or to use filters and queries in JOSM.

Optional tags, not provided by the tool

  • AssociatedStreet relation: Exact tagging for the relation is decided by the mapper. This is optional because:
    • We can't provide the relation in the import tool. The relation might already be partially present, and all member should be swapped (since we map building outlines and not nodes).
    • The data can already be derived from other tags and boundaries
    • Some users might want to add the data, to clarify border cases, or to use filters and queries in JOSM.

Forbidden tags

The following tags can be downloaded through the import tool, but they only serve to inspect the quality of CRAB data (in order to determine where a mistake might be made). They may never appear in the OSM database.

  • CRAB:herkomst=*: This tag denotes the source AGIV gives for the location (in Dutch). It ranges from front-door precision to municipality center (with several interpolations in-between). It's handy to see this quality description, but it should never appear in OSM like this, because we alter the quality (and hopefully improve it) by adding the address to a building.
  • CRAB:hnrlabels=*: This tag lists the housenumber labels for that address. Housenumber labels appear when different CRAB addresses overlap (due to lack of precision, or because they belong on the same physical object). This should not get in OSM because either the precision should be improved (move addresses to the right buildings), or a double housenumber should be put in the addr:housenumber=* key.

Changeset Tags

The source tags are documented on the import site. These should be:

  • source=Agiv GRB: We use the GRB database from Agiv, and we use Agiv aerial imagery to draw the outlines. No other data is used in most cases.
  • source:date=yyyy-mm-dd: To detect changes


Data Merge Workflow

Team Approach

The tools will be available for everyone, with specific guidelines in order to achieve optimal quality. The dataset will be used more as an additional source, than as a direct dump dataset (comparable to using Bing imagery to map things).

Workflow

See ToDo:LINK for the workflow documentation.

Dedicated upload account

Since mappers will be mapping much more than just the addresses provided in the source dataset (building outlines will also be mapped), and in some cases, surveying is part of the job, this cannot be considered a normal import. It's more comparable to mapping stuff based on background imagery. Here the housenumbers are used as a background to map the buildings. Many users will also map things next to the housenumbers in the same session (because they surveyed something, or because they notice something on the imagery).

As such, we consider the requirement for a dedicated user account as a limitation for the contributors.

Conflation

The tools provide information about the available addresses, but individual mappers must decide to draw the building outlines, or merge it with existing building outlines.

QA

There will be a continious QA through the comparison tools. Every mapper will map and control the region he knows. Next to the comparison between OSM and CRAB, other tools s.a. Osmose and keep right! will also be used from time to time.


When mistakes in CRAB are found, the Agiv provides tools to notify them of those mistakes, so the mistakes can get corrected, and in the next data update, the differences between OSM and CRAB will be gone. The reaction time is dependent on the municipalities, but it's usually a few weeks.