GRBimport

From OpenStreetMap Wiki
Jump to: navigation, search

!!!This is a draft page - Please don't take this as a reference yet!!!

High-level overview

The Flemish government has made a dataset of all buildings in the region. The data contains high quality addresses, is measured to centimetre level precision and is continuously updated.

Their main focus is accuracy, not recency. Changes are only made if an as-built plan is recieved, or when surveyors have been on the ground. Any change on the ground should be mapped within a year on a worst case basis. [link: details of update procedure]

Only data that can be collected from public land is measured. So the backside of buildings is drawn based on high resolution imagery. The building dataset contains addresses, which are periodically updated from the central address database (CRAB).

The import toolset does not have the objective to load the data into OSM as fast as possible. Instead, it is a tool that will assist the mapper to improve their mapping quality. The end result will be better than either current dataset.

The main issues we have had to solve:

  • Conversion to OSM categories it isn’t always possible to convert the GRB categories to a detailed OSM tag. But a good effort gets a long way. In case of doubt, we can use building=yes. The import toolset will result in warnings when trying to change existing tags. Because of how we plan to tag, QA tools will be easy to set up, to identify possible tagging mistakes. Several categories of mistakes are already identified and accounted for. [main articles]
  • Existing buildings About 1.5 million buildings exist in OSM already. The toolset will respect these objects, and their mappers. Only geometry will be changed, the object ID will remain. The top 100 building contributors will be contacted. [main article]
  • Address data Complex addresses are flattened onto buildings in GRB. These addresses will not be imported. Instead, they can be sourced directly from CRAB, helped by local surveys [main article]
  • Complexity of the mapper's job To assure quality importing, the tool will be properly documented, there will be workshops and hangouts. Only people with a minimum amount of training will be allowed to import into OSM with the toolset. However, we have to be careful not to be off-putting: people are already deleting-and-tracing from GRB background images or doing wild imports. The toolset will definitely improve current mapping practices.
  • Updating Our community has the skills to keep using the GRB to detect errors once a larger amount of GRB buildings are imported. In many cases, we will be faster than the GRB, in many others we will be slower. Marc Ducobu has presented at SotM [link] about the exact same problem in Brussels, where a building import happened a lot earlier. These tools will serve as inspiration or will be extended to cover Flanders. [article: updating challenges]


To do: add links to detailed sections, extend these sections.Joost schouppe (talk) 08:23, 13 December 2016 (UTC)

Scope

This page applies to the Flemish geographical area in Belgium including Brussels.

Terms

Related terms

  • GRB  : GRB - Grootschalig ReferentieBestand)
  • LRD  : English translation of GRB, or Large scale Reference Database
  • AGIV : Agency for Geographic Information Flanders (Agentschap voor Geografische Informatie Vlaanderen). This name is deprecated and should now be AIV (Agency for Infermation Flanders). However, the old name is still in much wider use than the new.
  • URBIS: Brussels Urban Information System

About GRB

In the period 2000-2013, the Large-Scale Reference Database (LRD) (in Dutch: GRB - Grootschalig ReferentieBestand) was established for all Flemish cities and municipalities. This detailed database focuses on government operated infrastructure on the public domain. It contains, amongst other things building footprints, road morphology, road infrastructure and administrative parcels. Everything that is accessible or visible from the public domain was collected by professional surveyors to within a 25 cm accuracy. Everything in the public domain is checked for updates at least once a year. Since the original creation of these databases, no updates were executed at the level of the inner terrains (areas not accessible for surveyors). In 2013 an update was executed based upon terrain changes (anomalies), midscale aerial imagery on more than 5.500 km² were screened for changes at the level of the inner terrains. In total, more than 400.000 anomalies were checked and updated by means of photogrammetric mapping on the midscale imagery.

In December 2015, the data was made publicly available but registration was required. The database is maintained and updated by Flemish municipalities using administrative procedures, geodesy measurements and satellite photos. It forms the geographical basis which others can use to inoculate their data. The Flemish government expects a positive effect on the quality of the data through feedback channels. Furthermore, it expects the development of new data sets and the creation of GRB data-based applications.

A detailed analysis of GRB entities and their possible translation to OSM objects is available at the WikiProject Belgium/GRB page.

License

AGIV released various datasets under the Flemish OpenData Licence (compatible with ODBL), which have been used by different mappers:

  • CRAB - database with addresses and positions (see AGIV_CRAB_Import).
  • Orthofoto's - a dataset with aerial imagery. It includes historical imagery, and new imagery with a 25cm resolution, refreshed every year
  • GRB - database with numerous geographical entries. Including f.e. waterbodies, street areas and building outlines. Most publicly accessible features (f.e. the front sides of the buildings) have been measured on the ground with great accuracy (typically better than 10cm).


Import Plan Outline

[x] Analyse GRB data

[x] Determine subsets of data to be imported (GRB: entity) selected: Gbg,Knw,Gba

[x] Process data (code and procedure in github)

[x] Build software (dev beta version ready, production version underway)

[ ] Create import case

Schedule

Planning meeting with interested volunteers scheduled for ?

Source Data

Current raw source datafiles can be retrieved here

BETA test site

Public vector layer

Curl data request example

   curl 'http://grbtiles.byteless.net/postgis_geojson.php?bbox=352711.8458126,6581269.4005653,353825.55720112,6581655.7658565' -H 'Pragma: no-cache' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en,en-US;q=0.8,nl;q=0.6,af;q=0.4,fr;q=0.2' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36' -H 'Accept: */*' -H 'Referer: http://grbtiles.byteless.net/' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' -H 'Cache-Control: no-cache' --compressed

Import type

This is purely a semi automated import. The biggest reason is that we want the data to be verified by a human. During data analysis and tool development it became clear that we cannot do a full automated import. Not because the data is bad, but because it takes a human to merge existing OSM data with GRB data intelligently. JOSM plays a crucial role in our workflow.

There is no deadline, the intended use is to help our large community of surveying mappers to determine correct addresses, streetnames and building contours. It's small scale although it's also very easy to complete area's with few buildings. Using the webtool we exclude existing buildings and take diffs in the web browser by combining JOSM features with Overpass API.

Import guidelines for GRB

Import guideline Required GRB import
Step 1.1 - Prerequisites n/a The GRB data for Belgium contains building outlines, addresses, objects.
Step 1.2 - Community Buy-in acceptance by the community Several threads on GRB can be found on the talk-be list.
Step 1.3 - Documentation license, updating the catalogue and the import The GRB is provided by law under an Odbl compatible license.
Step 1.4 - Import Review n/a The import will be discussed IRL at a community meeting which will probably be planned December 11 2016. Several test cases have been performed with limited impact in area's with few existing buildings. But advanced dense area's have also been tested.
Step 1.5 - Uploading dedicated user accounts Data is uploaded using JOSM, dedicated accounts are not needed as this tool is to assist existing mappers, we prefer drip-feeding the OSM db. It should be stressed that this is not an import in the sense of the word, even though we are making a case here, we are really talking about migrating GRB data into OSM due to the way the workflow goes.
2. Make sure data license is OK yes see Step 1.3
3. Document your import on the wiki yes see step 1.3 and separate section below
4. Use a dedicated user account no see step 1.5
5. Check tags n/a see separate section below. Some tags should be manually translated, verdieping is one of them. That is usually a floor with a road below but it's not possible to determine this automatically. It can also be a second legal entity/object on top of the other. But since we have mapillary and free 10cm detailed satellite picture from AGIV we can determine as a human the correct action. Keep, merge , retag or drop.
6. Work small- you have time n/a The merge will be timeless. GRB is also updated and we have accounted for this. The DB(s) will be updated with new information at select intervals. Updates require everything to be rebuild from zero since it uses the OSM schema and we cannot guarantee that buildings have the same osm_id each import. Hence we use source:* tags to match this data. Keep the changesets small
7. Don't screw up the data! yes The merge is incremental and fluent, since we aren't bulk importing but trickle feeding OSM changes for major screwups is small. Deletes are not really part of the plan unless a complicated merge has to be performed, that's the main reason this is semi-automatic.
8. Don't put data on top of data yes The current buildings will NOT be replaced, the mapper is required to use the Replace Geometry plugin: CTRL+SHIFT+G to merge new geometries with historic building data. This is essential in order to account for changes easily.
9. Simplifying yes The GRB data is very accurate and comes with government pledged Q/A guarantee, so it suffers from what we call overnoding , aka. Too many nodes in an arc and/or circles. An option in the webinterface is provided with a very sane default to prevent this. Sometimes small tweaks are needed with the slider. In the background all different entities are seperate objects, they will be glued to eachother when exporting to JOSM, preventing many validator warnings.

Workflow

Once the data is converted to OSM objects, it can be loaded to JOSM. In simple cases, new objects are created. Some building types will need manual review to check what tag they need. If buildings are already present, the OSM object will be retained. The tags will be compared and extended if necessary. The geometry will be updated. If in OSM the part of the building that is not visible from the public domain looks like it is mapped better than the GRB object, that part of the OSM geometry is retained too. Unique identifiers are added to the object to make sure it is clear where the geometry came from, and at which version. This makes the task of quality assurance possible, and makes it clear that the GRB information has already been merged

Step-by-step:

  • First zoom in on the area of interest
  • Consult Overpass API to get the existing GRB buildings in OSM(button)
  • Filter layers to take a cross-section (button)
  • Open JOSM
  • Open the zone (button)
  • Export Layer to JOSM
  • Edit in JOSM
  • Check the man_made structures and merge buildings with merge geometry tool plugin in JOSM
  • Validate in JOSM(very important)
  • Upload data
  • Wait a while for overpass to process the changeset after uploading
  • Repeat on area, filter again (existing OSM buildings will be excluded)

Related wiki pages

License

Sources

Belgian Open data sources

Reference sources

GRB Software

Related tools

Related software

Data handling

Data Reduction and Simplification

The data sets has been cleaned out of all objects that clash with their OSM counterparts. as an example, GRB uses 2 classifications of buildings. the main building, and a non-main building. Combined with address data, we can be quite confident to decide what OSM category it belongs to. (todo: gplv2: elaborate on classifications)

Also, overnoding (from GRB source) is tackled in the web based toolset. In fact it's been studied extensively since object between GRB entities do not shared nides in the postgis DB, the exported changeset will be merged together when exporting this (opening in JOSM). So it depends on what you actually export or not.

Tagging Plan

Source tags

what wiki says

Before you read further, please read the wiki page on Multiple source tags : Key:source and why using the source namespace is appropriate , quoting: If a feature has multiple tags you may want to make use of the source namespace to indicate precisely which tag your source refers to. See also Key:ref for the difference in use and why we are using the facilitated namespace under source:*=*

More notable: Other tags exist or can be established within the source namespace.. So we should defenitely use this to be able to couple our buildings back to the source data. It maintains the link between source(GRB) and target(OSM) and back. The choice at time of development after careful considerations was:

We will explain each of the chosen tags deeper.

source:geometry:entity=*

Both address data and geometry information comes from GRB database, The entity is the layer we got the object from. It's essential as the oidn is a unique identifier in GRB dataset but it is only unique per layer. If we omit this, all automated validation/QA work will be impossible as we have no idea what source layer an object came from with certainty.

source:geometry:date=*

As far as we could investigate, this date represents the last time an object was updated. Not sure if this is the date of measurement or of the actual update itself. What we do know for sure is that this will change if buildings are being re-measured and/or the structure is being expanded or modified. Measurements will be made and introduced in the database. GRB has regular updates so we can use this to distill a list of buildings that have been updated since entry in OSM. The good thing about this is that it's human readable.

It is important that future mappers that will look at the object get an idea how recent the data is. It will help to avoid making edits and changes based on older satellite pictures, aerials or other data sources.

source:geometry:oidn

The most important one. Together with the entity we can uniquely match this object back to the source.

source:geometry:uidn

This one could be considered for dropping from the tag list. Basically, it moves together with source:geometry:date=* , whenever a change happens, this number gets bumped/changed and will allow to for automated matching. But as far as we can tell, this number and the date move together. The difference is that the date is readable and this is just a number. In GRB it's described as the ```version``` number of the object.

We deciced not to put a source tag on each object, but rather we put one on the changesets. You can find many source references in OSM on Belgian object already (historical, addressing etc) and it feels like we will be polluting this tag unnecessarily but since we do need to know where it comes from, we tag the changeset instead.

Changeset Tags

Required on the changeset !

source=GRB


Other import cases

In order to get a good case together for this semi-auto import work, we can learn from previous requests:


Mailing list

TODO

you can see the issues many have with full auto approaches, hence our plan to do this in small pieces is pretty solid but needs explaining, we need to address:

  • what the data is
  • what the license is (is it OSM ready ?)
  • procedures (Glenn will do this)
  • detailed tag explanations
  • document use cases (Glenn)

Special:WhatLinksHere/GRBimport