Spanish Cadastre/Buildings Import/Data Conversion/Problems

From OpenStreetMap Wiki
Jump to navigation Jump to search
Import guide Projects management Results Documentation
Spanish Cadastre Buildings Import.svg

Problems describes the problems encountered in the data conversion for this buildings import and their solution.

Data simplification

Issues related to the reduction of the data set.

Buildings with multipart geometries

The Building data set contains two types of geometries.

  • Polygons: Formed by a list of rings. The first ring is the building footprint, the next ones are inner rings that correspond to holes in the building. If there is only one ring, the building corresponds to a OSM way. If there are many rings we will have a multipolygon OSM relation with the first way with outer role and the others with inner role. They are relations necessary to reflect the holes and there aren't many.
  • Multipolygon: Formed by several polygons. In OSM they correspond to relations with several outer ways. This occurs because there are buildings with the same Cadastral reference value (field 'localId'). As this field is not going not be imported, in OSM it isn't necessary to keep this relations that raise the number of relations to be imported.
Multipolygon relation with many outer ways

For more information consult the wikipedia: WKT

Solution: Buildings with multipart geometry are separated by this algorithm.

Building parts below ground level

The building parts layer contains geometries with the fields:

There are parts of buildings that do not have levels above the ground. These are generally parts that are outside the footprint of the building or that correspond to elements such as stairs. If both the number of levels above and below ground is zero, the part corresponds to a porch.

Building parts entirely underground.

Solution: During tag transformation it will be assigned building:part=roof to the building parts without levels above and below ground ('numberOfFloorsAboveGround' = 'numberOfFloorsBelowGround' = 0). The algorithm building parts outside the building footprint removes the rest of the building parts below ground.

Reduction of building parts

In the data set of building parts, only two attributes are imported, the number of levels above and below ground. The set contains buildings with adjacent parts that have the same level values. Their number can be reduced by merging the geometries of these parts into one.

Another strategy to further reduce data is to delete those building parts whose levels match the maximum and minimum value for the building. These values can be transferred to the building footprint and the parts, as they do not contain any other information, are redundant and can be deleted. In these cases, the building parts will not cover completely the footprint.

The resulting 3D scheme worked correctly (2017) on some renders, but not on others.

Scheme of 3D buildings not fully overlapped by building parts

This simplification was abandoned as of March 2021 (from CatAtom2Osm 1.3) to better fulfil the Simple 3D Buildings scheme that requires the entire building outline should be filled with building part areas. However, when the level is even for all the building, the parts are not incorporated (there would only be one and it coincides with the outline).

The CatAtom3Dfix tool has been developed to correct the data already imported.

Solution: Algorithm for reduction of building parts.

Detection of swimming pools inside buildings

Some pools may appear located within the building footprint. In these cases, the Cadastre data contains two entries with identical geometry, one in the pool data set and another in the buildings parts data set. The building part with the same shape of the swimming pool is used to define the cavity that holds the water on the roof of the building. Using the tags location=roof and layer=1 in the pool, the matching building part is redundant and can be deleted. The inner rings of building geometries that are equal to the swimming pool can also be deleted. If the entire building is equal to the swimming pool, the reviewed cases show that they are false buildings and can also be deleted.

Sometimes, these swimming pools are not located on the roof of the building, but inside. There is no way to identify these cases automatically and they are left for manual review.

Solution: Algorithm for detection of swimming pools inside buildings.

Building parts outside the footprint

In the 3d model in use, the parts of buildings must be contained within the building's outline. However, in the Cadastre data we can find parts outside the footprint, even after removing the underground parts. The image shows an example. In green the contour of two buildings and in red external parts, on the left the seen from the front and on the right seen from above.

Buildings with parts outside the footprint

They correspond to building parts on a sloping ground that are below ground to one side and above ground to another. There are two possible solutions. The first is to expand the footprint of the building to encompass the external annexed parts and promote the part to building when it is not attached to the footprint. The second is to eliminate these parts of the import.

You can also find 'orphan' building parts, that is, there is no associated building.

Building with a part outside its footprint and not contiguous

Solution: The algorithm building parts outside the building footprint deletes the parts outside the building footprint if it exists. If it finds parts without an associated building, it does not eliminate them and generates an outline from the union of the parts.

Duplicated nodes

In most cases, the Cadastre building layer has an estimated accuracy of 0.1 meters. We can find closely spaced nodes separated by a few centimeters. They can be consecutive in the same geometry or belong to two adjacent buildings. Its existence can imply topology errors and complicates the resolution of this problem and that of unneded nodes.

  • Duplicated nodes.
  • They can produce topological errors.

Solution: algorithm to add topological nodes and simplify duplicate nodes.

Unneded nodes

The Cadastre data contains an excessive number of nodes for each geometry.

Too many nodes in a straight line.

Solution: To identify and delete an algorith to simplify geometries is used.

Errors correction

Issues related with data quality.

Topological errors

The Cadastre data may contain topological errors. Topological errors can result in overlapping geometries instead of being adjacent.

  • Duplicate red crosses indicate segments with topological errors.
  • Overlapping buildings.

Solution: algorith to simplify geometries.

Vertices with too low angle

The original data contains geometries with vertices that form an angle with the adjacent vertices too small. They look like the result of having made a difference between two polygons with segments that should be adjacent but topologically incorrect. This problem is not detected by the JOSM validation tests.

Vertex with too small angle.

Another variant is to find two consecutive vertices with low angle (in 'zig-zag').

Vertices in zig-zag.

Solution: The algorithm to delete invalid geometries also deals with this problem.

Junk geometries

The original data contains buildings or parts that seem to be the result of having made a difference between polygons with sections that should be adjacent but topologically incorrect. This problem is not detected by the JOSM validation tests

Geometry with too low area.

Solution: The algorithm to delete invalid geometries deletes geometries that do not pass the validation tests GEOS and marks with a warning those wich area is below a threshold.

Data management

Issues related to the import preparation

Split of data in tasks

It's no convenient to upload to the OSM servers big amounts of data at a time. That's why it is necessary to divide Cadastre data, which cover an entire municipality, into smaller fractions or tasks. It is proposed to use the task manager to create projects for the data to be imported.

We can use the <CadastralZoning> elements to split the data in tasks. These elements can be of two types according to the value of the field <cp:levelName>:

  • Polygons: They form a complete partition of the municipality. That is, the union of all of them is equal to the area of the municipality without any polygon overlapping another. The buildings contained in 'Polygons' but not in 'Blocs' correspond to the Rustic Cadastre.
  • Blocs: The cover the main populated places. Each area covers a group of buildings surrounded by throughfares. They don't overlap each other nor are adyacent, but are contained by the 'Polygons'. The buildings contained in 'Blocs' correspond to the Urban Cadastre.

It is important that the ways contained in each task do not share nodes with the ways contained in another task. If not, before uploading a task, it would be necessary to check if there are matches with the existing nodes and merge them to avoid duplicates. For this reason, adjacent blocs are merged before using them.

  • Example of the Cadastral Zoning data set.
  • Detail of the zones.

Soluction: First it is necessary to generar ficheros wich each task represented by an area. Then the buildings are splitted into a OSM file for each task.

Througfare names correction

Througfare names in the addresses dataset are in capital letters, without accent marks, they contain abbreviations and information not belonging to the name. It is necessary to correct the names according to the normalization rules (es). In addition, the name of the througfare in the Cadastre data may contain errors or discrepancies with respect to the information collected on the ground.

Solution: Process for througfare names conflation.

Problems with addresses

Problemas direcciones postales.png

This image shows some of the problems we can find with the addresses. It is a combination of the OSM map, Cartociudad portal numbers (the small labels) and the converted Cadastral data for buildings and addresses. The addresses of type "entrance" show the icon of a door and those of type "parcel" a blue icon of the number plate.

  • The addresses nodes are displaced with respect to the footprint of the building or parcel contour.
  • Addresses of type "entrance" and "parcel" scatered in an inhomogeneous way.
  • Addresses displaced with respect to their correct position. Some examples have been pointed out with arrows.
  • Buildings with several addresses. The buildings between Juan Rumeu García and Rafael Arocha Guillama streets and between the previous one and Lorenzo García del Castillo street have access and a portal number from two streets. This is not a problem if both addresses are of the "entrance" type and are correctly located. This problem is not exclusive to Cadastre, it can also occur in Cartociudad.
  • If the program is intended to move the entrances to the nearest point on the footprint of the building, it must be borne in mind that in some cases, the buildings are inscribed within the parcel, as in the case of chalet-like houses or buildings surrounded by gardens. This is the case of the buildings located in the upper part of the previous image. If the parcel have private access, with a barrier (wall, hence, fence), the entry node must be located in the contour of the parcel that will not be imported [1]. The program will not transfer to the contour of the building the addresses located farther than a threshold.

Solution: Addresses are asigned to diferent elements according to this table:

Abstract of addresses placement
<AD:specification> N# of buildings addr position: Notes
parcel 0 N/A No se importan
parcel 1 closed way relation Addr tags in building the building footprint
parcel > 1 N/A Not imported
entrance 0 N/A Not imported
entrance >= 1 node Move the entrance node to the nearest building footprint only if the distance is under a threshold and the new position is not a building corner, otherwise not imported.