Spanish Cadastre/Buildings Import/Data Conversion/Software/Specifications

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages — Spanish Cadastre/Buildings Import/Data Conversion/Software/Specifications
Afrikaans Alemannisch aragonés asturianu azərbaycanca Bahasa Indonesia Bahasa Melayu Bân-lâm-gú Basa Jawa Baso Minangkabau bosanski brezhoneg català čeština dansk Deutsch eesti English español Esperanto estremeñu euskara français Frysk Gaeilge Gàidhlig galego Hausa hrvatski Igbo interlingua Interlingue isiXhosa isiZulu íslenska italiano Kiswahili Kreyòl ayisyen kréyòl gwadloupéyen kurdî latviešu Lëtzebuergesch lietuvių magyar Malagasy Malti Nederlands Nedersaksies norsk norsk nynorsk occitan Oromoo oʻzbekcha/ўзбекча Plattdüütsch polski português română shqip slovenčina slovenščina Soomaaliga suomi svenska Tiếng Việt Türkçe Vahcuengh vèneto Wolof Yorùbá Zazaki српски / srpski беларуская български қазақша македонски монгол русский тоҷикӣ українська Ελληνικά Հայերեն ქართული नेपाली मराठी हिन्दी অসমীয়া বাংলা ਪੰਜਾਬੀ ગુજરાતી ଓଡ଼ିଆ தமிழ் తెలుగు ಕನ್ನಡ മലയാളം සිංහල ไทย မြန်မာဘာသာ ລາວ ភាសាខ្មែរ ⵜⴰⵎⴰⵣⵉⵖⵜ አማርኛ 한국어 日本語 中文(简体)‎ 吴语 粵語 中文(繁體)‎ ייִדיש עברית اردو العربية پښتو سنڌي فارسی ދިވެހިބަސް
Proposal Import Guide Corrections Projects management Software

Specifications describes the solutions to some of the problems faced by the software developed for this Buildings Import.

Data transformation

Algorithm to split multipart buildings

Motivation: buildings and other datasets with multipart geometries.


  1. Iterates over each building.
  2. If their geometry is of type WKBMultiPolygon:
  3. A new building is added for each polygon copying the original fields.
  4. Deletes the old building.

From this moment, the field 'localId' will not be any more a unique identifier for buildings.

Implementation: layer.PolygonLayer.explode_multi_parts

Building parts outside the building footprint

Motivation: To delete building parts below ground level and building parts outside the building footprint, create footprint for parts without associated building.


  1. Select parts with 'numberOfFloorsAboveGround' = 0.
  2. Delete them.
  3. For each feature in this layer:
  4. If it is a building part:
    1. If it have levels under ground ('numberOfFloorsBelowGround' < 0) but not above ground ('numberOfFloorsAboveGround' = 0), drops it.
    2. If it have an associated building and is not inside it, drops it. It have associated building if there exists a building with the value of the catastral reference in the 'base:localId' field.
    3. If it haven't an associated building, adds it to a dictionary grouped by the cadastral reference.
  5. With this dictionary of parts without associated building, for each building cadastral reference, generates the building footprint from the merging of all the parts.

Implementation: layer.ConsLayer.remove_outside_parts

Reduction of building parts

Motivación: Reduction of building parts.


  1. For each building and its parts (matches their cadastral references).
  2. Calculates the minimun and maximum values of levels under and above ground in the building parts.
  3. Assigns these values to the building footprint.
  4. For each pair of diferent levels under and above ground values in the parts of the building and for each set of parts with this values.
  5. If these values are equal to the values calculated for the building footprint, delete the parts.
  6. Otherwise, merge the adyacent parts.

Implementation:: layer.ConsLayer.merge_building_parts

Detection of swimming pools inside buildings

Motivación: Detection of swimming pools inside buildings.


  1. For each building and its parts (matches their cadastral references).
  2. For each swimming pool in the same parcel (matches their cadastral references).
  3. If the pool is inside a building assign to it the tag layer=1.
  4. If the building footprint is coincident with the pool deletes the building.
    1. Otherwise,
    2. Deletes the inner rings of the building geometry that are coincent with the pool.
    3. Deletes the parts of the building that are coincident with the pool.

Implementation:: layer.ConsLayer.merge_building_parts

Invalid geometries

Motivation: To delete junk geometries and vertices with too low angle.


  • 'acute_inv' = 5°
  • 'min_area' = 0.05 m2
  • 'dist_inv' = 10 cm
  • 'straight_thr' = 2°


Acute angles.svg
  1. For each geometry 'geom' in this layer.
  2. For each ring with index 'i' in the geometry.
  3. For each vertex 'v' in the ring.
  4. If the angle of the vertex is acute ('angle_v' < 'acute_inv').
    1. Try to delete the vertex.
    2. If the resulting geometry is not valid or its area is less than 'min_area'.
      1. If this is the outermost ring (i == 0) the geometry is invalid, drops it.
      2. If this is a inner ring (i > 0) deletes the ring from the geometry.
    3. Otherwise and if the ring have more than four vertices.
      1. Take the 'angle_a' to the nearest adyacent vertex ('va').
      2. Take the distance 'c' from 'va' to the segment from 'v' to the farther adyacent vertex ('vb').
      3. If 'va' is acute ('angle_a' < 'acute_inv') and it pass a filter ('c' < 'dist_inv'), 'v' and 'va' forms a zig-zag, delete them.
      4. If 'va' is not straight (|180 - angle_a| < 'straight_thr') and pass a filter ('c' < 'dist_inv') this is a strike vertex.
        1. Take the 'vx' point formed by the projection of the previous segment of 'va' over the segment 'v'-'vb' (see the figure).
        2. Deletes 'v'.
        3. Move 'va' to 'vx' and put this movement in a dictionary 'va'->'vx'.
  5. If there are remaining movements iterates over each geometry.
  6. If there exist a vertex equal to 'va' move it to 'vx'.

Implementation: layer.PolygonLayer.delete_invalid_geometries

Algorithm to add topological nodes and simplify duplicate nodes

Motivation: Duplicated nodes and Topological errors.


  • This problem occurs both in buildings, building parts, and swimming pools. Its resolution is interdependent, that is, there may be a node of a building that is not duplicated at first but it is if we consider the other data sets. For this reason, the elements of the three data sets must be copied in the same data warehouse. From now, when we say element we refer to buildings, their parts and other constructions (pools).
  • The coordinates are maintained in the original UTM projection to simplify the distance calculation.


  • 'dup_thr' = 1.2 cm. This value was established with test taking into account that JOSM draws coordinates with arbitrary precision, but the validation of duplicated nodes round the coordinates to the sevent decimal. An angle of 10-7 degrees is about 0.011 meters (1.1 cm) in latitude.
  • 'dist_thr' = 2 cm.
  • 'straight_thr' = 2°.


  1. For each geometry 'geom' in this layer.
  2. For each vertex 'point' in the outermost ring.
  3. If this coordinates was not visited previously.
  4. Search for geometries intersecting a bounding box of 'dist_thr' radius around 'point'.
  5. For each candidate geometry.
    1. Search the nearest vertex to 'point' in the candidate geometry.
    2. If the distance to the nearest vertex ('dist_v') is 0, it belongs to 'geom'.
      1. If any of the adjacent vertices 'va' and 'vb' is near than 'dup_thr' deletes it.
    3. If 0 < 'dist_v' < 'dup_thr', it don't belong to 'geom'. Move it to 'point' position.
    4. If 'dist_v' > 'dup_thr', it don't belong to 'geom'. 'point' is a topological point if it fullfit this conditions:
      1. The distance from 'point' to the nearest segment in the candidate geometry is below 'dist_thr' and the nearest point to 'point' in this segment is not any of their extremes.
      2. The angle formed by 'point' with the extremes of the segment is straight (The difference with 180º is below 'ang_thr').
      3. Inserting it as a vertex in the candidate geometry does not generate an invalid geometry.
    5. If 'point' is a topological point insert it as a new vertex in the candidate geometry.
  • Topological error not corrected.
  • Topological error corrected.

Implementation: layer.PolygonLayer.topology

Algorith to simplify geometries

Motivation: Unneded nodes. The use of an external tool to simplify does not allow all the control we would like or refine the results. That is why an own algorithm is developed.

We intend to eliminate excessive vertices in straight lines, such as the red vertices in the image.

Excessive nodes in a straight line.

To do this, each vertex of a geometry is examined. The vertex is a candidate to be eliminated if it is not a corner. It is considered that it is not a corner if the angle it forms with the previous and next vertices does not differ much from 180° (it is straight).

The red node is a candidate to be deleted.

The straight angle condition is not enought. If the distance separating the previous and next vertices is enought large, although the angle is almost straight, the cathetus can measure meters and eliminating the vertex would mean an excessive modification of the building. The cathetus is the shortest distance between the vertex and the line joining the previous and next vertices.

Removing this node causes a displacement of several meters.

You can not delete a vertex if in another geometry there is a vertex in the same position. That would lead to topological errors. In the following image the green nodes can be eliminated, the red one does not because they belongs to the parts of the building (light gray).

Node belonging to several geometries.

We have to check the angle in all the geometries that have a node in that position. In the previous image the green nodes belong to two geometries: the footprint of the building and it parts.


  • The elements of the building, parts and other constructions (pools) dataset must be in the same data warehouse.
  • Duplicate nodes have been removed.
  • Topological nodes have been added.
  • The coordinates are maintained in the original UTM projection to simplify the distance calculation.



  1. Para cada nodo 'point' en la capa.
  2. Comprueba si 'point' es una esquina en alguna de las geometrías que tengan un nodo en las mismas coordenadas.
  3. Si no es esquina en ninguna, puede borrar 'point' de todas esas geometrías.

Implementation: layer.PolygonLayer.simplify

Operations on data

Algorithm to generate definition files for the task manager

Motivation: La creación de proyectos en el gestor de tareas para split of data in tasks.


  1. Se crea un conjunto de datos para rústica 'rustic_zoning' y otro para urbana 'urban_zoning'.
  2. Se copian los elementos con valor 'POLIGONO' en el campo levelName del conjunto 'zoning' al conjunto 'rustic_zoning' y los que tienen el valor 'MANZANA' a 'urban_zoning', separando las geometrías multiparte.
  3. Se fusionan los elementos 'MANZANA' adyacentes (si tienen algún segmento en común) para evitar colocar en distintas tareas edificios que tengan paredes comunes.
  4. Se asigna un identificador único a cada polígono.

Results: Se generan dos ficheros de salida: urban_zoning.geojson y rustic_zoning.geojson.

Implementation: catatom2osm.CatAtom2Osm.get_zoning

Algorithm to split the data into tasks

Motivation: Generar ficheros para split of data in tasks.


  1. Para cada característica de tipo edificio o piscina del conjunto de datos 'building'.
    1. Si no tiene asignada etiqueta de tarea.
    2. Busca una parcela de urbana que lo contenga y le asigna su etiqueta.
    3. Si no encuentra, busca un polígono de rústica que lo contenga y le asigna su etiqueta.
  2. Las características tipo parte de edificio reciben la misma etiqueta de tarea que su edificio asociado.

Una vez asignadas las etiquetas de tarea, se extraen las características que corresponden a cada una y se crean los ficheros OSM correspondientes.

Implementation: layer.ConsLayer.set_tasks y catatom2osm.CatAtom2Osm.process_tasks

Througfare names conflation

Motivation: Corrección de los nombres de viales.

Steps: La corrección se realiza en dos fases.

  • En la primera ejecución del programa sobre un municipio, de forma automática:
  1. Descargar de OSM las vías y relaciones con las etiquetas highway=* y name=* y las plazas place=square con name=*.
  2. Para cada nombre de vial de Catastro.
    1. Encontrar en OSM el nombre de vía más similar que esté cerca de los nodos que representan las direcciones con ese nombre de vía.
    2. Si no encuentra ninguno, convertir según las reglas de normalización.
      1. Las abreviaturas tipo de vía de Catastro (especificadas en este documento) se expanden usando un diccionario, personalizable por el usuario.
  3. Generar un archivo de corrección con los nombres de vías en Catastro y la transformación propuesta para cada uno.
  • A continuación se realiza una revisión manual del archivo de corrección. Este archivo será usado por el programa en la siguiente ejecución para transformar los nombres de Catastro. Para algunos tipos de vías, se colocará el nombre en una etiqueta addr:place=* en lugar de addr:street=*

Implementación: catatom2osm.CatAtom2Osm.get_translations