Spanish Cadastre/Buildings Import/Data Conversion/Software

From OpenStreetMap Wiki
Jump to navigation Jump to search
Import guide Projects management Results Documentation
Spanish Cadastre Buildings Import.svg

CatAtom2Osm is the tool developed to convert the datasets for this Buildings Import.

Installation

The installation procedure is described here.

Via a Docker image

There exists a docker image available here, where you can see a quick guide to installation and use (in spanish).

Settings

The software by default use Spanish to translate the throughfare types. To use another language, edit the file 'setup.py'. Change 'es' to 'cat' for Catalan, or to 'gl' for Galician, in this lines:

   # Dictionary for default 'highway_types.csv'
   highway_types = highway_types_es
   # List of highway types to translate as place addresses
   place_types = place_types_es
   # List of place types to remove from the name
   remove_place_from_name = [place_types_es[26]]

User guide

Basic use (buildings)

The program is executed in the command shell, from the folder that you want to dedicate to downloading the data.

To download a municipality you need its Cadastral code. If you don't know it, you can run the program with the -l option and the code of a province (two digits) to list all municipalities in the province.

   catatom2osm -l 02
   Territorial Office 02 - Albacete
   ==================================================================
    02001-ABENGIBLE
    02002-ALATOZ
    02900-ALBACETE
   ...
    02084-VILLAVERDE DE GUADALIMAR
    02085-NURSERIES
    02086-YESTE

Once you know the code, you can try the simplest option:

   catatom2osm -tm 02069

The program will create the folder '02069', download there the necessary zip files from Cadastre and generate some files. You can review the file 'report.txt' and take a look at the OSM files in the 'tasks' folder.

Addresses conversion and conflation

If you use the program with the default options, in addition to the buildings it will try to convert the addresses data and conflate it with the OSM data:

    catatom2osm 02069

The program ends indicating that a file for converting the thoroughfare names ' highway_names.csv' has been generated and you must check it as part of the process of Review of thoroughfare names (es).

After you have checked the conversion file, you can run the program again to continue the process. As a result, the OSM files in 'tasks' will now contain the addresses in addition to the buildings.

Access to facade photos

The -d option convert the addresses into a separate file 'address.osm' apart from the buildings. Don't import it, but you can use it as a source to obtain missing addresses in OSM and place them in the appropriate place manually.

   catatom2osm -d 02069

This file contains the links to the front photos of each parcel that are used to check the house numbers and additional information about the buildings. That is why it is also generated with the -b and -t options. In order to visualize the photos it is necessary to have enabled the Tag2Link plugin. With the file opened in JOSM, select a node and in the Selection List Dialog right click on the entry for this node in the list. Select the View image in the contextual menu to open the image in your browser.

The image = * tag should not be uploaded to OSM.

This is not the only method to access the images, the complement of Josm pointinfo allows you to do it more comfortably.

Other options

-b: A single 'building.osm' file is generated with all the buildings. Don't import it, but you can use it to inspect/validate the results for the municipality as a whole. For the largest municipalities, take into account that this option consumes more memory than -t.

-z: To generate only the boundaries of polygons and blocks for the task manager. Don't import it, but it could serve as a guide to draw barriers and areas using an aerial background image.

-p: A file 'parcel.osm' is generated with the cadastral parcels. Don't import it.

Specifications

To better understand the internal functioning of the program you can consult its specifications or its documentation.

Generated files

Name Option Description
A.ES.SDGC.AD.02001.zip -d (default) Addresses data set downloaded from Cadastre.
A.ES.SDGC.BU.02001.zip -t (default) or -b Buildings data set downloaded from Cadastre.
A.ES.SDGC.CP.02001.zip always Cadastral parcels data set downloaded from Cadastre.
address.osm -d, -b, -t Used to access facade photos. Don't import.
boundary.poly always Used when it's necessary to manually download data for conflation.
building.osm -b Conversion of buildings into a single file instead of tasks. Don't import.
catatom2osm.log always Log file, it is created in the current folder.
current_address.osm -d without -m OSM addresses.
current_building.bak.osm -t or -b without -m OSM buildings.
current_building.osm -t or -b without -m OSM buildings with conflicts (overlapping with Cadastre buildings).
current_highway.osm -d without -m OSM highways.
highway_names.csv -d (default) Table with street names in Cadastre and their proposed conversion for conflation of thoroughfare names
highway_types.csv -d (default) Table with the abbreviations used by Cadastre to designate thoroughfare types and their complete designation. It is created in the first run of the program in the installation folder and is used when the name of the street is not located in OSM.
parcel.osm -p Conversion of cadastral parcels. Don't import.
report.txt -t, -d, -b It is important to review this file following the results report procedure.
rustic_zoning.geojson always Rustic polygons boundaries for the task manager.
urban_zoning.geojson always Urban blocks boundaries for the task manager.
zoning.geojson always Rustic polygons and urban blocks boundaries for the task manager.
tasks -t (default) Folder containing the OSM files to import. They correspond to rustic polygons if the name begins with the letter 'r' or with urban blocks if the name begins with the letter 'u'.
*.shp --log=DEBUG Program debugging files.

Additional information on the transformation of thoroughfare names

When the program does not locate the name of a thoroughfare in OSM, it uses the one from Catastro after transforming it. Here some details of how this transformation is performed are explained, but they are not too important since the idea is that in OSM the correct names are previously introduced and the program will take them from there.

  • All street names are preceded by a two-letter code that indicates the generic designation of the type of thoroughfare. The list of possible values ​​is specified in this document, but some more may appear in the data.
    • For some types of road there is more than one possible translation, such as AL = Aldea / Alameda. The program does not correct it.
    • Some types of road do not refer to any road, but to a place or territorial area. In these cases the name must go in the addr:place=* tag instead of addr:street=*.
  • Articles may appear displaced from the beginning to the end of the name, separated by commas or parentheses. The program does not correct this.
  • It also does not correct spelling errors and special characters.
  • Additional text may appear after the name . Generally it will be the name of one or several places where the thoroughfare is located. The locations can be in parentheses, or separated by a hyphen. It is information that is not part of the name and should not be in the name tag. The program does not correct it.

The values ​​for the translation of the road types are read from the 'highway_types.csv' file that you can optionally complete or adapt.

Other aspects of the translation of names that can be configured in the 'setup.py' file are the following:

  • place_types: When generating the output files in OSM format, the name of the path will be placed in an addr:place=* instead of addr:street=* if the type of thoroughfare is one of the list.
  • lowcase_words: List of words that must always be lowercase.
  • excluded_highways: List of road types whose addresses will be excluded from the import. Includes the type 'Disseminated'.
  • highway_types: The file 'highway_types.csv' is not incorporated in the download of the program, it is generated in the installation folder after the first execution from the content of this variable.

Results report

The 'report.txt' file contains the following sections:

General information

Contains data obtained from the administrative boundary relation of the municipality. The program tries to locate this relation searching in OSM for the municipality name closest to the name in Cadastre. It is important because the administrative boundary is used to limit the data downloaded from OSM for conflation. In most cases the matching is correct, but both names are showed to confirm that there is no error. If the name obtained from OSM is not correct, see error when locating the municipality. If 'None' appears as the name of the municipality see The administrative limit could not be obtained, the bounding box will be used. When available, the population data and the Wikipedia/Wikidata links appear. We can take this opportunity to confirm and/or update them.

   Municipality: La Roda
   Cadastre name: LA RODA
   Code: 02069
   Surface: 397.8 km²
   Population: 16299 hab. (2009)
   Wikipedia: https://www.wikipedia.org/wiki/es:La Roda (Albacete)
   Wikidata: https://www.wikidata.org/wiki/Q630706
   Date: 15/11/17

System information

Contains data about the versions of the program, the operating system, the memory and processor of the system, time and resources used.

   Application version: CatAtom2Osm 2017-11-14
   Platform: Linux 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017 x86_64 x86_64
   QGIS version: 2.14.11-Essen
   CPU count: 4
   CPU frequency: 3600.0 Mhz
   Execution time: 153.54 seconds
   Total memory: 7928.30 GB
   Physical memory usage: 251.81 GB
   Virtual memory usage: 1014.31 GB

Addresses

Contains information about the addresses with these sections:

Input data

Date of publication and number of features in the Cadastral data set.

   Source date: 2017-09-05
   Feature count: 6016
     Type entrance: 4710
     Type parcel: 1306
   Postal codes: 2
   Street names: 237

Process

Count of features deleted/transformed by the program.

   Addresses without house number deleted: 547
   Addresses without associated building deleted: 537
   Addresses belonging to multiple buildings deleted: 1165
   'Parcel' addresses not unique for it building deleted: 2

Conflation

Objects existing in OSM and conflicts.

   OSM addresses: 882
   Addresses rejected because they exist in OSM: 471

Output data

Elements generated in the output files.

   Addresses: 3294
     In entrance nodes: 2316
     In buildings: 978
     Type addr:street: 3290
     Type addr:place: 4

Buildings

Input data

   Source date: 2017-09-05
   Feature count: 22685
     Buildings: 6103
     Buildings parts: 16265
     Swimming pools: 317

Process

   Parts outside footprint deleted: 512
   Parts with no floors above ground: 265
   Building footprints created: 5
   Buildings with multipart geometries: 1689
   Buildings resulting from splitting multiparts: 4065
   Parts merged to the footprint: 10000
   Adjacent parts merged: 497
   Spike vertices deleted: 4
   Close vertices merged: 324
   Topological points created: 7494
   Simplified vertices: 18122

Conflation

   Buildings/pools in OSM: 748
     With conflict: 606

Output data

   Nodes: 72292
   Ways: 14247
   Relations: 451
   Feature count: 13792
     Buildings: 8484
     Buildings parts: 4991
     Swimming pools: 317
   Building types counter: industrial: 814, office: 10, residential: 4216, yes: 2434, retail: 78, ruins: 64, public: 90, barn: 778
   Max. levels above ground (level: # of buildings): 1: 2629, 2: 2634, 3: 492, 4: 135, 5: 94, 6: 46, 7: 16, 8: 1, 9: 1, 10: 1, 101: 1
   Min. levels below ground (level: # of buildings): 1: 397, 2: 12, 3: 1
   Rustic tasks files: 114
   Urban tasks files: 326

Problems

This is the most important part in the report, it details issues that might require our action.

Fixmes

Number of fixmes reported in the OSM files. You must review them and delete the 'fixme' tag before uploading the file. You can find:

  • Area too big: This building is bigger than the area set in the 'warning_min_area' option of the 'setup.py' file.
  • Area too small: This building is smaller than the area set in the 'warning_max_area' option of the 'setup.py' file.
  • This part is bigger than its building: A building part can't be greater than the building that contains it.
  • Missing building footprint for this part: The building footprint didn't pass the validation checks, it has been deleted resulting in orphaned buildings parts.
  • GEOS validation: The geometry didn't pass the validation tests of the GEOS library.

Warnings

  • Failed to find administrative boundary, falling back to bounding box: There has been a problem (probably with the communications) when trying to obtain the OSM administrative boundary relation for this municipality and the program continues using the bounding box around the municipality to download the data for conflation. The resulting files will be slightly greater but this is not a big concern.
  • No OSM data were obtained from '%s': The download of OSM data has produced an empty result. This may be because there isn't data or due to an overload in the Overpass API (see manually download data for conflation.
  • OSM building with id %s is not valid: A building with invalid geometry was found in OSM. If it will not be replaced, you must fix it.
  • There are %d address without house number in the OSM data: When the program downloads the data for conflation, it will only query for addresses with a house number. If you download this data manually this warning may appear.
  • Detected a %s geometry in the '%s' layer: Very unlikely, contact with the developer.

Report validations

The program performs some checks adding the values obtained in the report. If the results do not match, the problem is reported. This does not usually happen, notify the developer.

The checks carried out are:

  • Sum of address types should be equal to the input addresses.
  • Sum of output and deleted addresses should be equal to the input addresses.
  • Sum of entrance and building address should be equal to output addresses.
  • Sum of street and place addresses should be equal to output addresses.
  • Sum of buildings, parts and pools should be equal to the feature count.
  • Sum of output and deleted minus created building features should be equal to input features.
  • Sum of building types should be equal to the number of buildings.

Possible problems

When using the program you may find the following problems.

Manually download data for conflation

Sometimes, particularly in large municipalities, the Overpass servers may be overloaded. If the program can't download the data for conflation you can try using the --log=DEBUG option. This way, the url of the Overpass query is exposed. Copy and paste it in your browser and save the result with the names 'current_highway.osm', 'current_address.osm' or 'current_building.osm' according to the case.

If you don't obtain data in this way, try downloading a PBF file from Hot Export Tool or Geofabrik and running the script extract (.sh or .bat) in the program installation folder.

Bad match of municipality name

The name in the 'Municipality' line 'report.txt' doesn't match with the correct municipality. This mean that the program has failed to find the name with greater similarity and proximity in OSM to the name in Cadastre. To correct it, locate in OSM the identifier of the administrative boundary relation of the municipality and set it in the 'mun_fails' option at the end of the 'setup.py' file. The search algorithm has been tested for all the municipalities, so this situation should be exceptional.

See also