Spanish Cadastre/Buildings Import/Data Conversion/Validation

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages — Spanish Cadastre/Buildings Import/Data Conversion/Validation
· Afrikaans · Alemannisch · aragonés · asturianu · Aymar aru · azərbaycanca · Bahasa Indonesia · Bahasa Melayu · bamanankan · Bân-lâm-gú · Basa Jawa · Basa Sunda · Baso Minangkabau · bosanski · brezhoneg · català · čeština · corsu · dansk · Deutsch · eesti · English · español · Esperanto · estremeñu · euskara · français · Frysk · Gaeilge · Gàidhlig · galego · Hausa · hrvatski · Igbo · interlingua · Interlingue · isiXhosa · isiZulu · íslenska · italiano · Kiswahili · Kreyòl ayisyen · kréyòl gwadloupéyen · Kurdî · Latina · latviešu · Lëtzebuergesch · lietuvių · Limburgs · magyar · Malagasy · Malti · Nederlands · Nedersaksies · norsk bokmål · norsk nynorsk · occitan · Oromoo · oʻzbekcha/ўзбекча · Plattdüütsch · polski · português · română · shqip · slovenčina · slovenščina · Soomaaliga · suomi · svenska · Tagalog · Tiếng Việt · Türkçe · Türkmençe · Vahcuengh · vèneto · walon · Wolof · Yorùbá · Zazaki · isiZulu · српски / srpski · авар · Аҧсшәа · башҡортса · беларуская · български · қазақша · Кыргызча · македонски · монгол · русский · тоҷикӣ · українська · Ελληνικά · Հայերեն · ქართული · नेपाली · भोजपुरी · मराठी · संस्कृतम् · हिन्दी · অসমীয়া · বাংলা · ਪੰਜਾਬੀ · ગુજરાતી · ଓଡ଼ିଆ · தமிழ் · తెలుగు · ಕನ್ನಡ · മലയാളം · සිංහල · བོད་ཡིག · ไทย · မြန်မာဘာသာ · ລາວ · ភាសាខ្មែរ · ⵜⴰⵎⴰⵣⵉⵖⵜ · አማርኛ · 한국어 · 日本語 · 中文(简体)‎ · 中文(繁體)‎ · 吴语 · 粵語 · ייִדיש · עברית · اردو · العربية · پښتو · سنڌي · فارسی · ދިވެހިބަސް
Proposal Import Guide Corrections Projects management Software
Spanish Cadastre Buildings Import.svg

Validation shows the results of the validation of the data conversion process on a sample of municipalities for this Buildings Import.

List of proposed municipalities

We have selected 16 municipalities with the largest number of buildings to import, the two Autonomous Cities in Africa, 8 random municipalities with between 1,000 and 10,000 buildings, and 10 random municipalities with less than 1,000 buildings. The number of buildings refers to data published in September 2017.

Candidate municipalities for validation
N#. Code Name Comunity Buildings Area (km²) Population (hab. 2016)
1 28900 Madrid Madrid 122 839 605.8 3 165 541
2 30030 Murcia Región de Murcia 80 495 881.9 441 003
3 08900 Barcelona Cataluña 70 296 102.2 1 608 746
4 41900 Sevilla Andalucía 58 559 140.8 690 566
5 51016 Cartagena Región de Murcia 53 283 398.3 569 009
6 29900 Málaga Andalucía 51 055 398.3 569 009
7 35017 Las Palmas de G.C. Canarias 41 941 100.6 378 998
8 54057 Vigo Galicia 41 597 109.1 292 817
9 07040 Palma Islas Baleares 38 586 208.6 402 949
10 14900 Córdoba Andalucía 36 944 1 255.2 326 609
11 46900 Valencia Valenciana 36 407 134.7 790 201
12 50900 Zaragoza Aragón 35 355 973.8 661 108
13 38023 S.C. de La Laguna Canarias 31 976 102.1 153 111
14 06900 Badajoz Extremadura 23 133 1 440.4 149 946
15 52024 Gijón Asturias 22 648 181.7 273 422
16 47900 Valladolid Castilla y León 16 999 197.9 301 876
17 56101 Melilla Melilla 9 866 12.3 86 026
18 55101 Ceuta Ceuta 7 304 18.5 84 519
19 13028 Campo de Criptana Castilla-La Mancha 6 158 302.4 13 949
20 45123 Olías del Rey Castilla-La Mancha 3 067 39.9 7 357
21 25070 Les Borges Blanques Cataluña 2 898 61.6 6 000
22 39074 Santa María de Cayón Cantabria 2 474 48.2 9 078
23 12085 Oropesa Valenciana 2 378 26.4 9 245
24 04101 Viator Andalucía 2 166 21.0 5 699
25 44260 Valderrobles Aragón 1 270 124.0 2 311
26 09361 Santa María del Campo Castilla y León 1 223 60.3 584
27 49189 Quiruelas de Vidriales Castilla y León 864 28.0 706
28 10189 Torrecilla de los Ángeles Extremadura 772 43.3 640
29 16005 Albalate de las Nogueras Castilla-La Mancha 671 40.1 276
30 19190 Ledanca Castilla-La Mancha 618 47.3 110
31 26069 Grañón La Rioja 595 31.0 275
32 37350 La Vellés Castilla y León 535 25.5 557
33 05015 Arevalillo Castilla y León 389 15.0 86
34 17184 Sant Miquel de Fluvià Cataluña 376 3.5 742
35 22279 Salillas Aragón 204 28.3 98
36 42095 Centenera de Andaluz Castilla y León 110 19.9 21

The validation process generates the following data.

Quantitative results

This table shows the dimension of the data before and after the conversion. The column 'Cadastre' indicates the number of objects in the original data and the column 'import' the number of objects in the converted data.

Quantitative results
Code Municipality Buildings Building parts Swimming pools Direcciones Increment of
vertices
Nodes Ways Rels. Tasks
Cadastre Import Cadastre Import Cadastre Import
04101 Viator 2 166 2 780 4 443 1 144 125 2 349 1 245 -3 914 20 158 4 162 111 174
05015 Arevalillo 389 455 617 83 - 370 296 -83 2 478 538 - 37
06900 Badajoz 23 133 37 652 89 242 28 982 3 216 25 593 15 320 -89 450 382 186 72 429 2 178 2 382
07040 Palma 38 586 58 549 251 422 90 174 4 780 42 211 25 881 -219 843 823 658 157 680 3 885 3 318
08900 Barcelona 70 296 81 530 319 857 200 277 962 76 581 57 122 -112 969 1 396 249 310 118 21 165 4 807
09361 Santa María del Campo 1 223 1 554 2 286 372 7 1 313 514 -2 108 8 523 1 937 6 210
10189 Torrecilla de los Ángeles 772 839 1 120 139 11 824 603 -660 4 905 999 6 73
12085 Oropesa 2 378 3 420 12 328 7 181 732 3 449 1 864 -12 523 72 186 11 797 287 338
13028 Campo de Criptana 6 158 7 346 12 949 3 504 197 6 754 4 871 -8 898 57 324 11 427 345 469
14900 Córdoba 36 944 60 614 151 432 46 847 11 220 39 733 25 827 -47 807 709 871 126 104 6 859 3 434
16005 Albalate de las Nogueras 671 726 954 165 15 796 614 -583 3 987 906 - 68
17184 Sant Miquel de Fluvià 376 484 902 253 47 444 262 -1 558 4 335 785 4 46
19190 Ledanca 618 649 868 151 11 620 500 -273 3 525 816 4 82
22279 Salillas 204 302 511 96 1 150 109 -171 1 849 401 2 32
25070 Les Borges Blanques 2 898 3 553 9 039 3 768 121 2 776 1 962 -8 117 28 335 7 487 44 169
26069 Grañón 595 632 834 147 6 639 524 -117 3 055 785 2 45
28900 Madrid 122 839 152 757 803 272 413 499 13 358 133 806 83 636 -699 973 3 065 059 615 106 27 851 11 893
29900 Málaga 51 055 64 533 205 596 81 952 5 643 55 213 40 987 -146 346 875 863 158 176 5 196 5 733
30030 Murcia 80 495 108 169 280 302 92 392 8 471 100 392 58 282 -371 151 1 185 938 216 027 6 605 9 474
35017 Las Palmas de G.C. 41 941 48 065 167 768 71 574 656 45 760 31 115 -186 819 678 114 128 493 7 821 3 742
37350 La Vellés 535 681 1 438 353 17 575 395 -645 5 820 1 067 20 62
38023 S.C. de La Laguna 31 976 38 144 95 112 39 402 307 35 963 25 944 +3 443 414 341 81 870 3 810 2 775
39074 Santa María de Cayón 2 474 3 307 10 140 3 638 37 2 291 1 408 -7 600 34 510 7 036 58 313
41900 Sevilla 58 559 67 250 217 421 83 613 1 696 60 188 48 994 -156 206 864 906 164 951 12 623 6 442
42095 Centenera de Andaluz 110 117 147 19 3 190 93 -65 701 139 - 21
44260 Valderrobles 1 270 1 347 2 834 1 133 9 1 457 1 158 -2 176 9 166 2 511 22 151
45123 Olías del Rey 3 067 3 809 11 965 4 066 588 5 000 2 561 -14 155 46 151 8 584 91 241
46900 Valencia 36 407 42 465 213 502 115 304 654 41 040 31 598 -279 238 769 447 169 270 10 514 4 082
47900 Valladolid 16 999 25 305 104 519 42 506 696 19 594 11 644 -106 226 399 859 75 684 3 415 1 933
49189 Quiruelas de Vidriales 864 1 257 1 880 445 27 988 432 -1 157 8 240 1 740 17 87
50900 Zaragoza 35 355 52 589 202 242 89 366 3 984 38 747 8 643 -144 409 736 015 150 238 3 607 3 375
51016 Cartagena 53 283 68 969 206 714 58 207 3 897 61 314 41 883 -207 487 809 949 135 712 3 598 5 164
52024 Gijón 22 648 32 306 110 477 47 401 915 29 620 12 427 -105 982 389 881 81 359 928 2 168
54057 Vigo 41 597 61 252 175 336 73 643 2 349 59 594 26 267 -101 474 649 104 138 187 1 189 3 160
55101 Ceuta 7 304 8 702 21 444 6 837 147 7 952 6 184 -15 059 88 295 16 271 502 853
56101 Melilla 9 866 11 071 31 503 12 992 228 10 287 9 420 -17 733 123 969 25 474 1 087 1 308
Total: 36 806 051 1 053 180 3 722 416 1 621 625 65 133 914 573 580 585 -3 069 532 14 677 952 2 886 266 123 852 78 661
Variation: +23.5% -56.4% +36.5% -17.3%

For buildings, an increase occurs because in the original data a group of buildings located in the same parcel (sharing the cadastral reference) is represented by a single geometry of multipolygon type. When converting, they have been separated into individual buildings. In a few cases, there are also building parts outside the footprint for which the building footprint has been created. On the other hand, the process of eliminating junk geometries can remove some buildings.

For building parts, the number is significantly reduced as described in building parts below ground level, reduction of building parts, detection of swimming pools inside buildings and junk geometries.

The number of swimming pools is not modified during the conversion.

The number of addresses is reduced because those that are not associated with exactly one building are not imported. Those that already exist in OSM will not be imported also.

The final number of nodes used is the result of adding topological points and eliminating duplicated, unnecessary and acute vertices nodes.

Each task includes an average of 13 buildings. Although this number is usually greater in the tasks corresponding to the Rustic Cadastre, they are also usually less complex buildings.

Qualitative results

This table shows the number of problems detected in each municipality.

Qualitative results
Code. Municipality Fixmes JOSM Errors JOSM Warnings
AG AP PM FC VG Total EG V2 VF IP Total EE EI ND NP OD RM VP Total
04101 Viator - - - - - - - - - - - - 2 - - - - - 2
05015 Arevalillo - - - - - - - - - - - - - - - - - - -
06900 Badajoz 13 2 - 7 - 22 - - - - - - 3 - - - - - 3
07040 Palma 6 52 - - - 58 1 - - - 1 6 11 1 - 1 - - 19
08900 Barcelona 27 33 2 6 - 68 1 - - 1 2 76 3 2 - - - 5 86
09361 Santa María del Campo - - - - - - - - - - - - - - - - - - -
10189 Torrecilla de los Ángeles - - - - - - - - - - - - - - - - - - -
12085 Oropesa 2 - - - - 2 - - - - - - - - - - - - -
13028 Campo de Criptana - - - - - - - - - - - - 10 - - - - - 10
14900 Córdoba 2 16 3 135 - 156 - - - - - 6 6 - - 1 - 1 14
16005 Albalate de las Nogueras - - - - - - - - - - - - - - - - - - -
17184 Sant Miquel de Fluvià - - - - - - - - - - - - - - - - - - -
19190 Ledanca - - - - - - - - - - - - - - - - - - -
22279 Salillas - - - - - - - - - - - - - - - - - - -
25070 Les Borges Blanques - - 1 - - 1 - - - - - - - - - - - - -
26069 Grañón - - - - - - - - - - - - - - - - - - -
28900 Madrid 90 200 4 4 1 299 4 1 1 - 6 376 29 1 1 56 1 10 474
29900 Málaga 10 5 2 - - 17 1 - - - 1 6 9 - - 20 - 3 38
30030 Murcia 9 12 21 6 - 48 - - - - - 4 19 1 - 3 - 1 28
35017 Las Palmas de G.C. 4 9 - - - 13 - - - - - 3 4 1 - - - - 8
37350 La Vellés - - - - - - - - - - - - 1 - - - - - 1
38023 S.C. de La Laguna 4 8 4 6 - 22 - - - - - 11 6 - - 6 - 2 25
39074 Santa María de Cayón 1 - - - - 1 - - - - - - - - - - - - -
41900 Sevilla 12 21 4 - - 37 1 - - - 1 8 4 - - 9 - 1 22
42095 Centenera de Andaluz - - - - - - - - - - - - - - - - - - -
44260 Valderrobles - - - - - - - - - - - - - - - - - - -
45123 Olías del Rey 1 - - - - 1 - - - - - - - - - - - - -
46900 Valencia 11 26 4 11 - 52 - - - - - 31 12 - - - - 2 45
47900 Valladolid 14 43 1 - - 58 - - - - - 4 3 - - - - - 7
49189 Quiruelas de Vidriales - - - - - - - - - - - - - - - - - - -
50900 Zaragoza 27 18 3 - - 48 2 - - - 2 6 13 - - - - - 19
51016 Cartagena 3 12 1 - - 16 - - - - - 1 1 - - 1 - - 3
52024 Gijón 7 8 - - - 15 - - - - - - 2 - - - - - 2
54057 Vigo 4 15 1 5 - 25 - - - - - 6 8 - - 1 - 2 17
55101 Ceuta 1 3 2 - - 6 - - - - - 2 3 - - - - - 5
56101 Melilla 2 4 - 71 - 77 - - - - - - - - - - - 1 1
Total 250 487 53 251 1 1 041 10 1 1 1 13 546 149 6 1 98 1 28 829


The meaning of the columns corresponds to the following key.

Keys for qualitative results
fixmes
Key Meaning Total Potential n# Percentaje
AG Area too big 250 2 886 266 0.0087%
AP Area too small 487 2 886 266 0.0169%
PM This part is bigger than its building 53 2 886 266 0.0018%
FC Missing building footprint for this part 251 2 886 266 0.0087%
VG GEOS validation 1 2 886 266 0.0000%
ES:JOSM/Validator errors
Key Meaning Total Potential n# Percentaje
EG Too big building 6 1 053 180 0.0006%
V2 Way with more than 2 000 nodes 1 2 886 266 0.0000%
VF Function verification problem 1 123 852 0.0008%
IP Intersection between multipolygon ways 1 123 852 0.0008%
ES:JOSM/Validator Warnings
Key Meaning Total Potential n# Percentaje
EE Building inside a building 170 1 053 180 0.0161%
EI Intersecting buildings 120 1 053 180 0.0114%
ND Mixed type duplicated nodes 5 3 069 532 0.0002%
NP Nodes in the same location 1 14 677 952 0.0000%
OD Other duplicated nodes 42 3 069 532 0.0014%
RM Relations with the same members 1 123 852 0.0008%
VP Ways with same position 18 2 886 266 0.0006%

The percentage of problems detected against the number of possible candidates to suffer the problem is very low. These are the possible types:

Fixmes

Number of corrections reported by the conversion tool in the OSM files.

  • Area too big: The building area is smaller than the value of the 'warning_min_area' option in the file 'setup.py'.
  • Area too small: The building area is greater than the value of the 'warning_max_area' option in the file 'setup.py'.
  • This part is bigger than its building: A building part that is larger than the building to which it belongs have been found.
  • Missing building footprint for this part: The building footprint has not passed the validation tests, it has been removed and building parts remain orphaned.
  • GEOS validation: The geometry has not passed the validation tests of the GEOS library.

JOSM/Validator errors

Number of errors reported by the JOSM validator.

  • Too big building: It could be a false buildings.
  • Way with more than 2 000 nodes: It could be a false building.
  • Role verification problem: Multipolygon relation without inner rings.
  • Intersection between multipolygon ways: Topological errors not fixed by the correction alglorithm.

Warnings in JOSM/Validator

Number of warnings reported by the JOSM validator.

  • Building inside a building: Topological errors not fixed by the correction algorithm.
  • Intersecting buildings: Topological errors not fixed by the correction algorithm.
  • Mixed type duplicated nodes: Duplicated nodes not fixed by the correction algorithm.
  • Nodes in the same location: Duplicated nodes not fixed by the correction algorithm.
  • Other duplicated nodes: Duplicated nodes not fixed by the correction algorithm.
  • Relations with the same members: Usually duplicated buildings with diferent uses or state of conservation.
  • Ways with same position: Usually duplicated buildings with diferent uses or state of conservation.

Accuracy with respect to aerial images

A manual inspection of the data on aerial images has been carried out. The process consisted of selecting a random sample of the task files generated by the program and reviewing them in JOSM, counting the number of elements that require some type of manual correction before import. The problems detected that need manual correction have moved to this catalog.

Manual inspection
Total tasks Tasks reviewed Objects
Code Municipality Reviewer Rustic Urban Rustic Urban Total Reviewed Corrections Percentaje
29900 Málaga Daniel Capilla (discusión) 55 5 678 4 17 152 128 1 524 110 6.0%
38023 S.C. de La Laguna Javiersanp (discusión) 65 2 710 3 28 77 853 811 158 19.5%
39005 Potes Jesús Gómez (discusión) 3 54 1 15 1 151 298 103 34.5%
06074 Llerena Matías Taborda 22 169 10 177

Comparison of CatAtom2Osm and Cat2Osm2

Cat2Osm2 was a great tool with a lot of work behind it that was used in the first attempt to import data from the Spanish Cadastre. Cat2Osm2 allowed many of us to access the Cadastre data and has evidenced problems that we want to correct before importing on this occasion. These are some arguments in favor of replacing it by CatAtom2Osm.

  • Example of a bloc of raw data.
     
  • Example of bloc transformed with Cat2Osm2.
  • Example of bloc transformed with CatAtom2Osm.

These screenshots show a block with the raw data (on the left), transformed using Cat2Osm2 (center) and with CatAtom2Osm (on the right). After passing the validation tool in the first case we have 14 problems versus 0 in the second. The reason is that the Cadastre data includes topological problems that had to be corrected manually.

As for the number of elements, the first example needs 128 ways and 606 nodes, while the second uses 2 relations, 125 ways and 481 nodes. The number of nodes has been reduced because there is more cleaning of unnecessary nodes in straight lines. The two additional relations are necessary to represent buildings and parts with holes. Its absence caused the error "Building inside a building" in the validator.

  • Detail of a block of raw data.
     
  • Detail of a block transformed with Cat2Osm2.
  • Detail of a block transformed with CatAtom2Osm.

These series of screenshots show in more detail the topological problems and the unnecessary nodes.

Cat2Osm2 took the addresses of Cadastre making some corrections on the throughfare names such as the use of uppercase and lowercase, but then it was necessary to select and correct each street in each of the import files in which the data were divided, tedious and repetitive work. CatAtom2Osm collects the thoroughfare names from OSM and combines them with the Cadastre addresses, it is reviewed globally before they appear in the task files reducing the necessary effort.

When Cat2Osm2 started, the Simple 3D Buildings was a recent scheme and was not applied. The different levels of a building are transformed as individual buildings that together sum the footprint of the real building. This works well when it comes to 3D rendering, but it gives an incorrect result if you want to count the number of buildings or assign properties to a building as such. In CatAtom2Osm, for each real building there is a way or multipolygon relation and the different levels are included in building parts.

Cat2Osm2 divided the data into files by block. Some blocks are adjacent and their buildings have common walls. Importing the data black by block it was ncessary to manually merge the nodes of these walls with the data already imported. CatAtom2Osm avoids this problem by merging adjacent blocks before dividing the data into tasks.

As an additional bonus, CatAtom2Osm automatically downloads the data from the ATOM services; with Cat2Osm2 you need a digital certificate for accessing the download page. CatAtom2Osm also takes 50% less processing time.

   /usr/bin/time -v java -jar cat2osm2.jar 38023
   Command being timed: "java -jar cat2osm2.jar 38023"
   Elapsed (wall clock) time (h:mm:ss or m:ss): 28:00.84
   Maximum resident set size (kbytes): 2157576
   Minor (reclaiming a frame) page faults: 70411
   Voluntary context switches: 107306
   Involuntary context switches: 18901
   Swaps: 0
   File system inputs: 16544
   File system outputs: 2007656
   /usr/bin/time -v catatom2osm -btd 38023
   Command being timed: "catatom2osm -btd 38023"
   Elapsed (wall clock) time (h:mm:ss or m:ss): 13:08.88
   Maximum resident set size (kbytes): 2812460
   Minor (reclaiming a frame) page faults: 1236035
   Voluntary context switches: 445
   Involuntary context switches: 4681
   Swaps: 0
   File system inputs: 37192
   File system outputs: 10066256

Against it, CatAtom2Osm doesn't import parcels.