Romania CLC Import

From OpenStreetMap Wiki
Jump to: navigation, search
CLC land cover import for Romania
Status: Proposed (under way)
Proposed by: stefanu
Tagging: landuse=*
Applies to: <type>
Definition: Land cover import acceptance
Rendered as: <appearance>
Draft start:
RFC start: *
Vote start: 2010-01-12
Vote end: 2010-02-28

What is this all about

This is what the Garmin format map looks like after building it using the solution #1; for a close-up, see the image below
Detail of what the Garmin format map looks like after building it using the solution #1; image shows the Danube at Turnu Severin, with data that extends over the Romanian border; OSM data however is present on the Romanian side only
Area around Campulung Moldovenesc with great level of detail but almost impossible to match to CLC with reasonable effort
Same area with CLC data instead of OSM land use
And a closeup comparison; display at full resolution to see the details
Overlapped CLC and 'cultura.ro' imports; red shows CLC which seem more precise; yellow shows 'cultura.ro' data that may be used for settlement boundaries
CLC and existing OSM data around Sighisoara; forrest polygons match almost perfectly; blue contour is CLC data

This page discusses various options for importing large amounts of data from the Corine Land Cover project. Several tests have been performed, resulting in a valid map in Garmin IMG format (using splitmap and mkgmap).

Corine Land Cover data is in the public domain, but express permission has been granted for this import; read here about this. Also, other mapping communities, most notably in France have been importing this data from the same source.

About land cover

The Corine Land Cover project (CLC for short), provides, free of charge and with no licensing restrictions data about land cover in ESRI shapefile format for the whole European Union. Simply put, there is a huge bunch of tagged poligons that cover the entire surface of the EU. Polygons are split into categories, about 45 of them; almost all are present inside the Romania border contour.

Data processing

The ESRI shapefiles have been processed using a ogr2ogr for a coordinate translation, then using a custom processing utility written in C++. The processing utility uses a country contour polygon, optionally expanded to cover the area just 'ouside' the territory. An intersection is then computed for all CLC polygons, resulting in a polygon set that covers 100% the given territory.

The problem

The main issue at hand is about existing landcover polygons in the OSM database that users have worked hard on to contribute. However, some data, even if very well detailed, covers some small perimeter clipped like in the sample image, which will be hard to correlate with CLC data. Other areas, like Sighisoara surroundings have forest areas that almost match CLC data.

Using a december planet extract for Romania, the following data has been computed; as more up to date and verified counts will be computed, they will be listed here.

Data Count
Existing OSM land cover ways or relations 7.193 (more details here)
Users creating existing land cover 184, some being bots
CLC polygons 312.591 (does not cover all land cover types)
CLC points for above polygons 17.558.483 (does not cover all land cover types)

Solution #1

Deletion of all existing land cover polygons in the OSM database, and replacing them with CLC data

Pros :

  • good precision
  • 100% territory coverage

Cons :

  • Huge delete of osm surveyors needed

Solution #1 improved

(see comment below by Strainu ) Re-tagging of all existing land cover polygons in the OSM database, making them invisible, and adding CLC data as visible items

Pros :

  • good precision
  • 100% territory coverage

Cons :

  • Some work needed someday to merge data

Solution #2

Full import of all CLC data, even if it overlaps with existing land cover polygons

Pros :

  • good precision
  • 100% territory coverage

Cons :

  • Produces overlapping polygons
  • A lot of work needed to manually inspect polygons and delete duplicates, eventually ending up in deleting a large amount of the existing data, since aligning old and CLC polygons may be tricky.

Solution #3

Selective import of CLC data, ruling out polygons that overlap with existing data

Pros :

  • No existing data is deleted

Cons :

  • Poor territory coverage
  • A lot of work needed to manually inspect the result

Voting

Enter your vote here

Solution #1 Solution #1 improved Solution #2 Solution #3 Import should not be performed Other
--stefanu 12:25, 11 March 2010 (UTC)
--diciu 09:50, 16 March 2010
--indreias 09:52, 16 March 2010
--rares.rusu 10:10, 16 March 2010
--Strainu 10:08, 16 March 2010 (UTC)
--manuelciosici 07:51, 30 July 2010 (UTC)


Write your votes and opinions here

  • --stefanu 09:13, 12 January 2010 (UTC) Looking at the final result, solution #1 looks best, even if it means user contributions will be replaced with a massive import.
  • --twist3r 19:52, 13 January 2010 . Solution #1 seems to be the best at the moment.
  • --Cipt2001 18:29, 13 January 2010 (UTC) I agree with solution #1 (which seems to be the fastest), but I would like to create an OSM extract of the data to be deleted (probably using XAPI)
  • --stefanu 18:42, 13 January 2010 (UTC) in any case, a backup of the modified/deleted data should be created
  • --owene 22:20, 13 January 2010 (UTC) I'm against #1 because in OSM philosophy we should improve existing, surveyed data, not delete & replace with other data possibly less accurate just because of the higher available volume. Then why surveying? What if we get something better next? Do we delete CLC imported data? Backup of deleted data is not a solution as backups get out of sight, forgotten. I am for modified solution #3. In the areas where there is no data, CLC improves. In area where there is possibly accurate data or higly accurate data(see C-lung Moldvenesc region), CLC data should be imported with modified tags(CLC:landuse=forest instead of landuse=forest) so it will be in OSM, hidden, can be compared easily and used, if it improves. Unused CLC data should be deleted after 1 year if it was not used, still has CLC:landuse for example.
  • --diciu 16:36, 9 March 2010. I like Solution #3.
  • --rares.rusu 16:46, 9 March 2010. I propose to go with #1. From my experience in OSM at the current rate of manual additions of land cover we will never have a good-enough land cover map. So importing everything from a trusted source will bring a HUGE benefit to the way OSM looks + it will make it more appealing to use, thus making it more popular.
  • --indreias 17:37, 9 March 2010. I prefer solution #3 - please keep the voting till the end of this week and post the final decision.
  • --Strainu 19:06, 10 March 2010 (UTC) I prefer the solution proposed by owene, only reversed - keep the CLC data visible and modify the tags on existing landcover data. Plus, add a note explaining the change. This way, users that introduced the landcover data and knew what they were doing can choose to revert the change if CLC data is inferior to their own. If I had to choose strictly from the 3 options, I would choose #3.
  • --stefanu 11:56, 11 March 2010 (UTC). Solution #3 reversed proposed by Strainu is like Solution #1 without delete, just re-tagging. This may indeed be the best compromise of all.

Final conclusion

Based on the very few feedback on this page and romanian talk list, the accepted solution is "Solution #1 improved". More information about import stages can be found here : Corine Land Cover Romania 2006

The aftermath

The import finished on December 7th 2010. A number of manual adjustments still have to be made. Please see Corine Land Cover for Romania aftermath for more details