Somalia/Somalia Imports/Somalia UNSOS waterways import

From OpenStreetMap Wiki
Jump to navigation Jump to search
Main page Workflow

The Somalia UNSOS waterways import is an import of waterway lines covering the southern part of Somalia). The import has passed all the Import Guidelines steps, and it's now an ongoing crowdsourced task.

Goals

For years the United Nations Global Service Centre (UNGSC) has been extracting data of different countries in Africa for the United Nations peacekeeping missions. Some of the data sets are still quite up to date and usable for importing into OSM.

One of these files is the waterway set produced by the United Nations Support Office in Somalia (UNSOS), that extends to around 108,000 km in total. These data were generated using SPOT satellite imagery.

UNSOS Waterways coverage.
UNSOS Waterways coverage.

The Unite Maps Initiative and its community, UN Mappers, has the goal of importing these waterways into OSM in a crowdsourced way, merging them with the existing data, improving their tagging and/or geometries when needed, and dismissing those that aren't correct, using aerial imagery for validation.

Schedule

  1. Preparation, discussion - started March 30th 2020.
    1. Discussion with Somalian OSM community and HOT list: 3 days, started April 3rd 2020. It got some feedback from one user of the HOT mailing list, but private messages only.
    2. Discussion in the Imports list: It lasted from the 7th to the 15th of April 2020.
  2. This data import won't be just a blind import, but a crowdsourced import, where users are expected to check the validity of the data against aerial imagery, and correct the geometry and/or tagging of each waterway according to those images. Any user is welcome, but a good experience with waterways mapping is required.
  3. Bearing in mind the quantity of data, this import will be split in several files and be done one file after the other. So we expect it to take several months.

Import data

Data description

The original file is in shapefile format, and consists of around 108,000 km of waterways across the southern part of Somalia. UNSOS has given permission to import the data into OSM.

Data has been generated by visual feature extraction based on SPOT imagery performed by humans, for the production of topographic maps at 1:50,000 scale for the UN peacekeeping missions in various African countries. The main feature extraction and data validation was mainly performed by people without direct knowledge of the area but with the support from local GIS specialists working in the field.

Only the waterway segments that are new to OSM will be imported, except when these UN waterway segments are a clear improvement of poor OSM ways. Some cases of this are: an OSM waterway traced using low resolution aerial imagery (Landsat...), an OSM waterway traced with cloudy or outdated imagery, etc. In these cases the UN waterways will replace these wrong OSM segments, keeping the history as much as possible, and hence improving the OSM database.

Background

Data source site: https://www.dropbox.com/s/ivj8m6oyabjc9y2/UNSOS_waterways.zip?dl=0
Data license: UNSOS has given permission to import the data into OSM to UNGSC, by internal UN email.
OSM attribution: https://wiki.openstreetmap.org/wiki/Contributors#United_Nations_Support_Office_in_Somalia_.28UNSOS.29
ODbL Compliance verified: yes

OSM Data Files

Here you can download the whole data set in .osm.pbf format, but without being simplified with the Simplify Way JOSM tool yet.

For a simplified one, you can check the file for the first project in the Tasking Manager.

Import type

The best approach to import this data is manually by the community, with data split in a number of files, that will correspond to an equal number of projects in the HOT Tasking Manager. Advantages:

  1. Any volunteers may join the import effort, at any time (although only skilled mappers should join).
  2. We can check, at any time, the mapping progress of the import.
  3. We can easily validate each task too, and check the validation progress.
  4. Easy to set up.

We will provide a link for each Tasking Manager project in a table of the import workflow wiki, with info about the total size (ways, nodes, km) and expected time of work for each one. We will open the different projects progressively, when the previous jobs are being finished.

Data preparation

Data reduction & simplification

During the import, as we explained before, those UNSOS waterways that have an equivalent OSM way of equivalent or better quality, won't be imported. Therefore, data reduction will be done by the users involved in the import before uploading.

As for simplification, all ways will be simplified with the JOSM Simplify Way (Shift+Y) tool before the import process starts, so the users participating in the import won't have to bother about this. This will simplify the ways in around 25% of nodes. This simplification will keep accuracy within 3 m of the original ways.

There is 1 single way (out of the total of 44,360 ways) that was extracted using Google imagery (the one with the TXT=Based on Google Imagery tag in the original dataset). That way has been verified using Maxar imagery, so it won't be an issue.

Tagging plans

The original tagging is in accordance to the Technical Reference Document (TRD) of the Multinational Geospatial Co-production Program (MGCP).

Most of the tags of the original dataset aren't relevant, so we ignore them. Here we list all the original tags with their corresponding translation into the OSM tagging schema, plus additional tags for all segments:

UNSOS key UNSOS values meaning OSM tag Comments
ACC=1/2 ACC=1 means accurate (44,315 ways) and ACC=2 means approximate (45 ways). We ignore it. Information not relevant. The only 45 ways tagged as approximate will be checked by users like the others.
ACE=-32767.0/-32768.0 Unknown. We ignore it.
ACE_EVAL=21 All ways have the same value ACE_EVAL=21, that means FZD: Evaluation deferred. We ignore it. Information not relevant.
ALE=-32765.0/-32768.0 Unknown. We ignore it.
ALE_EVAL=21/998 ALE_EVAL=21 (135 ways) means FZD:Evaluation deferred and ALE_EVAL=998 (44,225 ways) means Not Applicable We ignore it. Information not relevant.
CDA=-32768/0/1000 Unknown. We ignore it.
CPYRT_NOTE=* We ignore it. Not relevant. UNSOS has given permission to import the data to OSM.
FCSubtype=0/1/2 It's equivalent to F_CODE=*. We ignore it.
FUN=-32768/6 All ways except one have the value FUN=-32768, of unknown meaning. FUN=6 means Fully Functional. We ignore it. Information not relevant.
F_CODE=BH020/BH030/BH140 F_CODE=BH020 (1 way only) means Ditch, F_CODE=BH030 (2,307 ways) means Canal and F_CODE=BH140 (42,052 ways) means River. All ways with F_CODE=BH020/BH030 will be translated to waterway=ditch, and ways with F_CODE=BH140 will be translated as either waterway=river or waterway=stream, depending on the value of the HYP=* tag (see HYP=* further down). There are barely no canals in Somalia, and by random check of the ways with F_CODE=BH030, we can see that they are generally waterway=ditches. The ways with F_CODE=BH140 are either rivers or streams, so we use HYP=* to decide between waterway=river and waterway=stream.
GFID=* Unknown. We ignore it.
HYP=1/2/4 HYP=1 means Perennial, HYP=2 means Intermittent and HYP=4 means Dry in the original dataset. intermittent=yes for all waterways. For waterways with F_CODE=BH140 and HYP=1 or HYP=2 we add waterway=river and for waterways with F_CODE=BH140 and HYP=4 we add waterway=stream This tag is difficult to translate to different OSM tags. The huge majority of the waterways are intermittent. In the cases that they aren't, this tag will be deleted by the users during the import.
LBV=-32767.0 Unknown. We ignore it.
LOC=-32768/44 All ways except one have the value LOC=-32768, of unknown meaning. LOC=44 means On Surface. We ignore it. Information not relevant.
NAM=* NAM=UNK (41,213 ways) means Unknown NAM=N/A (2,280 ways) means Not Available and NAM=Null String (27 ways) is self descriptive. name=* or no tag. We have 840 ways with 277 different names. For all ways with NAM=UNK, NAM=N/A or NAM=Null String we won't add this tag.
NFI=N/A/N_A/Null String Not available or Null String. We ignore it.
NFN=* It's identical to NFI=* We ignore it.
NVS=32768/0 There is only one way with value NVS=0, that means Unknown We ignore it.
OBJECTID=* We ignore it.
RBV=-32767 Unknown. We ignore it.
SHAPE_Leng=* We ignore it.
SHL=-32768/0/8 Unknown. We ignore it.
SHR=-32768/0/8 Unknown. We ignore it.
SRC_DATE=* It's the source date of the way. source:date=* 93 different values, from source:date=2005-12-28 up to source:date=2015-06-25
SRC_INFO=* Imagery used for the data extraction. For 139 ways this info is not available. For the rest, 34,886 have used SPOT5 images, 8,777 SPOT6 images, and 558 SPOT7 images. source=UNSOS
SRC_NAME=0/110/112 SRC_NAME=0 (139 ways) means Source is not known, SRC_NAME=110 (9,321 ways) means Very High Resolution Commercial Monoscopic Imagery, and SRC_NAME=112 (34900 ways) means High Resolution Commercial Monoscopic Imagery. We ignore it. We've checked the 139 ways. They are most of them very short segments, that don't present any problem, and can be checked against imagery.
TID=-32768/0/1000/1001 Unknown. We ignore it.
TIER_NOTE=* We ignore it.
TXT=* We ignore it. Only one way has TXT=Based on Google Imagery. We've checked it against Maxar imagery, and it's correct, so no problem.
UPD_DATE=N_A Not available We ignore it.
UPD_INFO=N_A Not available We ignore it.
UPD_NAME=0/998 UPD_NAME=0 means Source is not known and UPD_NAME=998 means There is no possible value in the attriubte range that would be applicable. (May occur when the attribute is not applicable to the feature type (for example: the Airfield Type attribute of a Settlement feature type).) We ignore it.
WCC=-32768/0/1/4/7 Unknown. We ignore it.
WID=-32765.0/-32767.0/5.0/10.0/15.0/20.0/24.0 Unknown, but its meaning is probably the width of the waterway in meters. We ignore it.
WCC=-32768/0/1/2/998 Unknown. We ignore it.
WatrcrsL_M=2/3/4/5/6 Unknown. We ignore it.
ZVAL_TYPE=0/3 ZVAL_TYPE=0 (139 ways) means Unknown and ZVAL_TYPE=3 (44,221 ways) means Feature is 2D only We ignore it. Information not relevant.

Changeset tags

We will use the following changeset tags:

where:

  • imageryProvider = Maxar, Bing, Maxar;Bing or whatever other list of imagery providers used while importing the data.
  • NUMBEROFPROJECT is the Tasking Manager project number.

Example:

Data transformation

  1. The original file (in shapefile format) was opened with JOSM + Open Data plugin, and saved in osm format.
  2. We deleted all non-relevant tags and translated the info of the rest to produce the proposed tags.
  3. The resulting file will be divided in pieces, one for each Tasking Manager project, again with the JOSM editor.
  4. Finally, we apply the JOSM Simplify Way tool (Shift+Y) at 3m error to all waterways of each file. This step reduces the number of nodes by roughly 25%.

Data transformation results

You can download here the file for the first project in the Tasking Manager.

Data merge workflow

Team approach

This import (data integration) will be done through the HOT Tasking Manager, so the number of people importing the data is unknown. We don't expect a big number of users, as we require them to be experienced mappers. Among the skills required:

  • Good experience with JOSM.
  • Experience with previous imports through the Tasking Manager is not necessary, but a plus.
  • S/he knows how to use JOSM filters.
  • Skilled working with waterways. It's needed to know well how to merge nodes (M), join ways (J), combine ways (C) and unglueing (G).
  • S/he Knows how to use the Download Along... JOSM option (Alt+Shift+D).
  • S/he knows how to use the Replace Geometry tool of the UtilsPlugin 2 JOSM plugin (Ctrl+Shift+G), and why it is so interesting.
  • S/he knows how to use the ToDo JOSM plugin.
  • S/he knows how to create and deal with waterway relations.
  • S/he knows how to deal with conflicts.

References

Following the imports guidelines, this import will be discussed first in the local Somalian OSM mailing list, and to get more feedback it will also be shared with the HOT mailing list.

With the inputs from those lists, a thread was opened in the imports list.

Workflow

We will use the JOSM editor for this import.

Step by step instructions

There are several ways to reach the same results. We have tried to find the simplest one, the less error-prone and the one that assures a higher level of consistency across different import volunteers. The proposed workflow is as follows:

  1. Choose one task of one of the TM import projects. User will be presented with 2 sets of data (the OSM data and the new UNSOS waterway segments to be imported) for the task square in two different layers, that we will merge into only one layer.
  2. Using filters, we download the areas around the UNSOS segments that have a part of them beyond the downloaded area, using for this task the Download along... JOSM option.
  3. We go segment by segment using the Todo list JOSM plugin. For those segments that aren't still mapped in OSM, we:
    1. In general we have always bear in mind that UNSOS segments are meant to help us improve the hydro coverage of the map areas, but it's us mappers who will decide if a waterway is a good candidate to be integrated in the OSM database or not. So don't hesitate to ignore (delete) those segments that you consider they shouldn't be imported, as you would if you were mapping waterways from scratch.
    2. Check if the waterway=* tag is ok according to the aerial imagery. If not, we change the value to the best one. Same for the intermittent=* tag.
    3. Go all along the segment to check if the accuracy is just acceptable. If not acceptable in an area, we correct it. We can do that using the Improve Way Accuracy tool (W).
    4. Although rare, some waterways may have a wrong flow direction, and have to be reversed (R). In case we are unsure about the direction of the water flow, we will try to get a general view of the whole area, and if still in doubt we will add a fixme=Please check direction tag.
    5. Quite often we will find waterways that end in deserted areas, in a plain, or similar area, seeping inside the ground, before reaching (connecting) another waterway or the sea shore. In these cases, we will add a waterway=stream_end tag to the last waterway node. This tag is well documented in the wiki, so if you get a JOSM warning you can safely ignore it.
    6. Then, we have to check crossing with all kind of highways (this includes highway=tracks and highway=paths). For the huge majority of the crossings you will have to add either a bridge=*+layer=1, or a ford=yes to the highway, depending on the crossing. If it is a ford=*, make sure you join the crossing with a node. In case we have an enough wide waterway, you may consider making the ford a way all across the riverbed.
    7. But for some waterways that cross a highway through a culvert (this happens sometimes with waterway=ditches, for example), you will have to cut the segment of the waterway that makes the culvert and add the tunnel=culvert+layer=-1 tags. If unsure about the crossing, you can add a fixme=Please check this waterway crossing tag.
    8. If a highway shares part of its path with the riverbed of a waterway, please add the flood_prone=yes tag to the segment that goes inside the riverbed.
    9. If an UNSOS waterway segment is part of a longer waterway, we combine it with the rest of segments into one only way, but be careful if one has a name=* tag and the other doesn't. If we have to combine it with an OSM way, we will generally keep the tags of the OSM way, unless we find that the UNSOS tags are more correct. If we have a conflict of name=* tags, we put the UNSOS waterway name in an alt_name=* tag.
  4. For the UNSOS waterway segments that have already a waterway mapped in OSM (but being the UNSOS segment of a better quality than the OSM counterpart), we replace the OSM segment with the UNSOS one. We will use the Replace Geometry tool (Ctrl+Shift+G) of the UtilsPluging2 JOSM plugin, so we make sure we don't loose any tag of the OSM segment and we also keep the way history. We will also apply the steps 3.1 to 3.9., but we will keep the OSM way tags, unless we have clear reasons to change them.
  5. We finaly upload the data to OSM, using an import specific user account and with the aforementioned changeset tagging. So we are ready to start a new task.

A detailed workflow wiki can be read here.

Changeset policy

Changesets will be small in size, so no issues expected in this respect.

Revert plan

If something goes wrong (quite unlikely based on experience), JOSM reverter will be used.

Conflation

Conflation is explained in the import workflow wiki, in the UNSOS waterways that are already mapped in OSM and Crossing highways sections.

The import workflow will basically keep the tagging and/or geometries of the existing OSM waterways, unless the UNSOS equivalent is of better quality or simply more correct. Names of waterways, if in conflict, will be preserved, and the UNSOS will go into a alt_name=* tag.

We will also make sure highway with waterway crossings will be correctly mapped.

QA

Validation of the import will be done by a second user in the Tasking Manager.