WikiProject Tanzania/Tanzania MSD & JSI road import
- 1 Goals
- 2 Schedule
- 3 Import data
- 4 Data preparation
- 5 Data merge workflow
Over the past year, the Medical Stores Department (MSD) (which is a parastatal organization) and JSI staff collected GPS data (from now on we will call it MSD&JSI data) from medical supply delivery vehicles across Tanzania, and processed that into OSM road network improvements. The data was worked on externally in ArcGIS and QGIS, so there's a need for an import process to get this comprehensive road data into OSM. Additional data was generated using Bing aerial imagery.
This data is about 60,000 km of road length.
- Preparation, discussion - due to start July 29th 2015.
- The import could be ready to start in about 20 to 30 days. A conservative estimate (times may be shorter):
- Discussion with Tanzania OSM community and HOT: 3-7 days
- Discussion with Imports list: 15-25 days
- Setting the TM projects: 1 day
As for the import itself, we estimated a total of between 100 and 150 hours. :
- It won't be just a blind import: users will tag MSD&JSI roads, add some extra ways (finish incomplete ways, not present streets, etc.) and correct many roads that are already present in OSM but have wrong geometry, lack of tags (specially surface=*), etc.
- 5 volunteers, mapping an average of 4 hours/day, would take between 5 and 8 days to complete. And it is 60,000 km of new roads!
The original file is in shapefile format, and consists of about 60,000 km of detailed additional road data across Mainland Tanzania based on GPS data and Bing imagery. The Medical Stores Department (MSD) gave recently permission to JSI to upload this data on to OSM, as the data was gained from GPS devices on their vehicles.
For the time being, only segments that are new to OSM will be imported, except when MSD&JSI segments are a clear improvement of poor OSM ways. Some cases of this are: an OSM road traced using low resolution aerial imagery (Landsat...), an OSM road traced with cloudy imagery, OSM roads that are the result of a poor quality former road import (Africover...), etc. In these cases the MSD&JSI roads will replace these wrong OSM segments and hence improve the OSM database.
Although quite rare, we can find a MSD&JSI segment that is incomplete, like the one in the next image, where we can see the upper way not finished in its West side. In this case, if the not-finished part is not too long, we will finish that segment using aerial imagery (Bing or Mapbox).
Also quite unusually, we may find one residential area with only a part of the residential streets with MSD&JSI segments. Again, although not part of the import, we can take some time to complete the net of residential streets, as otherwise it would remain unfinished.
Very rarely, we may find a MSD&JSI road that looks not to exist, like the one we show in the next image. In this screenshot, we have the MSD&JSI way in red, and in blue we painted a road that goes nearby, but not traceable for the whole length of the JSI road. In this case, we will skip importing this road, and report it to the JSI staff, so they can check it again. In any case, this issue is extremely rare.
Data license: You can see a scan of the license here
The best approach to import this data is manually though several HOT Tasking Manager projects. Advantages:
- Any volunteers may join the import task, at any time (although only skilled mappers should join, and non-skilled volunteers should be discouraged).
- No need to manually download the data from any server. Each TM project will smartly provide the MSD&JSI roads for each task.
- We can check, at any time, the progress of the import.
- We can easily validate each task too.
- Easy to set up.
We need to split the file in several chunks/areas (around 4 or 5 in total), as retrieving the data for each task would take too long if we used only one file. Each piece will make one project in the Tasking Manager, and we can publish new projects progressively, as the previous jobs are being finished.
Data reduction & simplification
During the import, as we explained before, those MSD&JSI segments that have an equivalent OSM way of equivalent or better quality, won't be imported. So we will delete those segments beforehand, therefore making the import faster and easier.
Apart from that, all MSD&JSI roads will be simplified in their number of nodes by around 50%. This will be done also before the import process starts, using JOSM Simplify Way tool.
Every MSD&JSI segment comes with 19 tags. Of those, the only one that could be of use would be the Road_type=* tag. This tag has 4 values:
- Road_type=LOCAL: They could translate to different OSM highway=* values, like highway=service, highway=residential and highway=unclassified.
- Road_type=SECONDARY: They could translate to highway=unclassified and highway=tertiary. Sometimes, a segment of a whole road tagged as Road_type=SECONDARY is followed by a segment Road_type=LOCAL when crossing a residential area, and changes back to a Road_type=SECONDARY segment when out of that settlement.
- Road_type=MAJOR: It denotes higher level than Road_type=SECONDARY, but again, it can be translated by different OSM values, like highway=tertiary, highway=secondary, highway=primary and highway=trunk.
- Road_type=MAJOR-LOCAL: Segments of Road_type=MAJOR road traversing a residential area.
So it's clearly impossible to automatically translate these tags into the OSM tagging schema. We will instead follow the Highway Tag Africa wiki for road tagging as reference.
We propose the following base tagging (that means, the tags all users will get for all MSD&JSI segments that they have to manually modify when needed):
- highway=road -> This tag value has always to be changed by the users to highway=unclassified, highway=tertiary, highway=residential, highway=secondary, highway=service, highway=track, etc., taking the Highway Tag Africa as reference.
- surface=unpaved -> The huge majority of the roads are unpaved, specially the MSD&JSI roads that aren't yet in OSM. So the users will only change this tag to surface=paved in the few cases that they are paved.
- source=msd.or.tz & deliver.jsi.com
We will use the following changeset tags:
- comment=Tanzania MSD & JSI Roads import. #hotosm-project-NUMBEROFPROJECT
- source=msd.or.tz & deliver.jsi.com;imageryProvider
- imageryProvider = Bing or Mapbox or Bing;Mapbox, depending the imagery provider/s used while importing the data.
- NUMBEROFPROJECT is the Tasking Manager project number.
- comment=Tanzania MSD & JSI Roads import. #hotosm-project-1194
- source=msd.or.tz & deliver.jsi.com;Bing
- The original file (in shapefile format) was opened with JOSM + Open Data plugin, and saved in osm format.
- We deleted all tags and added to all ways the 4 proposed tags.
- We took the resulting file and we divided it in pieces (one for each HOT Tasking Manager project), again with the JOSM editor.
- We removed all MSD&JSI segments already mapped in OSM, except those cases where the MSD&JSI segments are a clear improvement of the OSM counterparts.
- Finally, we applied the JOSM Simplify Way tool to all MSD&JSI segments of all files. This step reduced the number of nodes by roughly 50%.
Data transformation results
Data merge workflow
This import (data integration) will be done through the HOT Tasking Manager, so the number of people importing the data is unknown. We don't expect a big number of users, as we require them to be experienced mappers. Among the skills required:
- Good experience with JOSM.
- Experience with previous imports through the Tasking Manager is not necessary, but a plus.
- S/he knows how to use JOSM filters.
- Skilled working with highways. It's needed to know well how to merge nodes (M), join ways (J), combine ways (C) and unglueing (G).
- S/he Knows how to use the Download Along plugin (Alt+Shift+D).
- S/he knows how to use the Replace Geometry tool (UtilsPlugin 2), and why it is so interesting.
- S/he knows how to deal with conflicts.
With the inputs from those lists, we will open a thread in the imports list.
We will use JOSM editor for this import.
There are several ways to reach the same results. We have tried to find the simplest one, the less error-prone and the one that assures a higher level of consistency across different import volunteers. The proposed workflow is as follows:
- Choose one task of one of the TM import projects. User will be presented with 2 sets of data (the OSM data and the new MSD&JSI segments) for the task square in two different layers, that we will merge into only one layer.
- Using filters, we download the areas around the MSD&JSI segments that have a part of them beyond the downloaded area, using for this task the Download along JOSM plugin.
- We go segment by segment. For those segments not yet in OSM, we:
- Check the surface (in case it is paved, we change the value accordingly).
- Go all along the segment to check if the accuracy is just acceptable. If not acceptable in an area, we correct it.
- Also, check crossing rivers. For those crossings, add a bridge or a ford, depending on the crossing. If unsure (lack of good imagery), just connect them and add a fixme=Check this waterway crossing tag.
- We check what level of classification is the most suitable for the segment (tertiary, unclassified, service, etc.), having the Highway Tag Africa wiki as reference.
- If the JSI segment is part of a longer highway, we combine it with the rest of segments into one only way. If we have to combine it with an OSM way, we will generally keep the tags of the OSM way, unless we find a reason not to do so.
- Finally, check both ends of the resulting highway, and in case there is a crossing-by OSM highway, join the MSD&JSI highway to it (you may need to unglue it from another MSD&JSI segment/s).
- For the few MSD&JSI segments that are already mapped in OSM (but of a much better quality than OSM), we replace the OSM segment with the new one. We will use the Replace Geometry tool (Ctrl+Shift+G) of the UtilsPluging 2, so we make sure we don't loose any tag of the OSM segment and we also keep the way history. We will also apply the steps 3.1 to 3.5., but we will in any case keep the highway=* and, specially, the ref=* tags of the OSM way, unless we have clear reasons to change them.
- For all MSD&JSI segments that are in an area where we can't check its surface (due to lack of high resolution imagery, cloudy imagery, etc), we will delete the surface=* tag.
- We finaly upload the data to OSM, using an import specific user account and with the aforementioned changeset tagging. So we are ready to start a new task.