Humanitarian OSM Team/Open Mapping Hub West and Northern Africa/Data Quality Approach

From OpenStreetMap Wiki
Jump to navigation Jump to search

An Approach for Improving OpenStreetMap Data Quality in West and Northern Africa

Background

OpenStreetMap (OSM) is a free, crowdsourced and open-source mapping platform that has gained tremendous popularity and widespread adoption over the years. OSM is highly valued for its flexibility, real-time updating, and accessibility. However, despite its numerous advantages, the quality of OSM data can vary greatly, depending on the level of contributor expertise, their availability, and the quality assurance mechanisms in place. Since numerous contributors in the region are continuously creating OSM data, improving the quality of the data has become a crucial task to make it more reliable and usable to support the several use cases across the region and the entire world.

This approach provides a broad guideline for contributors to learn the different approaches tailored to improving the quality of OpenStreetMap data in the West and Northern Africa region. This approach however can be used to improve OSM data by all other Open Mapping Communities that wish to and find this information useful.


Objectives

Overall, the goal of this approach is to provide insight on how to create accurate and up-to-date OpenStreetMap data that effective decision making can rely on.

The key objectives are:

  • Have a clear understanding of the data quality issues across the region.

By working collaboratively with local communities we can ensure that OSM data remains a valuable resource for end users

  • Build a data quality network to increase trust and credibility in OSM data.

There is a recurring concern from different partners on the quality and reliability of OSM data, it is therefore necessary to prioritize data quality and devote resources to support the production of good quality data.

  • Define strategy to increase data quality for remote and field mapping.

To ensure data quality in mapping, it is essential to follow specific guidelines and quality control measures


Our Approach

Improving data quality in OSM requires a concerted effort from the community of contributors and well as the use of appropriate tools and techniques. Here is our approach.

Have a clear understanding of data quality issues across the region.

Data quality assessment will be conducted to understand areas where data may be incomplete, inaccurate, or outdated. This will help to identify areas with low community participation which may be due to lack of local knowledge to keep the data up to date. Quality assessment will be conducted through:

  • Visual inspection using mapping tools such as iD Editor and JOSM.
  • Statistical analysis to identify patterns and trends with tools such as OSM Analytics.
  • Comparison with other data sources, such as satellite imagery, government data
  • Crowdsourcing by engaging communities in the region to identify gaps in the data and make corrections through different activities including mapping parties, editing challenges and discussions on online forums and OSM email lists.

The hub will work closely with community groups and local organizations through different granting schemes to encourage more contributors to get involved and improve the accuracy and completeness of OSM data


Build a Data Quality Expert Network.

OpenStreetMap (OSM) communities in the region are continuously creating OSM data from different sources including remote mapping, field mapping, and imports from other datasets. In order to consistently ensure that data provided is fit for purpose, the hub will establish a network of data quality experts who can support data quality needs and issues from the communities . This network of expert volunteer contributors will be recruited from existing OSM communities and provide technical support on different data quality issues. Interacting with local communities will be essential to creating a thriving mapping community and support the growth of quality contributions of local OSM communities.

Among other skills, the data quality expert network members will be able to;

1. Provide guidance and training to contributors: Providing clear and consistent guidelines on how to map and tag features in OpenStreetMap referencing the OSM Wiki Map Features and HOT Data Models can help to reduce errors and ensure consistency with the data. Providing training sessions, online or in-person, on best practices and necessary skills to identify bad tag values will help to improve data quality. The training will be targeted to all sorts of map contributors from beginners to advanced mappers.

2. Provide workshops on automated validation tools: Automated validation tools will help to identify errors in the data, such as missing attributes, geometrical inconsistencies, or duplicated features, and therefore improve quality by correcting the identified errors. This will be a great way to increase technical resources in the region and fill the knowledge gap of contributors. The training will be targeted also to newly established OSM communities and larger mapping communities (individual or smaller groups who are not OSM communities).

3. Design a full fledged data quality training module: It is essential to have a robust data quality module that contributors can reference to reinforce their learnings on the importance of data quality, skills and tools necessary to ensure data quality in OSM. This training module will include defined learning objectives tailored to different audiences. The learning objectives will align with the Top 10 data quality aspects that HOT is focusing on. The training content will also cover best practices for data quality management and data governance.


Strategy to improve data quality for field and remote mapping.

Data quality assurance is critical for ensuring data quality in field and remote mapping in OSM. The quality of data in OpenStreetMap is highly contextual; therefore, different uses of OpenStreetMap data require different metrics of quality depending on the context in which the data is going to be used. For the purpose of this document, we define OpenStreetMap data quality as fit-for-purpose that conforms minimum requirements by the users.

Following specific guidelines and quality control measures can help ensure the accuracy and completeness of mapped data. By adhering to these best practices, contributors will create good quality and reliable OSM data.

This workflow can always be adapted, customized, and improved for organized field mapping activities. The WNA Hub has defined best practices for field mapping as categorized into practices before, during and after data collection exercise

Before data collection.

1. Create specific guidelines (checklists, policies, workflows) for enumerators to ensure accurate field data collection. Field mapping process involves collecting data on physical features such as roads, buildings, waterways and the accuracy of data depends on the skills and techniques used during the mapping process. The guideline will help the enumerator to understand the mapping projects and get familiar with the tools and techniques for data collection.

2. Define the data model based on the project's objectives that includes a list of features to be mapped and the tagging guideline. Reference OSM Wiki Map Features and the HOT Data Models for feature tagging guidelines.

3. Commit enough time and resources to train enumerators and conduct pilot field data collection activities to understand potential data issues ahead of deployment. This will help:

  • Train enumerators on how to ask questions, probe for additional information and record responses
  • Identify potential data issues, such as confusing questions that seem difficult to be answered
  • Build enumerators confidence, leading to high quality data collection during actual deployment

Ensure that the device purchased for data collection is of high quality and is calibrated correctly.

During data collection.

The mapping supervisor should constantly check the data collected and provide regular feedback to the enumerators to ensure that data collected meets the project’s objectives and guidelines. Supervising the data collection also provides an insight on the coverage of data collection in the area of interest (AOI)

After data collection.

A comprehensive data cleaning workflow should be provided during this stage. But here are some general guidelines that should be followed:

1. Validation to check for:

  • Naming conventions,
  • Tag completeness
  • Positional accuracy and logical consistency of data

2. Prepare public documentation on the OpenstreetMap Wiki

3. Monitor local changeset discussion. If there is a discussion about the mapping activity, provide appropriate and timely responses.

WNA Hub best remote mapping practices.

1. Double check for overlapping active projects on the Tasking Manager. If you are new to Tasking manager, reference this onboarding presentation and this introductory page to learn more.

2. Onboard new mappers: Introducing new mappers to remote mapping can be a great way to expand the community but it is essential to introduce them to the remote mapping process properly to ensure accurate contribution to OSM.

  • Provide an overview of the OSM project by explaining the mission of the project, highlighting the importance of accurate mapping, and the role of volunteers in contributing to the project.
  • Explain the concept of remote mapping including the use of satellite imagery and tools and techniques used to map remotely. Guidance should also be provided on how to map and identify features such as roads, buildings and waterways, as well as how to add tags.
  • For new mappers, choose an area that is easy to map such as a small town or their neighborhood. This will help them get comfortable with the tools and techniques used in remote mapping.
  • Reference available resources for the onboarding process

3. Remember to always reference the tagging guidelines for best tags for the mapping activity to ensure that contributions are useful. 4. Track suspicious edits related to feature tracing and tagging while mapping activity is ongoing using OSMCha filter and changeset discussion. More tools for detecting such errors and improving data quality will be made available soon. 5. Monitor local changeset discussion and Tasking Manager Discussion. If there is a discussion about the mapping activity, provide appropriate and timely responses.

Community feedback.

WHA Hub understands that OpenStreetMap is a community owned project and for the success of the project, feedback is key since it is an end to end practice. The hub will therefore utilize the effect of feedback in improving the quality of community contributions to OpenStreetMap and ensure that the most up-to-date data are available. This will be done through several ways that include;

1. Changeset monitoring: For all hub organized mapathons and those organized by other local open mapping communities within the region, there shall be a dedicated person to monitor changests using OSMCha to flag any suspicious changesets and provide feedback to mappers.

2. Tasking Manager communication system: The Tasking Manager has a feedback system that allows mappers to communicate to each other inform of comments and username tagging. Also project managers will use the communication system to reach out to beginner mappers and direct them to training resources and tutorials.

3. Feedback collection drives: Through this initiative, the hub will reach out to:

  • local partners and other map users to collect feedback on the quality and completeness of the data. These will be carried out in surveys and physical interviews by visiting partner offices.
  • Local mapping groups to regularly check for updates and ensure that the data stays relevant and up-to-date. This will be carried out with online feedback session


We are looking forward to hear from you on ways to improve this approach. Please share your feedback with us