Mexico's Administrative Divisions Import Project

From OpenStreetMap Wiki
Jump to navigation Jump to search

About

This project is focused on importing the data of the national, state, municipal and sub-municipal level divisions present in the Marco Geoestadístico Nacional (MGN) published by the INEGI in a community monitored process and with the technical help from Telenav.

This project is possible thanks to the opening of Mexico's geographical data as open data at the end of 2014 under the National Digital Strategy

The import project has been discussed over the last 6 months among members of the local community which agreed to move with the project.

Import Plan Outline

The import plan will be divided in two stages, the first stage is for the official administrative boundaries (National, State and Municipal), the second stage is for the sub-municipal boundaries (localities, AGEB, blocks).

Goals

  • Have the better local map in the world
  • To enhance the current OSM administrative division coverage of Mexico with open data made available by the government at the end of 2014.
  • Get complete and accurate administrative boundaries of Mexico (National, State and Municipal levels).
  • Get complete and accurate political and administrative boundaries of Mexico at the sub-municipal level (locality, block, AGEB).

Schedule

1st stage

2015 Q1 - INEGI Datasets analysis
2015 Q2 - Community validation and import
We hope to be finished by the end of October, 2015.
2015 Q4 - Import was objected by a user and put on hold by the import team after a review by the DWG. Please refer to the current status section at the bottom of this wiki page for more details.
2016 Q1 - This stage of the Import is partially finished (3 states not completed and 3 states partially completed).
2016 Q1 - Work finished

2nd stage

2016 - TBD

Import Data

The data to be imported comes from the National Geostatistical Framework 2015 released in the INEGI Datasets.

Background

One of the current problems in OpenStreetMap about Mexico’s data is the incompleteness of the administrative boundaries for municipalities. Municipalities are the second-level administrative division in Mexico, the first being the state. There are 2456 municipalities in Mexico including the municipalities in Mexico City which are also a second-level administrative division just with a different name “delegations”.

We will take advantage of the validation of the municipality level to take a look at the state and national level boundaries in the same project, state and national levels don't have a poor coverage but since we've detected some errors in the boundaries currently in OSM, we would like to validate them during the import. Since the data from government is authoritative for these three levels, adding it in an import will contribute to more correct and valid admin levels in OSM.

The national and state limits are already covered in OSM but we will take this opportunity to validate them against the latest MGN (taking the MGN as the gold reference) if we find updates or modifications in the MGN that are not available in current OSM coverage we will import them.

For the municipal levels, as previously said, only certain regions of the country are covered and it would be a hard work to re-validate them all, since there's no impact in replacing those boundaries and since the latest MGN boundary data is the most authoritative data publicly available we think it would be best to replace any previous municipal boundary.

The reason the MGN has the most authoritative data about Mexico limits is that it's based in the following study: Harmonization of the National Geostatistical Framework Political-Administrative Boundaries which incorporates the geographical political-administrative limits validated at national level with each local government.

Data sources

INEGI

Data license

This data was released under the following terms of use:

LIBRE USO MX

Attribution: Attribution to the source is mandatory under the above terms of use and it's the only requirement to use the data in OSM.

OSM Data Files

A data sample of the imported data in .osm format can be downloaded from here.

Import Type

One time import Method: Manual upload

Data Preparation

  • Create an inventory of available data
  • INEGI Data Analysis
  • Evaluation of data quality Data Gaps Between INEGI and OSM
  • Evaluation of the topological accuracy of attributes
  • Evaluation of geometric accuracy of the attributes
  • Selection of layers suitable for import

Tagging Plans

Municipal Boundary and Changeset Tags

Key Value
admin_level=* 6
boundary=* administrative
name=* official name of the municipality
INEGI:MUNID=* Municipality ID generated from INEGI data
source=* INEGI, MGN 2014 v6.2
comment=* INEGI Boundaries import 1st Stage


Data Transformation

The only data transformation documented so far is transforming the original projection of the dataset (ITRF92) to WGGS84. It's important to note that we did not simplify in any way the boundaries.

The geometries from the original .shp file was converted in QGIS using Polygon to lines, then saved again as a .shp

Then, using ogr2osm https://github.com/pnorman/ogr2osm we had converted the file into an .osm file.

Then, using osmosis the osm file was converted into osm.pbf ( OSMOSIS cannot convert files that have negative id, so we had to first create the osm.pbf file to have positive id, and then search and replace with Notepad++ and search and replace <nd ref=' with <nd ref='- and <node id=' with <node id='- so the file will be with negative id.

This is because if not, the osm.pbf file will have positive id, then JOSM will upload a file that will try to overwrite the nodes the correspond with the nodes 1 to 1000xxx , the first ever notes created in 2004-2005 in the OSM database. If the node or way ID are negative, then JOSM will know that this is new data, not added yet to the map.

Then, we had loaded this file into a internal tool, called Mexico Split. This tool is designed to eliminate by detaching them from their polygons and replacing them with a single way common to the two involved polygons. Besides this main purpose, the tool also splits any resulting ways longer than 2000 segments in shorter ways, groups the ways in relationships according to the borders they define, and adds some predefined tags to these ways and relations.

Data Merge Workflow

1st Stage: National to municipal levels

Inegi admin import workflow.png

  1. Manual cleanup - Identify easily fixable issues by visual inspection and fix them with JOSM.
  2. National admin level difference analysis - Identify difference intersection between MGN and current OSM map.
  3. State admin level difference analysis - Identify difference intersection between MGN and current OSM map.
  4. Municipal admin level difference analysis - Identify difference intersection between MGN and current OSM map.
  5. Backup current relation data for admin_level=6 so that any valid tags from current relations ( for example admin_centre=* and label=* ) will be saved and added to the imported data.
  6. Current admin_level=6 boundaries will be erased* and replaced with the same boundary data from MGN just with an addition of correct tagging scheme since identifying and populating each and everyone of the current admin_level=6 boundaries with the new tags would be time consuming and error prone. It's validated that any boundary being erased and replaced will be having the exact same valid boundary data from the MGN dataset. This approach in no way harms users contributions since the data being erased and replaced is of the exact same boundaries that there were already in OSM as long as they were valid (please refer to the conflation explanation below).

2nd Stage: Sub-municipal levels

TBD

Conflation

Regarding the conflation requirements for this data, it's evident that a user couldn't have contributed more accurate, authoritative or correct data about administrative boundaries than the data released by the federal government for the National, State and Municipal levels (this assumption doesn't hold true for sub-municipal levels hence the assignment of sub-municipal divisions import to the 2nd stage)and given the nature of the information on these layers (administrative boundaries) they should be suitable for a one time import that replaces/removes previously contributed boundaries at those levels.

Team

  • User:mvexel: OSM expert, will coordinate import effort.
  • User:Baditaflorin: OSM expert, will help with the import effort.
  • User:KristenK: OSM expert, will help to create import code and data sources quality assessment.
  • User:Edvac: OSM imports expert, will help to guide the team through the import process
  • User:oldtopos: OSM expert, will help to review import code.
  • User:Andresuco: Local mapper and lifelong resident, will help in documentation and review resulting data.
  • User:ROSMAPEB: Local geographer and lifelong resident, will help evaluating correctness of mapping INEGI MGN levels to OSM levels.
  • User:gozmir: Local and lifelong resident, will help to manage the communication between all the involved parties in the project.
  • User:Vramirez122000: Community member from Puerto Rico, will help with guidance and previous experience in imports.
  • User:Igeopr: Community member from Puerto Rico, will help with guidance and previous experience in imports.
  • User:Irk_Ley: Local surveyor and lifelong resident from the state of Veracruz.


References

QA

Script will be run after import to retrieve all boundaries, export them to shape file and create a comparison with the official shp data for National , State and Municipal levels.

Rollback

In case any incorrect data is introduced it could be rolled back using JOSM reverter plugin.

Summary of current status

In summary, up to now (Jan 2016) we have finished the import of the municipal limits of 22 states, 3 states are partially completed and for 7 states the import has not yet been started.

So far we have imported 1,198 municipalities, (one relation per municipality), and we have a remaining 1,259 municipalities to be imported.

The data summary by state could be found on the following link. https://go o.gl/mvmJ5b

The import is incomplete until the remaining 7 full states and 3 partially imported states are finished which might be done during 2016 Q1. It's worth noticing that since there are no admin_level=6 boundaries on those 7 pending states we will continue with a similar import flow considering there couldn't be any new objection regarding those states since nothing will be deleted/replaced for those.

2nd stage of this import won't be reached until there's an agreement among community members regarding the assignation of the admin_levels. Based on the experience from the current import it's likely that this part of the import could not be addressed before the second half of 2016.

Update [February 10, 2016]: We have restarted the import, we're currently importing the states of Campeche, Quintana Roo, Tabasco and Yucatán

Status of the fixes requested by the DWG

After the import was objected by a user from the MX community, the DWG was involved in verifying such claims and requested the import team to fix the issues reported by the claimant.

All the issues reported by the claimant to the DWG were reviewed by the import team and fixed when the requested fix was valid. A summary of the fixes was communicated on December 4th 2015 through the talk-mx mailing list ([1]).

According to the backup plan and in relation to the claimant's requests, we reinstated the information linked to the admin_level 6 boundaries that we imported.

  • 343 ‘wikipedia’ tags ,
  • 22 ‘population’ tags
  • 120 ‘admin_centre’ tags that existed as part of the relations for these limits before the import.

It's important to note that we restored the previous boundaries at other admin levels if they were linked to level 6, although we didn’t validate if they were correct because that’s out of the scope of the import, we’re only focused on the level 6 boundaries correctness. There are minor differences between the admin_level=4 and 6 and will be reviewed case by case.

If there were invalid boundaries (ie. admin_level=8) linked to an admin_level 6 boundary before the import, those invalid boundaries will remain so after the import, since we agreed to re-instate whatever was linked to the boundaries at admin_level=6 as long as it was valid, but we cannot validate the correctness of other than admin level 4 or 6 , so we re-instated some deleted data even if it wasn't valid, in summary, a reinstated boundary at admin_level=8 or admin_level=10 for example, that was bad before the import, will remain bad until the community fixes it since it's not related to this import. The reason for this is that there's no agreement within the community regarding any of the admin levels below 6 and the import was only focused on having the proper admin level 6 boundaries base.

We made an effort to delete orphan boundaries and remainders from the import but if you detect one please notify the import team and we will properly delete it.

Another important note is that the claimant that raised this issue with the DWG stated early during this controversy that regardless of the fixes he would still revert/delete the data we imported when it doesn't fit his views.[2] We don't think this is a valid behavior since we've put a lot of effort on this import and the fixes requested by the DWG and regardless of our actions the claimant might still destroy our contribution .We can't dedicate resources to permanently monitor whether the boundaries will remain valid, hence we can't guarantee they will remain so, even after we finish this project.