From OpenStreetMap Wiki
Jump to: navigation, search


The Mississippi Geospatial Clearinghouse (MGC) (which is a distribution point for public domain data, including data from MARIS) provides access to a comprehensive spatial information warehouse of Geographic Information Systems (GIS) resources of Mississippi for use by government, academia, and the private sector.

Right to Use

Note on copyright from e-mail exchange with MGC:

From: GIShelp at its dot ms dot gov
To: eric dot ladner at gmail dot com
Date: Mon, 20 Dec 2010 07:13:21 -060

You may use the data for that.  That data is freely available to the the public. 

GIS Help Desk
301 N. Lamar Street, Suite 508
Jackson, Mississippi 39201-1495
GISHelp at its dot state dot ms dot us


The following users appear to participating in the bulk import process:

Process Used for Imports

  1. Data is downloaded from MGC in shapefile format.
  2. Shapfile data is loaded into postGIS via shp2pgsql
  3. Data is cleansed, analyzed, simplified and cross referenced in PostGIS
  4. Spot checks and spacial queries are done in QGIS (to avoid conversion back and forth between formats)
  5. Data is extracted from PostGIS with pgsql2shp in manageable subsets based on township/range
  6. The subset shapefiles are converted to OSM via ogr2osm
    1. Custom translators are used for building and parcel data to map database data to OSM tag/value pairs
  7. Spot checking, tagging of mass data and minor edits are handled in JOSM
    1. so far, for the building data, tagging is easy: building=yes, source=MGC
  8. Data is uploaded in small batches via JOSM (there's got to be a better way to do that...)

These are the programs which have been used to perform data manipulation.

  • PostGIS (storage during cleansing, reduction, cross referencing)
  • shp2pgsql (loading)
  • pgsql2shp (unloading)
  • (final conversion to OSM format)
  • SQuirreL SQL (data manipulation)
  • QGIS (validation)

Building data is reduced (round features contain 300+ points, these are reduced considerably).

Parcel data is used to create a cross reference to the building table based on ST_Contains(parcel_geom,ST_Centroid(building_geom)). This data is also analyzed so that the biggest building on a particular parcel is the only one that is tagged with address information (to avoid duplicate address numbers). This keeps small sheds and secondary buildings from getting tagged with address data.

Parcel data is also used to group buildings into township and range blocks for exporting manageable OSM files later on.

Address data (in the parcel table) is cleansed and normalized. Street names are left as is, but abbreviations are expanded (PL -> Place, ST -> Street, etc). These addresses are also manually checked and outlying data is corrected by hand. Estimate less than 0.5% error in the data once cleansing is done.


Data is proposed to be tagged with the prefix "MGC:". Each data set has different data associated with it, but most have UID numbers and some other data (like county name on the building footprints, etc).

Detailed tagging data will be developed with each layer (outlined below).

Imported layers

Please refer to the MGC list of available data layers.

Import status and mappings notes by layer follow.

MDEM Coast Building Footprints

Work on this layer is beginning.

POI Layers