Albania TPGInc Import/Roadmatcher

From OpenStreetMap Wiki
Jump to navigation Jump to search

This process was originally developed for the Canadian import by Jason Reid, Steve Singer, Frank Steggink, and Sam Venkemans. It has been modified to work with The Pineridge Group Inc.'s data for Albania by Adam Dunn and James Michael Dupont.

The process has the following high-level steps:

  • Generate bounded ESRI Shapefiles of the current OSM data and TPG data
  • Import these shapefiles into OpenJUMP Roadmatcher
  • Perform AutoConflation in Roadmatcher, followed up by manual adjustments in Roadmatcher, using TPG as a base.
  • Use the result file from Roadmatcher in pineridge2osm.py to generate standalone.osm and excluded.osm
  • Import the standalone.osm file into OSM using JOSM

Generating Bounded Shapefiles

First, you need PostgreSQL with GIS extensions. I have used the database name of "gis" for the examples on this page. You can find instructions on setting it up at Mapnik/PostGIS#Create_Database. This page instructs you to log in as a user or root that has access to postgresql, and the next seven(7) commands should also be entered as that user. Now you need to import spatial references to the gis database by running the following command (you will only need to do this once per computer, for the entire lifetime of the import):

psql gis < /usr/share/postgresql-8.3-postgis/spatial_ref_sys.sql

In order to import the OSM data to PostGIS you need to get osm2pgsql: Osm2pgsql#From_source.

You need to download LL_Roads2 (shp,shx,etc) from xhema.flossk.org. LL_Roads2 is hardcoded in - you'll have to change code in pineridge2osm.py if you want to use another shapefile.

Load the TPG data into PostGIS (only do this once, unless The Pineridge Group releases a newer version and you need to import the newer data). In this line, 4326 is the projection, albania_tpg_roadseg is the name of the table (hardcoded in the sql functions imported later on), and gis is the name of the database:

shp2pgsql -s 4326 LL_Roads2 albania_tpg_roadseg | psql -d gis -f -

Now you need to get the current OSM data for a bounding box. There are two possible ways to go about this:

  • You can use JOSM to download a specific lat/long area by going through the "Bounding Box" tab in the download dialog box. I would suggest sizes of around 0.1 degrees lat and long.
  • If you can get osm files in bounded shapefiles (from Cloudmade or some other provider), that'll be fine but you still want The Pineridge Group shapefile to be bounded, so still need PostgreSQL. (this method is not yet written about on this page)

If you still want to work with .osm files (ie. you are using the JOSM method, not the Cloudmade method), you need to load a stylefile in your current directory (only do this once):

cp /path/to/osm2pgsql/default.style .

You would then load the osm into postgres (repeat this for each area you work on):

osm2pgsql -v -l -d gis DownloadedOSMData.osm

To export the geometry from PostGIS, you will need SQL functions. Use the albania-functions.sql (gitorious repository link). You can import them into PostGIS with the following command (remember that "gis" is the name of the database here) (only do this once):

cat albania-functions.sql | psql -d gis

Finally, we get reprojected, bounded shapefiles out of postgresql. The numbers inside the POLYGON() is in WKT format. It is simply the five nodes that make up a closed rectangle, in Long Lat tuples (the numbers shown are for an area near Gjirokastra) (repeat this for each area):

pgsql2shp -f OSM.shp gis "select * FROM select_albania_osm_roadtile('SRID=4326;POLYGON((20 40,20.1 40,20.1 40.1,20 40.1,20 40))')"
pgsql2shp -f TPG.shp gis "select * FROM select_albania_tpg_roadtile('SRID=4326;POLYGON((20 40,20.1 40,20.1 40.1,20 40.1,20 40))')"

You can now exit the special user that has access to postgresql.

OpenJUMP RoadMatcher

You will need to install and run OpenJUMP with RoadMatcher. See RoadMatcher for information on this. For conflation options in RoadMatcher, you'll probably want: Minimum Segment Size: 5E-7, Standalone distance: 5E-4, Match Distance: 5E-4, though feel free to play with these numbers, as optimal values will change from sparse to dense areas. In dense areas, higher accuracy will be optimal, so you might end up using 5E-8, 5E-5 and 5E-5, respectively. You also need to make sure that ET_ID from the Pineridge Group data is exported into the result layer. This is the ID that pineridge2osm.py will use later on.

see the export instructions here: RoadMatcher#Exporting

pineridge2osm.py

Download pineridge2osm.py from Gitorious (try the tar.gz link on the right, or go to each file and scroll to the bottom and click on "Raw blob data", or git clone the whole thing)

need to do this :

sudo aptitude install python-shapely

Create a text file that contains a bounding WKT (eg. name it GjirokastraWKT.txt). This is the exact same text that you entered into PostGIS earlier:

POLYGON((20 40,20.1 40,20.1 40.1,20 40.1,20 40))

You will need to obtain a GML version of LL_Roads2. GML is used because it is XML format, and it is open source/not proprietary (unlike ESRI Shapefiles). It might be possible to get the gml file from xhema.flossk.org, or if you want to save 161MB of bandwidth, you can make your own from the Shapefiles using the ogr2ogr application (only do this once):

ogr2ogr -f "GML" LL_Roads2.gml LL_Roads2.shp

Run pineridge2osm.py (Gjirokastra is used as the running example) (repeat for each area):

./pineridge2osm.py -i LL_Roads2.gml -b GjirokastraWKT.txt -e GjirokastraResult.jml -o GjirokastraMatchedOut.osm

This will result in three files:

  • GjirokastraMatchedOut.osm - the full set of roads from TPG for the bounded area
  • GjirokastraMatchedOut.osm.excluded.osm - the set of TPG roads that Roadmatcher found as already being in OSM
  • GjirokastraMatchedOut.osm.standalone.osm - the set of TPG roads that Roadmatcher found is not yet in OSM

Open one/many of these files in JOSM and start fixing topology, adding names, adding better classification, adding fast food restaurants, and generally making OSM a better map. Some roads will be tagged with highway=boulevard. These roads need to be split into two seperate one directional ways to represent the boulevard configuration.

Tag Mapping

The following represents how pineridge2osm translates tags in The Pineridge Group to data in OSM. Along with the OSM specific tags, the original TYPE value will get placed in the tpg:type=*, CATEGORY will be placed in tpg:category=*, and Cat2 will be placed in cat2=*, so there is no worry about losing this information in the conversion process. This list attempts to be a complete reference of all (type, cat, cat2) tuples. If you discover another tuple, please note the et_id to developers. Please do not edit this table unless you also edit the python script. If you desire changes to the tag mapping, please discuss with a developer on mailing lists.

TPG TYPE TPG CAT TPG CAT2 OSM tags
Railroad 0 0 railway=rail
0 0 highway=residential
0 11 highway=residential
0 80 highway=residential
Nat 1 1 highway=primary
National Asphalted Road 1 10 highway=trunk
National Asphalted Road 1 11 highway=primary
Well-Kept Gravel Road 2 10 highway=unclassified

surface=gravel

Well-Kept Gravel Road 2 20 highway=unclassified

surface=gravel

Seasonal Road 3 10 highway=unclassified

seasonal:winter=no

seasonal:summer=yes

seasonal:autumn=yes

seasonal:spring=yes

Seasonal Road 3 30 highway=unclassified

seasonal:winter=no

seasonal:summer=yes

seasonal:autumn=yes

seasonal:spring=yes

Village Road 4 10 highway=tertiary
Village Road 4 11 highway=tertiary
Village Road 4 40 highway=tertiary
Dwelling Area Road 4 40 highway=residential
5 10 highway=residential
Dwelling Area Road 5 10 highway=residential
Dwelling Area Road 5 11 highway=residential
Boulevard 5 50 highway=boulevard
Dwelling Area Road 5 50 highway=residential
Dwelling Area Road 6 60 highway=residential
Village Road 6 60 highway=tertiary
Footpath 7 70 highway=footway