LINZ Trial Import

From OpenStreetMap Wiki
Jump to navigation Jump to search
Logo.png
This page describes a historic artifact in the history of OpenStreetMap. It does not reflect the current situation, but instead documents the historical concepts, issues, or ideas.
About
Captured time
2009


User:JoeRichards ran an import of LINZ & NZOGPS and put the results on a dev server. The maps can be compared against live OSM data here.

The source code is visible in a pretty trac page here

or download via subversion, e.g.

svn co http://svn.openstreetmap.org/applications/utils/import/linz2osm

TODO:

  • JR (or anyone with me)- move this over to the new dev server with a working API (the old dev server's API server doesn't work for this instance)
  • JR - fix the Gmap api key (since the server is now called old-dev, not dev)


The actual XML OSM files output by the script run are here:

http://joerichards.dev.openstreetmap.org/files.html

This page is intended for listing problems and queries regarding the test import and preparing for the next run.

Issues/Bugs/Problems/Questions

Roundabouts

roundabouts are broken

Roundabouts are apparently split. This is caused by roundabouts being tagged as junction=roundabout, but not highway=*

roundabouts are there, just lacking highway tags

Roundabouts are present, just split up and lacking highway=* tags

Problem:

The roundabouts are maybe caused through
one of the authoring requirements for
Garmin maps - they are split into two unique
polylines to avoid a self intersecting line.

Right, they look like a donut with two bits taken out. At least the ones I've spotted so far.

Solution

  • user:JoeRichards - The problem here is that it's obvious to tag the roundabout as junction=roundabout, but it also needs to be tagged as highway=*. The exact type of highway value (road, primary, tertiary etc) is determined by what the roads are that lead into the roundabout. As such it's not easy to script (since the script would have to work out what connects to the roundabout and get the values from them)

Speed limits

Problem:

  • are looked up by the script using codes, so the actual speed limits are not present in the NZOGPS .MP files
  • At the moment they are set to these (weird) numbers, which may well correspond to actual speed limits, but the numbers are not right]. The numbers are speedval = [8, 20, 40, 56, 72, 93, 108, 128]

Proposed Solution(s):

  • examine roads with known speed limits and compare them with the resulting .OSM output from the script. If these are consistently the same (but with the wrong number) we adjust the speed limit, e.g. 56 becomes 50.
  • Apparently the speedval from NZOGPS is not official data, it's just based on road type with a dash of local knowledge about traffic conditions. In this case these should not be imported, and left to the end-user app to decide how to interpret the speed from the road type.
Typical road speeds in NZ are 100 kph for motorways and highways and 50 kph for residential roads. The remaining 9.999% will consist of 5,10,15,20,30,40,60,70,80,90.
  • others?

Braided rivers

Problem:

  • large, complex "braided" rivers appear to be absent from the source NZOGPS data
small rivers are blocky
  • smaller rivers are present, but very blocky.

Proposed Solution(s):

  • Graeme Williams could find out why these are missing from NZOGPS and we work from there
Apparently cut down in that dataset in order to save space for more roady things.
  • import from another dataset (what others contain these?)
-- Just grab the original data from the LINZ topo dataset river_poly layer. Koordinates.com have these available for download as a shapefile. --HB
  • What else has been cut down to save space?

Town sizes

Problem:

too many place names at 1:1.5M

At the moment the garmin codes (which roughly specify a region/city/town/locality etc) are translated into osm tags, but they might be at the right scale (e.g. a village might be tagged as a hamlet). This might mean that we see too many (I think!) or too few localities at a particular zoom level.


Solution(s):

  • manually review a hand full of the places, check their official populations (against the size guidelines in Key:place) and see if they match up. If a certain category is consistently wrong, we can change the code to retag it.
  • Note you may want to scale the "official" OSM guidelines to reflect NZ's small and low population (and place name) density.

Abbreviations used

Problem The NZOGPS data uses common (and not-so-common) abbreviations such as St for Street. Openstreetmap doesn't use abbreviations. So we need to retranslate these back to their unabbreviated forms, without doing it in the wrong places, e.g. St Heliers is not Street Heliers.

A list of abbrevations and their full version (thanks to Mike Oberdries) is here: http://docs.google.com/Doc?docid=0AT8ugvNNx8UZZHh3OHZ0NV83Mmdxc3pwdmZ4&hl=en

  • The original LINZ database has the full/expanded version with correct title casing.

Solution(s)

  • make a mapping table using the above data and incorporate this into the script.
    • does this work? are the abbreviations only at the end of a placename, or can they exist in other places -- User:JoeRichards
  • Use way 'Name ID' to look up correct string in LINZ topo shapefile .dbf file (see comments re. fixing 'Arthur'S' in #Minor problems section below) --Hamish


Link roads

Robin P said:

yes, the on/off ramps have all been imported as secondary_link - they
should be motorway_link

TODO - check this outside of Auckland and in Auckland. They might just be motorway_link in the examples he's checked (since he's in Auckland). Or it might be that all of them really need to be motorway_link after all.

Dual-carriageways

>> there are a lot of dual-carriageways that have been imported as single
>> carriageway roads. again, is this a lack of data?
>
> Are you zoomed in enough?  At zoom 16 they should be visible.  Again,
> something for the check-list, cheers!

no, looks like it's a lack of data

Double-check. Is it just a particular area? Can we get around this during our manual merge process?

General issues/bugs

  • Roads seem to be tagged with all access restrictions eg most roads have emergency=yes,goods=yes,motorcar=yes,psv=yes,taxi=yes,foot=yes,bicycle=yes,hgv=yes. I believe normal practice is to only include access tags where they differ from the defaults for the highway tag used. For example highway=residential implies yes for motorcar, motorcycle, goods, hgv, psv, moped, horse, bicycle and foot. More details here OSM_tags_for_routing/Access-Restrictions --rcr

Problems with Garmin types

  • (0x7) translated to landuse=aeroway. Should probably be aeroway=aerodrome. Airport node is fine, just the way representing the airport area is not showing. Example is Christchurch International Airport --rcr
  • (0xe) currently not translated. Appears to be an area corresponding to airport runways and maybe apron. Could be translated to aeroway=taxiway and aeroway=apron but in the case of taxiway would need to somehow be converted to a way rather than an area. Example is Christchurch International Airport. Smaller airfields seen to be present but missing info to identify them for example Rangiora airfield just has garmin type of (0xe) and name=Aerodrome and there are others with name=Airstrip. --rcr
http://koordinates.com/layer/560-nz-runways-and-airstrips-v14/
http://koordinates.com/layer/123-nz-airports-v14/
  • (0x13) translated to building=yes. Look ok at lower zoom levels but are very inaccurate when compared to satellite imagery. Problems include buildings overlaping roads, irregular shapes and sizes that don't match the actual building. See here for sample, will be a lot worse if rendered at max zoom. --rcr
-- maybe these have been cut down to save space in the .mp file, same the river_poly's were? or perhaps they are just crude representations for the topo maps -- HB
http://koordinates.com/layers/global/oceania/new-zealand/?q=building
  • (0x17) translated as leisure=park. Appear to be a mixture of parks, reserves and domains so leisure=park may not always be the best translation, landuse=recreation_ground or leisure=nature_reserve may be more appropriate. Since they are all one Garmin type many not be possible for the script to differentiate but in some cases the words Domain, Park or Reserve appear in the name so that could be used in some cases? --rcr
  • (0x650c) currently just tagged with name. Appear to be small offshore islands or in some cases big rocks. place=islet or place=island probably the best translation. --rcr
  • (0x900) translated to place=city, examples I have checked so far are actually towns. 30 examples in the Canterbury.osm file. --rcr
  • (0x6406) translated as place=locality,mountain_pass=yes. Seem to be a mixture of passes and fords. In OSM mountain_pass=yes and highway=ford usually applies to nodes on a highway but these ones from the LINZ data are all just standalone nodes which is understandable since many are saddles and cols in remote areas with no roads or tracks nearby. --rcr
http://koordinates.com/layer/588-nz-saddles-v14/
http://koordinates.com/layer/587-nz-fords-v14/
  • (0x6412) translated as place=locality, highway=track. Mostly look like names for tramping/hiking tracks so highway=footway might be more accurate but is still not valid since they are just nodes not ways. Not sure there is an existing corresponding valid tag. --rcr
A matrix of OSM icons can be found here. See transport → track (no listed OSM Condition)
  • (0x6616) translated to natural=peak, look good but no elevation info, is it available? --rcr
-- Yes, the LINZ topo4 shapefiles's height_pnt.dbf includes a single-precision floating point column called "Elevation" containing the peak height (ridges too). You can also get the trig station database from LINZ as well as a different dataset. --HB
see http://koordinates.com/layers/global/oceania/new-zealand/?q=height
  • unsealed roads are showing on linz dev map as tracks

Minor problems

Also need to check this:

>> also, what are the rust-brown areas. in some places they appear to
>> correspond to landuse=industrial/commercial/retail in others to
>> building=yes?
>
> I'd say building from what you mentioned, but if you have any specific
> examples, take a look at the .osm files

yeah, 0x13 has been imported as building, probably should be
landuse=commercial or retail 

hmm, 0x13 looks like it includes a lot of things. hard to say what to
do here. newmarket and eden terrace are covered in them
  • Probably better to take everything but road data directly from the LINZ topo v14 layers. --Hamish

Data sources

Used in import

Sources: LINZ, Corax, Zenbu
http://mapcenter2.cgpsmapper.com/mapsetview.php?id=185
Public data license: BSD-like with attribution:
"Where data from NZTopo is reproduced, derived or copied,
the following acknowledgement note must be shown on the
product and associated media:

   Sourced from NZTopo Database. Crown Copyright Reserved"
-- http://www.linz.govt.nz/topography/topographic-data/vector-extracts/index.aspx

Additional data

Many LINZ layers are available for download from Koordinates.com.

For example: train stations:
http://koordinates.com/layer/606-nz-railway-stations-v14/

A comprehensive place name database can be downloaded from http://www.geonames.org. Geonames data is largely an import (http://forum.geonames.org/gforum/posts/list/26.page) of the LINZ placename data, from several years ago - better to go to the source http://www.linz.govt.nz/placenames/find-names/nz-gazetteer-official-names/index.aspx . In my opinion that data is of questionable value, odd mix of data, coords read from paper maps... --Zenbu

We have started mapping the LINZ layers to OSM types. Each layer needs to be looked at individually to see what tags we should use and what manual work will need to be done to clean things up once imported. We are using the Chatham Island's as a test bed as this has very little OSM data already so we won't be overwriting any existing work. See the List of LINZ layers and notes for details of the progress. I will update this section with notes on the mapping process once i have this documented - Barnaclebarnes


Comparative datasets

  • Transit NZ GPS traces of all roads in GPX format
data: http://www.openstreetmap.org/traces/tag/TransitHSDC2008
about: http://www.gis.org.nz/wiki/Transit_High_Speed_Data_Collection_Survey

Import script

Previous version:

Which was in turn based on:


Fixed Now

  • name of Arthur's Pass is name=Arthur'S Pass (note capital 'S')
    • fixed using nice little python function (see below), Thanks! --User:JoeRichards
    • could be translation problem or in original data. --rcr
    • it's conversion code itself, trying to be smart but being buggy. this is caused by the function which is used to capitalise words. The names in the nzogps/linz data are capitalised, so it tries to convert them. --JoeRichards
    • It's a limitation of .title(). Jesse Hager suggests:
 def capwords(words):
:      return ' '.join([x.capitalize() for x in words.split()])
:
    • Alternatively, the .mp file retains the LINZ ID number, and the original LINZ .dbf has correctly cased and expanded road names, so we could churn through a little program to select the correct name from the DB for each road. e.g. for Glen Nevis Station Rd. in Southland.osm, the "Name ID" is 1030000192208.
LINZ topo v4 DBF: 
 Id         : 111056
 Name       : Glen Nevis Station Road
 Crt date   : 
 Mod date   : 
 Surface    : unmetalled
 Status     : 
 Lane count : 1
 Hway numbe : 
 Road image : 
 Way count  : 
 Name id    : 1030000192208
Southland.mp:
 ;sufi=1030000192208
 ;Auto-numbered=20080524
 [POLYLINE]
 Type=0x6
 Label=GLEN NEVIS STATION RD
 EndLevel=1
 CityIdx=4
 RoadID=22944
 RouteParam=2,0,0,0,0,0,0,0,0,0,0,0
 Data0= [...]
Joe's Southland.osm:
  <way id="-129" visible="true">
   <tag k="source" v="LINZ & NZ Open GIS" />
   <tag k="linz:sufi" v="1030000192208" />
   <tag k="linz:garmin_type" v="0x6" />
   <tag k="highway" v="residential" />
   <tag k="name" v="Glen Nevis Station Rd" />
   <tag k="linz:RoadID" v="22944" />
   ...

So process would be like:

  1. Get;sufi= from .mp file
  2. Lookup the Name ID value in the LINZ topo shapefile .dbf database
  3. If found, check that the strings before the [end] .split(' ') space match. If so use the expanded name from the DBF. If not output debug message + full details to stderr and use .capwords() or .title() version of the .mp string.
  4. If not found in DBF use .capwords() on the .mp string and a switch table to expand common ' Rd$'→Road, ' St$'→Street, etc.
  5. Profit!


The same is true for road surface type and number of lanes, etc.