New Zealand/Place names

From OpenStreetMap Wiki
Jump to: navigation, search

New Zealand Place Names

This page is concerned with establishing consensus on how to deal with New Zealand place names.

It is for general use but is being developed in support of the LINZ data import.

There are two main issues:

  • Localization (l10n) and language translations
  • Establishing town size to tag name. (e.g. what population size before a village becomes a town?)


Localization

See the main OSM page: Key:name#Localization

  • The ISO 639-1 two letter code for Māori is "mi".
  • The ISO 639-1 two letter code for English is "en".

Together with the country code, we have mi_NZ and en_NZ locales.

An introduction to UTF-8 and Unicode

Non-latin letters, including macron variants of latin letters such as used by te reo Māori are stored in what is known as UTF-8.

Some introduction to the nitty gritty of how that all works:

Unicode Macrons, mainly for Te Reo Māori
Character Encodings
Māori spell checker for OpenOffice howto

How to tag

name=the default name, used locally
name:mi=the name in Māori
name:en=the name in English

TODO: Open question: when to use which? for example:

name=Taranaki
name:en=Mount Egmont
name:mi=Taranaki  ?redundant?
name=Kaikōura or Kaikoura ?
name:en=Kaikoura  ?redundant?
 ?can/should we really call it English, or just skip :en version?
name:C=Kaikoura
LANG=C style for plain ASCII version? the average non-UNIX programmer OSMer probably won't be familiar with that it means.
name:mi=Kaikōura
 ?if macroned version is used in main name=, is this redundant?

Converting text

If you need to transform from ISO-8859-1 (basic latin) encoding to UTF-8 (supports advanced macrons) encoding environment, you can use the GNU iconv utility: (at least on Linux)

iconv -f ISO_8859-1 -t UTF-8 iso_file > utf_file

Interesting stuff to know

  • Components of Māori place names translated
http://www.nzhistory.net.nz/culture/tereo-100words#place

Sizes of populated places

A fair amount of work and decisions about matching NZ's ideas of what these are versus OSM's global conventions on what to call these has already been worked out as part of the LINZ data import & set up in Rob C's LINZ2OSM web app. TODO: summarize+copy that here so that everyone is working from the same page, and not trying to reinvent a wheel which already exists.