Multilingual names

From OpenStreetMap Wiki
(Redirected from Bilingual street names)
Jump to: navigation, search
Available languages
English français 日本語

There are various situations where we need to handle multilingual names, where a feature has different names in different languages. This is very common for the names of major world cities. In some regions, places and streets have names in all local languages. People will also want to render maps in a specific language (Map Internationalization).

Names#Localization is the primary documentation describing how to use Key:name with a language code. People seem to generally agree on using name:code=* where code is a language's ISO 639-1 code, or ISO 639-2 if an ISO 639-1 code doesn't exist.

Issues:

  • what should the plain name=* tag be (local name, at least if not disputed), and how to specify the language that the name tag is in.
  • how to specify transliterations. It has been proposed to use a key of name:code_trans where code is as above and trans specifies the transliteration. There doesn't appear to be a standard list of transliteration codes.

On this page we give some country by country information particularly regarding the language used in the main 'name' tag. See Exonyms for some examples.

Belarus

Belarus has two official languages (Belarusian and Russian). As russian language is used more widely, the name tag value should be in Russian. To add a belarusian value use name:be tag.

Belgium

Belgium has three official languages (Dutch, French and German). Dutch is spoken in Flanders, French in Wallonia, German in the East-Kantons. In Brussels however, both French and Dutch are spoken. Street names in Brussels and some surrounding towns and villages are bilingual too.

In Brussels use name:fr for the French name and name:nl for the Dutch name. The currently used convention for name is name=French - Dutch or name=Dutch - French, such as name=Grand Place - Grote Markt. The order of the names is chosen by the first person creating the object, and should be left as is afterwards if there is no real reason to change it. Changing the order for the sake of having your own language first will be considered vandalism.

In Wallonia, use "name" for the common french name, if you want to complete with the walloon name, use "name:wa". For example, for the city of Liège, we use :

name=Liège
name:wa=Lîdje

Bulgaria

The official language in Bulgaria is Bulgarian written in Cyrillic.

Road and other signs in Bulgaria are written in Bulgarian and are usually transcribed to the Latin alphabet according to the Bulgarian transliteration law.

  • The geographic terms: планина (mountain), равнина (plain), низина (valley), плато (platteau), град (city), село (village), река (river), езеро (lake), залив (bay) etc., which are part of the geographic name, are generally transliterated as follows:

Стара планина Stara planina
Атанасовско езеро Atanasovsko ezero

  • Geographic terms that are not part of the name, should be translated (here example in English):

Нос Емине Cape Emine

  • The terms "северен" (northern), "южен" (southern), "източен" (eastern), "западен" (western), "централен" (central) and similar ones, in case they are part of the geographic name, should be transliterated:

Централен Балкан Tsentralen Balkan
София-юг Sofia-yug
Перник-север Pernik-sever

Example:

  • name=: name in Bulgarian (бул. Източен)
  • int_name=: transliterated name (bul. Iztochen)
  • name:en= (optional): Iztochen Blvd.

China

Names in China are written with Chinese characters, but an English version and pinyin version can be very useful for foreigners so should be included also. Pinyin is written with accents. These accents are mandatory and can not be omitted.

In most cases, pinyin without tones does not add any information to the English name, so it doesn't have to be added.

A whole standard entry would include :
name=<Chinese>
name:zh=<Chinese>
name:en=<English>
name:zh_pinyin=<Chinese pinyin (with tones)>

An example of the given methodology follows...
name=朝阳公园南路
name:zh=朝阳公园南路
name:en=Chaoyang Park South Road
name:zh_pinyin=Cháoyánggōngyuán Nánlù

zh_pinyin is not a standard language code (as pinyin is not a language, only a romanisation system), but it can not interfere with other codes, so it is possible to convert it later if a more coherent system is developed.

It is also possible to add local names using the same template name:code if the language has an ISO 639-1 or ISO 639-2 code (for example 'bo' for Tibetan, 'ug' for Uighur, etc.)

Places follow the same convention. The only difference is that Chinese maps add 市,镇,省,县 at the end of the name, so we follow this convention, but do not translate this part in English.
place=city
name=自贡市
name:en=Zigong
name:zh=自贡市
name:zh_pinyin=Zìgòng Shì

place=town
name=淅川县
name:en=Xichuan
name:zh=淅川县
name:zh_pinyin=Xīchuān Xiàn

(A standard language code for Pinyin would be name:zh-Latn-pinyin=*.)

Cornwall

Cornwall is a county in the UK. People spoke Cornish(Kernewek) up to the 19th Century. There are some people that think Cornwall should/could be its own country, or at least keep its own identity and so some road signs to attractions get the Cornish flag spray painted over the English Heritage sign.

Some of the street signs of Kerrier District Council have the names also written in Cornish. It seems good to add this data to OSM, though the English name is likely to be the only one used.

The ISO 639-1 code is kw. So name=English Street, and name:kw=Cornish Translation (if shown).

Croatia

All name=* tags should be in Croatian. Bilingual places are no exception. Italian names should be in name:it=* tags, and Serbian names should be in name:sr=*.

This was decided by vote on our mailing list: https://lists.openstreetmap.org/pipermail/talk-hr/2013-May/001815.html

Finland

Some cities/places/streets have names in Finnish and Swedish. Use name key for the commonly used language (mostly Finnish, use Swedish version if municipality is mainly Swedish) and name:fi and name:sv for the specific language version.

Three Sami languages are recognized up in the north, Northern, Inari and Skolt Sami, in the following municipalities: Enontekiö, Utsjoki, Inari and Sodankylä. The ISO 639-1 code system covers only Northern sami (tagged as name:se). But with the ISO 639-2 code system all three languages are taggable with the name key. Nothern Sami is tagged with name:sme, Inari Sami with name:smn and Skolt Sami with name:sms.

More information on the WikiProject Finland page.

France

Bilingual streetsign in Perpignan, both in Catalan and French.

France has multiple languages besides French. There are also bilingual streetsigns. Some of the ISO codes are br (Breton), ca (Catalan), co (Corsican), oc (Occitan), eu (Euskara), vls (West Flemish), gsw (here: Alsatian). (note: gsw is also used for the German dialects in Switzerland and the dialects in Südbaden, Germany)

Germany

In the eastern part of Germany some people speek Sorbian. The centre of the Upper Sorbian (name:hsb) speech area is Bautzen, while Cottbus is the centre for Lower Sorbian (name:dsb).

Greece

The official language in Greece is Modern Greek.

  • Road and other signs in Greece are written in Modern Greek and are usually transcribed to the Latin alphabet using ISO 843:1999 (same as ELOT 743)
  • Way names are written in the Greek genitive case (no English equivalent; it is used to describe the owner of something, in this case who a way -street/avenue/square etc.- is named after). This is different than the name itself (which is expressed in nominative case)
  • "Οδός", Greek for "street" and "road", transcribes to "Odos". Although used in most signs, is rarely used in speech and search, as the only other type of way used is "avenue" ("λεωφόρος")
  • "Λεωφόρος", Greek for "avenue", transcribes to "Leoforos"
  • When there is a first and a last name, like in Eleftheriou Venizelou Avenue (Λεωφόρος Ελευθερίου Βενιζέλου) the order of first and last names is kept, unlike in printed guides where they were reversed for indexing reasons

So, "Λεωφόρος Δημοκρατίας", (after "Δημοκρατία", meaning "Democracy Avenue") is transliterated to "Leoforos Dimokratias". It can also be expressed as "Dimokratia Avenue", if "λεωφόρος" is translated, instead of transcribed. Note that "Dimokratia" (nominative) becomes "Dimokratias" (with an "s", genitive) after "Leoforos", but remains nominative before "Avenue", hence the dilemma: Do concepts like "Avenue" get transcribed or translated? If translated, how should the Name be expressed, in nominative or in genitive case? Then again, translation makes the name language-specific, while transcription does not. So, it's transcription for int_name and maybe translation for name_en.

Resulting naming convention (examples: Οδός Σταδίου and Λεωφόρος Ελευθερίου Βενιζέλου, also widely known as Λεωφόρος Πανεπιστημίου):

  • name=* and (optionally) name:el=* name in Greek ("Οδός" is omitted), eg. Σταδίου, Λεωφόρος Ελευθερίου Βενιζέλου
  • int_name=* transcribed name (genitive case is preserved), eg. Stadiou, Leoforos Εleftheriou Venizelou
  • name:en=* (optional): transcribed name (nominative case with the suffix Street, Avenue etc.), Stadio Street, Eleftherios Venizelos Avenue. The first example sounds really wrong, but why use name:en=*, if it would be the same as int_name?
  • old_name=* (optional): old Greek Name, eg. blank, Λεωφόρος Πανεπιστημίου

Transliteration and Trancription links:

Haiti

The country has two official languages : French and Haitian Creole. Street names generally are only noted in French though.

  • name= Names in both languages (Haitian Creole first, followed by French)
  • name:ht= Name in Haitian Creole
  • name:fr= Name in French


Example:

  • name= Okay Les Cayes
  • name:ht= Okay
  • name:fr= Les Cayes

Hawaiʻi

The US state of Hawaiʻi has both English and Hawaiian (haw) as its official languages. Most place names retain the aboriginal Hawaiian names, with a few notable examples like Pearl Harbor. Most place names are now officially spelled with the ʻokina (glottal stop) and kahakō (macron) where appropriate. Street signs may or may not have the proper diacritics.

Hong Kong

Under "One Country, Two Systems", in Hong Kong, one of the special administrative regions of China, both Chinese and English are gazetted as the official languages. As a result, both Chinese and English names are used. The current bilingual (Chinese zh and English en) tagging conventions as follows:

  • name= Names in both languages (Traditional Chinese first, followed by English)
  • name:zh= Name in Traditional Chinese
  • name:en= Name in English

Take Connaught Road Central (干諾道中) in Central for example.

  • name=干諾道中 Connaught Road Central
  • name:zh=干諾道中
  • name:en=Connaught Road Central

Iran

We use name:fa and full_name:fa for Persian street names. See Iran_tagging#Naming for more information.


Ireland (Republic)

Officially, all street signage must be bilingual, Irish and English, unless the name is of a place in a Gaeltacht (area where a majority speaks Irish as a first language), in which case signs must be Irish-language only.

Certain rare examples exist of placenames outside Gaeltachts where the English version has been successfully supplanted by the Irish version, at least when written. Such towns (including Portlaois/Port Laoise and Dunleary/Dún Laoghaire, formerly Kingstown). Examples (past and current) do exist of places where the English name has endured despite signage that would suggest differently (Kells/Ceanannas, Bagenalstown/Muine Bheag, Charleville/Rath Luirc/An Rath). For these places, only local knowledge will allow you to tag "correctly".

Safest tagging practise is generally to use the English-language version, where given, as the primary name, including the Irish-language version under name:ga=*. Where only a single name is provided, this name will be in Irish, but is likely, in the absence of local knowledge to the contrary, to be the best primary name.

In Gaeltacht areas, it is probably best to use the Irish version of a name as primary name and tag the English version (where known) under name:en=*. This simple rule of thumb does break down where a Gaeltacht place is better known in English, or whose popularly used local Irish version is different to the officially recorded name, as in Dingle, a mostly English-speaking town in a Gaeltacht whose local Irish name is Daingean Uí Chúis but which is officially known as An Daingean.

Isle of Man

Many street signs in the Isle of Man have both an English version and a Manx version of the name. The Manx version should be added to the object using the name:gv=* tag, and the English version added with the name:en=*. Whichever version is used as the primary name (usually the English version) should be used in the main name=* tag as well.

Italy

Friuli-Venezia Giulia

In Friuli Venezia Giulia many places and geographical features (rivers, mountains...) have a toponym in Friulian. These names should be added as name:fur=*. In some villages the names of the roads are both in Italian and Friulian so you should add them in the same way. In certain cases there are two version of the toponym, one in standard Friulian and one in the local form; you should add the local form too using loc_name=*. Here you can find a document with a complete list of toponyms in standard Friulian. There's a rendering which shows Friulan names and you can find it here.

Some features have also a name in German and/or Slovenian; you should add them using name:de=* and name:sl=*

Alto Adige/Südtirol (South Tyrol)

Languages of South Tyrol. Majorities per municipality in 2001

German and Italian are both official languages of South Tyrol. In some eastern municipalities Ladin is the third official language. (c.f. Wikipedia). Every town has two (or three) official names; but even the street names are mostly bilingual.

The German and Italian (and Ladin) names of towns, streets, POIs, etc. are tagged separately with the respective name:* tags (name:it=* for Italian, name:de=* for German and name:lld=* for Ladin). There seems to be consensus in including both languages in the general name-tag separated by a dash, where the (locally) more commonly used name comes first (most of the time this will be the language, which dominates in the municipality). Examples are "Bolzano - Bozen" (Italian first), "Brixen - Bressanone" (German first).

Examples:

name=Bolzano - Bozen
name:de=Bozen
name:it=Bolzano

name=Brixen - Bressanone
name:de=Brixen
name:it=Bressanone

name=Urtijëi (or name=Urtijëi - St. Ulrich - Ortisei)
name:de=St. Ulrich
name:it=Ortisei
name:lld=Urtijëi

Sardegna (Sardinia)

Languages of Sardinia.

In Sardinia Sardinian and Catalan toponyms are protected by the national law 482/1999 and all the names in the local languages are co-official with Italian. These names should be added as name:sc=* (for Sardinian), or name:ca=* (for Catalan) and also the name tag must contain both the local and the italian name. You can find the local toponyms here. Other local languages, and related place names, are Corsican (Sassarese and Gallurese) name:co=* and name:lij=* for Ligurian.

Many places and geographical features (rivers, mountains...) have a toponym in the local languages. In some villages the names of the roads are both name so you should add them in the same way. In certain cases there are two version of the toponym, one in standard form and one in the local form; you should add the local form too using loc_name=*.

The correct tagging would be like:

name=Nùgoro/Nuoro
name:it=Nuoro
name:sc=Nùgoro

or

name=L'Alguer/Alghero
name:it=Alghero
name:ca=L'Alguer
name:co=L'Aliera
name:sc=S'Alighera

Japan

See Japan tagging#Names and JA:Naming_sample for more information.

Usage Example
name=Japanese name=首都高速道路
name:en=English or romanization of Japanese name:en=Metropolitan Expressway
name:ja=Japanese name:ja=首都高速道路
name:ja_rm=romanization of Japanese name:ja_rm=Shuto Kōsoku Dōro

(A standard language code for romanized Japanese would be name:ja-Latn=*

Lebanon

Street signs in Lebanon are written in Arabic and in French, tagging is :


name=طرابلس for arabic name
name:ar=طرابلس for arabic name
name:fr=Tripoli for the french version

Luxembourg

In Luxembourg, Village names and their signs are bilingual, Luxembourgish and French. For example, Bascharage in French is Nidderkäerjeng in Luxembourgish. Most street names generally are only noted in French, but some bilingual street signs have been installed.

name:lb=* can be used for Luxembourgish names when there is one.

Examples

for rue de Koerich / Kärcherwee, the correct tagging is:

name = rue de Koerich
name:lb = Kärcherwee

for rue Saint-Ulric / Tilleschgaass, the correct tagging is:

name = rue Saint-Ulric
name:lb = Tilleschgaass

Morocco

Street signs in Morocco are written in Arabic and in French, tagging is name=الرباط for arabic name name:fr=Rabat for the french version.

Scotland

Some signs in the Highlands of Scotland have names in both English and Gaelic, usually only for place names (Mallaig, Fort William etc.), but sometimes they appear on street signs, such as this one in Oban.

Gaelic names can be tagged using name:gd=*, with name=* for the name in English. In places where Gaelic is the main language (eg the Western Isles), the Gaelic name can be tagged using name=*, with name:en=* for the English name. An example of this in the wild is Steòrnabhagh (Stornoway).

gd is the ISO 639-1 code for Scottish Gaelic, which isn't the same as other Gaelics (e.g. Irish), so this code is only for Scottish Gaelic (which is used in Canada as well).

An OSM rendering showing the Gaelic names is available at OSM Alba.

Some places (mainly islands) have Old Norse names as well, use name:non=* for these. (non is the ISO 639-2 code for Old Norse)

Serbia

The Serbian language is using both Cyrillic (official) and Latin (popular) alphabets, so the naming scheme needs to support both. The current proposal is to use name:sr=* for the Cyrillic script, and name:sr-Latn=* for Latin. The locale tags have been chosen according to BCP 47.

The default name=* tag should be filled in using Serbian Cyrillic, unless the feature is in an area where the language and script of the sizable ethnic minority is in official use (e.g. Subotica or Preševo). In that case, name:sr=* must be added, as well as the tag specifying the language of the default tag (e.g. name:hu=*, name:sq=*, etc.).

Spain

Depending of the zone you could find 3 additional languages: Català (ca), Galego (ga) and Euskera (eu)

Català its spoken in Catalunya, Illes Balears and several places in the Comunitat Valenciana (north east and balearic islands) Usually the official name for the places is that one but sometimes you are going to find names in Spanish, as an example the official name for one of the provinces its Lleida, but also for tradition this place its known and marked in several maps as Lérida.

The correct tagging would be:

name=Lleida
name:es=Lérida
name:ca=Lleida

The same happens with the Galego, spoken in the northwest corner, and with Euskera spoken in the basque region and Navarra. There are few exceptions like Donostia - San Sebastián and Vitoria - Gasteiz where the official form depends on the local administration and historic use of the names, so the correct tagging in this case would be:

name=Vitoria - Gasteiz
name:es=Vitoria
name:eu=Gasteiz

Please check out the right situation in each placename.

Switzerland

The ISO code for Rhaeto-Romance languages including the 4. national language Romansh is rm (ISO 639-1; ISO 639-2: roh) Currently ISO 639-1 and -2 specifications do not provide any differentiation between Romansh dialects, therefore rm should be used mainly for the more or less official "Rumantsch Grischun". The problem of missing ISO codes for different Romansh dialects seems still unresolved and needs discussion.

For names in Swiss german dialects ('Schwiizertüütsch','Schwyzerdütsch') add following tag to name=* tags: name:gsw=a Swiss german name. For explanations about the notation of Swiss german dialects see GISpunkt HSR Wiki.

Example: name:gsw=Züri as an addition to name=Zürich and name:en=Zurich. Code gsw most probably means "German SWiss". See [1] and [2].

See de:Switzerland/Map Features#Mehrsprachige Benennung (name).

Taiwan

Taiwan uses traditional Chinese characters. Names in Chinese should use these characters. Romanized names should be mentioned as the name:en=Street Name field.

Taiwan#Translation (中文地址英譯) mentions how to translate names.

Following the Taiwan Post Office guide, the following rules should be applied:

Name (Chinese) Name (English) Abbreviation (English)
Road Rd.
西 West W.
East E.
Section Sec.
Street St.
Lane Ln.
Alley Aly.

Since "Aly." and "Ln." sound quite cryptic and don't save that many characters, the whole English name will be used for Lanes and Alleys.

A whole standard entry should include:

name=<Chinese>
name:zh=<Chinese>
name:en=<English>

Here is a real life example:

name=南京東路三段256巷28弄
name:zh=南京東路三段256巷28弄
name:en=Alley 28, Lane 256, Sec. 3, Nanjing E. Rd.


Different Romanisations are in use in Taiwan. (?:) name:en=* should use the name given on the street sign. (Maybe other romanisations should be entered as well (how?), as different forms are sometimes used (e.g. on maps or on business cards) and it should be possible to find streets based on such variants.)

Thailand

Thailand has names written in Thai script. Many street signs list English names as well.

We state the local (Thai) name in the name tag. For other languages the name-tag is extended by the language code. A special rendering allows selection of the language to display as well as bilingual naming: http://thaimap.osm-tools.org/

Example:

name=เชียงใหม่
name:en=Chiang Mai
name:th=เชียงใหม่

Tunisia

Street signs in Tunisia are written in Arabic and in French, tagging is :


name=عربي for arabic name
name:ar=عربي for arabic name
name:fr=français for the french version

Wales

In Wales, for instance, many if not most, streets have a Welsh name and an English name.

Simply tagging 'name=Terrace Road' doesn't work, as this ignores the Welsh version. Using nat_name, as currently recommended in Map features isn't satisfactory, as there is ambiguity over what is the 'national' name (English and Welsh both have equivalent legal status).

A suggestion is to use 'name:language=streetname' where 'language' is the two letter language code from ISO639-1 eg name:en=Terrace Road and name:cy=Fford-y-Mor.

There is a Welsh language render available at http://brasskipper.org.uk/cyosm/ it renders name:cy=* where it exists, and falls back to name=* where its unavailable.

A good background email [3] from Simon Hewison provides context.

If you're going to use name:language=placename, then I'd suggest that within Wales, you use name:cy=placename.

cy is the two letter ISO639-1 language code for the Welsh language.

There are some places in Wales where nobody uses an English placenames, or there is no English placename, eg. Ystrad Mynach.

There are also some places in Wales where there are no commonly used Welsh placenames, and Welsh speakers tend to use the English language place name, eg. Crosskeys.

There are also some places in Wales where the signposted welsh language placenames are wrong, and were made up by well-meaning but non-Welsh speaking civil servants after 1971 when protesters had been defacing or destroying English language only signs, and road signs in Wales were supposed to include the national language of Wales. An example of this was Aberdaugleddau, known to English speakers as Milford Haven. There are some signs about that incorrectly signposted it as Milffordd, though last time I was there, they had been amended with stickers with the correct spelling. This example must have been a guess at transliterating place names, since Haverfordwest was Hwlffordd.

Finally, some places in Wales have English language names that are an attempted anglicisation of the Welsh placename, so that it's not a translation, but something easier for non-Welsh speakers to pronounce and spell. For instance, Caerphilly (English) should actually be spelt Caerffili (Welsh). In many cases, early English mapmakers attempted to spell Welsh words using phonetic English and English grammar. The result is that these forms of names appear all over Wales.

Much of this information comes from http://www.caerphilly.gov.uk/pdf/equalities/welsh-language-scheme-05-08.pdf

As such, I reckon that you should set the name tag to the Welsh name in areas where the Welsh language has a high concentration of native speakers, and use name:en for the English name in such situations.

Ditto for street names. Many street names are signposted in Welsh only, yet they do have official English translations, eg. Stryd-y-Capel becomes Chapel Street, but it's never signposted as Chapel Street, and most English speakers refer to it as Stryd-y-Capel in their address. In which case, set the name tag to be the Welsh language.

There really is no substitute for local knowledge when doing place names in mapping. Since Wales has very little decent Yahoo maps coverage, we're stuck with visiting with a GPS and old maps like New Popular Edition (which regularly mis-spelt place names in Wales). If as an English speaker you want to map Wales, go and visit the place. It's a marvellous place. Speak to the locals, they're mostly friendly, especially the members of Cymdeithas yr Iaith (Welsh Language Society) if you show an interest in getting Welsh language placenames on maps with the correct spelling.

Shared boundary features

Sometimes a boundary is a shared feature like a river with different names on each side of the border. One example is the Rhine river separating Germany and France.

Always add name:code=* for each involved language, and for compatibility with older rendering engines, also set name=* to both names, separated by a forward slash with spaces in between, sorted in (a somewhat neutral) Unicode alphabetical order.

For the Rhine river, this would be:
name=Le Rhin / Rhein
name:fr=Le Rhin
name:de=Rhein

Note that, in the rare case when the feature borders multiple language regions, like the Danube river, joining all names within name=* can produce an inconveniently long string. Though not implemented anywhere yet, one could:

  • create a relation for the whole feature without name=*, only name:code=* tags
  • break down the feature for each pair of adjacent regions (usually countries)
  • for each pair of regions, create a relation for the feature's parts separating the two regions with name=* containing the two local names and no name:code=* tags

Finally, if only part of the feature is used as a boundary, set name=* of the non-boundary parts to the name used in the containing region, as in most other objects.