Talk:Multilingual names

From OpenStreetMap Wiki
Jump to: navigation, search

Transliteration or/and translation?

I'm not sure whether transliteration is better than translation. Usage is very tricky. I've 2 of the rare recent paper city maps of Ulaanbaatar in hands: on both maps, "Ulaanbaatar" is written like this, i.e. according to the ISO 9 transliteration from Cyrillic, and this is the usual spelling inside Mongolia for foreigners. But for instance the French embassy still uses "Oulan-Bator", the traditional French transliteration system (according to French pronunciation of "ou"), which is much better known among French living in France than "Ulaabaatar". Airlines often use "Ulan Bator", the traditional English transliteration, and Russians write "Улан-Батор", a Russian phonetical spelling due to the fact that, when the name was given to the town, Cyrillic was not yet used in Mongolia.
This is not all. Mongolian has been written in... 10 different scripts during history ! and 2 of them are still in everyday use : Cyrillic in (independent) Mongolia and Mongol-Uighur script in Inner Mongolia and, as a cultural thing (taught to all pupils in junior high schools) in Mongolia too. The transliteration of the Mongol-Uighur spelling would be "Ulaganbagator", which is never used. The translation is "Red-Hero", which is also not used.
This is still not all. On my maps, made for tourists (locals nearly never use maps), the main street is written "Peace avenue": a translation. The 2 main boulevards are named "Baga toyruu" and "Ih toyruu" in one map, "Baga toiruu" and "Ikh toiruu" in the other one, non-ISO 9 transliterations (the meanings are "small boulevard" and "big boulevard"). And this is indeed the way foreigners are used to talk: "Peace avenue", "Ih toiruu" etc.. On the same map, a secondary street is named, in both maps, "Zaluuchuud avenue": the combination of a transliteration (without the grammatical case) and of a translation. Both maps have a "Seoul street", where "Seoul" is written according to the usual English/French spelling of that town, not as a transliteration from the Mongolian spelling, which would be "Sôùl" (ISO 9). The 2 maps diverge for the name of the main bridge: "Enkhtaivany bridge" (non-ISO 9 transliteration + translation) on one, "Peace bridge" on the other map.
The reasons why the ISO 9 standard is not always followed is that it has not even been translated into Mongolian, and, for some letters, leads to an English pronunciation too far from the Mongolian one, while an obvious non-standard transliteration is phonetically better. According to ISO 9, the transliteration of тойруу should be "tojruu", though nothing sounds like an English "j" in the Mongolian word. A problem in using ISO transliteration standards is that they are copyrighted (and expensive) so that we cannot provide them freely on OpenStreetMap, and we cannot expect that all contributors would buy them.
The first question is: "Is the usage of foreigners living in the place so important that it should be followed?". I'd say no. Should we follow the local cartographers usage? I'm not sure it's always a good idea, and these usages sometimes differ from one cartographer to the other. Should we follow the Post office usage, since the Universal postal union makes it compulsory to write addresses in latin characters for international mails. The U.P.U. doesn't say if it has to be a transliteration (and which one) or a translation, and local post offices might accept several solutions. I propose that be always provided:

  • the original name in the original language,
  • a translation into English (if the name is not English and has a meaning in the present local language),
  • a transliteration into latin characters (if the original script is not latin) according to systematic rules among those usually accepted locally. By "systematic", I don't mean universal. For instance, the transliteration rules of Cyrillic could be different for Ukrainian and for Mongolian.
  • the traditional English name if there is one.

The reason for translation is that it's culturally interesting and sometimes gives geographical information. For instance, the county of Erdenet city (a name meaning "Precious", because it's a mining town) is called "Bayan-Ôndôr", which means "Rich-height", while the central municipality is called "Uurhaičid", meaning "Miners", which is even clearer. My proposal implies that the number of local names could be multiplied by up to 3. If there are 2 different local names with different meanings in non-latin scripts, this makes 6 fields. It should then be clear which field is the translation/transliteration of which one. So "name:en" is not sufficent at all in this case, because it doesn't say if it's a transliteration, a translation or a tradition, and doesn't say, for a translation or transliteration, of what it is the transliteration or the transliteration. "name:en" should only be used when English is indeed (one of) the local language(s), or if there is a proper traditional English name. I'd suggest "translation:mn/en" and "transliteration:mn-Cyrl/Latn" (The hyphen cannot be used between 2 languages because it enters in dialects codes, such as "es-AR" for Argentinian spanish, or for scripts, as "mn-Cyrl" for Mongolian Cyrillic). We also need a way to specify the local usual language(s) for each big zones. And we may also need a way to say, for each place, which of the translation, the transliteration and the traditional name is most used. For Ulaanbaatar, we could have:

  • name:mn-Cyrl=Улаанбаатар
  • translation:mn/en=Red-Hero
  • transliteration:mn-Cyrl/Latn=Ulaanbaatar
  • name:en=Ulan Bator

and optionaly:

  • name:mn-Mong=(Sorry, I've not Mongolian-Uighur script keyboard to write this. See here )
  • transliteration:mn-Mong/Latn=Ulaganbagator
  • name:fr=Oulan-Bator
  • translation:mn/fr=Héros-Rouge
  • name:ru=Улан-Батор

etc. I prefer calling "Ulan Bator" and "Oulan-Bator" proper English and French names rather than (phonetical) transcriptions (that could be specified as "transcription:mn/en" and "transcription:mn/fr"), because it's not the result of a transcription system still in use nowdays.

For Lyons:

  • name:fr=Lyon
  • name:en=Lyons

"Lyon" has no meaning in present French language, so need no translation.
-- 17:41, 10 March 2009 (UTC)

Wales-centric (Greece mentioned)

This seems to be only about Wales. Could it be changed to be as generic as possible? Bruce89 21:02, 26 April 2007 (BST)

I agree. It would be nice to have the article extended a bit to include other examples. For greece for instance the name= tag should use greek spelling and so on. Drawing the line on most frequently language used locally can be hard sometimes I guess though. Karlskoging1 21:41, 26 April 2007 (BST)
In Athens, I saw a "mess" of Greece and English street names in the name tag. This is a big problem, because you can't find street names when searching. So, there something should be done about it. I agree that the name tag should be in Greek, and name:en should be used for the English transliteration, which is on the street signs. --Willem1 19:22, 1 March 2009 (UTC)
I think a solution is to ask each renderer, asking if they can combine two tags in the 'displaying name' of streets and other object names. I've asked at mapnik-users today about it.

Missing ISO639-1 language codes

I just discovered that there are no ISO639-1 language codes for at least two of the minority languages used in sweden. I suggest that in such cases we use the ISO639-2 language codes instead. Karlskoging1 22:05, 26 April 2007 (BST)

I am already using the ISO 639-2 language code for Old Norse (non) for some Scottish Islands. Bruce89 22:31, 26 April 2007 (BST)


I'm a bit concerned this doesn't preserve all needed information.

If we have, say,

  • name=[name in Welsh]
  • name:en=[name in English],

or indeed, if we have

  • name=[name in English]
  • name:cy=[name in Welsh]

then we lose the information as to which language the default name is in. If I'm rendering a map of the UK in English, I can easily pull out name:en before name, so this isn't a terrible problem. However, if you were specifying complex rules as to language preference order, this might be a problem, especially when rendering large areas where several different languages are likely to occupy the 'name' thing: imagine that I wanted to show all places with Welsh names with that Welsh name, but to show English names in preference to Gaelic ones? On the Welsh nodes i'd want to pick out "name:cy", "name", then finally "name:en", but on Gaelic nodes I'd want to pick out "name:cy", "name:en", "name". If we never had a bare "name" field this wouldn't be a problem. Morwen 12:35, 27 April 2007 (BST)

This is a fair point, but ATM renderers ignore all name:code=* tags, hence name=* containing the default language. I'm not sure if it would be possible to tell a worldwide renderer to use the local languages if they weren't in the name=* tag. Bruce89 13:23, 27 April 2007 (BST)
Well, you could have a defaultlanguage=* tag, or declare that if you have a "name" you also need a "name:code". I can think of several other ways of doing this, with varying complexity. Morwen 14:09, 27 April 2007 (BST)

A way to keep the all information is very needed.-- 17:54, 10 March 2009 (UTC)

I tend to use lang=* - probably from a HTML analogy. --tms13 17:15, 13 November 2009 (UTC)

Street name

Surely this should be moved to Bilingual names, and the text changed accordingly. This doesn't only apply to street names. Bruce89 16:31, 8 May 2007 (BST)

Rendering names where there are two

Has there been any progress about rendering multilingual street and place names? In Brussels all name tags are now getting something like "Bruxelles - Brussel", since it's no option to show just one of them.

If I can do a suggestion, I'd like to see either something like "name=Bruxelles;Brussel" where the renderer chooses an appropriate method for displaying both (like each on a separate line for place names, seperated by a dash in a long street). We can have multiple tags for each key separated by a semicolon, so why not make use of that then? The other option is to just have no name key at all and a new tag like "display_languages=fr;nl", but I like the former more. --Eimai 21:37, 3 January 2008 (UTC)

That would be really useful indeed. --Moyogo 16:29, 4 January 2008 (UTC)
Perhaps it would be best that the name tag is the default local language, if someone wants to render a map in English they can start with name:en=* tags and fallback to name=*, equally, if someone wants to produce a hybrid English/French map they can combine the name:fr and name:en tags into the rendered name... Changing the values purely for rendering such as this is dirty imho... dkt 11:00, 9 December 2008 (UTC)
Well the name tag should hold the default local language name yes. But I think the question was about situations where there *is* no default local language. The streets have two different equally valid names.
So looking in Brussels at the moment this tram stop for example has been set up with the correct values in 'name:fr' and 'name:nl' but the value in the main 'name' tag should (some might argue) use a Semi-colon value separator to indicate that it has two different names.
Anyone developing a renderer would then have to decide what to do about that (e.g. swap in a hyphen instead). I believe technically it would be easy to do that for Mapnik renderers e.g. it could be added as a new feature of osm2pgsql. Other rendering systems and mobile apps everywhere could make a similar change, and then we could slowly transition to a more technically correct set up of all the name tags in places like this
...and the end result would be no difference (at best, although anything not transitioned would show an ugly ';' in there) All in all, quite a lot of faffing around just to satisfy some tagging pedantry. I can see why this hasn't happened yet! :-)
-- Harry Wood 12:59, 18 March 2012 (UTC)


Is there really a point in making a difference between accented and non-accented names? One major highway in Brazil is named Rodovia Governador Mário Covas, how should this be tagged in other languages? Is name:en Rodovia Governador Mário Covas, Rodovia Governador Mario Covas, Governor Mario Covas Highway or Governador Mário Covas Highway?


Is "ga" the correct ISO code for Galego / Galician? "ethnologue" suggests "gl" or "glg", with "ga" being Gaelic / Irish. (SomeoneElse 20:28, 27 January 2009 (UTC))

Breizh - Brittany - Bretagne

The system indicated for Wales does not work in France for bilingual streets, in french and in breton language (Brezhoneg [1]) (br). Many towns and villages in the west of Brittany have bilingual street names.

If you do this : name:fr=[name in French] and name:br=[name in Breton], then we loose each name !

Today, the only way is to indicate the bilingual form into the same field : name=[name in French] - [name in Breton], which is not satisfactory. Do you have any other way to suggest ? thanks! --Rimael 19:47, 2 August 2009 (UTC)

For China

(Discussion moved from the main page Ouleyang 08:21, 11 August 2010 (BST))

The way that names have been tagged in Hangzhou and Shanghai so far is as follows...
name:zh_py=<Chinese pinyin (toneless)>
name:zh_pyt=<Chinese pinyin (with tones)>

An example of the given methodology follows...
name:en=Chaoyang Park South Road
name:zh_py=Chaoyanggongyuan Nanlu
name:zh_pyt=Cháoyánggōngyuán Nánlù

This gives us the ability to render maps useful to as many users as possible, default rendering would use Chinese, which is good from the point of view that most of the population in China reads Chinese, but it would be easy to render maps with English and pinyin with tones as well for particular uses... For someone who reads Chinese, they'd only need the Chinese, for someone who doesn't, the English could make the map more understandable, if they want to try to communicate a name to someone in Chinese, having pinyin would be very helpful, if they want to be understood, having pinyin with tones would be very important...

Dtucny 01:00, 15 October 2007 (BST)

zh_py and zh_pyt is defined not any standard, we should not use it. Furthermore, zh_py could be generated automatically from zh_pyt. I propose zh zh-Hans and zh-Hant. For places in mainland China, zh=zh-Hans, otherwise zh=zh-Hant. (comment added by Python eggs)

Thanks for the comment Python eggs... As far as I'm aware, no standard language code exists for Pinyin, with tones or without... There has been discussion about creating language codes for it (in August 2008), but I'm not aware that this has been put into a standard as yet and the discussion didn't even venture into use of tones, but, when that happens, we can batch convert the tags if needed... Converting from pinyin including tones to pinyin without tones is possible, as you have said, though I think there is benefit to having both as options at data entry, it is definitely more difficult to add pinyin including tones than without and I would expect that not everyone would want to, but, where is it missing, someone else could easily see that and add it, whereas with a single pinyin field that could include tones or not, that would be less easy to spot.

Your point about use of zh-Hans and zh-Hant have definite merit though and I have used them before. The ideal combination of names that could be captured would be Simplified Chinese, Traditional Chinese, Pinyin, Cantonese Pinyin and English as there is no way to automatically convert between any of these without some pretty large lookup tables and even then, it wouldn't likely get everything correct...

So, for now, I stand by the proposal above, but, do admit there is room for improvement especially regarding handling of Simplified and Traditional Chinese... dkt 12:08, 21 April 2009 (UTC)

There is also an issue of parts of China, such as Autonomous provinces of Xinjiang and Tibet where the first official language is not written in Chinese script e.g. Uyghar or Tibetan. Here the name tag should probably have at least Simplified Chinese and Uighar names.We may also need the latin script version of these language scripts, such as Tibetan Pinyin (tb_py), although there are 5 version of this in Uyghur, and a cyrillic version too, since previously Russian governed Uyghur areas use this. Of course, since Uyghur is written right to left, by it will appear written first if included on the right hand side of a bi/tri lingual name tag. There are also often a range of older 'international names' for places in these areas. jamesks 31 July 2009 (UTC)

Tibetan : name:bo
Uyghur : name:ug
Manchu : name:mnc
Mongolian : name:mn
Russian : name:ru

name:zh=<Simplified Chinese>
int_name=<latin script names>
name:ru=<Russian> name:zh_py=<Chinese pinyin (toneless)>
name:ug_??=<latin Uyghur>

Rename page or re-organise a little

We've had a section for long time (although the page title moved recently) : Names#Localization which documents how to use 'name' with a language code. By comparison this page carries only a small description followed by lots of country-by-country info. I think we should either

  • Make this page the primary documentation, meaning Names#Localization details would move to here, and that page would just have a short paragraph and a link to here.
  • Renaming this page to something like 'Name tag use by country' ...not a great name, but you see what I mean.

I think the first option makes most sense.

-- Harry Wood 13:17, 18 March 2012 (UTC)

Standard language codes

Some of our language tagging does not follow the standards. Maybe some usage has to be grandfathered in, but I wonder if we can change to following the standards. New language tagging should follow the standards and renderers should be written to understand standard language-code format.

Examples of correct language codes. The names of romanizations are registered as IANA variant subtags – we shouldn’t just make up more codes unless we use an -x- private-use code.

  • bu – Bulgarian
  • bu-Latn – Bulgarian in Latin characters
  • zh – Chinese
  • zh-Latn – Chinese in Latin characters
  • zh-Latn-pinyin – Chinese in pinyin romanization (not zh_pinyin, zh_py, nor zh_pyt)
  • zh-Latin-wadegile – Chinese in Wade–Giles romanization
  • zh-Hans – Simplified Chinese
  • zh-Hant – Traditional Chinese
  • ja – Japanese
  • ja-Latn – Japanese in Latin characters (not ja_rm)
  • ja-Latn-hepburn – Japanese in Hepburn romanization
  • ja-Latn-alalc97 – Japanese in Library of Congress romanization
  • ja-Latn-x-osm – Japanese in Latin characters, according to some private OSM scheme

A few more romanization methods are registered in the Unicode CLDR,[2] and can be used with the t singleton and m0 separator. In this case, an ISO date can be added indicating a version of a standard.

  • mn-Latn – Mongolian in Latin characters
  • en-t-mn – English translated from Mongolian
  • mn-Latn-t-mn-Cyrl – Mongolian transliterated from Cyrillic into Latn
  • und-Latn-t-und-Cyrl – Text transliterated from Cyrillic to Latin (und = undetermined language)
  • ja-Latn-t-ja-Jpan-m0-alaloc – Japanese in Library of Congress romanization (equivalent to the shorter version above)
  • ja-Latn-t-ja-Jpan-m0-alaloc-1949 – Japanese in Library of Congress romanization, 1949 version

CLDR v24 transforms:

  • alaloc – American Library Association-Library of Congress
  • bgn – US Board on Geographic Names
  • buckwalt – Buckwalter Arabic transliteration system
  • din – Deutsches Institut für Normung
  • gost – Euro-Asian Council for Standardization, Metrology and Certification
  • iso – International Organization for Standardization
  • mcst – Korean Ministry of Culture, Sports and Tourism
  • satts – Standard Arabic Technical Transliteration System (SATTS)
  • ungegn – United Nations Group of Experts on Geographical Names

Language and script codes are governed by BCP 47: Tags for Identifying Languages and the IANA Language Subtag Registry. Also relevant might be BCP 47 Extension T: Transformed Content and BCP 47 Extension U (Unicode).

Codes should be kept as short as possible. Michael Z. 2013-11-06 07:28 z

Sardegna Edit War

There is an edit war on what the section about multilingual naming in Sardegna (Italy) should look like.

The subject is currently also discussed on talk-it.

I reverted the paragraph to a state that is compatible with the pre edit-war content, which apparently is what the community on talk-it decided about: “Santo cielo, ancora? Abbiamo discusso per settimane di questa cosa, basta!” (quoting Luca Meloni)

If you think the paragraph needs updating, please do not continue with the edit-war, but discuss here, at talk-it, or wherever, until you reach some kind of consensus with the community. Only after that, the paragraph should be updated.

--Tyr (talk) 19:12, 9 January 2014 (UTC)

This edit was added citing an inexistent and unsourced "co-officiality" for six (!!!) different languages. This formulation is a "unique" invented here, instead of using the rules already established for the Friulian and all other local/regional languages. In addition, this formulation puts a local name in front of ot fhe official, main and common use name, both locally and internationally (ex. Nuoro, comune.Cagliari and Cagliariturismo, comune.Alghero and Algheroturismo). In the community there's stil the same debate with no general consensus. --Drinz (talk) 15:49, 12 January 2014 (UTC)
  1. The local languages are official in Sardinia thanks to italian and regional law.
  2. This formulation is a standard proposed for all the other regions.
  3. The local names are put before the italian ones because majority of population have those local languages as their native one.
  4. In the community there is a general consensus with this standard. Consent is different from unanimity, the last one is simply impossible to obtain. The discussion was closed months ago, it's only because of you if it's now open again.
  5. Don't open a meaningless discussion here. The discussion place is the mailing list.--L2212 (talk) 17:19, 12 January 2014 (UTC)
+1 for the local mailing list as the "best" place to discuss this.
But also: Please note that legal finesses of certain laws is not really the most important thing that counts in OSM. Instead, we have always been mapping what is on the ground – here that would be the term(s) that local people use and what local signposts read.
-- Tyr (talk) 21:22, 12 January 2014 (UTC)

Tamazight language ISO code not supported

Hello, I noticed that the official language of Morocco is not supported when naming places in this country.

Name:ar and Name:fr work fine but Name:zgh should show ⵜⴰⵎⴰⵣⵉⵖⵜ

Here is the SIL report on the language code:

The Ethnologue report:

Gagnabil (talk) 02:13, 29 April 2015‎ (UTC)

Not sure what you mean by "not supported". The tags name:zgh and name:ber work fine. Tags are always written in the Roman/UK alphabet; the content of the tag can be written in Tamazight. So, for one example, Ben Slimane Airport is written ⴰⵣⴰⴳⵯⵣ ⵏ ⴱⵏ ⵙⵍⵉⵎⴰⵏ in Berber/Tamazight and tagged correctly. See
Also, please sign your comments with four tildes.
Johnparis (talk) 12:01, 7 August 2016 (UTC)

Edit warring

Could the people who have been editing this page today stop changing it please and explain why their preferred version is better?--Andrew (talk) 20:36, 3 August 2016 (UTC)

  • The user User:Fayor is editing a topic that's still being discussed in the community (because he reopened the issue) without the approval of the community, and without telling anyone about it. --L2212 (talk) 20:51, 3 August 2016 (UTC) suggests that there are two sides to the edit war in the wiki. That edit war was also happening in OSM itself and the participants have both been blocked by the DWG temporarily to provide a "cool down" period. --SomeoneElse (talk) 21:18, 3 August 2016 (UTC)
The edit war is the restart of an old discussion, the one described in the "Sardegna Edit War" section before. The difference is that this time the user Fayor is in the role that was Dritz's.--L2212 (talk) 01:05, 4 August 2016 (UTC)


Seems there has been no discussion or project on naming places in Iraq so far, so I'd like to start. For the time being, there's some write-up in WikiProject_Iraq#Place_names_.28proposal_for_handling_multiple_names_and_multiple_languages.29 Øukasz (talk) 18:21, 1 October 2016 (UTC).

Hello, I left a message on the project's talk page--Ghybu (talk) 14:37, 4 October 2016 (UTC)