Talk:Multilingual names

From OpenStreetMap Wiki
Jump to: navigation, search

Contents

Transliteration or/and translation?

I'm not sure whether transliteration is better than translation. Usage is very tricky. I've 2 of the rare recent paper city maps of Ulaanbaatar in hands: on both maps, "Ulaanbaatar" is written like this, i.e. according to the ISO 9 transliteration from Cyrillic, and this is the usual spelling inside Mongolia for foreigners. But for instance the French embassy still uses "Oulan-Bator", the traditional French transliteration system (according to French pronunciation of "ou"), which is much better known among French living in France than "Ulaabaatar". Airlines often use "Ulan Bator", the traditional English transliteration, and Russians write "Улан-Батор", a Russian phonetical spelling due to the fact that, when the name was given to the town, Cyrillic was not yet used in Mongolia.
This is not all. Mongolian has been written in... 10 different scripts during history ! and 2 of them are still in everyday use : Cyrillic in (independent) Mongolia and Mongol-Uighur script in Inner Mongolia and, as a cultural thing (taught to all pupils in junior high schools) in Mongolia too. The transliteration of the Mongol-Uighur spelling would be "Ulaganbagator", which is never used. The translation is "Red-Hero", which is also not used.
This is still not all. On my maps, made for tourists (locals nearly never use maps), the main street is written "Peace avenue": a translation. The 2 main boulevards are named "Baga toyruu" and "Ih toyruu" in one map, "Baga toiruu" and "Ikh toiruu" in the other one, non-ISO 9 transliterations (the meanings are "small boulevard" and "big boulevard"). And this is indeed the way foreigners are used to talk: "Peace avenue", "Ih toiruu" etc.. On the same map, a secondary street is named, in both maps, "Zaluuchuud avenue": the combination of a transliteration (without the grammatical case) and of a translation. Both maps have a "Seoul street", where "Seoul" is written according to the usual English/French spelling of that town, not as a transliteration from the Mongolian spelling, which would be "Sôùl" (ISO 9). The 2 maps diverge for the name of the main bridge: "Enkhtaivany bridge" (non-ISO 9 transliteration + translation) on one, "Peace bridge" on the other map.
The reasons why the ISO 9 standard is not always followed is that it has not even been translated into Mongolian, and, for some letters, leads to an English pronunciation too far from the Mongolian one, while an obvious non-standard transliteration is phonetically better. According to ISO 9, the transliteration of тойруу should be "tojruu", though nothing sounds like an English "j" in the Mongolian word. A problem in using ISO transliteration standards is that they are copyrighted (and expensive) so that we cannot provide them freely on OpenStreetMap, and we cannot expect that all contributors would buy them.
The first question is: "Is the usage of foreigners living in the place so important that it should be followed?". I'd say no. Should we follow the local cartographers usage? I'm not sure it's always a good idea, and these usages sometimes differ from one cartographer to the other. Should we follow the Post office usage, since the Universal postal union makes it compulsory to write addresses in latin characters for international mails. The U.P.U. doesn't say if it has to be a transliteration (and which one) or a translation, and local post offices might accept several solutions. I propose that be always provided:

The reason for translation is that it's culturally interesting and sometimes gives geographical information. For instance, the county of Erdenet city (a name meaning "Precious", because it's a mining town) is called "Bayan-Ôndôr", which means "Rich-height", while the central municipality is called "Uurhaičid", meaning "Miners", which is even clearer. My proposal implies that the number of local names could be multiplied by up to 3. If there are 2 different local names with different meanings in non-latin scripts, this makes 6 fields. It should then be clear which field is the translation/transliteration of which one. So "name:en" is not sufficent at all in this case, because it doesn't say if it's a transliteration, a translation or a tradition, and doesn't say, for a translation or transliteration, of what it is the transliteration or the transliteration. "name:en" should only be used when English is indeed (one of) the local language(s), or if there is a proper traditional English name. I'd suggest "translation:mn/en" and "transliteration:mn-Cyrl/Latn" (The hyphen cannot be used between 2 languages because it enters in dialects codes, such as "es-AR" for Argentinian spanish, or for scripts, as "mn-Cyrl" for Mongolian Cyrillic). We also need a way to specify the local usual language(s) for each big zones. And we may also need a way to say, for each place, which of the translation, the transliteration and the traditional name is most used. For Ulaanbaatar, we could have:

and optionaly:

etc. I prefer calling "Ulan Bator" and "Oulan-Bator" proper English and French names rather than (phonetical) transcriptions (that could be specified as "transcription:mn/en" and "transcription:mn/fr"), because it's not the result of a transcription system still in use nowdays.

For Lyons:

"Lyon" has no meaning in present French language, so need no translation.
--http://solages.site.voila.fr/index_en.html 17:41, 10 March 2009 (UTC)

Wales-centric (Greece mentioned)

This seems to be only about Wales. Could it be changed to be as generic as possible? Bruce89 21:02, 26 April 2007 (BST)

I agree. It would be nice to have the article extended a bit to include other examples. For greece for instance the name= tag should use greek spelling and so on. Drawing the line on most frequently language used locally can be hard sometimes I guess though. Karlskoging1 21:41, 26 April 2007 (BST)
In Athens, I saw a "mess" of Greece and English street names in the name tag. This is a big problem, because you can't find street names when searching. So, there something should be done about it. I agree that the name tag should be in Greek, and name:en should be used for the English transliteration, which is on the street signs. --Willem1 19:22, 1 March 2009 (UTC)
I think a solution is to ask each renderer, asking if they can combine two tags in the 'displaying name' of streets and other object names. I've asked at mapnik-users today about it.

Missing ISO639-1 language codes

I just discovered that there are no ISO639-1 language codes for at least two of the minority languages used in sweden. I suggest that in such cases we use the ISO639-2 language codes instead. Karlskoging1 22:05, 26 April 2007 (BST)

I am already using the ISO 639-2 language code for Old Norse (non) for some Scottish Islands. Bruce89 22:31, 26 April 2007 (BST)

information

I'm a bit concerned this doesn't preserve all needed information.

If we have, say,

or indeed, if we have

then we lose the information as to which language the default name is in. If I'm rendering a map of the UK in English, I can easily pull out name:en before name, so this isn't a terrible problem. However, if you were specifying complex rules as to language preference order, this might be a problem, especially when rendering large areas where several different languages are likely to occupy the 'name' thing: imagine that I wanted to show all places with Welsh names with that Welsh name, but to show English names in preference to Gaelic ones? On the Welsh nodes i'd want to pick out "name:cy", "name", then finally "name:en", but on Gaelic nodes I'd want to pick out "name:cy", "name:en", "name". If we never had a bare "name" field this wouldn't be a problem. Morwen 12:35, 27 April 2007 (BST)

This is a fair point, but ATM renderers ignore all name:code=* tags, hence name=* containing the default language. I'm not sure if it would be possible to tell a worldwide renderer to use the local languages if they weren't in the name=* tag. Bruce89 13:23, 27 April 2007 (BST)
Well, you could have a defaultlanguage=* tag, or declare that if you have a "name" you also need a "name:code". I can think of several other ways of doing this, with varying complexity. Morwen 14:09, 27 April 2007 (BST)

A way to keep the all information is very needed.--http://solages.site.voila.fr/index_en.html 17:54, 10 March 2009 (UTC)

I tend to use lang=* - probably from a HTML analogy. --tms13 17:15, 13 November 2009 (UTC)

Street name

Surely this should be moved to Bilingual names, and the text changed accordingly. This doesn't only apply to street names. Bruce89 16:31, 8 May 2007 (BST)

Rendering names where there are two

Has there been any progress about rendering multilingual street and place names? In Brussels all name tags are now getting something like "Bruxelles - Brussel", since it's no option to show just one of them.

If I can do a suggestion, I'd like to see either something like "name=Bruxelles;Brussel" where the renderer chooses an appropriate method for displaying both (like each on a separate line for place names, seperated by a dash in a long street). We can have multiple tags for each key separated by a semicolon, so why not make use of that then? The other option is to just have no name key at all and a new tag like "display_languages=fr;nl", but I like the former more. --Eimai 21:37, 3 January 2008 (UTC)

That would be really useful indeed. --Moyogo 16:29, 4 January 2008 (UTC)
Perhaps it would be best that the name tag is the default local language, if someone wants to render a map in English they can start with name:en=* tags and fallback to name=*, equally, if someone wants to produce a hybrid English/French map they can combine the name:fr and name:en tags into the rendered name... Changing the values purely for rendering such as this is dirty imho... dkt 11:00, 9 December 2008 (UTC)
Well the name tag should hold the default local language name yes. But I think the question was about situations where there *is* no default local language. The streets have two different equally valid names.
So looking in Brussels at the moment this tram stop for example has been set up with the correct values in 'name:fr' and 'name:nl' but the value in the main 'name' tag should (some might argue) use a Semi-colon value separator to indicate that it has two different names.
Anyone developing a renderer would then have to decide what to do about that (e.g. swap in a hyphen instead). I believe technically it would be easy to do that for Mapnik renderers e.g. it could be added as a new feature of osm2pgsql. Other rendering systems and mobile apps everywhere could make a similar change, and then we could slowly transition to a more technically correct set up of all the name tags in places like this
...and the end result would be no difference (at best, although anything not transitioned would show an ugly ';' in there) All in all, quite a lot of faffing around just to satisfy some tagging pedantry. I can see why this hasn't happened yet! :-)
-- Harry Wood 12:59, 18 March 2012 (UTC)

Accented/non-accented?

Is there really a point in making a difference between accented and non-accented names? One major highway in Brazil is named Rodovia Governador Mário Covas, how should this be tagged in other languages? Is name:en Rodovia Governador Mário Covas, Rodovia Governador Mario Covas, Governor Mario Covas Highway or Governador Mário Covas Highway?

Spain

Is "ga" the correct ISO code for Galego / Galician? "ethnologue" suggests "gl" or "glg", with "ga" being Gaelic / Irish. (SomeoneElse 20:28, 27 January 2009 (UTC))

Breizh - Brittany - Bretagne

The system indicated for Wales does not work in France for bilingual streets, in french and in breton language (Brezhoneg [1]) (br). Many towns and villages in the west of Brittany have bilingual street names.

If you do this : name:fr=[name in French] and name:br=[name in Breton], then we loose each name !

Today, the only way is to indicate the bilingual form into the same field : name=[name in French] - [name in Breton], which is not satisfactory. Do you have any other way to suggest ? thanks! --Rimael 19:47, 2 August 2009 (UTC)

For China

(Discussion moved from the main page Ouleyang 08:21, 11 August 2010 (BST))

The way that names have been tagged in Hangzhou and Shanghai so far is as follows...
name=<Chinese>
name:zh=<Chinese>
name:en=<English>
name:zh_py=<Chinese pinyin (toneless)>
name:zh_pyt=<Chinese pinyin (with tones)>

An example of the given methodology follows...
name=朝阳公园南路
name:zh=朝阳公园南路
name:en=Chaoyang Park South Road
name:zh_py=Chaoyanggongyuan Nanlu
name:zh_pyt=Cháoyánggōngyuán Nánlù

This gives us the ability to render maps useful to as many users as possible, default rendering would use Chinese, which is good from the point of view that most of the population in China reads Chinese, but it would be easy to render maps with English and pinyin with tones as well for particular uses... For someone who reads Chinese, they'd only need the Chinese, for someone who doesn't, the English could make the map more understandable, if they want to try to communicate a name to someone in Chinese, having pinyin would be very helpful, if they want to be understood, having pinyin with tones would be very important...

Dtucny 01:00, 15 October 2007 (BST)

zh_py and zh_pyt is defined not any standard, we should not use it. Furthermore, zh_py could be generated automatically from zh_pyt. I propose zh zh-Hans and zh-Hant. For places in mainland China, zh=zh-Hans, otherwise zh=zh-Hant. (comment added by Python eggs)

Thanks for the comment Python eggs... As far as I'm aware, no standard language code exists for Pinyin, with tones or without... There has been discussion about creating language codes for it (in August 2008), but I'm not aware that this has been put into a standard as yet and the discussion didn't even venture into use of tones, but, when that happens, we can batch convert the tags if needed... Converting from pinyin including tones to pinyin without tones is possible, as you have said, though I think there is benefit to having both as options at data entry, it is definitely more difficult to add pinyin including tones than without and I would expect that not everyone would want to, but, where is it missing, someone else could easily see that and add it, whereas with a single pinyin field that could include tones or not, that would be less easy to spot.

Your point about use of zh-Hans and zh-Hant have definite merit though and I have used them before. The ideal combination of names that could be captured would be Simplified Chinese, Traditional Chinese, Pinyin, Cantonese Pinyin and English as there is no way to automatically convert between any of these without some pretty large lookup tables and even then, it wouldn't likely get everything correct...

So, for now, I stand by the proposal above, but, do admit there is room for improvement especially regarding handling of Simplified and Traditional Chinese... dkt 12:08, 21 April 2009 (UTC)

There is also an issue of parts of China, such as Autonomous provinces of Xinjiang and Tibet where the first official language is not written in Chinese script e.g. Uyghar or Tibetan. Here the name tag should probably have at least Simplified Chinese and Uighar names.We may also need the latin script version of these language scripts, such as Tibetan Pinyin (tb_py), although there are 5 version of this in Uyghur, and a cyrillic version too, since previously Russian governed Uyghur areas use this. Of course, since Uyghur is written right to left, by it will appear written first if included on the right hand side of a bi/tri lingual name tag. There are also often a range of older 'international names' for places in these areas. jamesks 31 July 2009 (UTC)

Tibetan : name:bo
Uyghur : name:ug
Manchu : name:mnc
Mongolian : name:mn
Russian : name:ru

name=<Chinese
name:zh=<Simplified Chinese>
int_name=<latin script names>
name:en=<English>
name:ru=<Russian> name:zh_py=<Chinese pinyin (toneless)>
name:ug=<Uyghur>
name:ug_??=<latin Uyghur>


Rename page or re-organise a little

We've had a section for long time (although the page title moved recently) : Names#Localization which documents how to use 'name' with a language code. By comparison this page carries only a small description followed by lots of country-by-country info. I think we should either

I think the first option makes most sense.

-- Harry Wood 13:17, 18 March 2012 (UTC)

Personal tools
Namespaces
Variants
Actions
site
Toolbox