Talk:Proposed features/Language information for name

From OpenStreetMap Wiki
Jump to: navigation, search

tag name

Generally I really appreciate this proposal, we have been missing a way to tell which is the default language used in the name tag. Minor nitpicks: why do you suggest an abbreviation (lang) rather than a word like name_language=* or name:language=*? This concept could also be extended to other name tag variations like loc_name_language=* or official_name_language=* (here the colon approach would make it clearer what this is about, e.g. loc_name:language=*). --Dieterdreist (talk) 12:34, 5 May 2017 (UTC)

Thanks for your feedback! Now the proposal uses “language” instead of “lang”. About the colon approach and extending it to other keys like loc_name: I was also thinking about that. When writing the proposal, I was a little bit reluctant to the colon approach, because usually what you find after the colon is the language code itself within the key, and then a localized name in the value. On the other hand, right know there exist yet non-language-code suffixes (:right and :left for border features), so anyway yet now a data consumer cannot rely on the assumtion that everything after “name:” is a language code. I’ve changed the proposal to use the colon. Let’s see what the others think about… --Sommerluk (talk) 08:28, 6 May 2017 (UTC)

That proposition needs default languages to be defined first

You say "This tag is not always necessary" which, as often in OSM, is not really precise.
That means that if there is no name=* tag, the region's default languages applies.
Alas, OSM seems to hate tagging default values inside its database.
Rather they are scattered in many other places and if one changes, all applications have to be updated instead of a transparent OSM map update.
See the nice Proposed features/Defaults that they removed (even though it is used).
So, your proposal puts the cart before the horses and default languages should be defined first.
If you want to do that, call on my advice on how to do it.
Here in Belgium, we have defined the linguistic regions relations.
But, owing to that OSM shortcoming, they are of course not used as defaults. --Papou (talk) 22:47, 5 May 2017 (UTC)

Thanks for feedback. After skimming the proposal you mentioned, I did not find anything about languages. It seems to be rather about road speed limits. Anyway, it tries to provide defaults for already existing keys (Like, for example, maxspeed. So, following the proposal, if the real-world maxspeed differs from the default in this region, the mapper can of course add a maxspeed=* tag to its object). As there still does not exist a key that describes the language of the name=* key, I cannot see how the proposal for defaults could help here. Also, the proposal for defaults is abandonned since 2010… --Sommerluk (talk) 08:28, 6 May 2017 (UTC)
This proposal doesn't need to define a default language, it simply introduces a tag to say in which language(s) the name tag is given. --Dieterdreist (talk) 08:54, 6 May 2017 (UTC)

I don't see how you can say that def:conditions;new tag = default_value is limited to maxspeed.
It's about anything and if it doesn't speak of languages, it's because it can't speak of everything.
You still don't say how to determine the language of a name without a :language tag. Papou (talk) 13:00, 26 May 2017 (UTC)

multlingual areas

Can we have

name:language=fr - nl

for places where name is like this:

name=Rue Haute - Hoogstraat

--Polyglot (talk) 19:57, 23 May 2017 (UTC)

Thanks for your feedback. The proposal for multilanguage names is name:language=fr;nl basicly because the “;” character is yet known from other keys as value separator. In Belgium, multilanguage names are usually like “a - b”, but there are other regions of the world, where the tagging convention is different: “a/b” or “a / b”. Nebulon42 has made a good overview, that is available at With name:language=* I want to propose a tag with a clean and strict syntax that can easily (and unambigously!) be processed. That’s the reason why the proposal uses the “;”. --Sommerluk (talk) 20:39, 23 May 2017 (UTC)
My preference would be to use the same separator in both the name field and the name:language field. That will be a lot easier to parse automatically. If it had made sense to use ; between both languages, then we would have used that, but in something that's going to be rendered on maps, as is, it would have looked extremely ugly.
The hyphen "-" is unsuitable as a separator (IMHO) because it also occurs in regular names (that are not multilingual), e.g. Dessau-Roßlau. Better use something that doesn't occur, I would suggest the slash "/" (the traditional OSM-multivalue-approach would be the semicolon ";", but it is ugly in rendered maps). --Dieterdreist (talk) 11:17, 24 May 2017 (UTC)
The separator is not "-", it is " - ", which is quite suitable. I'm a big proponent of using ";" as a separator for machine readable tags, but since no preprocessing will happen on name tags to convert it into something readable to humans, before rendering, it's better to create human readable strings for situations like in Brussels. --Polyglot (talk) 09:29, 27 May 2017 (UTC)

Use geographic boundaries

This is the question you will almost certainly get, so I will ask: Why aren't you simply using geographic boundaries to determine the glyph in the name tag? Everything in Japan is rendered with the Japanese glyphs, everything in China with the Chinese and everything in Korea in Korean Hanja. For the case we have name:jp name:ko we already have the language information and can use those glyphs.

Partial answer: Yes, but there are special cases where it doesn't work. For example in Korea, the name:zh tag is sometimes used for Korean Hanja which can be different from Chinese. There is currently no way to determine if the tag name:zh is to be rendered in actual Chinese or Korean Hanja. However: I'm not suer how this proposal will solve this issue because it only seems to be aimed at the name= tag, not its language specific name:xy= tags. --Panoramedia (talk) 14:21, 24 May 2017 (UTC)

Additional to what you have already mentioned we can add this argument: The assumption that all name=* in China are Simplified Chinese is not true. In China there are more that only one language. There are quite a few regional languages. The geographic boundary might be a not-so-bad approximation when looking to CJK characters only, but it will still be only an approximation and it would only solve the CJK glyph issue. Having name:language=* works potentially for all languages – also when language boundaries and country boundaries are different. The name:language=* information can be used to tell to an OpenType rendering engine to use the specific rendering rules for a particular language. Today we have OpenType fonts that are developed with internationalization in mind and that provide various glyph variants that can be chosen based on the language (also for various African languages that are written with the Latin alphabet but use sometimes different glyph variants). The corresponding OpenType features is called “locl”. --Sommerluk (talk) 15:12, 24 May 2017 (UTC)
Thanks for the clarification. Just to make it super clear, please add info how the name:language will solve your examples and how they will be tagged: Node with name:en=Beijing name=北京市 name:ja=北京市 name:zh=北京市. Additionally maybe a node in Korea with Korean Hanja and Chinese (it doesn't exist because there is no way to tag it, since booth are tagged name:zh= at the moment and this proposal doesn't address that. --Panoramedia (talk) 15:46, 24 May 2017 (UTC)
Node with name:en=Beijing name=北京市 name:ja=北京市 name:zh=北京市. The default map style at only renders name=北京市 and ignores the other tags like name:en=Beijing (This means it uses local names by default to show that OSM is an international project.) However, the map style cannot know what language name=北京市 is. Currently, it defaults worldwide to Japanese (which is sort of arbitrary) and uses Japanese glyphs if various glyph variants exists. When name:language=zh is available, the map style could instead adopt its rendering and use Chinese instead of Japanese glyphs. The other example: name=* and name:language=ko will work without problems, both if name=* is Hangul and also if name=* is Hanja. OpenType smart fonts need a language information, not a script information (the script information is already contained in the string that will get rendered). It is up to the font to deal correctly with this information. The default style at uses the font “Noto” that indeed does correctly deal with this information. So it will work in both cases. --Sommerluk (talk) 19:34, 24 May 2017 (UTC)