Proposal:Language information for name

From OpenStreetMap Wiki
Revision as of 16:31, 30 April 2023 by TigerfellBot (talk | contribs) (TigerfellBot moved page Proposed features/Language information for name to Proposal:Language information for name: running task 'Proposal namespace': moving proposals into the proposal namespace)
Jump to navigation Jump to search
Language information for name
Proposal status: Rejected (inactive)
Proposed by: sommerluk
Tagging: name:language=code
Applies to: For usage together with non-localized name keys
Definition: Describes the language in which the name keys are (using the same code as Multilingual names).
Statistics:

Rendered as: Not rendered itself. But useful for better rendering of name=*
Draft started: 2017-04-30
RFC start: 2017-05-05
Vote start: 2017-05-23
Vote end: 2017-06-06

Rationale

There are many applications that use the name=* tag in OSM. You will usually use the name=* tag when you intentionally want to use the name in the default language. (Example: OSMand lets you optionally choose between “local names” or a specific language. And the default style at openstreetmap.org uses exclusivly name=* because it wants to use always the local names. They do intentionally not use localized tags like name:en, name:jp, name:de…)

The content of name=* is plain Unicode. Problem: This is not enough to render the text correctly. There are glyphs (character shapes) that are different in the four variants (japanese, traditional chinese, simplified chinese, korean) of the CJK script, but Unicode encodes them at the same codepoint. Also there are four variants of some cyrillic glyphs (russian, bulgarian, serbian, mazedonian) that are encoded at the same Unicode codepoint. In the web, this problem is easily solved: The HTML code contains a language tag that gives the necessary information about the language. So the Internet browser can display everything correctly. In OSM this information is missing.

Deduce this information by the country in which our OSM element is located is not very reliably. Also within the same country may exist (much) more than only one language. Also within the same region, there might be objects who’s name is in a different language than the mayority language of this region. It’s also error-prone. That’s not an option.

Deduce this information by comparing with the other name:en, name:jp, name:de … tags does also not help. Example: The node http://www.openstreetmap.org/node/25248662 (english: Beijing) has name=北京市 and name:ja=北京市 and name:zh=北京市. They are identical. We cannot reliably determine the language of the name value. It would also not work for double-language names like “Bruxelles - Brussel” where none of the name:??=* tags has an identical value.

Usage

Use the suffix “:language” together with (non-localized) name keys.

It describes the language in which the values of the name keys are (using the same codes that are used as key suffixes in Multilingual names already).

Example: name=London and name:language=en

For double-names in the name=* tag (like “Bruxelles - Brussel”) a semicolon-separated list: name:language=fr;nl

This tag is not always necessary. But in regions where a different language/script combination makes a difference in rendering, this tag is useful.

Representation

Not rendered itself. But can be used to make correct language-specific rendering of name=* possible. Can be used also by text-to-speech-engines. Can also be used for more specialized processing of multilingual name=* values.

Voting

Voting closed

Voting on this proposal has been closed.

It was rejected with 15 votes for, 8 votes against and 3 abstentions.

15/23 = 65%, less than the required 74% approval

  • I approve this proposal I approve this proposal. --Panoramedia (talk) 20:31, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Waldhans (talk) 20:34, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Sommerluk (talk) 20:41, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I had a similar idea for multilingual names, but Sommerluk's proposal is a superset hence better suited. Name rendering of osm-carto could IMO really benefit from this addition. --Nebulon42 (talk) 20:55, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Mfuji (talk) 04:16, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Hufgardm (talk) 05:47, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I also support extending this to name variant tags if deemed helpful, e.g. loc_name:language=<language-code>--Dieterdreist (talk) 11:38, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. Also in support of Dieterdreist's suggestion. --Artoria2e5 (talk) 14:29, 24 May 2017 (UTC)


  • I abstain from voting but have comments I have comments but abstain from voting on this proposal. I only map in the United States and this is not an issue for me, so I don't really know enough about it to make an informed decision. --Dr Centerline (talk)
  • I abstain from voting but have comments I have comments but abstain from voting on this proposal. Unfortunately this proposal doesn't solve all naming issues, because for `name` tag you should select one (or practically maximum two with dash) languages, that is really problem for multilingual countries, names of continents, oceans, rivers and etc. However information of source language is very useful, but source language not always same with `name` tag now. My a bit radical vision to use language name tags + info of source language, because end user know only a few languages (mostly one at all) and see text on signboard. --Tbicr (talk) 17:32, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. This will help rendering, the solution for multiple languages will help here in Ethiopia where we use several different scripts and languages, language-code=am;en.
  • I approve this proposal I approve this proposal. This definitely helps interpret the language of the `name` as well as define what the locale language is. We could potentially use this tag on areas like boundaries to create shapes of areas that define the popular language of features within Planemad/Talk 08:49, 25 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Johan Jönsson (talk) 10:24, 25 May 2017 (UTC)
  • I oppose this proposal I oppose this proposal. Either we have to use :language for every name or there is a default mechanism for it. But OSM has just abandoned the proposal for defaults. Please restore that proposal to define a default :language mechanism first. Papou (talk) 03:16, 26 May 2017 (UTC)
As explained at https://wiki.openstreetmap.org/wiki/Talk:Proposed_features/Language_information_for_name#That_proposition_needs_default_languages_to_be_defined_first the proposal you mentioned defines default values for yet existing keys (like maxspeed) – so it cannot express any information for which you do not have a normal key yet. And the proposal for default values is from 2010 and seems to be abandoned. It does neither conflict with this current proposal nor is there any overlap. --Sommerluk (talk) 08:30, 26 May 2017 (UTC)
I don't see how you can say that def:conditions;new tag = default_value is limited to maxspeed.
It's about anything and if it doesn't speak of languages, it's because it can't speak of everything.
You still don't say how to determine the language of a name without a :language tag. Papou (talk) 12:58, 26 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I'd prefer to use the same separator as used in the corresponding tags, but I guess it's indeed possible to use the name:isolang to get the name's languages. I thought this would be a lot easier to parse by scripts. In general I do agree with the proposal though and it's long overdue. Polyglot (talk) 09:43, 27 May 2017 (UTC)
  • I oppose this proposal I oppose this proposal. The proposal violates the current expectation that name:* tags contain a valid name of the object. So it adds additional processing effort for those who don't care about the tag. The proposed tag also has no advantage at all over simply adding the name:<language code>=<name in that language>. It just becomes harder to parse for data consumers because it adds an additional indirection step. There is also no gain for mappers because they have to add the same number of tags. The proposed multivalue is even worse because there is no indication how the multiple values map to the content of the name tag. --Lonvia (talk) 08:37, 2 June 2017 (UTC)
The advantage is, that for example openstreetmap-carto, that uses by policy only local names from name=*, could now know which is this default language and do correct CJK rendering (which is currently not possible). I agree that “name:language” is not an elegant key, but on the other hand you have also right know name:left=* and name:right=* yet. --Sommerluk (talk) 11:48, 2 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. First, the proposal is just bad in its form. Every tag of the form "name:something" means "something" is a language, except when it is actually "language" than it means something totally different. Thats difficult to understand and difficult to use. Second, we do have a way to set the language already, why a second one? Just tag both "name=SOMENAME" and "name:whateverlanguage=SOMENAME". Does work already and doesn't add a second mechanism that people would have to implement. Even worse is the idea that multiple name versions in the name tag ("Bruxelles - Brussel") can somehow be fixed by this using several languages in the "name:language" tag. This is unimplementable, there are different versions how different names appear in those name tags and there is no way we can put all this in a nice user interface for the mapper or tell the user of the data how to simply use this information. Finally: Is there a realistic chance that this new tag will be used often enough that anybody implementing this wouldn't have to implement the fallback when the tag isn't available, too? The proposal makes an admittedly bad situation more complicated. We need to fix this, but this is not the way. -- Joto (talk) 09:02, 2 June 2017 (UTC)
We already have name:left=*, name:right=*, name:prefix=*, and name:etymology=*. —seav (talk) 12:19, 3 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. as per Joto.--Ethylisocyanat (talk) 10:51, 2 June 2017 (UTC)
  • I approve this proposal I approve this proposal. --Michi (talk) 18:48, 2 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. The whole point is not to have to parse the name tag. How is it going to be split? Bruselle,Brusel? Brusselle - Brussel? Brusselle;Brussel? Bruselle Brussel? This violates the K.I.S.S. and underminds the already established name:* key which is much easier toparse than this garbage. --James2432 (talk) 10:09, 3 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. I can see the benefits/use of having such a mechanism for recording the language of tags. But to avoid confusion with name:<langcode> and for consistency with source:<tagname>=* I think it would be better as language:name=<langcode>. Rjw62 (talk) 10:59, 3 June 2017 (UTC)
  • I approve this proposal I approve this proposal. —seav (talk) 12:21, 3 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. This would add a unwanted layer of complexity to parse the name attribute, for the dual language names simply adding `name:lang1=Name` & `name:lang2=Name` is plenty, no need to add separators in the main name attribute. --DenisCarriere (talk) 17:12, 3 June 2017 (UTC)
I fail to see how this oppose reason has anything to do with the proposal. Many years even before this proposal was created, we already have name=Bruxelles - Brussel. The proposal merely allows one to interpret what language is in the name=* tag. And you don't have to parse this tag if you don't want to. —seav (talk) 17:43, 3 June 2017 (UTC)
if this was a good proposal, the name:language=* value would be similar to the name=*, I mean to parse name=Bruxelles - Brussels, you would need something like name:language=fr - nl, not name:language=fr,nl as in other countries they would have written in such a case name=Bruxelles / Brussels.
/ and - are both valid parts of a name, that's why fr,nl is not a proper solution. BTW, the rule is country/county wide as far as I know: in Belgium you will find 'fr' or 'nl' or 'fr - nl' or 'nl -fr', some with 'de' too. if you have '* - *' you can safely check for 'name:nl' and 'name:fr' to determine if it's 'nl - fr' or 'fr - nl'. Why tagging each object where the possibilities are 4, country wide? It's sufficient to give the rules at the right admin level.Here name:schema=% - % or better name:separator=- and name:main_languages=de,fr,nl: we say the form of the combination and the possible languages (those possible on the name=* tag, not the name:xx=* names of course. Because this (IMHO bad) proposal has been accepted, we need to add something like name:separator=- to make it somehow usable. Thanks to the logic given at a higher level (distirct, country etc...) the tools can help for coherent namings (here name=fr - nl or nl - de or fr etc... --Nospam2015
  • I approve this proposal I approve this proposal. Pizzaiolo (talk) 17:37, 3 June 2017 (UTC)
  • I oppose this proposal I oppose this proposal. The problem in question is not seemingly a true problem for me, because the deduction by country is not unreliable as described. For the country with more than one language, these are usually not the ones with different glyphs for the same code. For example, simplified Chinese and Tibetan coexist in mainland China but they are not the case with different glyphs. Japanese, Korean and traditional Chinese is rarely, if ever, used in mainland China names. The Brussel example does not involve different glyphs. For the Cyrillic countries, I've checked that they do not use the other three controversial languages massively in their own country. So, it is not worthwhile for the limited special cases to mandate the language of majority of the names to be specified by another new tag. If this proposal is finally approved, I still recommend it should only be used for the special occasions, not for the majority of the names whose language can be deduced correctly. --Rc1028 (talk) 07:02, 5 June 2017 (UTC)
  • I abstain from voting but have comments I have comments but abstain from voting on this proposal. Can not we solve this problem with a tag Wikidata wikidata = *? I know that for the moment, there is not much exploitation of this tag, but one must start one day --Gendy54 (talk) 15:49, 5 June 2017 (UTC)
No, not every thing is notable enough to be wikidata-ed. For example, the simplified character 门 (door) has different glyphs in zh/ja, and doors/gates are... ubiquitous. --Artoria2e5 (talk) 00:19, 22 March 2019 (UTC)