Proposed features/Language information for name

From OpenStreetMap Wiki
Jump to: navigation, search

Voting

Instructions for voting
  • Log in to the wiki if you are not already logged in.
  • Scroll down to voting and click 'Edit source'. Copy and paste the appropriate code from this table on its own line at the bottom of the text area:
I approve this proposal yes {{vote|yes}} --~~~~
I oppose this proposal no {{vote|no}} reason --~~~~
Replace reason with your reason(s) for voting no.
I abstain from voting but have comments abstain {{vote|abstain}} comments --~~~~
If you want don't want to vote but have comments. Replace comments with your comments.

Note: The ~~~~ automatically inserts your name and the current date.


Language information for name
Status: Proposed (under way)
Proposed by: sommerluk
Tagging: name:language=code
Applies to: For usage together with non-localized name keys
Definition: Describes the language in which the name keys are (using the same code as Multilingual names).
Rendered as: Not rendered itself. But useful for better rendering of name=*
Drafted on: 2017-04-30
RFC start: 2017-05-05
Vote start: 2017-05-23
Vote end: 2017-06-06


Rationale

There are many applications that use the name=* tag in OSM. You will usually use the name=* tag when you intentionally want to use the name in the default language. (Example: OSMand lets you optionally choose between “local names” or a specific language. And the default style at openstreetmap.org uses exclusivly name=* because it wants to use always the local names. They do intentionally not use localized tags like name:en, name:jp, name:de…)

The content of name=* is plain Unicode. Problem: This is not enough to render the text correctly. There are glyphs (character shapes) that are different in the four variants (japanese, traditional chinese, simplified chinese, korean) of the CJK script, but Unicode encodes them at the same codepoint. Also there are four variants of some cyrillic glyphs (russian, bulgarian, serbian, mazedonian) that are encoded at the same Unicode codepoint. In the web, this problem is easily solved: The HTML code contains a language tag that gives the necessary information about the language. So the Internet browser can display everything correctly. In OSM this information is missing.

Deduce this information by the country in which our OSM element is located is not very reliably. Also within the same country may exist (much) more than only one language. Also within the same region, there might be objects who’s name is in a different language than the mayority language of this region. It’s also error-prone. That’s not an option.

Deduce this information by comparing with the other name:en, name:jp, name:de … tags does also not help. Example: The node http://www.openstreetmap.org/node/25248662 (english: Beijing) has name=北京市 and name:ja=北京市 and name:zh=北京市. They are identical. We cannot reliably determine the language of the name value. It would also not work for double-language names like “Bruxelles - Brussel” where none of the name:??=* tags has an identical value.

Usage

Use the suffix “:language” together with (non-localized) name keys.

It describes the language in which the values of the name keys are (using the same codes that are used as key suffixes in Multilingual names already).

Example: name=London and name:language=en

For double-names in the name=* tag (like “Bruxelles - Brussel”) a semicolon-separated list: name:language=fr;nl

This tag is not always necessary. But in regions where a different language/script combination makes a difference in rendering, this tag is useful.

Representation

Not rendered itself. But can be used to make correct language-specific rendering of name=* possible. Can be used also by text-to-speech-engines. Can also be used for more specialized processing of multilingual name=* values.

Voting

  • I approve this proposal I approve this proposal. --Panoramedia (talk) 20:31, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Waldhans (talk) 20:34, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Sommerluk (talk) 20:41, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I had a similar idea for multilingual names, but Sommerluk's proposal is a superset hence better suited. Name rendering of osm-carto could IMO really benefit from this addition. --Nebulon42 (talk) 20:55, 23 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Mfuji (talk) 04:16, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Hufgardm (talk) 05:47, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I also support extending this to name variant tags if deemed helpful, e.g. loc_name:language=<language-code>--Dieterdreist (talk) 11:38, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal. Also in support of Dieterdreist's suggestion. --Artoria2e5 (talk) 14:29, 24 May 2017 (UTC)
  • I abstain from voting but have comments I have comments but abstain from voting on this proposal. comments --Dr Centerline (talk) I only map in the United States and this is not an issue for me, so I don't really know enough about it to make an informed decision.
  • I abstain from voting but have comments I have comments but abstain from voting on this proposal. Unfortunately this proposal doesn't solve all naming issues, because for `name` tag you should select one (or practically maximum two with dash) languages, that is really problem for multilingual countries, names of continents, oceans, rivers and etc. However information of source language is very useful, but source language not always same with `name` tag now. My a bit radical vision to use language name tags + info of source language, because end user know only a few languages (mostly one at all) and see text on signboard. --Tbicr (talk) 17:32, 24 May 2017 (UTC)
  • I approve this proposal I approve this proposal.This will help rendering, the solution for multiple languages will help here in Ethiopia where we use several different scripts and languages, language-code=am;en.
  • I approve this proposal I approve this proposal. This definitely helps interpret the language of the `name` as well as define what the locale language is. We could potentially use this tag on areas like boundaries to create shapes of areas that define the popular language of features within Planemad/Talk 08:49, 25 May 2017 (UTC)
  • I approve this proposal I approve this proposal. --Johan Jönsson (talk) 10:24, 25 May 2017 (UTC)
  • I oppose this proposal I oppose this proposal. Either we have to use :language for every name or there is a default mechanism for it. But OSM has just abandoned the proposal for defaults. Please restore that proposal to define a default :language mechanism first. Papou (talk) 03:16, 26 May 2017 (UTC)
As explained at https://wiki.openstreetmap.org/wiki/Talk:Proposed_features/Language_information_for_name#That_proposition_needs_default_languages_to_be_defined_first the proposal you mentioned defines default values for yet existing keys (like maxspeed) – so it cannot express any information for which you do not have a normal key yet. And the proposal for default values is from 2010 and seems to be abandoned. It does neither conflict with this current proposal nor is there any overlap. --Sommerluk (talk) 08:30, 26 May 2017 (UTC)
I don't see how you can say that def:conditions;new tag = default_value is limited to maxspeed.
It's about anything and if it doesn't speak of languages, it's because it can't speak of everything.
You still don't say how to determine the language of a name without a :language tag. Papou (talk) 12:58, 26 May 2017 (UTC)
  • I approve this proposal I approve this proposal. I'd prefer to use the same separator as used in the corresponding tags, but I guess it's indeed possible to use the name:isolang to get the name's languages. I thought this would be a lot easier to parse by scripts. In general I do agree with the proposal though and it's long overdue. Polyglot (talk) 09:43, 27 May 2017 (UTC)