Key:default_language

From OpenStreetMap Wiki
Jump to navigation Jump to search
Public-images-osm logo.svg default_language
Description
The most likely language of the name tags in the region. Show/edit corresponding data item.
Group: names
Used on these elements
should not be used on nodesshould not be used on waysmay be used on areas (and multipolygon relations)may be used on relations
Useful combination
Status: undefined

This tag is set to the most likely language of the name=* tags in the region (e.g. a country). It helps data consumers decide which name or name:xx tag to use when drawing localized maps. This tag does not indicate which languages are spoken in the area, or the official languages of the region. It only applies to the name=*, alt_name=*, and the other similar tags.

  • Use the largest possible admin region for this value, e.g. a country relation with admin_level=2.
  • Do not set it on any smaller sub-regions unless their default language is different.
  • A region may contain a sub-region with a different default_language.
  • Always use just one language code per region (see multi-lingual regions below)
  • In some rare cases, additional non-admin regions might be required for the default_language. Try to avoid it if possible.

Multilingual Regions

a question mark

This article or section contains questionable, contentious or controversial information. See the talk page for more information.
There is currently no clear evidence as to why a multi-lingual region should avoid listing multiple languages..

This tag should never be set to multiple semicolon-separated values just because multiple languages are used in an area, e.g. default_language=fr;en, because it has no meaning to the data consumer. Having two or more language codes does not tell the language of any specific name tag, and therefore should not be used.

The documentation for this key has long stated that multiple languages for multilingual areas renders the default language key useless. However, this is not demonstrably true of data consumers as a whole. For example, this blog post from Nominatim explains the advantage of listing multiple default languages for an area which is truly multilingual and for which breaking the languages down into fine grained regions does not help:

First of all, it is of course a bit naive to assume that exactly one language is spoken in each country. Even using a more fine-grained regions doesn’t help very much. There are enough regions where many languages are in parallel use. Fortunately, there is no need to determine exactly one language for our names. If multiple languages are spoken in a country, then we can simply take the name and analyse it multiple times, once for every language. There will be a couple of false results but still far less than our language-unaware algorithm produces now. As long as the language list for the country is complete, the right result will be in there as well.

- From "Detecting languages" (Section: "Multi-lingual countries"); 10 October 2021 on nominatim.org [1]

If a region uses mixed languages in all of its name tags, e.g. separated by a dash or semicolon -- "[name_in_en] - [name_in_zh]", you could set default_language=en - zh, but this is not ideal. Please look at all name tags in the region with this Overpass Turbo Query. If you see that the majority of all tags are in a single language, please use that language (unless you can identify sub-areas with different languages). It is ok for the country or city name to have multiple languages in the name - they usually have many other names defined, so localizing them is not a problem. It is the majority of the other, smaller locations is what causes localization problems.

replace last zeros in area() with region's relation ID, but keep initial 3600... - there must be 10 digits. quick link

[out:csv("name")]; area(3600000000)->.boundaryarea;
( node(area.boundaryarea)[name];
  way(area.boundaryarea)[name];
  rel(area.boundaryarea)[name];
); out;

Examples

See also


Sophox query