Module:Languages/config

From OpenStreetMap Wiki
Jump to navigation Jump to search
[Edit] [Purge] Documentation

This is a data module that contains configuration for Module:Languages.

Usage

The module exports only simple types, so it can be loaded as a data module:

mw.loadData("Module:Languages/config")

Module:Languages is used by templates that are transcluded many times on a given page, so loading this module as a data module helps to avoid wasteful Lua processing.

Structure

languageNamesByCode

The languageNamesByCode table has ISO 639 language codes as keys and autonyms (native names) as values. Module:Languages uses this table to label language links. The table is derived from MediaWiki's internal localization data, which has several inaccuracies for historical reasons. If the table is missing a language code that is used on this wiki or the name of a language is incorrect, modify languageNamesByCode after the call to mw.language.fetchLanguageNames().

languageCodes

The languageCodes table contains the ISO 639 language codes of languages used on this wiki, sorted by the languages' autonyms (native names). Module:Languages uses this table to arrange the list of links. The list appears on almost every page on this wiki, so it should be as short as possible.

A language code may be included only if this wiki has at least one content page in the language, based on the page title (which should contain the language code either as a namespace or a pseudonamespace). Empty categories and redirects alone are insufficient for a language to be listed. The codes pt-br, ro-md, and zh-tw are deprecated and thus omitted, even though a few content pages remain under those pseudonamespaces.

If languageCodes is missing a language code that is used on this wiki or the language is misplaced in the list:

  1. Add the missing language code anywhere in the languageCodes table.
  2. Remove the language code from minorLanguageCodes.
  3. Enter p.languageCodesSortedByName() in the debug console below the source code editor, then press Enter.
  4. Copy and paste the result into the source code editor, replacing the languageCodes table's previous contents.

If a language code in languageCodes is no longer used on this wiki, remove it from languageCodes and add it to minorLanguageCodes (if the language code is still valid) or deprecatedLanguageCodes (if the language code is invalid and should not be used anymore).

languageCodesSortedByName

This function sorts the contents of languageCodes by autonym, then logs the sorted table to the debug console. This function is only available in the debug console. It is not exported to any other module or template, because the NFD normalization required for Unicode-aware sorting is very memory-intensive.

See also

local p = {}

-- Map all known language codes to their autonyms per MediaWiki, then fix up a
-- few codes and names that are incorrect in MediaWiki for historical reasons.
local languageNamesByCode = mw.language.fetchLanguageNames()
languageNamesByCode.gcf = "kréyòl gwadloupéyen" -- Guadeloupean Creole French
languageNamesByCode["sr-cyrl"] = languageNamesByCode["sr-ec"]
languageNamesByCode["sr-latn"] = languageNamesByCode["sr-el"]

--- A table mapping language codes to their autonyms. Every language code in
--- languageCodes must have a pair in this table, but this table has many pairs
--- that go unused in languageCodes.
p.languageNamesByCode = languageNamesByCode

--- A table of ISO 639 language codes for the languages used on this wiki,
--- sorted by autonym. A language code may be included only if there is at least
--- one content page in the language that has the code in its title (either as a
--- namespace or a pseudonamespace). Empty categories and redirects don’t count.
p.languageCodes = {
	-- After modifying this table, rerun p.languageCodesSortedByName() in the
	-- debug console below and paste the re-sorted results here. This ensures
	-- that the codes are sorted by language name.
	"af", "ast", "az", "id", "ms", "bs", "br", "ca", "cs", "da", "de", "et",
	"en", "es", "eo", "eu", "fr", "fy", "gd", "gl", "hr", "ia", "is", "it",
	"ht", "gcf", "ku", "lv", "lb", "lt", "hu", "nl", "no", "nn", "oc", "pl",
	"pt", "ro", "sc", "sq", "sk", "sl", "sr-latn", "fi", "sv", "tl", "vi", "tr",
	"diq", "el", "be", "bg", "mk", "mn", "ru", "sr", "uk", "hy", "he", "ar",
	"fa", "pa", "pnb", "ps", "skr", "ne", "mr", "bn", "ta", "ml", "si", "th", "my", "ko", "ka",
	"tzm", "zh-hans", "zh-hant", "ja", "yue",
}

--- A table of language codes for languages that are only nominally used on this
--- wiki. A language code is included in this table if there is at least one
--- non-redirect page in the language, such as a category or template. If there
--- is a content page in the language, place the code in languageCodes instead.
--- Unlike the table above, this is sorted by code (and manually).
p.minorLanguageCodes = {
	"ab", "am", "an", "as", "av", "ay", "ba", "bm", "bo", "co", "cy", "dv",
	"ext", "ga", "gsw", "gu", "ha", "hi", "ie", "ig", "jv", "kk", "km",
	"kn", "ky", "la", "ldn", "li", "lo", "mg", "min", "mt", "nan", "nds",
	"nds-nl", "om", "or", "sa", "sd", "so", "su", "sw", "te",
    "tg", "tk", "ug", "ur", "uz", "vec", "wa", "wo",  "wuu", "xh", "yi", "yo", "za",
    "zh", "zu",
}

--- A table mapping deprecated language codes to their preferred replacements.
--- Deprecated language codes should not be used on new pages, but a few content
--- pages remain under these pseudonamespaces for now.
p.deprecatedLanguageCodes = {
	["pt-br"] = "pt",
	["ro-md"] = "ro",
	["zh-tw"] = "zh-hant",
}

--- A table mapping language codes to their content namespaces. For historical
--- reasons, several early languages got dedicated content namespaces, but most
--- languages rely on pseudonamespaces in the main content namespace. Apart from
--- pseudonamespaces, the main content namespace is assumed to be in English.
p.namespacesByLanguage = {
	["de"] = "DE",
	["en"] = "",
	["es"] = "ES",
	["fr"] = "FR",
	["it"] = "IT",
	["ja"] = "JA",
	["nl"] = "NL",
	["ru"] = "RU",
}

--- A table mapping certain language codes to the names of the categories
--- tracking missing translations in those languages. This table only includes
--- languages that have dedicated namespaces.
p.unavailablePageCategoryNames = {
	["de"] = "Pages unavailable in German",
	["en"] = "Pages unavailable in English",
	["es"] = "Pages unavailable in Spanish",
	["fr"] = "Pages unavailable in French",
	["it"] = "Pages unavailable in Italian",
	["ja"] = "Pages unavailable in Japanese",
	["nl"] = "Pages unavailable in Dutch",
	["ru"] = "Pages unavailable in Russian",
}

if mw.title.getCurrentTitle().fullText == "Module:Languages/config" then
	--- Logs a table of language codes sorted by autonym. This function is only
	--- available in the debug console, because NFD normalization uses a lot of
	--- memory.
	p.languageCodesSortedByName = function ()
		local siteLanguage = mw.getContentLanguage()
		
		local codes = {}
		local sortingKeys = {}
		for i, code in ipairs(p.languageCodes) do
			table.insert(codes, code)
			local foldedName = siteLanguage:caseFold(languageNamesByCode[code])
			-- Fold diacritics by isolating and deleting combining characters.
			sortingKeys[code] = mw.ustring.gsub(mw.ustring.toNFD(foldedName),
				"[^%a%p%s]+", "")
		end
		table.sort(codes, function (a, b)
			return sortingKeys[a] < sortingKeys[b]
		end)
		mw.log((table.concat(codes, " "):gsub("(%S+)", "\"%1\",")))
	end
end

return p