Data items

From OpenStreetMap Wiki
(Redirected from OpenStreetMap:Wikibase)
Jump to: navigation, search
Available languages — Data items
Afrikaans Alemannisch aragonés asturianu azərbaycanca Bahasa Indonesia Bahasa Melayu Bân-lâm-gú Basa Jawa Baso Minangkabau bosanski brezhoneg català čeština dansk Deutsch eesti English español Esperanto estremeñu euskara français Frysk Gaeilge Gàidhlig galego Hausa hrvatski Igbo interlingua Interlingue isiXhosa isiZulu íslenska italiano Kiswahili Kreyòl ayisyen kréyòl gwadloupéyen kurdî latviešu Lëtzebuergesch lietuvių magyar Malagasy Malti Nederlands Nedersaksies norsk norsk nynorsk occitan Oromoo oʻzbekcha/ўзбекча Plattdüütsch polski português română shqip slovenčina slovenščina Soomaaliga suomi svenska Tiếng Việt Türkçe Vahcuengh vèneto Wolof Yorùbá Zazaki српски / srpski беларуская български қазақша македонски монгол русский тоҷикӣ українська Ελληνικά Հայերեն ქართული नेपाली मराठी हिन्दी অসমীয়া বাংলা ਪੰਜਾਬੀ ગુજરાતી ଓଡ଼ିଆ தமிழ் తెలుగు ಕನ್ನಡ മലയാളം සිංහල ไทย မြန်မာဘာသာ ລາວ ភាសាខ្មែរ ⵜⴰⵎⴰⵣⵉⵖⵜ አማርኛ 한국어 日本語 中文(简体)‎ 吴语 粵語 中文(繁體)‎ ייִדיש עברית اردو العربية پښتو سنڌي فارسی ދިވެހިބަސް

Intro

Data items is a way to document all OSM metadata like keys and tags in every language on this wiki in a structured way, useful to both humans and tools.

  • Tools, such as iD editor and Taginfo are now able to get tag information without complex and error-prone parsing of the wiki markup. Eventually the data may include tag suggestions, validation rules, common pitfalls, presets, and more.
  • Data consumers are able to get structured metadata to help process main OSM database.
  • This wiki can now show data as info cards and tables, without information duplication and complicated template hackery.
  • All metadata can be analyzed using Sophox queries (see query examples).

This page documents how to store structured tag metadata on this wiki using data items provided by the Wikibase extension - the same software that runs Wikidata (initial discussion). This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve metadata documentation, making it more useful to various tools.

How can I help?

  • Add tag descriptions and translations. See the following 3 minute video.
How to add tag descriptions
Looking for volunteers...  click expand -->
Add descriptions and translations
Community and content
  • Set up a wiki portal, possibly similar to Wikidata's community portal (but simpler), where community can:
    • propose new properties
    • write guidelines/docs
    • discuss Wikibase data structures
  • Create Lua modules to generate tag tables, such as {{Template:Bridge:movable}}, {{Map Features:highway}}, or {{Template:Religions}}.
    • Implementation note: Wikibase only links Tags to the corresponding Key, but Keys do not list all possible Tags. To generate a table, we must have a list of items somewhere. We could create a new WB key property that lists all tags, and use a bot to maintain it, or we could list all needed tags as a template parameter, e.g. for highway, {{...|motorway|trunk|primary|secondary|...}}. List as a template parameter does not need to be localized, and it could specify proper ordering of items (not available in WB). Lua code would use mw.wikibase.getEntityIdForTitle("Key:highway=motorway") to find the right data.
Technical
  • Add Wikibase support to external tools. Simple usage: get key/tag localized description. Complex usage: allow user to add missing or even edit description, especially when user is creating a new key.
  • Port simple validation rules, e.g. regex-based, to use Wikibase data.
  • Help parse various tables of tag data. Even if you can only generate plain files with data, user:Yurik can quickly import them.
tasks in progress
done!

Tag Keys

Each OSM Key is stored as a separate page in the Item namespace. For example, see bridge:movable (Q104) that describes a bridge:movable=*:

property type value example description
description string en - The mechanism by which a movable bridge moves to clear the way below.
uk - Механізм, що приводить в дію рухомий міст для вивільнення шляху під ним.
This is the primary way to describe the key using proper sentences that end with a period, and whose first word is capitalized. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.
label string en - bridge:movable Label usage is still a bit undecided for the key/tag data items, so it is best not to use it for anything. For now, bot sets the English label to the key's value, exactly the same as P16 below. Some languages have nativekey (localized key) that was added to the labels as well. Do not add a copy of the English label to any other languages. Note that same as "en", the localized label must be unique in that language.
sitelink string Key:bridge:movable Links to the Key:... pages, even if the page does not exist. The sitelink is shown on top of the page on the other side as the page title.
instance of (P2)
that class of which this subject is a particular example and member (subject typically an individual member with a proper name label); different from P3 (subclass of)
Item key (Q7) Indicate the type of the item. Set to Q7 for keys.
permanent key ID (P16)
A string representing the key ID. Once set on a key data item, this value should never be changed.
string bridge:movable Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, keys "Key:water tap", "Key:water_tap", and "Key:water_tap_" have identical wiki pages/sitelinks - "Key:water tap". In this case, set this property to multiple strings, but mark one as Preferred rank "preferred".
use on nodes (P33)
Use status to indicate if this key or tag should be used on nodes. Use P26 qualifier to limit to a specific language region. Use P11 reference to link to discussions.

use on ways (P34)

Use status to indicate if this key or tag should be used on ways. Use P26 qualifier to limit to a specific language region. Use P11 reference to link to discussions.

use on areas (P35)

Use status to indicate if this key or tag should be used on areas (closed ways). Use P26 qualifier to limit to a specific language region. Use P11 reference to link to discussions.

use on relations (P36)

Use status to indicate if this key or tag should be used on relations. Use P26 qualifier to limit to a specific language region. Use P11 reference to link to discussions.
Item is allowed (Q8000)
or

is prohibited (Q8001)

Sets if this key is allowed on nodes/ways/areas/relations. In the future we may want to use other statuses like approved (Q15), but this is not yet supported. See also limiting per locale below.
image (P28)  [ image caption (P47) ]
Image of a relevant illustration of the subject.
[A qualifier to add to an image to specify image caption in a specific language.]
string Noexit.jpg An image stored either on Wikimedia Commons (preferable) or on the OSM wiki, without the File: prefix. To use a different image for a specific language region, add another value and set limited to language (P26). Make sure to set Preferred rank preferred status for the default image. Use image caption (P47) qualifier to indicate image caption for any language (will show English if not found, or any other if EN does not exist).
group (P25)
indicates which group the given tag or a key belongs to. Target must have instance-of = group
Item bridges (Q4712) The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple groups, changing the meaning of the "group" to something like a "label"/"meta-tag".
status (P6)  [ proposal discussion (P11) ]
Community acceptance status. Use reference to link to the proposal discussion page (P11).
[link to the key or tag proposal page]
Item approved (Q15)
  reference link
community's approval status, together with a reference link to the discussion page (optional)
key type (P9)
Type of the key entity, e.g. enum, external id. Do not use this for groups or statuses.
Item well-known values (Q8) Describes the type of values this key is expected to have. If there is a well known list of values, use Q8. Other types are TBD.
Wikidata concept (P12)
this item represents a concept described by the given Wikidata item
Item Q787417 A link to the Wikidata item, stored as an external ID (string). Must be a Q-number.
value validation regex (P13)
Regular expression to test the validity of the tag's value. May also be used for role names. The wrapping ^( and )$ are assumed. Do not use for enum-like values, e.g. noexit=yes should be a tag, not a regex.
string [0-9]+ A regular expression that can be used to validate the value of this key. In this case the value must be one or more digits. Validators will require the entire expression to match the string, i.e. they will add ^( in front and )$ at the end.
See population (Q574) example.
documentation wiki pages (P31)
Wiki pages for this data item in different languages. There should be no more than one value per language.
multilingual string Key:bridge:movable (English)

Cs:Key:bridge:movable (Czech)
...

Each value is the name of the wiki page in a specific language. This value is usually updated with a bot.

Tag values

For keys like Key:highway, there is a list of the well-known values such as highway=residential, highway=service, highway=footway. These values are stored similarly to keys. See bridge:movable=bascule (Q888) that describes a bridge:movable=bascule. See all items that link to bridge:movable.

property type value example description
description string en - A type of movable bridge, a bascule bridge contains one or two spans, one end of which is free and swings upwards. A counterweight at the pivoting end of the span or spans balances the weight as the free end rises.
pl - Most zwodzony jest to rodzaj mostu w którym co najmniej jedno przęsło jest podnoszone. Mosty zwodzone mogą być jedno- lub dwuskrzydłowe.
This is the primary way to describe a tag using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.
label string en - bridge:movable=bascule Label usage is still a bit undecided for the key/tag data items, so it is best not to use it for anything. For now, bot sets the English label to the tag's value, exactly the same as P19 below. Some languages have nativekey=nativevalue (localized key/value) that was added to the labels as well. Do not add a copy of the English label to any other languages. Note that same as "en", the localized label must be unique in that language.
sitelink string Tag:bridge:movable=bascule Links to the Tag:... pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
instance of (P2) Item tag (Q2) Indicate the type of the item. Set to Q2 for tags.
permanent tag ID (P19) string bridge:movable=bascule Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, tags "Tag:water tap=yes", "Tag:water_tap=yes", and "Tag:water_tap=yes_" have identical wiki pages/sitelinks - "Tag:water tap=yes". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).
key for this tag (P10) Item bridge:movable (Q104) Every tag item links to the corresponding key item, making it easier to easier to query and validate.

Tags may also use use on nodes (P33), use on ways (P34), use on areas (P35), use on relations (P36), image (P28), group (P25), status (P6), value validation regex (P13), documentation wiki pages (P31). See their description in Tag Key section above.

Relations

Similar to keys and tags, here is an example restriction relation (Q16054) copied from Relation:restriction.

property type value example notes
description string ... This is the primary description of the relation, using proper sentences that end with a period with the capitalized first word. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.
label string en - restriction relation Short relation description. Do not add a copy of the English label to any other languages.
sitelink string Relation:restriction Links to the corresponding Relation:... wiki page, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
instance of (P2) Item relation type (Q6) Indicate the type of the item. Must be set to Q6 for relations.
permanent relation type ID (P41)
A string representing the role ID, e.g. "motorway". Once set on a relation data item, this value should never be changed.
string restriction Shows the exact form of the relation ID as used in OSM. Must never be changed once the item is created. Due to technical limitations, sitelinks "Relation:destination sign", "Relation:destination_sign", and "Relation:destination_sign_" have identical wiki sitelink - "Relation:destination sign". In this case, set this property to multiple strings, but mark one with a Preferred rank preferred rank.
tag for this relation type (P40)
For a given relation item, links to the corresponding tag item, e.g. type=multipolygon.
Item type=restriction (Q16013) Every relation item links to the corresponding type=* tag item, making it easier to query and validate.

Relations may also use image (P28), group (P25), status (P6), documentation wiki pages (P31). See their description in the Tag Key section above.

Relation Roles

Members of the relation could be labeled with "roles", e.g. "inner" and "outer" ways in the multipolygon relation. Each role for each relation type has its own data item. Example for boundary=admin_centre (Q16060).

property type value example notes
description string ... This is the primary description of the relation role, using proper sentences that end with a period with the capitalized first word. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.
label string en - boundary admin center role Short relation description. Do not add a copy of the English label to any other languages.
sitelink string Relation:boundary=admin center Links to the Relation:<relation>=<role> wiki page, even if the page does not exist. If the role is empty, use Relation:relation=  form. If the role has a variable portion, e.g. route=platform:<number>, set sitelink to the fixed part -- Relation:route=platform:, and use value validation regex (P13) to validate the variable part. Sitelink is shown in the upper right corner of the item page.
instance of (P2) Item relation member role (Q4667) Indicate the type of the item. Must be set to Q4667 for the relation member roles.
relation role ID (P21)
A string in a "relationtype=role" format. Should only be set on relation role items. Once set on a role item, the value should never be changed.
string boundary=admin_center Shows the exact form of the relation role ID as used in OSM. Must never be changed once the item is created. Due to technical limitations, sitelinks "Relation:boundary=admin_center", "Relation:boundary=admin center", and "boundary=admin_center_" have identical wiki sitelink - "Relation:boundary=admin center". In this case, set this property to multiple strings, but mark one with a Preferred rank preferred rank.
belongs to relation type (P43)
For a given relation role (e.g. "inner"), links to the corresponding relation type (e.g. "multipolygon")
Item boundary relation (Q16019) Every relation member role links to the corresponding relation item, making it easier to query and validate.
value validation regex (P13)
Regular expression to test the validity of the tag's value. May also be used for role names. The wrapping ^( and )$ are assumed. Do not use for enum-like values, e.g. noexit=yes should be a tag, not a regex.
string platform:[0-9]+ A regular expression that can be used to validate the variable part of the role. In this case the value must be one or more digits, e.g. for the route=platform:<number> role. Validators will convert regex expression into the ^(platform:[0-9]+)$ form (for the given example).

Relation member roles may also use use on nodes (P33), use on ways (P34), use on areas (P35), use on relations (P36), image (P28), group (P25), status (P6), documentation wiki pages (P31). See their description in the Tag Key section above.

Storing Geographical Differences

A phone booth looks very different depending on the geographical region, e.g. a country. To indicate that an image, or any other value of the data item is specific to a location, use limited to region (P48) qualifier with a geographical region item.

A geographical region item is a data item with the instance of (P2) = geographic region (Q19531), and it contains a geographic code (P49) property set to one or more country codes.

The limited to region (P48) qualifier should eventually replace the limited to language (P26).

Storing Locale Differences

Most translated Key:... and Tag:... pages tend to have mismatching parameters like status, group, or the types of elements it should be used on. While some were deliberate results after a careful local community evaluation (see noexit (Q501)), many other cases are simply stale and need to be fixed, or possibly removed from the template's parameters to let it use the underlying data item.

All locale differences are stored using limited to language (P26). The value with no qualifiers is the default. It should have the Preferred rank preferred rank, but it is OK to keep Normal rank normal rank when there are no other values for the property. All language-specific values must use limited to language (P26) and have Normal rank normal rank. Each value must be used only once, possibly with multiple qualifier values (e.g. a property access:lhv (Q33) can have only one is allowed (Q8000) and one is prohibited (Q8001)). Each language qualifier can only be used once for the whole property. Language must not be listed if it is the same as the default. If there is no value without qualifiers, it means that the default is not set (e.g. English page has no onRelation= parameter).

Property Rank Value Qualifier Meaning
group (P25) Preferred rank bridges (Q4712) no qualifiers This value is used for English page and all other language pages except those explicitly listed below.
Normal rank properties (Q4671) limited to language (P26)
Italian-language documentation (Q7798)
Finnish-language documentation (Q7791)
This value is only used for the Italian and Finnish pages.
Normal rank placement (Q4707) limited to language (P26)
Czech-language documentation (Q7785)
This value is only used for the Czech pages.

Meta item

There are many data items which are neither a Key nor a Tag:

OSM Concepts
element (Q9), key (Q7), tag (Q2), status (Q11), group (Q12)
Statuses of type status (Q11))
de facto (Q13), in use (Q14), approved (Q15), rejected (Q16), voting (Q17), draft (Q18), abandoned (Q19), proposed (Q20), obsolete (Q5060), deprecated (Q5061), discardable (Q7550)
Statuses of type element status (Q8010)
is allowed (Q8000), is prohibited (Q8001)
Special
OSM concept (Q10), sandbox item (Q2761)

Item Creation Process

A bot has created all significantly used keys and tags, and will continue creating these items when they are detected in the OSM database (taginfo API) or on the wiki. The bot will:

  • create an item for any key with 10+ usages if it matches ^[a-z0-9]+([-:_\.][a-z0-9]+)*$, or for any 1000+ usages regardless of the key syntax (see talk page)
  • set item's label to be the same as the key
  • set item's description from the corresponding wiki page's info card (if available, from all languages)
  • set used-by, recommended tags, implies, and any other easy-to-figure-out data from the info cards.
  • will NOT update any fields modified by a user, e.g. if description in FR has been changed by a user, it should not be changed by the bot.

Eventually, it would be better for OSM tools (iD, JOSM, ...) to ask the user for the metadata, and use MW API to create new items.

API access and querying

  • The easiest way for an external tool to get all the data about a key is to use this API call:
https://wiki.openstreetmap.org/w/api.php?action=wbgetentities&sites=wiki&titles=Key:bridge:movable&languages=en|fr
Use languages to filter labels and descriptions to the needed languages.
Add &format=json&formatversion=2 to get the actual JSON instead of HTML.
Due to MediaWiki limitations, the titles value should be ("Key:" + key).replace('_', ' ').trim(). Use permanent key ID (P16) to get the actual format of the key. Make sure to get the "preferred" value, just in case more than one value is present.

Quality Control

There are several additional extensions designed to validate Wikibase data, and find items that do not pass validation. Installing such capabilities may not be done in the first deployment stage.

Limitations

  • Wikibase's "Commons File" properties do not yet support files stored on this wiki. Instead, we use a regular string property to store the image name, and use a gadget (see your preferences) to show strings as images.
  • The sitelink in the upper right corner does not show whether the Tag:* or a Key:* page exists or not.
  • All sitelinks must use spaces instead of underscores. API sitelink search does not work otherwise. See permanent key ID (P16) and permanent tag ID (P19) for the correct value. Note that regular Mediawiki Key:* and Tag:* pages have the same issue, and use a special hack to change the title.
  • MediaWiki removes spaces/underscores from the key, so Key:_abc_ would become Key: abc. There are no way to have two items with sitelinks Key:_abc and Key:_abc_ -- they are treated as the same, and fail.

See also