Data items are a way to document all OSM metadata like keys and tags in every language on this wiki in a structured way, useful to both humans and tools.
- Tools, such as iD editor and Taginfo are now able to get tag information without complex and error-prone parsing of the wiki markup. Eventually the data may include tag suggestions, validation rules, common pitfalls, presets, and more.
- Data consumers are able to get structured metadata to help process main OSM database.
- This wiki can now show data as info cards and tables, without information duplication and complicated template hackery.
- All metadata can be analyzed using Sophox queries (see query examples).
This page documents how to store structured tag metadata on this wiki using data items provided by the Wikibase extension - the same software that runs Wikidata (initial discussion). This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve metadata documentation, making it more useful to various tools.
Where to find them
Data items are located in the Item: namespace. Each data item has a page title consisting of a "Q" followed by a numeric ID.
Every Key: and Tag: page has a corresponding data item. Follow the "Data item" link in the "Tools" section of the page's sidebar, or enter the page's title into Special:ItemByTitle/wiki/.
How can I help?
- Add tag descriptions and translations. See the following 3 minute video.
|description||string||en - The mechanism by which a movable bridge moves to clear the way below.
uk - Механізм, що приводить в дію рухомий міст для вивільнення шляху під ним.
|This is the primary way to describe the key using proper sentences that end with a period, and whose first word is capitalized. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.|
||Label usage is still a bit undecided for the key/tag data items, so it is best not to use it for anything. For now, bot sets the English label to the key's value, exactly the same as P16 below. Some languages have nativekey (localized key) that was added to the labels as well. Do not add a copy of the English label to any other languages. Note that same as "en", the localized label must be unique in that language.|
|sitelink||string||Key:bridge:movable||Links to the Key:... pages, even if the page does not exist. The sitelink is shown on top of the page on the other side as the page title.|
|instance of (P2)
||Item||key (Q7)||Indicate the type of the item. Set to Q7 for keys.|
|permanent key ID (P16)
||Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, keys "Key:water tap", "Key:water_tap", and "Key:water_tap_" have identical wiki pages/sitelinks - "Key:water tap". In this case, set this property to multiple strings, but mark one as "preferred".|
|use on nodes (P33)
||Item||is allowed (Q8000)
||Sets if this key is allowed on nodes/ways/areas/relations. In the future we may want to use other statuses like approved (Q15), but this is not yet supported. See also limiting per locale below.|
|image (P28) [ image caption (P47) ]
||An image stored either on Wikimedia Commons (preferable) or on the OSM wiki, without the File: prefix. To use a different image for a specific language region, add another value and set limited to language (P26). Make sure to set preferred status for the default image. Use image caption (P47) qualifier to indicate image caption for any language (will show English if not found, or any other if EN does not exist).|
||Item||bridges (Q4712)||The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple groups, changing the meaning of the "group" to something like a "label"/"meta-tag".|
|status (P6) [ proposal discussion (P11) ]
|community's approval status, together with a reference link to the discussion page (optional)|
|key type (P9)
||Item||well-known values (Q8)||Describes the type of values this key is expected to have. If there is a well known list of values, use Q8. Other types are TBD.|
|Wikidata concept (P12)
||Item||Q787417||A link to the Wikidata item, stored as an external ID (string). Must be a Q-number.|
|value validation regex (P13)
||A regular expression that can be used to validate the value of this key. In this case the value must be one or more digits. Validators will require the entire expression to match the string, i.e. they will add |
See population (Q574) example.
|documentation wiki pages (P31)
||multilingual string||Key:bridge:movable (English)
|Each value is the name of the wiki page in a specific language. This value is usually updated with a bot.|
For keys like Key:highway, there is a list of the well-known values such as highway=residential, highway=service, highway=footway. These values are stored similarly to keys. See bridge:movable=bascule (Q888) that describes a bridge:movable=bascule. See all items that link to bridge:movable.
|description||string||en - A type of movable bridge, a bascule bridge contains one or two spans, one end of which is free and swings upwards. A counterweight at the pivoting end of the span or spans balances the weight as the free end rises.
pl - Most zwodzony jest to rodzaj mostu w którym co najmniej jedno przęsło jest podnoszone. Mosty zwodzone mogą być jedno- lub dwuskrzydłowe.
|This is the primary way to describe a tag using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.|
||Label usage is still a bit undecided for the key/tag data items, so it is best not to use it for anything. For now, bot sets the English label to the tag's value, exactly the same as P19 below. Some languages have nativekey=nativevalue (localized key/value) that was added to the labels as well. Do not add a copy of the English label to any other languages. Note that same as "en", the localized label must be unique in that language.|
|sitelink||string||Tag:bridge:movable=bascule||Links to the Tag:... pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.|
|instance of (P2)||Item||tag (Q2)||Indicate the type of the item. Set to Q2 for tags.|
|permanent tag ID (P19)||string||
||Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, tags "Tag:water tap=yes", "Tag:water_tap=yes", and "Tag:water_tap=yes_" have identical wiki pages/sitelinks - "Tag:water tap=yes". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).|
|key for this tag (P10)||Item||bridge:movable (Q104)||Every tag item links to the corresponding key item, making it easier to easier to query and validate.|
Tags may also use use on nodes (P33), use on ways (P34), use on areas (P35), use on relations (P36), image (P28), group (P25), status (P6), value validation regex (P13), documentation wiki pages (P31). See their description in Tag Key section above.
|description||string||...||This is the primary description of the relation, using proper sentences that end with a period with the capitalized first word. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.|
||Short relation description. Do not add a copy of the English label to any other languages.|
|sitelink||string||Relation:restriction||Links to the corresponding Relation:... wiki page, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.|
|instance of (P2)||Item||relation type (Q6)||Indicate the type of the item. Must be set to Q6 for relations.|
|permanent relation type ID (P41)
||Shows the exact form of the relation ID as used in OSM. Must never be changed once the item is created. Due to technical limitations, sitelinks "Relation:destination sign", "Relation:destination_sign", and "Relation:destination_sign_" have identical wiki sitelink - "Relation:destination sign". In this case, set this property to multiple strings, but mark one with a preferred rank.|
|tag for this relation type (P40)
||Item||type=restriction (Q16013)||Every relation item links to the corresponding type=* tag item, making it easier to query and validate.|
Members of the relation could be labeled with "roles", e.g. "inner" and "outer" ways in the multipolygon relation. Each role for each relation type has its own data item. Example for boundary=admin_centre (Q16060).
|description||string||...||This is the primary description of the relation role, using proper sentences that end with a period with the capitalized first word. Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item. Any key:... and tag:...=... will be automatically shown as links.|
||Short relation description. Do not add a copy of the English label to any other languages.|
|sitelink||string||Relation:boundary=admin center||Links to the Relation:<relation>=<role> wiki page, even if the page does not exist. If the role is empty, use Relation:relation= form. If the role has a variable portion, e.g. route=platform:<number>, set sitelink to the fixed part -- Relation:route=platform:, and use value validation regex (P13) to validate the variable part. Sitelink is shown in the upper right corner of the item page.|
|instance of (P2)||Item||relation member role (Q4667)||Indicate the type of the item. Must be set to Q4667 for the relation member roles.|
|relation role ID (P21)
||Shows the exact form of the relation role ID as used in OSM. Must never be changed once the item is created. Due to technical limitations, sitelinks "Relation:boundary=admin_center", "Relation:boundary=admin center", and "boundary=admin_center_" have identical wiki sitelink - "Relation:boundary=admin center". In this case, set this property to multiple strings, but mark one with a preferred rank.|
|belongs to relation type (P43)
||Item||boundary relation (Q16019)||Every relation member role links to the corresponding relation item, making it easier to query and validate.|
|value validation regex (P13)
||A regular expression that can be used to validate the variable part of the role. In this case the value must be one or more digits, e.g. for the route=platform:<number> role. Validators will convert regex expression into the |
Relation member roles may also use use on nodes (P33), use on ways (P34), use on areas (P35), use on relations (P36), image (P28), group (P25), status (P6), documentation wiki pages (P31). See their description in the Tag Key section above.
Storing Geographical Differences
A phone booth looks very different depending on the geographical region, e.g. a country. To indicate that an image, or any other value of the data item is specific to a location, use limited to region (P48) qualifier with a geographical region item.
Storing Locale Differences
Tag:... pages tend to have mismatching parameters like status, group, or the types of elements it should be used on. While some were deliberate results after a careful local community evaluation (see noexit (Q501)), many other cases are simply stale and need to be fixed, or possibly removed from the template's parameters to let it use the underlying data item.
All locale differences are stored using limited to language (P26). The value with no qualifiers is the default. It should have the preferred rank, but it is OK to keep normal rank when there are no other values for the property. All language-specific values must use limited to language (P26) and have normal rank. Each value must be used only once, possibly with multiple qualifier values (e.g. a property access:lhv (Q33) can have only one is allowed (Q8000) and one is prohibited (Q8001)). Each language qualifier can only be used once for the whole property. Language must not be listed if it is the same as the default. If there is no value without qualifiers, it means that the default is not set (e.g. English page has no onRelation= parameter).
|group (P25)||bridges (Q4712)||no qualifiers||This value is used for English page and all other language pages except those explicitly listed below.|
|properties (Q4671)||limited to language (P26)||This value is only used for the Italian and Finnish pages.|
|placement (Q4707)||limited to language (P26)||This value is only used for the Czech pages.|
There are many data items which are neither a Key nor a Tag:
- OSM Concepts
- element (Q9), key (Q7), tag (Q2), status (Q11), group (Q12)
- Statuses of type status (Q11))
- de facto (Q13), in use (Q14), approved (Q15), rejected (Q16), voting (Q17), draft (Q18), abandoned (Q19), proposed (Q20), obsolete (Q5060), deprecated (Q5061), discardable (Q7550)
- Statuses of type element status (Q8010)
- is allowed (Q8000), is prohibited (Q8001)
- OSM concept (Q10), sandbox item (Q2761)
Item Creation Process
A bot has created all significantly used keys and tags, and will continue creating these items when they are detected in the OSM database (taginfo API) or on the wiki. The bot will:
- create an item for any key with 10+ usages if it matches
^[a-z0-9]+([-:_\.][a-z0-9]+)*$(i.e. sequence of one or more words separated by single dashes, colons, underscores or periods, where words contain only lower case English letters and numbers), or for any 1000+ usages regardless of the key syntax (see talk page)
- set item's label to be the same as the key
- set item's description from the corresponding wiki page's info card (if available, from all languages)
- set used-by, recommended tags, implies, and any other easy-to-figure-out data from the info cards.
- will NOT update any fields modified by a user, e.g. if description in FR has been changed by a user, it should not be changed by the bot.
Eventually, it would be better for OSM tools (iD, JOSM, ...) to ask the user for the metadata, and use MW API to create new items.
Item Deletion Process
Data items can be deleted manually by administrators. They might delete a Data Item if
- there is no regular wiki page here (aside from discussions or user pages) that describe a key/tag/rel/... and
- there is no proposal associated with the Data Item and
- the Data Item does not qualify for creation according to the Item Creation Process.
API access and querying
- The easiest way for an external tool to get all the data about a key is to use this API call:
languagesto filter labels and descriptions to the needed languages.
&format=json&formatversion=2to get the actual JSON instead of HTML.
- Due to MediaWiki limitations, the
titlesvalue should be
("Key:" + key).replace('_', ' ').trim(). Use permanent key ID (P16) to get the actual format of the key. Make sure to get the "preferred" value, just in case more than one value is present.
Tracking changes to data items
To track changes to data items, you can add a data item to your watchlist like any other page on the wiki. You can also configure your watchlist to automatically include changes to the data item associated with any wiki page you are watching, by opening the "Filter changes (use menu or search for filter name)" dropdown and checking the "Data item edits" checkbox. To make these changes permanent, click the button or checking the "Show data item edits in your watchlist" checkbox in your watchlist preferences.
Changes to data items are included in Special:RecentChanges, Special:RecentChangesLinked, and Special:Watchlist by default. To filter out all edits to data items, click the Namespaces button in the filter panel, check "Item", and click "Exclude selected". To filter out changes to labels, descriptions, or aliases but include changes to statements, click the Tags button in the filter panel, check "Data item terms", and click "Exclude selected". (This latter filter is useful for ignoring translations and is implemented via an "abuse filter" and corresponding tag.)
To make any of these changes permanent, click the button.
There are several additional extensions designed to validate Wikibase data, and find items that do not pass validation. Installing such capabilities may not be done in the first deployment stage.
Limitations and known issues
- Wikibase's "Commons File" properties do not yet support files stored on this wiki. Instead, we use a regular string property to store the image name, and use a gadget (see your preferences) to show strings as images.
- The sitelink in the upper right corner does not show whether the Tag:* or a Key:* page exists or not.
- All sitelinks must use spaces instead of underscores. API sitelink search does not work otherwise. See permanent key ID (P16) and permanent tag ID (P19) for the correct value. Note that regular Mediawiki Key:* and Tag:* pages have the same issue, and use a special hack to change the title.
- MediaWiki removes spaces/underscores from the key, so Key:_abc_ would become Key: abc. There are no way to have two items with sitelinks Key:_abc and Key:_abc_ -- they are treated as the same, and fail.
- Date item titles and talk page titles are not human readable, e.g. Item:Q5007 vs Tag:amenity=shelter
- Item "Q" numbers collide with those used in the main Wikidata site, e.g. wikidata Q5007 vs dataitem Q5007). Despite numerous attempts by OSM community, and a working implementation by Yurik, the fix for this issue was declined by the maintainers of the Wikibase software.
- The Wikibase software and the Wikidata project are sometimes confused with these data items, which in the past have been called "the wikibase" or "wiki data items" by some users.
- Adding a data item to the Special:Watchlist will generate more watchlist notification, including all language translations. There are ways to work around it (TODO: add watchlist instructions).
- Many data items were created by bot in large batches of several hundred in one day, including for all tags with more than 10 uses and a certain format. Many of these items only include the key=value and no additional information. This makes it difficult for wiki users to review the new items.
- Because of this, many experienced wiki users and mappers are not following the data item pages, so any mistakes are less likely to be fixed as quickly as mistakes on Tag and Key wiki pages.
The current data item editing experience should be improved, especially in these areas:
- Most data items were created by bot, and are still updated by bot, unless a human user has edited the item. The bot source code is available, but it is not yet documented.
- Editing data item should be made simpler by a direct edit button from the key/tag/relation page, without navigating to the data item page itself. Data item editor can already be enabled in user preferences, but there is more work to be done in polishing this feature.
- Currently there is no bulk-editing interface available, other than to write a bot. We should enable Quick Statements tool to simplify such operations.
- The current system of having the tag description in two places (wiki page and data items) has been creating some problems. The difference is shown by a small icon, and can also be queried in Sophox (TODO: add query link here).
- It is not currently possible to copy the data item content, edit it outside browser as text, and copy it back, or to make multiple changes to a data item at one time.