User:Joto/How to invent tags
OpenStreetMap lives because people can and do invent new tags all the time. But inventing tags in a way that they are easy to understand and easy to use for mappers and users of the data alike is difficult. Over the years we have gained experience with what works and what doesn't. This page is my personal list of do's and dont's. Maybe it can be one input for a general "tagging best practice" wiki page that will capture the community consensus. You are welcome to edit this page to fix typos etc., but please leave larger changes for me (Jochen). Feel free to add comments to the discussion page.
This page is meant to help you when inventing tags, not restrict you. There are exceptions to all the rules. People reading this page can learn from the experience of the community and then form their own opinion.
This page reflects my current thinking on these issues. I have changed my mind about some of these things in the past and I reserve the right to do so in the future.
There is no particular order to the following list.
- 1 The principle of least surprise
- 2 Characters used
- 3 Do think about the mappers
- 4 Do think about the users
- 5 Do not invent tags you are not using yourself
- 6 Do use descriptive names
- 7 Do not use abbreviations and acronyms
- 8 Do not use generic words for keys
- 9 Separate common concepts from specialized ones
- 10 Every tag should stand on its own
- 11 Do not re-define existing tags
- 12 Do not build elaborate hierarchies
- 13 Do not be afraid of leaving things for later
- 14 Use value yes as unspecific case
- 15 Take care with defaults
- 16 Find out what's there already
- 17 See also
The principle of least surprise
Keep the Principle of least surprise in mind.
If you are inventing a tag that will be used worldwide, stick with lowercase latin characters (a-z) and the underscore (_) for keys. You can also use the colon (:) as a special separation character. If you have a more or less definite list of values (like the values for the highway=* tag), you should restrict those values to the same list of characters if possible. Of course for keys like name=* no such constraint exists.
If you are inventing a tag that will be used in one country, region, or in one language only, feel free to use other characters specific to that area.
Do think about the mappers
Think about what the mappers will do with your tag? Where do they get the information needed? Separate easily obtainable information from difficult to obtain information. Separate always existing information from optional information.
- Example: Power lines. Most mappers will not know what voltage a power line has. But mapping a power line makes sense even if you don't know what voltage it has. So there should be one tag for the power line itself, and possibly another one for the voltage which can be added by those people who know more.
Do think about the users
Think about how your tags will be used. What kind of maps will use it. What kind of programs will be needed to extract this information from OSM and bring it into a usable form. Ask others who have done these things before, if you are unsure yourself. Sometimes tags are easy to understand for a human but very difficult to use for a computer, especially with the imperfect programs we currently have.
You should invent tags for the things you want to map or the data you want to use. Do not invent tags just for the sake of it. If you have a concrete problem you are trying to solve, you are better qualified to invent tags that actually make sense.
Do use descriptive names
Use something like "hiking_trail" that is more descriptive than "trail". This helps people understand the tag without reading the documentation. Descriptive names are often longer, but that doesn't matter. Most tags are not typed in these days anyway. Either you have auto-completion or you are using buttons and menus in your editor.
Do not use abbreviations and acronyms
You should not use abbreviations and acronyms in tag names. It might be obvious to you what it means, but it will not be obvious to everybody. If you use a long name it will also be much easier to search for.
Do not use generic words for keys
Do not use generic words like id, type, or class for keys. Other people do that too and you can never figure out what was meant. The "class" of a restaurant is different from the "class" of a road. Instead use descriptive and specific tags.
- The type=* tag on relations is an exception, because it describes a very general concept of how a relation is to be interpreted.
Separate common concepts from specialized ones
A "name" is something common to many things. It's an arbitrary text we humans use to call things. There is nothing fundamentally different between a name of a street and a name of a restaurant. On the other hand, a road number (such as "M2" for the motorway 2 or so) has many differences from the number of, say, a bus line. Both "name" the thing in a way, but they are given out by different organisations, they have different structure and they are used in different ways. It might make sense to just put a label on the map with the name of any old thing, even if you don't know what the thing is. But I wouldn't want the bus line number to appear in a map looking like a motorways shield.
Of course there is a only a gradual difference here. Try to think about the different uses of a tag in different context and see whether they are more alike or more different. The most common uses should inform the decision to use common tags for different things or different tags.
Every tag should stand on its own
Ideally tags should be understandable on their own without any other tags on the same element. That makes them much easier to understand and easier to use.
Existing tags that are already in widespread use should not change their meaning. You will break existing applications and confuse the mappers. Sometimes this leads to results that might look ugly and unsystematic, but it makes sense, because it often means that very common tags and uses are given priority.
- Example: The highway=* tag was in use a long time when the question arose how to handle roads in construction. The systematic approach would have been to tag those roads with their normal tag (like highway=residential) and add a key like construction=yes.
But this would have broken the common case: Most use cases don't care about highways in construction, because they are not usable yet as highways. Instead the tag combination highway=construction with construction=residential was used.
The common case didn't have to change for the price of having a more complex special case.
Do not build elaborate hierarchies
Do not build hierarchies of tags: For instance "plant=tree", "tree=conifer", "conifer=fir_tree", "fir_tree=douglas_fir". Hierarchies might seem logical to you, but they often are not. Other people use different ways of classifying objects into hierarchies. And its harder to use those data because you have to look through all tags to get usable information.
Instead make sure all tags work on their own: "tree=douglas_fir", you don't need "plant=tree", because you have that information in the key "tree" already. Now you might only need a tag for the "conifer" or "fir_tree" concepts. But we leave that to the botanists.
Note that this is a fictional example. I am not suggesting that we should use those tags.
Do not be afraid of leaving things for later
OSM tags evolve over time. That's how it works. You don't have to take every eventuality into account when creating your tags. You can leave something for the next round.
Use value yes as unspecific case
When you have a key and a list of specific values for that key you often need a value for the "unknown" case (for instance there is building=hut and building=house, but sometimes you just want to say its a building of unspecified type). In OSM yes is used most often in those cases and you should also do that (building=yes).
Take care with defaults
Defaults should be explicit and not change between values.
- Example: Highways are generally not oneways. So you don't need to tag every highway=* with oneway=no. But some people argue that highway=motorway is an exception and that it implies oneway=yes. That's confusing. Try to avoid this situation.