Semi-colon value separator: Difference between revisions

From OpenStreetMap Wiki
Jump to navigation Jump to search
(Everything you ever wanted to know about ';' characters)
(No difference)

Revision as of 18:19, 4 December 2010

We use a semi-colon value separator (the ';' character) in our tag values in some situations. This can be necessary when a single element needs to take multiple values for the same key.

Examples

nat_ref=B500;B550 for a section of road that is designated both B500 and B550. You would only do this if the same section of road takes both ref values (Places where authorities have screwed up their numbering systems. Quite rare, although seemingly surprising common on U.S. interstates!) Note: If there is any point on this section of road where you move from one ref to the other, then this not the correct approach. Instead you would place a node and split the way at that point.

Some more minor "properties" tags lend themselves to taking multiple values. For example when mapping a car shop you can add a tag: service=dealer;tyres;repair. If this is more than just an infrequent possibility, then you would expect the tag documentation to give an example illustrating this approach, as you see on the Tag:shop=car page.

When NOT to use a semi-colon value separator

In general avoid ';' separated values whenever possible. Don't use them in your mapping, and don't propose them on the wiki if there are better ways of representing things. Why?...

The whole idea of sticking semi-colons into tag values is very contrary to the aim of keeping it simple. Tags are supposed to be dead easy for mappers: Easy for new mappers to learn how to use (Not just 20-something male geeks. Think mothers and five-year-old children) Easy for people to read, and type quickly while mapping. Easy for people to type on mobile device touch screens.

Tags are also supposed to be dead easy for data users to work with. In truth this is not the case. There are many aspects of OpenStreetMap tagging which make life very difficult for data users. Semi-colon value separator would be one of them. For the sake of anyone trying use the data (people building software for rendering, searching, "find my nearest cafe" mobile apps, etc) we should minimise use of values with special characters.

It is particularly important to (wherever reasonably possible) avoid ';' separated values in more important "top-level" tags. That is, tags which define what an element is. In situations where you have multiple values, there are normally a couple of alternative approaches:

  • Choose one of the values Take the overriding "primary" value, and go with that. Example: You're mapping something which is a cafe but also a bar. It's much more helpful to just pick amenity=cafe or amenity=bar (look at the cafe/bar, and make a choice. Is it primarily a cafe, or primarily a bar?) It is not a good idea to map it as amenity=cafe;bar.
  • Split the element Separate things out into separate nodes/ways to allow them to be tagged separately with normal tags. This is a good option where things are located in separate spatial locations anyway. Example: You're mapping a library which has a cafe inside it. Place a node for the cafe, and then either represent the library (a larger building) as an area instead, or just as a separate node for the different centrepoint of the library. It is not a good idea to map it as amenity=library;cafe

In both examples, if you use ';' in the amenity value, then that isn't going to show up in a "find my nearest cafe" mobile app any time soon. Even though it is entirely possible for systems to parse the value, and split it by the ';' character, most existing systems don't, and most systems probably aren't ever going to.

Quirks and inconsistencies

Other character separators?

There ongoing debates about various aspects of tag value separators, but we have more or less agreed that the separator character is a semi-colon ';'. You may see other characters being used to separate values. In the past people have suggested for example, "/" (solidus), " " (space), "-" (hyphen), or "#" (number sign). The semicolon is now widely accepted as the character to use, and is supported by Potlatch and JOSM (these editors will automatically put in ';' chars when merging elements)

Space character padding

Should it be...

service=dealer;tyres;repair

or

service=dealer; tyres; repair ?

In a lot of tag documentation we show semi-colon separated values without any additional spacing, however it is very common to add a space character after each of the ';' characters. This is currently an inconsistency between JOSM and Potlatch (both versions) in their approach to automatic value separating.

Escaping with ';;'

It has been suggested that, if a semi-colon exists in the actual value of the data mappers should enter it as two consecutive semicolons ';;'. This is an "escape character" approach used in computer programming and data formats [1]. The implication is that people developing parsing systems should take ';;' into account. In practice this situation crops up approximately never, and so is supported by approximately zero parsing systems.