Talk:Semi-colon value separator

From OpenStreetMap Wiki
Jump to: navigation, search

Discuss Semi-colon value separator here:

find my nearest cafe

Why the "find my nearest cafe" can not find something marked as amenity=cafe;bar. If it is the case that *semi-color is NOT a value separator*, because it does not separate values to individual ones. --Jakubt 11:23, 2 July 2011 (BST)

why not multiple tags instead

What is the (technical) reason for not using tag with the same key several times. Imagine using


insted of


it seems to solve the inability of semicolon to act as separator in searches and also simplifies the usage greatly for non geeks.

-- User:Jakubt 10:23, 2 July 2011

why would such simple straightforward approach be unfeasible? --solitone 21:17, 20 November 2012 (UTC)
Well it's not unfeasible apart from that would be a low level change to the OpenStreetMap data representation, database and API, which would mean all data consumers and editors needing code changes to work with it. Changes like this can be made, but only rolled out as a new numbered version of the OSM API e.g. "API 0.7" (not likely to happen any time soon)
Should such a change be considered for "API 0.7"? Maybe. It solves this particular niggling data problem, but it has a consequences for people trying to understand OpenStreetMap (how to use the data, or how to add tags in an editor) It introduces a new thing mappers can do wrong in the editor (accidentally at the same key twice) and new set of tag formulations for data consumers to contend with (though probably easier than contending with stuffed in ';' chars)
I wouldn't really describe this as a simple straightforward change.
-- Harry Wood 23:58, 20 November 2012 (UTC)
how about numbring the tags and add a 'tag:multi' tag with a value of how many values for this tag should be considerd:
(E.g. amenity:multi=2 , amenity1=cafe, amenity2=bar). This way the rendrer has a chance to render multiple tags next to each other. Espcially useful for merged highways with multiple 'ref' values.
And no API change necessary.
--Mixmaxtw (talk) 06:37, 12 April 2013 (UTC)
Yeah so obviously you're now just suggesting another tagging scheme (fitting within the way the API/data format currently works) . Did you mean to start a new heading for that idea? In the discussion below they're also suggesting bunging in numbers. -- Harry Wood (talk) 05:06, 13 April 2013 (UTC)

multiple values for tags that are somehow connected to each other

This does not fix the problem of having multiple values for tags that are somehow connected to each other. Imagine you want to tag the opening hours of a bank where the ATM has different opening hours than the bank itself. Or a road where different speed limits apply at different hours or for different vehicles. I would more like to see a way to connect such tags, such as numbered namespaces, for example tagging the bank with amenity:0=bank, opening_hours:0=Mo-Fr 08-17:00, amenity:1=atm, opening_hours:1=Mo-Su 06:00-22:00. --Candid Dauth 00:41, 21 March 2009 (UTC)

You gave me an idea for the multiple tags problem: use subscripts. For example, amenity[0]=bank, amenity[1]=atm, amenity[2]=... No subscript on a tag implies tag[0]. It's not as elegant as the multiple tags as proposed above, but the idea might be useful somewhere.--Elyk 02:46, 21 March 2009 (UTC)
This could solve currently unmappable limits that only apply to specified vehicles as well. For example, near my home there is a road that limits the weight to 3.5 tonnes, except for agricultural vehicles. With some “for” or similar tag, we could limit a limit to specified vehicles, in this example maxweight[0]=3.5; maxweight[1]=-1; for[1]=agricultural (presuming -1 means “no limit”). Another example, on the Autobahn 8, there is a segment where between 6 and 20 o’clock, maxspeed is limited to 100 km/h, and 60 km/h for HGVs. Currently, this is unmappable, but with “tag arrays” this would be easy: maxspeed[0]=100; hour_on[0]=6; hour_off[0]=20; maxspeed[1]=60; for[1]=hgv; hour_on[1]=6; hour_off[1]=20. As a consequence, we wouldn’t have hundreds of different tag keys for the access of different vehicles but only the “access” and the “for” tag, for a cyclepath for example: access[0]=no; for[0]=all; access[1]=yes; for[1]=foot; access[2]=designated; for[2]=bicycle. --Candid Dauth 02:03, 24 March 2009 (UTC)
Not quite. What if there are some unrelated arrays on the same way? For example suppose the road has name[0]=foo; name[1]=bar. Does this mean that road 'foo' has maxspeed=100, while road 'bar' has maxspeed=60? This doesn't make sense to a human, but a parser might get confused. Of course you could use different indices (e.g. name[2]=foo; name[3]=bar) but that gets ugly and abuses the whole array idea.--Elyk 05:14, 24 March 2009 (UTC)
One problem is that all of the tags don't form any kind of hierarchy.
|-- amenity=bank
|-- opening_hours=Mo-Fr 08-17:00
|-- amenity=atm
\-- opening_hours=Mo-Su 06:00-22:00
Logically this should form a tree:
|-- amenity=bank
|   \-- opening_hours=Mo-Fr 08-17:00
\-- amenity=atm
    \-- opening_hours=Mo-Su 06:00-22:00
What is the best way to represent this?--Elyk 05:49, 24 March 2009 (UTC)
What about
(common part of node tags)
name=Bank of ...
(first subitem)
1:opening_hours=Mo-Fr 08-17:00
(second subitem)
2:opening_hours=Mo-Su 06:00-22:00
name=... (default name)
maxweight=3.5 (default maxweight)
1:maxweight=5 (override default maxweight for 1:type=agricultural)
2:maxweight=no (override default maxweight for 1:type=non-agricultural)
-- wildMan 15:34, 23 July 2009 (UTC)
Another problem is that some tag combinations don't easily form a tree. In the maxweight example above, which tag do you choose as the root?
|-- maxweight=3.5
\-- maxweight=-1
    \-- for=agricultural
|-- maxweight=3.5
\-- for=agricultural
    \-- maxweight=-1
You could argue either way. Maybe the maxweight=3.5 could be a child of another for=non-agricultural tag?--Elyk 05:49, 24 March 2009 (UTC)
Your problem above with the two names of the road could perhaps be solved by double tagging:
  • name[0]=foo
  • maxspeed[0]=100
  • for[0]=motor_vehicle
  • name[1]=foo
  • maxspeed[1]=60
  • for[1]=hgv
  • name[2]=bar
  • maxspeed[2]=100
  • for[2]=motor_vehicle
  • name[3]=bar
  • maxspeed[3]=60
  • for[3]=hgv
This looks a bit confusing, but I think it would actually be a solution without tag hierarchy. (And I don’t think that such cases occur very often in reality.) We could also shorten this:
  • name[0,1]=foo
  • name[2,3]=bar
  • maxspeed[0,2]=100
  • maxspeed[1,3]=60
  • for[0,2]=motor_vehicle
  • for[1,3]=hgv
(Which keeps the logic, but is even more confusing ;-)) --Candid Dauth 17:05, 24 March 2009 (UTC)
Doesn't that already make relations a much more appealing solution for such complex cases? Alv 17:18, 24 March 2009 (UTC)
How? By defining that members of a relation inherit its tags? --Candid Dauth 22:07, 26 March 2009 (UTC)
If you represent the ATM as a separate node ((amenity=atm node separate from amenity=bank) then all of these awkward schemes for making one tag relate to another value are not necessary (We're not designing a programming language here! keep it simple!) It creates its own problems of course, because now the ATM is not as clearly related to the bank, so I guess Alv is suggesting using a relation for that. Personally I just map them separately anyway (without a relation). If some automated process really needs to know that the ATM is related to the bank, it could use nearby location heuristics, ...or we could stick relations on them. In any case, I would suggest that mapping them as separate elements is the solution for this particular ATM<->bank example. Are there any other examples where properties tags need to associate themselves with different values with in another multi-value tag? It seems to me that to worry about that kind of thing, is to try and be too flexibly generic at the expense of mapping simplicity. -- Harry Wood 14:37, 5 December 2010 (UTC)

Warning in Potlatch2

Potlatch2 displays a warning "The tag contains more than one value - please check". It was about a path that is ski piste in winter and a mountain bike route in summer, tagged as "route:mtb;ski" So what to do? --User:Gerdami 13:49, 29 January 2012 (UTC)

I second on that question. Have the same warning with multiple animals living in the same cage in the zoo (like "animal:donkey;lama") Positron96 13:14, 27 June 2012 (BST)
From what I understand, it's just a warning, not an error. So you should be able to use those multiple values. However, as I read on this wiki page, this practice is discouraged, and so Potlatch warns you. Following the reasoning in this wiki, you had better choose either mtb or ski, depending on the main usage your route has. --solitone 21:13, 20 November 2012 (UTC)
Yeah it's just a warning, but it's important to realise that little/no software dealing with mountain bike and ski routes will actually know how to decode your mashed together tag. Yes it's algorithmically possible, but you're making developers work very hard. Instead perhaps you should follow a 'split the element' approach. Map the ski route and the mountain bike route as two different ways. They could overlap eachother sharing nodes (which has it's own problems) or in fact it wouldn't be all that unreasonable to have them following close to eachother but not exactly the same way. I'm not an expert, but I think generally a mountain bike follows a more specific eroded zig zagging pathway down the mountain, while a ski piste is probably easier just mapped as a centreline. Anyway, two different ways, problem solved.
-- Harry Wood 23:47, 20 November 2012 (UTC)
I face this very same issue, as bikers ride the hiking trails (route=hiking) in my area. I thought I might use route=hiking;mtb, but then I read Semi-colon value separator#When_NOT_to_use_a_semi-colon_value_separator, and changed my mind. Plus, it's true very few applications do manage such tagging scheme--for instance, Waymarked Trails: Hiking map does not. This explains why multiple values for route=* are almost unused at the moment [1].
Since both hikers and bikers follow the same trail, I wouldn't use two different ways to map one single trail, though. I feel a better practice would rather be using two relations--the first with route=hiking, and the second with route=mtb. The second relation could be created as a copy of the first (very practical from a mapping perspective), hence the two relations would share the same members, and the only difference between them would be in route=*. It's true you would end up with a duplicate relation, nevertheless you would still have one single way, which corresponds to reality, as you only have one physical trail.
Any thoughts?
--solitone 13:58, 21 November 2012 (UTC)
Yes there's no restriction on the number of relations you can attach to a particular element, even several of the same type (type=route). I'm usually quite anti relations, because they get over-used in many silly ways. Using relations is not the solution for all of these tagging problems (contrary to what some mappers seem to think!) For example it would be rather a messy mis-use to attach multiple relations onto animal cages. But in the case of overlapping hiking/biking routes, it's sensible yes. It's already quite common to use relations for routes, and this is an established reasonably elegant way of dealing with awkward overlaps. -- Harry Wood 15:18, 21 November 2012 (UTC)
As for animals in a cage, maybe this is an example where semi-colon value separators make more sense, a bit like the car service types. You're capturing lots of extra hyper-detail which is (perhaps) less likely to be consumed in a way which would require distinct parsing and machine processing of the different animals. But yes, potlatch will warn you.
-- Harry Wood 23:47, 20 November 2012 (UTC)


The page suggests doubling of literal semicolons to escape them. That won't work, because

  • foo;;bar may either mean foo;bar or foo + (empty value) + bar
  • foo;;;bar may either mean foo; + bar or foo + ;bar or foo + (empty value) + (empty value) + bar

While empty values are certainly always useless, a leading or trailing semicolon may be desired sometimes, e.g. when an inscription on a stone ends with a ";" because the rest ot the text fell victim to weathering.

The usual way to escape special charaters is to use a dedicated escape character:
\; → ;
\\ → \
--Fkv (talk) 19:13, 23 January 2015 (UTC)

If empty values are ever allowed then it doesn't work. Now at this point I would be tempted to say that you that you are disappearing up your own arse overthinking a problem that basically doesn't exist... except
I do like the point you're making here, because actually means the overthinking that other people have done... hasn't been thought through properly :-) Now that I think about it, if you come across a ';;' in the data, you're probably far more likely to be seeing a case where a user has entered an empty value into some buggy mobile app, rather than this escaping meaning.
I've added a note that empty values should not be permitted. To me that seems like common sense, and actually that's a more useful rule to be stating.
-- Harry Wood (talk) 13:49, 26 January 2015 (UTC)
By the way, in my original incarnation of this page, the "When NOT to use" section was higher up the page, and all this escaping nonsense was placed further down as a sort of quirky footnote at the bottom. In my opinion that was a clearer way to arrange things -- Harry Wood (talk) 13:52, 26 January 2015 (UTC)
Leaving empty values aside, there's still the other problem that ;;; may mean literal ";" + separator or separator + literal ";". This can be coded, but not decoded. --Fkv (talk) 20:04, 26 January 2015 (UTC)
Added this note to the page. Probably we should just state instead "it is impossible to decode multiple values using strings without escape character and null character and not to mess with ;; suggestion. How many software developers fallen into this trap? Xxzme (talk) 04:48, 27 January 2015 (UTC)
No null character needed. I think that the backslash \ is a sufficient escape character, because it never occurs in normal text. --Fkv (talk) 06:37, 27 January 2015 (UTC)
\ is a sufficient escape character, because it never occurs in normal text What. Do you have basic knowledge of encoding theory?
You are definitely have no idea how taginfo works. searchbox for \ in value part in top-right corner [2]. Otherwise you wouldn't make such claims.
Regarding my earlier point. "When NOT to use" section is now higher up the page, and this escaping nonsense is lower down in an "syntax" section -- Harry Wood (talk) 01:32, 31 May 2015 (UTC)

Page focus changed from semi-neutral to the clear message Avoid semi-colon value separator

This change was recently discussed at tagging list, there no better solution at the moment for regular mappers. Details are at page and tagging@ list. Xxzme (talk) 04:10, 27 January 2015 (UTC)

This may be your perception, but as a matter of fact the majority of participants in the mailing list disagree with you and your page edits. Regular mappers really do not need to care about parsing. They only need to find the ";" key on their keyboards. --Fkv (talk) 06:31, 27 January 2015 (UTC)
I don't care about people at tagging list. I do care about regular mappers.
I don't care about keyboards. User interfaces are not limited only to keyboard input.
There dozens of tagging schemes and thousands of mappers who use these tags to prove you are wrong
Do you know how to use Taginfo? [3] [4][5] [6]
8735 distict people were use tag fuel:diesel=*. You can throw your tagging list in the garbage can.
I have no idea what do you trying to achieve here. Do you have better solution to the problem? Can you improve database schema and relevant API code? Xxzme (talk) 06:45, 27 January 2015 (UTC)
When you want to use taginfo, you simply go to the key page (such as ) and you type the value in the search field inside the box. You will have as a result everything a regular mapper needs.
I don't buy your argument about the other tags --> "People did this decision in the past in cases X and Y, so we have to follow it for everything, whether that's the best decision for other cases or not" ?
--Jgpacker (talk) 10:58, 27 January 2015 (UTC)
You don't buy what? Is this how you make arguments? Wow.
So what. I'm not your teacher. If you choose to be ignorant than be it. Don't fool other people like it is good idea to use semicolon.
10 points why it is wrong to use semicolon or multiple values in value part of tags
3 alternatives to users how to avoid them
If you want to use semicolon them you should prove every statement at this page is wrong, not me. 11:10, 27 January 2015 (UTC)
here is how you should make statements. Your link to cuisine only proves my words: search for ; in taginfo.
There only 2212 cuisine values with ;. Top multivalued tag in cuisine is only with 642 instances.
You only prove my words: Most cuisine tags are single-valued or tagged using single node per tag appoach.
9679 values in total, this means you approach is also less popular across all mappers around the world. This is not your single opinion how things looks like. These are numbers.
top cruisine values are without semicolon, you don't need complex tools to examine most popular values and sum numbers. Numbers will be always against you. Always. It only proves my statement that most of OSM data is single valued, not multivalued.
There no need for semicolon/multiplevalues when you have 3 different approaches to avoid them entirely. Xxzme (talk) 12:15, 27 January 2015 (UTC)

Jgpacker disruptive opinion based edits without arguments or talk at this page / tagging@ list

[7][8] [9] [10] [11] [12] [13] [14]
I have no idea what this person trying to archive or fool someone. There plenty of link present to prove statements at this page. If there something missing, then missing parts should be added, not every single change should be reverted. Xxzme (talk) 10:36, 27 January 2015 (UTC)
The discussion in the tagging list is still ongoing. You must not make such edits before reaching consensus. I was simply reverting them. The majority of people in the mailing list do not agree with your views. --Jgpacker (talk) 10:47, 27 January 2015 (UTC)
Please settle the discussion before making extensive documentation changes, or else I'll ban both of you. --Dee Earley (talk) 10:50, 27 January 2015 (UTC)
Hi, my changes were simply the reversion of documentation change; exactly because the discussion is still ongoing (mostly in the tagging mailing list). --Jgpacker (talk) 11:00, 27 January 2015 (UTC)
Then you will ban wrong person. Discussion at tagging list was 115 messages long. No activity since my last message to the list. My last message should look like this. It was posted at 2015-01-25, I haven't seen reply to it for 2 days so I decided to make actual updates to wiki. Jgpacker doesn't like changes and wants to revert these edits without arguments. He did same thing at tagging list, now he wants to repeat this trick here. Not going to happen. Xxzme (talk) 11:02, 27 January 2015 (UTC)
You're both edit waring. I don't care who's "in the right" at the moment. I can revert back to the state this morning until there is some form of consensus on the change. --Dee Earley (talk) 11:08, 27 January 2015 (UTC)
Quote: "Then you will ban wrong person.". My opinion: I don't think so.
Quote: "No activity since my last message to the list.". Maybe that's because you are simply ignored? This has - I'm just guessing - maybe something to do with lines like "Are you and idiot?" in many of your mails.
I support a revert to the state this morning and I most definitively support a ban of Xxzme. --Imagic (talk) 11:30, 27 January 2015 (UTC)
About the second quote: I suppose he was banned from the mailing list before sending his last message (for insulting an user even after receiving an warning). --Jgpacker (talk) 11:35, 27 January 2015 (UTC)
I have no idea, my messages are not processed by moderator or something. My last message was at 2015-01-25, but I cannot see it list archive. Xxzme (talk) 11:48, 27 January 2015 (UTC)
Pages have been reverted to Decemberish, and the semicolon page has been renamed back and protected for a week. Please sort yourselves out and introduce gradual changes if deemed necessary rather than wide sweeping changes to how the entire data set should be handled. Have a nice day. --Dee Earley (talk) 11:49, 27 January 2015 (UTC)
December page more or less is okay, danke. Xxzme (talk) 11:50, 27 January 2015 (UTC)

Sorry my original words seem to have caused a heated debate. I would be interested in any consensus emerging on the mailing list, but really it requires calmer discussion. On the wiki here too. This discussion is confrontational. The nice thing about wiki discussions, is that we can purge them onto an archive page. I suggest we do that with this discussion, so that we can take a breather, and then continue more calmly with "Avoid semi-colons?" below.

-- Harry Wood (talk) 17:11, 27 January 2015 (UTC)

Avoid semi-colons?

I wrote the original page here. There's some wording on there which is fairly strong "In general avoid ';' separated values whenever possible" but that's not an absolute "DO NOT USE". The page goes on to give an example of how to avoid them in some cases where a lot of mappers may be tempted to use them (the amenity=cafe;bar and amenity=library;cafe examples). But there are cases where they can be useful, and not too damaging to the simplicity of the data. So this is tricky. I wanted to use clear strong words to try avoid mappers going overboard with using these characters. But the page is not intended as a ban on all semi-colons. To rename this whole page as "Avoid using semi colons" is not appropriate. Hopefully we can agree on this. I think it was just a misunderstanding.

-- Harry Wood (talk) 17:11, 27 January 2015 (UTC)

Well better solution will be in two separate pages. Neutral page in general avoid separated values whenever possible. But I also need to point users to alternative schemes and disadvantages(later section may need some rework or more links) under no circumstances I will agree if this content will be not accessible simply because there no consensus about this. How does this even possible in OSM. Is there good idea how to split older version of the page into two pages? Xxzme (talk) 17:27, 27 January 2015 (UTC)
The wiki page should reflect consensus. That can be a difficult thing to achieve. You don't have to agree with the consensus, but you should make an effort to reflect it when you're making wiki edits. There's no single author in charge of the text. If you're suggesting that you could go off onto another wiki page where you can be the single author. Well I'm afraid that won't work. ALL wiki pages should reflect consensus. The only exception to that is if you make a subpage from your user page and label it clearly as an essay/opinion piece (we used to have a Template:Essay for this. Not sure why it was deleted)
In the main namespace I don't think this is a big enough topic to spread onto multiple pages. We should document semi-colon value separators on this page... and yes... we should reflect consensus. This is actually a powerful thing about wikis. Some people love semi-colon values, some people hate them, and together they're forced to agree upon what the text of this wiki page should say. It's not always like that. Across a lot of pages of the wiki we can be quite bold and make some sweeping changes as long as they feel like improvements, but on pages like this we know that there is a weight of opinion on either side of a debate. It's not acceptable to do things like renaming the whole page to push one point of view.
-- Harry Wood (talk) 13:05, 29 January 2015 (UTC)
Concerning "You don't have to agree with the consensus": If someone disagrees, it's not a consensus by definition. With more than 500000 users, you will never reach a consensus. Therefore, the wiki cannot reflect a consensus. It should reflect usage and points of views supported by a notable portion of the community. I agree that single user opionions as presented by Xxzme belong to the respective user page or a subpage thereof. Discussion pages and proposals are also fine, of course. --Fkv (talk) 13:44, 29 January 2015 (UTC)
"If someone disagrees, it's not a consensus by definition" . No I think you're thinking of "unanimous agreement". Yeah the contents of the wiki are not governed by unanimous agreement :-) -- Harry Wood (talk) 14:58, 29 January 2015 (UTC)
Please don't start fighting over the correct term now! PLEASE!! ;-) --Imagic (talk) 15:42, 29 January 2015 (UTC)
Hehe. Yeah no I thought it was fairly clear what "consensus" means, but this did give me pause for thought. Maybe to be clear, I should say a longer thing... the wiki contents are governed by Consensus decision-making. And "views supported by a notable portion of the community" is maybe just another way of saying that. ...I think we know what we mean -- Harry Wood (talk) 15:54, 29 January 2015 (UTC)

Most of our data/mappers follow approach pick one value

50M tags:

10M tags:

1M tags:

If some person wants to use semicolon in value despite all challenges and forcing other mappers in trouble situations while avoiding all alternatives, thats only fault of the person who fails to see real reasons behind such destribution of tags and countless user preference (not to use multiple values in value part / mess with all relevant troubles). Xxzme (talk) 11:49, 27 January 2015 (UTC)