Proposal talk:Boolean values

From OpenStreetMap Wiki
Jump to navigation Jump to search

The point I am trying to make here is quite simple, but is one that has very little support. What we see on the screen is purely controlled by the tools, and providing a 'yes'/'no' on the screen is no different to a translated version of the same. It is about time that we started to take the internal 'bloat' a little more seriously and this is a good starting point. STORE a single bit in the database, but DISPLAY it in the format appropriate for the user. Of cause the biggest hole here IS knowing that the key is a boolean, and a properly structured XML 'schema' would provide that information. If we start nailing down the core keys and providing them with a defined set of options then everything starts to come under control. It does not prevent 'personal' free format data being added - if the key allows - but defined data types like 'boolean' will automatically filter some of the edge cases, and things like layer would also be restricted to approved values. I'm only saying 'NO' to this proposal because once again it misses the whole background problem. TRANSLATION is becoming a major requirement, and while mapping 'yes' and 'no' to other words is not rocket science, having to search for xx='yes' INTERNALLY is stupid, when xx=1 is natural for the computer. USERS do not need to see the 0/1 at all. Lsces 11:30, 3 October 2009 (UTC)

I won't start another "central authority" debate here, but storing true boolean values, rather than strings, internally doesn't work for a simple reason: There currently are no keys which are limited to boolean values only. Look at the examples from the proposal: There is "-1" for oneway (even if I don't like that value much), time intervals for lit, bridge types for bridge and contact_line/rail for electrified. --Tordanik 12:17, 3 October 2009 (UTC)
It the big chalenge of OSM to keep open tags and values and to strenght its semantic. The openness make place to creativity, and for newbies to start. The closed semantic and syntax ease the computing. If OSM tags were leaded by a DTD, the process of creating a new tag, or acceding to mapping would be hard. And the evolution would be slower : new tags, new values would be rejected and so approximated tags and values. What would be the whole set of values for the layer tag? Would 'null' or equivalent not be a significant one ? And how will we distinguish between a layer=0 and a layer=no. Is it a closed set of values for a bicycle lane ? The untyped values is a chance in the development of OSM. But computed use of the data will require more consistent syntax. I'm shure in a few time, a DTD or other will take place to help developing data use. Fixing the bolean values to yes/now is a way to simplify, saying it is a semantic boolean rather than a logic. lit, so, can have semantic boolean and text/open_hours values. But it is too early to fix the type of the tags : The lit example is a good illustration. FrViPofm 12:51, 3 October 2009 (UTC)
The sticking point here seems to be the 0/1 - how about if it WAS t/f instead. THAT would provide an inbuilt indication of the type of tag. However I DO think it is about time some of the flexibility is reigned in, so that locking a key to a particular set of tags - in this case affirmative/negative at the display level whould start to get some areas under control Lsces 13:17, 3 October 2009 (UTC)
This is actually quite an interesting idea but it’s got various ramifications. Most obviously, if it’s done in the server it’s got to wait until API 0.7 because all users of the database have got to be rewritten; editors don’t have to wait of course, though standardising on ‘yes’/‘no’ makes it a bit easier. Tag values such as electrified=contact_wire can be dealt with by allowing boolean and string values to coexist with the same key, which allows new boolean keys to come into use but may lead to users used to typing ‘yes’ or ‘no’ continuing to do so and reintroducing variations in boolean values. --Wynndale 12:32, 11 October 2009 (UTC)

true boolean

If it should be truely boolean, than the only accepted values should be TRUE and FALSE (in upper case). IMO since tags are edited by human, I see no point in forcing a special way of tagging. I agree that 1/0/-1 is very computer, and I avoid using them. If I need to tag a oneway street that goes the opposite direction I reverse the way before adding a true or yes value, though backwards and reverse could be possible alternatives to -1. Why shouldn't bridge=1/yes/true be accepted equally? As far as I know, bridge=* can have other true values such as swinging, suspension, and more. Most renderers accept all of these values equally, and should continue to do that no matter what the outcome of this vote is. By limiting the accepted true/false values in renderers will only give a lot of false falses in the rendered map, as we still will have free hand tagging. --Skippern 04:05, 6 October 2009 (UTC)

Applies to ?

The proposal says (in "Apllies to" section) that it applies to "All keys that expects a boolean value", but does not specify which are those exactly. Currently (API 0.6) NONE of the keys expects a boolean value (but a string value), so the proposal targets empty set of keys (ie. is useless). Furthermore, in "Tagging" section, proposal mentions oneway, lit, bridge, electrified keys as examples, but in fact none of those is a boolean - not according to wiki specification, nor according to actual usage. As implementing such a restriction requires exact list of all keys which are to be restricted to boolean values, the proponent should have listed them all in advance - the proposal can serve no practical purpose without listing them. (Note that such an issue would have been quickly noted in "draft" or "RFC" phases of the proposal if the proposal actually followed proposal creation guidelines, but that is another issue, which is too late to try to fix now as we're in "voting" phase).

Anyway, if such a thing as forcing only specific sets of values to be valid is indeed wanted (it might create more problems than it fixes, really), then the correct procedure would NOT be to implement boolean type, but instead to implement (in a addition to current string type, and maybe an additional integer/floating numeric types) a fixed set type of data (like enum in C or SQL). Then one of those sets/enums could be made to have only two values ("yes" and "no"), thus simulating boolean for all practical purposes. And other sets could be made to support almost-boolean (like "yes"/"no"/"-1" for "oneway" key for example) values, or even more complex sets of fixed values (like "yes" / "no" / "contact_line" / "rail"/ etc... for "electrified" key).

Note however that changing API from (unrestricted) strings to restricted data types (numerics and/or sets) would lead to issues (actually design decisions) like inability to use custom tag values until server code is modified to accept those (if ever, depending on how much it is used, and how much free time both trying to use non-popular values and server admins have to negotiate those), and would require constant (if maybe small) amount of work to be done forever (as a need for new tags/values probably would never cease).

Another way would be to leave the API using free format strings values, but change the editors gradually to only allow users to enter specific set of values by default (but allowing users to extend those sets if need be), which wouldn't have almost any of the many issues as API/database restrictions, but then again the renderers wouldn't have the fixed data type guarantee and would still have to parse all the values as they do now; making it less useful. --mnalis 16:07, 11 October 2009 (UTC)

I agree that RFC would have avoided some misunderstandings, but the intention of the proposal is actually quite clear: It recommends that people use the string "yes" instead of all value strings that are equivalent with "yes", and "no" instead of all values that are equivalent with "no" - for any key where there are values that are equivalent with "yes" or "no".
It does not intend to change anything about value strings that are not equivalent with "yes" or "no". It also does not change internal data types - keys and values will remain strings, and the string values will not be validated by the server. In fact, this proposal doesn't have anything to do with the API or servers at all. --Tordanik 19:29, 11 October 2009 (UTC)
Thanks for clarification, Tordanik. The original proposal might have been quite clear to you, but it is quite clear (pardon my pun) that it was not clear at all for everybody (or even majority, judging from the comments). That is exactly why draft and RFC phases of the proposal are there, and why it is terrible idea to ignore them because something "makes sense to you" (of course it always would!)
As I understand now (which still might be wrong), what you actually wanted to propose was just to prefer "yes" to other affirmative synonyms (both in English and other languages), and to prefer "no" to other other negative synonyms (also both in English and other languages) in all values (most, if not all, of which are NOT boolean values). You also did NOT want to enforce those values by technological means (like API strong-typing, or putting limits in editors), but just prefer them by majority agreement, right ? Probably the most confusing part of the proposal was that you insisted on talking about boolean values, while NONE of the values of the tags you mentioned as examples actually WERE boolean values. "Boolean value" means something has only two and EXACTLY two states (one corresponding for true, and another one for false). "oneway" values do not. Values for "lit" do not. Nor do values for "bridge" or "electrified". NONE of those are boolean values, as they all have more than TWO values. The fact that some of the values they have are "yes" and "no" (and/or "true" and "false", and/or "0" and "1", and/or...) DOES not make them boolean in the slightest. You also said it applies to "all keys that expects a boolean value", which has the same problem (in strict view, it would apply to none of the keys, as they are all string values. In more broad view, it would apply to ALMOST none of the keys). If those were corrected, and "editor auto-change" (although I would accept "editor recommendations") and "bot changer" clauses where removed from additional comments (see the main article for my reasoning why), I would quite agree with that proposal. But that hypothetical proposal, and this proposal, have very little in common; and it is quite late to try to fix this proposal now (as we're waaay past "draft" and "RFC" phases). It would be probably best to obsolete this one, and try again (the right way that time, following proposal guidelines!) some time later when people had some time to forget about this one. (so they won't think it is the some old stuff they've been through) --mnalis 21:38, 13 November 2009 (UTC)

(real) Rationale ?

The "Rationale" section in proposal actually says it doesn't really care about the feature, and that the reason the proposal is put up is to stop the endless discussion on the list (which might be a good reason; but is not really rationale for the issue in question: the boolean values). So the REAL questing is more along the lines of "what is the reason to *implement* the boolean values ? what would we gain by it ?". I can see three possible reasons for the idea (feel free to add more):

  • it would solve the issue with typos (for example "yws" instead of "yes") when entering the values (as non-approved values simply would not be accepted by the server). Which I think everyone would agree is a good idea. To the some extent, editors already (somewhat) offset those errors by having drop down lists with values.
  • it would restrict value proliferation, and limit duplicate values for same things "no" / "0" / "nein" / "non" / "ne" / "nej" / ... Which is a good reason in itself, but that coin also have it's negative side: at the same time it would require central authority to approve any changes to any tag (if it happens to stop being boolean, as most of them do sooner or later), and until such approval if coded in the server, people would be still restricted to old set of values. In addition to such unwanted delay and bureaucracy (currently you can just start using new values anytime), there is also implied question of who would have to power to decide what tags and values are accepted, and which are not. Which might actually lead to creation of new (actually unwanted by all sides!) duplicate tags as that would be easier than waiting for old tags to have approved values changed. Those and other problems should be weighted against the positive sides.
  • it would make the renderer job easier. Not really. Changing the imaginary function is_true (value) from if(value !~ /^(no|0|false|nein|ne|nej)$/ { return false } to if (value eq "no") { return false } would bring only extremely small benefit to simplicity (as you can see from above psuedocode) and speed or renderer. Actually, having the check all the values on upload would much more likely make the whole thing slower (as there are more uploads then renders) and more complex (as it would have to check each value against list of possible allowed values for that key) .