Talk:Date specification

From OpenStreetMap Wiki
Jump to navigation Jump to search

To "avoid parsing ambiguities and allow easier filtering of tags easier filtering of tags"

Regarding [1] :

  • if there are real ambiguities in the previous version of the document please point out which precisely
  • fixed numbers of digits etc can't be enforced. If you don't know on which day of 1761 a road was opened it would be wrong to enter a fake day/month just to simplify parsing. Hence relying on this will only lead to greater ambiguity and filtering/searching tags will work worse.
  • while parsing of the original format may appear complex its is only a slight variation of the standard which is fairly widely implemented, the programmers point view in a very similar case has been also discussed here: Talk:Proposed_features/Date_namespace#It_is_awful_from_the_point_of_view_of_querying_and_data_processing

RicoZ (talk) 16:03, 19 April 2017 (UTC)

PArsing ambiguities is all about this. But the double-hyphen is not required and almost never used anywhere. Most tags never use them for date range, because they are not needed for correct parsing.
The format of individual dates should still honor the ISO format, with 4-digit years, and 2-digits months and days of month. Hever you really looked at the data and how dates are really parsed and entered in the database?
It is just enough to signal cases where ambiguities occur, but with the ISO format enforced (with required hyphen separators between elements), we can easily determine where there's a date range or not: the double hyphen is never needed if 4-digit years are present (and this is almost always the case in most tags). As well requiring hyphens between date elements (ISO format) means that "1761" is unambiguously a year, not a fake month and a 2 year: all date elements have a fixed number of digits (2 or 4).
Beside this there's another (more complex) specification for opening_hours that allows specifying dates partially, with repetitions or other conditions. — Verdy_p (talk) 20:05, 19 April 2017 (UTC)
Pardon? I have parsing ambiguities with your text. Can you just answer my concerns point by point? And no, this wikipage has been in place for just too long so you can not replace it with something diametraly opposing without prior discussion. RicoZ (talk) 22:36, 19 April 2017 (UTC)
But your text contradicts many existing uses in the database. No there's not any ambiguity. You've introduced requirements that did not exist, by not looking at existing data (since longer time than your text).
You've been strictly alone to put that text in 2014, without ever discussing it and without checking what was in the database long before. It has been rapoidly corrected to include the current practices, and you've made compeltely false assumptions about the ISO 8601 format and your supposed "backward compatibility". the "--" has never been recommended anywhere by anyone except you. The correction has been rapidly made when you linked your page to other pages because it was wrong (and also incoimpatible with the adopted schemd for opening_hours where this was discussed and where "-" is used consistently.
Your reverts are like if you want to ignore all current best practices. You made errors but cannot admit it.
I've not removed your proposal to use "--", but it is definitely not needed at all :
  • it's not compatible with ISO 8601 (unlike what you wrote, which is false)
  • it is not backward compatible (unlike what you wrote, which is false)
  • it has never been discussed and approved (unlike what you wrote, which is false)
  • it is not the common practice (unlike what you wrote, which is false)
  • it is not even needed for standard date ranges (with years)
and for monthly and yearly recurrent date (the only case where "--" is usable) it is NOT what was adopted and discussed with the opening hours (that uses "-" only, but makes distinctions using 3-letter abbreviated month names instead of month number)
For this reason I do think that even you proposal should be rejected. Only "-" should be kept and only for date ranges with years (and NO: this does not create any ambiguity), the other cases (with missing years) being trated with opening_hours prefered to your "--" proposal (incorrectly justified).
And visibly you don't know what the ISO 8901 standard says about validity of formats.
What are parsing problems? You spoke about "1761" alone but there's no ambiguity at all, it is only a year. Give only one example! You can't. Your 2014 text was invalid (and not discussed at all otherwise you would have seen the problem that it was not working like this in MANY existing data).
Summary: you did not perform any search, you just wanted to document some specific cases you have used locally in your region of interest or for some tags).
I did not invalidate your personal (undiscussed) "--" proposal for date ranges, but added what was already used (with a single "-" almost everywhere). And I made sure that the ISO 8601 format was respected ("80-1-1" is NOT a valid ISO date format, "1980-01-01" is valid and works with simpler parsers)
Verdy_p (talk) 23:35, 19 April 2017 (UTC)
If you had read the ISO 8601 standard correctly, "--" was already used in an older version for dates with missing years (this feature has been removed: ISO formats must start by a 4-digit year, possibly prefixed by a sign by years in BCE and there's no "0000"). Date intervals in ISO 8601 use a slash "/" instead... Your "--" is an invention by you only, not used by anyone else. — Verdy_p (talk) 23:50, 19 April 2017 (UTC)
Sure it has been a while ago that I wrote this text. It has not been questioned until you came so there was no discussion. There is at least one existing implementation (independent of me) which proves that it works. Have you tried to implement your idea? You are making it more complicated to implement, it is more verbose, it is more error prone - why?? Just to avoid the use of "--" in some cases? I see you made some improvements of your idea however there still many things which at this point are unclear to me or dubious. Did I anywhere claim that "80-1-1" was a valid date? If you have this impression I don't object to claryfing that this is indeed not a valid date. However 1980-1-1 is a valid date. Also, in your latest version ( [2] ) you contradict your own claim (above) that a double hyphen is not needed at all.
This page was never meant to replace or compete with the opening hours date specification, they are for different purposes.
Looking again at the state of ISO8601 and related standards I notice that many things changed since I wrote this proposal so it is a good idea to look at it again.
At this point we should make a step back and regard your and mine variants as proposals. So please restore my original proposal, create your proposal and move both to the proposal space. Then we can both work on our proposals and probably agree on something. RicoZ (talk) 21:05, 21 April 2017 (UTC)
It is not "my idea" but the most common practice currently in OSM. Only you seem to support the "--" which is not compatible with ISO 8601 and causes ambiguities.
Yes the "-" separator is used since very long (long before your undiscussed proposal you wrote in mid-2014 without checking).
And franky I don't understand why you think your idea is simpler when in fact it was inconsistant, and also incompatible with the openin_hours (that has been discussed a lot and also using "-" with the promiss to be also compatible with simple date ranges writting with "-", not with "--" that you are alone to support and which is used by almost no one (except those that have read your page during a short period after you linked it to other pages.
But I've not removed your proposal, I just iunk it is not even needed at all. "--" for date ranges has never been part of any standard. But many applications and users expect a single "-" (or an en-dash=half-cadratin hyphen, not an em-dash=cadratin hyphen)
Note also that "1980-1-1 is NOT valid in ISO8601. Standard dsate ranges MUST start by a year any way and that year must be written with 4 digits. This means that the "-" separator coming in a date range can only come before 4 digits for the year (or occurs at start or end for open-ended ranges, that are also not ambiguous at all). There's never any ambiguity for standard date ranges.
But I only note that it could have been needed for recurennt monthly or year data ranges ommiting years, but another solution was adopted for recurrent dates, i.e. the opening_hours specification, where months in this case are using 3-letters, not digits and where recurrent ranges of days in one month are also written as two digits (avoiding all confusion of these days with years that must have 4 digits).
All practives adopted require the fixed number of digits (2 for months and days, 4 digits for years).
When you reverted my old edit, you have wanted to erase a practice that was used since longer tyhan your proposal, as if it did not exist, even if it was already used massively and caused absolutely no interpretation problem for anyone. Once again it was not "my idea" but an idea already used by many people and already widely understood.
The only alternative I've seen in some tags is to add extra spaces around "-" in date ranges, but only for typographical purpose (in that case the "-" should also become a typographic en-dash, including in opening_hours; this is not needed: a typographic representation would depend on the language used, and would also reformat the dates in national formats as well instead of IOS8601 format, possibly with non ASCII digits in Arabic, Farsi, Chinese, Japanese...). For this reason even these spaces are not needed: a renderer that would display those dates for a language/typographic convention would rewrite the value completely, but we won't do that in OSM tags where they are meant to be used for technical use, and hould have a form easily parsable without unneeded variants: the simplest and most efficient parsers will prefer keeping only the strict ISO format for dates, and don't need extra characters such as a second "-" which may cause unmodified ISO8601 parsers to break. — Verdy_p (talk) 01:21, 22 April 2017 (UTC)