Talk:Overpass API/Overpass QL

From OpenStreetMap Wiki
Jump to: navigation, search
The old posts can be found on Talk:Overpass API/OverpassQL

Attic data ("date")

Example: The relation of Lake Nasser with the ID 280282. The changeset for the relations state contained in the first ODbL planet file was opened on 2012-06-02T13:23:36Z and closed on 2012-06-02T13:25:49Z. So querying for the relation for 2012-06-02T13:23:00Z returns nothing while a query for 2012-06-02T13:26:00Z returns the relation for this date. (Why a query for 2012-06-02T13:23:40Z doesn't work but for 2012-06-02T13:24:36Z does although at both times the change was not complete and what the query will return a more knowledgeable person may explain.) Malenki (talk)

A changeset is not a "db transaction", where updates become only visible to the rest of the world, once you close it. In fact, any change you upload will be become immediately visible for the rest of the world and you can still add further changes later on to the same changeset! For Overpass API, only the object timestamp is relevant. The changeset open/close timestamps can be ignored for the purpose of finding out, if an object is returned or not. Mmd (talk) 17:52, 16 February 2015 (UTC)

If your query asks for an object in a state prior to what this planet file contains the API will return nothing.

It would return the state contained in the first planet. Due to a DB inconsistency, will indeed return nothing at this time, but as soon as the DB is rebuilt, you will get the state as in the first planet again (tested on my local instance). Mmd (talk) 18:36, 16 February 2015 (UTC)

Key/value matches regular expression (~"key regex"~"value regex")

Is it possible to simplify the syntax for filters using regexps for keys, but only equality for the value ?

A filter such as:

[~"^(.*_)name(:(en(-.*)?)?$"~"^Dummy \(Region\)$"]

could be more simply written as

[~"^(.*_)name(:en)?$"="Dummy (Region)"]

without having to escape some characters in the value and adding the extra "ˆ" and "$" bounds. I think that Overpass could perform itself this transform to the equivalent regexp.

Also, there should exist a way to query names by language (using a simpler mechanism, comparable to the lang() selector in CSS, i.e. using BCP47 language code resolution rules). The query above would become:

[~"^(.*_)name$":lang("en")="Dummy (Region)"]

Or even more simply (if we don't care about "alt_ame", "old_name", "loc_name" and so on that this simpler form will not match):

["name":lang("en")="Dummy (Region)"]

And Overpass would transform the key regexp to match the ":languagecode" key suffix.

Still Overpass will infer the regexp, and there is still no support for excluding elements that match the specified name. Effectively the "!=" comparator is still not supported but could be supported using an internally generated difference of two sets, or an union of queries for each key value, depending on statistics:

When only one or or a few selective keys are matching, using an union may be a lot faster than using a difference between two large sets; it will be always faster if the input set has only one key matching the key regexp, or even no key at all in which case the output set will imediately be empty : the Overpass query engine can easily generate for each input set an index of all keys that are effectively used in it, possibly even with a counter for each one in order to evaluate some selectivity independantly of their values).

Thanks. — — Verdy_p (talk) 17:42, 14 June 2015 (UTC)

Please read this Github ticket on regular expression support. Some of the points you're proposing are already mentioned there but got postponed. There's no more recent status on this, it simply hasn't been implemented due to other priorities/lack of time and/or resources, etc. I would suggest to create additional Github tickets for everything which is not covered in that ticket. To be honest, the :lang("en") looks like a very special case and is not very likely to be adopted at all. Just give it a try on Github. Mmd (talk) 18:14, 14 June 2015 (UTC)
The lang() feature is very documented for use in XML, CSS, and very convenient for localisation (this includes performing Overpass QL to return data exposed to users on the map or hen clicking on a marker to sho details, ithout having long list of less relevant languages for translations); it handles correct fallbacks using BCP 47 resolution rules, not having to build complex regexps for them, and in case of multiple matches, it uses the most specific one (it acts just like a max() aggregator to return a single value matching the maximum specificity of the selector). And it reduces a lot the volume of data returned and loaded from the net, to be parsed by the client in a more memory-limited environment such as 32-bit browsers or smartphones. It saves also storage resources on the server for the temporary results that can be discarded much faster as the data ill be loaded by clients also faster. — Verdy_p (talk) 12:57, 5 March 2017 (UTC)

Closed Way

Is there a possibility to differenciate closed and unclosed ways?--Jojo4u (talk) 20:15, 30 October 2015 (UTC)

Broken discussion link

There is a link to about memory usage. Unfortunatelly the link (Gmane) is broken. Do anybody knows another place to read it? --naoliv (talk) 10:50, 31 January 2017 (UTC), Mmd (talk) 12:51, 31 January 2017 (UTC)
Unfortunately, the links above are pointing to the full archive. As a result, you don't know what to do if you experience error messages like "runtime error: Query run out of memory using about 2048 MB of RAM.",. If there is a link to the specific thread, I would gladly include a summary here. Zstadler (talk) 06:19, 24 March 2017 (UTC)
Rephrased for clarity Zstadler (talk) 09:53, 16 October 2017 (UTC)

Valid hours, minutes, and seconds in date() and is_date()

The Date Check and Normalizer section says:

  • The hour, if present, must be less or equal to 60,
  • The minutes and seconds, if present, must be less or equal to 60.

Can someone confirm that this should be changed to:

  • The hour, if present, must be less than 24.
  • The minutes and seconds, if present, must be less than 60.

Zstadler (talk) 09:50, 16 October 2017 (UTC)

Actually the seconds may be equal to 60 (or even 61 very rarely) when these are "leap seconds" which may be added on the last minute of the last day of January or December within the UTC calendar time, and consequently in all timezones aligned with a static offset from UTC, including for daylight. (It is also possible that these same days may not have a second number 59 or even 58 exceptionally because one or two seconds may be substracted too on these dates). This occurs to maintain the effective average duration of days so that the observed zenith of the sun will remain within at most 1 or 2 seconds of midday, for the next 6 months.
As the rotation of Earth is not regular, and Earth rotation is also slightly increasing (very slowly), it is unavoidable: this occurs because the official duration of the second is no longer aligned to a constant 1/86400 fraction of the observed solar day on Earth, but on a constant number of pulsations in coordinated reference atomic clocks. For physics it is extremely important now to have very precise measurements of the second, independantly of the gross irregularities of Earth rotation.
After some major earthquakes on Earth, there's a sudden variation of the duration of the day, and slight changes of position of the rotation axis. As well other planets (notably the tide effect of the Moon and major eruptions from the Sun) are affecting the Earth rotation, as well as when the Earth passes through clouds of dust or is impacted by meteorites (notably in August) and this also results in slight changes of the furation of the day and deceleration of the Earth rotation (so days are gradually become longer over time, but after some natural events, the rotation may accelerate again and we could have shorter days.
So a UTC time "23:59:60.00" is valid on 30 June and 31 December and is one second (sometimes 2 seconds!) before "00:00:00.00" on the next Gregorian day...
If necesssary the UTC standard boday may need to add/substract leap seconds more often (e.g. also at end of March or September) but given the current margins (1 or 2 seconds of difference of time between UTC clock and observed solar clock), it should be extremely exceptional. These adjustments are announced several months before they occur (you cannot predict when these adjustments will be made). — Verdy_p (talk) 21:11, 16 October 2017 (UTC)

I think the documentation exactly describes the implementation (year should be >= 1000, rather than > 1000). In case you wonder, here's the respective link to the implementation: Mmd (talk) 09:41, 17 October 2017 (UTC)
@Zstadler: Unfortunately, some of your rephrasing introduced semantic changes and as such no longer correctly represent, how a specific function operates. Also, I wouldn't recommend to change names of some functions (e.g. Union vs. Unique), as this will create inconsistencies between the code and the documentation. I already started fixing them / reverted them to the original documentation. Please note that the part you were editing is automatically generated from the source code. BTW:It would have been much easier to work on the spelling thing first, make much smaller changes. Mmd (talk) 12:31, 17 October 2017 (UTC)
While a few people read or write the code, many more people read the documentation and use the Overpass QL. The use of "Union" for the u() aggregator is therefore misleading many more people. In the usual sense, a "union" contains many different items, while the u() aggregator produced a single result. Moreover, it returns the diagnostic text "< multiple values found >" when any of its arguments are different. I also doubt the code developers can be confused by the documentation... Zstadler (talk) 17:10, 17 October 2017 (UTC)
I think it would make more sense to propose this change to Roland, and see if he's willing to change the terminology (=code + wiki), rather than just go ahead and do those changes in the wiki. It's not meant to be restrictive, but rather to avoid confusion in bug reports and conversations further down the road, as devs probably won't have an idea what "unique" refers to in the first place and then need to first dig through all the changes people made to the wiki. Mmd (talk) 17:38, 17 October 2017 (UTC)

Reverting and accepting changes

@Mmd: It would have been nice if the page says it is automatically generated and how to contribute. Zstadler (talk) 12:48, 17 October 2017 (UTC)

Sure: It really only affects the part you were editing, namely chapters 8 + 9. All other chapters don't include automatically generated parts. As an example, please see . I don't know exactly which script is used by Roland, best is to ask him. Mmd (talk) 12:58, 17 October 2017 (UTC)
I would be happy to send a pull request and discuss on GitHub the issues you found. Zstadler (talk) 14:04, 17 October 2017 (UTC)
Ok, great. In any case, I would highly recommend to get in touch with Roland via email and check what his plans are regarding this "in code" documentation, e.g. if/when/how changes you propose via Pull Request will be re-published on the Wiki page. Mmd (talk) 16:48, 17 October 2017 (UTC)

I'm not that strict on changes. I do understand what Mmd says and agree with the objectives. It is probably a less-than-optimal idea to change documentation without changing to code because the code may have error messages relying on the notions in documentation. Or the notion may have already been used elsewhere in the wiki.

However, I will not steamroll the content of this page (not even chapter 8 + 9). When there is time for the next release then I will do a diff on the content to the content copied from the source code and merge back to the source code the changes I have found.

In the given case, I suggest to

  • leave a comment in the wiki: "Suggestion: rename the aggregator union to unique"
  • write a hint in the clear text like "not to be confused with the statement union"

Thank you for the suggested change, it is by the way a good idea. -- drolbr

Regular expression dialect

There are several different regular expression dialects. It would be nice to have a link to the syntax of the regular expression package used by Overpass. Zstadler (talk) 12:07, 20 October 2017 (UTC)

Currently, POSIX extended regular expressions are being used by default, see Mmd (talk) 12:15, 20 October 2017 (UTC)
Thanks! It seems like POSIX extended regular expressions would not allow the use of Unicode hex values instead of the characters themselves... Zstadler (talk) 14:03, 20 October 2017 (UTC)
Right, there's an open issue for this: Mmd (talk) 14:17, 20 October 2017 (UTC)