Overpass API/Wording

From OpenStreetMap Wiki
Jump to navigation Jump to search
Overpass API logo.svg
edit
Overpass API · Language reference · Language guide · Technical terms · Areas · Query examples · Sparse Editing · Permanent ID · FAQ · more · Web site
Servers status · Versions · Development · Technical design · Installation · XAPI compatibility layer · Public transport sketch lines · Applications · Source code and issues
Overpass turbo · Wizard · Overpass turbo shortcuts · MapCSS stylesheets · Export to GeoJSON · more · Development · Source code and issues · Web site
Overpass Ultra · Overpass Ultra extensions · MapLibre stylesheets ·more · Source code and issues · Web site

The Overpass API inevitably can feel quite complicated due to its range of advanced features and functionality. However, bad wording of messages should not add an extra layer of confusion. For that reason I've set up this page with the intention that native (British) English speakers should review whether the wording makes sense. As a secondary aim, this page may help to decipher error messages.

I'll look from time to time to this page and fix messages to use better English if there is consensus on an improved message.

Please note that I won't keep this page up to date with every new version of Overpass API. This is not because I don't want to, but simply because the release process is already bloated. Please remind me to update this page if you get a message that is not on this list.

General concepts

Attic

This denotes all the OSM data that is no longer valid in an up-to-date dataset. Examples are old versions of objects or deleted objects.

Perhaps "outdated", "stale" or even "museum" might be more explanatory? -- User:Messpert 19:24, 26 July 2016

Deriveds, Areas

Most often you will receive a copy of valid OSM data from Overpass API. This is why the service is called "read-only mirror". However, sometimes it is handy to get instead data that has been computed from OSM data instead. An example are areas: When you want to restrict a search to a given location or want to know for objects in what these are contained then it doesn't matter whether these areas are represented in OSM as closed ways or (multipolygon) relations. This is why you can use in that case an area instead that doesn't care where it is from.

More examples for deriveds are on the way, but currently areas are the only class of deriveds.

Diff

This denotes a difference between two sets of OSM data sets. One class of examples are the files from planet.osm once per minute that allow to keep the database up-to-date. The other class of examples are the difference files produced from Overpass API to represent the changes in the result of a query for two different timestamps.

meta

The data of an OSM object that doesn't represent what's on the ground but rather tells how this data has been gotten to the database. In detail, these are version number, change timestamp, changeset id, user id, and user name.

Recurse

In hindsight, it's a horrible misnomer. OSM uses ID to link nodes, ways, and relations together. And an important tool is to traverse from object to object by resolving references to the referred object or by tracing back onto all the objects linking to a given object. Short of word for this, I called that process "to recurse".

Set

Queries are composed of multiple statements such that each statement has a narrow but well-defined job. To make this work it is necessary to somehow funnel the result of one statement as input of the next statement. This is the purpose of sets: During execution of the query, the interpreter keep track of a list of named sets of objects. Objects in this sense are OSM objects or deriveds. Each statement then takes zero, one or more sets by name as input and produce output from that. This means that the interpreter will manipulate the content of the sets according to the statements semantics.

Usually, you will have to deal with only one set, named by default _. If you use this set, you don't have to care about sets at all.

Statement

This is an atomic instruction for the interpreter to compute results from input sets or from everywhere. Examples are: - a query statement. This produces from nothing a set of results. - a recurse statement. This has necessarily a set as input and produces another set as output that depends on the input. - a print statement. This needs a set of input to print this to the output stream but doesn't change the list of sets in the interpreter.

Timestamp

There are a lot of events in the OSM universe that justify to keep a timeline for them. 1. The reality. Things get constructed or destructed. This is usually encoded in tags like "start_date" or "end_date". And destructed objects are called "historic" 2. The representation in OSM. Things get mapped or are deleted from the database. Deleted or obsolete data can be identified by version numbers. 3. The distribution of data. Data is represented as OSM datasets or databases. And changed are transmitted by diff files.

We are dealing here with the second timeline. It can happen that the geometry of a way and even the topology of a relation change, but no new version of the element is created. This is why Overpass API works instead based on timestamps: for a given second in time, it is clear enough what the geometry and topology looked like in the OSM database.

Timeout, Space limit, Rate limit

Overpass API is designed to work as a public available resource. This means that we have to develop a notion for what is a fair share of use. The timeout is the number of seconds that a query may run at most. The user sets this timeout beforehand, and queries are accepted or rejected based on for how much time they apply. Similar, the space limit is the amount of RAM that a query proposes to use at most. In both cases, the server aborts a query if it exceeds its limit.

The rate limit limits the number of parallel requests per IP. Parallel is not as strictly parallel as you might think, because after each requests a waiting time is added. This waiting time depends on the server load and allows to use higher quotas on lower load levels and vice versa.

Messages from the dispatcher

 Too Many Requests. Another request from your IP is still running.

The server thinks that too many requests specifically from your IP address have been completed in the last few minutes. One challenge with the wording of this message is that it refers to the availability of slots. If you just got served a bunch of requests then you may get the message although no parallel request runs because all slota are on waiting time.

The correct metaphor is that of a leaky bucket. When a query is completed then tokens are taken out of the bucket. Tokens are refilled until the bucket is full after a time depending on the load of the server.

An equivalent metaphor is: For each IP address there is a set of slots (e.g. 3 slots). An accepted query occupies a slot for its runtime and in addition for the waiting time after the query is completed.

 Gateway Timeout. Probably the server is overcrowded.

The global load on the server is so high that it has to reject the query. This is more or less by chance, and the whole purpose of the aforementioned resource management is to avoid that this happens to low key users.

A potential better phrasing is:

Gateway timeout. The server is probably too busy to handle your request.

 Query run out of memory in "STATEMENT" at line "LINE" using about "SIZE" MB of RAM.

The query has used or I surely going to use more memory than it has proposed to use at maximum.

 Query timed out in "STATEMENT" at line "LINE" after "SECONDS" seconds.

The query has used or I surely going to use more runtime than it has proposed to use at maximum.

Messages from the parser

The biggest number of all messages comes from the parser. This is no surprise because most of the input comes through the query language.

 An empty query is not allowed

The server has seen a requests with no query attached.

 encoding error

The classification of all input problems that are severe enough to reject the query.

 encoding remark

The classification of all input problems that might help to explain problems. But the query is still executable.

 Input too long

The length of the input is bigger than 1 MB, the current query size limit.

A potential improvement may be:

Input too long. The maximum query length is 1 MB. --Gregoryw (talk) 19:55, 26 July 2016 (UTC)

 Invalid regular expression

The operating system has told us that the given character sequence is not a valid POSIX extended regular expression.

 Invalid UTF-8 character (value below 32) in escape sequence.

The whole XML ecosystem (including OSM data) doesn't allow any such Unicode characters, with only TAB, CR, and LF as exceptions. Hence, it is best to reject them right away. Otherwise they may act as building blocks for security holes in Overpass API using applications.

 Element "STATEMENT" must not contain text.

This applies to input in the XML query language only. XML allows to have other text than whitespace between elements, but this text isn't evaluated by Overpass API. Hence, any such text is rejected at input time.

 Parameter "jsonp" must contain only letters, digits, or the underscore.

This mitigates a security issue with jsonp. To avoid that an attacker can inject JavaScript in a client application, any jsonp returning service should ensure that the wrapper function is a proper simple function name with parentheses and nothing else.

 Parameter "template" must not contain slashes.

If you provide a CGI parameter with a template name then this parameter must not contain slashes. This constraint avoids attacks on the host server.

 Parse error at line %lu:

This applies to input in the XML query language only. It's a message from the XML parser that the input is not well-formed XML. Details follow after the double colon.

 Unexpected end of input.

The syntax is organized in blocks. For example, statements are blocks on their own. But some statements can contain sub-statements. This message is reported when the input ends but one or more blocks are incomplete w.r.t. their syntax.

 Unknown tag "STATEMENT" in line "LINE"

This applies to input in the XML query language only. There is a valid XML tag with unknown name where an XML tag denoting a well-known statement is expected.

 Your input contains only whitespace.

The server hasn't found anything in your request that it can treat as query but some whitespace.

 Your input is empty.

The server hasn't found anything in your request that it can treat as query.

QL specific issues

These are messages that come from the QL parser. Hence, they can only appear if the server tries to parse the query as QL.

 A bounding box needs four comma separated values.

The server has found an expression in parentheses that contains more than one number. This can only end up with something valid if it is a bounding box. A valid bounding box must contain exactly four numbers, separated by commas.

Very minor improvement:

A bounding box needs four comma-separated values. --Gregoryw (talk) 20:02, 26 July 2016 (UTC)

 A recursion from type 'X' produces Y.

The server has found a valid recursion expression in parentheses. But the result of the expression has a different type than the type of this query statement.

 X expected - Y found

At most stages of the parsing, there is a quite short list of what is acceptable as the next token. The parser will tell you as Y the type of what it has parsed as next token. X is the type or list of types what the parser would have been able to process at this point. The individual entries can be explicitly mentioned characters or tokens or something from the following list:

 Positive integer
 Floating point number
 CSV format specification
 CSV output header line (true or false)
 CSV separator character
 List of Coordinates

The possible types beside explicitly listed characters or tokens.

 Invalid parameter for print

The parser is inside a statement started with out and expecting another parameter token. But the found token is not an accepted parameter value.

 Latitudes in bounding boxes must be between -90.0 and 90.0.
 Longitudes in bounding boxes must be between -180.0 and 180.0.

The parser tries to interpret this parentheses expression as bounding box, but the found numbers are outside the range for degrees.

 Unknown query clause

The parser has found an expression with parentheses and starting inside with a literal. But the found literal is not a name of a query clause.

 Unknown type

The parser tries to interpret the expression as query statement. But the statement doesn't start with node, way, or relation.

Bad parameter values

This class of messages means that a single parameter of a single statement is already so odd that it cannot be processed regardless if the rest of the query is well-formed or not.

 For the attribute "X" of the element "Y" the only allowed values are Z.

The statement Y requires for the parameter X a value as described by Z. Z can be a list of possible values or a type (like positive integers, non-empty strings, or nonnegative floats).

 ... the only allowed values are floats between MIN and MAX.
 ... the only allowed values are floats between MIN and MAX or an empty value.

These are two specializations that exist in addition to the before mentioned message.

 For the attribute "bounds" of the element "polygon-query" an even number of float values must be provided.
 For the attribute "bounds" of the element "polygon-query" at least 3 lat/lon float value pairs must be provided.

And two more specific format requirements for a long enough list of coordinates to form a polygon.

 The attribute "X" must be empty or contain a timestamp exactly in the form "yyyy-mm-ddThh:mm:ssZ".
 The attribute "X" must contain a timestamp exactly in the form "yyyy-mm-ddThh:mm:ssZ".

The message that an attribute should have been a timestamp but contains something else.

 The only allowed values for "augmented" are an empty value or "deletions".

Yet another message to refuse the value of the given parameter as unsuitable.

 Unknown attribute

For the given statement the given attribute is not known. Can only happen in the XML query language.

Bad statement tree

These messages inform that the created tree of statements would not make sense. The query is rejected in all these cases.

 A set difference takes exactly two substatements: the set of elements to copy to the result minus the set of elements to leave out in the result.

A difference statement with a different number than two substatements has been provided.

 difference takes always two operands

Same thing, different context.

Better phrasing:

difference always requires two operands --Gregoryw (talk) 20:11, 26 July 2016 (UTC)

 X cannot be subelement of element Y

The combination of this type of statement and this type of substatement doesn't make sense.

 "newer" can appear only inside "query" statements.

A specific variant of this, because newer is very restricted in what its parents can be.

 Nesting of statements limited to 1023 levels

Statements cannot be nested more than 1023 levels deep to avoid a stack overflow in processing.

A minor tweak:

Nesting of statements is limited to 1023 levels --Gregoryw (talk) 20:17, 26 July 2016 (UTC)

Bad combinations

This class of messages is emitted when the given combination of settings doesn't make sense or the required fuctionality isn't implemented.

 A regular expression for a key can only be combined with a regular expression as value criterion

If you specify a has-kv statement where for keys is searched with a regular expression then you need to search for values as well by a regular expression. This has no semantic reason, but the other execution pathes aren't implemented.

 In the element "has-kv" the attribute "regk" must be combined with "regv".
 In the element "has-kv" regular expressions on keys cannot be combined with negation.

Same problem as before but a different variant.

 A role can only be specified for values "..."

This applies to recurse statements only. If the type of recurse involves relations then the resolving of memberships can be restricted to a specific role. This message is returned if a type of recurse that doesn't involve relations is combined with a role restriction.

 Exactly one of the two attributes "name" and "uid" must be set.

If you use a user statement then you need to provide exactly one of either the user name or the user id. Each of them is already unique, hence combining both criteria is either redundant or a contradiction.

 In the element "has-kv" only one of the attributes "k" and "regk" can be nonempty.
 In the element "has-kv" only one of the attributes "v" and "regv" can be nonempty.

Either you restrict the accepted values by an exact value to match or you restrict the accepted values by a regular expression.

 In the element "has-kv" the attribute "modv" can only be empty or set to "not".

In a has-kv statement you can specify to apply the filter either positive (keep only matches) or negative (discard matches, keep everything else). For positive filtering leave modv empty, for negative filtering set it to not.

 The value of attribute "n" of the element "X" must always be greater or equal than the value of attribute "s"

In a bounding box, the degree of the north must always be greater or equal than the degree of the south.

 The attribute "augmented" can only be set if the attribute "from" is set.

This applies only to the osm-script statement in the XML representation. The attribute augmented only makes sense when you query for a diff. A diff necessarily requires the attribute from.

 The attributes "since" and "until" must be set either both or none.

This applies to the changed statement. This is to select objects by whether they changed within a time range. That requires the start and end of the time range.

Explanative texts

 Node X is contained in an odd number of ways.

This is a message from make-area that finds a data error in a relation. As a result, no area is generated for this relation. The particular error condition is that node X is first or last node in an odd number of ways. That always means that the relation's geometry is broken.

 Node X is not contained in set SET
 Node X referred by way Y is not contained in set SET

This is a message from make-area that finds a data error in a relation. As a result, no area is generated for this relation. The particular error condition is that node X is not found in the input. Because this input usually comes from a preceeding recurse this means that the database has an inconsistency.

 Please enter your query and terminate it with CTRL+D.

This is a reminder from the command line tool osm3s_query to enter input on stdin.

In the returned payload

 Areas based on data until:

This is the explanation for the timestamp of the received areas.

 Data included until: 

This is the explanation for the timestamp of the received OSM data.

 The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.

This is a reminder of the license of all OSM data.

 OSM3S Response

The title of the response. This is a leftover from the time before Overpass API was baptised Overpass API.

 No {{node:..}} found in template.
 No {{way:..}} found in template.
 No {{relation:..}} found in template.
 No results found.
 Template not found.

There is something wrong with the keywords in the given template. Templates were are feature to produce a result in layouted HTML. But as they never got tracting, I would like to phase them out.

 unknown object

This term should not happen. But it is a fallback if the CSV output engine get an object of another type than node, way, relation, or area.

Parser transformations

These messages doesn't mean that a fatal error has happened. It's something else: While Overpass API tries to be quite permissive on what is acceptable as input, it has to do some normalizing. This means to add extra lines or wrapping tags or to decode optional encodings. This could mean that your query is processed by the parser in a different form than you intended because a transformation got in your way.

 No input found from GET method. Trying to retrieve input by POST method.

Query can be submitted by either appending them to the URL or by sending them as payload of the request. The former is called GET, the latter is called POST. The parser probes first the URL. And if nothing is found then this message is added and it looks if payload has been added.

 Only whitespace found from GET method. Trying to retrieve input by POST method.

Slight modification of the preceeding. The URL did contain some whitespace but no query. The parser then has a look at the payload.

 Found input from GET method. Thus, any input from POST is ignored.

At the end of the URL something has been found that looks like it can be feeded to the parser. The server doesn't peek the request payload if any in that case.

 The first non-whitespace character is '<'. Thus, your input will be interpreted verbatim.

The parser has got a literal less-than character as first non-whitepace character. This is taken as indication that we have unencoded XML as input.

 The server now removes the CGI character escaping.

The server thinks that you our your client has encoded the URL according to the rules of CGI encoding and tries to decode that back to the plain query before parsing.

 Your input contains an 'osm-script' tag. Thus, a line with the datatype declaration is added. This shifts line numbering by N line(s).

A proper XML document requires a document type declaration before the first tags. This is a boilerplate line. You haven't prepended the XML by such a line, hence the server will generate that for you.

 Your input starts with a tag but not the root tag. Thus, a line with the datatype declaration and a line with the root tag 'osm-script' is added. This shifts line numbering by N line(s).

A proper XML document requires after the document type declaration a single tag that frames all other tags, the so-called root tag. This is almost a boilerplate notation, too. But you could at least have tweaked some parameters. You haven't and you haven't even added the tag. Hence the server will generate the document type declaration and a root tag with default settings for you.

Messages from the backend

Like the parser, the backend has a lot of more or less informational messages that do not necessarily mean a fatal error. But in case of a fatal error, they may prove useful to figure out the real source of trouble.

 File_Blocks_Index: Unsupported index file format version
 Random_File_Index: Unsupported index file format version

You are trying to read a format variant that the software version you are using cannot read. Try to use a newer version or a version with more comprehensive feature support.

 File error caught: ...

This may or may not be fatal. It depends on what is in the dots, and that is relayed from your file system.

 LZ4: output buffer too small during compression
 LZ4: output buffer too small during decompression

If you use the LZ4 compression method it might have screwed up itself. It's still sort of experimental.

 Unknown argument: ...

You have provided an argument on the command line that the called tool does not support. The command will instead list all acceptable arguments. Thus you can check whether you have made a typo.

 X has changed at timestamp TIMESTAMP in two different diffs.
 Version VERSION has a later or equal timestamp (TIMESTAMP) than version VERSION (TIMESTAMP) of X.
 X appears multiple times at timestamp TIMESTAMP

These are purely informational and appear only if you have turned on attic data. They tell you that the main OSM database has produced a constellation of data such that a version of an object is essentially inaccessible. Normally, it is granted that every state that you ever have seen can be later reproduced with the right timestamp as argument. These are the exceptions where a version gets buried because a later version has the same or even an earlier tiestamp.

Fatal errors

While the details of this messages may help to figure out the cause of the problem, it is almost sure that your database is damaged beyond repair if you get one of them.

 X used in Y not found.

Ths OSM element Y (a way or a relation) has a reference to an OSM object that is not in the database. There are a few exceptions where the main database delivered indeed inconsistent objects. But unless you process data from lat 2012 it is very likely that the referred object got lost from the database which means that the data in the database is incomplete.

 Bad geometry for way X

Same thing as before, but in particular for ways. A referred node is missing, hence the server cannot figure out the geometry of the way from the given data.

 Block_Backend::1: one index expected - several found.
 Block_Backend: an item's size exceeds block size.
 Block_Backend: index out of range.
 File_Blocks_Index: bad pos in index file
 File_Blocks::read_block: Index inconsistent
 Random_File: bad pos in index file

The database structure doesn't make sense. The database backend is organized as a huge file in block and a list of the content of all blocks. The server didn't find in the given block the data that the list of content announced, or other assumptions of the index file format are violated by this file. Sometimes, this is because you have run an update on a database with extra db-dir parameter and still a running dispatcher. In that case, both processes don't know of each other and are likely to clash.

 X cannot be expanded at timestamp TIMESTAMP

The backend does a simple compression of attic objects: It only stores a delta to the next newer version. Due to bugs in the software this might end up in trouble if for some reason the reference object has been overwritten. Please report these incidents always to the bugtracker on github, there is always a yet undetected bug behind it.

Progress messages

These messages are printed by the data update mechanism in the backend and help to watch that they are still properly working.

 elapsed node X
 elapsed relation X
 elapsed way X

The tool is going to flush the data into the files right now and has got with parsing so far up to element X.

 finished reading nodes. 
 finished reading relations. 
 finished reading ways. 

The tool has accomplished reading all the available data of type X.

 Flushing to database .

The tool now writes the data to the files.

 Reading XML file ...

The tool parses the input which is always an XML file.

 Reorganizing the database ...

The tool has produced multiple intermediate data flushes and now merges them into a single file.

 Update complete.

The tool has accomplished the complete update with success.