Overpass API/Overpass QL

From OpenStreetMap Wiki
Jump to: navigation, search

Overview

Overpass QL is the second query language for the Overpass API and was designed as an alternative to Overpass XML. It has a C style syntax: The whole query source code is divided in statements, and every statement ends with a semicolon. It has imperative semantics: The statements are processed one after another and change the execution state according to their semantics.

The execution state consists of the default set, potentially other named sets, and for block statements a stack. A set can contain nodes, ways, relations and areas, also of mixed type and of any number. Sets are created as result sets of statements and are read by subsequent statements as input. Unless you specify a named set as input or result, all input is read from and all results are written to the default variable, named _ (a single underscore). Names for sets may consists of letters, digits and the underscore but must not start with a digit.

There are several different types of statement. You almost always need the print statement, which is called an action, because it has an effect outside the execution state (the output). The other statements are grouped into

  • Standalone queries: These are complete statements on their own.
  • Clauses: They are always part of a query statement and contain the interesting selectors and filters.
  • Block statements: They group statements and enable disjunctions as well as loops.
  • Settings: Things like output format that can be set once at the beginning.

Sets

Overpass QL can work with sets. By default, everything is read from and send to the default set "_".

To send something to a different set, use the "->" syntax. For example

  (node[name="Foo"];)->.a;

will store all nodes with name=Foo in set "a".

To select something from a set, append the command with ".a".

  node.a[amenity=foo];

will return all nodes in the set "a" that have the tag amenity=foo.

Block statements

union

The union block statement is written as a pair of parentheses. Inside the union, any sequence of statements can be placed, including nested union and foreach statements.

It takes no input set. It produces a result set. Its result set is the union of the result sets of all substatements, regardless of whether a substatement has a redirected result set or not.

Example:

  (node[name="Foo"];way[name="Foo"];);

This collects in the first statement all nodes that have a name tag "Foo" and in the second statement all ways that have a name tag "Foo". After the union statement, the result set is the union of the result sets of both statements.

The result set of the union statement can be redirected with the usual postfix notation:

Example:

  (node[name="Foo"];way[name="Foo"];)->.a;

Same as the preceding example, but the result is written into the variable a.

difference

The difference block statement is written as a pair of parentheses. Inside the difference statement, exactly two statements must be placed, and between them a minus sign.

It takes no input set. It produces a result set. Its result set contains all elements that are result of the first substatement and not contained in the result of the second substatement.

Example:

  (node[name="Foo"]; - node(50.0,7.0,51.0,8.0););

This collects all nodes that have a name tag "Foo" but are not inside the given bounding box.

The result set of the difference statement can be redirected with the usual postfix notation:

Example:

  (node[name="Foo"]; - node(50.0,7.0,51.0,8.0);)->.a;

Same as the preceding example, but the result is written into the variable a.

foreach

The foreach block statement is written as the keyword foreach, followed by a pair of parentheses. Inside these parentheses, any sequence of statements can be placed, including nested union and foreach statements.

It takes an input set. It produces no result set. The foreach statement loops over the content of the input set, once for every element in the input set.

Example:

  way[name="Foo"];foreach((._;>;);out;);

This prints for each way that has a name tag with value "Foo" the way and immediately before the nodes that belong to this way. In detail, the result set of way\[name="Foo"\] is taken as input set. Then, for each element in this input set the loop body is executed once. Inside the loop body the union of the element and its nodes is taken. Then this union is printed.

The input set of the foreach statement can be taken from a variable with the usual prefix notation:

Example:

  foreach.a(...);

This loops over the content of a.

The name of the variable to put the loop element into can also be chosen by adding a postfix immediately before the opening bracket.

Example:

  foreach->.b(...);

This puts the element to loop over into the variable b.

Example for both input and loop set changed:

  foreach.a->.b(...);


Standalone queries

Item

The item standalone query consists only of an input set prefix.

It takes the input set specified by its prefix. It reproduces its input set as result set. This is in particular useful for union statements.

The most common usage is the usage with the default input set:

  ._;

But of course other sets are possible too:

  .a;

Recurse up

The recurse up standalone query is written as a single less than.

It takes an input set. It produces a result set. Its result set are all the ways that have a node appearing in the input set as a member, all relations that have a node or way from the input set as a member, and all relations that have a way from the result set as members.

Example:

  <;

The input set of the recurse up statement can be chosen with the usual prefix notation:

  .a <;

The result set of the union statement can be redirected with the usual postfix notation:

  < ->.b;

Of course, you can also change both:

  .a < ->.b;

Recurse up relations

The recurse up relations standalone query has a similar syntax to the recurse up query and differs only in two aspects:

  • It is written as a double less than.
  • It also recursively returns all relations that have a relation appearing in the input set as a member.

In particular, you can change the input and/or result set with the same notation as for the recurse up standalone query.

Precisely, the recurse up relations standalone query returns the transitive and reflexive closure of membership backwards.

Example:

  <<;

Recurse down

The recurse down standalone query has a similar syntax to the recurse up query and differs only in two aspects: It is written as a greater than. And it returns the node members of all ways from the input set, the way and node members of all relations from the input set, and the node members of all ways that are in the result set.

In particular, you can change the input and/or result set with the same notation as for the recurse up standalone query.

Example:

  >;

Recurse down relations

The recurse down relations standalone query has a similar syntax to the recurse down query and differs only in two aspects:

  • It is written as a double greater than.
  • It also recursively returns all relations that are members in a relation appearing in the input set.

In particular, you can change the input and/or result set with the same notation as for the recurse down standalone query.

Precisely, the recurse down relations standalone query returns the transitive and reflexive closure of membership.

Example:

  >>;

Query for areas

The coord-query standalone query is written with the keyword is_in in one of the below explained variants.

It takes an input set. It produces a result set. Its result set are all areas of which at least one of the given nodes or the given pair of coordinates lie inside.

In its simplest form, it takes its input set as the coordinates to search for. Example:

  is_in;

The input set can be chosen with the usual prefix notation:

  .a is_in;

The result set can be redirected with the usual postfix notation:

  is_in->.b;

Of course, you can also change both:

  .a is_in->.b;

Instead of taking existing nodes you can also specify coordinates with two floating point numbers, divided by a comma. They are interpreted as latitude, longitude. In this case, the input set is ignored. Example:

  is_in(50.7,7.2);

Also in this variant, the result set can be redirected with the usual postfix notation:

  is_in(50.7,7.2)->.b;

Clauses

The most important statement is the query statement. This is not a single statement but rather consists of one of the type specifiers node, way or relation (or shorthand rel), followed by one or more clauses. The result set is the set of all elements that match the conditions of all the clauses.

Example:

  node[name="Foo"];

Here, node is the type specifier, \[name="Foo"\] is the clause and the semicolon ends the statement.

The query statement has a result set that can be changed with the usual postfix notation.

  node[name="Foo"]->.a;

The individual clauses may have in addition input sets that can be changed in the individual clauses. Please see for this at the respective clause.

By tag

The has-kv clause selects all elements that have or have not a tag with a certain value. It support the basic OSM types node, way, and relation as well as the extended type area.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

All variants consist of an opening bracket, then a string literal in single or double quotes. Then the variants differ. All variants end with a closing bracket. If the string literal consists only of letters, the quotes can be omitted.

The most common variant selects all elements where the tag with the given key has a specific value. This variant contains after the key literal an equal sign and a further literal containing the value. Examples, all equivalent:

  node["name"="Foo"];
  node[name=Foo];
  node['name'="Foo"];
  node[name="Foo"];
  node["name"='Foo'];

If you have a digit, whitespace or whatever in the value, you do need single or double quotes:

  node["name"="Foo Street"];
  node["name"='Foo Street'];
  node[name="Foo Street"];

The second variant selects all elements that have a tag with a certain key and an arbitrary value. It contains nothing between the key literal and the closing bracket:

  node["name"];
  node['name'];
  node[name];

The third variant selects all elements that have a tag with a certain key and a value that matches some regular expression. It contains after the key literal a tilde, then a second literal for the regular expression to search for:

  node["name"~"^Foo$"];    /* finds exactly "Foo" */
  node["name"~"^Foo"];     /* finds anything that starts with "Foo" */
  node["name"~"Foo$"];     /* finds anything that ends with "Foo" */
  node["name"~"Foo"];      /* finds anything that contains the substring "Foo" */
  node["name"~"."];        /* finds anything, equal to the previous variant */

Please note that in QL you need to escape backslashes: ["name"~"^St\."] results in the regular expression ^St. (which finds every name starting with "St"), while ["name"~"^St\\."] produces the most likely meant regular expression St\. (which finds every name starting with "St."). This is due to the C escaping rules and doesn't apply to the XML syntax.

You can also search case insensitive:

  node["name"~"^Foo$",i];    /* finds "foo", "FOO", "fOo", "Foo" etc. */

Both the key and value variants with and without regular expressions can be negated. They then select exactly the elements which have a tag with the given key, but no matching value and the elements that don't have a tag with the given key:

  node["name"!="Foo"];
  node["name"!~"Foo"];
  node["name"!~"Foo",i];

Bounding box

The bbox-query clause selects all elements within a certain bounding box.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follow four floating point numbers, separated by commas. The clause is ends with a closing parenthesis.

The floating point numbers give the limits of the bounding box: The first is the southern limit or minimum latitude. The second is the western limit, usually the minimum longitude. The third is the northern limit or maximum latitude. The last is the eastern limit, usually the maximum longitude. If the second argument is bigger than the fourth argument, the bounding box crosses the longitude of 180 degrees.

Example:

  node(50.6,7.0,50.8,7.3);

Recurse

The recurse clause selects all elements that are members of an element from the input set or have an element of the input set as member, depending on the given parameter.

The input set can be changed with an adapted prefix notation. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows one of the acronyms w (forward from ways), r (forward from relations), bn (backward from nodes), bw (backward from ways), or br (backward from relations). Then follows an optional input set declaration. The clause ends with a closing parenthesis.

Examples with default input set:

  node(w);        // select child nodes from all ways of the input set
  node(r);        // select node members of relations of the input set
  way(bn);        // select parent ways for all nodes from the input set
  way(r);         // select way members of relations from the input set
  rel(bn);        // select relations that have node members from the input set
  rel(bw);        // select relations that have way members from the input set
  rel(r);         // select all members of type relation from all relations of the input set
  rel(br);        // select all parent relations of all relations from the input set

Example with modified input set:

  node(w.foo);

You can also restrict the recurse to a specific role. Just add a double colon and then the name of the role before the closing parenthesis.

Examples with default input set:

  node(r:"role");        // select node members of relations of the input set
  way(r:"role");         // select way members of relations from the input set
  rel(bn:"role");        // select relations that have node members from the input set
  rel(bw:"role");        // select relations that have way members from the input set
  rel(r:"role");         // select all members of type relation from all relations of the input set
  rel(br:"role");        // select all parent relations of all relations from the input set

Example with modified input set:

  node(r.foo:"role");

And you can also search explicitly for empty roles:

  node(r:"");
  node(r.foo:"");

By element id.

The id-query clause selects the element of given type with given id. It supports beside the OSM datatypes node, way, and relation also the type area.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows a positive integer. The clause is ends with a closing parenthesis.

Examples:

  node(1);
  way(1);
  rel(1);
  area(1);

Around

The around clause selects all elements within a certain radius around the elements in the input set.

The input set can be changed with an adapted prefix notation. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows the keyword around. Then follows optionally an input set declaration. Then follows a single floating point number that denotes the radius in meters. The clause either ends with a closing parenthesis or is followed by two comma separated floating point numbers indicating latitude and longitude and then finally a closing parenthesis.

Examples:

  node(around:100.0);
  way(around:100.0);
  rel(around:100.0);

Example with modified input set:

  node(around.a:100.0);

Examples with coordinates:

  node(around:100.0,50.7,7.1);
  way(around:100.0,50.7,7.1);
  rel(around:100.0,50.7,7.1);

By polygon

The polygon clause selects all elements of the chosen type inside the given bounding box.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows the keyword poly. Then follows a string containing an even number of floating point numbers, divided only by whitespace. Each pair of floating point numbers represents a coordinate, in order latitude, then longitude. The clause ends with a closing parenthesis.

An example (a triangle near Bonn, Germany):

  node(poly:"50.7 7.1 50.7 7.2 50.75 7.15");
  way(poly:"50.7 7.1 50.7 7.2 50.75 7.15");
  rel(poly:"50.7 7.1 50.7 7.2 50.75 7.15");

Newer

The newer clause selects all elements that have been changed since the given date.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows a date specification. Please note that this date specification cannot be abbreviated as has to be put in single or double quotes. The clause ends with a closing parenthesis.

Example:

  node(newer:"2012-09-14T07:00:00Z");

This finds all elements that have changed since 14 Sep 2012, 7 h UTC.

By user

The user clause selects all elements that have been last touched by the specified user.

It has no input set. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows either the keyword user, a double colon and a string literal denoting the user name to search for. Or the keyword uid followed by the user id of the user to search for. The clause ends with a closing parenthesis.

Example:

  node(user:"Steve");
  node(uid:1);

By area

The area clause selects all elements of the chosen type that are inside the given area.

The input set can be changed with an adapted prefix notation. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows the keyword area. Then can follow a double colon and a nonnegative integer. The clause ends with a closing parenthesis.

Nodes are found if they are properly inside or on the border of the area. Ways are found if at least one point (also points on the segment) is properly inside the area. A way ending on the border and not otherwise crossing the area is not found. Relations are found if one of its members is properly inside the area.

If the area statement is provided without integer, the areas from the input set are used. An Example:

  node(area);
  way(area);
  rel(area);

The example with modified input set:

  node(area.a);
  way(area.a);
  rel(area.a);

If an integer is added, the input set is ignored and instead the area that has the given integer as id is taken.

  node(area:2400000001);
  way(area:2400000001);
  rel(area:2400000001);

Area pivot

The pivot clause selects the element of the chosen type that defines the outline of the given area.

The input set can be changed with an adapted prefix notation. As for all clauses, the result set is specified by the whole statement, not the individual clause.

It consists of an opening parenthesis. Then follows the keyword pivot. The clause ends with a closing parenthesis.

The statement finds for each area in the input set the respective element that the area has been generated from. Which is either a multipolygon relation or a way.

Examples:

  way(pivot);
  rel(pivot);

The example with modified input set:

  way(pivot.a);
  rel(pivot.a);

Actions

There is currently only one action. This action prints out the content of its input set.

print

The out action can be configured with an arbitrary number of parameters that are appended, separated by whitespace, between the word out and the semicolon.

The out action takes an input set. It doesn't return a result set. The input set can be changed by prepending the variable name.

Allowed values are: - one of ids, skel, body, or meta for the degree of verbosity. Default is body. - one of qt or asc for the sort order. Default is asc. - a non-negative integer for the maximum number of elements to print. Default is to set no limit at all.

The meaning of the different degrees of verbosity are: - ids: Print only the ids of the elements. - skel: Print also the information necessary for geometry. These are also coordinates for nodes and way and relation member ids for ways and relations. - body: Print all information necessary to use the data. These are also tags for all elements and the roles for relation members. - meta: Print everything known about the elements. This includes additionally to body for all elements the version, changeset id, timestamp and the user data of the user that last touched the object.

The sort orders are either by id (use ids) or by quadtile index (use qt). The latter is roughly geographical and significantly faster than order by ids.

Example:

  out;

Print the elements without meta information.

Example:

  out meta;

Print the elements with meta information.

Example:

  out 99;

Print at most 99 elements.

Example:

  out meta qt 1000000;

Print up to 1000000 elements, ordered by location, with meta data.

Example:

  .a out;

Reads from variable a the data to output.


Settings

timeout

The timeout setting has one parameter, a non-negative integer. Default value is 180.

This parameter indicates the maximum allowed runtime for the query in seconds, as expected by the user. If the query runs longer than this time, the server may abort the query with a timeout. The second effect is, the higher this value, the more probably the server rejects the query before executing it.

So, if you send a really complex big query, prefix it with a higher value, e.g. 3600 for an hour. And ensure that your client is patient enough to not abort due to a timeout in itself.

Example:

  [timeout:180]

element-limit

The maxsize setting has one parameter, a non-negative integer. Default value is 536870912.

This parameter indicates the maximum allowed memory for the query in bytes RAM on the server, as expected by the user. If the query needs more RAM than this value, the server may abort the query with a memory exhaustion. The second effect is, the higher this value, the more probably the server rejects the query before executing it.

So, if you send a really complex big query, prefix it with a higher value, e.g. 1073741824 for a gigabyte.

Example:

  [maxsize:1073741824]

output

The out setting can take one of the four values xml, json, custom or popup. Default value is xml.

The values custom and popup require further configuration. Please see details in the output formats documentation.

Example:

  [out:json]

Global bbox

The bbox setting can define a bounding box that is then implicitly added to all queries (unless they specify a different explicit bbox).

The bounding box is written in order southern lat, western lon, northern lat, eastern lon (which is the standard order).

Example:

  [bbox:50.6,7.0,50.8,7.3]

Enforced a bounding box roughly around the German city Bonn, which is a 50.7 degrees latitude, 7.1 degrees longitude.

If a query is URL encoded as value of the data= parameter, the bounding box can also be appended as separate parameter. It has then order lon-lat. This is the common order for OpenLayers and other frameworks.

Complete Example:

  /api/interpreter?data=[bbox];node[amenity=post_box];out;&bbox=7.0,50.6,7.3,50.8

This finds all post boxes roughly in Bonn, Germany.

Special syntax

Comments

The query language allows comments in the same style like in C source code:

  out; // A single line comment
  /* Comments starting with a star must always be closed. */
  /* But they can span
         multiple lines. */

Escaping

The following escape sequences are recognized:

  • \n: escapes a carriage return
  • \t: escapes a tabulator
  • \", \': escaped the respective quotation mark
  • \\: escapes the backslash
  • \u#### (the hash characters stand for four hexadecimal digits): escapes the respective unicode character, see Unicode escape sequences