Databases and data access APIs

From OpenStreetMap Wiki
(Redirected from Database schema)
Jump to: navigation, search

This page provides an overview of the databases that could be used to store and manipulate OSM data, how to obtain data to populate the databases, and how to query them to find something useful.

It is intended as an overview for new developers who wish to write software to use OSM data, and not for end users of the information.

Contents

Sources of OSM Data

See also Downloading data for a run down of the basic options

The various sources of OSM data (either the whole world, or a small part of it) are identified below with links to other Wiki pages which provide more detail.

The most of the following methods of obtaining data return the data in the OSM XML format that can be used by other tools to populate the database. The format of the data is described in Data Primitives.

Planet.osm

Every week a dump of the entire current OSM dataset is saved in different formats and made available as Planet.osm. Quite a few people break this file down into smaller files for different regions and make extracts available separately on mirror servers. Various tools are available to cut the Planet file up into smaller areas if required.

Differences between the live OSM data and the planet dump are also published each minute as changeset, so it is possible to maintain an up-to date copy of the OSM dataset.



XAPI

The Xapi servers allow OSM data to be downloaded in XML format for a given region of the globe, filtered by tag. Xapi will return quite larger areas (city level) of the globe if requested, which makes it different to the standard OSM API described below.

API

The main API is the method of obtaining OSM data used by editors, as this is the only method of changing the OSM data in the live database. The API page provides a link to the specification of the protocol to be used to obtain data. Its limitations are that it will only return very small areas <0.25deg square.

This method of obtaining data should therefore be reserved for editing applications, with other methods being used for rendering, routing or other purposes.

Overpass API aka Server Side Script

Allows quite complex queries on larger areas.

TRAPI

The Tile Read-only API (TRAPI) is a method of obtaining OSM data in a way that is optimised for rendering by the TilesAtHome project that produces the osmarender layer on the main OSM Slippy Map.

Are there any live TRAPI servers?? The wiki page does not provide a URL....

ROMA

ROMA is a slightly different implementation of TRAPI

Choice of DBMS

There are several different databases systems used by OSM users:

Database Benefits Disbenefits Used By
PostgreSQL Can handle large datasets. The PostGIS extension allows the use geographic extensions Requires database server to be installed, with associated administrative overhead Main OSM API, Mapnik renderer
MySQL Can handle large datasets Does not have geographic extensions. Requires database server to be installed, with associated administrative overhead The main database API used MySQL until version 0.6, when it was changed to Postgresql
SQLite Small, does not require a database server Will struggle with large datasets - See Mail Archive Microcosm
GT.M Xapi
MonetDB
CouchDB GeoCouch extension OSMCouch
MongoDB Osmo

Database Schemas

OSM uses different database schemas for different applications. Unfortunately, the various schemas do not really have a fixed name.

The schema name should link to a description of the schema, the "created with" Link to an manual how to create the schema and fill it with data.

Schema name Used By Created With Uses geospatial columns (PostGIS) Uses hstore columns update-able via diffs Description / References
main, core or backend API osmosis No No  ? The postgresql database used by the main OSM API.
PostGIS Mapnik, Kothic JS osm2pgsql Yes optional slim variant A lossy database structure optimised for rendering using Mapnik. It only adds features that have certain tags, as defined in the config file, and it converts nodes and ways to linestrings and polygons. This means that you can't tell which linestring is connected to which, but for rendering a map that's not important (as opposed to routing).
gazetteer Nominatim osm2pgsql  ?  ?  ?
PostGIS snapshot jXAPI osmosis Yes Yes Yes schema for storing a snapshot of OSM data. Node locations are always stored as a point. Ways are related to nodes as in the normal API schema, however they may optionally have bounding box and/or full linestring columns added as well allowing a full set of geo-spatial operations to be performed on them.
PostGIS simple  ?? osmosis Yes No Yes This is effectively an older version of the snapshot schema where tags are still stored in separate tags tables instead of hstore columns. It is recommended to use the newer "Snapshot Schema" versions of these tasks where possible due to the improved performance they provide.
unnamed KeepRight Optimized for finding errors or inconsistencies in OSM data.

Tools to Populate Databases

Tool Database Schema Notes
osm2pgsql PostgreSQL The Mapnik specific schema
osmosis PostgreSQL, MySQL Several schemas provided with osmosis but you can define your own
osm2postgresql PostgreSQL Starts with default osmosis import/schema. Then polygons are created based on ways and relations. Nodes with tags are copied in a nodes_with_tags table.
Imposm ??? ???
osm2gis ??? ???
Personal tools
Namespaces
Variants
Actions
site
Toolbox