SQLite

From OpenStreetMap Wiki
Jump to navigation Jump to search

SQLite (sqlite.org) is a local Database Management System. Data is stored in a local file under a well-documented file format.

SQLite is just a container format; the choice of what can go where in the DB, i.e. the schema, is chosen by the program that creates the database. For example, the DB can store simple geo data, like latitude and longitude columns; store structured geo data, when an appropriate schema is used; store pre-computed tiles (which can then be served offline, for example in OsmAnd); or do as a basic, generic container to build new standard file formats.

SpatiaLite [1] extends SQLite with spatial indexing and spatial functions.

Obtaining a DB with OSM data

Some processed data providers provide SQLite or GeoPackage files directly (a GeoPackage is a SQLite file). Note that that page does not list all the formats for all the providers.

Otherwise you can download OSM in another format (for example the native PBF format) and use a tool to convert to SQLite.

Historic options:

SQLite files and QGis

Often, it's more useful to store geodata in small SQLite. It doesn't have geo-functions or a spatial index, like SpatiaLite does, but this is often not needed at all. SQLite gives a very lightweight way to store such information.

QGIS can produce both kinds of DBs, but it doesn't add the correct CRS/SRS (information about the coordinate system being used). That might be a bug, but doesn't work up to version 2.18.3 and maybe later. The problem results into QGIS complaining about a missing CRS/SRS on re-opening at file and forces it to be EPSG:4326. That's good for OpenStreetMap data, but not really usefull for anything else.

To add the missing CRS/SRS, keep in mind to store metadata in QGIS and select "SQLite" as export format (not "SpatiaLite"!). In the following example, I'll set SRID to 3035 used in Europe for maps about statistics:

  1. Open the SQLite file, e.g., with sqlite3: sqlite3 database.sqlite
  2. Add the information about spatial reference to the geometry column (table is "demo_data"):
UPDATE geometry_columns SET srid = 3035 WHERE table = 'demo_data';
  1. Add the definition of the spatial reference:
INSERT INTO spatial_ref_sys (srid, auth_name, auth_srid, srtext) VALUES (3035, 'epsg', 3035, 'PROJCS["ETRS89 / ETRS-LAEA",GEOGCS["ETRS89",DATUM["European_Terrestrial_Reference_System_1989",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6258"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4258"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],PROJECTION["Lambert_Azimuthal_Equal_Area"],PARAMETER["latitude_of_center",52],PARAMETER["longitude_of_center",10],PARAMETER["false_easting",4321000],PARAMETER["false_northing",3210000],AUTHORITY["EPSG","3035"],AXIS["X",EAST],AXIS["Y",NORTH]]');

Now you're done. You can look up the string for spatial reference, f.e. on this website: https://spatialreference.org/ref/epsg/3035/postgis/ But this is important:

  1. Remove "proj4text," from the bracketed part and remove "+proj=..." from the second one.
  2. Make the first number of the second bracketed part be equal to your EPSG-code, f.e. "3035".

Tools using SQLite

Microcosm is a map API server that uses Sqlite and the R*tree module.

Derived formats

Some standard formats are defined as a SQLite DB with a specific schema.

  • GeoPackage can contain vector features and raster tile images.
  • MBTiles can contain raster tile images.

Optimize searching by tag

ogr2ogr stores the OSM tags in SQLite or GeoPackage files in suboptimal ways (JSON, HSTORE, ogrconf.ini's attributes). There is a feature request to store in two columns key and value[2] to allow for a query that is performant, readable and flexible at the same time. For now, you can implement that schema by hand, see SQLite/fid_to_tag schema.

General usage

Visualisation tools

General tools to navigate in a SQLite database:

  • sqlitebrowser: open the tab Navigate data, select the table in the dropdown box, filter columns by typing something in the text filters below the table column headers
  • sqlite3 command-line program:
    • Type sqlite3 to run SQL interactively in the terminal window
    • Or give it a SQL as an argument. For example, using the above optimization tutorial: sqlite3 file.sqlite "select fid from points_fid_to_tag where key = 'amenity' and value = 'atm' limit 10;"

Indices

You can make indices on any column of any DB table. You can pass a query to .expert to auto-suggest indices that speed up that specific query. [3] [4] Beware that WHERE uses indices which have WHERE's columns as their first columns[5] and that at most one index is used for each table in a query[6].

For example, ogr2ogr makes a column osm_id. To make an index to speed up the WHERE osm_id = ... part of a query, run:

CREATE INDEX index_points_by_osm_id ON points(osm_id);

To check that a query uses an index (for example, for the optimization tutorial above):

username@machine$ sqlite3 the_file.sqlite
sqlite> EXPLAIN QUERY PLAN
   ...>  SELECT points.geometry
   ...>    FROM points, points_fid_to_tag
   ...>   WHERE points.ogc_fid = points_fid_to_tag.ogc_fid
   ...>         AND points_fid_to_tag.key = 'amenity'
   ...>         AND points_fid_to_tag.value = 'bench';
QUERY PLAN
|--SCAN points_fid_to_tag
`--SEARCH points USING INTEGER PRIMARY KEY (rowid=?)
sqlite>
sqlite> -- Make indices per above
sqlite> CREATE INDEX ...
sqlite>
sqlite> EXPLAIN QUERY PLAN
   ...>  SELECT points.geometry
   ...>    FROM points, points_fid_to_tag
   ...>   WHERE points.ogc_fid = points_fid_to_tag.ogc_fid
   ...>         AND points_fid_to_tag.key = 'amenity'
   ...>         AND points_fid_to_tag.value = 'bench';
QUERY PLAN
|--SEARCH points_fid_to_tag USING COVERING INDEX index_points_fid_to_tag_by_all (key=? AND value=?)
`--SEARCH points USING INTEGER PRIMARY KEY (rowid=?)

See also

References

  1. release notes, details
  2. https://github.com/OSGeo/gdal/issues/7552
  3. .expert is documented here in section 17.
  4. If you run .expert and it gives error sqlite3_expert_new: no such module: VirtualSpatialIndex on Debian, then install the package libsqlite3-mod-spatialite and run SELECT load_extension("mod_spatialite");.
  5. query planner documentation's section "3. WHERE Clause Analysis"
  6. query planner documentation's section "8. Choosing Between Multiple Indexes"