MonetDB/Schema

From OpenStreetMap Wiki
Jump to: navigation, search

This is a schema for API v0.6 data in MonetDB.

planet-100512.osm imported in this schema has a size of 63GB. While importing, you will need more than 100GB of available disk space (dbfarm). Please keep in mind that the CSV data (that can be deleted after the import) takes another 80GB.

The schema will be created automatically by POI tools/Importer.

Tables

START TRANSACTION;                                                                                                                                           
CREATE TABLE nodes_legacy (id integer, long numeric(10,7), lat numeric(10,7), uid integer, timestamp timestamptz, zcurve bigint);
CREATE TABLE node_tags (node integer, k varchar(255), v varchar(1024));                                                                                      
CREATE TABLE ways (id integer,uid integer, timestamp timestamptz);
CREATE TABLE way_tags (way integer, k varchar(255), v varchar(1024));                                                                                        
CREATE TABLE way_nds (way integer, idx integer, to_node integer);
CREATE TABLE relations(id integer, uid integer, timestamp timestamptz);                                                                                      
CREATE TABLE relation_members_node (relation integer, idx integer, to_node integer, role varchar(255));
CREATE TABLE relation_members_relation (relation integer, idx integer, to_relation integer, role varchar(255));                                              
CREATE TABLE relation_members_way (relation integer, idx integer, to_way integer, role varchar(255));
CREATE TABLE relation_tags (relation integer, k varchar(255), v varchar(1024)); 
COMMIT;


START TRANSACTION;                                                                                                                                           
CREATE SEQUENCE s_nodes AS INTEGER;
ALTER SEQUENCE s_nodes RESTART WITH (SELECT MAX(id) FROM nodes_legacy);                                                                                      
ALTER TABLE nodes_legacy ALTER COLUMN id SET NOT NULL;
ALTER TABLE nodes_legacy ALTER COLUMN id SET DEFAULT NEXT VALUE FOR "s_nodes";                                                                               
ALTER TABLE nodes_legacy ADD CONSTRAINT pk_nodes_id PRIMARY KEY (id);
CREATE SEQUENCE s_ways AS INTEGER;                                                                                                                           
ALTER SEQUENCE s_ways RESTART WITH (SELECT MAX(id) FROM ways);
ALTER TABLE ways ALTER COLUMN id SET NOT NULL;                                                                                                               
ALTER TABLE ways ALTER COLUMN id SET DEFAULT NEXT VALUE FOR "s_ways";                                                                                        
ALTER TABLE ways ADD CONSTRAINT pk_ways_id PRIMARY KEY (id);
CREATE SEQUENCE s_relations AS INTEGER;                                                                                                                      
ALTER SEQUENCE s_relations RESTART WITH (SELECT MAX(id) FROM relations);
ALTER TABLE relations ALTER COLUMN id SET NOT NULL;                                                                                                          
ALTER TABLE relations ALTER COLUMN id SET DEFAULT NEXT VALUE FOR "s_relations";
ALTER TABLE relations ADD CONSTRAINT pk_relations_id PRIMARY KEY (id);                                                                                       
ALTER TABLE relation_members_node ADD CONSTRAINT pk_relation_members_node PRIMARY KEY (relation, idx);
ALTER TABLE relation_members_way ADD CONSTRAINT pk_relation_members_way PRIMARY KEY (relation,idx);                                                          
ALTER TABLE relation_members_relation ADD CONSTRAINT pk_relation_members_relation PRIMARY KEY (relation,idx);
COMMIT;

Constraints

Test for violations

Test for constraint violations before applying constraints. Tests taken from OSM Fixer.

Check for foreign key violations in the way-nodes table:

SELECT DISTINCT way_nds.way as way, to_node FROM way_nds
LEFT JOIN nodes_legacy ON way_nds.to_node = nodes_legacy.id
WHERE nodes_legacy.id IS NULL;


Check for foreign key violations in relations:

SELECT DISTINCT * FROM (
  SELECT DISTINCT relation_members_way.relation AS relation FROM relation_members_way
  LEFT JOIN ways ON relation_members_way.to_way = ways.id WHERE ways.id IS NULL
  UNION
  SELECT DISTINCT relation_members_relation.relation AS relation FROM relation_members_relation
  LEFT JOIN relations ON relation_members_relation.to_relation = relations.id WHERE relations.id IS NULL
  UNION
  SELECT DISTINCT relation_members_node.relation AS relation FROM relation_members_node
  LEFT JOIN nodes_legacy ON relation_members_node.to_node = nodes_legacy.id WHERE nodes_legacy.id IS NULL
) AS bigquery ORDER BY relation;


Script to write affected members in file "broken-relations" to files:

#!/bin/bash

# the file 'broken-relations' contains results from foreign key violations (referential violations) in relations, one id per line
echo "ways that were deleted, but are still referenced in relations:"
for id in $(cat broken-relations)
do
  echo "SELECT relation_members_way.to_way FROM relation_members_way  LEFT JOIN ways ON relation_members_way.to_way = ways.id WHERE ways.id IS NULL AND relation_members_way.relation=$id;" | mclient -f csv
done

echo
echo "relations that were deleted, but are still referenced in relations:"
for id in $(cat broken-relations)
do
  echo "SELECT relation_members_relation.to_relation FROM relation_members_relation  LEFT JOIN ways ON relation_members_relation.to_relation = ways.id WHERE ways.id IS NULL AND relation_members_relation.relation=$id;" | mclient -f csv
done

echo
echo "nodes that were deleted, but are still referenced in relations:"
for id in $(cat broken-relations)
do
  echo "SELECT relation_members_node.to_node FROM relation_members_node  LEFT JOIN ways ON relation_members_node.to_node = ways.id WHERE ways.id IS NULL AND relation_members_node.relation=$id;" | mclient -f csv
done

Change optimizer pipeline

Change MonetDB's optimizer to the Nov2009 optimizer pipeline for a speedup

SET optimizer='nov2009_pipe';

and switch back to the usual optimizer after applying the constraints.

SET optimizer='default_pipe';

This might not be neccessary anymore in future.

Apply constraints

START TRANSACTION;
ALTER TABLE node_tags ADD CONSTRAINT fk_node_tags_node FOREIGN KEY (node) REFERENCES nodes_legacy (id);                                                      
ALTER TABLE node_tags ADD CONSTRAINT pk_node_tags UNIQUE (node, k, v);
ALTER TABLE way_tags ADD CONSTRAINT fk_way_tags_way FOREIGN KEY (way) REFERENCES ways (id);                                                                  
ALTER TABLE way_tags ADD CONSTRAINT pk_way_tags UNIQUE (way, k, v);
ALTER TABLE way_nds ADD CONSTRAINT pk_way_nds PRIMARY KEY (way, idx);                                                                                        
ALTER TABLE way_nds ADD CONSTRAINT fk_way_nds_way FOREIGN KEY (way) REFERENCES ways (id);
ALTER TABLE way_nds ADD CONSTRAINT fk_way_nds_node FOREIGN KEY (to_node) REFERENCES nodes_legacy (id);                                                       
ALTER TABLE relation_tags ADD CONSTRAINT fk_relation_tags FOREIGN KEY (relation) REFERENCES relations (id);
ALTER TABLE relation_tags ADD CONSTRAINT pk_relation_tags UNIQUE (relation, k, v);                                                                           
ALTER TABLE relation_members_node ADD CONSTRAINT fk_relation_members_node FOREIGN KEY (relation) REFERENCES relations (id);
ALTER TABLE relation_members_node ADD CONSTRAINT fk_relation_members_tonode FOREIGN KEY (to_node) REFERENCES nodes_legacy (id);                              
ALTER TABLE relation_members_way ADD CONSTRAINT fk_relation_members_way FOREIGN KEY (relation) REFERENCES relations (id);
ALTER TABLE relation_members_way ADD CONSTRAINT fk_relation_members_toway FOREIGN KEY (to_way) REFERENCES ways (id);                                         
ALTER TABLE relation_members_relation ADD CONSTRAINT fk_relation_members_relation FOREIGN KEY (relation) REFERENCES relations (id);
ALTER TABLE relation_members_relation ADD CONSTRAINT fk_relation_members_torelation FOREIGN KEY (to_relation) REFERENCES relations (id);                     
COMMIT;


Sideways cracking

Cracking is an internal optimization which reorders the database entries for faster access after analyzing queries.

Sideways cracking is the optimization pipeline for multicolumn queries. The cracker pipelines need a read-only database.

Make the database read-only in monetdb.conf:

gdk_readonly=yes

In the configuration file you can also change the optimizer pipeline. Change

sql_optimizer=default_pipe

to

sql_optimizer=sidcrack_pipe

or if you just want to change the optimizer temporarily, do it in SQL:

SET optimizer='sidcrack_pipe';

You need to use single quotes in this case.