Osmosis/pgsnapshot

From OpenStreetMap Wiki
Jump to navigation Jump to search

This page is for information on pgsnapshot

Size

pgsnapshot size depends on what options are used when setting up the database and the data imported.

Bloat

Tables and their indexes subject to UPDATEs or DELETEs will have "bloat" in the form of data (dead tuples) on disk that can be deleted. The dead tuples are removed with VACUUM or autovacuum but the disk space isn't reclaimed. For that a CLUSTER or VACUUM FULL is needed, but these take days to complete. For a continually growing database like OSM autovacuum will keep the table bloat below some fixed bounds but indexes will periodically require rebuilding.

It is probably not worth doing a CLUSTER or VACUUM FULL over a reimport.

The numbers below are from running from Sept 2012 to May 2013 (7 months) and show the difference in size between the old index and the reindexed index. The numbers are from btree indexes or GIN tag indexes, not GiST indexes.

table Increase in size of the old index
way_nodes +50%
ways +100%
relation_members +130%

Size

The numbers below are for the full planet kept up to date to April 2013, with linestrings built.

users
Column name Size
id
name
Total 10MB
Index name Size Creation time
pk_users 6808 kB
Total
Grand total
nodes
1.91b rows
Column name Size
id 14GB
version 7.3GB
user_id 7.3GB
tstamp 14GB
changeset_id 14GB
tags 17GB
geom 52GB
Total 188GB
Index name Size Creation time
idx_nodes_geom 91GB 14h24m
idx_nodes_tags 15GB 1h13m
pk_nodes 40 GB 59m
Total 146GB 16h36m
Grand total 334GB
ways
185m rows
Column name Size
id 1414MB
version 707MB
tstamp 1414MB
changeset_id 1414MB
tags 18GB
nodes 18GB
bbox 20GB
linestring 39GB
Total 108GB
Index name Size Creation time
pk_ways 3969MB 18m
idx_ways_bbox 9727MB 52m
idx_ways_linestring 9727MB 59m
idx_ways_tags 8706MB 1h38m
Total
Grand total
way_nodes
2.3b rows
Column name Size
Total 110GB
Index name Size Creation time
pk_way_nodes 67GB 1h9m
idx_way_nodes_node_id 47GB 1h56m
Total
Grand total
relations
1.98m rows
Column name Size
id 15MB
version 7735kB
tstamp 15MB
changeset_id 15MB
tags 231MB
Total 353MB
Index name Size Creation time
pk_relations 42MB 5s
idx_relations_tags 1m42s
Total
Grand total
relation_members
xxx rows
Column name Size
Total
Index name Size Creation time
pk_relation_members 674MB 39s
idx_relation_members_member_id_and_type 674MB 56s
Total
Grand total

Import times

The most efficient way to import the planet is with --write-pgsql-dump. Aside from a more efficient SQL it allows you to use all your memory for osmosis in the first stage then all your memory for index builds later on.

Values are for a 20213MB PBF planet updated to May 31, 2013. Database on 6 disk RAID10 array, all disks consumer 7200RPM drives, 32GB memory, AMD Phenom II x6 1090T. planet file and dump .txt files on RAID10 of 2TB 7200RPM drives with 1TB platters.

Osmosis command was osmosis --read-pbf-fast workers=3 --write-pgsql-dump nodeLocationStoreType=InMemory keepInvalidWays=yes enableBboxBuilder=yes enableLinestringBuilder=yes. Osmosis 0.43 was used. Total time was 4:18:34.

       User time (seconds): 37114.18
       System time (seconds): 773.61
       Percent of CPU this job got: 244%
       Elapsed (wall clock) time (h:mm:ss or m:ss): 4:18:34
       Average shared text size (kbytes): 0
       Average unshared data size (kbytes): 0
       Average stack size (kbytes): 0
       Average total size (kbytes): 0
       Maximum resident set size (kbytes): 126439008
       Average resident set size (kbytes): 0
       Major (requiring I/O) page faults: 1103342
       Minor (reclaiming a frame) page faults: 22992736
       Voluntary context switches: 4913429
       Involuntary context switches: 7905070
       Swaps: 0
       File system inputs: 91933400
       File system outputs: 831028824
       Socket messages sent: 0
       Socket messages received: 0
       Signals delivered: 0
       Page size (bytes): 4096
       Exit status: 0

table dump .txt size \copy time rows size in DB without indexes
users 3.8MB 851ms 234k 10MB
nodes 194GB 2h48m 1.91b 188GB
ways 154GB 2h2m 185m 108GB
way_nodes 48GB 1h40m 2.27b 110GB
relations 306MB 14s 1.98m 353MB
relation_members 563MB 78s 22.4m 1225MB
Total 397GB 6h33m - -

With bboxes only: 3:45:03 wall clock ways.txt: 82GB.

with linestrings only: 3:54:14 wall clock ways.txt 120G

With neither: ways.txt 48GB