Osm2pgsql/benchmarks
Contents
- 1 Background
- 2 What affects import time?
- 3 Example Output
- 4 Benchmarks
- 4.1 Update performance osm2pgsql classic vs. persistent node cache
- 4.2 Planet import on custom E5-1650 (32GB RAM - 6 cores)
- 4.3 Planet import on custom server VM (10 cores, 24 GB, RAID5)
- 4.4 IBM LS21 blade (4 opteron cores @ 2.4GHz / 8GB RAM) with SSD
- 4.5 Dell PowerEdge R520 2 x Xeon E5-2420@1.9Ghz (12 cores) 128GB RAM
- 4.6 AWS RDS db.m2.2xlarge, 600GB storage, 6000 provisioned IOPS
- 4.7 Parallels Virtual Machine on 15 inch Retina MacBook Pro (Early 2013)
- 4.8 Desktop Core i7 3770 + 16GB RAM + SSD
- 4.9 Europe import on Windows Server 2012 HDD
- 4.10 Planet import on Ubuntu 17.10 NVMe
- 4.11 Debian 8.11 planet import
- 4.12 Debian 8.8 planet import
Background
Benchmarks of osm2pgsql are important metrics for users to reference because importing OSM data is highly dependent on machine hardware and software configurations. Importing a complete planet file can take days even on a typical higher end desktop machine. Importing an extract(subset) of the planet file can take considerably less time to import and should be used if possible for your import instead of the planet file.
What affects import time?
Partial list of variables that affect the time it takes to import.
- Hardware
- Hard disk throughput ( HDD, SSD )
- RAM size
- CPU
- Operating system
- OS used and 32 bit or 64 bit version
- linux disk scheduler ( CFG, noop, deadline, ... )
- linux file system ( ext4/LVM , xfs/raid, ...)
- Database
- Version
- PostgreSQL 8.2 and PostGIS 1.5
- PostgreSQL 9.1 and PostGIS 2.0
- PostgreSQL 9.2 and PostGIS 2.1
- ...
- PostgreSQL config settings ( postgresql.conf )
- shared_buffers
- work_mem
- maintanance_work_mem
- synchronous_commits
- fsync
- autovacuum
- checkpoint_segments
- ...
- Version
- Osm2pgsql application
- osm2pgsql version ( 32bit, 64bit )
- parameters
- "Slim Mode" or not
- Size of the node cache (-C command line argument)
- Flat node storage
- Hstore / style
- Parallelisation ( --number-processes )
- Use case
- size of the import file : Full planet or Extracts
- input reader and input file format
- the experimental 'primitive' XML parser reads OSM XML input about 30% faster than the default 'libxml2' reader (but seams it currently as a bug in supporting some xml entities such as ' )
- the 'pbf' input reader reads OSM PBF files about twice as fast as the 'libxml2' parser parses an equivalent OSM XML file
- other customisation ( special index )
more info - see Frederik Ramm: Optimising the Mapnik/osm2pgsql Rendering Toolchain 2.0 @ SOTM 2012
Example Output
The following is sample output generated while importing a planet file using osm2pgsql for reference. The linux time command was used in this example to output the amount of time it took for osm2pgsql to finish. Note the NOTICE lines are normal when importing into an empty database.
~/osm2pgsql$ time osm2pgsql -s -v -U mapper -S ./default.style -d gis -C 3000 ../planet/planet-100324.osm.bz2
osm2pgsql SVN version 0.69-20672
Using projection SRS 900913 (Spherical Mercator)
Setting up table: planet_osm_point
NOTICE: table "planet_osm_point" does not exist, skipping
NOTICE: table "planet_osm_point_tmp" does not exist, skipping
Setting up table: planet_osm_line
NOTICE: table "planet_osm_line" does not exist, skipping
NOTICE: table "planet_osm_line_tmp" does not exist, skipping
Setting up table: planet_osm_polygon
NOTICE: table "planet_osm_polygon" does not exist, skipping
NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping
Setting up table: planet_osm_roads
NOTICE: table "planet_osm_roads" does not exist, skipping
NOTICE: table "planet_osm_roads_tmp" does not exist, skipping
Mid: pgsql, scale=100, cache=3000MB, maxblocks=384001*8192
Setting up table: planet_osm_nodes
NOTICE: table "planet_osm_nodes" does not exist, skipping
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_nodes_pkey" for table "planet_osm_nodes"
Setting up table: planet_osm_ways
NOTICE: table "planet_osm_ways" does not exist, skipping
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_ways_pkey" for table "planet_osm_ways"
Setting up table: planet_osm_rels
NOTICE: table "planet_osm_rels" does not exist, skipping
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_rels_pkey" for table "planet_osm_rels"
Reading in file: ../planet/planet-100324.osm.bz2
Processing: Node(574797k) Way(43465k) Relation(87k)
excepton caught processing way id=110802
excepton caught processing way id=110803
Processing: Node(574797k) Way(43465k) Relation(465k)
Node stats: total(574797076), max(673005476)
Way stats: total(43465572), max(53189409)
Relation stats: total(465800), max(533629)
Going over pending ways
processing way (12179k)
Going over pending relations
node cache: stored: 386841238(67.30%), storage efficiency: 98.38%, hit rate: 65.63%
Committing transaction for planet_osm_roads
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_line
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_line
Committing transaction for planet_osm_point
Sorting data and creating indexes for planet_osm_point
Completed planet_osm_point
Completed planet_osm_roads
Completed planet_osm_polygon
Stopping table: planet_osm_nodes
Stopping table: planet_osm_rels
Stopping table: planet_osm_ways
Building index on table: planet_osm_rels
Stopped table: planet_osm_nodes
Building index on table: planet_osm_ways
Stopped table: planet_osm_rels
Completed planet_osm_line
Stopped table: planet_osm_ways
real 2985m27.269s
user 327m47.240s
sys 35m32.480s
Explanation and progressbar
If you want to know how far the import has gone, see the statistics and compare it with "Processing: Node" section of the output. Note that processing relation takes approximately 10 times as processing a way (which takes approximately 10 times processing a node). In my case, nodes were processed at 40.8k/s, ways at 0.13k/s and relations at ..k/s. Also, closing a table take approximately as long as importing the data. So if you finished importing the nodes, you are roughly 1/6 into total import.
Benchmarks
List of benchmarks contributed by users. Currently simply using time command to return length of time' it takes osm2pgsql task to complete. If you do not have time available please provide some other meaningful metric. Better organization and formatting standard for this section is needed.
Update performance osm2pgsql classic vs. persistent node cache
While import performance was of great importance when it took weeks, today it is really more interesting to improve the time it takes to catch up and update the database. For example in the above example it takes 24 hours to import and much longer to catch up with the current diffs. Even assuming a planet dump can be downloaded in a couple of hours, an installation will have to catch up at least the 3 days it takes to actually produce the dump. Further experience shows that rendering and updating tend to impact eachother leading to installations falling substantially behind over time.
Kai Krüger has added an experimental "persistent" node cache mode to osm2pgsql that promises some improvements. To see if this is actually the case I ran a further import on above HW with the persitent node cache enabled and updated the DB for a while.
Results for 15x6h updates starting 1st August 2012:
osm2pgsql classic: 36 hours
osm2pgsql experimental: 27.7 hours
Notes:
- in both cases osm2pgsql was ran with -number-processes 1, both numbers would be substantially lower if ran with more parallel processes however there is currently an issue with this with the persistent node cache. Note this issue has been fixed in the mean time.
- not all 6h updates showed the same performance gains, this probably warrants further investigation.
- no significant difference in import time.
As a further comparision numbers from the fastet installation I currently have ( Intel E5-1650): 4.8h
Planet import on custom E5-1650 (32GB RAM - 6 cores)
- Intel E5-1650
- 32GB ECC RAM
- Supermicro X9SRA motherboard
- Ubuntu 12.04 LTS
- PostgreSQL 9.1 (note: these settings are for import only)
- shared_buffers = 4GB
- work_mem = 100MB
- effective_io_concurrency = 2
- fsync = off (import only!)
- synchronous_commit = off
- full_page_writes = off
- checkpoint_segments = 100
- checkpoint_completion_target = 0.9
- autovacuum = off (import only!)
- Kai Krüger tile server packages
- osm2pgsql SVN version 0.80.0 (64bit id space)
- Planet from 2012-07-04
- 1 x Intel 520 120GB SSD - / (OS etc)
- 1 x OCZ Agility 3 240GB SSD - rendering database without slim tables
- 2 x WD VelociRaptor WD1000DHTZ, 10'000rpm, 64MB, 1TB, SATA-3 in software RAID 1 - pg_xlog, slim tables and indices
- 2 x Seagate Barracuda 3TB, 7200rpm, 64MB, 3TB, SATA-3 in software RAID 1
- Notes:
- using the deadline I/O scheduler on the SSD has no apparent effect in Ubuntu 12.04
- if you don't want or need ECC support, a core i7 39xx processor will give the same performance with substantially cheaper memory. For example a Hetzner EX10 with a 240GB SSD should give even better performance (due to more memory).
- increasing maintenance_work_mem over 4GB had no significant effect, except that with 16GB the import process became very slow.
No hstore
- DB config
- effective_cache_size = 16GB
- maintenance_work_mem = 1GB
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisslimidx --slim -C 16000 --number-processes 6 planet-120704.osm.pbf
Processing: Node(1507455k 311.7k/s) Way(141480k 38.61k/s) Relation(1469760 74.91/s) parse time: 28120s Node stats: total(1507455627), max(1812375450) in 4836s Way stats: total(141480878), max(170086664) in 3664s Relation stats: total(1469765), max(2263585) in 19620s Pending ways took 6071s at a rate of 13397.84/s Pending relations took 1s at a rate of 0.00/s All indexes on planet_osm_roads created in 682s All indexes on planet_osm_point created in 1340s All indexes on planet_osm_line created in 3723s All indexes on planet_osm_polygon created in 4274s Stopped table: planet_osm_ways in 23804s Osm2pgsql took 57999s overall (16.1 hours) real 966m38.279s user 324m38.769s sys 45m51.064s
- DB config
- effective_cache_size = 28GB
- maintenance_work_mem = 4GB (setting this to 16GB led to extremely long indexing run times, probably would require more memory in the machine)
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisslimidx --slim -C 16000 --number-processes 6 planet-120704.osm.pbf
Processing: Node(1507455k 299.8k/s) Way(141480k 38.37k/s) Relation(1469760 75.33/s) parse time: 28227s Node stats: total(1507455627), max(1812375450) in 5028s Way stats: total(141480878), max(170086664) in 3687s Relation stats: total(1469765), max(2263585) in 19512s 81338263 Pending ways took 5986s at a rate of 13588.08/s 0 Pending relations took 1s at a rate of 0.00/s All indexes on planet_osm_roads created in 766s All indexes on planet_osm_point created in 1366s All indexes on planet_osm_line created in 4028s Stopped table: planet_osm_ways in 17558s Osm2pgsql took 51778s overall (14.4 hours) real 862m58.007s user 325m26.384s sys 42m20.939s
- DB config
- effective_cache_size = 28GB
- maintenance_work_mem = 4GB (setting this to 16GB led to extremely long indexing run times, probably would require more memory in the machine)
- Indices of the slim tables moved to SSD
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisidx --slim -C 16000 --number-processes 6 planet-120704.osm.pbf
- SSD usage after import: /dev/sdb1 234152908 181270760 41160648 82% /db
Processing: Node(1507455k 301.6k/s) Way(141480k 37.49k/s) Relation(1469760 81.71/s) parse time: 26760s Node stats: total(1507455627), max(1812375450) in 4998s Way stats: total(141480878), max(170086664) in 3774s Relation stats: total(1469765), max(2263585) in 17988s 81338263 Pending ways took 6158s at a rate of 13208.55/s 0 Pending relations took 1s at a rate of 0.00/s All indexes on planet_osm_roads created in 760s All indexes on planet_osm_point created in 1395s All indexes on planet_osm_line created in 3950s All indexes on planet_osm_polygon created in 4449s Stopped table: planet_osm_ways in 7086s Osm2pgsql took 40016s overall (11.1 hours) real 666m56.457s user 335m7.557s sys 40m0.366s
Pro memoria: base tablespace 49GB, gisidx 14GB, gisslim 112GB, gisslimidx 114GB
With hstore
- DB config
- effective_cache_size = 28GB
- maintenance_work_mem = 8GB (note no difference to using 4GB)
- Slim indices on SSD
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisidx --slim -C 16000 --number-processes 6 --hstore planet-120704.osm.pbf
- SSD usage /dev/sdb1 234152908 197481692 24949716 89% /db
- slim tables ~128GB
Processing: Node(1507455k 297.3k/s) Way(141480k 34.32k/s) Relation(1469760 74.25/s) parse time: 28990s Node stats: total(1507455627), max(1812375450) in 5071s Way stats: total(141480878), max(170086664) in 4123s Relation stats: total(1469765), max(2263585) in 19796s 81560808 Pending ways took 6559s at a rate of 12434.95/s 0 Pending relations took 2s at a rate of 0.00/s All indexes on planet_osm_roads created in 822s All indexes on planet_osm_point created in 2106s All indexes on planet_osm_line created in 4402s All indexes on planet_osm_polygon created in 4794s Stopped table: planet_osm_ways in 7116s Osm2pgsql took 42685s overall (11.9 hours) real 711m24.621s user 351m38.131s sys 41m8.714s
No hstore, persistent node cache
Persistent node cache located on Intel SSD. Slim DB indices down from ~120GB to 81GB and slim DB tables down from ~120GB to 46GB
- DB config
- effective_cache_size = 16GB
- maintenance_work_mem = 1GB
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisslimidx --slim -C 16000 --number-processes 6 --flat-nodes node.cache planet-120704.osm.pbf
Processing: Node(1507455k 1779.8k/s) Way(141480k 37.74k/s) Relation(1469760 74.66/s) parse time: 24282s Node stats: total(1507455627), max(1812375450) in 847s Way stats: total(141480878), max(170086664) in 3749s Relation stats: total(1469765), max(2263585) in 19686s Pending ways took 7113s at a rate of 11435.16/s Pending relations took 2s at a rate of 0.00/s All indexes on planet_osm_roads created in 738s All indexes on planet_osm_point created in 1354s All indexes on planet_osm_line created in 3895s All indexes on planet_osm_polygon created in 4375s Stopped table: planet_osm_ways in 26922s Osm2pgsql took 58330s overall (16.2 hours) real 972m10.578s user 317m51.896s sys 97m48.319s
Flat file node cache located on Intel SSD. Complete DB located on 240GB DB.
- DB config
- effective_cache_size = 28GB
- maintenance_work_mem = 4GB
- time /usr/local/bin/osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisidx --tablespace-slim-index gisidx --slim -C 16000 --number-processes 6 --flat-nodes node.cache planet-120704.osm.pbf
- SSD usage after import: /dev/sdb1 234152908 193942684 28488724 88% /db
Processing: Node(1507455k 1765.2k/s) Way(141480k 39.04k/s) Relation(1469760 129.80/s) parse time: 15802s Node stats: total(1507455627), max(1812375450) in 854s Way stats: total(141480878), max(170086664) in 3624s Relation stats: total(1469765), max(2263585) in 11324s 81338263 Pending ways took 5196s at a rate of 15654.02/s 0 Pending relations took 1s at a rate of 0.00/s All indexes on planet_osm_roads created in 773s All indexes on planet_osm_point created in 1480s All indexes on planet_osm_line created in 4232s All indexes on planet_osm_polygon created in 4798s Stopped table: planet_osm_ways in 7163s Osm2pgsql took 28165s overall (7.8 hours) real 469m24.696s user 312m14.919s sys 38m22.284s
Planet import on custom server VM (10 cores, 24 GB, RAID5)
Same configuration as Osm2pgsql/benchmarks#Planet_import_on_custom_server_VM_.2810_cores.2C_16_GB.2C_RAID5.29, except for 8 GB more RAM and i used -C 23500.
Osm2pgsql took 393,444s overall (4.5 days)
IBM LS21 blade (4 opteron cores @ 2.4GHz / 8GB RAM) with SSD
Less than 3 hours to import France extract Diff updates: 1 hour in 5-20s
12h30 for Europe extract import
See full hard/soft/parameters and osm2pgsql sessions
Dell PowerEdge R520 2 x Xeon E5-2420@1.9Ghz (12 cores) 128GB RAM
- Slackware Linux 14 64-bit, XFS filesystems, 24 logical CPUs
- PostgreSQL 9.1.8
- Postgres on 6x600GB 15kRPM SAS drives in RAID6 (PERC H710p)
- Planet pbf file on 2x2TB 7.2kRPM SAS in RAID1 (PERC H710p)
/usr/local/bin/osm2pgsql -U mapnik -P 5432 -r pbf --slim --create -C 16000 --number-processes 12 --hstore --flat-nodes /home/node.cache /storage/planet/planet-130313.osm.pbf All indexes on planet_osm_roads created in 1716s All indexes on planet_osm_point created in 3727s All indexes on planet_osm_line created in 11446s All indexes on planet_osm_polygon created in 11841s Stopped table: planet_osm_ways in 21888s Osm2pgsql took 71613s overall (~20h, 20GB planet pbf)
- It was 2nd run on this machine, and it's sure results could be better. Seems CPU was bottleneck here.
- shared_buffers=20GB
- temp_buffers=100MB
- work_mem=1GB
- maintenance_work_mem=4GB
- checkpoint_segments = 400
- checkpoint_timeout = 1h
- fsync=off
- autovacuum=off
- effective_io_concurrency = 6
I'll greatly appreciate an expert advice on postgres configuration. tomasz (@t) salwach.pl
AWS RDS db.m2.2xlarge, 600GB storage, 6000 provisioned IOPS
- load was performed from another EC2 instance
- total import time: 116004s
osm2pgsql -U gisuser --slim -C 20000 -d gis --host myhost --flat-nodes=flat.file --number-processes 12 --hstore ./planet-latest.osm.pbf osm2pgsql SVN version 0.85.0 (64bit id space) Using projection SRS 900913 (Spherical Mercator) Setting up table: planet_osm_point NOTICE: table "planet_osm_point_tmp" does not exist, skipping Setting up table: planet_osm_line NOTICE: table "planet_osm_line_tmp" does not exist, skipping Setting up table: planet_osm_polygon NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping Setting up table: planet_osm_roads NOTICE: table "planet_osm_roads_tmp" does not exist, skipping Allocating memory for dense node cache Allocating dense node cache in one big chunk Allocating memory for sparse node cache Sharing dense sparse Node-cache: cache=20000MB, maxblocks=2560000*8192, allocation method=11 Mid: loading persistent node cache from flat.file Allocated space for persistent node cache file Maximum node in persistent node cache: 0 Mid: pgsql, scale=100 cache=20000 Setting up table: planet_osm_nodes Setting up table: planet_osm_ways Setting up table: planet_osm_rels Reading in file: ./planet-latest.osm.pbf Node stats: total(2084251660), max(2530028096) in 1974s Way stats: total(204149024), max(245905425) in 10574s Relation stats: total(2234511), max(3322905) in 25338s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending ways... Maximum node in persistent node cache: 2530213887 125034024 ways are pending Using 12 helper-processes ... All child processes exited 125034024 Pending ways took 23071s at a rate of 5419.53/s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending relations... Maximum node in persistent node cache: 2530213887 0 relations are pending Using 12 helper-processes ... All child processes exited 0 Pending relations took 69s at a rate of 0.00/s Sorting data and creating indexes for planet_osm_point Sorting data and creating indexes for planet_osm_line Sorting data and creating indexes for planet_osm_polygon node cache: stored: 2084251660(100.00%), storage efficiency: 86.72% (dense blocks: 2199604, sparse nodes: 75448717), hit rate: 100.00% Sorting data and creating indexes for planet_osm_roads Maximum node in persistent node cache: 2530213887 Stopping table: planet_osm_nodes Stopping table: planet_osm_ways Stopping table: planet_osm_rels Building index on table: planet_osm_ways (fastupdate=off) Building index on table: planet_osm_rels (fastupdate=off) Stopped table: planet_osm_nodes in 0s Stopped table: planet_osm_rels in 407s Analyzing planet_osm_roads finished Analyzing planet_osm_line finished Analyzing planet_osm_polygon finished Analyzing planet_osm_point finished Copying planet_osm_roads to cluster by geometry finished Creating geometry index on planet_osm_roads Creating osm_id index on planet_osm_roads Creating indexes on planet_osm_roads finished All indexes on planet_osm_roads created in 6883s Completed planet_osm_roads Copying planet_osm_point to cluster by geometry finished Creating geometry index on planet_osm_point Creating osm_id index on planet_osm_point Creating indexes on planet_osm_point finished All indexes on planet_osm_point created in 9503s Completed planet_osm_point Copying planet_osm_line to cluster by geometry finished Creating geometry index on planet_osm_line Creating osm_id index on planet_osm_line Creating indexes on planet_osm_line finished All indexes on planet_osm_line created in 28117s Completed planet_osm_line Copying planet_osm_polygon to cluster by geometry finished Creating geometry index on planet_osm_polygon Creating osm_id index on planet_osm_polygon Creating indexes on planet_osm_polygon finished All indexes on planet_osm_polygon created in 37781s Completed planet_osm_polygon Stopped table: planet_osm_ways in 54957s Osm2pgsql took 116004s overall
Parallels Virtual Machine on 15 inch Retina MacBook Pro (Early 2013)
- 2.7GHz Core i7 (3rd Gen)
- 16GB RAM
- 512GB SSD
- OSX 10.9.3
VM Configuration
- Parallels 9
- Ubuntu 12.04
- 6 of 8 available cores
- 12GB RAM
- 410GB 'expanding' drive
-s : store temp data in the database. saves RAM. enables future updating. -d gis : Use the 'gis' database. -C 10240 : use 10GB of RAM to cache nodes for processing. --number-processes 5 : use up to 5 processes --flat-nodes /usr/local/share/flat_nodes/nodes.flat : Store nodes in a file rather than in the database osm2pgsql -s -d gis -C 10240 --number-processes 5 --flat-nodes /usr/local/share/flat_nodes/nodes.flat ~/planet/planet-latest.osm.pbf
Processing: Node(2407993k 1722.5k/s) Way(240538k 15.03k/s) Relation(2651850 139.00/s) parse time: 36485s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending ways... WARNING: Failed to fork helper processes. Falling back to only using 1 153417313 Pending ways took 55154s at a rate of 2781.62/s Committing transaction for planet_osm_point Committing transaction for planet_osm_line Committing transaction for planet_osm_polygon Committing transaction for planet_osm_roads Going over pending relations... 0 Pending relations took 2s at a rate of 0.00/s Sorting data and creating indexes for planet_osm_line node cache: stored: 1270253309(52.75%), storage efficiency: 94.64% (dense blocks: 1200297, sparse nodes: 56537016), hit rate: -151.30% Sorting data and creating indexes for planet_osm_polygon Sorting data and creating indexes for planet_osm_point Sorting data and creating indexes for planet_osm_roads Maximum node in persistent node cache: 2932867071 Stopping table: planet_osm_nodes Stopping table: planet_osm_ways Stopped table: planet_osm_nodes in 0s Stopping table: planet_osm_rels Building index on table: planet_osm_rels (fastupdate=off) Building index on table: planet_osm_ways (fastupdate=off) Analyzing planet_osm_point finished Analyzing planet_osm_line finished Analyzing planet_osm_polygon finished Analyzing planet_osm_roads finished Stopped table: planet_osm_rels in 389s Copying planet_osm_roads to cluster by geometry finished Copying planet_osm_point to cluster by geometry finished Creating indexes on planet_osm_roads finished All indexes on planet_osm_roads created in 4041s Completed planet_osm_roads Creating indexes on planet_osm_point finished All indexes on planet_osm_point created in 7745s Completed planet_osm_point Copying planet_osm_line to cluster by geometry finished Creating indexes on planet_osm_line finished All indexes on planet_osm_line created in 22119s Completed planet_osm_line Copying planet_osm_polygon to cluster by geometry finished Creating indexes on planet_osm_polygon finished All indexes on planet_osm_polygon created in 33888s Completed planet_osm_polygon Stopped table: planet_osm_ways in 61265s Osm2pgsql took 152995s overall (45.5 hours)
NOTES:
- I had to use osm2pgsql 0.81 as the current git version (0.85) crashed while processing relations (twice)
- I also upgraded libgoes from 3.2.2 to 3.3.9 while attempting to fix the above crash. It didn't fix it, but I stuck with the new version.
- The parallels image file grew to 414GB. My '410GB' expanding drive was 98% full at the end. I would recommend making your drive as big as possible.
- My mac actually ran out of space about 30 hours in (though the VM drive thought it had more space because parallels allows you to make expanding drives as big as you like). Thankfully the VM paused while I freed up more space and didn't crash the import process
Desktop Core i7 3770 + 16GB RAM + SSD
See : Osm2pgsql/benchmarks/i7-3770-SSD
Europe import on Windows Server 2012 HDD
Whole process took only ~ 15 days :-)
- Date of import: 21.01.2017 (Europe pbf file downloaded few days before from main page of http://download.geofabrik.de/)
- CPU: Xenon E3-1246 v3 3.5 GHz (8 cores)
- Memory: 32 GB
- Hard drive: double HDD (!) WDC WD2000FYYZ-01UL1B2 (WD 2TB 7200 RPM 64MB Cache SATA 6.0Gb/s)
- OS: Windows Server 2012 64-bit
- osm2pgsql : Newest version downloaded from https://ci.appveyor.com/api/buildjobs/iisabxffssilv7gl/artifacts/osm2pgsql_Release.zip
PS C:\OSM\rendering> osm2pgsql -U postgres -m -d osm -p osm -E 3857 -s -S C:\OSM\osm2pgsql\default.style C:\OSM\Data\eur ope-latest.osm.pbf osm2pgsql version 0.92.0 (64 bit id space) Using built-in tag processing pipeline Using projection SRS 3857 (Spherical Mercator) Setting up table: osm_point Setting up table: osm_line Setting up table: osm_polygon Setting up table: osm_roads Allocating memory for sparse node cache Node-cache: cache=800MB, maxblocks=12800*65536, allocation method=9 Mid: pgsql, scale=100 cache=800 Setting up table: osm_nodes Setting up table: osm_ways Setting up table: osm_rels Reading in file: C:\OSM\Data\europe-latest.osm.pbf Using PBF parser. Processing: Node(1827652k 131.7k/s) Way(223763k 0.78k/s) Relation(3148140 15.84/s) Standard exception processing relation id=6540291: TopologyException: side location conflict at -85245.029999999999 6666 499.7699999996 Processing: Node(1827652k 131.7k/s) Way(223763k 0.78k/s) Relation(3254340 15.98/s) Standard exception processing relation id=6752256: TopologyException: side location conflict at 538047.93000000005 56717 07.1900000004 Processing: Node(1827652k 131.7k/s) Way(223763k 0.78k/s) Relation(3316870 16.07/s) parse time: 507324s [5d20h55m] Node stats: total(1827652739), max(4604487172) in 13878s [3h51m] Way stats: total(223763746), max(465494091) in 287105s [3d07h45m] Relation stats: total(3316871), max(6875737) in 206341s [2d09h19m] Committing transaction for osm_point Committing transaction for osm_line Committing transaction for osm_polygon Committing transaction for osm_roads Setting up table: osm_nodes Setting up table: osm_ways Setting up table: osm_rels Using built-in tag processing pipeline Going over pending ways... 165944299 ways are pending Using 8 helper-processes Finished processing 165944299 ways in 200997 s [2d07h49m] 165944299 Pending ways took 200997s at a rate of 825.61/s Committing transaction for osm_point Committing transaction for osm_line Committing transaction for osm_polygon Committing transaction for osm_roads Going over pending relations... 0 relations are pending Using 8 helper-processes Finished processing 0 relations in 0 s Committing transaction for osm_point WARNING: there is no transaction in progress Committing transaction for osm_line WARNING: there is no transaction in progress Committing transaction for osm_polygon WARNING: there is no transaction in progress Committing transaction for osm_roads WARNING: there is no transaction in progress Sorting data and creating indexes for osm_point Sorting data and creating indexes for osm_roads Sorting data and creating indexes for osm_polygon Sorting data and creating indexes for osm_line Stopping table: osm_nodes Stopped table: osm_nodes in 0s Stopping table: osm_ways Building index on table: osm_ways Stopping table: osm_rels Building index on table: osm_rels Copying osm_roads to cluster by geometry finished Creating geometry index on osm_roads Creating osm_id index on osm_roads Creating indexes on osm_roads finished All indexes on osm_roads created in 13205s [3h40m] Completed osm_roads Copying osm_point to cluster by geometry finished Creating geometry index on osm_point Creating osm_id index on osm_point Creating indexes on osm_point finished All indexes on osm_point created in 30703s [8h31m] Completed osm_point Stopped table: osm_rels in 51844s Copying osm_line to cluster by geometry finished Creating geometry index on osm_line Creating osm_id index on osm_line Creating indexes on osm_line finished All indexes on osm_line created in 103561s [1d04h46m] Completed osm_line Copying osm_polygon to cluster by geometry finished Creating geometry index on osm_polygon Creating osm_id index on osm_polygon Creating indexes on osm_polygon finished All indexes on osm_polygon created in 324884s [3d18h14m] Completed osm_polygon Stopped table: osm_ways in 562079s [6d12h57m] node cache: stored: 52428801(2.87%), storage efficiency: 50.00% (dense blocks: 0, sparse nodes: 52428801), hit rate: -6. 78% Osm2pgsql took 1270744s overall [14d18h59m]
Planet import on Ubuntu 17.10 NVMe
- all work done in November 2017
- Hardware
- Hetzner PX61-NVMe
- Intel® Xeon® E3-1275 v5 Quad-Core
- 2 x 512 GB NVMe Gen3 x4 SSD
- 1x 960 GB 6 Gb/s SSD Datacenter Edition
- 64 GB ECC DDR4
- Software
- Ubuntu 17.10
- PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406, 64-bit
- POSTGIS="2.4.1 r16012" PGSQL="100" GEOS="3.5.1-CAPI-1.9.1 r4246" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.1, released 2017/06/23" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.2.1" RASTER
- osm2pgsql version 0.92.1 (64 bit id space)
- German OSM-Carto-Style with hstore-only schema
- PostgreSQL configuration
- /etc/postgresql/10/main/postgresql.conf
shared_buffers = 14GB work_mem = 1GB maintenance_work_mem = 8GB effective_io_concurrency = 500 max_worker_processes = 8 max_parallel_workers_per_gather = 2 max_parallel_workers = 8 checkpoint_timeout = 1h max_wal_size = 5GB min_wal_size = 1GB checkpoint_completion_target = 0.9 random_page_cost = 1.1 min_parallel_table_scan_size = 8MB min_parallel_index_scan_size = 512kB effective_cache_size = 22GB
- OSM2PGSQL
osm2pgsql -c -E 3857 -p planet_osm_hstore -d gisdb -H /var/run/postgresql/ -C 28000 -G --hstore --style /projects/openstreetmap-carto-de/hstore-only.style --tag-transform-script /projects/openstreetmap-carto-de/openstreetmap-carto.lua --number-processes 8 --tablespace-main-data ssd2 --tablespace-main-index ssd1 --tablespace-slim-data ssd2 --tablespace-slim-index ssd1 --flat-nodes /projects/nodecache/node.cache --slim /projects/downloads/planet-latest.osm.pbf Using projection SRS 3857 (Spherical Mercator) Setting up table: planet_osm_hstore_point Setting up table: planet_osm_hstore_line Setting up table: planet_osm_hstore_polygon Setting up table: planet_osm_hstore_roads Allocating memory for dense node cache Allocating dense node cache in one big chunk Allocating memory for sparse node cache Sharing dense sparse Node-cache: cache=28000MB, maxblocks=448000*65536, allocation method=11 Mid: pgsql, scale=100 cache=28000 Setting up table: planet_osm_hstore_nodes Setting up table: planet_osm_hstore_ways Setting up table: planet_osm_hstore_rels Reading in file: planet-latest.osm.pbf Using PBF parser. Processing: Node(4181183k 1034.2k/s) Way(451967k 26.84k/s) Relation(5368700 584.13/s) parse time: 30073s Node stats: total(4181183058), max(5225692843) in 4043s Way stats: total(451967894), max(540101215) in 16839s Relation stats: total(5368704), max(7729119) in 9191s Finished processing 304423213 ways in 8016 s 304423213 Pending ways took 8016s at a rate of 37976.95/s Stopped table: planet_osm_hstore_nodes in 0s Stopped table: planet_osm_hstore_rels in 649s All indexes on planet_osm_hstore_roads created in 1810s All indexes on planet_osm_hstore_point created in 7406s All indexes on planet_osm_hstore_line created in 10446s All indexes on planet_osm_hstore_polygon created in 16379s Stopped table: planet_osm_hstore_ways in 31478s node cache: stored: 3255953602(77.87%), storage efficiency: 88.72% (dense blocks: 416059, sparse nodes: 130831138), hit rate: 76.37% Osm2pgsql took 69595s overall
- resulting tables
\d+ Schema | Name | Type | Owner | Size | Description --------+---------------------------+-------+----------+------------+------------- public | geography_columns | view | postgres | 0 bytes | public | geometry_columns | view | postgres | 0 bytes | public | planet_osm_hstore_line | table | gisuser | 61 GB | public | planet_osm_hstore_nodes | table | gisuser | 8192 bytes | public | planet_osm_hstore_point | table | gisuser | 14 GB | public | planet_osm_hstore_polygon | table | gisuser | 91 GB | public | planet_osm_hstore_rels | table | gisuser | 2655 MB | public | planet_osm_hstore_roads | table | gisuser | 10196 MB | public | planet_osm_hstore_ways | table | gisuser | 106 GB | public | spatial_ref_sys | table | postgres | 4440 kB | \di+ Schema | Name | Type | Owner | Table | Size | Description --------+---------------------------------+-------+----------+---------------------------+------------+------------- public | idx_country_osm_grid_geometry | index | postgres | country_osm_grid | 1200 kB | public | planet_osm_hstore_line_index | index | gisuser | planet_osm_hstore_line | 15 GB | public | planet_osm_hstore_line_pkey | index | gisuser | planet_osm_hstore_line | 3224 MB | public | planet_osm_hstore_nodes_pkey | index | gisuser | planet_osm_hstore_nodes | 8192 bytes | public | planet_osm_hstore_point_index | index | gisuser | planet_osm_hstore_point | 5353 MB | public | planet_osm_hstore_point_pkey | index | gisuser | planet_osm_hstore_point | 2139 MB | public | planet_osm_hstore_polygon_index | index | gisuser | planet_osm_hstore_polygon | 31 GB | public | planet_osm_hstore_polygon_pkey | index | gisuser | planet_osm_hstore_polygon | 6596 MB | public | planet_osm_hstore_rels_parts | index | gisuser | planet_osm_hstore_rels | 2024 MB | public | planet_osm_hstore_rels_pkey | index | gisuser | planet_osm_hstore_rels | 116 MB | public | planet_osm_hstore_roads_index | index | gisuser | planet_osm_hstore_roads | 1252 MB | public | planet_osm_hstore_roads_pkey | index | gisuser | planet_osm_hstore_roads | 333 MB | public | planet_osm_hstore_ways_nodes | index | gisuser | planet_osm_hstore_ways | 207 GB | public | planet_osm_hstore_ways_pkey | index | gisuser | planet_osm_hstore_ways | 9722 MB | public | spatial_ref_sys_pkey | index | postgres | spatial_ref_sys | 200 kB |
- Conclusions
- 2 x 512GB was enough for full planet import using -slim and unlogged tables, import time about 12 hours with full planet
- for updatable database it was too much tablespace shuffling
- additional 1TB SSD was finally working with continuous updates
- for updates osmosis + osm2pgsql is used
- catching up 6 days took about 6 hours, minutly updates 2-20 seconds (depending on count and size on changesets)
- PostgreSQL settings are working well, but it is bound to the hardware in use (DONT copy & paste blindly!)
- unsure if the parallel settings had positive effect
- flat nodes files best kept on fastest disk
- if import is interrupted, not all is lost. If it happens during index generation, you can manually create indexes.
Debian 8.11 planet import
- hardware
- 8 cpus
- Xeon E31270@3.4Ghz
- 16GB RAM
- 465GB WD HD (WD5003ABYX-0)
- software
- Debian 8.11
- postgres 10.6
- osm2pgsql version 0.96.0 (64 bit id space)
- planet as of 29.1.19:
- 5G nodes
- 559M ways
- 6.5M relations
and /etc/postgresql/10/main/postgresql.conf has the following (amongst others):
shared_buffers = 4GB work_mem = 512MB maintenance_work_mem = 7GB #effective_io_concurrency = 1 max_worker_processes = 8 #max_parallel_workers_per_gather = 2 max_parallel_workers = 8 checkpoint_timeout = 10min #max_wal_size = 1GB #min_wal_size = 80GB checkpoint_completion_target = 0.9 #random_page_cost = 4.0 #min_parallel_table_scan_size = 8MB #min_parallel_index_scan_size = 512kB effective_cache_size = 10GB
running an import with
sudo nice -n -20 sudo -u nominatim ./utils/setup.php --osm-file /home/jeremyr/osm/planet-latest.osm.pbf --all --osm2pgsql-cache 8192 --threads 8 2>&1 | tee setup.log
which winds up doing
/srv/nominatim/Nominatim-3.2.0/build/osm2pgsql/osm2pgsql --flat-nodes /srv/nominatim/Nominatim-3.2.0/build/flatnode.file -lsc -O gazetteer --hstore --number-processes 1 -C 8192 -P 5432 -d nominatim /home/jeremyr/osm/planet-latest.osm.pbf
I see ~2000Knode/s, 0.4Kway/sec , estimated time till ways finished is 2 wks, if relations speed is ~1/10 ways speed then a few hrs for relations, if closing db is ~ same time as building then another 2 wks.
I imagine that some of the import is nonlinear in number of nodes/ways/relations but am not versed in the mysteries; however if this is the case, then size of db is even more important to report ; in such a case, Kways/s for given hardware will not be constant for different sized dbs.
Debian 8.8 planet import
- hardware
- 32 cpus
- Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
- 258GB RAM
- 17TB Avago HD (MR9361-8i)
- software
- Debian 8.8
- postgres 10.6
- osm2pgsql version 0.96.0 (64 bit id space)
- libosmium 2.14.2
- planet as of 29.1.19:
- 5G nodes
- 559M ways
- 6.5M relations
and /etc/postgresql/10/main/postgresql.conf has the following (amongst others):
shared_buffers = 16GB
maintenance_work_mem = 16GB
work_mem = 10GB
effective_cache_size = 64GB
synchronous_commit = off
checkpoint_timeout = 1h
checkpoint_completion_target = 0.9
Import took less than half a day -
Node stats: total(4963309823), max(6188986148) in 26971s
Way stats: total(554719341), max(661058995) in 8177s
Relation stats: total(6490364), max(9192176) in 1437s