Osm2pgsql/benchmarks

From OpenStreetMap Wiki
Jump to navigation Jump to search

Background

Benchmarks of osm2pgsql are important metrics for users to reference because importing OSM data is highly dependent on machine hardware and software configurations. Importing a complete planet file can take days even on a typical higher end desktop machine. Importing an extract(subset) of the planet file can take considerably less time to import and should be used if possible for your import instead of the planet file.

This page contains benchmarks for both legacy and modern hardware and software. Carefully review what is an appropriate comparison for your configuration!

What affects import time?

Partial list of variables that affect the time it takes to import.

  • Hardware
    • Hard disk throughput ( HDD, SSD )
    • RAM size
    • CPU
  • Operating system
    • OS used and 32 bit or 64 bit version
    • linux disk scheduler ( CFG, noop, deadline, ... )
    • linux file system ( ext4/LVM , xfs/raid, ...)
  • Database
    • Version
      • PostgreSQL 8.2 and PostGIS 1.5
      • PostgreSQL 9.1 and PostGIS 2.0
      • PostgreSQL 9.2 and PostGIS 2.1
      • ...
    • PostgreSQL config settings ( postgresql.conf )
      • shared_buffers
      • work_mem
      • maintanance_work_mem
      • synchronous_commits
      • fsync
      • autovacuum
      • checkpoint_segments
      • ...
  • Osm2pgsql application
    • osm2pgsql version ( 32bit, 64bit )
    • parameters
      • "Slim Mode" or not
      • Size of the node cache (-C command line argument)
      • Flat node storage
      • Hstore / style
      • Parallelisation ( --number-processes )
  • Use case
    • size of the import file : Full planet or Extracts
    • input reader and input file format
      • the experimental 'primitive' XML parser reads OSM XML input about 30% faster than the default 'libxml2' reader (but seams it currently as a bug in supporting some xml entities such as ' )
      • the 'pbf' input reader reads OSM PBF files about twice as fast as the 'libxml2' parser parses an equivalent OSM XML file
    • other customisation ( special index )

more info - see Frederik Ramm: Optimising the Mapnik/osm2pgsql Rendering Toolchain 2.0 @ SOTM 2012


Benchmarks

List of benchmarks contributed by users. Currently simply using time command to return length of time' it takes osm2pgsql task to complete. If you do not have time available please provide some other meaningful metric. Better organization and formatting standard for this section is needed.

AWS EC2 x1e.4xlarge, 16 vCPU, 47 ECU, 488GiB of ram, 500GB GP SSD storage

  • Ubuntu Server 18.04 LTS
  • db and load performed on the same machine
  • peak ram usage: 312GB
  • total import time: 1h46min
  • final storage usage: 128GB

PosgreSQL config differences from default install:

  • shared_buffers = 2048MB
  • checkpoint_timeout = 30min
$ time osm2pgsql -C 480000 -O gazetteer planet-190916.osm.pbf

(...)

Osm2pgsql took 6379s overall

real	106m28.153s
user	130m8.427s
sys	16m42.562s

Desktop Debian 9, 4 cores i5-6500 CPU @ 3.20GHz/32GB RAM, 1TB+500GB SSD (hstore slim drop flat-nodes and ZFS filesystem)

Summary

  • Full planet import using osmtilemaker (planet-200107.osm.pbf 49GB).
  • Date of import: 2020/01/07
  • Full import lasted 51,8 hours
  • Disk usage
    • planet.osm.pbf file: 49 GB
    • PostgreSQL/PostGIS OSM database (ZFS filesystem): peak usage 460 GB, end size: 185 GB.
    • nodes.cache: 53 GB

Hardware

  • Processor model: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
  • 4 cores (1 thread per core)
  • 32GB RAM/32GB swap
  • 500GB SSD disk (to host planet-200107.osm.pbf and nodes.cache)
  • 1TB SSD disk with ZFS filesystem with compression=lz4 and recordsize=8k for database

Operating system

  • Debian 9 Stretch
  • Docker-ce 19.03.5

Database

  • PostgreSQL 10 and PostGIS 2.5

shared_buffers = 512MB
work_mem = 64MB
maintenance_work_mem = 1024MB
wal_buffers = -1
checkpoint_completion_target = 0.9
random_page_cost = 2.0
cpu_tuple_cost = 0.05
autovacuum_analyze_scale_factor = 0.2
full_page_writes = off

Osm2pgsql

  • osm2pgsql version 1.2.0
  • osm2pgsql command options:

--hstore --style openstreetmap-carto.style --tag-transform-script openstreetmap-carto.lua --slim --drop --flat-nodes nodes.cache --cache 25000 --number-processes 4 --multi-geometry

Osm2pgsql logs extract

osm2pgsql version 1.2.0 (64 bit id space)
Node-cache: cache=25000MB, maxblocks=400000*65536, allocation method=11
Processing: Node(5682827k 1919.9k/s) Way(630406k 7.36k/s) Relation(7379140 176.55/s) parse time: 130354s
Node stats: total(5682827154), max(7094416299) in 2960s
Way stats: total(630406201), max(759474335) in 85597s
Relation stats: total(7379167), max(10509928) in 41796s
All indexes on planet_osm_roads created in 13768s
All indexes on planet_osm_point created in 15274s
All indexes on planet_osm_line created in 28082s
All indexes on planet_osm_polygon created in 55963s
Osm2pgsql took 186318s overall
Mid: removing persistent node cache at /docker_mounted_volumes/osm_nodes/nodes.cache
node cache: stored: 2988125793(52.58%), storage efficiency: 91.19% (dense blocks: 368683, sparse nodes: 128277398), hit rate: 51.38%

Azure VM (CentOS 7.8, 16 vcpus, 64GB RAM, 2TB + 128GB + 128GB Premium SSD)

Summary

  • Full planet import based on https://github.com/Overv/openstreetmap-tile-server (planet-200831.osm.pbf 53GB).
  • Date of import: 2020/10
  • Full import 6 days 9 hours
  • Disk usage
    • planet.osm.pbf file: 53 GB
    • PostgreSQL/PostGIS OSM database: peak usage 1.1 TB, end size: 880 GB.
    • nodes.cache: 59 GB

Hardware

  • 16 cores
  • 64GB RAM
  • 128GB SSD disk to host planet-200831.osm.pbf file
  • 128GB SSD disk to host flat-nodes file
  • 2TB SSD disk for PostgreSQL database

Operating system

  • Centos 7.8.2003
  • Docker 19.03.13

Database

  • PostgreSQL 10 and PostGIS 2.4

shared_buffers = 2GB
min_wal_size = 1GB
max_wal_size = 2GB
maintenance_work_mem = 10GB
max_connections = 250
temp_buffers = 32MB
work_mem = 256MB
wal_buffers = 1024kB
wal_writer_delay = 500ms
commit_delay = 10000
max_wal_size = 2880MB
random_page_cost = 1.1
track_activity_query_size = 16384
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02

Osm2pgsql

  • osm2pgsql version 1.2.0
  • osm2pgsql command options:

--create --slim -G --hstore --tag-transform-script /home/renderer/src/openstreetmap-carto/openstreetmap-carto.lua -S /home/renderer/src/openstreetmap-carto/openstreetmap-carto.style /data.osm.pbf --number-processes 14 -C 32000 --flat-nodes /nodes/flat_nodes.bin osm2pgsql version 1.2.0

Osm2pgsql logs extract

sudo -u renderer osm2pgsql -d gis --create --slim -G --hstore --tag-transform-script /home/renderer/src/openstreetmap-carto/openstreetmap-carto.lua -S /home/renderer/src/openstreetmap-carto/openstreetmap-carto.style /data.osm.pbf --number-processes 14 -C 32000 --flat-nodes /nodes/flat_nodes.bin osm2pgsql version 1.2.0

Processing: Node(6282141k 1619.5k/s) Way(693811k 3.89k/s) Relation(8092025 54.9/s)  parse time: 329824s
Node stats: total(6282141789), max(7859748112) in 3879s
Way stats: total(693811435), max(842504393) in 178503s
Relation stats: total(8092025), max(11562182) in 147442s

Stopped table: planet_osm_nodes in 0s
Stopped table: planet_osm_rels in 674s
All indexes on planet_osm_roads created in 4405s
All indexes on planet_osm_point created in 6702s
All indexes on planet_osm_line created in 20419s
All indexes on planet_osm_polygon created in 41617s
Stopped table: planet_osm_ways in 219360s

Osm2pgsql took 549186s overall
node cache: stored: 3803116413(60.54%), storage efficiency: 90.67% (dense blocks: 473077, sparse nodes: 159429692), hit rate: 59.15%

Europe Import (Ubuntu 20.04, AMD EPYC 7502P (32 Cores), 128 GB RAM, 960 GB NVMe SSD)

Summary

  • Europe import based on https://download.geofabrik.de/europe.html (europe-latest.osm.pbf 22.5GB).
  • Date of import: 2020/12
  • Full import 5.8 hours
  • Disk usage
    • europe-latest.osm.pbf file: 22.5 GB
    • PostgreSQL/PostGIS OSM database: end size: 158 GB.
    • nodes.cache: 80 GB

Hardware

  • 32 cores
  • 128 GB RAM
  • 960 GB NMVe SSD disk

Operating system

  • Ubuntu Server 20.04
  • Docker 20.10.0

Database

  • PostgreSQL 12.5 and PostGIS 3.0

Non-Default settings:

shared_buffers = 20000 MB
maintenance_work_mem = 1000 MB
work_mem = 5000 MB
max_worker_processes = 32
max_parallel_workers = 32

Osm2pgsql

  • osm2pgsql version 1.2.1
  • osm2pgsql command options:

osm2pgsql -d gis -U xxx --create -G --hstore -W -l --slim --drop --style ~/src/openstreetmap-carto/openstreetmap-carto.style --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua -C 80000 --number-processes 32 ~/europe-latest.osm.pbf

Osm2pgsql log

Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=80000MB, maxblocks=1280000*65536, allocation method=11
Mid: pgsql, cache=80000
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using lua based tag processing pipeline with script /home/renderaccount/src/openstreetmap-carto/openstreetmap-carto.lua
Using projection SRS 4326 (Latlong)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads

Reading in file: /home/renderaccount/europe-latest.osm.pbf
Using PBF parser.
Processing: Node(2660149k 833.1k/s) Way(318682k 35.89k/s) Relation(5358780 1927.62/s)  parse time: 14852s
Node stats: total(2660149455), max(8223095445) in 3193s
Way stats: total(318682136), max(884214400) in 8879s
Relation stats: total(5360596), max(12010480) in 2780s
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 3s
Stopping table: planet_osm_ways
Stopped table: planet_osm_ways in 1s
Stopping table: planet_osm_rels
Stopped table: planet_osm_rels in 0s
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_line
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating indexes on planet_osm_roads finished
All indexes on planet_osm_roads created in 410s
Completed planet_osm_roads
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on planet_osm_point
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on planet_osm_line
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on planet_osm_polygon
Creating indexes on planet_osm_point finished
All indexes on planet_osm_point created in 1972s
Completed planet_osm_point
Creating indexes on planet_osm_line finished
All indexes on planet_osm_line created in 2653s
Completed planet_osm_line
Creating indexes on planet_osm_polygon finished
All indexes on planet_osm_polygon created in 6022s
Completed planet_osm_polygon

Osm2pgsql took 20879s overall
node cache: stored: 2660149455(100.00%), storage efficiency: 57.81% (dense blocks: 244710, sparse nodes: 1298474118), hit rate: 100.00%


Planet, 32GB + 72GB swap, 4 cores, 2x SSD RAID 0, Ubuntu 20.04 ~25h

Xeon(R) CPU E3-1245 V2 @ 3.40GHz
RAM: 32GB
RAID-0 2x SSD 480GB INTEL SSDSC2BB48 
72GB SWAP

osm2pgsql version 1.2.1 (64 bit id space)
Postgresql 13

Time: 88528s (<25h)

Postgres config:

wal_level=minimal
hot_standby=off
max_wal_senders=0
checkpoint_timeout=1d
checkpoint_completion_target=0.90
max_parallel_workers_per_gather=3
max_wal_size = 10GB
min_wal_size = 1GB
shared_buffers = 8GB
dynamic_shared_memory_type = posix 

osm2pgsql -d XXX -U XXX  --create --slim --flat-nodes ./nodes --multi-geometry  -G --hstore  \
       --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua \
       -C 72000 --number-processes 2 \
       -S ~/src/openstreetmap-carto/openstreetmap-carto.style ./planet-latest.osm.pbf

....
Reading in file: ./planet-latest.osm.pbf
Processing: Node(7086449k 1608.7k/s) Way(787322k 16.38k/s) Relation(9109670 473.55/s)  parse time: 71718s
Node stats: total(7086449253), max(8931477723) in 4405s
Way stats: total(787322904), max(965461232) in 48076s
Relation stats: total(9109683), max(12991924) in 19237s
...
All indexes on planet_osm_point created in 5429s
...
All indexes on planet_osm_roads created in 754s
...
Stopped table: planet_osm_rels in 200s
Osm2pgsql took 88528s overall
node cache: stored: 7086449253(100.00%), storage efficiency: 83.69% (dense blocks: 972608, sparse nodes: 249971189), hit rate: 100.00%


Planet Latest (13/3/22), Dell R730 288GB DDR4 2133 + 256GB swap, 2x Xeon 2699v4 22C/44T, 4x Samsung 870 SSD RAID 0, Centos Stream 9 22h1m56s

Software

  • osm2pgsql 1.5.2
  • Posgresql 14.1
  • PostGIS 3.2

Command

nohup time osm2pgsql -d gis --create --slim -G --hstore --tag-transform-script ~/git/openstreetmap-carto/openstreetmap-carto.lua -C 250000 --number-processes 32 -S ~/git/openstreetmap-carto/openstreetmap-carto.style planet-latest.osm.bz2 > osm2pgsql.out 2> osm2pgsql.err < /dev/null &

Output

2022-03-13 15:12:01  osm2pgsql version 1.5.2 (1.5.2)
2022-03-13 15:12:01  Database version: 14.1
2022-03-13 15:12:01  PostGIS version: 3.2
2022-03-13 15:12:01  Setting up table 'planet_osm_point'
2022-03-13 15:12:02  Setting up table 'planet_osm_line'
2022-03-13 15:12:02  Setting up table 'planet_osm_polygon'
2022-03-13 15:12:02  Setting up table 'planet_osm_roads'
2022-03-14 05:37:12  Reading input files done in 51910s (14h 25m 10s).
2022-03-14 05:37:12    Processed 7538107733 nodes in 23694s (6h 34m 54s) - 318k/s
2022-03-14 05:37:12    Processed 840641373 ways in 23276s (6h 27m 56s) - 36k/s
2022-03-14 05:37:12    Processed 9712543 relations in 4940s (1h 22m 20s) - 2k/s
2022-03-14 05:37:20  Clustering table 'planet_osm_line' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_point' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_roads' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_polygon' by geometry...
2022-03-14 05:37:23  Done postprocessing on table 'planet_osm_nodes' in 0s
2022-03-14 05:37:23  Building index on table 'planet_osm_ways'
2022-03-14 05:37:23  Building index on table 'planet_osm_rels'
2022-03-14 05:47:34  Creating geometry index on table 'planet_osm_roads'...
2022-03-14 05:49:51  Creating geometry index on table 'planet_osm_point'...
2022-03-14 05:52:05  Creating osm_id index on table 'planet_osm_roads'...
2022-03-14 05:52:11  Analyzing table 'planet_osm_roads'...
2022-03-14 06:10:03  Creating geometry index on table 'planet_osm_line'...
2022-03-14 06:30:15  Creating geometry index on table 'planet_osm_polygon'...
2022-03-14 06:34:07  Creating osm_id index on table 'planet_osm_point'...
2022-03-14 06:35:07  Analyzing table 'planet_osm_point'...
2022-03-14 07:30:00  Creating osm_id index on table 'planet_osm_line'...
2022-03-14 07:31:54  Analyzing table 'planet_osm_line'...
2022-03-14 12:24:01  Creating osm_id index on table 'planet_osm_polygon'...
2022-03-14 12:27:57  Analyzing table 'planet_osm_polygon'...
2022-03-14 13:13:57  Done postprocessing on table 'planet_osm_ways' in 27394s (7h 36m 34s)
2022-03-14 13:13:57  Done postprocessing on table 'planet_osm_rels' in 327s (5m 27s)
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_point' done in 3467s (57m 47s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_line' done in 6874s (1h 54m 34s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_polygon' done in 24637s (6h 50m 37s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_roads' done in 891s (14m 51s).
2022-03-14 13:13:57  osm2pgsql took 79316s (22h 1m 56s) overall.


Planet import on Windows 10 (64 GB RAM, 24 core CPU, 1 TB NVMe SSD) ~18h

System:

  • OS: Windows 10 21H2
  • RAM: Corsair Vengeance LPX 64 GB (4 x 16 GB @ 2133 MHz)
  • CPU: AMD Ryzen 9 3900X (24 cores @ 3.8 GHz)
  • SSD: Samsung 970 Evo 1 TB (800 GB free space available)

Notes:

Command:

C:\osm\bin\osm2pgsql\osm2pgsql.exe --verbose --disable-parallel-indexing --slim --drop --flat-nodes=C:\osm\logs\flat-nodes.bin --multi-geometry --hstore --database=gis --user=postgres --password --tag-transform-script=C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.lua --style=C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.style C:\osm\pbf\planet.osm.pbf

Output:

2022-06-21 22:52:57  osm2pgsql version 1.6.0
2022-06-21 22:53:02  [0] Database version: 14.4
2022-06-21 22:53:02  [0] PostGIS version: 3.2
2022-06-21 22:53:02  [0] Reading file: C:\osm\pbf\planet.osm.pbf
2022-06-21 22:53:02  [0] Started pool with 1 threads.
2022-06-21 22:53:02  [0] Loading persistent node cache from 'C:\osm\logs\flat-nodes.bin'.
2022-06-21 22:53:02  [0] Mid: pgsql, cache=800
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_nodes'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_ways'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_rels'
2022-06-21 22:53:02  [0] Using projection SRS 3857 (Spherical Mercator)
2022-06-21 22:53:02  [0] Using lua based tag transformations with script C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.lua
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_point'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_line'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_polygon'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_roads'
2022-06-22 08:17:37  [0] Reading input files done in 33875s (9h 24m 35s).
2022-06-22 08:17:37  [0]   Processed 7741381450 nodes in 2982s (49m 42s) - 2596k/s
2022-06-22 08:17:37  [0]   Processed 866111767 ways in 19739s (5h 28m 59s) - 44k/s
2022-06-22 08:17:37  [0]   Processed 9979763 relations in 11154s (3h 5m 54s) - 895/s
2022-06-22 08:17:42  [0] Removing persistent node cache at 'C:\osm\logs\flat-nodes.bin'.
2022-06-22 08:17:50  [0] Dropping table 'planet_osm_nodes'
2022-06-22 08:17:50  [0] Table 'planet_osm_nodes' dropped in 0s
2022-06-22 08:17:50  [0] Dropping table 'planet_osm_ways'
2022-06-22 08:18:04  [0] Table 'planet_osm_ways' dropped in 14s
2022-06-22 08:18:04  [0] Dropping table 'planet_osm_rels'
2022-06-22 08:18:06  [0] Table 'planet_osm_rels' dropped in 2s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_nodes' in 0s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_ways' in 0s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_rels' in 0s
2022-06-22 08:18:06  [1] Starting task...
2022-06-22 08:18:06  [1] Clustering table 'planet_osm_point' by geometry...
2022-06-22 08:18:06  [1] Using native order for clustering table 'planet_osm_point'
2022-06-22 08:29:33  [1] Creating geometry index on table 'planet_osm_point'...
2022-06-22 09:00:27  [1] Analyzing table 'planet_osm_point'...
2022-06-22 09:00:28  [1] Done task in 2542096ms.
2022-06-22 09:00:28  [1] Starting task...
2022-06-22 09:00:28  [0] All postprocessing on table 'planet_osm_point' done in 2542s (42m 22s).
2022-06-22 09:00:28  [1] Clustering table 'planet_osm_line' by geometry...
2022-06-22 09:00:28  [1] Using native order for clustering table 'planet_osm_line'
2022-06-22 09:52:27  [1] Creating geometry index on table 'planet_osm_line'...
2022-06-22 10:58:13  [1] Analyzing table 'planet_osm_line'...
2022-06-22 10:58:16  [1] Done task in 7067928ms.
2022-06-22 10:58:16  [1] Starting task...
2022-06-22 10:58:16  [0] All postprocessing on table 'planet_osm_line' done in 7067s (1h 57m 47s).
2022-06-22 10:58:16  [1] Clustering table 'planet_osm_polygon' by geometry...
2022-06-22 10:58:16  [1] Using native order for clustering table 'planet_osm_polygon'
2022-06-22 13:43:53  [1] Creating geometry index on table 'planet_osm_polygon'...
2022-06-22 16:46:19  [1] Analyzing table 'planet_osm_polygon'...
2022-06-22 16:46:25  [1] Done task in 20886325ms.
2022-06-22 16:46:25  [1] Starting task...
2022-06-22 16:46:25  [0] All postprocessing on table 'planet_osm_polygon' done in 20886s (5h 48m 6s).
2022-06-22 16:46:25  [1] Clustering table 'planet_osm_roads' by geometry...
2022-06-22 16:46:25  [1] Using native order for clustering table 'planet_osm_roads'
2022-06-22 16:52:33  [1] Creating geometry index on table 'planet_osm_roads'...
2022-06-22 16:55:31  [1] Analyzing table 'planet_osm_roads'...
2022-06-22 16:55:31  [1] Done task in 545890ms.
2022-06-22 16:55:31  [0] All postprocessing on table 'planet_osm_roads' done in 545s (9m 5s).
2022-06-22 16:55:31  [0] osm2pgsql took 64949s (18h 2m 29s) overall.

Planet import on Ubuntu Linux (128 GB RAM, 32 core CPU, 2 TB NVMe SSD, PG15) ~13.5h

System:

Hardware assembled at Crunchy Data Benchmark Lab by Greg Smith. Presentation of results 2021-11-18 at Virtual PostGIS Day 2021

Command:

Planet data as of 2021-09-02, 119G .bzip2 download. osm2pgsql and PostgreSQL 15 database run on the same server. Final size 989 GB, build disk usage peak ~1.4TB.

osm2pgsql -d gis -v --create --slim
--cache 40000 --number-processes 4
--flat-nodes /inland/15/nodes.db
--hstore --hstore-add-index
/var/lib/postgresql/osm/planet-latest.osm.bz2

PostgreSQL Tuning:

At current Planet size improvements from OSM cache increase peaked around 40GB of RAM, with only marginal benefit going to 50GB. With 128GB that leaves enough memory for aggressive database memory tuning as well.

shared_buffers = 32GB
maintenance_work_mem = 4GB
work_mem = 4GB
max_wal_size = 16GB
min_wal_size = 4GB
effective_cache_size = 80GB

# Below here are loading tweaks not recommended for production

checkpoint_timeout = 60min
synchronous_commit = off
fsync = off
autovacuum = off
wal_level = minimal
max_wal_senders = 0

Background writer tuning:

An aggressive background writer tuning moved a lot of the index building load from the CREATE INDEX to the BGW process, often over 150MB/s:

bgwriter_delay = 10ms
bgwriter_lru_maxpages = 500
bgwriter_lru_multiplier = 4.0

Note that alternately the background writer can be turned off altogether, as there is no foreground latency for it to optimize anyway, and on some systems the CPU time it uses isn't worth spending on its job:

bgwriter_lru_maxpages = 0
bgwriter_flush_after = 0

Unsuccessful improvement attempts:

Alternate system using newest Ryzen 9 7700X/64GB was much faster at some operations. But for the full Planet, 128GB of RAM is the only way I've found to make the Way times in particular competitive now.

There are so few disk bottlenecks left at PCIe 4.0, the 5GB/s read rated drive used is more than enough to keep up. A 4 disk RAID0 or SSDs on the newer P18 chipsets were only trivially faster.

Some runs set shared_buffers as high as 48GB with clear benefits. This 32GB value is a compromise to cut down on Out of Memory Killer action ruining the load. The two work_mem settings can be dropped to 2GB as a way to fit more buffer cache, but that change is barely above the run to run noise.

Output:

2022-10-27 01:42:51  osm2pgsql version 1.6.0
2022-10-27 01:42:51  [0] Database version: 15.0 (Ubuntu 15.0-1.pgdg22.04+1)
2022-10-27 01:42:51  [0] PostGIS version: 3.3
2022-10-27 01:42:51  [0] Reading file: /var/lib/postgresql/osm/planet-latest.osm.bz2
2022-10-27 01:42:51  [0] Started pool with 4 threads.
2022-10-27 01:42:51  [0] Loading persistent node cache from '/inland/15/nodes.db'.
2022-10-27 01:42:51  [0] Mid: pgsql, cache=40000
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_nodes'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_ways'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_rels'
2022-10-27 01:42:51  [0] Using projection SRS 3857 (Spherical Mercator)
2022-10-27 01:42:51  [0] Using built-in tag transformations
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_point'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_line'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_polygon'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_roads'
2022-10-27 08:00:44  [0] Reading input files done in 22673s (6h 17m 53s).
2022-10-27 08:00:44  [0]   Processed 7882127434 nodes in 11533s (3h 12m 13s) - 683k/s
2022-10-27 08:00:44  [0]   Processed 883613025 ways in 5395s (1h 29m 55s) - 164k/s
2022-10-27 08:00:44  [0]   Processed 10176596 relations in 5745s (1h 35m 45s) - 2k/s
2022-10-27 08:00:44  [0] Overall memory usage: peak=143600MByte current=140706MByte
2022-10-27 08:00:44  [1] Starting task...
2022-10-27 08:00:44  [4] Starting task...
2022-10-27 08:00:44  [2] Starting task...
2022-10-27 08:00:44  [3] Starting task...
2022-10-27 08:00:44  [1] Clustering table 'planet_osm_point' by geometry...
2022-10-27 08:00:44  [4] Clustering table 'planet_osm_line' by geometry...
2022-10-27 08:00:44  [3] Clustering table 'planet_osm_roads' by geometry...
2022-10-27 08:00:44  [2] Clustering table 'planet_osm_polygon' by geometry...
2022-10-27 08:00:45  [4] Using native order for clustering table 'planet_osm_line'
2022-10-27 08:00:45  [3] Using native order for clustering table 'planet_osm_roads'
2022-10-27 08:00:45  [2] Using native order for clustering table 'planet_osm_polygon'
2022-10-27 08:00:45  [1] Using native order for clustering table 'planet_osm_point'
2022-10-27 08:04:09  [3] Creating geometry index on table 'planet_osm_roads'...
2022-10-27 08:06:06  [3] Creating osm_id index on table 'planet_osm_roads'...
2022-10-27 08:06:27  [3] Creating hstore indexes on table 'planet_osm_roads'...
2022-10-27 08:08:38  [1] Creating geometry index on table 'planet_osm_point'...
2022-10-27 08:10:06  [3] Analyzing table 'planet_osm_roads'...
2022-10-27 08:10:25  [3] Done task in 580104ms.
2022-10-27 08:10:25  [3] Starting task...
2022-10-27 08:10:25  [3] Done task in 0ms.
2022-10-27 08:10:25  [3] Starting task...
2022-10-27 08:10:25  [0] Done postprocessing on table 'planet_osm_nodes' in 0s
2022-10-27 08:10:25  [3] Building index on table 'planet_osm_ways'
2022-10-27 08:14:47  [1] Creating osm_id index on table 'planet_osm_point'...
2022-10-27 08:15:40  [1] Creating hstore indexes on table 'planet_osm_point'...
2022-10-27 08:27:38  [4] Creating geometry index on table 'planet_osm_line'...
2022-10-27 08:35:38  [4] Creating osm_id index on table 'planet_osm_line'...
2022-10-27 08:36:07  [1] Analyzing table 'planet_osm_point'...
2022-10-27 08:36:15  [1] Done task in 2130567ms.
2022-10-27 08:36:15  [1] Starting task...
2022-10-27 08:36:15  [1] Building index on table 'planet_osm_rels'
2022-10-27 08:37:03  [4] Creating hstore indexes on table 'planet_osm_line'...
2022-10-27 08:38:29  [1] Done task in 134154ms.
2022-10-27 08:40:45  [2] Creating geometry index on table 'planet_osm_polygon'...
2022-10-27 08:47:23  [4] Analyzing table 'planet_osm_line'...
2022-10-27 08:47:31  [4] Done task in 2806718ms.
2022-10-27 08:57:48  [2] Creating osm_id index on table 'planet_osm_polygon'...
2022-10-27 09:00:23  [2] Creating hstore indexes on table 'planet_osm_polygon'...
2022-10-27 09:23:17  [2] Analyzing table 'planet_osm_polygon'...
2022-10-27 09:23:20  [2] Done task in 4955960ms.
2022-10-27 15:14:20  [3] Done task in 25435433ms.
2022-10-27 15:14:20  [0] Done postprocessing on table 'planet_osm_ways' in 25435s (7h 3m 55s)
2022-10-27 15:14:20  [0] Done postprocessing on table 'planet_osm_rels' in 134s (2m 14s)
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_point' done in 2130s (35m 30s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_line' done in 2806s (46m 46s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_polygon' done in 4955s (1h 22m 35s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_roads' done in 580s (9m 40s).
2022-10-27 15:14:20  [0] Overall memory usage: peak=143600MByte current=881MByte
2022-10-27 15:14:20  [0] osm2pgsql took 48689s (13h 31m 29s) overall.

Planet import on AWS EC2 m6gd.16xlarge

CPU: 64 ARM vCPU Memory: 256 GiB Storage: 2*1900GB NVMe SSD osm2pgsql version 1.7.1 Database version: 14.5 (Ubuntu 14.5-2.pgdg22.04+2) PostGIS version: 3.3

sudo -u tile osm2pgsql --slim --database=gis \
--cache=0 --flat-nodes=/store/database/nodes \
--number-processes=20 --multi-geometry --hstore \
--style=/srv/tile.openstreetmap.org/styles/default/openstreetmap-carto.style \
--tag-transform-script=/srv/tile.openstreetmap.org/styles/default/openstreetmap-carto.lua \
/store/tmp/planet-221031.osm.pbf
2022-11-10 14:55:14  Reading input files done in 32396s (8h 59m 56s).
2022-11-10 14:55:14    Processed 8006526349 nodes in 4680s (1h 18m 0s) - 1711k/s
2022-11-10 14:55:14    Processed 897621157 ways in 23333s (6h 28m 53s) - 38k/s
2022-11-10 14:55:14    Processed 10353417 relations in 4383s (1h 13m 3s) - 2k/s
2022-11-10 18:12:46  Done postprocessing on table 'planet_osm_ways' in 11848s (3h 17m 28s)
2022-11-10 18:12:46  Done postprocessing on table 'planet_osm_rels' in 245s (4m 5s)
2022-11-10 18:12:46  All postprocessing on table 'planet_osm_point' done in 3681s (1h 1m 21s).
2022-11-10 18:12:46  All postprocessing on table 'planet_osm_line' done in 6570s (1h 49m 30s).
2022-11-10 19:30:03  All postprocessing on table 'planet_osm_polygon' done in 16488s (4h 34m 48s).
2022-11-10 19:30:03  All postprocessing on table 'planet_osm_roads' done in 503s (8m 23s).
2022-11-10 19:30:03  osm2pgsql took 48886s (13h 34m 46s) overall.

Tuning:

shared_buffers = 8GB
maintenance_work_mem = 7144MB
work_mem = 128MB
max_wal_size = 2880MB
min_wal_size = 80MB
effective_cache_size = 16GB

Planet import on CloudFerro eo2a.4xlarge

  • Ubuntu 22.04
  • 128 GB RAM, 32 cores, ext4 on SSD disks

Full planet import (planet-221226.osm.pbf, ~66 GB): ~56h, 1,6 TB

Tuning: https://osm2pgsql.org/doc/manual.html#tuning-the-postgresql-server + shared_buffers = 2GB

Script:

sudo -u _renderd osm2pgsql \                                                                         
  -d gis \                                                                                                
  --create \                                                                                              
  --slim -G \                                                                                             
  --hstore \                                                                                              
  --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua \                              
  -C 660000 \                                                                                             
  --number-processes 32 \                                                                                 
  -S ~/src/openstreetmap-carto/openstreetmap-carto.style \                                                
  ~/data/planet-221226.osm.pbf
$ ./import_planet.sh
2023-01-04 16:43:22  osm2pgsql version 1.6.0
2023-01-04 16:43:22  Database version: 14.5 (Ubuntu 14.5-0ubuntu0.22.04.1)
2023-01-04 16:43:22  PostGIS version: 3.2
2023-01-04 16:44:04  Setting up table 'planet_osm_point'
2023-01-04 16:44:05  Setting up table 'planet_osm_line'
2023-01-04 16:44:07  Setting up table 'planet_osm_polygon'
2023-01-04 16:44:11  Setting up table 'planet_osm_roads'
2023-01-05 13:10:43  Reading input files done in 73592s (20h 26m 32s).
2023-01-05 13:10:43    Processed 8119216287 nodes in 21821s (6h 3m 41s) - 372k/s
2023-01-05 13:10:43    Processed 909904675 ways in 25775s (7h 9m 35s) - 35k/s
2023-01-05 13:10:43    Processed 10494303 relations in 25996s (7h 13m 16s) - 404/s
2023-01-05 13:10:44  Clustering table 'planet_osm_roads' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_line' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_point' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_polygon' by geometry...
2023-01-05 13:10:47  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-01-05 13:10:47  Building index on table 'planet_osm_ways'
2023-01-05 13:10:47  Building index on table 'planet_osm_rels'
2023-01-05 13:18:49  Creating geometry index on table 'planet_osm_roads'...
2023-01-05 13:25:19  Creating osm_id index on table 'planet_osm_roads'...
2023-01-05 13:25:50  Analyzing table 'planet_osm_roads'...
2023-01-05 13:30:04  Creating geometry index on table 'planet_osm_point'...
2023-01-05 14:18:24  Creating geometry index on table 'planet_osm_line'...
2023-01-05 14:30:48  Creating osm_id index on table 'planet_osm_point'...
2023-01-05 14:33:23  Analyzing table 'planet_osm_point'...
2023-01-05 15:06:54  Creating geometry index on table 'planet_osm_polygon'...
2023-01-05 16:11:46  Creating osm_id index on table 'planet_osm_line'...
2023-01-05 16:17:42  Analyzing table 'planet_osm_line'...
2023-01-05 22:28:54  Creating osm_id index on table 'planet_osm_polygon'...
2023-01-05 22:39:44  Analyzing table 'planet_osm_polygon'...
2023-01-07 00:08:53  Done postprocessing on table 'planet_osm_ways' in 125886s (34h 58m 6s)
2023-01-07 00:08:53  Done postprocessing on table 'planet_osm_rels' in 385s (6m 25s)
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_point' done in 4959s (1h 22m 39s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_line' done in 11221s (3h 7m 1s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_polygon' done in 34146s (9h 29m 6s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_roads' done in 907s (15m 7s).
2023-01-07 00:08:53  osm2pgsql took 199531s (55h 25m 31s) overall.