Osm2pgsql/benchmarks

From OpenStreetMap Wiki
< Osm2pgsql(Redirected from Osm2pgsql benchmarks)
Jump to navigation Jump to search

Background

Benchmarks of osm2pgsql are important metrics for users to reference because importing OSM data is highly dependent on machine hardware and software configurations. Importing a complete planet file can take days even on a typical higher end desktop machine. Importing an extract(subset) of the planet file can take considerably less time to import and should be used if possible for your import instead of the planet file.

This page contains benchmarks for both legacy and modern hardware and software. Carefully review what is an appropriate comparison for your configuration!

The official manual pages for osm2pgsql with many tips and technical details can be found here:

osm2pgsql manual

What affects import time?

Partial list of variables that affect the time it takes to import.

  • Hardware
    • Hard disk throughput ( HDD, SSD )
    • RAM size
    • CPU
  • Operating system
    • OS used (64 bit version required)
    • linux disk scheduler ( CFG, noop, deadline, ... )
    • linux file system ( ext4/LVM , xfs/raid, ...)
  • Database
    • Version
      • PostgreSQL 10 and PostGIS 2.5
      • PostgreSQL 15 and PostGIS 3.3
      • ...
    • PostgreSQL config settings ( postgresql.conf )
      • shared_buffers
      • work_mem
      • maintanance_work_mem
      • fsync
      • checkpoint_segments
      • ...
  • Osm2pgsql application
    • osm2pgsql version (v1.2.1 .. v1.8.0)
    • parameters
      • "Slim Mode" or not
      • Size of the node cache (-C command line argument)
      • Flat node storage
      • hstore / jsonb
      • Style: "flex" or "pgsql" output (flex is new and recommended going into the future)
      • Parallelisation ( --number-processes )
  • Use case
    • size of the import file : Full planet or Extracts
    • input reader and input file format
      • the 'pbf' input reader reads OSM PBF files about twice as fast as the 'libxml2' parser parses an equivalent OSM XML file
    • other customisation ( special index )

more info - see Frederik Ramm: Optimising the Mapnik/osm2pgsql Rendering Toolchain 2.0 @ SOTM 2012

Benchmarks

List of benchmarks contributed by users. Currently simply using time command to return length of time' it takes osm2pgsql task to complete. If you do not have time available please provide some other meaningful metric. Better organization and formatting standard for this section is needed.

Gazetteer on AWS EC2 x1e.4xlarge, 16 vCPU, 47 ECU, 488GiB of ram, 500GB GP SSD storage ~2h

  • Ubuntu Server 18.04 LTS
  • db and load performed on the same machine
  • peak ram usage: 312GB
  • total import time: 1h46min
  • final storage usage: 128GB

PosgreSQL config differences from default install:

  • shared_buffers = 2048MB
  • checkpoint_timeout = 30min
$ time osm2pgsql -C 480000 -O gazetteer planet-190916.osm.pbf

(...)

Osm2pgsql took 6379s overall

real	106m28.153s
user	130m8.427s
sys	16m42.562s

Europe (22.5GB PBF, 2020-12) on Ubuntu 20.04, AMD EPYC 7502P (32C/64T), 128 GB RAM, 960 GB NVMe SSD ~6h

Summary

  • Europe import based on https://download.geofabrik.de/europe.html (europe-latest.osm.pbf 22.5GB).
  • Date of import: 2020/12
  • Full import 5.8 hours
  • Disk usage
    • europe-latest.osm.pbf file: 22.5 GB
    • PostgreSQL/PostGIS OSM database: end size: 158 GB.
    • nodes.cache: 80 GB

Hardware

  • 32 cores
  • 128 GB RAM
  • 960 GB NMVe SSD disk

Operating system

  • Ubuntu Server 20.04
  • Docker 20.10.0

Database

  • PostgreSQL 12.5 and PostGIS 3.0

Non-Default settings:

shared_buffers = 20000 MB
maintenance_work_mem = 1000 MB
work_mem = 5000 MB
max_worker_processes = 32
max_parallel_workers = 32

Osm2pgsql

  • osm2pgsql version 1.2.1
  • osm2pgsql command options:

osm2pgsql -d gis -U xxx --create -G --hstore -W -l --slim --drop --style ~/src/openstreetmap-carto/openstreetmap-carto.style --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua -C 80000 --number-processes 32 ~/europe-latest.osm.pbf

Osm2pgsql log

Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=80000MB, maxblocks=1280000*65536, allocation method=11
Mid: pgsql, cache=80000
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Using lua based tag processing pipeline with script /home/renderaccount/src/openstreetmap-carto/openstreetmap-carto.lua
Using projection SRS 4326 (Latlong)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads

Reading in file: /home/renderaccount/europe-latest.osm.pbf
Using PBF parser.
Processing: Node(2660149k 833.1k/s) Way(318682k 35.89k/s) Relation(5358780 1927.62/s)  parse time: 14852s
Node stats: total(2660149455), max(8223095445) in 3193s
Way stats: total(318682136), max(884214400) in 8879s
Relation stats: total(5360596), max(12010480) in 2780s
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 3s
Stopping table: planet_osm_ways
Stopped table: planet_osm_ways in 1s
Stopping table: planet_osm_rels
Stopped table: planet_osm_rels in 0s
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_line
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating indexes on planet_osm_roads finished
All indexes on planet_osm_roads created in 410s
Completed planet_osm_roads
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on planet_osm_point
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on planet_osm_line
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on planet_osm_polygon
Creating indexes on planet_osm_point finished
All indexes on planet_osm_point created in 1972s
Completed planet_osm_point
Creating indexes on planet_osm_line finished
All indexes on planet_osm_line created in 2653s
Completed planet_osm_line
Creating indexes on planet_osm_polygon finished
All indexes on planet_osm_polygon created in 6022s
Completed planet_osm_polygon

Osm2pgsql took 20879s overall
node cache: stored: 2660149455(100.00%), storage efficiency: 57.81% (dense blocks: 244710, sparse nodes: 1298474118), hit rate: 100.00%

Planet (49GB PBF, 2020-01-07) on Desktop Debian 9, 4 cores i5-6500 CPU @ 3.20GHz/32GB RAM, 1TB+500GB SSD (hstore slim drop flat-nodes and ZFS filesystem ~52h

Summary

  • Full planet import using osmtilemaker (planet-200107.osm.pbf 49GB).
  • Date of import: 2020/01/07
  • Full import lasted 51,8 hours
  • Disk usage
    • planet.osm.pbf file: 49 GB
    • PostgreSQL/PostGIS OSM database (ZFS filesystem): peak usage 460 GB, end size: 185 GB.
    • nodes.cache: 53 GB

Hardware

  • Processor model: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
  • 4 cores (1 thread per core)
  • 32GB RAM/32GB swap
  • 500GB SSD disk (to host planet-200107.osm.pbf and nodes.cache)
  • 1TB SSD disk with ZFS filesystem with compression=lz4 and recordsize=8k for database

Operating system

  • Debian 9 Stretch
  • Docker-ce 19.03.5

Database

  • PostgreSQL 10 and PostGIS 2.5

shared_buffers = 512MB
work_mem = 64MB
maintenance_work_mem = 1024MB
wal_buffers = -1
checkpoint_completion_target = 0.9
random_page_cost = 2.0
cpu_tuple_cost = 0.05
autovacuum_analyze_scale_factor = 0.2
full_page_writes = off

Osm2pgsql

  • osm2pgsql version 1.2.0
  • osm2pgsql command options:

--hstore --style openstreetmap-carto.style --tag-transform-script openstreetmap-carto.lua --slim --drop --flat-nodes nodes.cache --cache 25000 --number-processes 4 --multi-geometry

Osm2pgsql logs extract

osm2pgsql version 1.2.0 (64 bit id space)
Node-cache: cache=25000MB, maxblocks=400000*65536, allocation method=11
Processing: Node(5682827k 1919.9k/s) Way(630406k 7.36k/s) Relation(7379140 176.55/s) parse time: 130354s
Node stats: total(5682827154), max(7094416299) in 2960s
Way stats: total(630406201), max(759474335) in 85597s
Relation stats: total(7379167), max(10509928) in 41796s
All indexes on planet_osm_roads created in 13768s
All indexes on planet_osm_point created in 15274s
All indexes on planet_osm_line created in 28082s
All indexes on planet_osm_polygon created in 55963s
Osm2pgsql took 186318s overall
Mid: removing persistent node cache at /docker_mounted_volumes/osm_nodes/nodes.cache
node cache: stored: 2988125793(52.58%), storage efficiency: 91.19% (dense blocks: 368683, sparse nodes: 128277398), hit rate: 51.38%

Planet (53GB PBF, 2020-10) on Azure VM (CentOS 7.8, 16 vcpus, 64GB RAM, 2TB + 128GB + 128GB Premium SSD) ~6d9h

Summary

  • Full planet import based on https://github.com/Overv/openstreetmap-tile-server (planet-200831.osm.pbf 53GB).
  • Date of import: 2020/10
  • Full import 6 days 9 hours
  • Disk usage
    • planet.osm.pbf file: 53 GB
    • PostgreSQL/PostGIS OSM database: peak usage 1.1 TB, end size: 880 GB.
    • nodes.cache: 59 GB

Hardware

  • 16 cores
  • 64GB RAM
  • 128GB SSD disk to host planet-200831.osm.pbf file
  • 128GB SSD disk to host flat-nodes file
  • 2TB SSD disk for PostgreSQL database

Operating system

  • Centos 7.8.2003
  • Docker 19.03.13

Database

  • PostgreSQL 10 and PostGIS 2.4

shared_buffers = 2GB
min_wal_size = 1GB
max_wal_size = 2GB
maintenance_work_mem = 10GB
max_connections = 250
temp_buffers = 32MB
work_mem = 256MB
wal_buffers = 1024kB
wal_writer_delay = 500ms
commit_delay = 10000
max_wal_size = 2880MB
random_page_cost = 1.1
track_activity_query_size = 16384
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02

Osm2pgsql

  • osm2pgsql version 1.2.0
  • osm2pgsql command options:

--create --slim -G --hstore --tag-transform-script /home/renderer/src/openstreetmap-carto/openstreetmap-carto.lua -S /home/renderer/src/openstreetmap-carto/openstreetmap-carto.style /data.osm.pbf --number-processes 14 -C 32000 --flat-nodes /nodes/flat_nodes.bin osm2pgsql version 1.2.0

Osm2pgsql logs extract

sudo -u renderer osm2pgsql -d gis --create --slim -G --hstore --tag-transform-script /home/renderer/src/openstreetmap-carto/openstreetmap-carto.lua -S /home/renderer/src/openstreetmap-carto/openstreetmap-carto.style /data.osm.pbf --number-processes 14 -C 32000 --flat-nodes /nodes/flat_nodes.bin osm2pgsql version 1.2.0

Processing: Node(6282141k 1619.5k/s) Way(693811k 3.89k/s) Relation(8092025 54.9/s)  parse time: 329824s
Node stats: total(6282141789), max(7859748112) in 3879s
Way stats: total(693811435), max(842504393) in 178503s
Relation stats: total(8092025), max(11562182) in 147442s

Stopped table: planet_osm_nodes in 0s
Stopped table: planet_osm_rels in 674s
All indexes on planet_osm_roads created in 4405s
All indexes on planet_osm_point created in 6702s
All indexes on planet_osm_line created in 20419s
All indexes on planet_osm_polygon created in 41617s
Stopped table: planet_osm_ways in 219360s

Osm2pgsql took 549186s overall
node cache: stored: 3803116413(60.54%), storage efficiency: 90.67% (dense blocks: 473077, sparse nodes: 159429692), hit rate: 59.15%

Planet (66GB PBF, 2022-12-26) on CloudFerro eo2a.4xlarge, 32 cores, 128GB RAM ~56h

  • Ubuntu 22.04
  • 128 GB RAM, 32 cores, ext4 on SSD disks

Full planet import (planet-221226.osm.pbf, ~66 GB): ~56h, 1,6 TB

Tuning: https://osm2pgsql.org/doc/manual.html#tuning-the-postgresql-server + shared_buffers = 2GB

Script:

sudo -u _renderd osm2pgsql \                                                                         
  -d gis \                                                                                                
  --create \                                                                                              
  --slim -G \                                                                                             
  --hstore \                                                                                              
  --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua \                              
  -C 660000 \                                                                                             
  --number-processes 32 \                                                                                 
  -S ~/src/openstreetmap-carto/openstreetmap-carto.style \                                                
  ~/data/planet-221226.osm.pbf
$ ./import_planet.sh
2023-01-04 16:43:22  osm2pgsql version 1.6.0
2023-01-04 16:43:22  Database version: 14.5 (Ubuntu 14.5-0ubuntu0.22.04.1)
2023-01-04 16:43:22  PostGIS version: 3.2
2023-01-04 16:44:04  Setting up table 'planet_osm_point'
2023-01-04 16:44:05  Setting up table 'planet_osm_line'
2023-01-04 16:44:07  Setting up table 'planet_osm_polygon'
2023-01-04 16:44:11  Setting up table 'planet_osm_roads'
2023-01-05 13:10:43  Reading input files done in 73592s (20h 26m 32s).
2023-01-05 13:10:43    Processed 8119216287 nodes in 21821s (6h 3m 41s) - 372k/s
2023-01-05 13:10:43    Processed 909904675 ways in 25775s (7h 9m 35s) - 35k/s
2023-01-05 13:10:43    Processed 10494303 relations in 25996s (7h 13m 16s) - 404/s
2023-01-05 13:10:44  Clustering table 'planet_osm_roads' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_line' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_point' by geometry...
2023-01-05 13:10:44  Clustering table 'planet_osm_polygon' by geometry...
2023-01-05 13:10:47  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-01-05 13:10:47  Building index on table 'planet_osm_ways'
2023-01-05 13:10:47  Building index on table 'planet_osm_rels'
2023-01-05 13:18:49  Creating geometry index on table 'planet_osm_roads'...
2023-01-05 13:25:19  Creating osm_id index on table 'planet_osm_roads'...
2023-01-05 13:25:50  Analyzing table 'planet_osm_roads'...
2023-01-05 13:30:04  Creating geometry index on table 'planet_osm_point'...
2023-01-05 14:18:24  Creating geometry index on table 'planet_osm_line'...
2023-01-05 14:30:48  Creating osm_id index on table 'planet_osm_point'...
2023-01-05 14:33:23  Analyzing table 'planet_osm_point'...
2023-01-05 15:06:54  Creating geometry index on table 'planet_osm_polygon'...
2023-01-05 16:11:46  Creating osm_id index on table 'planet_osm_line'...
2023-01-05 16:17:42  Analyzing table 'planet_osm_line'...
2023-01-05 22:28:54  Creating osm_id index on table 'planet_osm_polygon'...
2023-01-05 22:39:44  Analyzing table 'planet_osm_polygon'...
2023-01-07 00:08:53  Done postprocessing on table 'planet_osm_ways' in 125886s (34h 58m 6s)
2023-01-07 00:08:53  Done postprocessing on table 'planet_osm_rels' in 385s (6m 25s)
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_point' done in 4959s (1h 22m 39s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_line' done in 11221s (3h 7m 1s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_polygon' done in 34146s (9h 29m 6s).
2023-01-07 00:08:53  All postprocessing on table 'planet_osm_roads' done in 907s (15m 7s).
2023-01-07 00:08:53  osm2pgsql took 199531s (55h 25m 31s) overall.


Planet (72.2GB PBF, 2023-01-23) on HP Z840 workstation, 256GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe, Ubuntu 22.04 (flex | --slim) ~10.5h

Software

  • osm2pgsql 1.8.0
  • PostgreSQL 15.2
  • PostGIS 3.3.2

This import used the new "flex" output of osm2pgsql, including "stage 2" processing to create de-duplicated line geometries of OpenStreetMap boundary polygons according to a modified variant of this "flex output" variant of the openstreetmap-carto style. The modified variant includes one more table table definition than the original (waterway route table), and uses WGS 1984 output coordinate system rather than Web Mercator.

Import done on a locally running Windows Hyper-V instance of Ubuntu, with Ubuntu 22.04 as guest system, and Windows 10, 22H2 as host system.

All Samsung 2TB NVMe drives attached over PCIe using two "ASUS Hyper M.2 x16 Gen. 4" cards connected on the PCIe3 x16 slots (PCIe 4x4x4x4 bifurcation set in the BIOS).

NOTE: It was discovered after the import, that the Windows Storage Space created through the Windows GUI for the 5x Samsung 970 EVO Plus drives did not put the drives in an actual striped RAID configuration as expected, but only used a single column, which isn't striped RAID 0, but just combines the capacity of the drives. This negatively affects the combined total performance of the drives in the Storage Space. For a new run with properly configured Windows Storage Space, see next benchmark immediately below this one.

Hyperthreading and NUMA were disabled in the BIOS on the HP Z840 workstation to create a single processor group, due to the Windows 10 issue with processor groups on systems with > 64 logical processors (https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/).

Command

osm2pgsql -O flex -d <DATABASE> -U <USER_NAME> --create -W -C 75000 --slim --drop --flat-nodes 'data/nodes.bin' --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' 'data/planet-230123.osm.pbf'

Output

2023-02-09 19:49:17  osm2pgsql version 1.8.0 (1.7.2-164-g1359e2ea)
Password:
2023-02-09 19:49:24  Database version: 15.2 (Ubuntu 15.2-1.pgdg22.04+1)
2023-02-09 19:49:24  PostGIS version: 3.3
2023-02-10 01:55:00  Reading input files done in 21934s (6h 5m 34s).                      
2023-02-10 01:55:00    Processed 8172948111 nodes in 1997s (33m 17s) - 4093k/s
2023-02-10 01:55:00    Processed 915512510 ways in 12874s (3h 34m 34s) - 71k/s
2023-02-10 01:55:00    Processed 10604686 relations in 7063s (1h 57m 43s) - 2k/s
2023-02-10 01:55:00  Reprocess marked ways (stage 2)...
2023-02-10 01:55:17  Creating id index on table 'planet_osm_line'...
2023-02-10 01:58:41  Creating id index on table 'planet_osm_polygon'...
2023-02-10 02:05:21  Creating id index on table 'planet_osm_transport_line'...
2023-02-10 02:07:48  Creating id index on table 'planet_osm_transport_polygon'...
2023-02-10 02:07:49  Creating id index on table 'planet_osm_admin'...
2023-02-10 02:07:49  Creating id indexes took 769s (12m 49s)
2023-02-10 02:07:51  There are 2473761 ways to reprocess...
2023-02-10 02:21:42  Dropping table 'planet_osm_nodes'
2023-02-10 02:21:42  Table 'planet_osm_nodes' dropped in 0s
2023-02-10 02:21:42  Dropping table 'planet_osm_ways'
2023-02-10 02:22:04  Table 'planet_osm_ways' dropped in 22s
2023-02-10 02:22:04  Dropping table 'planet_osm_rels'
2023-02-10 02:22:05  Table 'planet_osm_rels' dropped in 0s
2023-02-10 02:22:05  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-02-10 02:22:05  Done postprocessing on table 'planet_osm_ways' in 0s
2023-02-10 02:22:05  Done postprocessing on table 'planet_osm_rels' in 0s
2023-02-10 02:22:05  Clustering table 'planet_osm_point' by geometry...
2023-02-10 02:22:35  Clustering table 'planet_osm_polygon' by geometry...
2023-02-10 02:22:35  Clustering table 'planet_osm_line' by geometry...
2023-02-10 02:22:46  No indexes to create on table 'planet_osm_route'.
2023-02-10 02:22:46  Clustering table 'planet_osm_transport_polygon' by geometry...
2023-02-10 02:22:46  Clustering table 'planet_osm_transport_line' by geometry...
2023-02-10 02:22:46  Analyzing table 'planet_osm_route'...
2023-02-10 02:22:46  Clustering table 'planet_osm_waterway' by geometry...
2023-02-10 02:22:47  Clustering table 'planet_osm_admin' by geometry...
2023-02-10 02:22:55  Creating index on table 'planet_osm_transport_polygon' ("way")...
2023-02-10 02:22:56  Analyzing table 'planet_osm_transport_polygon'...
2023-02-10 02:23:17  Creating index on table 'planet_osm_waterway' ("way")...
2023-02-10 02:23:19  Analyzing table 'planet_osm_waterway'...
2023-02-10 02:23:48  Creating index on table 'planet_osm_admin' ("way")...
2023-02-10 02:23:53  Analyzing table 'planet_osm_admin'...
2023-02-10 03:04:39  Creating index on table 'planet_osm_point' ("way")...
2023-02-10 03:23:04  Analyzing table 'planet_osm_point'...
2023-02-10 03:23:12  All postprocessing on table 'planet_osm_point' done in 3666s (1h 1m 6s).
2023-02-10 03:39:06  Creating index on table 'planet_osm_transport_line' ("way")...
2023-02-10 04:03:46  Creating index on table 'planet_osm_line' ("way")...
2023-02-10 04:16:11  Analyzing table 'planet_osm_transport_line'...
2023-02-10 04:47:41  Analyzing table 'planet_osm_line'...
2023-02-10 04:47:54  All postprocessing on table 'planet_osm_line' done in 8748s (2h 25m 48s).
2023-02-10 05:02:29  Creating index on table 'planet_osm_polygon' ("way")...
2023-02-10 06:19:32  Analyzing table 'planet_osm_polygon'...
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_polygon' done in 14254s (3h 57m 34s).
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_transport_line' done in 6857s (1h 54m 17s).
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_transport_polygon' done in 54s.
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_route' done in 53s.
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_admin' done in 111s (1m 51s).
2023-02-10 06:19:40  All postprocessing on table 'planet_osm_waterway' done in 79s (1m 19s).
2023-02-10 06:19:43  osm2pgsql took 37818s (10h 30m 18s) overall.

Planet (74.9GB PBF, 2023-07-03) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (flex | --slim) ~9h

Software

  • osm2pgsql 1.8.1
  • PostgreSQL 15.3
  • PostGIS 3.3.3

This import used the new "flex" output of osm2pgsql, including "stage 2" processing to create de-duplicated line geometries of OpenStreetMap boundary polygons according to a modified variant of this "flex output" variant of the openstreetmap-carto style. The modified variant includes one more table table definition than the original flex style ('planet_osm_waterway'), and four more tables ('planet_osm_admin','planet_osm_transport_line','planet_osm_transport_polygon','planet_osm_waterway') compared to old style "pgsql output", while dropping the "pgsql output" specific 'planet_osm_roads' table. It also uses WGS 1984 output coordinate system rather than Web Mercator.

NOTE: Due to the different table structure and database schema, for a fairer comparison of the total processing time of this "flex output" style benchmark compared to old style "pgsql output" as is used in most other benchmarks you see on this webpage, you should substract some time of the 9h benchmark value. A reasonable amount for that is probably 0.5 hours, based on adding up the times for "Reading input files" and the postprocessing time of the largest table ('planet_osm_polygon'), which means 8.5h is a fair benchmark value to compare with "pgsql output" benchmarks on this webpage. Note that postprocessing of the different tables happens in parallel, so if enough resources (CPU, io) are available, only the largest table will ultimately determine the total processing time.

Import done on a locally running Windows Hyper-V instance of Ubuntu, with Ubuntu 22.04 as guest system, and Windows 10, 22H2 as host system. 420GB of the 512GB RAM of the workstation assigned to the Ubuntu Hyper-V instance, additionally 100GB swap defined under Ubuntu which uses the virtualized NVMe RAID.

All Samsung 2TB NVMe drives attached over PCIe using two "ASUS Hyper M.2 x16 Gen. 4" cards connected on the PCIe3 x16 slots (PCIe 4x4x4x4 bifurcation set in the BIOS), the Intel OPTANE drives connected over two x8 PCIe Gen.4 cards and inserted in the remaining last two empty PCIe x8 slots of the HP Z840 (PCIe 4x4 bifurcation set in the BIOS).

NOTE: This import session used properly configured Windows Storage Spaces of type "Simple" which is equivalent to RAID 0 if Windows PowerShell is used to define the correct number of "columns" to truly stripe the data across all drives. This cannot be set using the Windows GUI for Storage Spaces, but requires PowerShell.

Hyperthreading and NUMA were disabled in the BIOS on the HP Z840 workstation to create a single processor group, due to the Windows 10 issue with processor groups on systems with > 64 logical processors (https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/).

This import, which is 1 1/2 hours faster than the previous one despite slightly larger PBF, differs from the previous one in four aspects:

  • Properly defined Windows Storage Space as software RAID 0
  • Intel OPTANE drives used for PostgreSQL WAL
  • Upgrade to 512GB instead of 256GB RAM
  • osm2pgsql cache set to 0

Of these, the proper definition of the Storage Space, and thus taking full advantage of the RAIDed Samsung and OPTANE drives, is probably the main reason for the faster import of nodes, ways and relations at the start of osm2pgsql processing as seen in the log below. Adding the OPTANE drives was just an experiment, as the specific P1600X ones, aren't the fastest OPTANE options out there. In fact, they are low end, but still have part of the typical low queue depth high throughput of which OPTANE is known. That said, it was only a temporary offer that made this worthwhile, and other more recent PCIe4 and 5 NVMe drives are probably better options (although unsupported in full by the HP Z840 which only features PCIe3). Putting PostgreSQL's WAL on a separate set of drives should theoretically increase performance though, although it is unlikely to play a major role here, as osm2pgsql uses UNLOGGED tabels for at least part of the import steps, minimizing WAL usage. The upgrade in RAM allowed setting bigger work memory settings in PostgreSQL, and is likely the main cause for the speed up of the indexing processing as witnessed by the considerably shorter timings displayed for the "All processing on table <X> done in <Y> seconds" log lines. The biggest win actually appears in the indexing stage, see the timings of both runs. Osm2pgsql cache was experimentally set to 0 based on a remark of one of the osm2pgsql maintainers, so as to free up memory for the osm2pgsql main process, and didn't seem to hurt import in this configuration, although further testing should follow.

Postgres config:

  • shared_buffers = 275GB
  • temp_buffers = 8000MB
  • work_mem = 4000MB
  • maintenance_work_mem = 16000MB
  • autovacuum_work_mem = 4000MB
  • wal_level = minimal
  • synchronous_commit = off
  • wal_compression = on
  • wal_buffers = 128MB
  • max_wal_senders = 0
  • checkpoint_timeout = 5min
  • checkpoint_completion_target = 0.9
  • max_wal_size = 150GB
  • min_wal_size = 10GB
  • random_page_cost = 1.0
  • parallel_tuple_cost = 0.025
  • parallel_setup_cost = 100
  • effective_cache_size = 400GB
  • effective_io_concurrency = 500
  • maintenance_io_concurrency = 500
  • max_worker_processes = 264
  • max_parallel_maintenance_workers = 22
  • max_parallel_workers_per_gather = 44
  • parallel_leader_participation = off
  • max_parallel_workers = 264

Command

osm2pgsql -O flex -d <DATABASE> -U <USER_NAME> --create -W -C 0 --slim --drop --flat-nodes 'data/nodes.bin' --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' 'data/planet-230703.osm.pbf'

Output

2023-07-12 10:03:56  osm2pgsql version 1.8.1 (1.8.1-101-geaff0924)
Password:
2023-07-12 10:04:04  Database version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
2023-07-12 10:04:04  PostGIS version: 3.3
2023-07-12 15:49:52  Reading input files done in 20743s (5h 45m 43s).                     
2023-07-12 15:49:52    Processed 8484080355 nodes in 1807s (30m 7s) - 4695k/s
2023-07-12 15:49:52    Processed 951070773 ways in 11522s (3h 12m 2s) - 83k/s
2023-07-12 15:49:52    Processed 11073095 relations in 7414s (2h 3m 34s) - 1k/s
2023-07-12 15:49:53  Reprocess marked ways (stage 2)...
2023-07-12 15:50:35  Creating id index on table 'planet_osm_line'...
2023-07-12 15:53:49  Creating id index on table 'planet_osm_polygon'...
2023-07-12 16:00:22  Creating id index on table 'planet_osm_transport_line'...
2023-07-12 16:09:14  Creating id index on table 'planet_osm_transport_polygon'...
2023-07-12 16:09:14  Creating id index on table 'planet_osm_admin'...
2023-07-12 16:09:14  Creating id indexes took 1161s (19m 21s)
2023-07-12 16:09:16  There are 2516850 ways to reprocess...
2023-07-12 16:23:45  Dropping table 'planet_osm_nodes'
2023-07-12 16:23:45  Table 'planet_osm_nodes' dropped in 0s
2023-07-12 16:23:45  Dropping table 'planet_osm_ways'
2023-07-12 16:24:09  Table 'planet_osm_ways' dropped in 24s
2023-07-12 16:24:09  Dropping table 'planet_osm_rels'
2023-07-12 16:24:10  Table 'planet_osm_rels' dropped in 0s
2023-07-12 16:24:10  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-07-12 16:24:10  Done postprocessing on table 'planet_osm_ways' in 0s
2023-07-12 16:24:10  Done postprocessing on table 'planet_osm_rels' in 0s
2023-07-12 16:24:16  Clustering table 'planet_osm_polygon' by geometry...
2023-07-12 16:24:23  Clustering table 'planet_osm_transport_line' by geometry...
2023-07-12 16:24:23  Clustering table 'planet_osm_transport_polygon' by geometry...
2023-07-12 16:24:24  Clustering table 'planet_osm_point' by geometry...
2023-07-12 16:24:24  Clustering table 'planet_osm_admin' by geometry...
2023-07-12 16:24:24  Clustering table 'planet_osm_waterway' by geometry...
2023-07-12 16:24:24  No indexes to create on table 'planet_osm_route'.
2023-07-12 16:24:24  Clustering table 'planet_osm_line' by geometry...
2023-07-12 16:24:24  Analyzing table 'planet_osm_route'...
2023-07-12 16:24:31  Creating index on table 'planet_osm_transport_polygon' ("way")...
2023-07-12 16:24:32  Analyzing table 'planet_osm_transport_polygon'...
2023-07-12 16:24:52  Creating index on table 'planet_osm_waterway' ("way")...
2023-07-12 16:24:54  Analyzing table 'planet_osm_waterway'...
2023-07-12 16:25:11  Creating index on table 'planet_osm_admin' ("way")...
2023-07-12 16:25:17  Analyzing table 'planet_osm_admin'...
2023-07-12 17:09:18  Creating index on table 'planet_osm_point' ("way")...
2023-07-12 17:16:01  Analyzing table 'planet_osm_point'...
2023-07-12 17:16:06  All postprocessing on table 'planet_osm_point' done in 3115s (51m 55s).
2023-07-12 17:35:22  Creating index on table 'planet_osm_transport_line' ("way")...
2023-07-12 17:42:59  Analyzing table 'planet_osm_transport_line'...
2023-07-12 17:56:23  Creating index on table 'planet_osm_line' ("way")...
2023-07-12 18:07:42  Analyzing table 'planet_osm_line'...
2023-07-12 18:07:48  All postprocessing on table 'planet_osm_line' done in 6217s (1h 43m 37s).
2023-07-12 18:40:21  Creating index on table 'planet_osm_polygon' ("way")...
2023-07-12 19:05:55  Analyzing table 'planet_osm_polygon'...
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_polygon' done in 9710s (2h 41m 50s).
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_transport_line' done in 4733s (1h 18m 53s).
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_transport_polygon' done in 25s.
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_route' done in 20s.
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_admin' done in 69s (1m 9s).
2023-07-12 19:06:01  All postprocessing on table 'planet_osm_waterway' done in 48s.
2023-07-12 19:06:12  osm2pgsql took 32527s (9h 2m 7s) overall.

Planet (75.0GB PBF, 2023-07-10) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (flex | non-slim) ~8h

Software

  • osm2pgsql 1.8.1
  • PostgreSQL 15.3
  • PostGIS 3.3.3

This - one hour faster - import session than the previous one listed directly above, was a first attempt to take full advantage of the large available RAM in the upgraded HP Z840 workstation. With 512GB RAM, it proved possible importing the data without --slim, storing all intermediate data for nodes, ways, and relations directly in memory.

In order to make this possible, the PostgreSQL config was changed slightly (all other settings still the same):

  • shared_buffers = 75GB
  • maintenance_work_mem = 32000MB

Actually, the only really relevant change is to shared_buffers. By reducing it from 275GB to 75GB, memory was freeed up for osm2pgsql. The previous very high setting was actually another experiment, where there seems discussion in the PostgreSQL community to giving either a minor, or major amount of RAM to PostgreSQL, to either allow PostgreSQL itself to maximally cache data (very large setting of shared_buffers), or the operating system (small setting of shared_buffers).

No -C/--cache was set on the command line. In non-slim mode, osm2pgsql by default stores all nodes, ways and relations in memory, so caching has no meaning and the cache setting is apparently (and logically) ignored.

Also, based on the previous enlarged setting of maintenance_work_mem, that seemed to enhance the speed of indexing, the maintenance_work_mem was doubled to 32GB from 16GB. However, looking at the benchmark results, there appears to be no benefit on the indexing stages ("All postprocessing on table X done in Y s"), as the timings are virtually the same, so setting the maintenance_work_mem above 16GB or the 10GB recommended in the osm2pgsql manual, doesn't seem to make sense.

The one hour shorter time seems wholly attributable to a much faster relation loading stage, which went down from 2h3m to just 40m. This came at the cost of slightly slower way loading (83k/s down to 70k/s). Why the way stage loading is slightly slower, is unclear to me, I would have expected it to be slightly faster as well with data in RAM.

All in all, it seems that on modern NVMe backed hardware, the benefits of running fully in RAM, although still measurable, are limited, and using --flat_nodes and --slim with a proper NVMe RAID seems enough to get acceptable import times that come close to full in-RAM imports, reducing the need for a very large amount of RAM and risk of out of memory errors.

Top osm2pgsql memory usage witnessed was about 325GB, with all nodes, ways and relations loaded. For storing nodes, ways, and relations in RAM, it therefor seems probable that about 5x times the size of the imported PBF is needed (375GB at current 75GB PBF). To determine the total amount of needed RAM + swap for a non-slim import, you will need to add the expected memory usage of postgresql, the largest chunk of which is likely shared_buffers, but also work_mem settings need to be taken into account.

E.g., with current Planet of 75GB:

  • Allocate 5x "size of PBF" = 5x75GB = 375GB for osm2pgsql alone
  • If shared buffers is 75GB, and expected work_mem and other memory usage is another 75GB, a total of 375+75+75 = 525GB RAM + swap is expected to be needed for a successful non-slim import of Planet.

Command

osm2pgsql -O flex -d <DATABASE> -U <USER_NAME> --create -W --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' 'data/planet-230710.osm.pbf'

Output

2023-07-19 09:51:01  osm2pgsql version 1.8.1 (1.8.1-101-geaff0924)
2023-07-19 09:51:09  Database version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
2023-07-19 09:51:09  PostGIS version: 3.3
2023-07-19 14:48:55  Reading input files done in 17726s (4h 55m 26s).                     
2023-07-19 14:48:55    Processed 8508054840 nodes in 1761s (29m 21s) - 4831k/s
2023-07-19 14:48:55    Processed 953992088 ways in 13566s (3h 46m 6s) - 70k/s
2023-07-19 14:48:55    Processed 11109244 relations in 2399s (39m 59s) - 5k/s
2023-07-19 14:48:56  Reprocess marked ways (stage 2)...
2023-07-19 14:49:10  Creating id index on table 'planet_osm_line'...
2023-07-19 14:52:05  Creating id index on table 'planet_osm_polygon'...
2023-07-19 14:57:27  Creating id index on table 'planet_osm_transport_line'...
2023-07-19 14:59:05  Creating id index on table 'planet_osm_transport_polygon'...
2023-07-19 14:59:06  Creating id index on table 'planet_osm_admin'...
2023-07-19 14:59:06  Creating id indexes took 609s (10m 9s)
2023-07-19 14:59:07  There are 2524685 ways to reprocess...
2023-07-19 15:07:55  No indexes to create on table 'planet_osm_route'.
2023-07-19 15:07:55  Analyzing table 'planet_osm_route'...
2023-07-19 15:08:01  Clustering table 'planet_osm_waterway' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_line' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_transport_line' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_admin' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_point' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_transport_polygon' by geometry...
2023-07-19 15:08:01  Clustering table 'planet_osm_polygon' by geometry...
2023-07-19 15:08:08  Creating index on table 'planet_osm_transport_polygon' ("way")...
2023-07-19 15:08:10  Analyzing table 'planet_osm_transport_polygon'...
2023-07-19 15:08:26  Creating index on table 'planet_osm_waterway' ("way")...
2023-07-19 15:08:27  Analyzing table 'planet_osm_waterway'...
2023-07-19 15:08:45  Creating index on table 'planet_osm_admin' ("way")...
2023-07-19 15:08:50  Analyzing table 'planet_osm_admin'...
2023-07-19 15:45:16  Creating index on table 'planet_osm_point' ("way")...
2023-07-19 15:51:58  Analyzing table 'planet_osm_point'...
2023-07-19 15:52:04  All postprocessing on table 'planet_osm_point' done in 2707s (45m 7s).
2023-07-19 16:14:28  Creating index on table 'planet_osm_transport_line' ("way")...
2023-07-19 16:23:49  Analyzing table 'planet_osm_transport_line'...
2023-07-19 16:33:30  Creating index on table 'planet_osm_line' ("way")...
2023-07-19 16:46:45  Analyzing table 'planet_osm_line'...
2023-07-19 16:46:51  All postprocessing on table 'planet_osm_line' done in 5994s (1h 39m 54s).
2023-07-19 17:18:05  Creating index on table 'planet_osm_polygon' ("way")...
2023-07-19 17:46:49  Analyzing table 'planet_osm_polygon'...
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_polygon' done in 9597s (2h 39m 57s).
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_transport_line' done in 4618s (1h 16m 58s).
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_transport_polygon' done in 75s (1m 15s).
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_route' done in 63s (1m 3s).
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_admin' done in 115s (1m 55s).
2023-07-19 17:46:54  All postprocessing on table 'planet_osm_waterway' done in 95s (1m 35s).
2023-07-19 17:46:55  osm2pgsql took 28545s (7h 55m 45s) overall.

Planet (75.1GB PBF, 2023-07-17) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (flex | --slim) ~8h

Software

This is actually an interesting new run done with a custom compiled non-release version of osm2pgsql 1.8.1+. This version features two important changes compared to released osm2pgsql 1.8.1:

- Commit https://github.com/openstreetmap/osm2pgsql/commit/5dba24058233f0aff38eb1d271ffdcc5ca735c33 that made relation loading 25% faster in --slim mode by ensuring proper use of a database ID index. There are no benefits of this commit in non-slim mode, where relation loading is already very fast, and osm2pgsql uses its own memory optimized RAM data structures to store and access nodes, ways and relations of the middle, so follows a different processing path.

- Introduces a new optionally set osm2pgsql 'middle' database structure / schema, with different contents of the nodes/ways/relations slim tables, and introduces a new 'osm2gpsql_properties' table to permanently store last used import command line settings. The new 'middle' structure did not seem to have a significant impact on performance (which also wasn't a goal of that development), even if the new command line option --middle-with-nodes was chosen, which adds a 'planet_osm_nodes' table copy next to 'nodes.bin' flat-nodes file with only the tagged nodes (the flat-nodes file contains all nodes, tagged or untagged ones). Note that in previous versions of osm2pgsql, including the official 1.8.1, and without specifying this new command line option, no 'planet_osm_nodes' table is created if a --flat-nodes file is used, as all nodes are already stored in the flat nodes file. The new option is only of limited use at the current state of development of osm2pgsql (2023-07-27), as the 'planet_osm_nodes' table cannot yet be used in osm2pgsql 'stage2' flex processing from within a LUA flex style file. This awaits further development of osm2pgsql. Nonetheless it was interesting to add this option to the command line to see if it had any impact on loading performance, which it apparently didn't, and is most likely due to the fact that only tagged nodes are loaded, significantly reducing the overall speed of node record insertion compared to storing all nodes in a --slim import without --flat-nodes (where all 8.5B+ nodes need to be loaded in the database), and thus not creating a bottleneck.

What is interesting to observe in comparison to the last non-slim run, is that after the commit that made relation loading faster in --slim mode, and with current Planet's ratio of ways:relations of about 100:1, the total time to load nodes, ways and relations is virtually identical in --slim + --flat-nodes and non-slim modes (+/- 5h). This is a coincidence, but kind of nice to see happening.

The slightly slower speed of nodes loading in --slim mode, and significantly (2.5x) slower relation loading, is compensated by a faster way loading (85k/s in --slim versus 70k/s in non-slim mode). The reason for the initially unexpected faster way loading in --slim mode versus the all-RAM non-slim mode, lies in the abundant RAM of the used hardware, and thus the --flat-nodes file being entirely cached in RAM, and the flat nodes file using a structure and non-compressed data format optimized for speed and not memory conservation like in non-slim mode.

This latest --slim run:

Output

2023-07-24 14:11:35  Reading input files done in 18693s ('''5h 11m 33s''').                     
2023-07-24 14:11:35    Processed 8518951560 nodes in 2126s (35m 26s) - '''4007k/s'''
2023-07-24 14:11:35    Processed 955223105 ways in 11190s (3h 6m 30s) - '''85k/s'''
2023-07-24 14:11:35    Processed 11128923 relations in 5377s (1h 29m 37s) - '''2k/s'''

The previous non-slim run:

Output

2023-07-19 14:48:55  Reading input files done in 17726s ('''4h 55m 26s''').                     
2023-07-19 14:48:55    Processed 8508054840 nodes in 1761s (29m 21s) - '''4831k/s'''
2023-07-19 14:48:55    Processed 953992088 ways in 13566s (3h 46m 6s) - '''70k/s'''
2023-07-19 14:48:55    Processed 11109244 relations in 2399s (39m 59s) - '''5k/s'''

Lastest Postgres config used in this run:

  • shared_buffers = 75GB
  • temp_buffers = 8000MB
  • work_mem = 4000MB
  • hash_mem_multiplier = 4.0
  • maintenance_work_mem = 16000MB
  • autovacuum_work_mem = 4000MB
  • wal_level = minimal
  • synchronous_commit = off
  • wal_compression = on
  • wal_buffers = 128MB
  • max_wal_senders = 0
  • checkpoint_timeout = 15min
  • checkpoint_completion_target = 0.9
  • max_wal_size = 150GB
  • min_wal_size = 10GB
  • sequential_page_cost = 1.0
  • random_page_cost = 1.0
  • parallel_tuple_cost = 0.025
  • parallel_setup_cost = 100
  • bgwriter_delay = 100ms
  • bgwriter_lru_maxpages = 200
  • bgwriter_lru_multiplier = 4.0
  • bgwriter_flush_after = 512kB
  • effective_cache_size = 400GB
  • effective_io_concurrency = 1000
  • maintenance_io_concurrency = 1000
  • max_worker_processes = 264
  • max_parallel_maintenance_workers = 22
  • max_parallel_workers_per_gather = 44
  • parallel_leader_participation = off
  • max_parallel_workers = 264

Command

osm2pgsql -O flex -d <DATABASE> -U <USER_NAME> --create -W -C 0 --slim --drop --flat-nodes '<FOLDER>nodes.bin' --middle-database-format new --middle-with-nodes --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' 'data/planet-230717.osm.pbf'

Output

2023-07-24 08:57:36  osm2pgsql version 1.8.1 (1.8.1-183-g5dba2405)
2023-07-24 08:57:44  Database version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
2023-07-24 08:57:44  PostGIS version: 3.3
2023-07-24 08:57:44  Storing properties to table 'osm2pgsql_properties'.
2023-07-24 14:11:35  Reading input files done in 18693s (5h 11m 33s).                     
2023-07-24 14:11:35    Processed 8518951560 nodes in 2126s (35m 26s) - 4007k/s
2023-07-24 14:11:35    Processed 955223105 ways in 11190s (3h 6m 30s) - 85k/s
2023-07-24 14:11:35    Processed 11128923 relations in 5377s (1h 29m 37s) - 2k/s
2023-07-24 14:11:35  Reprocess marked ways (stage 2)...
2023-07-24 14:11:50  Creating id index on table 'planet_osm_line'...
2023-07-24 14:14:13  Creating id index on table 'planet_osm_polygon'...
2023-07-24 14:19:18  Creating id index on table 'planet_osm_transport_line'...
2023-07-24 14:21:06  Creating id index on table 'planet_osm_transport_polygon'...
2023-07-24 14:21:07  Creating id index on table 'planet_osm_admin'...
2023-07-24 14:21:07  Creating id indexes took 571s (9m 31s)
2023-07-24 14:21:09  There are 2528460 ways to reprocess...
2023-07-24 14:34:29  Dropping table 'planet_osm_nodes'
2023-07-24 14:34:33  Table 'planet_osm_nodes' dropped in 3s
2023-07-24 14:34:33  Dropping table 'planet_osm_ways'
2023-07-24 14:35:02  Table 'planet_osm_ways' dropped in 29s
2023-07-24 14:35:02  Dropping table 'planet_osm_rels'
2023-07-24 14:35:03  Table 'planet_osm_rels' dropped in 1s
2023-07-24 14:35:03  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-07-24 14:35:03  Done postprocessing on table 'planet_osm_ways' in 0s
2023-07-24 14:35:03  Done postprocessing on table 'planet_osm_rels' in 0s
2023-07-24 14:35:03  Clustering table 'planet_osm_point' by geometry...
2023-07-24 14:35:03  Clustering table 'planet_osm_line' by geometry...
2023-07-24 14:35:14  No indexes to create on table 'planet_osm_route'.
2023-07-24 14:35:14  Clustering table 'planet_osm_transport_polygon' by geometry...
2023-07-24 14:35:14  Clustering table 'planet_osm_polygon' by geometry...
2023-07-24 14:35:14  Clustering table 'planet_osm_admin' by geometry...
2023-07-24 14:35:14  Clustering table 'planet_osm_waterway' by geometry...
2023-07-24 14:35:14  Clustering table 'planet_osm_transport_line' by geometry...
2023-07-24 14:35:14  Analyzing table 'planet_osm_route'...
2023-07-24 14:35:20  Creating index on table 'planet_osm_transport_polygon' ("way")...
2023-07-24 14:35:21  Analyzing table 'planet_osm_transport_polygon'...
2023-07-24 14:35:41  Creating index on table 'planet_osm_waterway' ("way")...
2023-07-24 14:35:43  Analyzing table 'planet_osm_waterway'...
2023-07-24 14:36:01  Creating index on table 'planet_osm_admin' ("way")...
2023-07-24 14:36:06  Analyzing table 'planet_osm_admin'...
2023-07-24 15:11:23  Creating index on table 'planet_osm_point' ("way")...
2023-07-24 15:17:45  Analyzing table 'planet_osm_point'...
2023-07-24 15:17:51  All postprocessing on table 'planet_osm_point' done in 2568s (42m 48s).
2023-07-24 15:41:26  Creating index on table 'planet_osm_transport_line' ("way")...
2023-07-24 15:50:40  Analyzing table 'planet_osm_transport_line'...
2023-07-24 16:01:10  Creating index on table 'planet_osm_line' ("way")...
2023-07-24 16:14:52  Analyzing table 'planet_osm_line'...
2023-07-24 16:14:59  All postprocessing on table 'planet_osm_line' done in 5995s (1h 39m 55s).
2023-07-24 16:45:48  Creating index on table 'planet_osm_polygon' ("way")...
2023-07-24 17:16:03  Analyzing table 'planet_osm_polygon'...
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_polygon' done in 9664s (2h 41m 4s).
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_transport_line' done in 4544s (1h 15m 44s).
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_transport_polygon' done in 20s.
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_route' done in 21s.
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_admin' done in 66s (1m 6s).
2023-07-24 17:16:08  All postprocessing on table 'planet_osm_waterway' done in 45s.
2023-07-24 17:16:09  Storing properties to table 'osm2pgsql_properties'.
2023-07-24 17:16:09  osm2pgsql took 29905s (8h 18m 25s) overall.

Planet (75.3GB PBF, 2023-07-31) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (pgsql | non-slim) ~7h

Software

This run and the next run are both based on osm2pgsql "old" 'pgsql' output. 'pgsql' output will eventually be deprecated and removed from osm2pgsql, but since most other benchmarks presented on this page are based on it, it is good to add a true comparison, as all other tests up to now with the HP Z840 used the 'flex' output option, which uses a different processing pipeline in osm2pgsql. The style used is the current official one of openstreetmap-carto.

Note that although it appears that 'pgsql' output is faster than the new 'flex' output, it has to be taken into account that the custom flex style used in the other 'flex' output based runs is considerably more complicated than the style used for these 'pgsql' output results, and produces more tables and a larger database, than the pgsql style. E.g., the creation of the 'planet_osm_transport_line' table, that duplicates all ways tagged with highway or railway tags into a separate large table next to classic 'planet_osm_line', means many ways defined in the PBF have to be written twice to the database. This may well explain all, or at least part, of the reduced performance figures for way processing in 'flex' output. Other more advanced processing, including 'stage 2' flex processing, is implemented as well, requiring additional processing time as well compared to classic pgsql output.

Data is from a non-slim run.

Command

osm2pgsql -d <DATABASE> -U <USER_NAME> --create --hstore --hstore-add-index -W -l -G --number-processes 32 --style 'osm2pgsql/openstreetmap-carto.style' --tag-transform-script 'osm2pgsql/openstreetmap-carto.lua' 'data/planet-230731.osm.pbf'

Output

2023-08-06 23:27:45  osm2pgsql version 1.8.1 (1.8.1-205-g93c35831)
2023-08-06 23:27:53  Database version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
2023-08-06 23:27:53  PostGIS version: 3.3
2023-08-06 23:27:53  Storing properties to table 'osm2pgsql_properties'.
2023-08-06 23:27:53  Setting up table 'planet_osm_point'
2023-08-06 23:27:58  Setting up table 'planet_osm_line'
2023-08-06 23:28:23  Setting up table 'planet_osm_polygon'
2023-08-06 23:29:56  Setting up table 'planet_osm_roads'
2023-08-07 03:44:46  Reading input files done in 15290s (4h 14m 50s).                     
2023-08-07 03:44:46    Processed 8541000694 nodes in 1572s (26m 12s) - 5433k/s
2023-08-07 03:44:46    Processed 957810113 ways in 11405s (3h 10m 5s) - 84k/s
2023-08-07 03:44:46    Processed 11155881 relations in 2313s (38m 33s) - 5k/s
2023-08-07 03:44:46  Clustering table 'planet_osm_point' by geometry...
2023-08-07 03:44:46  Clustering table 'planet_osm_line' by geometry...
2023-08-07 03:44:46  Clustering table 'planet_osm_polygon' by geometry...
2023-08-07 03:44:46  Clustering table 'planet_osm_roads' by geometry...
2023-08-07 03:51:20  Creating geometry index on table 'planet_osm_roads'...
2023-08-07 03:53:38  Creating hstore indexes on table 'planet_osm_roads'...
2023-08-07 03:56:15  Creating geometry index on table 'planet_osm_point'...
2023-08-07 04:02:00  Analyzing table 'planet_osm_roads'...
2023-08-07 04:04:29  Creating hstore indexes on table 'planet_osm_point'...
2023-08-07 04:20:04  Creating geometry index on table 'planet_osm_line'...
2023-08-07 04:34:25  Creating hstore indexes on table 'planet_osm_line'...
2023-08-07 04:45:54  Creating geometry index on table 'planet_osm_polygon'...
2023-08-07 04:59:43  Analyzing table 'planet_osm_line'...
2023-08-07 05:03:43  Analyzing table 'planet_osm_point'...
2023-08-07 05:03:52  All postprocessing on table 'planet_osm_point' done in 4745s (1h 19m 5s).
2023-08-07 05:03:52  All postprocessing on table 'planet_osm_line' done in 4501s (1h 15m 1s).
2023-08-07 05:20:21  Creating hstore indexes on table 'planet_osm_polygon'...
2023-08-07 06:17:25  Analyzing table 'planet_osm_polygon'...
2023-08-07 06:17:30  All postprocessing on table 'planet_osm_polygon' done in 9164s (2h 32m 44s).
2023-08-07 06:17:30  All postprocessing on table 'planet_osm_roads' done in 1038s (17m 18s).
2023-08-07 06:17:30  Storing properties to table 'osm2pgsql_properties'.
2023-08-07 06:17:31  osm2pgsql took 24577s (6h 49m 37s) overall.

Planet (75.3GB PBF, 2023-07-31) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (pgsql | --slim) ~7.5h

Software

Same as last run above, also osm2pgsql 'pgsql' output, but with --slim and --flat-nodes.

Command

osm2pgsql -d <DATABASE -U <USER_NAME> --create --hstore --hstore-add-index -W -l -G --slim --drop --flat-nodes '<FOLDER>nodes.bin' --number-processes 32 --style 'osm2pgsql/openstreetmap-carto.style' --tag-transform-script 'osm2pgsql/openstreetmap-carto.lua' 'data/planet-230731.osm.pbf' -C 0

Output

2023-08-07 09:44:23  osm2pgsql version 1.8.1 (1.8.1-205-g93c35831)
2023-08-07 09:44:31  Database version: 15.3 (Ubuntu 15.3-1.pgdg22.04+1)
2023-08-07 09:44:31  PostGIS version: 3.3
2023-08-07 09:44:31  Storing properties to table 'osm2pgsql_properties'.
2023-08-07 09:44:31  Setting up table 'planet_osm_point'
2023-08-07 09:44:43  Setting up table 'planet_osm_line'
2023-08-07 09:45:09  Setting up table 'planet_osm_polygon'
2023-08-07 09:46:35  Setting up table 'planet_osm_roads'
2023-08-07 14:43:11  Reading input files done in 17793s (4h 56m 33s).                     
2023-08-07 14:43:11    Processed 8541000694 nodes in 1632s (27m 12s) - 5233k/s
2023-08-07 14:43:11    Processed 957810113 ways in 9689s (2h 41m 29s) - 99k/s
2023-08-07 14:43:11    Processed 11155881 relations in 6472s (1h 47m 52s) - 2k/s
2023-08-07 14:43:49  Dropping table 'planet_osm_nodes'
2023-08-07 14:43:49  Table 'planet_osm_nodes' dropped in 0s
2023-08-07 14:43:49  Dropping table 'planet_osm_ways'
2023-08-07 14:45:05  Table 'planet_osm_ways' dropped in 76s (1m 16s)
2023-08-07 14:45:05  Dropping table 'planet_osm_rels'
2023-08-07 14:45:08  Table 'planet_osm_rels' dropped in 3s
2023-08-07 14:45:08  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-08-07 14:45:08  Done postprocessing on table 'planet_osm_ways' in 0s
2023-08-07 14:45:08  Done postprocessing on table 'planet_osm_rels' in 0s
2023-08-07 14:45:08  Clustering table 'planet_osm_point' by geometry...
2023-08-07 14:45:08  Clustering table 'planet_osm_line' by geometry...
2023-08-07 14:45:08  Clustering table 'planet_osm_polygon' by geometry...
2023-08-07 14:45:08  Clustering table 'planet_osm_roads' by geometry...
2023-08-07 14:50:50  Creating geometry index on table 'planet_osm_roads'...
2023-08-07 14:52:17  Creating hstore indexes on table 'planet_osm_roads'...
2023-08-07 14:56:04  Creating geometry index on table 'planet_osm_point'...
2023-08-07 15:00:40  Analyzing table 'planet_osm_roads'...
2023-08-07 15:04:15  Creating hstore indexes on table 'planet_osm_point'...
2023-08-07 15:20:01  Creating geometry index on table 'planet_osm_line'...
2023-08-07 15:34:30  Creating hstore indexes on table 'planet_osm_line'...
2023-08-07 15:46:54  Creating geometry index on table 'planet_osm_polygon'...
2023-08-07 16:00:26  Analyzing table 'planet_osm_line'...
2023-08-07 16:04:52  Analyzing table 'planet_osm_point'...
2023-08-07 16:05:01  All postprocessing on table 'planet_osm_point' done in 4792s (1h 19m 52s).
2023-08-07 16:05:01  All postprocessing on table 'planet_osm_line' done in 4522s (1h 15m 22s).
2023-08-07 16:21:10  Creating hstore indexes on table 'planet_osm_polygon'...
2023-08-07 17:18:11  Analyzing table 'planet_osm_polygon'...
2023-08-07 17:18:16  All postprocessing on table 'planet_osm_polygon' done in 9187s (2h 33m 7s).
2023-08-07 17:18:16  All postprocessing on table 'planet_osm_roads' done in 936s (15m 36s).
2023-08-07 17:18:16  Storing properties to table 'osm2pgsql_properties'.
2023-08-07 17:18:16  osm2pgsql took 27225s (7h 33m 45s) overall.

Facebook Daylight v1.32 (149.3GB PBF, 2023-10-11) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 4x Intel OPTANE P1600X 110GB NVMe RAID 0, Ubuntu 22.04 (flex | --slim) ~14h

Software

  • osm2pgsql 1.9.2
  • PostgreSQL 15.4
  • PostGIS 3.4

Import based on a file containing the Facebook Daylight map distribution Planet PBF file + all the machine learning buildings and administrative boundaries of the v1.32 release, as combined with osmium.

Command

osm2pgsql -O flex -d <DATABASE -U <USER_NAME> --middle-database-format new --schema 'osm' --create -W -C 0 --slim --drop --flat-nodes '<FOLDER>/nodes.bin' --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' 'data/daylight-all-v1.32.osm.pbf'

Output

2023-10-13 22:46:33  osm2pgsql version 1.9.2 (1.8.1-319-gbf00c512)
2023-10-13 22:46:41  Database version: 15.4 (Ubuntu 15.4-2.pgdg22.04+1)
2023-10-13 22:46:41  PostGIS version: 3.4
2023-10-13 22:46:41  Storing properties to table '"osm"."osm2pgsql_properties"'.
2023-10-13 22:46:53  WARNING: The add_row() function is deprecated. Please read
2023-10-13 22:46:53  WARNING: https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/
2023-10-14 07:16:46  Reading input files done in 30594s (8h 29m 54s).                     
2023-10-14 07:16:46    Processed 12427937505 nodes in 2070s (34m 30s) - 6004k/s
2023-10-14 07:16:46    Processed 1779505576 ways in 22040s (6h 7m 20s) - 81k/s
2023-10-14 07:16:46    Processed 10383409 relations in 6484s (1h 48m 4s) - 2k/s
2023-10-14 07:16:46  Reprocess marked ways (stage 2)...
2023-10-14 07:17:00  Creating id index on table 'planet_osm_line'...
2023-10-14 07:19:27  Creating id index on table 'planet_osm_polygon'...
2023-10-14 07:30:32  Creating id index on table 'planet_osm_transport_line'...
2023-10-14 07:32:15  Creating id index on table 'planet_osm_transport_polygon'...
2023-10-14 07:32:15  Creating id index on table 'planet_osm_admin'...
2023-10-14 07:32:15  Creating id indexes took 928s (15m 28s)
2023-10-14 07:32:18  There are 2585439 ways to reprocess...
2023-10-14 07:42:35  Dropping table 'planet_osm_nodes'
2023-10-14 07:42:35  Table 'planet_osm_nodes' dropped in 0s
2023-10-14 07:42:35  Dropping table 'planet_osm_ways'
2023-10-14 07:43:26  Table 'planet_osm_ways' dropped in 51s
2023-10-14 07:43:26  Dropping table 'planet_osm_rels'
2023-10-14 07:43:28  Table 'planet_osm_rels' dropped in 1s
2023-10-14 07:43:28  Done postprocessing on table 'planet_osm_nodes' in 0s
2023-10-14 07:43:28  Done postprocessing on table 'planet_osm_ways' in 0s
2023-10-14 07:43:28  Done postprocessing on table 'planet_osm_rels' in 0s
2023-10-14 07:43:30  No indexes to create on table 'planet_osm_route'.
2023-10-14 07:43:30  Analyzing table 'planet_osm_route'...
2023-10-14 07:43:39  Clustering table 'planet_osm_waterway' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_polygon' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_transport_polygon' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_line' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_transport_line' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_admin' by geometry...
2023-10-14 07:43:39  Clustering table 'planet_osm_point' by geometry...
2023-10-14 07:43:46  Creating index on table 'planet_osm_transport_polygon' ("way")...
2023-10-14 07:43:48  Analyzing table 'planet_osm_transport_polygon'...
2023-10-14 07:44:05  Creating index on table 'planet_osm_waterway' ("way")...
2023-10-14 07:44:06  Analyzing table 'planet_osm_waterway'...
2023-10-14 07:44:25  Creating index on table 'planet_osm_admin' ("way")...
2023-10-14 07:44:30  Analyzing table 'planet_osm_admin'...
2023-10-14 08:17:19  Creating index on table 'planet_osm_point' ("way")...
2023-10-14 08:24:13  Analyzing table 'planet_osm_point'...
2023-10-14 08:24:19  All postprocessing on table 'planet_osm_point' done in 2451s (40m 51s).
2023-10-14 08:37:37  Creating index on table 'planet_osm_transport_line' ("way")...
2023-10-14 08:47:59  Analyzing table 'planet_osm_transport_line'...
2023-10-14 08:55:09  Creating index on table 'planet_osm_line' ("way")...
2023-10-14 09:06:59  Analyzing table 'planet_osm_line'...
2023-10-14 09:07:05  All postprocessing on table 'planet_osm_line' done in 5016s (1h 23m 36s).
2023-10-14 11:24:30  Creating index on table 'planet_osm_polygon' ("way")...
2023-10-14 12:36:08  Analyzing table 'planet_osm_polygon'...
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_polygon' done in 17566s (4h 52m 46s).
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_transport_line' done in 3877s (1h 4m 37s).
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_transport_polygon' done in 23s.
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_route' done in 11s.
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_admin' done in 135s (2m 15s).
2023-10-14 12:36:15  All postprocessing on table 'planet_osm_waterway' done in 42s.
2023-10-14 12:36:16  Storing properties to table '"osm"."osm2pgsql_properties"'.
2023-10-14 12:36:16  osm2pgsql took 49774s (13h 49m 34s) overall.

Facebook Daylight v1.38 (234.2GB PBF, 2024-01-17) on HP Z840 workstation, 512GB DDR4 2133, 2x Xeon E5-2699v4 22C, 5x Samsung 970 EVO Plus 2TB NVMe RAID 0, 2x Intel OPTANE 905P 960GB NVMe RAID 0, Ubuntu 22.04 (flex | --slim) ~22.5h

Software

  • osm2pgsql 1.10.0
  • PostgreSQL 16.1
  • PostGIS 3.4

Import based on a file containing the Facebook Daylight map distribution Planet PBF file + all the machine learning buildings and administrative boundaries of the v1.38 release, including the new Google Open Buildings integration by Facebook. All combined with osmium. The resulting 'planet_osm_polygon' table has 2.448B records, including approximately 1.8B buildings.

NOTE: This is un updated import that replaces the first Google Open Buildings integration of Facebook Daylight v1.37 benchmark posted recently. That version of Daylight turned out to have a major issue, likely duplicate IDs, that was revealed after further examining the imported data, and that caused the nodes of buildings to be merged with other features, causing malformed geometries and the loss of approximately 900M building records in the main 'planet_osm_polygon' table. A similar issue, but in that case involving country boundary relations, affected Daylight v1.32. The new successfull import of v1.38 therefor now features a much larger Polygon table with over 2.4B records, and consequently longer import time due to the considerably larger output database.

This import used a changed hardware configuration compared to the last v1.32 Facebook Daylight import: it swapped out two of the Intel OPTANE P1600X 110GB drives for larger Intel OPTANE 905P drives, which finally have become affordable (but are nowhere near the performance of the newer PCIe4 P5800X OPTANE drives). Benchmarking using CrystalDiskMark has shown these drives to be only marginally faster on IOPS and QD1T1 over the P1600X (maybe 5%), but they have significantly higher sequential read/write and much higher drive capacity (960GB). Just like the P1600Xs, these 905P drives do show considerably higher random read IOPS and performance compared to the Samsung 970 EVO Plus RAID at low queue depth (about 2.5x higher). The random low queue depth write IOPS aren't much higher though than regular PCIe NVMe drives. As a consequence of all this, the benefits of the OPTANE drives for the specific processing of osm2pgsql seem limited over ordinary NVMe drives, and regular newer PCIe 4 and 5 drives are probably a better choice.

The main reason to try this new setup was also not so much to enhance the speed of the osm2pgsql import, which only marginally seemed to benefit from the new drives if at all, but to allow storing derived data tables and especially database indexes that will be created in subsequent processing on OPTANE. The planned additional processsing after the import does rely on high random read / write IO and heavy indexing of the tables, and could potentially benefit from having all indexes stored on OPTANE, although testing of this is ongoing. This however, was the main drive to try out this new setup with the 905Ps, not so much to enhance the osm2pgsql import speed.

Command

osm2pgsql -O flex -d <DATABASE -U <USER_NAME> --middle-database-format new --schema 'osm' --tablespace-slim-data '<TABLESPACE_ON_INTEL_OPTANE_RAID>' --tablespace-slim-index '<TABLESPACE_ON_INTEL_OPTANE_RAID>' --create -W -C 0 --slim --drop --flat-nodes '<FOLDER_ON_INTEL_OPTANE_RAID>/nodes.bin' --number-processes 32 -P 5433 -S 'osm2pgsql/flex-config/openstreetmap-carto.lua' '<FOLDER_ON_SAMSUNG_RAID>/daylight-all-v1.38.osm.pbf'

Output

2024-01-25 07:48:23  osm2pgsql version 1.10.0 (1.8.1-345-gfbdfef85)
Password:
2024-01-25 07:48:42  Database version: 16.1 (Ubuntu 16.1-1.pgdg22.04+1)
2024-01-25 07:48:42  PostGIS version: 3.4
2024-01-25 07:48:42  Storing properties to table '"osm"."osm2pgsql_properties"'.
2024-01-25 07:49:16  WARNING: The add_row() function is deprecated. Please read
2024-01-25 07:49:16  WARNING: https://osm2pgsql.org/doc/tutorials/switching-from-add-row-to-insert/
2024-01-25 20:51:44  Reading input files done in 46948s (13h 2m 28s).                     
2024-01-25 20:51:44    Processed 16681096063 nodes in 2584s (43m 4s) - 6456k/s
2024-01-25 20:51:44    Processed 2761636064 ways in 38320s (10h 38m 40s) - 72k/s
2024-01-25 20:51:44    Processed 10704410 relations in 6044s (1h 40m 44s) - 2k/s
2024-01-25 20:51:44  Reprocess marked ways (stage 2)...
2024-01-25 20:52:02  Creating id index on table 'planet_osm_line'...
2024-01-25 20:55:05  Creating id index on table 'planet_osm_polygon'...
2024-01-25 21:16:04  Creating id index on table 'planet_osm_transport_line'...
2024-01-25 21:18:09  Creating id index on table 'planet_osm_transport_polygon'...
2024-01-25 21:18:10  Creating id index on table 'planet_osm_admin'...
2024-01-25 21:18:10  Creating id indexes took 1585s (26m 25s)
2024-01-25 21:18:33  There are 2760896 ways to reprocess...
2024-01-25 21:29:41  Dropping table 'planet_osm_nodes'
2024-01-25 21:29:41  Table 'planet_osm_nodes' dropped in 0s
2024-01-25 21:29:41  Dropping table 'planet_osm_ways'
2024-01-25 21:31:04  Table 'planet_osm_ways' dropped in 82s (1m 22s)
2024-01-25 21:31:04  Dropping table 'planet_osm_rels'
2024-01-25 21:31:05  Table 'planet_osm_rels' dropped in 1s
2024-01-25 21:31:05  Done postprocessing on table 'planet_osm_nodes' in 0s
2024-01-25 21:31:05  Done postprocessing on table 'planet_osm_ways' in 0s
2024-01-25 21:31:05  Done postprocessing on table 'planet_osm_rels' in 0s
2024-01-25 21:31:05  Clustering table 'planet_osm_point' by geometry...
2024-01-25 21:31:06  Clustering table 'planet_osm_line' by geometry...
2024-01-25 21:31:22  Clustering table 'planet_osm_polygon' by geometry...
2024-01-25 21:31:22  No indexes to create on table 'planet_osm_route'.
2024-01-25 21:31:22  Clustering table 'planet_osm_transport_line' by geometry...
2024-01-25 21:31:22  Clustering table 'planet_osm_admin' by geometry...
2024-01-25 21:31:22  Analyzing table 'planet_osm_route'...
2024-01-25 21:31:22  Clustering table 'planet_osm_transport_polygon' by geometry...
2024-01-25 21:31:22  Clustering table 'planet_osm_waterway' by geometry...
2024-01-25 21:31:30  Creating index on table 'planet_osm_transport_polygon' ("way")...
2024-01-25 21:31:32  Analyzing table 'planet_osm_transport_polygon'...
2024-01-25 21:31:53  Creating index on table 'planet_osm_waterway' ("way")...
2024-01-25 21:31:54  Analyzing table 'planet_osm_waterway'...
2024-01-25 21:32:11  Creating index on table 'planet_osm_admin' ("way")...
2024-01-25 21:32:18  Analyzing table 'planet_osm_admin'...
2024-01-25 22:10:53  Creating index on table 'planet_osm_point' ("way")...
2024-01-25 22:19:20  Analyzing table 'planet_osm_point'...
2024-01-25 22:19:27  All postprocessing on table 'planet_osm_point' done in 2901s (48m 21s).
2024-01-25 22:33:38  Creating index on table 'planet_osm_transport_line' ("way")...
2024-01-25 22:45:56  Analyzing table 'planet_osm_transport_line'...
2024-01-25 22:52:20  Creating index on table 'planet_osm_line' ("way")...
2024-01-25 23:07:02  Analyzing table 'planet_osm_line'...
2024-01-25 23:07:10  All postprocessing on table 'planet_osm_line' done in 5764s (1h 36m 4s).
2024-01-26 04:17:54  Creating index on table 'planet_osm_polygon' ("way")...
2024-01-26 06:19:10  Analyzing table 'planet_osm_polygon'...
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_polygon' done in 31694s (8h 48m 14s).
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_transport_line' done in 4497s (1h 14m 57s).
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_transport_polygon' done in 30s.
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_route' done in 30s.
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_admin' done in 77s (1m 17s).
2024-01-26 06:19:20  All postprocessing on table 'planet_osm_waterway' done in 55s.
2024-01-26 06:19:21  Storing properties to table '"osm"."osm2pgsql_properties"'.
2024-01-26 06:19:22  osm2pgsql took 81039s (22h 30m 39s) overall.

Planet (PBF, 2022-10-31) on AWS EC2 m6gd.16xlarge, 64 ARM vCPU, 256 GiB, 2x 1900GB NVMe ~13.5h

CPU: 64 ARM vCPU Memory: 256 GiB Storage: 2*1900GB NVMe SSD osm2pgsql version 1.7.1 Database version: 14.5 (Ubuntu 14.5-2.pgdg22.04+2) PostGIS version: 3.3

sudo -u tile osm2pgsql --slim --database=gis \
--cache=0 --flat-nodes=/store/database/nodes \
--number-processes=20 --multi-geometry --hstore \
--style=/srv/tile.openstreetmap.org/styles/default/openstreetmap-carto.style \
--tag-transform-script=/srv/tile.openstreetmap.org/styles/default/openstreetmap-carto.lua \
/store/tmp/planet-221031.osm.pbf
2022-11-10 14:55:14  Reading input files done in 32396s (8h 59m 56s).
2022-11-10 14:55:14    Processed 8006526349 nodes in 4680s (1h 18m 0s) - 1711k/s
2022-11-10 14:55:14    Processed 897621157 ways in 23333s (6h 28m 53s) - 38k/s
2022-11-10 14:55:14    Processed 10353417 relations in 4383s (1h 13m 3s) - 2k/s
2022-11-10 18:12:46  Done postprocessing on table 'planet_osm_ways' in 11848s (3h 17m 28s)
2022-11-10 18:12:46  Done postprocessing on table 'planet_osm_rels' in 245s (4m 5s)
2022-11-10 18:12:46  All postprocessing on table 'planet_osm_point' done in 3681s (1h 1m 21s).
2022-11-10 18:12:46  All postprocessing on table 'planet_osm_line' done in 6570s (1h 49m 30s).
2022-11-10 19:30:03  All postprocessing on table 'planet_osm_polygon' done in 16488s (4h 34m 48s).
2022-11-10 19:30:03  All postprocessing on table 'planet_osm_roads' done in 503s (8m 23s).
2022-11-10 19:30:03  osm2pgsql took 48886s (13h 34m 46s) overall.

Tuning:

shared_buffers = 8GB
maintenance_work_mem = 7144MB
work_mem = 128MB
max_wal_size = 2880MB
min_wal_size = 80MB
effective_cache_size = 16GB

Planet (PBF) on Ubuntu 20.04, Xeon E3-1245 v2, 32GB + 72GB swap, 4C/8T, 2x 480GB SSD RAID 0 ~25h

Xeon(R) CPU E3-1245 V2 @ 3.40GHz
RAM: 32GB
RAID-0 2x SSD 480GB INTEL SSDSC2BB48 
72GB SWAP

osm2pgsql version 1.2.1 (64 bit id space)
Postgresql 13

Time: 88528s (<25h)

Postgres config:

wal_level=minimal
hot_standby=off
max_wal_senders=0
checkpoint_timeout=1d
checkpoint_completion_target=0.90
max_parallel_workers_per_gather=3
max_wal_size = 10GB
min_wal_size = 1GB
shared_buffers = 8GB
dynamic_shared_memory_type = posix 

osm2pgsql -d XXX -U XXX  --create --slim --flat-nodes ./nodes --multi-geometry  -G --hstore  \
       --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua \
       -C 72000 --number-processes 2 \
       -S ~/src/openstreetmap-carto/openstreetmap-carto.style ./planet-latest.osm.pbf

....
Reading in file: ./planet-latest.osm.pbf
Processing: Node(7086449k 1608.7k/s) Way(787322k 16.38k/s) Relation(9109670 473.55/s)  parse time: 71718s
Node stats: total(7086449253), max(8931477723) in 4405s
Way stats: total(787322904), max(965461232) in 48076s
Relation stats: total(9109683), max(12991924) in 19237s
...
All indexes on planet_osm_point created in 5429s
...
All indexes on planet_osm_roads created in 754s
...
Stopped table: planet_osm_rels in 200s
Osm2pgsql took 88528s overall
node cache: stored: 7086449253(100.00%), storage efficiency: 83.69% (dense blocks: 972608, sparse nodes: 249971189), hit rate: 100.00%

Planet (PBF, 2022-06-21) on Windows 10, 64 GB RAM, AMD Ryzen 9 3900X 12C/24T CPU, 1 TB NVMe SSD ~18h

System:

  • OS: Windows 10 21H2
  • RAM: Corsair Vengeance LPX 64 GB (4 x 16 GB @ 2133 MHz)
  • CPU: AMD Ryzen 9 3900X (12C/24T @ 3.8 GHz)
  • SSD: Samsung 970 Evo 1 TB (800 GB free space available)

Notes:

Command:

C:\osm\bin\osm2pgsql\osm2pgsql.exe --verbose --disable-parallel-indexing --slim --drop --flat-nodes=C:\osm\logs\flat-nodes.bin --multi-geometry --hstore --database=gis --user=postgres --password --tag-transform-script=C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.lua --style=C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.style C:\osm\pbf\planet.osm.pbf

Output:

2022-06-21 22:52:57  osm2pgsql version 1.6.0
2022-06-21 22:53:02  [0] Database version: 14.4
2022-06-21 22:53:02  [0] PostGIS version: 3.2
2022-06-21 22:53:02  [0] Reading file: C:\osm\pbf\planet.osm.pbf
2022-06-21 22:53:02  [0] Started pool with 1 threads.
2022-06-21 22:53:02  [0] Loading persistent node cache from 'C:\osm\logs\flat-nodes.bin'.
2022-06-21 22:53:02  [0] Mid: pgsql, cache=800
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_nodes'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_ways'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_rels'
2022-06-21 22:53:02  [0] Using projection SRS 3857 (Spherical Mercator)
2022-06-21 22:53:02  [0] Using lua based tag transformations with script C:\osm\scripts\openstreetmap-carto\openstreetmap-carto.lua
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_point'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_line'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_polygon'
2022-06-21 22:53:02  [0] Setting up table 'planet_osm_roads'
2022-06-22 08:17:37  [0] Reading input files done in 33875s (9h 24m 35s).
2022-06-22 08:17:37  [0]   Processed 7741381450 nodes in 2982s (49m 42s) - 2596k/s
2022-06-22 08:17:37  [0]   Processed 866111767 ways in 19739s (5h 28m 59s) - 44k/s
2022-06-22 08:17:37  [0]   Processed 9979763 relations in 11154s (3h 5m 54s) - 895/s
2022-06-22 08:17:42  [0] Removing persistent node cache at 'C:\osm\logs\flat-nodes.bin'.
2022-06-22 08:17:50  [0] Dropping table 'planet_osm_nodes'
2022-06-22 08:17:50  [0] Table 'planet_osm_nodes' dropped in 0s
2022-06-22 08:17:50  [0] Dropping table 'planet_osm_ways'
2022-06-22 08:18:04  [0] Table 'planet_osm_ways' dropped in 14s
2022-06-22 08:18:04  [0] Dropping table 'planet_osm_rels'
2022-06-22 08:18:06  [0] Table 'planet_osm_rels' dropped in 2s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_nodes' in 0s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_ways' in 0s
2022-06-22 08:18:06  [0] Done postprocessing on table 'planet_osm_rels' in 0s
2022-06-22 08:18:06  [1] Starting task...
2022-06-22 08:18:06  [1] Clustering table 'planet_osm_point' by geometry...
2022-06-22 08:18:06  [1] Using native order for clustering table 'planet_osm_point'
2022-06-22 08:29:33  [1] Creating geometry index on table 'planet_osm_point'...
2022-06-22 09:00:27  [1] Analyzing table 'planet_osm_point'...
2022-06-22 09:00:28  [1] Done task in 2542096ms.
2022-06-22 09:00:28  [1] Starting task...
2022-06-22 09:00:28  [0] All postprocessing on table 'planet_osm_point' done in 2542s (42m 22s).
2022-06-22 09:00:28  [1] Clustering table 'planet_osm_line' by geometry...
2022-06-22 09:00:28  [1] Using native order for clustering table 'planet_osm_line'
2022-06-22 09:52:27  [1] Creating geometry index on table 'planet_osm_line'...
2022-06-22 10:58:13  [1] Analyzing table 'planet_osm_line'...
2022-06-22 10:58:16  [1] Done task in 7067928ms.
2022-06-22 10:58:16  [1] Starting task...
2022-06-22 10:58:16  [0] All postprocessing on table 'planet_osm_line' done in 7067s (1h 57m 47s).
2022-06-22 10:58:16  [1] Clustering table 'planet_osm_polygon' by geometry...
2022-06-22 10:58:16  [1] Using native order for clustering table 'planet_osm_polygon'
2022-06-22 13:43:53  [1] Creating geometry index on table 'planet_osm_polygon'...
2022-06-22 16:46:19  [1] Analyzing table 'planet_osm_polygon'...
2022-06-22 16:46:25  [1] Done task in 20886325ms.
2022-06-22 16:46:25  [1] Starting task...
2022-06-22 16:46:25  [0] All postprocessing on table 'planet_osm_polygon' done in 20886s (5h 48m 6s).
2022-06-22 16:46:25  [1] Clustering table 'planet_osm_roads' by geometry...
2022-06-22 16:46:25  [1] Using native order for clustering table 'planet_osm_roads'
2022-06-22 16:52:33  [1] Creating geometry index on table 'planet_osm_roads'...
2022-06-22 16:55:31  [1] Analyzing table 'planet_osm_roads'...
2022-06-22 16:55:31  [1] Done task in 545890ms.
2022-06-22 16:55:31  [0] All postprocessing on table 'planet_osm_roads' done in 545s (9m 5s).
2022-06-22 16:55:31  [0] osm2pgsql took 64949s (18h 2m 29s) overall.

Planet (119GB BZIP2, 2021-09-02) on Ubuntu Linux 128 GB RAM, AMD Ryzen 9 5950X 16C/32T CPU, 2 TB NVMe SSD, PG15 ~13.5h

System:

Hardware assembled at Crunchy Data Benchmark Lab by Greg Smith. Presentation of results 2021-11-18 at Virtual PostGIS Day 2021

Command:

Planet data as of 2021-09-02, 119G .bzip2 download. osm2pgsql and PostgreSQL 15 database run on the same server. Final size 989 GB, build disk usage peak ~1.4TB.

osm2pgsql -d gis -v --create --slim
--cache 40000 --number-processes 4
--flat-nodes /inland/15/nodes.db
--hstore --hstore-add-index
/var/lib/postgresql/osm/planet-latest.osm.bz2

PostgreSQL Tuning:

At current Planet size improvements from OSM cache increase peaked around 40GB of RAM, with only marginal benefit going to 50GB. With 128GB that leaves enough memory for aggressive database memory tuning as well.

shared_buffers = 32GB
maintenance_work_mem = 4GB
work_mem = 4GB
max_wal_size = 16GB
min_wal_size = 4GB
effective_cache_size = 80GB

# Below here are loading tweaks not recommended for production

checkpoint_timeout = 60min
synchronous_commit = off
fsync = off
autovacuum = off
wal_level = minimal
max_wal_senders = 0

Background writer tuning:

An aggressive background writer tuning moved a lot of the index building load from the CREATE INDEX to the BGW process, often over 150MB/s:

bgwriter_delay = 10ms
bgwriter_lru_maxpages = 500
bgwriter_lru_multiplier = 4.0

Note that alternately the background writer can be turned off altogether, as there is no foreground latency for it to optimize anyway, and on some systems the CPU time it uses isn't worth spending on its job:

bgwriter_lru_maxpages = 0
bgwriter_flush_after = 0

Unsuccessful improvement attempts:

Alternate system using newest Ryzen 9 7700X/64GB was much faster at some operations. But for the full Planet, 128GB of RAM is the only way I've found to make the Way times in particular competitive now.

There are so few disk bottlenecks left at PCIe 4.0, the 5GB/s read rated drive used is more than enough to keep up. A 4 disk RAID0 or SSDs on the newer P18 chipsets were only trivially faster.

Some runs set shared_buffers as high as 48GB with clear benefits. This 32GB value is a compromise to cut down on Out of Memory Killer action ruining the load. The two work_mem settings can be dropped to 2GB as a way to fit more buffer cache, but that change is barely above the run to run noise.

Output:

2022-10-27 01:42:51  osm2pgsql version 1.6.0
2022-10-27 01:42:51  [0] Database version: 15.0 (Ubuntu 15.0-1.pgdg22.04+1)
2022-10-27 01:42:51  [0] PostGIS version: 3.3
2022-10-27 01:42:51  [0] Reading file: /var/lib/postgresql/osm/planet-latest.osm.bz2
2022-10-27 01:42:51  [0] Started pool with 4 threads.
2022-10-27 01:42:51  [0] Loading persistent node cache from '/inland/15/nodes.db'.
2022-10-27 01:42:51  [0] Mid: pgsql, cache=40000
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_nodes'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_ways'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_rels'
2022-10-27 01:42:51  [0] Using projection SRS 3857 (Spherical Mercator)
2022-10-27 01:42:51  [0] Using built-in tag transformations
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_point'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_line'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_polygon'
2022-10-27 01:42:51  [0] Setting up table 'planet_osm_roads'
2022-10-27 08:00:44  [0] Reading input files done in 22673s (6h 17m 53s).
2022-10-27 08:00:44  [0]   Processed 7882127434 nodes in 11533s (3h 12m 13s) - 683k/s
2022-10-27 08:00:44  [0]   Processed 883613025 ways in 5395s (1h 29m 55s) - 164k/s
2022-10-27 08:00:44  [0]   Processed 10176596 relations in 5745s (1h 35m 45s) - 2k/s
2022-10-27 08:00:44  [0] Overall memory usage: peak=143600MByte current=140706MByte
2022-10-27 08:00:44  [1] Starting task...
2022-10-27 08:00:44  [4] Starting task...
2022-10-27 08:00:44  [2] Starting task...
2022-10-27 08:00:44  [3] Starting task...
2022-10-27 08:00:44  [1] Clustering table 'planet_osm_point' by geometry...
2022-10-27 08:00:44  [4] Clustering table 'planet_osm_line' by geometry...
2022-10-27 08:00:44  [3] Clustering table 'planet_osm_roads' by geometry...
2022-10-27 08:00:44  [2] Clustering table 'planet_osm_polygon' by geometry...
2022-10-27 08:00:45  [4] Using native order for clustering table 'planet_osm_line'
2022-10-27 08:00:45  [3] Using native order for clustering table 'planet_osm_roads'
2022-10-27 08:00:45  [2] Using native order for clustering table 'planet_osm_polygon'
2022-10-27 08:00:45  [1] Using native order for clustering table 'planet_osm_point'
2022-10-27 08:04:09  [3] Creating geometry index on table 'planet_osm_roads'...
2022-10-27 08:06:06  [3] Creating osm_id index on table 'planet_osm_roads'...
2022-10-27 08:06:27  [3] Creating hstore indexes on table 'planet_osm_roads'...
2022-10-27 08:08:38  [1] Creating geometry index on table 'planet_osm_point'...
2022-10-27 08:10:06  [3] Analyzing table 'planet_osm_roads'...
2022-10-27 08:10:25  [3] Done task in 580104ms.
2022-10-27 08:10:25  [3] Starting task...
2022-10-27 08:10:25  [3] Done task in 0ms.
2022-10-27 08:10:25  [3] Starting task...
2022-10-27 08:10:25  [0] Done postprocessing on table 'planet_osm_nodes' in 0s
2022-10-27 08:10:25  [3] Building index on table 'planet_osm_ways'
2022-10-27 08:14:47  [1] Creating osm_id index on table 'planet_osm_point'...
2022-10-27 08:15:40  [1] Creating hstore indexes on table 'planet_osm_point'...
2022-10-27 08:27:38  [4] Creating geometry index on table 'planet_osm_line'...
2022-10-27 08:35:38  [4] Creating osm_id index on table 'planet_osm_line'...
2022-10-27 08:36:07  [1] Analyzing table 'planet_osm_point'...
2022-10-27 08:36:15  [1] Done task in 2130567ms.
2022-10-27 08:36:15  [1] Starting task...
2022-10-27 08:36:15  [1] Building index on table 'planet_osm_rels'
2022-10-27 08:37:03  [4] Creating hstore indexes on table 'planet_osm_line'...
2022-10-27 08:38:29  [1] Done task in 134154ms.
2022-10-27 08:40:45  [2] Creating geometry index on table 'planet_osm_polygon'...
2022-10-27 08:47:23  [4] Analyzing table 'planet_osm_line'...
2022-10-27 08:47:31  [4] Done task in 2806718ms.
2022-10-27 08:57:48  [2] Creating osm_id index on table 'planet_osm_polygon'...
2022-10-27 09:00:23  [2] Creating hstore indexes on table 'planet_osm_polygon'...
2022-10-27 09:23:17  [2] Analyzing table 'planet_osm_polygon'...
2022-10-27 09:23:20  [2] Done task in 4955960ms.
2022-10-27 15:14:20  [3] Done task in 25435433ms.
2022-10-27 15:14:20  [0] Done postprocessing on table 'planet_osm_ways' in 25435s (7h 3m 55s)
2022-10-27 15:14:20  [0] Done postprocessing on table 'planet_osm_rels' in 134s (2m 14s)
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_point' done in 2130s (35m 30s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_line' done in 2806s (46m 46s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_polygon' done in 4955s (1h 22m 35s).
2022-10-27 15:14:20  [0] All postprocessing on table 'planet_osm_roads' done in 580s (9m 40s).
2022-10-27 15:14:20  [0] Overall memory usage: peak=143600MByte current=881MByte
2022-10-27 15:14:20  [0] osm2pgsql took 48689s (13h 31m 29s) overall.


Planet (BZIP2, 2022-03-13) on Dell R730 288GB DDR4 2133 + 256GB swap, 2x Xeon E5-2699v4 22C/44T, 4x Samsung 870 SSD RAID 0, Centos Stream 9 ~22h

Software

  • osm2pgsql 1.5.2
  • PostgreSQL 14.1
  • PostGIS 3.2

Command

nohup time osm2pgsql -d gis --create --slim -G --hstore --tag-transform-script ~/git/openstreetmap-carto/openstreetmap-carto.lua -C 250000 --number-processes 32 -S ~/git/openstreetmap-carto/openstreetmap-carto.style planet-latest.osm.bz2 > osm2pgsql.out 2> osm2pgsql.err < /dev/null &

Output

2022-03-13 15:12:01  osm2pgsql version 1.5.2 (1.5.2)
2022-03-13 15:12:01  Database version: 14.1
2022-03-13 15:12:01  PostGIS version: 3.2
2022-03-13 15:12:01  Setting up table 'planet_osm_point'
2022-03-13 15:12:02  Setting up table 'planet_osm_line'
2022-03-13 15:12:02  Setting up table 'planet_osm_polygon'
2022-03-13 15:12:02  Setting up table 'planet_osm_roads'
2022-03-14 05:37:12  Reading input files done in 51910s (14h 25m 10s).
2022-03-14 05:37:12    Processed 7538107733 nodes in 23694s (6h 34m 54s) - 318k/s
2022-03-14 05:37:12    Processed 840641373 ways in 23276s (6h 27m 56s) - 36k/s
2022-03-14 05:37:12    Processed 9712543 relations in 4940s (1h 22m 20s) - 2k/s
2022-03-14 05:37:20  Clustering table 'planet_osm_line' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_point' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_roads' by geometry...
2022-03-14 05:37:20  Clustering table 'planet_osm_polygon' by geometry...
2022-03-14 05:37:23  Done postprocessing on table 'planet_osm_nodes' in 0s
2022-03-14 05:37:23  Building index on table 'planet_osm_ways'
2022-03-14 05:37:23  Building index on table 'planet_osm_rels'
2022-03-14 05:47:34  Creating geometry index on table 'planet_osm_roads'...
2022-03-14 05:49:51  Creating geometry index on table 'planet_osm_point'...
2022-03-14 05:52:05  Creating osm_id index on table 'planet_osm_roads'...
2022-03-14 05:52:11  Analyzing table 'planet_osm_roads'...
2022-03-14 06:10:03  Creating geometry index on table 'planet_osm_line'...
2022-03-14 06:30:15  Creating geometry index on table 'planet_osm_polygon'...
2022-03-14 06:34:07  Creating osm_id index on table 'planet_osm_point'...
2022-03-14 06:35:07  Analyzing table 'planet_osm_point'...
2022-03-14 07:30:00  Creating osm_id index on table 'planet_osm_line'...
2022-03-14 07:31:54  Analyzing table 'planet_osm_line'...
2022-03-14 12:24:01  Creating osm_id index on table 'planet_osm_polygon'...
2022-03-14 12:27:57  Analyzing table 'planet_osm_polygon'...
2022-03-14 13:13:57  Done postprocessing on table 'planet_osm_ways' in 27394s (7h 36m 34s)
2022-03-14 13:13:57  Done postprocessing on table 'planet_osm_rels' in 327s (5m 27s)
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_point' done in 3467s (57m 47s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_line' done in 6874s (1h 54m 34s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_polygon' done in 24637s (6h 50m 37s).
2022-03-14 13:13:57  All postprocessing on table 'planet_osm_roads' done in 891s (14m 51s).
2022-03-14 13:13:57  osm2pgsql took 79316s (22h 1m 56s) overall.

Planet (2023-09-26 , 71 GB ) on Azure : Standard E32as v5 (32 vcpus, 256 GiB memory) , Postgres : 15.3, Azure - Memory Optimized, E32ads_v5, 32 vCores, 256 GiB RAM, 4096 GiB storage ~ 27 Hour

Import Command :

osm2pgsql version : 1.8.1

PostGIS version : 3.3.1

osm2pgsql --create --slim --extra-attributes --output=flex --style /opt/raw-data-api/backend/raw.lua /tmp/data/planet-latest.osm.pbf --number-processes 32 -C 75000 --flat-nodes '/tmp/data/nodes.bin'


Disk Speed and specs  :

— 5000 IOps , 1000 GB Premium Disk (Azure)


Virtual Machine :

Azure : Standard E32as v5 (32 vcpus, 256 GiB memory)


Postgres :

Version : 15.3

Specs : Memory Optimized, E32ads_v5, 32 vCores, 256 GiB RAM, 4096 GiB storage

Tuning :

maintenance_work_mem - 2097151 KB

shared_buffers - 8388608 x 8 KB

work_mem - 167772 KB

checkpoint_timeout - 3600    x sec

max_worker_processes - 32

max_parallel_workers - 16

max_parallel_maintenance_workers = 4

max_parallel_workers_per_gather = 4


Planet data :

71G , Downloaded on Sep 26 11:26 , From planet osm : planet-latest.osm.pbf (https://planet.openstreetmap.org/planet/)  


Lua Script :

https://github.com/hotosm/raw-data-api/blob/develop/backend/raw.lua


Import logs :

2023-09-27 08:25:04  Reading input files done in 65996s (18h 19m 56s).                    

2023-09-27 08:25:04    Processed 8631263509 nodes in 8113s (2h 15m 13s) - 1064k/s

2023-09-27 08:25:04    Processed 968694815 ways in 11267s (3h 7m 47s) - 86k/s

2023-09-27 08:25:04    Processed 11290477 relations in 46616s (12h 56m 56s) - 242/s

2023-09-27 11:45:12  Creating index on table 'ways_line' ("geom")...

2023-09-27 13:12:08  Creating index on table 'ways_poly' ("geom")...

2023-09-27 13:37:08  Creating index on table 'relations' ("geom")...

2023-09-27 13:46:55  Creating id index on table 'ways_line'...

2023-09-27 14:01:44  Analyzing table 'ways_line'...

2023-09-27 14:15:36  Creating id index on table 'relations'...

2023-09-27 14:16:12  Analyzing table 'relations'...

2023-09-27 16:24:31  Creating id index on table 'ways_poly'...

2023-09-27 16:26:45  Done postprocessing on table 'planet_osm_ways' in 28901s (8h 1m 41s)

2023-09-27 16:26:45  Done postprocessing on table 'planet_osm_rels' in 483s (8m 3s)

2023-09-27 16:26:45  All postprocessing on table 'nodes' done in 9234s (2h 33m 54s).

2023-09-27 16:26:45  All postprocessing on table 'ways_line' done in 20211s (5h 36m 51s).

2023-09-27 16:47:18  Analyzing table 'ways_poly'...

2023-09-27 16:47:39  All postprocessing on table 'ways_poly' done in 30155s (8h 22m 35s).

2023-09-27 16:47:39  All postprocessing on table 'relations' done in 21118s (5h 51m 58s).

2023-09-27 16:47:39  osm2pgsql took 96151s (26h 42m 31s) overall.

Final Space Used :

nodes.bin size : 84GB Database table size after import ( only geom and id index ) : 896 GB


Database Update :

It took approx 2 hour to cover a day update gap You can directly use osm2pgsql replication command instead of this , I have added this in order to update couple of extra attributes ! (https://github.com/hotosm/raw-data-api/blob/develop/backend/replication)

python replication update --flat_nodes /tmp/data/nodes.bin --skip_cupdate -s raw.lua  --max-diff-size 10 – --number-processes 32


Planet (72.0GB PBF, 2024-01-27) on 220GB DDR4 2133, Xeon E5-2680v4@2.40GHz 14C, NVME 2GB , Ubuntu 22.04 (non-slim) ~11h

Hardware:

Operating System: Ubuntu 22.04
Kernel Version: 6.5.0-15-generic (64-bit)
Graphics Platform: X11
Processors: 28 × Intel® Xeon® CPU E5-2680 v4 @ 2.40GHz
Memory: 220,1 GiB of RAM
Samsung SSD 870 EVO 500GB -- system and planet.osm
Patriot M.2 P300 2048GB -- postgresql database only.

Postgres configuration

Just defaults


Command line

osm2pgsql -d gis -U zkir --create  -G --hstore --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua --number-processes 8 -S ~/src/openstreetmap-carto/openstreetmap-carto.style -r pbf ~/data/planet-latest-updated.osm.pbf

Import logs :

2024-01-27 05:08:54  osm2pgsql version 1.10.0 (1.10.0-13-gcc8470b2)
2024-01-27 05:08:54  Database version: 16.1 (Ubuntu 16.1-1.pgdg22.04+1)
2024-01-27 05:08:54  PostGIS version: 3.4
2024-01-27 05:08:54  Storing properties to table '"public"."osm2pgsql_properties"'.
2024-01-27 05:08:54  Setting up table 'planet_osm_point'
2024-01-27 05:08:54  Setting up table 'planet_osm_line'
2024-01-27 05:08:54  Setting up table 'planet_osm_polygon'
2024-01-27 05:08:54  Setting up table 'planet_osm_roads'
2024-01-27 13:39:16  Reading input files done in 30622s (8h 30m 22s).
2024-01-27 13:39:16    Processed 8905535812 nodes in 4720s (1h 18m 40s) - 1887k/s
2024-01-27 13:39:16    Processed 996304807 ways in 23228s (6h 27m 8s) - 43k/s
2024-01-27 13:39:16    Processed 11796935 relations in 2674s (44m 34s) - 4k/s
2024-01-27 13:39:16  Clustering table 'planet_osm_point' by geometry...
2024-01-27 13:39:19  Clustering table 'planet_osm_line' by geometry...
2024-01-27 13:39:19  Clustering table 'planet_osm_polygon' by geometry...
2024-01-27 13:39:19  Clustering table 'planet_osm_roads' by geometry...
2024-01-27 13:53:22  Creating geometry index on table 'planet_osm_roads'...
2024-01-27 13:55:33  Analyzing table 'planet_osm_roads'...
2024-01-27 14:02:56  Creating geometry index on table 'planet_osm_point'...
2024-01-27 14:13:20  Analyzing table 'planet_osm_point'...
2024-01-27 14:13:21  All postprocessing on table 'planet_osm_point' done in 2044s (34m 4s).
2024-01-27 14:50:55  Creating geometry index on table 'planet_osm_line'...
2024-01-27 15:07:17  Analyzing table 'planet_osm_line'...
2024-01-27 15:07:17  All postprocessing on table 'planet_osm_line' done in 5280s (1h 28m 0s).
2024-01-27 15:39:02  Creating geometry index on table 'planet_osm_polygon'...
2024-01-27 16:15:19  Analyzing table 'planet_osm_polygon'...
2024-01-27 16:15:19  All postprocessing on table 'planet_osm_polygon' done in 9362s (2h 36m 2s).
2024-01-27 16:15:19  All postprocessing on table 'planet_osm_roads' done in 977s (16m 17s).
2024-01-27 16:15:19  Storing properties to table '"public"."osm2pgsql_properties"'.
2024-01-27 16:15:19  osm2pgsql took 39985s (11h 6m 25s) overall.

Notes: After import was completed, init_update failed, due to non slim mode.

osm2pgsql-replication init -d gis --server https://planet.openstreetmap.org/replication/hour
[CRITICAL]: osm2pgsql middle table "public.planet_osm_ways" not found in database "gis". Database needs to be imported in --slim mode.

next test will be in slim mode.

Planet (72.0GB PBF, 2024-01-27) on 220GB DDR4 2133, Xeon E5-2680v4@2.40GHz 14C, NVME 2GB , Ubuntu 22.04 (slim) ~17h

Run with same parameters as above, but in slim mode (--slim -C 0 --flat-nodes), because I need regular updates. it took 6 hours longer, 17 hours this time!

it seems that this parameters do not take advantage of huge RAM size. Processing relations (Processed 11800687 relations ) is 4 (!) times slower. Post-processing is also slower.


Hardware:

Operating System: Ubuntu 22.04
Kernel Version: 6.5.0-15-generic (64-bit)
Graphics Platform: X11
Processors: 28 × Intel® Xeon® CPU E5-2680 v4 @ 2.40GHz
Memory: 220,1 GiB of RAM
Samsung SSD 870 EVO 500GB -- system and planet.osm
Patriot M.2 P300 2048GB -- postgresql database only.

Postgres configuration

Just defaults


Command line

osm2pgsql -d gis -U zkir --create --slim  -G --hstore --tag-transform-script ~/src/openstreetmap-carto/openstreetmap-carto.lua -C 0 --flat-nodes ~/data/nodes.bin --number-processes 16 -S ~/src/openstreetmap-carto/openstreetmap-carto.style -r pbf ~/data/planet-latest-updated.osm.pbf

Import logs :

2024-01-28 12:39:58  osm2pgsql version 1.10.0 (1.10.0-13-gcc8470b2)
2024-01-28 12:39:58  Database version: 16.1 (Ubuntu 16.1-1.pgdg22.04+1)
2024-01-28 12:39:58  PostGIS version: 3.4
2024-01-28 12:39:58  Storing properties to table '"public"."osm2pgsql_properties"'.
2024-01-28 12:39:59  Setting up table 'planet_osm_point'
2024-01-28 12:39:59  Setting up table 'planet_osm_line'
2024-01-28 12:39:59  Setting up table 'planet_osm_polygon'
2024-01-28 12:40:00  Setting up table 'planet_osm_roads'
2024-01-28 22:59:21  Reading input files done in 37161s (10h 19m 21s).                    
2024-01-28 22:59:21    Processed 8908224076 nodes in 5167s (1h 26m 7s) - 1724k/s
2024-01-28 22:59:21    Processed 996551518 ways in 21816s (6h 3m 36s) - 46k/s
2024-01-28 22:59:21    Processed 11800687 relations in 10178s (2h 49m 38s) - 1k/s
2024-01-28 22:59:22  Clustering table 'planet_osm_point' by geometry...
2024-01-28 22:59:22  Clustering table 'planet_osm_line' by geometry...
2024-01-28 22:59:24  Done postprocessing on table 'planet_osm_nodes' in 0s
2024-01-28 22:59:24  Clustering table 'planet_osm_polygon' by geometry...
2024-01-28 22:59:24  Clustering table 'planet_osm_roads' by geometry...
2024-01-28 22:59:24  Building index on table 'planet_osm_ways'
2024-01-28 22:59:24  Building index on table 'planet_osm_rels'
2024-01-28 23:24:58  Creating geometry index on table 'planet_osm_roads'...
2024-01-28 23:29:52  Creating osm_id index on table 'planet_osm_roads'...
2024-01-28 23:30:15  Analyzing table 'planet_osm_roads'...
2024-01-28 23:41:57  Creating geometry index on table 'planet_osm_point'...
2024-01-29 00:12:05  Creating osm_id index on table 'planet_osm_point'...
2024-01-29 00:15:08  Analyzing table 'planet_osm_point'...
2024-01-29 01:23:19  Creating geometry index on table 'planet_osm_line'...
2024-01-29 01:45:18  Creating osm_id index on table 'planet_osm_line'...
2024-01-29 01:53:43  Analyzing table 'planet_osm_line'...
2024-01-29 03:19:42  Creating geometry index on table 'planet_osm_polygon'...
2024-01-29 04:12:53  Creating osm_id index on table 'planet_osm_polygon'...
2024-01-29 04:25:46  Analyzing table 'planet_osm_polygon'...
2024-01-29 05:48:07  Done postprocessing on table 'planet_osm_ways' in 24523s (6h 48m 43s)
2024-01-29 05:48:07  Done postprocessing on table 'planet_osm_rels' in 927s (15m 27s)
2024-01-29 05:48:07  All postprocessing on table 'planet_osm_point' done in 4547s (1h 15m 47s).
2024-01-29 05:48:07  All postprocessing on table 'planet_osm_line' done in 10465s (2h 54m 25s).
2024-01-29 05:48:07  All postprocessing on table 'planet_osm_polygon' done in 19585s (5h 26m 25s).
2024-01-29 05:48:07  All postprocessing on table 'planet_osm_roads' done in 1854s (30m 54s).
2024-01-29 05:48:07  Storing properties to table '"public"."osm2pgsql_properties"'.
2024-01-29 05:48:07  osm2pgsql took 61688s (17h 8m 8s) overall.