Osm2pgsql/benchmarks
Background
Benchmarks of osm2pgsql are important metrics for users to reference because importing OSM data is highly dependent on machine hardware and software configurations. Importing a complete planet file can take days even on a typical higher end desktop machine. Importing an extract(subset) of the planet file can take considerably less time to import and should be used if possible for your import instead of the planet file.
What affects import time?
Partial list of variables that affect the time it takes to import.
- Size of the node cache (-C command line argument)
- PostgreSQL config settings
- Hard disk throughput
- OS used and 32 bit or 64 bit version
- input reader and input file format
- the experimental 'pritive' XML parser reads OSM XML input about 30% faster than the default 'libxml2' reader
- the 'pbf' input reader reads OSM PBF files about twice as fast as the 'libxml2' parser parses an equivalent OSM XML file
Example Output
The following is sample output generated while importing a planet file using osm2pgsql for reference. The linux time command was used in this example to output the amount of time it took for osm2pgsql to finish. Note the NOTICE lines are normal when importing into an empty database.
~/osm2pgsql$ time osm2pgsql -s -v -U mapper -S ./default.style -d gis -C 3000 ../planet/planet-100324.osm.bz2 osm2pgsql SVN version 0.69-20672 Using projection SRS 900913 (Spherical Mercator) Setting up table: planet_osm_point NOTICE: table "planet_osm_point" does not exist, skipping NOTICE: table "planet_osm_point_tmp" does not exist, skipping Setting up table: planet_osm_line NOTICE: table "planet_osm_line" does not exist, skipping NOTICE: table "planet_osm_line_tmp" does not exist, skipping Setting up table: planet_osm_polygon NOTICE: table "planet_osm_polygon" does not exist, skipping NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping Setting up table: planet_osm_roads NOTICE: table "planet_osm_roads" does not exist, skipping NOTICE: table "planet_osm_roads_tmp" does not exist, skipping Mid: pgsql, scale=100, cache=3000MB, maxblocks=384001*8192 Setting up table: planet_osm_nodes NOTICE: table "planet_osm_nodes" does not exist, skipping NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_nodes_pkey" for table "planet_osm_nodes" Setting up table: planet_osm_ways NOTICE: table "planet_osm_ways" does not exist, skipping NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_ways_pkey" for table "planet_osm_ways" Setting up table: planet_osm_rels NOTICE: table "planet_osm_rels" does not exist, skipping NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_rels_pkey" for table "planet_osm_rels" Reading in file: ../planet/planet-100324.osm.bz2 Processing: Node(574797k) Way(43465k) Relation(87k) excepton caught processing way id=110802 excepton caught processing way id=110803 Processing: Node(574797k) Way(43465k) Relation(465k) Node stats: total(574797076), max(673005476) Way stats: total(43465572), max(53189409) Relation stats: total(465800), max(533629) Going over pending ways processing way (12179k) Going over pending relations node cache: stored: 386841238(67.30%), storage efficiency: 98.38%, hit rate: 65.63% Committing transaction for planet_osm_roads Committing transaction for planet_osm_polygon Committing transaction for planet_osm_line Sorting data and creating indexes for planet_osm_polygon Sorting data and creating indexes for planet_osm_roads Sorting data and creating indexes for planet_osm_line Committing transaction for planet_osm_point Sorting data and creating indexes for planet_osm_point Completed planet_osm_point Completed planet_osm_roads Completed planet_osm_polygon Stopping table: planet_osm_nodes Stopping table: planet_osm_rels Stopping table: planet_osm_ways Building index on table: planet_osm_rels Stopped table: planet_osm_nodes Building index on table: planet_osm_ways Stopped table: planet_osm_rels Completed planet_osm_line Stopped table: planet_osm_ways real 2985m27.269s user 327m47.240s sys 35m32.480s
Explanation and progressbar
If you want to know how far the import has gone, see the statistics and compare it with "Processing: Node" section of the output. Note that processing relation takes approximately 10 times as processing a way (which takes approximately 10 times processing a node). In my case, nodes were processed at 40.8k/s, ways at 0.13k/s and relations at ..k/s. Also, closing a table take approximately as long as importing the data. So if you finished importing the nodes, you are roughly 1/6 into total import.
Benchmarks
List of benchmarks contributed by users. Currently simply using time command to return length of time' it takes osm2pgsql task to complete. If you do not have time available please provide some other meaningful metric. Better organization and formatting standard for this section is needed.
Planet / 8GB DKB
- Import Description: planet file (planet-100324.osm.bz2)
- RAM: 8GB
- CPU: Xeon X3220 2.4GHz
- DISK(s): 1TB Western Digital Black / 500GB partitions
- OS: Ubuntu 9.10 64 bit
- osm2pgsql SVN version 0.69-20672
- slim mode
- --cache 3000
- time osm2pgsql -s -v -U mapper -S ./default.style -d gis -C 3000 ../planet/planet-100324.osm.bz2
- PostgreSQL 8.4.2
- shared_buffers = 128MB
- maintenance_work_mem = 256MB
- checkpoint_segments = 20
- autovacuum = off
- PostGIS 1.5
- Results: (49.75 Hours)
real 2985m27.269s user 327m47.240s sys 35m32.480s
Planet / 32GB rw
- Import Description: planet file (planet-100414.osm.bz2)
- RAM: 32GB
- CPU: 2 x Xeon X5520 2.26GHz
- DISK(s): 2 x 1TB 7200 rpm SATA2 drives; RAID0
- OS: Ubuntu 10.04 (Lucid Lynx) 64 bit
- osm2pgsql SVN version 0.69-20937
- slim mode
- --cache 4096
- time ./osm2pgsql -S default.style -C 4096 --slim -d gis /home/nerd/planet/planet-100414.osm.bz2
- PostgreSQL 8.4.3
- shared_buffers = 128MB
- maintenance_work_mem = 4096MB
- checkpoint_segments = 20
- autovacuum = off
- PostGIS 1.5
- Results: (20.6 hours)
real 1236m30.801s user 272m56.090s sys 15m6.180s
sly's benchmark 05/2010 (influence of SSD, software RAID and memory on import and diffs)
Benchs :
- Computer A, : 2GB MEMORY / 2 SATA SATA magnétic drives Software RAID 0 (10000rpm)
- Computer B, (tests): 32Go RAM / 2 SATA SATA magnétic drives Software RAID 1 (10000rpm)
- Computer C, (new): 8Go RAM / 2 SATA SSD drives RAID 0 (INTEL SSDSA2M080)
Software are always the same : default lenny 64bits packages (postgres 8.3 and osm2pgsql from SVN of 05/2010) The osm2pgsql style file is an home made one, so hard to compare to other's benchmarks, especially it has the -x switch to import timestamp which prooved to be extremely disk unfriendly
- case 1) geofabrik europe.osm.bz2 import
($/home/ressource-for-osm/osm2pgsql/osm2pgsql -C 1200 -s -S ./style -G -x -m -d gis europe.osm.bz2)
- A: ~7 days
- B: ~3 days (2.5 days with -C 16000)
- C: 8 hours
- case 2) Mean minute diff import time (mean value on 24 consecutive hours)
($time /home/ressource-for-osm/osm2pgsql/osm2pgsql -C 400 --bbox -27,31,50,72 -e 18 -o./regeneration_old_tiles/expire_file_list -x -G -a -s -S ./default.style -m -d gis temporaire.osc)
- A:~50secondes
- B:~30secondes
- C:~3secondes
Computer B, beside having slower storage due to RAID 1 proved itself faster, probably helped by the well handled Memory cache of the linux kernel.
DELL® PowerEdge R210
- Import Description: planet file (planet-101020.osm.bz2)
- RAM: 8GB
- CPU: Intel(R) Xeon(R) CPU L3426 @ 1.87GHz
- DISK(s): 1,7TB
- OS: Debian 64
- osm2pgsql : SVN version 0.69
- time osm2pgsql -S beciklo.style -G -s -v -m --bbox -27,31,50,72 -d gis -C 3072 planet-*.osm.bz2
- PostgreSQL 8.4.5
- shared_buffers = 512MB
- maintenance_work_mem = 512MB
- checkpoint_segments = 20
- autovacuum = off
- PostGIS 1.5
- Results: (4,45 days)
real 6419m44.318s user 634m29.431s sys 167m56.762s
Dedibox pro : DELL® PowerEdge R210
- Import Description: planet file (07/07/2011)
- RAM: 16GB
- CPU: Intel(R) Xeon(R) CPU L3426 @ 1.87GHz (x8)
- OS: Debian 64
- osm2pgsql : 0.69+r20104-2 (squeeze debian package)
- time osm2pgsql -S mapnik/Beciklo/style/Becyklo.style -G -s -v -m -d osm -C 10240 planet-latest.osm.bz2
- PostgreSQL 8.4.8
- shared_buffers = 1024MB
- maintenance_work_mem = 512MB
- checkpoint_segments = 20
- autovacuum = on
- PostGIS 1.5.1
- Results: 1 day 16 h ...
real 2420m15.421s user 830m9.141s sys 19m51.334s
custom i7 system
- Import Description: planet file (11/09/2011)
- RAM: 12GB
- CPU: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz
- OS: Ubuntu 11.10 x86_64
- osm2pgsql : osm2pgsql SVN version 0.70.5 (ubuntu package)
- time osm2pgsql -d osm -S /usr/share/osm2pgsql/default.style -G -v -m -s -K -C 8192 planet-111109.osm.bz2
- PostgreSQL 9.1
- shared_buffers = 1024MB
- maintenance_work_mem = 2048MB
- checkpoint_segments = 20
- autovacuum = off
- PostGIS 1.5.3 (ubuntu package)
- Results: 5 days ...
real 7211m6.191s user 595m47.626s sys 73m32.200s
Hetzner : Root Server EX 4
- Import Description: planet file (2011-12-29)
- RAM: 16GB
- CPU: Intel® Core™ i7-2600 Quad-Core
- OS: Ubuntu 11.10 x86_64
- osm2pgsql SVN version 0.80.0 (32bit id space)
- time osm2pgsql --create --database gis --username osm --prefix planet --slim --cache 2048 --hstore planet-latest.osm.bz2
- PostgreSQL 9.1.1
- shared_buffers = 128MB
- maintenance_work_mem = 256MB
- checkpoint_segments = 20
- autovacuum = off
- PostGIS 1.5 (came with PostgreSQL)
- Results: 7.28 days
real 10483m29.951s user 640m39.706s sys 49m43.398s
Hetzner : Root Server EX 4 (with HW RAID) with reasonable settings
- Import Description: planet file pbf (2012-01-04)
- RAM: 16GB
- CPU: Intel® Core™ i7-2600 Quad-Core CLEANMAP HW
- OS: Ubuntu 11.10 x86_64
- osm2pgsql SVN version 0.80.0 (32bit id space)
- osm2pgsql -r pbf --tablespace-main-index gisidx --tablespace-slim-data gisslim --tablespace-slim-index gisslimidx --slim -C 12000 --number-processes 2 planet-120104.osm.pbf
- PostgreSQL 9.1.1
- shared_buffers 1 GB (note detuned for import normal value 4GB)
- maintenance_work_mem = 1GB
- checkpoint_segments = 100
- autovacuum = off
- PostGIS 1.5 (came with PostgreSQL)
- Results 29h 12min (105178s)
Hetzner : Root Server EX 4 (stock) with reasonable settings
Amazon AWS EC2 : High-Memory Double Extra Large Instance
- Import Description: Planet binary file (2012-01-25)
- RAM: 34,2GB
- CPU: 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each) One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
- DISK(s): 8 50GB EBS blocks, RAID 0, deadline IO scheduler for Planet import file and PostgreSQL data files
- OS: Amazon Linux x86_64
- osm2pgsql SVN version 0.80.0 (64bit id space)
- time osm2pgsql -r pbf -S /mnt/data/openpistemap/osm2pgsql/default.style -C 20000 --slim -d gis --cache-strategy dense /mnt/planet/planet-latest.osm.pbf
- PostgreSQL 8.4.9 (Amazon Linux package)
- Postgis 1.5.2
- shared_buffers = 4000MB
- temp_buffers = 128MB
- work_mem = 512MB
- maintenance_work_mem = 16000MB
- checkpoint_segments = 20
- fsync = off
- autovacuum = off
- Results: 23,3 hour
real 1396m37.462s user 237m43.645s sys 58m24.671s
- Cost: $1.14 per hour
Amazon AWS EC2 : High-Memory Quadruple Extra Large Instance
- Import Description: Planet binary file (2012-01-25)
- RAM: 68,4GB
- CPU: 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each) One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
- DISK(s): 1 20GB EBS block for Planet import file and 1 500GB EBS block for PostgreSQL data files
- OS: Amazon Linux x86_64
- osm2pgsql SVN version 0.80.0 (64bit id space)
- time osm2pgsql -r pbf -S /usr/local/share/osm2pgsql/default.style -C 40000 --slim -d gis /mnt/planet/planet-latest.osm.pbf
- PostgreSQL 8.4.9 (Amazon Linux package)
- Postgis 1.5.2
- shared_buffers = 8000MB
- work_mem = 512MB
- maintenance_work_mem = 8000MB
- checkpoint_segments = 20
- autovacuum = off
- Results: 28,9 hour
- Cost: $2.28 per hour