Osm2pgsql

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages
Deutsch English русский

osm2pgsql is a command-line based program that converts OpenStreetMap data to postGIS-enabled PostgreSQL databases.

Such databases are often used to render OSM data visually using Mapnik, as PostgreSQL is one of the most efficient and flexible formats Mapnik can use for querying large amounts of data. For more information on using osm2pgsql to render OpenStreetMap data with Mapnik, see the Mapnik page.

Its homepage is: https://github.com/openstreetmap/osm2pgsql It currently does not have a sole maintainer but is maintained by a community of contributors and Osm2pgsql does not a designate 'stable' or 'development' versions but most recent builds should be able to be used in a production environment. Until April 2013, it was hosted on svn.

It is highly recommended to use a version released since Sept. 2, 2012, which has support for 64-bit_Identifiers.


OS Support

osm2pgsql is available for Linux, Mac OS X, and Windows. Most osm2pgsql users run ubuntu or a debian-based distribution and you'll find more complete and up to date documentation and support for it.

Installation

To use a version of osm2pgsql which supports 64-bit IDs add the option --enable-64bit-ids to your ./configure statement.

For Debian or Ubuntu

From the package manager

On Ubuntu 13.04 (Raring) or later, install the osm2pgsql package from universe.

On Debian Wheezy or later, install the osm2pgsql package from main.

On Debian Squeeze systems, it's highly recommended to compile from source to get the latest features, otherwise you get an outdated version which lacks important features like 64bit IDs, hstore or pbf support.

From source

When compiling under Ubuntu (12.04 LTS), you will need the following dependencies:

sudo apt-get install build-essential libxml2-dev libgeos++-dev libpq-dev libbz2-dev proj libtool automake git

Note : For ubuntu 13.04 and debian Wheezy, proj is replaced by libproj-dev

If you want PBF read support, you will also need libprotobuf-c0-dev and protobuf-c-compiler:

sudo apt-get install libprotobuf-c0-dev protobuf-c-compiler

libprotobuf-c0-dev needs to be at least in version 0.14-1. Ubuntu <= 10.04 has only 0.11, so you need to build it from source [2]. To compile from source:

sudo apt-get install protobuf-compiler libprotobuf-dev libprotoc-dev subversion
svn checkout http://protobuf-c.googlecode.com/svn/trunk/ protobuf-c-read-only
cd protobuf-c-read-only
./autogen.sh
make
sudo make install

If you want to use lua scripts for tag_transform, you will need to install lua5.2 liblua5.2-0 liblua5.2-dev and liblua5.1-0

sudo apt-get install lua5.2 liblua5.2-0 liblua5.2-dev liblua5.1-0

After that, follow the from source instructions.

Fedora

Fetch the dependencies:

yum install geos-devel proj-devel postgresql-devel libxml2-devel bzip2-devel 
yum install gcc-c++ protobuf-c-devel autoconf automake libtool

Then follow the from source instructions. If you prefer the package from the Fedora Package Collection, just use yum.

yum install osm2pgsql

openSUSE

For openSUSE 11.3 and newer: First add the Geo package repository (adapt URL to correct openSUSE version):

sudo zypper ar http://download.opensuse.org/repositories/Application:/Geo/openSUSE_11.3/ "Geo"
sudo zypper refresh

Then install osm2pgsql:

sudo zypper install osm2pgsql

For openSUSE 11.2 or older you will need to compile from source.

Arch Linux

Build the osm2pgsql-git package from the AUR. (Download the tarball and compile/install with 'makepkg', or use an AUR helper such as 'yaourt'.)

FreeBSD

make -C /usr/ports/converters/osm2pgsql install clean

or

portinstall osm2pgsql

Windows

Binary


This is by far and away the easiest way to get Osm2pgsql running on Windows.

The newest windows binary is from February 6, 2013 at http://customdebug.com/osm/osm2pgsql.zip

This version does not appear to handle all objects properly (64-bit problem), checking git not all 32-bit variables for IDs had been caught when this was compiled. See Cygwin below for current workarounds.

It contains 32 and 64 bit binaries with 32 and 64 bit node id (four binaries). It supports pbf and all standard osm2pgsql options except fork and flat-nodes. This version has not been tested in a production environment so far. The linked PostgreSQL version is 9.2.1 and it was tested with postgis 2.0.1 and mapnik 2.

For using osm2pgsql just unzip, add directory to your path and run.

  1. Extract the zip to a new directory on your local disk
  2. Add the new directory path to your system variable path ([3])
  3. log out and back in so the new variable is active
  4. Open up command prompt ie... run
  5. Change directories so that you are located in the extracted osm2pgsql directory (NOTE this step should not be needed if its in the system path).
  6. now run the > osm2pgsql -h this will show you the help
  7. on windows it is essential to use parameter -S or --style to locate the stylefile. Default is a unix like path
  8. Projections other than 900913 (Spherical Mercator) are not supported on Windows as the code has an embedded UNIX style pathname
  • Example prompt> osm2pgsql -c -d gis -U postgres -H localhost -P 5432 N:\Geo_data\OpenStreetMap\OSM_xml\yourOSM_filename.bz2
    • if you set up a password for the user (e.g. "xxx"), you will need to set it into environment variable by running > set pgpassword=xxx
    • if you get an error AddGeometryColumn() - invalid SRID, you will need to import spatial_ref_sys.sql from the PostGIS directory


    Place commands in a DOS Batch File....
    
    Database Host is called db1, substitute localhost or your server name
    places database in a seperate schema to ensure that it can be backed up and restored 
    without causing any PostGis conflicts on the restore.
    900913.txt can be found using google very quickly
    The style file was copied and placed in the same directory as the batch file was executing in
    user osm was created for DB Access, hence the User GRANT (User postgres is used for the actual load)
   DROPDB -U postgres -h db1 planetosm
   CREATEDB -U postgres -h db1 -T template_postgis_20 -e -O postgres planetosm
   psql -h db1 -t -d planetosm -U postgres -a  -c  "CREATE SCHEMA markware;"
   psql -h db1 -t -d planetosm -U postgres -a  -c  "ALTER DATABASE planetosm SET search_path='markware','public';"
   psql -h db1 -t -d planetosm -U postgres -a  -c  "GRANT ALL PRIVILEGES ON DATABASE planetosm to osm;"
   psql -h db1 -t -d planetosm -U postgres -a  -c  "CREATE EXTENSION hstore;"
   psql -h db1 -U postgres -f 900913.txt planetosm 
   VACUUMDB -h db1 -U postgres -d planetosm -z -e    
   
   :note - user AND password seem to be needed as an environment variable to ensure successful login
   SET pgpass=secret
   SET pguser=postgres
   :DO NOT FORGET APPEND FLAG ON SUBSEQUENT UPDATES IF YOU ARE ADDING MULTIPLE SMALLER EXTRACTS TO THE SAME DATABASE
   osm2pgsql --keep-coastlines --unlogged --cache-strategy sparse -x -k -s -C 1000 -c -H db1 
         -d planetosm -S asia.style philippines-latest.osm.pbf


If you are using osm2pgsql on windows and are using postgis 2.0 or newer, note that you will have to manually patch your database.

   2014-05-12 - Note, we were able to use and style the loaded data without having to apply this patch

From Source

For native Windows compilation with only minor source adjustments it is recommended to use a C99 compatible compiler such as the Intel Compiler.

Using MinGW, there are currently a lot of problems to solve when building osm2pgsql:

osm2pgsql configure script does not find zlib and other libraries

The configure script created by autogen does not work correctly because the parameters for defining the libraries to be linked (e.g. -lz for libz) are not at the correct position. They have to be placed at the end of the command line. By manually adding the correspondent parameters to the

Workaround: Edit the configure script and add the libraries to link manually:

ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS -lz >&5'

Note: Adding the debug parameter d to autogen.sh (autoreconf -vfid) is extremely helpful in this situation for finding out which line to edit.

Linking problem

build_geometry.cpp:206: undefined reference to `geos::geom::Coordinate::Coordinate(double, double, double)'

Workaround: Edit build_geometry.cpp and add #define GEOS_INLINE

Used versions: libxml2-2.7.8, zlib-1.2.5, proj-4.7.0, geos-3.2.2

Jburgess, who compiled the newest windows binary (2010-04-10), used mingw32 to cross-compile the windows binary from a Linux host.

Cygwin

Cross-compilation problems with osm2pgsql are documented on github. Whilst these problems are being worked on, an alternative compilation path for windows users targeting Cygwin has been used.

This can be run directly in Windows as all relevant DLLs are included in the distribution. Alternatively it can be used under Cygwin within Windows.

Binary

Download the zipped osm2pgsql Cygwin package (latest version (43) here: https://vanguard.houghtonassociates.com/browse/OSM-OSM2PSQL-43/artifact).

Unzip this into a directory usually cygwin_package, but perhaps better called osm2pgsql.

You can run osm2pgsql within this directory and/or add the directory to your PATH variable. (In practice I have a single windows directory for core OSM apps (osmosis, osmconvert, osmfilter, osm2pgsql, ogr2osm etc.) and copy apps and their libraries there.

(A quick hack if running this under Cygwin itself which I've used is simply to copy the .exe and .dll files to /usr/bin, some of the .dlls will already exist. Osm2pgsql will then be in the path for Cygwin.)



Mac OS X

Binary Installer

SVN snapshot builds of osm2pgsql are packaged and available at: https://github.com/openstreetmap/osm2pgsql/issues/15

With Homebrew

brew install osm2pgsql

In order to import .pbf files, install protobuf-c package first:

brew install protobuf-c

You'll then need to manually point to your protobuf-c libraries/headers:

brew install osm2pgsql --with-protobuf-c=$HOMEBREW_PREFIX/opt/protobuf-c

As of August 2013, the the osm2pgsql Homebrew formula[4] no longer requires the path to protobuf and instead builds protobuf support with simply:

brew install osm2pgsql --with-protobuf-c

With MacPorts

The only catch to installing on Mac OS X with MacPorts is the fact that the default Postgres 8.3 installation doesn't install pg_config in the shell, so the call to pg_config --includedir and pg_config --libdir fails. Therefore, either change these paths in the Makefile or add pg_config to your shell via .bashrc, etc. For reference, an installation of 8.3 usually places the binary file at /opt/local/lib/postgresql83/bin/pg_config.

With KyngChaos binaries

PATH=$PATH:/Library/Frameworks/GEOS.framework/unix/bin/
CFLAGS="-I/Library/Frameworks/PROJ.framework/unix/include"
LDFLAGS="-L/Library/Frameworks/PROJ.framework/unix/lib/"
make

From source (generic)

You will need the 'autoconf' command, libtool and other dependencies. If not already installed, install them.

If you need o5m file format support, refer to osm2pgsql/o5m, otherwise go on as described here:

You can get the source of osm2pgsql (28 mb) from git

git clone https://github.com/openstreetmap/osm2pgsql.git


Next, enter the newly created directory containing the source for the utility:

cd osm2pgsql/

If no Makefile and configure script exist, generate them with:

./autogen.sh
./configure

Optionally, you can configure the compiler to produce a faster binary that can only run on CPUs that have the same capabilities as yours.

sed -i 's/-g -O2/-O2 -march=native -fomit-frame-pointer/' Makefile

Finally, compile the sources into an executable program:

make

If you are using an old version of GCC, you may get an error: unrecognized option `-Wextra'. In this case, edit the Makefile to remove this unnecessary option.

Osm2pgsql schema

Information about the schema created by Osm2pgsql can be found here: Osm2pgsql/schema.

Import style

Aspects of how osm2pgsql converts OSM data into PostgreSQL tables can be configured via a style file. The default style file that is installed with osm2pgsql is suitable for rendering the standard OSM Mapnik style or similar styles. With a custom style file, you can control how different object types and tags map to different columns and data types in the database.

The style file is a plain text file containing 4 columns separated by spaces. As each OSM object is processed it is checked against conditions specified in the first two columns. If they match, processing options from the third and fourth columns are applied.

  1. OSM object type: can be node, way or both separated by a comma. way will also apply to relations with type=multipolygon, type=boundary, or type=route; all other relations are ignored by osm2pgsql.
  2. Tag: the tag to match on. If the fourth column is linear or polygon, a column for this tag will be created in each of the point, line, polygon, and road tables.
  3. PostgreSQL data type: This specifies how data will be stored in the tag's postgresql table column. Possible values are text, int4, real. If the fourth column is delete or phstore, this column has no meaning and should just be set to text.
  4. Flag: Zero or more flags separated by commas. Possible values:
    • linear: Specifies that ways with this tag should be imported as lines by default, even if they are closed. Other conditions can still override this. See the Osm2pgsql/schema#planet_osm_line schema documentation for details on how route relations are handled.
    • polygon: Specifies that closed ways with this tag should be imported as polygons by default. This will override any linear flags that would apply to the same object. Closed ways with area=yes and closed relations with type=multipolygon or type=boundary will be imported as polygons even if no polygon flag is set. Non-closed ways and closed ways with area=no will always be imported as lines.
    • delete: Prevents the specified tag (but not the entire object) from being stored in the database. Useful for tags that tend to have long values but will not be used for rendering, such as source=*. This flag only affects --slim mode imports.
    • phstore: Behaves like the polygon flag, but is used in hstore mode when you do not want to turn the tag into a separate PostgreSQL column.
    • nocache: This flag only applies to nodes and can be used for tags where it is unlikely that the node will also be a part of a way (eg, tags that suggest an independent POI). If you are not using --slim mode, this may break things.

The style file may also contain comments. Any text between a # and the end of the line will be ignored.

Special 'tags'

There are several special values that can be used in the tag column (second column) of the style file for creating additional fields in the database which contain things other than tag values.

  • way_area: Creates a database column that stores the area (calculated in the units of the projection, normally Mercator meters) for any objects imported as polygons. Use with real as the data type.
  • z_order: Adds a column that is used for ordering objects in the render. It mostly applies to objects with highway=* or railway=*. Use with int4 as the data type.
  • osm_user: Adds a column that stores the username of the last user to edit an object in the database.
  • osm_uid: Adds a column that stores the user ID number of the last user to edit an object in the database.
  • osm_version: Adds a column that stores the version of an object in the database (ie, how many times the object has been modified).
  • osm_timestamp: Adds a column that stores the date and time that the most recent version of an object was added to OpenStreetMap.

To use the osm_user, osm_uid, osm_version, and osm_timestamp tags, you must use the osm2pgsql option --extra-attributes when importing.

Avoiding pitfalls

osm2pgsql is a specialized tool and there's some behavior hardcoded into it that you should know about.

Area imports

For users of osm2pgsql whose version are older than v28868, the bbox parameter does not work with .pbf files. As a workaround, use a preprocessing tool like osmconvert to create your extract as a .osm file.

Coastlines

The natural=coastline tag is suppressed by default, even if you import the natural=* key. The main mapnik map renders coastlines from shapefiles so it does not need them. As of revision 24605, you can use the --keep-coastlines parameter to change this behavior if you want coastlines in your database.

Compiling Problems

Error:

/usr/bin/ld: cannot find -lstdc++
collect2: ld returned 1 exit status
make: *** [osm2pgsql] Fehler 1

the new version od gcc don't need lstdc++ anymore: remove the line in the 'Makefile' (or comment it):

LDFLAGS += -lstdc++

to

#LDFLAGS += -lstdc++

PostgreSQL intarray module

If you are using an osm2pgsql version older than 0.80 and are planning on using diffs then you need to activate the intarray contrib module. Do not activate it if you are using osm2pgsql 0.80 or newer:

psql gis < /usr/share/postgresql/8.3/contrib/_int.sql

Usage

For basic usage, see the man page of osm2pgsql (man osm2pgql) and the README found on its github page.


Before running osm2pgsql, you first must create your postgresql database and enable postgis on it.


Basic usage:

osm2pgsql -s -U postgres -d nameofdatabase /file/path/toosm/fileorpbf/name.osm

In the example above, postgres is the user of the database, the user enabled slim mode (generally recommended).

The above sample is feasible for someone looking to export a city's OSM data into a postGIS database. For more advanced and larger datasets, read the Optimization section.

Additional parameters:

-G|--multi-geometry Generate multi-geometry features in postgresql tables.

--flat-nodes flat-nodes is an alternative node store in a file which uses ~20GB (Oct 2013) instead of 100GB in Postgresql. It is also faster during import and during diff updates, and thus can only be used in slim mode. Put this file on a SSD, if you can.

Optimization

Best practices

see : Frederik Ramm : Optimising the Mapnik/osm2pgsql Rendering Toolchain 2.0 @ SOTM 2012

Slim mode

osm2pgsql has two main modes of running - normal and slim mode.

It is highly recommended to run osm2pgsql in slim mode. Some important features (including incremental updates (planet diffs, the initial load to populate the track tables, and proper evaluation of multipolygons) only work in slim mode.

The normal mode uses RAM for intermediate storage, Slim mode uses object tracking tables

  • planet_osm_nodes
  • planet_osm_ways
  • planet_osm_rels

in the database on-disk. You must use slim-mode for planet imports on 32-bit systems, since there are too many nodes to store in RAM otherwise. This limitation doesn't apply to 64-bit systems.

One benefit of not using slim mode is that osm2pgsql is a bit faster since it doesn't read from the DB until the index creation.

Implement slim mode by using the '-s' option: "osm2pgsql -s -d ...."

Parameters

  • Large imports into PostGIS are very sensitive to maintenance and monitoring configuration: it is smart to increase the value of checkpoint_segments so that autovacuum tasks don't slow down imports.
  • osm2pgsql relies much on its node cache during import. If the nodes do not fit into the cache in slim mode it needs to do database lookups which slow down the process. (Without slim mode, it fails if the nodes do not fit in the cache). Use enough cache so all nodes are cached. -C 22000 seems to do the job, even if that means you have to configure more swap space.
    • Without slim mode, there is also a "way" cache which takes up about as much space as the node cache does, but is not charged against -C. In slim mode, there is no "way" cache.

Fast disks

The bottleneck is usually the I/O when running osm2pgsql with --slim or even without it.

Multicore

With a bzipped planet file, run osm2pgsql as

bunzip -c [planet.osm.bz2] | osm2pgsql [options] /dev/stdin

Uses one core for decompressing the bzip2, and another for osm2pgsql. This is tested on Ubuntu 11.04. But doesn't work within a screen session.

Benchmarks

Please see /benchmarks for general benchmarking information.

Updating Data in Database

osm2pgsql is one of the tools used to keep OSM data updated with changes to the OpenStreetMap servers in your own PostgreSQL database. Please see Minutely_Mapnik for more information.

Bug reports

Please report bugs to the github tracker.

Until April 2013, bug reports were filed in Trac under the mapnik component.

hstore

Hstore is for sets of key/value pairs. As associative array datatype, just like a hash in perl or dictionary in python.

This should come in handy especially for rarely used tags. Using hstore, one can use any tag in sql queries like this:

gis=> SELECT COUNT(*) FROM planet_osm_point WHERE ((tags->'man_made') = 'tower');
 COUNT
-------
  447
(1 Zeile)

build osm2pgsql from svn you now will have a -k option

-k|--hstore		Generate an additional hstore (key/value) column to  postgresql tables

install into postgresql

Before postgresql 9.1

$ psql -d gis -f /usr/share/postgresql/8.4/contrib/hstore.sql

Postgresql 9.1 and later

This sql code will install the extension in your current database[1] :

CREATE EXTENSION hstore;

Getting a List of all Dirty Tiles

Speaking about the file "dirty_tiles", osm2pgsql does not provide the names of all affected tiles because this list is stripped of all redundant information. If there is, for example, the tile 3/0/1 (zoom, x, y) in the list, the also affected tile 2/0/0 will be omitted. Furthermore, if all four subtiles – 3/0/0, 3/0/1, 3/1/0 and 3/1/1 – were affected, none of them would appear in the list, they would be replaced by the entry 2/0/0.

This is a very effective way to reduce the dirty-tiles list length, however, certain applications will still need a list of all affected tiles. To accomplish this, you may adapt one of osm2pgsql's source files: "expire-tiles.c". At first, create a new file and name it "expire-tiles.diff". Now copy this contents into the new file:

*** expire-tiles_0_80_0.c    2011-05-21 15:13:49.000000000 +0200
--- expire-tiles.c    2011-08-10 16:23:05.000000000 +0200
*************** static int _mark_tile(struct tile ** tre
*** 105,114 ****
--- 105,146 ----
   */
  static int mark_tile(struct tile ** tree_head, int x, int y, int zoom) {
  	return _mark_tile(tree_head, x, y, zoom, 0);
  }
  
+ #if 1
+ static void output_dirty_tile(FILE * outfile, int x, int y, int zoom, int min_zoom) {
+   // writes a tile into output file, including all subsequent tiles
+   // with higher zoom levels;
+   int x_max, y_max;
+   int tile_size;
+ 
+   tile_size= 1;
+   while (zoom <= Options->expire_tiles_zoom) {
+     if(zoom>=min_zoom) {
+       x_max = x + tile_size;
+       while (x < x_max) {
+         y_max = y + tile_size;
+         while (y < y_max) {
+ 		      if ((outcount++ % 1000) == 0) {
+ 			      fprintf(stderr, "\rWriting dirty tile list (%ik)", outcount / 1000);
+ 			      fflush(stderr);
+ 		      }
+ 		      fprintf(outfile, "%i/%i/%i\n", zoom, x, y);
+           y++;
+         }
+         y-= tile_size;
+         x++;
+       }
+       x-= tile_size;
+     }
+     zoom++;
+     x<<= 1; y<<= 1;
+     tile_size<<= 1;
+   }
+ }
+ #else
  static void output_dirty_tile(FILE * outfile, int x, int y, int zoom, int min_zoom) {
  	int	y_min;
  	int	x_iter;
  	int	y_iter;
  	int	x_max;
*************** static void output_dirty_tile(FILE * out
*** 131,148 ****
--- 163,191 ----
  			}
  			fprintf(outfile, "%i/%i/%i\n", out_zoom, x_iter, y_iter);
  		}
  	}
  }
+ #endif
  
  static void _output_and_destroy_tree(FILE * outfile, struct tile * tree, int x, int y, int this_zoom, int min_zoom) {
  	int	sub_x = x << 1;
  	int	sub_y = y << 1;
  	FILE *	ofile;
  
  	if (! tree) return;
  
+ #if 1
+   if(this_zoom >= min_zoom) {
+     if ((outcount++ % 1000) == 0) {
+       fprintf(stderr, "\rWriting dirty tile list (%ik)", outcount / 1000);
+       fflush(stderr);
+     }
+     fprintf(outfile, "%i/%i/%i\n", this_zoom, x, y);
+   }
+ #endif
+ 
  	ofile = outfile;
  	if ((tree->complete[0][0]) && outfile) {
  		output_dirty_tile(outfile, sub_x + 0, sub_y + 0, this_zoom + 1, min_zoom);
  		ofile = NULL;
  	}

Use these commands to apply the changes and to recompile osm2pgsql's source:

patch -b expire-tiles.c expire-tiles.diff
make

See also

  • Manually building a tile server - Also explains how to create the database for osm2pgsql, import data, etc
  • PostGIS/Installation - Explains how to create the database for osm2pgsql
  • Mapnik - Explains how to import OSM data
  • High Road - collection of Postgres views that make rendering roads from Osm2pgsql schema easier and better-looking

Alternatives to Osm2pgsql


  1. [1]