These installation instructions are for the older Nominatim V1. For the current version of Nominatim, see Nominatim/Installation.
These scripts are used in conjunction with -O gazetteer mode of osm2pgsql to generate a database suitable for geo-coding.
- 1 Changes
- 2 Prerequisites
- 3 First Installation
- 3.1 Make the database
- 3.2 Import OSM data
- 3.3 Build the transliteration module
- 3.4 Various supplementary data, used to patch holes in OSM data
- 3.5 Create website user
- 3.6 Add gazetteer functions to database
- 3.7 Copy the data into the live tables
- 3.8 Index the database
- 3.9 Various 'special' words for searching
- 3.10 Arrange permissions
- 3.11 Set up the website
- 4 Updates
- GCC compiler http://gcc.gnu.org/
- PostgreSQL http://www.postgresql.org/
- Proj4 http://trac.osgeo.org/proj/
- GEOS http://trac.osgeo.org/geos/
- PostGIS http://postgis.refractions.net/
- PHP http://php.net/ (both apache and command line)
- PEAR::DB http://pear.php.net/package/DB
In standard Debian/Ubuntu distributions these should all be available as packages.
apt-get install php5-pgsql postgis postgresql php5 php-pear gcc proj libgeos-c1 postgresql-contrib postgresql-8.4-postgis postgresql-server-dev-8.4 pear install DB
PostgreSQL and PostGIS Version
Please be aware that various problems have been found running Nominatim on PostgreSQL 8.4. Note that 8.4 also slows down the indexing massivly(!). It is currently recomended to use PostgreSQL 8.3. Initial testing suggests that version 9.0 appears to work.
Also, Nominatim makes heavy use of the ST_CONTAINS PostGIS function which seems to be rather slow on PostGIS versions before 1.5, so if you have the option of using PostGIS 1.5 or later, that would be a good idea. Some versions of 1.3 also have stability problems.
For a full planet install you will need a minimum of 300GB of hard disk space. On the OSM Nominatim server (http://wiki.openstreetmap.org/wiki/Servers/katie) the initial import (osm2pgsql) takes around 30 hours, and the the rest of the indexing process takes approximately 10 days using both processors in parallel.
On a 16-core 48 GB machine with fast disks, the initial import takes around 4 hours, and the rest of the indexing process still takes about a week.
Note. You may still find the database name "gazetteer" or "gazetteerworld" and the user name "twain" hard-coded in some parts. When using different names, make sure to grep and change.
Make the database
sudo su postgres createdb gazetteer createlang plpgsql gazetteer
cat /usr/share/postgresql/8.3/contrib/_int.sql | psql gazetteer
(Install location of /contrib and /postgis directories may differ on your machine.)
cat /usr/share/postgresql/8.3/contrib/pg_trgm.sql | psql gazetteer cat /usr/share/postgresql-8.3-postgis/lwpostgis.sql | psql gazetteer
(lwpostgis.sql is replaced with postgis.sql in newer versions of postgis)
cat /usr/share/postgresql-8.3-postgis/spatial_ref_sys.sql | psql gazetteer
Full script for Ubuntu 11.04:
cat /usr/share/postgresql/8.4/contrib/_int.sql | psql gazetteer cat /usr/share/postgresql/8.4/contrib/pg_trgm.sql | psql gazetteer cat /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql | psql gazetteer cat /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql | psql gazetteer
You might want to tune your PostgreSQL installation so that the later steps make best use of your hardware:
- check shared_buffers, maintenance_work_mem, and work_mem and set them to larger values than the default (e.g. 4G/1G/1G?)
- set random_page_cost to 1.5 (lower than the default of 4).
- for the initial import, set autovacuum=off and fsync=off; restore these to "on" later.
Import OSM data
First, download a [Planet File].
Compile osm2pgsql (from http://svn.openstreetmap.org/applications/utils/export/osm2pgsql) unless you have already got a package for it:
cd osm2pgsql ./autogen.sh ./configure make
Load the planet file. The database created in this step is not compatible with one you might already have for rendering, created without the -O gazetteer option:
./osm2pgsql -lsc -O gazetteer -C 2000 -d gazetteer planet.osm.bz2
Make sure that you have -l (--latlon) and -s (--slim); these are required. -C is the cache size in MB. If you have enough memory, set -C to 8 times the highest node ID divided by one million (at the time of writing, -C 8000 is sure to give you best performance. Higher values do not improve performance. If you do not have that much memory, use as much as you have.
You do not have to expand the planet file, osm2pgsql will handle the bzip.
Ignore notices about missing functions and data types.
If you get a projection initialization error, your proj installation can't be found in the expected location. Copying the proj folder to /usr/share/ will solve this.
Build the transliteration module
Up to osm2pgsql 0.69
cd gazetteer make
Update gazetteer-functions.sql to give the absolute path to the module, replacing /home/twain/osm2pgsql/gazetteer/gazetteer.so.
osm2pgsql 0.70.x and beyond
With the autotools changes in osm2pgsql 0.70.x the gazetteer.so shared library is built as sourcedir/gazetteer/.libs/gazetteer.so in the (hidden).libs subdirectory, and installed as $prefix/lib/osm2pgsql/gazetteer.so (with /usr/local being the default $prefix).
Up to 0.70.5 it is still necessary to change the path to gazetteer.so in gazetteer-functions.sql manually. Starting with 0.70.6 make will take care of setting the installation path in the gazetteer-functions.sql.
Various supplementary data, used to patch holes in OSM data
cd gazetteer cat import_country_osm_grid.sql | psql gazetteer cat import_worldboundaries.sql | psql gazetteer cat import_country_name.sql | psql gazetteer cat import_gb_postcode.sql | psql gazetteer cat import_gb_postcodearea.sql | psql gazetteer cat import_us_state.sql | psql gazetteer cat import_us_statecounty.sql | psql gazetteer
Create website user
Create a website user, in this sample www-data
createuser -SDR www-data
Add gazetteer functions to database
First, move gazetteer.so to somewhere in your /usr/local/lib directory tree (/usr/local/lib/gazetteer/gazetteer.so). Gazetteer.so is found in ~mapnik/bin/osm2pgsql/gazetteer/.libs/ if you've followed the default installation instructions. When you've installed a .deb package you can try dpkg -S gazetteer.so.
Edit the path to gazetteer.so in gazetteer-functions.sql to where you've found/installed the lib.
cat gazetteer-functions.sql | psql gazetteer
The list is long and you can easily miss errors that scroll past; pipe through less.
Ignore any errors about place_boundingbox not existing in the first run!
If you get an error ERROR: type "planet_osm_ways" does not exist, that means you are trying to run the script on a non-slim database which is not supported; you need to specify -s on first import.
cat gazetteer-tables.sql | psql gazetteer
If you get an error ERROR: operator class “gin_trgm_ops” does not exist for access method “gin” then you have forgotten to load the pg_trm.sql contrib module as described earlier. You can do that now and re-run only the gazetteer-tables step.
cat gazetteer-functions.sql | psql gazetteer
You really do need to run gazetteer-functions.sql TWICE!
Copy the data into the live tables
This does the first stange of indexing using various triggers and will take a while (for a full planet, somewhere between 10 and 30 hours depending on your setup):
cat gazetteer-loaddata.sql | psql gazetteer
If this errors out you've probably missed an error in a step before this.
If you're curious about the progress, execute the SQL command select pg_size_pretty(pg_relation_size('placex')); and do the same with place instead of <placex>. When loading is finished, placex will be within +/- 25% of the size of place.
It is advisable to run the PostgreSQL analyze command after the loaddata step.
Index the database
This will take a very long time - up to 10 days for a full planet on a small machine.
For small imports (single country) you can use:
cat gazetteer-index.sql | psql gazetteer
For anything large you will need to use:
Be sure to fix the database connection string in that PHP script to access your database.
If you have a multi processor system you can make use of the multithreading mode like this:
./util.update.php --index --index-instances 8 --max-load 10 --max-blocking 10
This will create 8 threads, and pause them only if the load goes above 10 or there are more than 10 blocking processes according to /proc/stat.
As a sanity check after indexing, if you do a select count(*) from search_name you should end up with something in the region of 20 million.
Various 'special' words for searching
There is a detailed description in the file itself and at Nominatim/Special_Phrases
cat import_specialwords.sql | psql gazetteer
Ensure that all permissions in the table are set to the www-data username.
for tbl in `psql -qAt -c "select tablename from pg_tables where schemaname = 'public';" gazetteer` ; do psql -c "alter table $tbl owner to \"www-data\"" gazetteer; done
Set up the website
cp website/* ~/public_html/
You will need to make sure website/.htlib/settings.php is configured with correct values.
Change CONST_Website_BaseURL and CONST_Database_DSN according to your installation
If you want to run continuous updates, you can either use Osmosis to download replication diffs or have the util.update.php script load hourly or daily diffs.
Update the table 'import_status' to reflect the date of your planet dump file (you might want to have a days overlap to ensure that no data is missed).
Edit util.update.php to replace /home/twain with the location you wish to store your files, then run:
./util.update.php --import-daily --import-all --index