Taginfo/Running

From OpenStreetMap Wiki
Jump to: navigation, search

Most people will not have to run Taginfo themselves. If you just need access to the Taginfo data you can download it from http://taginfo.openstreetmap.org/download or use the API. But if you want to run Taginfo yourself, for instance to run it with a different dataset or do development on it, this page explains how.

If you are running your own taginfo instance, we recommend you join the mailing list, so that you get notified when the software is updated or if there are other issues.

Note that all of this is harder and more complex than it should really be. I'll try to clean up those scripts to make it a bit easier... Joto 10:20, 4 March 2011 (UTC)

Get taginfo

Taginfo is Open Source software and maintained at https://github.com/joto/taginfo . You need git to download the current version:

 git clone https://github.com/joto/taginfo.git

Installing the configuration

Taginfo comes with a taginfo-config-example.json example config file. Make a copy of that file under the name taginfo-config.json and put it in the parent directory of the directory you found it in, ie. it will end up outside the directory with the git repository.

The config file is in JSON format. There are some setting you need to change and some you can leave alone or change as you like. You have to change at least: instance.url, instance.name, instance.description, instance.icon, instance.contact, opensearch.shortname, and opensearch.contact. For the sources settings see below.

Dependencies

Read the README file, some dependencies are described there and you need to install them.

On Debian, the following should install all required dependencies:

 sudo apt-get install sqlite3 libsqlite3-dev ruby-dev ruby curl m4
 sudo apt-get install zlib1g-dev libosmpbf-dev libprotobuf-dev libboost-dev libgeos-dev libexpat1-dev libsparsehash-dev libgd2-xpm-dev
 sudo gem install mongrel json sqlite3 sinatra sinatra-r18n rack-contrib

You may also need:

 sudo apt-get install libgeos++-dev

Compiling tagstats

First install Osmium (it is a header only library, so you don't have to compile it) and its dependecies. If needed you can change the settings tagstats.cxxflags and tagstats.geodistribution in your config file.

Then go into the directory tagstats and compile the binary:

 cd tagstats
 make

The configuration for the geographical distribution map

The geographical distribution map shows where in the world each key is used. Two static maps (one for nodes, one for ways) for each key are created when tagstats runs.

To define which area your map shows, you'll have to define the bounding box with the left, bottom, right, and top settings. You also have to define the width and height of the image. The larger your image is, the more memory is needed when running tagstats and later more disk space for the database files!

We need to keep all the locations of all the nodes in memory to create the way maps. To keep the memory usage within reason, only 2 bytes are allotted for each coordinate pair, so width*height must not be bigger than 65536 (256*256). (You can change this limit by changing tagstats.geodistribution_int=uint16_t to uint32_t in your config file and recompile tagstats, you'll need twice as much memory but your image can now be up to 65535*65535 pixel.)

There is an additional configuration parameter called scale_image. Thats the factor the resulting image is scaled by when it is shown in the user interface. This makes the image bigger, but doesn't enhance the resolution.

If your memory is really tight and you don't need the distribution maps for ways, you can comment out all lines containing TAGSTATS_GEODISTRIBUTION_FOR_WAYS in the Makefile and recompile tagstats. If you are working with the whole planet or whole continents of data change the SparseTable to Mmap in the tagstats.geodistribution setting in the config file and recompile. Not that with Mmap storage you'll need 2 bytes per node storage, so thats over 1GByte. See the Osmium documentation for details on what the Mmap/SparseTable means.

It is your job to provide a background image and set the location in the geodistribution.background_image configuration parameter. See below how to create such an image.

Import data

Setting up the data sources

Taginfo comes with several scripts to import/create all the data it needs. They get the data from the different sources: An OSM file (planet), JOSM configuration, etc.

Source databases can either be created from the actual sources or they can be downloaded from the main taginfo site at http://taginfo.openstreetmap.org/download . It is better to use the download if you want to use the same data anyway, especially for the "wiki" source, because otherwise it will get thousands of pages from the wiki. You can change those settings in the taginfo-config.json config file. To get everything except the OSM data from the taginfo site, set it like this:

  ...
  "sources": {
        "download": "languages josm potlatch wiki",
        "create": "db",
  ...

Setting up the build directory

You have to give the update scripts a directory where they should store all the data. We refer to it as BUILD_DIR in the following.

Creating the databases

Now you can build all the databases:

cd sources
./update_all.sh BUILD_DIR

Depending on the sources, the imports can take quite a while. Running taginfo on a planet dump can take many hours and needs a lot of RAM.

To create some statistics during an update, taginfo needs the statistics from the last update run. So you will have to run the update twice before you get all statistics to work properly. (But most of the stuff will already work after the first run.)

Moving data to the right place

After running the imports you have to move the data to the right place for the webserver. Please note that at the moment the webserver expects to find the data in the directory ../../data relative from where it was started. Assuming the data should be stored in INSTALL_DIR, do the following to move the databases:

 mv BUILD_DIR/taginfo-*.db BUILD_DIR/*/taginfo-*.db INSTALL_DIR/

Depending on where your data is etc. a script like this will do the entire database update process for you:

cd /osm/taginfo/taginfo/sources

./update_all.sh /osm/taginfo/var/sources
mv /osm/taginfo/data/taginfo-* /osm/taginfo/data/old/
mv /osm/taginfo/var/sources/taginfo-*.db /osm/taginfo/var/sources/*/taginfo-*.db /osm/taginfo/data/
mv /osm/taginfo/var/sources/download/* /osm/taginfo/download/

After an update of the databases you have to restart the taginfo webserver so that it will pick up the new databases!

Webserver

The taginfo webserver presents the user interface. It is written in Ruby using the Sinatra framework.

Start webserver from command line

Its easiest to start taginfo from the command line:

cd web
./taginfo.rb PORT

If you do not give a port number Taginfo will start in debugging mode on port 4567.

Running webserver under Apache/Passenger

Passenger (http://www.modrails.com/) is an Apache module to run Ruby/Rails applications. You can run taginfo under Passenger with this config.ru, which should be placed in the web directory:

require 'rubygems'
require 'sinatra'
require './taginfo.rb'
 
set :run, false
set :environment, :production

run Taginfo

Adapting the map view

The map view that shows the distribution of tags will per default compile maps for the entire planet. This section describes what you have to do to create maps for a specific region only.

Computing the Region Boundaries

First, you will need the boundaries of the region you want to show. For example, the data of the Switzerland excerpt has a BBOX of (5.95 45.81, 10.49 47.80), so to round things up, we choose boundaries of (5.93 45.81, 10.53 47.81) for the image to be shown.

Taginfo uses a simple linearly scaled map in WGS84 projection. So, you will also need to calculate a scale factor for your map. Simply start from the area covered in degrees. For the Switzerland example, it covers an area of 4.6°x2°, so a scale factor of 150 results in a map of size 690x300, which is just right. (At this point you see why it is necessary to round up the area covered by the bounding box. If your scaling does not result in an image with integral numbers for width and height, you will find later that your background map and the distribution maps are slightly out of sync.)

Creating the Base Map Using Mapnik

You will need database with the outlines of the regions you want to show. Try to use and adapt the standard style sheet for the OSM Mapnik map. The map needs to be in WGS84 projection, so make sure your style sheet starts like this:

 <Map background-color="#d7d8f3" buffer-size="0" srs='+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs'>

Then the following Python script will create a PNG of just the right size:

 import mapnik2 as mapnik
 
 m = mapnik.Map(690, 300)
 mapnik.load_map(m, 'taginfo_outline.xml')
 
 bbox = mapnik.Box2d(5.93, 45.81, 10.53, 47.81)
 m.zoom_to_box(bbox)
 
 im = mapnik.Image(690, 300)
 mapnik.render(m, im)
 im.save('worldp.png', 'png256')

Again, this is for the Switzerland example and assumes your style file is called taginfo_outline.xml. So, adapt it as required.