Daily update an OSM XML file

From OpenStreetMap Wiki
Jump to navigation Jump to search

This page describes how to keep a local OSM file up-to-date.

Please note that there are alternative update strategies. For further details refer to the all-purpose program Osmosis and to the specialized update program osmupdate.

Purpose

Why should you update an OSM file every day? Some Applications need an OSM file as input. Of course, you can download a new copy of the file every time you need one, but this could cause a lot of data traffic.

You should not do this daily update if...
  • you keep every available OSM item in the postgreSQL database which is directly kept up-to-date with Osmosis,
  • you need minutely updates,
  • you need an up-to-date OSM file only every other month or
  • you need only a very small region, e.g. a city, for which you could easily download its .pbf or .osm.bz file every day.
You should do this daily update in every other case, especially if...
  • you do filtering with osmfilter on a regular basis,
  • you want to minimize data traffic,
  • you want to minimize hard disk space,
  • your computer has not much main memory or
  • you want to reduce CPU load (e.g., having an Intel Atom or a virtual server).

Prerequisites

Hardware

A weak CPU, e.g. Intel Atom or a virtual Internet server, will suffice. 1 GB RAM is recommended, but 512 MB will be enough if you reduce the required main memory for the program osmconvert a bit (add parameter -h=300).

Operating System

We assume that you use Linux as operating system. However, the Software you need is available for Windows too. The commands will differ slightly.

Prepare your System

At first, you will have to create a new directory for all the stuff we are going to do. If you do not have a specific preference, name the new directory "osmupdate" and create it in your home directory:

mkdir ~/osmupdate
cd ~/osmupdate

Get osmconvert Program

This programs will be needed: osmconvert. You can download it as binary, however, it is recommended to download and compile the source code because the binary may be out of date. To do this – downloading and compiling – from the command line, enter this command:

wget -O - http://m.m.i24.cc/osmconvert.c |cc -x c - -lz -o osmconvert

Get a Border Polygon

As soon as you have chosen a geographical region, try to get a border polygon for it. A couple of polygons are available at openstreetmap.org. If there is no suitable polygon, you can create a new one. Or you decide to use a bounding box instead. In this case, you will have to replace the "-B=a.poly" in the following commands with e.g. "-b=11,49,12,49.5".

If you want to create an easy border polygon by hand, open a new file with the name a.poly and enter the corner points of the polygon, using the following format. For example:

 -1 57
 2 57
 3.5 56.3
 3 55
 -1.2 55

(Be sure to start every line with a space character.)

In the following example we will download a border polygon for Germany and name it "a.poly":

wget -O a.poly https://trac.openstreetmap.org/export/24667/applications/utils/osm-extract/polygons/germany.poly

Get the OSM XML File

You should try to download an OSM file of the region you chose. Regional OSM files are available through several servers; for a list see Planet. For Germany, there is a file available at geofabrik.de. Each of the following commands will download the file, clip it to the selected region and store it into an .o5m file with the name a.o5m in one run. The first command will be considerably faster because PBF format is more efficient than .osm this purpose.

wget -O - https://download.geofabrik.de/osm/europe/germany.osm.pbf |osmconvert - -B=a.poly --out-o5m >a.o5m
wget -O - https://download.geofabrik.de/osm/europe/germany.osm.bz2 |bunzip2 |./osmconvert -B=a.poly --out-o5m >a.o5m

If there is no file for your region, choose a larger one which covers your region or choose the whole planet.osm file.

wget -O - http://planet.osm.org/pbf-experimental/planet-latest.osm.pbf | ./osmconvert - -B=a.poly --out-o5m >a.o5m

Although .pbf files are packed originally, we choose to reformat them because using the .o5m format will speed up the processing a bit.

After having downloaded and converted the OSM file, you might have to apply the latest changes by hand. Planet files are usually up to one week old; regional files may be up-to-date, but do not expect them to have exported at midnight. The germany.osm.pbf, for example, will be available today at about 03:00 or 04:00, but it has the state of the day before yesterday ca. 20:00. Therefore you will have to apply at least the latest two .osc change files to get the regional OSM file up-to-date.

Download the necessary change files from planet.osm.org/daily and put them into your osmupdate directory. Now, unpack all these .osc files (command gunzip) and apply them with osmconvert to your .o5m file. For example:

rm -f b.o5m
./osmconvert a.o5m -B=a.poly a.osc --out-o5m >b.o5m
mv -f b.o5m a.o5m

If you do not want to update the .o5m file by hand, you can use the following script instead. It will automatically download and apply the changes of the last 8 change files.

rm -f b.o5m *.osc.gz
OSCFILES=$(wget -O - http://planet.osm.org/daily/ |grep ".osc.gz" |sed s"/<a href=\"/\n/" |sed s"/\">/\n/" |grep -v "<" |grep ".osc" |tail -8 |sed s"/^/ http:\/\/planet.osm.org\/daily\//" |tr -d "\n")
wget $OSCFILES
gunzip *.osc.gz
./osmconvert a.o5m -B=a.poly *.osc --out-o5m >b.o5m
mv -f b.o5m a.o5m

Daily Update

Two steps are necessary to perform the daily update: get the latest .osc file and apply the changes to the existing local .o5m file "a.o5m". The following commands will do this. A log file is written, to document every step, and the responsible user will be mailed if anything goes wrong.

Emails can only be sent if the appropriate packages have been installed on your system, i.e., if you are using an Internet server. Otherwise, remove the line which contains the command mail.

Create a new file named osmupdate.sh; use gedit as editor. In case you do not have a graphical environment, use nano or vi, for example. Then enter this contents (replace my_user_name with the name of your account on the machine and my_email@address.com with your actual email address):

#!/bin/bash
cd /home/my_user_name/osmupdate  # (insert your user name here)

# rotate log and write new headline
mv upd.log upd.log_temp
tail -10000 upd.log_temp>upd.log
rm upd.log_temp
echo >>upd.log
echo Starting update script >>upd.log
date >>upd.log

# ensure that the last downloaded .osc is not too old
if [ "0"$(stat -L -c"%Y" a.osc 2>/dev/null) -lt $(date -d"yesterday 01:00" +"%s") ]; then
  echo "Error: Did not download yesterday's OSM change file." >>upd.log
  (echo "Did not download yesterday's OSM change file"; ls -lL) |mail -a "From: osmupdate" -s "No .osc file download yesterday" my_email@address.com
  exit
fi

# get the latest .osc file
OSCFILE=$(date -u -d yesterday +%Y%m%d)"-"$(date -u +%Y%m%d)".osc.gz"
echo Latest changefile is $OSCFILE>> upd.log
date >>upd.log
t=0  # time we have waited, in minutes
while [ $t -le 720 ]; do  # try max 8 hours to get the .osc file
  rm -f a.osc
  wget -O - http://planet.osm.org/daily/$OSCFILE 2>/dev/null | gunzip >a.osc
  ls -l a.osc >>upd.log
  date >>upd.log
  if [ "0"$(stat -Lc%s a.osc 2>/dev/null) -lt 1000000 ]; then
    echo "OSM change file "$OSCFILE" not yet available; will wait some minutes" >>upd.log
    sleep 1260; t=$(expr $t + 21)
    continue;
    fi
  if (! tail -4 a.osc |grep -c "</osm" >/dev/null); then
    echo "OSM change file "$OSCFILE" without end tag" >>upd.log
    echo "Will wait some hours and hope OSM team will fix it" >>upd.log
    sleep 10860; t=$(expr $t + 181)
    continue;
    fi
  t=0
  break
  done
if [ $t -gt 0 ]; then  # timeout
  echo "Error: No valid OSM change file "$OSCFILE"" >>upd.log
  (echo "No valid changefile available: "$OSCFILE; ls -lL) |mail -a "From: osmupdate" -s "No .osc file today" my_email@address.com
  exit
  fi

# apply the .osc file to the .o5m file
rm -f b.o5m
./osmconvert a.o5m a.osc -B=a.poly --out-o5m 2>>upd.log >b.o5m
date >>upd.log
if [ "0"$(stat -c%s b.o5m 2>/dev/null) -lt 900000 ]; then
  ls -l b.o5m >>upd.log
  echo "Updated .o5m file too small (implausible)" >>upd.log
  exit
fi
./osmconvert -t --out-o5m b.o5m
if [ $? -ne 0 ]; then
  (echo "osmconvert reported an error."; ./osmconvert -t --out-o5m b.o5m 2>&1) |mail -a "From: osmupdate" -s "osmconvert error" my_email@address.com
  exit
fi
mv -f a.o5m aa.o5m
mv -f b.o5m a.o5m
ls -l a.o5m >>upd.log
date >>upd.log

The new file osmupdate.sh must be made executable:

chmod ug+x osmupdate.sh

To get the update job started every day, you should enter its path to the cron directory. For example:

sudo echo -e "45 16 * * * my_user_name /home/my_user_name/osmupdate/osmupdate.sh 2>&1>/dev/null \n" > /etc/cron.d/osmupdate

(You will have to replace my_user_name with the name of your account on the machine, of course.)

Please ensure that the time of the day (16:45 in the example) is not close to the time the change files will usually be generated. The unit for the time you enter is your local time; the new planet change file is usually made available every night between 01:00 and 06:00, but UTC.

Benchmarks

(Please add comments.)