Tiles@home/Currency

From OpenStreetMap Wiki
Jump to navigation Jump to search

information sign

Software described on this page or in this section is unlikely to be compatible with API v0.4, API v0.5, and API v0.6 (current version) deployed in 2007 and 2009.
If you have fixed the software, or concluded that this notice does not apply, remove it.

This page describes methods to verify that the tiles@home tiles are current (i.e. not out of date) and complete (i.e. for every location we have OSM data for, there's also a tile).

NOTE: Scripts discussed here will need minor overhaul since the stats file format has changed due to the introduction of layers.

A temporary fix is to change the new stat file format to the old format.

zcat latest.txt.gz | grep ",12,1," |\
awk -F "," {'print $1 "," $2  "," $3  "," $7 "," $5 "," $6 '} > stats.txt

This will result in a old format stat file, and thereby a working lastmodtile.pl and comparestats.pl . The stat file consists only of tiles with zoom=12 and layer=1.

Prerequisites

You will need the latest Planet.osm (hereafter called the "planet file") and the latest tiles@home statistics file (http://dev.openstreetmap.org/~ojw/Stats/, hereafter called the "stats file"), as well as the Perl scripts lastmodtile.pl, comparestats.pl, and checkl7.pl from the SVN repository (osm/utils/tilesAtHome/tools/check-currency).

Introduction

For the slippy map, there is nothing special about any one zoom level. The tiles@home process however uses level-12 tiles, a.k.a "tilessets", as a kind of pivot element in tile generation. Whenever a tiles@home client renders a level-12 tile, it will also render all level-13 to level-17 tiles contained therein; and the level-7 to level-11 tiles are created from the level-12 tiles using bitmap processes only (i.e. they are not independently rendered).

This means that level-12 tiles are the main focus of currency checking. If all level-12 tiles are current, then so should all level-13 to level-17 tiles (if they exist), and the level-7 to level-11 tiles can be generated quickly where required.

Step 1: Finding Data Timestamps for Tiles

The first thing to do is process the planet file, compute which level-12 tile each node lies on, and generate a list containing all level-12 tiles for which data is present together with the time when that data was last changed.

The lastmodtile.pl script will do that:

 perl lastmodtile.pl planet-010110.osm > lastmodtile.out

It currently requires about 700 MB RAM, creates an output file of about 1 MB, and runs for 15 minutes on an average PC. The output file contains tile-x, tile-y, and timestamp (as epoch value):

 2377 1177 1158005727
 2041 1358 1167479172
 2105 1356 1167514001
 ...

Using minor modifications in the script you could also find last modification timestamps for tiles of other zoom levels but we don't require that.

Step 2: Finding outdated level-12 tiles from tiles@home

This step requires comparing the last modificiation timestamp for each level-12 tile from the stats file with the last modification timestamp of the data on that tile as determined in step 1.

Provided that you have the files lastmodtile.out and stats.txt in your current directory, just run

 perl comparestats.pl > comparestats.out

The output is rather self-explanatory. Every tile falls into one of five groups

  • spurious: Tiles which do exist in tiles@home and seem non-empty (this is guessed from the PNG file size) but for which there is no OSM data. Such tiles may be leftovers of deletions (i.e. generated at a time when OSM data was indeed available), but they may also represent cases where rendering something on a neigbouring tile touches a tile.
  • obsolete: Tiles which exist in tiles@home, seem empty, and have no data in the OSM planet file; they could probably be deleted without harm.
  • current: Tiles which exist in tiles@home and are newer than the last change in the OSM data.
  • outdated: Tiles which exist in tiles@home but the corresponding OSM data has changed since the tile was created.
  • missing: Tiles which do not exist in tiles@home, but OSM data is present and they could meaningfully be rendered.

The script accepts the names of these five groups (spurious, obsolete, current, outdated, missing) on the command line; if one or more are specified, it will only output the tile names for those tiles that fall into the specified categories. If the command line is empty, a complete list will be genereated naming the status of each tile.

Note that there may be cases where a tile is reported "missing", but in reality the tile area contains only data that is not rendered by osmarender (for example, a few nodes segments without any attributes). In that case, re-generation of the tile would not lead to a tile being uploaded, and the tile would still be missing the next time round.

You should of course always check the comparestats.pl output but if you are reasonably confident, you could do something like

 perl comparestats.pl missing outdated | while read x y
 do
    wget -O /dev/null http://dev.openstreetmap.org/~ojw/NeedRender/?x=$x&y=$y&priority=2&src=foobar
 done

to automatically enqueue requests for all tiles that the script thinks are either missing or outdated. Use Caution: Make sure you have used a current stats file or you will request many more updates than necessary. Be aware that anything that has been rendered between the creation of your stats file and the time that you run your analysis cannot be considered by this mechanism. TODO: Use a fine-grained approach that, before requesting tile rendering, use the individual status check from tiles@home/APIs to check the latest status of the tile in question.

Step 3: Finding outdated low-zoom tiles

Since low-zoom tiles (levels 7 to 11) are generated from level-12 tiles and not directly from OSM data, a view at the stats file alone can tell us which tiles need to be re-generated. The algorithm is:

  1. From looking at all existing level-12 tiles, compute a list of all level-7 tiles that should exist (divide x and y of level-12 tile by 32).
  2. For each of the level-7 tiles that should exist, compare the last modification time of the newest level-12 tile contained with the modification time of the level-7 tile (using 0 if the tile does not exist). Re-generate if level-12 has newer modification time.

The script checkl7.pl does this. It reads a stat file on standard input, takes the same optional arguments as comparestats.pl, and produces similar output.

The script ignores level-12 tiles that (judging from their size) seem to be empty.