User:Mmd/Overpass API/install/attic

From OpenStreetMap Wiki
Jump to navigation Jump to search

Populating the DB with attic data (0.7.50 and newer)

** This section is still incomplete / experimental. If you don't want to provide previous versions of OSM objects via [date:...], [diff:...] or [adiff:...], you can skip this section **

Since Overpass API version 0.7.50 it is also possible to store the full OSM object history ("attic data") in the Overpass DB. By providing a certain date in a query or a time range you can retrieve data as it was back then in the OSM database - or - what has been created/changed or even deleted since then or in a given timeframe.

To populate the database with attic data there are basically two options:

  1. Start with the first available planet file since the license change to ODbL and apply all changesets thereafter.
  2. Use a full history file (experimental!)

Using Planet file and applying daily/hourly/minutely changesets

Steps (to be confirmed):

  1. Populating the DB using first available ODbL planet file originating from September 2012.
  2. Applying minutely (or hourly, or daily) diffs using --keep-attic parameter to retain object history

Using Full History Files (highly experimental)

*** Using Full History extracts to populate the Overpass API is discouraged at this time ***

Full history files are available either as planet file or extracts for a smaller regions (see Planet.osm/full).

The following example is intended as a starting point to demonstrate the overall process. It uses a rather small extract for the city of Mainz, which is provided by User:MaZderMind. Please refer to the Full History Extracts page for a recent .osh.pbf file of the city of Mainz. Depending on your available hardware you could also choose a larger file for a dry run, before running the full history dump.

As full history dumps and extracts are usually provided in osh.pbf format (full history file in Portable Binary Format), we use the osmconvert for an on-the-fly conversion to .osc file format. Please refer to the wiki page for compilation instructions. osmconvert needs to installed in the tools directory ($TOOLS_DIR).

Using osmconvert (at least 0.7V required!)

$TOOLS_DIR/osmconvert mainz.osh.pbf --out-osc | $EXEC_DIR/bin/update_database --db-dir=$DB_DIR --keep-attic --flush-size=8

Caveat: Using the OSM Change format (.osc) for update_database with --keep-attic is an absolute MUST. Using the Full Planet History (.osh) format instead will generate NO warning message. However, the database will be completely unusable afterwards, i.e. invalid data, performance will be factor 100 worse, etc.!

Update_database uses quite a large amount of main memory by default. If your update_database process gets killed, you can reduce the flush size at the expense of doing more frequent flushes to the database. The flush size can be defined via parameter --flush-size. The default value of 16 will require >4GB memory, while --flush-size=2 will keep your update_database process down to around 1GB, even for large full history extracts like Germany. For performance reasons you should to try increase this value as much as your available memory permits.

Open Issues

  • Performance: Running Overpass API queries on a full history extract DB results in massive increase in runtime / memory requirements (4s => 40 minutes) for Germany extract ( filter_attic_elements). Using osh format will create an invalid/unusable db, use .osc instead!
  • Currently there seems to be an issue with osmconvert 0.7U not returning ways with visible=false, see Github ticket, reported on osmconvert Wiki Discussion page - Fixed in osmconvert 0.7V
  • update_database implicitly assumes deleted nodes to not have any lat/lon values and does not yet evaluate the visible="false" flag. Unfortunately osmium_convert produces deleted nodes as lat="0.0000000" lon="0.0000000" along with visible="false". update_database considers these as proper nodes rather than deleted ones, see Github Ticket Obsolete when using .osc instead of .osh
  • Both osmium_convert and osmconvert produce an XML for relation id 240302 version 2, which cannot be handled by the Expat parser used in Overpass API, see Github Ticket for osmium_convert. => fixed, need to wait for next available full history extract.