History API and Database
The entire history of OpenStreetMap spans about seven years now. Every day, a lot of new data is added, existing data modified, and stale or incorrect data removed. While the history of each object that ever existed in OSM lives on in the main database, there is no easy way to replicate the full history for an area. There is a full history planet dump that is made available at irregular intervals, but none of the existing OpenStreetMap data processing tools are suitable to process this file. Also, there is no suitable database schema for a full history OSM database.
The general purpose of a full history database and API is to make it easy to answer questions of the type 'What did OpenStreetMap look like in area X on date Y'. This is useful in many ways:
- Visualizations: historical maps, animations
- Visualizations of Changesets : Objects before they are modified or deleted
- Visualization of objects deleted in the last x days
- Monitoring mapping in a local zone
- Monitoring modifications of specific tags
- Monitoring new users
- Tracing vandalism
- Insight into user and social dynamics
- Quality analysis
- Undelete changesets
- Undelete objects
- ...Add yours...
Questions for a History API
This is a never-complete list of possible questions one might have for a history API:
- Give me the data for bounding box X on date Y.
- Give me the data inside a Boundary relation.
- How many features of type X existed in OSM on date Y?
- Between which dates was user X adding features of type Y?
- Give me the list of users with less then 10 changesets created.
There are several data-sources that contain history information:
- The Full History Dump (see Planet.osm/full) - regional extracts are also available.
- The Hourly and Minutely replication diffs
- Osmosis History-Diff-Files
- To design a database schema for a full history OSM database
- To develop tools for generating and processing historical OSM data:
- Generating full history dumps for arbitrary regions
- Importing full history dumps into a full history database
- Applying existing diffs to an existing full history database
- To design a full history OpenStreetMap API
- Retrieve full history for a bounding box
- Retrieve full history for a boundary relation
- Retrieve feature X on date Y
- When was feature X deleted / created?
- How long has feature X existed?
- Which users were active in area X between dates Y and Z?
- ...Your History API request could be here...
Following discussions on FOSSGIS 2011, we're planning a hack weekend in spring 2011 to get everyone interested in this topic together and start working on solutions. If you think you can contribute, sign up!
Relevant projects and pages
- The full history planet description page.
- User MaZderMind describes the challenges involved as well as a stab at an algorithm to parse the full history data dump.
- There's also an experimental osmosis plugin
- And a githib repo where User MaZderMind is working on a history api
- A github repo where Martijn van Exel started a OSM history parser in python. Status is pre-alpha (it doesn't work yet).
- At this github repo lives a MacOSX (10.6+) desktop application that will retrieve OSM history for a bounding box, by Martijn van Exel
- OSM History_Viewer
- The History Browser
- Frederik Ramm operated a history service before Historic Planet was available.
- Jochen Topf extended his osm framework osmium so it could in theory read history dumps
- Talk at FOSSGIS 2011 by Peter Körner about this topic (in German), Video and Slides available.
- MapQuest is working on an OSM API implementation that would eventually support history.
- Full-History extracts
- OWL_(OpenStreetMap_Watch_List) is drifting towards becoming something of a history server/API.
Who's involved with this topic?
- Martijn van Exel is carrying out research that will involve historical analysis of OSM.
- Steven M. Ottens is helping Martijn with his research, setting up servers and visualizing results.
- Peter Körner is working on history support in osmium and oder tools and publishes the history extracts.
- Paweł Paprota is working on OWL.