History API and Database
If you know about the current state of affairs, please help keep everyone informed by updating this information. (Discussion)
Background
The entire history of OpenStreetMap spans several years now. Every day, a lot of new data is added, existing data modified, and stale or incorrect data removed. While the history of each object that ever existed in OSM lives on in the main database, there is no easy way to replicate the full history for an area. There is a full history planet dump that is made available at irregular intervals, but none of the existing OpenStreetMap data processing tools are suitable to process this file. Also, there is no suitable database schema for a full history OSM database.
Purposes
The general purpose of a full history database and API is to make it easy to answer questions of the type 'What did OpenStreetMap look like in area X on date Y'. This is useful in many ways:
- Visualizations: historical maps, animations
- Visualizations of Changesets : Objects before they are modified or deleted
- Visualization of objects deleted in the last x days
- Monitoring mapping in a local zone
- Monitoring modifications of specific tags
- Monitoring new users
- Tracing vandalism
- Insight into user and social dynamics
- Quality analysis
- Undelete changesets
- Undelete objects
- ...Add yours...
Questions for a History API
This is a never-complete list of possible questions one might have for a history API:
- Give me the data for bounding box X on date Y.
- Give me the data inside a Boundary relation.
- How many features of type X existed in OSM on date Y?
- Between which dates was user X adding features of type Y?
- Give me the list of users with less then 10 changesets created.
Datasources
There are several data-sources that contain history information:
- The Full History Dump (see Planet.osm/full) - regional extracts are also available.
- The Hourly and Minutely replication diffs
- Osmosis History-Diff-Files
Goals
- To design a database schema for a full history OSM database
- To develop tools for generating and processing historical OSM data:
- Generating full history dumps for arbitrary regions
- Importing full history dumps into a full history database
- Applying existing diffs to an existing full history database
- To design a full history OpenStreetMap API
- Retrieve full history for a bounding box
- Retrieve full history for a boundary relation
- Retrieve feature X on date Y
- When was feature X deleted / created?
- How long has feature X existed?
- Which users were active in area X between dates Y and Z?
- ...Your History API request could be here...
Relevant projects and pages
- A hack weekend in spring 2011 was dedicated to history.
- The full history planet description page.
- User MaZderMind describes the challenges involved as well as a stab at an algorithm to parse the full history data dump.
- There's also an experimental osmosis plugin
- And a githib repo where User MaZderMind is working on a history api
- A github repo where Martijn van Exel started a OSM history parser in python. Status is pre-alpha (it doesn't work yet).
- At this github repo lives a MacOSX (10.6+) desktop application that will retrieve OSM history for a bounding box, by Martijn van Exel
- OSM History_Viewer
- The History Browser
- Frederik Ramm operated a history service before Historic Planet was available.
- Jochen Topf extended his osm framework osmium so it could in theory read history dumps
- Talk at FOSSGIS 2011 by Peter Körner about this topic (in German), Video and Slides available.
- MapQuest is working on an OSM API implementation that would eventually support history.
- Full-History extracts
- OWL_(OpenStreetMap_Watch_List) is drifting towards becoming something of a history server/API.
- OSMHistoryServer and NepalOSMHistory are a server/API and visualization client, respectively. Both tools work (together), and are in active development.
- The OSHDB is a high-performance data analysis framework for analysing full-history data.
- The ohsome API is a generic web API for querying statistics and data from the OSM history.
People
Who's involved with this topic?
- Martijn van Exel is interested in the user / data analysis opportunities a history API would offer, and has done things like the Brave Mappers project.
- Steven M. Ottens is helping Martijn with his research, setting up servers and visualizing results.
- Peter Körner is working on history support in osmium and oder tools and publishes the history extracts.
- Paweł Paprota is working on OWL.
- Max von Hippel and Kathmandu Living Labs (KLL) built the OSMHistoryServer and NepalOSMHistory projects, and KLL continues to develop both.