User:Rickm/Various implementation of history ideas

From OpenStreetMap Wiki
Jump to: navigation, search

The basic theory I use when making histories is to have a datatable where every object has a object_id(oid) and a version_id (vid). When a new object is created its version id is its object id. When the object is altered, a new tuple is created with a new version_id, but the object_id of the very first version, when I did it every tuple also recorded the VID of the objects latest version, however for the number of tuples we'll have in the database, I think a boolean of the tuple of 'latest' would be more efficent.
A CHANGE table is used to store which user changed the object, what the change was, and the time of that change.

So the schema for our OSM node table would be
node: version_id, object_id, lat, lng, latest, tags, deleted
node_change: change_id, object_id, old_version_id, new_version_id, change_type, user_id, timestamp

Changetype is a numeric identifer which is used in creating the History output. eg.
change_type 0 refers to a lat,lng change, so the History would output the old lat,lng and the new lat,lng
change_type 1 refers to a tag change, so the History output could print out the old tags and the new tags (and probably highlight the difference)
change type 2 refers to a deletion

An addition to this idea would be to store current nodes in a seperate table, until they are edited, at which point they get moved to an archive table (similar to what happens now). This would be worthwhile if searching the database for (object_id = node_id) && (latest = true) is less efficient than storing an index of the latest node ID for each object. In this case, we wouldn't need the latest tag. What we do is have another table NODE_TRACKER that keeps track of what the current latest version of a node is.
Node_tracker: object_id, latest_version_id
(This isn't fully thought out or tested yet, the stuff above this I've actually implemented in my own programs)