User:Frederik Ramm/Ideas for API 0.7
- 1 Multi-Layer Capability
- 2 Area Data Type
- 3 Relations with extended Roles
- 4 Symlinks and/or Diff storage on the database backend
- 5 Return Version Number with "Gone" Result
- 6 Allow "if-modified-since" requests
- 7 Extend "Capabilities" Response with Policy Information
OSM doesn't have "layers" as such, but still we have many things that are relatively disjunct thematically. For example, someone who wants to edit the country's power grid will *not* want to download every single object in an area; he only wants the power lines and masts (and might use dimmed tiles in his editor as a background). The same goes for someone doing house number mapping - no need to bother him with tons of objects he is not going to touch anyway.
People might even want to map ancient roman roads - but someone editing today's city map does not want to be disturbed by them.
So, we need something like a tag filter to be applied when downloading stuff for a bbox. This is probably more difficult than it sounds at first because it will require the API to pass additional information like "this object is used by another object which you have not downloaded but trust me you cannot delete it and even moving it might upset things...".
Area Data Type
They're not dead unless you've seen the corpse.
Proper area data types would replace our current "multipolygon" relations plus lots of simple areas that we currently map using closed ways.
The advantage (over closed ways) would be that even without semantics (i.e. not looking at the tags) you could know whether something is just a closed line (e.g. a roundabout) or an area.
I've moved some of the stuff that was here to The Future of Areas.
Here is a nice quote from the mailing list that could serve as an illustration for anyone thinking about the problem:
- I checked that the biggest lake polygon in the official Finnish mapping data has 287273 vertices, 5484 islands and the total length of this simple polygon is more that 5276 kilometers. It would be quite a beast as OSM multipolygon relation.
Moving to the Area Data Type
If you want any change in OSM, you not only have to describe (and possibly implement) what you want; you also have to think about a method of transition from the status quo to whatever it is that you would like to have.
So here's the (draft) plan:
- For every multipolygon and boundary relation that currently exists, create an area object with the same tags and describing the same area.
- If processing a boundary relation and that boundary relation is either part of some kind of hierarchy (French mappers are known to build such) or has extra roles that need to be preserved (e.g. admin_centre=<some node>), then preserve the boundary relation but drop all inner/outer way members from it and add the new area as one member in the role "area".
- (define what exactly roles to be preserved are)
- If processing a multipolygon relation, or if processing a boundary relation with other members than those listed above, just delete the members from the relation.
- If the area data type should be one that is based on nodes not ways, then the ways currently used by the multipolygon or boundary relation might become obsolete now. A way is obsolete if it carries no tags, or if all its tags are represented by the area object. E.g. a member way of a boundary relation that is only tagged as boundary=administrative, admin_level=x will be obsolete and should be deleted according to the special deletion procedure below. If the same way also has waterway=river, it needs to be preserved but the boundary tag may be removed. This is true only if the area data type is not based on ways.
- Delete the relation if it is not needed any more.
For each multipolygon processed, open a new changeset with a suitable changeset comment. The changeset is an integral part of this procedure, it can be used to
- link the new area object to the now-deleted relation object for someone researching object history
- explain which parts of the original relation have been dropped and why
- collect source attributes from objects that have been merged into the new area object
- optionally contain a short human-readable object history ("converted from multipolygon relation originally created on <date> by <user> and since edited by <user>,<user>,<user>")
t.b.d.: how to treat closed ways when migrating?
- what of above migration stuff is manual, what is automatic?
- What with editors and applications? Will there be some sort of emulation for a while, some kind of backwards compatibility or a time when the API supports both new and old style areas in parallel?
- user interface in editors?
- changes in processing (osm2pgsql etc)
- make statistics about which types of polygons we currently have - take into considerations that while there will be many small polygons, large polygons affect mappers more.
Relations with extended Roles
Perhaps we need a full set of tags for a relation member role (grammatical speaking, something like adverbs instead of adjectives). This would allow you to describe more precisely the way in which something is a member of something else.
A possible example is when relation membership is constrained in some way:
Say you have a bus stop, and two bus routes that serve the bus stop. One bus route serves it on Sundays only; the other serves it on weekdays only. Now where do you put that information. The bus stop is not a "Sunday only" bus stop - it is in use on all days. The bus routes #1 and #2 are not "Sunday only" bus routes - they operate throughout the week. It is just the membership of this bus stop in the route that is "Sunday only". You would like to somehow say: This object is member of this relation, but qualified by the following tags: "Sunday only".
In pseudo OSM XML, instead of this:
<relation id="1"> <tag k="name" v="Bus route 1" /> <tag k="note" v="Stops at node #2 only on Sundays!" /> <member type="node" ref="1" role="stop" /> <member type="node" ref="2" role="stop" /> <member type="node" ref="3" role="stop" /> </relation>
you want this:
<relation id="1"> <tag k="name" v="Bus route 1" /> <member type="node" ref="1"> <tag k="role" v="stop" /> </member> <member type="node" ref="2" /> <tag k="role" v="stop" /> <tag k="operating_days" v="Sunday" /> </member> <member type="node" ref="3"> <tag k="role" v="stop" /> </member> </relation>
This is just one example; you might also want to add a "note" tag to a relation membership (not a note to the relation, and not a note to the object, but a note that explains why you made this object a member), or myriad other uses. Some people have abused the "role" property for this in the past, basically collapsing every bit of information that concerns the membership into that one role field, with some special character as a separator.
It might make sense to allow the Rails API to refer back to previous versions. For example, if an object is created, then modified, and the change then reverted, we have an identical version 1 and 3 (apart from timestamp, author, changeset). If the object in question is a 500-node way then this means significant waste of storage. It would be cool if we could just store "version 3 is the same as version 1".
A more complex version of that is storing differences only: "version 3 has the same 500 nodes as version 2, but node 501 has been added at the end". The logical next step would be allowing real diff uploads, i.e. "dear API, please add this node to the end of way so-and-so but save me the hassle of uploading the whole way".
A lesser form of this idea is to at least make the API silently ignore update requests that would not actually change an object, returning the existing version number "n" (instead of "n+1").
Return Version Number with "Gone" Result
When you try to retrieve an object that has been deleted, you only get a "Gone" message but not the version number in which the deletion occurred. If you want to retrieve the last version before deletion, you currently have to load the full history which, especially for complex relations, can be very large. - Return version number with "Gone" so that people can access version n-1 easily.
Allow "if-modified-since" requests
Suppose I have a number of very large ways or relations and want to bring them up to date. Currently I'll have to retrieve them in full, unnecessarily straining API and bandwidth. It would be good to have a request that does "give my way #3456 but only if you have a higher version than 5 which is what I already have here".
Extend "Capabilities" Response with Policy Information
As OSM is getting larger, we have a greater variety of clients talking to the API.
I propose that the response to the "capabilities" request be enhanced with a number of, yet to be definied, policy requirements. Such policies could e.g. be
- the suppression of certain types of background imagery
- a limit to the number of new objects that may be created by anyone in a given session or time frame
- a limit to the number of certain requests that may be made in a given session or time frame
We would expect any software that uses the API to implement these restrictions, and we would ban software that does not comply from accessing the API. (We would not attempt to enforce these rules, we'd just expect software to play by the rules and call out those that don't.)