The Future of Areas/Fixing Multipolygons

From OpenStreetMap Wiki
Jump to: navigation, search

If the OSM API will not introduce an area type that guarantees referential integrity, the following is an option to fix the shortcomings of multipolygons. The goal is to render and modify partially downloaded areas without invalidating the whole object.

For this, the foremost thing needed is a reliable way to determine for each part if the 'inside' of the area lies left or right (see e.g. Sanderd17's proposal).

Relations are (at this time) vulnerable to referenced data being changed without them being notified (i.e. they eventually will not be updated on the event a member changes or is deleted). Therefore it is desirable to have data structures that give good hints to detect and eventually repair invalid states, at best with no or little extra data to be downloaded. Instead of re-stuffing the role field of multipolygon members to store

  • a specific ring association
  • if inside is left or right side of the member (as if it were part of an outer ring)
  • if the member is part of an outer or inner ring (toggling left;right at decoding time if it is an inner part)

there is a better way to do it:

A relation of type=ring may contain only way(s), that are joinable to exactly one closed, not self-intersecting way.

.. which may be extended to allow its members to be trees of way-concatenable relations, that once flattened result in the same. This extension would implement the concept of boundary segments and variant #2 of super multipolygons.

A relation of type=polyring may contain only other relations of type=ring.

.. which may be extended to allow its members to be trees of relations of type=polyring, with only relations of type=ring at the leaves. This extension would implement the concept of super areas.

Accepted role values for members are
 for type=ring relations: left or right.
 for type=polyring relations: outer or inner.
Both, type=ring and type=polyring may be areas.
(to not have to maintain type=polyring relations with only a single member)
If a ring is defined/created, role of a way member should be computed
 left for ways running with the orientation of the whole ring
 right for ways running against the orientation of the whole ring
Any whole ring (type=ring) is oriented counter-clockwise around inside for this computation.
Inside is determined by the smaller of two possible surfaces separated by a ring on a sphere.

For any valid type=ring (standing by itself, not as a member of a polyring), it is asserted that

  • joining all members, reverse concatenating those having role right, results in exactly one closed, not self-intersecting way with the inside on its left side (i.e. oriented counter-clockwise around inside)
  • joining all members, reverse concatenating those having role left, results in exactly one closed, not self-intersecting way with the inside on its right side (i.e. oriented clockwise around inside)
  • all role information can be recomputed if all members are known, e.g. have been downloaded; this is asserted for invalid type=ring relations as well, see healing rings further down
The inside of a type=ring may by definition never be larger than half the sphere's surface.
The inside of a type=polyring could extend to all of the sphere surface minus
the smallest surface definable by a type=ring.

To define an area extending to a size larger than half of the sphere's surface, define type=ring relation as usual and then add this as inner to a type=polyring.

Role inner for a member of type=polyring flips inside to outside for that member.
Role outer does nothing.

The biggest advantage of this, besides an uncluttered definition of areas, is that rings will have timestamps, so we can heal individual rings, without the need to download areas using that ring. Healing is done by comparing the timestamps of the downloaded members of a ring to the timestamp of the ring.

  • If all are older or equal the role information of the ring is considered valid.
  • If a member with a newer timestamp is found, that member might invalidate the ring (because the member might have been reversed, been part of a split operation or have had start/end nodes removed, etc.). To check this and eventually repair its role, retrieving (some) connected members is necessary (up to the closest member having a timestamp older than that of the ring). In the worst case these will be all members of the ring, in the best case a reliable member is already part of the partially downloaded dataset and dirty roles can be checked/repaired from there.
  • If a dirty member of a ring results in definition gaps (i.e. start and/or end node are not found in any other members), manual fixing is required. If there is more than one gap, the inside cannot be determined confidently. With one gap only, inside can be determined, but geometry for the gap will be replacable only by a straight line, possibly leading to great errors. Most editors do not hurt relations on way split events, but rather fix them automatically (that is, if they have all relations refering to the way being split in the dataset), so definition gaps might be produced rarely.

If partial ring data is downloaded, edited and uploaded, the timestamp of the ring will be updated. That however should only be done, after dirty, not-downloaded way members have been checked (and eventually repaired). Otherwise these not-downloaded way members will be detected valid the next time the ring (or parts of it) is downloaded. To avoid lots of false hits, additional timestamps for way objects or finding clever ways to diff between the version histories of the objects are open ends of research here.

One way to explore might be to simply append the current version of the ring to the role values of all members not having been downloaded in a partial download+edit session. The next time the ring is downloaded, we will know to check the timestamps of members with a versioned role value not against the latest timestamp of the ring, but against the timestamp of a ring version determined by the role suffix. The downside of this is that we need to retrieve the version history of the ring (down to the oldest version demanded by a role suffix). The upside is, that parts of the ring can be healed by different, individual edit sessions.

Example Object(s)

  <relation id='30'>
    <tag k='type' v='ring' />
    <member type='way' ref='1' role='left' />
    <member type='way' ref='2' role='left' />
    <member type='way' ref='3' role='left' />
    <member type='way' ref='4' role='right' />
  <relation id='20'>
    <tag k='type' v='ring' />
    <member type='way' ref='5' role='right' />
    <member type='way' ref='6' role='left' />
    <member type='way' ref='7' role='right' />
    <member type='way' ref='8' role='left' />
  <relation id='10'>
    <tag k='type' v='ring' />
    <tag k='landuse' v='basin' />
    <member type='way' ref='9' role='right' />
  <relation id='42'>
    <tag k='type' v='polyring' />
    <tag k='landuse' v='farmland' />
    <member type='relation' ref='10' role='inner' />
    <member type='relation' ref='20' role='outer' />
    <member type='relation' ref='30' role='outer' />

Similarities to other types

Is type=ring similar to type=route ? Yes and no, the difference is that type=route relations may be open ended (i.e. routes do not need to form a ring) and are bi-directional (so much so that for few members of routes breaking this rule, backward and forward roles were invented). Tagwise there's another no for reusing type=route relations, since they describe linear features, while type=ring would describe the simplest form of a distinctive area.