Duplicate nodes map

From OpenStreetMap Wiki
Jump to: navigation, search
Duplicate nodes map
zoomed out to show the whole world

The Duplicate nodes map (http://matt.dev.openstreetmap.org/dupe_nodes/) is a display showing the locations of all duplicate nodes, where two nodes are at the exact same location.

Duplicate nodes map (About page)

The map is updated minutely with worldwide coverage of all duplicate nodes. This is made possible by some clever OWL-based trickery behind the scenes. Developed by User:Matt, this supersedes the more simplistic display TIGER fixup/250 cities/duplicate nodes. Note that this new map shows all duplicated nodes not only highways duplicated nodes like the previous one. It is important to note that, despite the design of the map, these are not all necessarily errors, and should not all be "fixed" by merging nodes.

Dupe nodes leaderboard

The Dupe nodes leaderboard is an interesting side product, showing which users have cleared / created most duplicate nodes recently. This is a crude metric which can in some ways reflect unfairly on the users concerned. For example the users who have cleared most dupe nodes are mostly cheating by making Automated Edits. This is discouraged, so they're not really heroes. In fact they may well be creating more work for the rest of us; in addition to the cases listed below, the TIGER data will sometimes have railway lines in the wrong place, so that they are drawn to cross highways when they actually do not. If an automated edit eliminates the dupe node by joining these roads and railways then this is creating new (worse) problems. Please use the dupe nodes map to direct your manual fixing up of the data.

Meanwhile the users who have created the most duplicate nodes could be regarded as villains, since large numbers of duplicate nodes may be a sign of a badly performed import, however it may also be a user who is undoing badly performed duplicate elimination.

Legitimate dupe nodes?

Is there ever a legitimate reason for having a duplicate node in the data? There are a couple of real-world things which some people have argued should be represented as duplicate nodes in the data. In all cases there's usually a way of representing these things without using duplicate nodes, but nonetheless if someone wants to argue that it's a good idea then we need to stop and think about it. This is another reason for not auto-fixing all of them using a bot.

Reasons for possibly legitimate use of dupe nodes in the data:

  • Multi-tier roads such as in Chicago. Dupe nodes could (should?) be avoided by staggering the nodes very slightly.
  • Likewise two-tier walkways e.g. if you try to map walkways on the inside of a shopping mall or train station. This kind of 3D multi-level mapping presents all kinds of problems for our simple datamodel. Again could be avoided by staggering the nodes very slightly.
  • There was an import of radio mast towers somewhere, which for some reason represented them as dupe nodes.
  • It has been argued that boundaries which run along highways (common in the TIGER data) might better be expressed as a duplicated way with duplicate nodes - or at least that this is a better state of affairs than joining them. See talk page and mailing list thread.

Fixing dupes

The obvious way to get rid of duplicate nodes is to merge them, thus joining the ways. But (see above) this is not always ideal. A much slower fixup process would eventually get rid of dupes caused by the TIGER import, at least in those counties with imprecise data, and would solve more important problems than two nodes having the same coordinates.