What's the problem with mechanical edits?

From OpenStreetMap Wiki
Jump to navigation Jump to search

So you've stumbled across something tagged Building=house and you thought: Hey, that's silly, it won't appear on the map because building has a capital B and everything is case sensitive. You fire up taginfo and you find that there are 321 uses of that erroneously capitalized tag world-wide. Imagine that! 321 houses that aren't rendered because of a simple tagging error. You're going to fix that for us!

So, you manage to run an Overpass query from within taginfo, and you load the data into JOSM. Your palms are sweaty. You double-click the Building key (the value is listed as "different"), and you change it to building. You upload, and you feel great: Hey, I have just rescued 321 buildings from the dust pile!

After a while, you wonder what other misspellings might be waiting for a knight in shining armour and you're just about to fire up taginfo again when you get a nastygram from this guy who claims to be with DWG: "You're violating our Mechanical Edit Policy!" – But... you just wanted to help!

What have you done?

Large changesets

Firstly, you have very likely created a changeset that encompasses more or less the whole planet. If you do this a couple more times, your edit history will look like this:

Large-changesets.png

Anyone who clicks on the History tab in any small place will see your changeset in the local history because it spans the globe. Arguably this is a software problem with OSM but (spatially) large changesets can be a real pain.

Situational unawareness

Secondly, you haven't looked at the 321 objects that you have changed. In one case, because the original building didn't show on the map, someone had simply drawn a second, correctly tagged, building on top and now that you've made the original visible, the map looks like this:

Duplicate-building.png

If you had zoomed to that area in your editor and loaded the data that was already there, of course you would have noticed. But since you have done a "mechanical edit", that is without looking at the individual thing you changed, you missed that.

Over in Canada, you "rescued" a building that now sits half in the water:

Building-in-water.png

Closer inspection reveals that the building is in the right place, just the coastline needs fixing – something that a person really looking at this building would have noticed. Elsewhere, you have reinstated buildings that sit across roads or rivers.

In another location, all four corner nodes of a building were tagged Building=yes and instead of removing this unnecessary tagging, you have helpfully "fixed" that to building=yes.

False impression of quality

Thirdly, even though you haven't inspected the objects that you changed, you are now listed as the person last editing that object. There might be a village somewhere that hasn't been edited for three years, and it used to look all faded out on specialist maps that highlight editing activity – now, thanks to your edit, the village looks bright and current on that map, even though mapping-wise it is still collecting the same dust as before.

(There are many bug checkers in OSM and it is quite possible that a neglected area somewhere lights up on the bug checker like a christmas tree – usually this means that it is high time for a mapper to go there and overhaul the place. If you fix all the cosmetic bugs, then the place won't be any better mapped – but by fixing the most obvious bugs that attract attention, you're maybe patching over the other problems that lie below the surface.)

Lack of understanding

A few days later a mapper from Taiwan messages you about something to do with your edit but you don't understand what he wants since it's written in Mandarin.

Potential for bad judgement

In a slightly more complex example, you might even have made a change that seemed obvious to you – e.g. name=Mac Donald to name=McDonald's – but it later turns out that your knowledge was insufficient to make this decision, and there are indeed some things where name=Mac Donald is totally correct.

Conclusion

Summing up, you had the best intentions but your edit did have quite a few negative consequences. Try to refrain from these types of edits; it is better to actually look at the area you're editing in, rather than just making mechanical corrections from afar.

Oh, and did we mention the monthly meeting invitation from the San Salvador community that you're now receiving because you're one of the recent editors in their local area?

What should you do instead?

Here's what you should do when you detect a potential problem:

Understand the cause(s)

Firstly, before fixing anything, try and understand what the cause was. Perhaps an inexperienced mapper has edited some existing data that broke something that they didn't understand? You'll need to look at the mappers who have contributed to the problem, their relative experience, and what editors they are using. It is also possible that buggy data comes from an import or a badly-executed organised mapping activity, in which case further research might lead to the data being thrown out altogether.

Get a first-hand look at the problem

If you can, go and actually survey the area. No, really, do actually go there. That way you'll get a full 3d picture in your head of what's there and how it relates to the aerial imagery. It also enables you to recognise features from imagery better, so you can see what sort of surface a path is, and (with water features) tell man-made ones from natural rivers and streams (difficult from imagery, especially when made by man 200 years ago). Maybe the area is inaccessible to everyone, in which case anyone would have to work from imagery and other out of copyright sources, but if it is accessible to local mappers then they are the best people to fix any problem because they will be able to do a proper survey.

Communicate

You'll now have a picture of (a) what the original mapper had in mind when they mapped it, (b) what subsequent mappers were trying to do and (c) what you'd have mapped it as, if you had mapped it from scratch.

If these three all agree, and it was just a tagging error (for example I've seen people add natural=foo instead of name=foo recently) then it makes sense to "just correct the data". However, it's quite likely that these three might disagree, and perhaps you need to explain to an earlier mapper how multipolygons work, or to someone who has come along and "corrected" data in the interim that what they've changed something to is a valid OSM tag, but doesn't actually match what's on the ground in this case.

The best way to try and communicate with a specific previous mapper is via a changeset discussion comment. The advantages of doing it this way are that the discussion is public, and the context is obvious, as it's visible with the changeset. Other local mappers can also add comments there too – perhaps someone else locally has more knowledge about a particular water body. If that doesn't work, or you need to contact all local mappers you can try adding an OSM note explaining the issue. This might not get picked up immediately but notes sometimes do get fixed many months after they were added. Another option is to try and contact local users via a country's mailing list, forum or IRC channel. They may know someone who is local, or know someone who knows someone (not necessarily an OSMer) who may be able to answer questions.

You can't be serious!

Please don't think that "changing a tag to one that is valid within OSM" means "making the data correct" – it doesn't. All it means is that it is no longer possible to automatically find potentially problematical areas needing survey, or find mappers who may need help to map better. In an analogy, if someone has described a "horse" as a "kow" correcting the spelling to "cow" does not make the description correct.

Please remember – OSM is about geography, not computer science. Many people come to OSM, see a problem, and immediately start writing code to fix it, when they've made few if any edits that are based on actual survey. It is strongly suggested to take a little time out to actually do some real survey-based mapping, and in addition spend a bit of time understanding the human causes of the sorts of problems that one is aiming to detect, and helping those people understand the resulting problems in the data. Don't just say "you did X wrong" – explain to them politely how and offer to help them get it right next time.

Some people argue that in many cases automatic edits may be the best solution for a problem, as it allows to avoid tedious edits. In general this argument is highly disputed in the community. Mandatory discussion before automatic edit is intended to ensure that only this type of automation is used.

Further reading

Automated Edits code of conduct gives various steps which should be followed if you do want to make automated edits