Mechanical Edits/Mateusz Konieczny - bot account/remove objects that are not existing according to source of import that added them

From OpenStreetMap Wiki
Jump to navigation Jump to search

Intro

Page content created as advised on Automated_Edits_code_of_conduct#Document_and_discuss_your_plans.

There are thousands of objects mistakenly imported to OSM from GNIS. Objects proposed to be deleted were documented in GNIS database as not existing at time of the import, but were imported anyway.

Edit would remove many nonexisting objects that are currently misleading users of OSM data and confuse mappers. There are are many amenity=post_office, amenity=place_of_worship and other mapped in USA that in reality are not existing. There are also thousands of object retagged to hide them in standard rendering but this entries also should be deleted as unwanted and usually incorrect (for example abandoned:amenity=post_office).

This automated edit intends to revert part of the GNIS import that added them and delete objects that never had any reason to appear in the OSM database in any form, at least according to GNIS data.

Some of them are present for a decade or more like for example https://www.openstreetmap.org/node/357118918/history [1]

Examples of other objects that would be deleted: [2] [3] [4]

To avoid deleting objects that were not imported from GNIS following filters will apply

  • Only objects created in specific changesets that were importing GNIS
  • Only objects with name tag that has "(historical)" part (this is how GNIS indicates nonexisting objects, see below for details)
  • Only objects with gnis:feature_id and name tags that were not changed from import to 2019-03-10
  • Only objects that have gnis:feature_id and name tags, where name tag has "(historical)" part at time of edit
  • Nodes that are now parts of ways or relations will be skipped, ways and relations (if any, it seems that only nodes were imported) that are now parts of relations will be skipped

All must apply, otherwise item will not be deleted.

Who

I, Mateusz Konieczny using my bot account

contact

message via OSM I will respond also to PMs to the bot account, though messaging my main account is preferable as I will get notifications in OSM editors.

English and Polish languages are preferable, for other I need to use an automatic translator.

What

Filters

To avoid deleting objects that were not imported from GNIS following filters will apply

  • Only objects created in specific changesets that were importing GNIS
  • Only objects with name tag that has "(historical)" part (this is how GNIS indicates nonexisting objects, see below for details)
  • Only objects with gnis:feature_id and name tags that were not changed from import to 2019-03-10
  • Only objects that have gnis:feature_id and name tags, where name tag has "(historical)" part at time of edit
  • Nodes that are now parts of ways or relations will be skipped, ways and relations (if any, it seems that only nodes were imported) that are now parts of relations will be skipped

Deletion would be restricted to items imported in following changesets with "(historical)" as part of the name=*

and with gnis:feature_id=* and name=* not modified since initial import up to 2019-03-10 (when I started processing OSM data) and still with gnis:feature_id=* tag and name=* containing "(historical)" at time of edit.

This restriction are to avoid deleting nodes that were changed to represent a different features. It is possible that I will propose in future separate mechanical edit that will process other groups of misimported objects.

List of candidates

Notes that edits between now and bot run may remove some elements, so deletion of all is not guaranteed. For example, someone editing and removing name=* tag will cause element to be ineligible (see criteria in an earlier section).

It contains 40 520 elements.

list of objects for deletion was hosted offsite due to limitations of wiki page length: https://raw.githubusercontent.com/matkoniecz/objects_for_deletion/master/objects_for_deletion.txt (this repository is now deleted, after completing this edit, as it was declared from start)

https://github.com/matkoniecz/objects_for_deletion had also

  • all_objects_can_be_opened_in_JOSM.osm file that can be downloaded and opened in JOSM. It contains all affected nodes.
  • objects_for_deletion_overpass_query.txt - Overpass query to download all nodes
  • Oregon, Washington.osm - modified all_objects_can_be_opened_in_JOSM.osm file with data far away from Seattle deleted

How we know that nonexisting object were imported?

Performed import dumped into OSM unfiltered GNIS dataset. Many of features have "(historical)" suffix in name. As GNIS documents it means that such object "no longer exists and is no longer visible on the landscape".

So it means that for example https://www.openstreetmap.org/node/358230694 is a post office that was known to be gone at time of import, according to database that was source of import.

What does "(historical)" following a feature name mean? "A feature with "(historical)" following the name no longer exists and is no longer visible on the landscape. Examples: a dried up lake, a destroyed building, a hill leveled by mining. The term makes no reference to the age, size, population, use, or any other aspect of the feature. A ghost town, for example, is not a historical feature if it is still visible. Valid features are never removed from the database, but become historical if they no longer exist. The GNIS is unique in containing such features, and additional data concerning historical features based on authenticated documentation are welcome.

Note that it is linked in a weird way - go to https://geonames.usgs.gov/apex/f?p=gnispq and follow link in "Stop! Do not bookmark or copy/paste this URL before reading FAQs." at the top

See also discussion in one of edits that mass added nonexisting objects - https://www.openstreetmap.org/changeset/747176

Why

Objects known to be gone should be deleted. And also should not be imported, but it is too late for that.

Numbers

Up to 40 520 elements.

Source code

This is source code of bot itself - note that it relies on .osm files with nodes for deletion.

Code is GNU GPLv3 licensed.

import osm_bot_abstraction_layer.generic_bot_retagging as generic_bot_retagging
import osm_bot_abstraction_layer.overpass_downloader as overpass_downloader
import osm_bot_abstraction_layer.osm_bot_abstraction_layer as osm_bot_abstraction_layer
import osm_bot_abstraction_layer.human_verification_mode as human_verification_mode
from osm_bot_abstraction_layer.split_into_packages import Package
from osm_iterator.osm_iterator import Data
import time
import osmapi

global list_of_elements

def splitter_generator(is_element_editable_function):
    def splitter_generated(element):
        global list_of_elements
        if is_element_editable_function(element.get_tag_dictionary(), element.get_link()):
            list_of_elements.append(element)
    return splitter_generated # returns a callback function

def process_osm_elements_package(package, is_in_manual_mode, changeset_comment, discussion_url, osm_wiki_documentation_page, is_element_editable_function):
    for_deletion = []
    for element in package.list:
        data = get_and_show_object_for_deletion(element.get_link(), is_element_editable_function)
        if data == None:
            continue
        if generic_bot_retagging.is_edit_allowed(is_in_manual_mode):
            for_deletion.append((data,element))
        print()
        print()

    if for_deletion == []:
        return []

    failure_log = []
    api = generic_bot_retagging.build_changeset(is_in_manual_mode, changeset_comment, discussion_url, osm_wiki_documentation_page)
    for delete_target in for_deletion:
        data, element = delete_target
        #print("deleting", data, element)
        try:
            osm_bot_abstraction_layer.delete_element(api, element.element.tag, data)
        except osmapi.ApiError as e:
            print(e)
            failure_log.append(str(e))
    api.ChangesetClose()
    generic_bot_retagging.sleep_after_edit(is_in_manual_mode)
    return failure_log

def get_and_show_object_for_deletion(osm_link_to_object, is_element_editable_function):
    prerequisites = {}
    data = osm_bot_abstraction_layer.get_and_verify_data(osm_link_to_object, prerequisites)
    if data == None:
        return None
    human_verification_mode.smart_print_tag_dictionary(data['tag'])

    if not is_element_editable_function(data['tag'], "test before edit"):
        return None
    return data

def main():
    max_count_of_elements_in_one_changeset = 500
    file_with_nodes_for_deletion = 'objects_for_deletion_in_usa_edit.osm'
    is_in_manual_mode = False
    changeset_comment = "remove objects that are not existing according to source of import that added them"
    discussion_url = "https://lists.openstreetmap.org/pipermail/talk-us/2019-March/019336.html and https://osmus.slack.com/archives/C029HV951/p1553094326606300"
    osm_wiki_documentation_page = "https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_objects_that_are_not_existing_according_to_source_of_import_that_added_them"


    global list_of_elements
    list_of_elements = []

    osm = Data(file_with_nodes_for_deletion)
    osm.iterate_over_data(splitter_generator(is_element_editable))

    packages = Package.split_into_packages(list_of_elements, max_count_of_elements_in_one_changeset)
    if len(list_of_elements) == 0:
        print("no elements found, skipping!")
        return
    print(str(len(list_of_elements)) + " objects split into " + str(len(packages)) + " edits. Continue? [y/n]")
    if human_verification_mode.is_human_confirming() == False:
        return
    index = 0
    failure_log = []
    for package in packages:
        index += 1
        print(index, "of", len(packages))
        failure_log += process_osm_elements_package(package, is_in_manual_mode, changeset_comment, discussion_url, osm_wiki_documentation_page, is_element_editable)
        print()
        print()
        print(failure_log)
        print()

def is_element_editable(tag_dictionary, identifier):
    if "name" in tag_dictionary:
        return "(historical)" in tag_dictionary["name"]
    else:
        print(identifier, "has no name tag")
        return False

main()

How

  • Objects not existing according to source of import (filtered as documented above) will be deleted
  • Nodes that are now parts of ways or relations will be skipped
  • Each changeset contains a single element or group of close elements to avoid edits spanning across large areas (it is impossible in cases where edited object itself spans very large area)
  • Edits will be made gradually, likely over long time due to massive amount of elements
  • Edits will not be blocked or interrupted by objects deleted after local resurvey, it is not necessary for USA community to edit in different way due to this edit

Discussion

posted on talk-us at https://lists.openstreetmap.org/pipermail/talk-us/2019-March/019336.html

posted into slack us https://osmus.slack.com/archives/C029HV951/p1553094326606300 (as advised at https://github.com/osmlab/osm-community-index/pull/219#issuecomment-467427863 )

Repetition

Not repeated, potential further removals of misimported objects will be proposed as a separate edits.

Opt-out

Please write in discussion thread at talk-us mailing list. To verify your account please also send me a PM via OSM messaging system.