Mechanical Edits/Mateusz Konieczny - bot account/remove links to temporary files hosted on westnordost.de

From OpenStreetMap Wiki
Jump to navigation Jump to search

Page content created as advised on Automated_Edits_code_of_conduct#Document_and_discuss_your_plans.

This edit will remove useless image=*/url=*/website=*/... tags leading to links such as https://westnordost.de/p/31513.jpg

See also https://github.com/openstreetmap/iD/issues/9302 for unsolved iD issue.

And https://josm.openstreetmap.de/ticket/22397 for solved JOSM one.

Info for mappers

Note that images uploaded by mappers and hosted at westnordost.de server contain photos taken for OSM notes. But this images are not kept permanently, so adding it for example to image=* is not useful. Upload your photos to Wikimedia Commons if you want to keep them - note that uploading photos made by someone else needs a license assignment.

See also https://github.com/openstreetmap/iD/issues/9302 and https://josm.openstreetmap.de/ticket/22397

Note that dedicated StreetComplete image server exists after lut.im used to store images was shut down due to abuse of this service and framapic that replaced it was shut down due to abuse and one of recommended framapic replacement (https://wtf.roflcopter.fr/maintenance.html ) is also already shut down for the same reason. We want to avoid dealing with any such issues.

See also

Just OSM Wiki has thousands of images waiting for content review/replacement/deletion. With things improving very slowly (at current rate it will take at least decades to process just already known problems).

Who

I, Mateusz Konieczny using my bot account

contact

message via OSM I will respond also to PMs to the bot account, though messaging my main account is preferable as I will get notifications in OSM editors.

English and Polish languages are preferable, for other I need to use an automatic translator.

What

This edit will remove useless image=*/url=*/website=*/... tags leading to links such as https://westnordost.de/p/31513.jpg

Why

StreetComplete allows to attach image to notes, which often makes this notes able to be remotely confirmed and fixed.

This is quite useful

Images are temporarily stored on server operated by SC author.

But some people mistakenly add them as value of image tag, unaware that they will be gone soon after note is closed.

I propose to remove all image=https://westnordost.de/p/* values, automatically, with bot. And also ones added as value of website/url/etc tags.

Edit would apply to all existing ones and to new ones as they arrive.

Every single such link is dead or will be dead and therefore adds no value.

Why automatic edit? I a have massive queue (in thousands and tens of thousands) of automatically detectable issues which are not reported by mainstream validators, require fixes and fix requires review or complete manual cleanup.

There is no point in manual drudgery here, with values completely useless.

This values here do NOT require manual overview. If this cases will turn out to be an useful signal of invalid editing than I will remain reviewing nearby areas where bot edited.

Yes, bot edit WILL cause objects to be edited. Nevertheless, as result map data quality will improve.

Numbers

Large enough to make it useful to automate it.

See https://overpass-turbo.eu/s/1nc9 for presence in image=*

How

    board_type = history
    image = https://westnordost.de/p/31513.jpg
    information = board
    name = Kopiec Niepodległości im. Józefa Piłsudskiego na Sowińcu
    tourism = information

state after a mechanical edit:

    board_type = history
    information = board
    name = Kopiec Niepodległości im. Józefa Piłsudskiego na Sowińcu
    tourism = information

Changeset would be described and tagged with tags that mark it as automatic, provide link to discussion approving edit, include link promoting https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/ etc

Discussion

Discussed at talk mailing list at https://lists.openstreetmap.org/pipermail/talk/2022-September/087786.html

Repetition

This is reoccurring edit and may be made as soon as new matching elements appear. At this moment triggering new edit requires human intervention so exact schedule is not predictable and bot may stop running at any moment.

This can change in a future. If bot is abandoned and does not run, feel free to ping me. If I am unable to run it any more feel free to use my code. Note that it may require going through bot approval process again and that code is on specific license.

https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/recurrent_bot_edits may have more up to date code version that what is listed on this page

Source code - version not removing still working links

GPL 3.0 licensed

from osm_bot_abstraction_layer.generic_bot_retagging import run_simple_retagging_task
import wikimedia_connection.wikimedia_connection as wikimedia_connection
import wikibrain.wikimedia_link_issue_reporter
import osm_handling_config.global_config as osm_handling_config
import urllib
import http
import time
import socket

def edit_element_simple(tags):
    for tag in ['image', 'url', 'website']:
        if tag in tags:
            url = tags[tag]
            if " https://westnordost.de/p/" in url:
                return tags
            if ";" in url:
                return tags
            if url.find("https://westnordost.de/p/") == 0:
                tags.pop(tag, None)
                return tags
            print(url, "has mismatching url, somehow")
            return tags
    return tags

def edit_element(tags):
    for tag in ['image', 'url', 'website']:
        if tag in tags:
            url = tags[tag]
            if " https://westnordost.de/p/" in url:
                for actual_url in url.split():
                    if actual_url.find("https://westnordost.de/p/") != 0:
                        print(url, "has multiple urls, tricky")
                        return tags
                tags.pop(tag, None)
                return tags
            if ";" in url:
                for actual_url in url.split(";"):
                    if actual_url.find("https://westnordost.de/p/") != 0:
                        print(url, "has multiple urls, tricky")
                        return tags
                tags.pop(tag, None)
                return tags
            if url.find("https://westnordost.de/p/") == 0:
                tags.pop(tag, None)
                return tags
            print(url, "has mismatching url, somehow")
            return tags
    return tags

def main():
    print("https://overpass-turbo.eu/s/1nlJ finds most present links - notify editors?")
    ok_links = ""
    wikimedia_connection.set_cache_location(osm_handling_config.get_wikimedia_connection_cache_location())
    run_global_edit("image")
    run_global_edit("website")
    run_global_edit("url")


def run_global_edit(main_tag):
    query = """
[out:xml][timeout:2500];
(
  nwr['""" + main_tag + """'~"^https://westnordost.de/p/"];
);
out body;
>;
out skel qt;
"""
    run_edit(query)

def run_edit(query):
    # See also https://github.com/openstreetmap/iD/issues/9302 and https://josm.openstreetmap.de/ticket/22397
    # (impossible to fit into changeset comment)
    run_simple_retagging_task(
        max_count_of_elements_in_one_changeset=500,
        objects_to_consider_query=query,
        cache_folder_filepath='/media/mateusz/OSM_cache/osm_bot_cache',
        is_in_manual_mode=False,
        changeset_comment='remove links to temporary files hosted at westnordost.de (this images will be deleted after note is closed! Upload your photos to Wikimedia Commons if you want to keep them - note that uploading photos made by someone else needs a license assignment).',
        discussion_url='https://lists.openstreetmap.org/pipermail/talk/2022-September/087786.html',
        osm_wiki_documentation_page='https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_links_to_temporary_files_hosted_on_westnordost.de',
        edit_element_function=edit_element_simple,
    )
    run_simple_retagging_task(
        max_count_of_elements_in_one_changeset=500,
        objects_to_consider_query=query,
        cache_folder_filepath='/media/mateusz/OSM_cache/osm_bot_cache',
        is_in_manual_mode=True,
        changeset_comment='remove links to temporary files hosted at westnordost.de (this images will be deleted after note is closed! Upload your photos to Wikimedia Commons if you want to keep them - note that uploading photos made by someone else needs a license assignment).',
        discussion_url='https://lists.openstreetmap.org/pipermail/talk/2022-September/087786.html',
        osm_wiki_documentation_page='https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/remove_links_to_temporary_files_hosted_on_westnordost.de',
        edit_element_function=edit_element,
    )

main()

Opt-out

Please write at bot approval thread. Note that in case of opt-out exactly the same edit will be made manually for objects where bot opt-out was used.