Mechanical Edits/Mateusz Konieczny - bot account/fixing malformed shop tags

From OpenStreetMap Wiki
Jump to navigation Jump to search

Page content created as advised on Automated_Edits_code_of_conduct#Document and discuss your plans.

This edit will retag several shop=* tags

Who

I, Mateusz Konieczny using my bot account

contact

message via OSM I will respond also to PMs to the bot account, though messaging my main account is preferable as I will get notifications in OSM editors.

English and Polish languages are preferable, for other I need to use an automatic translator.

What

First part

discussed at https://lists.openstreetmap.org/pipermail/talk/2023-February/088076.html

replace following shop tags by doing an automated edit:

various ways of saying that we lack info about shop type

synonyms or synonyms in this context - sometimes product tagged as a shop type

singular to plural

translations

Second part

discussed at https://lists.openstreetmap.org/pipermail/talk/2023-April/thread.html#88185

Why

Why it is useful? It helps newbies to avoid becoming confused. It protects against such values becoming established. Without drudgery that would be required from the manual cleanup. It also makes easier to add missing shop= values

Why automatic edit? I have a massive queue (in thousands and tens of thousands) of automatically detectable issues which are not reported by mainstream validators, require fixes and fix requires review or complete manual cleanup.

There is no point in manual drudgery here, with values obviously fixable.

This values here do NOT require manual overwiev. If this cases will turn out to be an useful signal of invalid editing than I will remain reviewing nearby areas where bot edited.

Yes, bot edit WILL cause objects to be edited. Nevertheless, as result map data quality will improve.

Numbers

Large enough to make it useful to automate it.

How

state after a mechanical edit:

Changeset would be described and tagged with tags that mark it as automatic, provide link to discussion approving edit, include link promoting https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/ etc

Discussion

Discussed at talk mailing list at https://lists.openstreetmap.org/pipermail/talk/2023-February/088076.html and https://lists.openstreetmap.org/pipermail/talk/2023-April/thread.html#88185

Additionally https://www.openstreetmap.org/changeset/136075427 resulted in stopping to change shop=fixme into shop=yes

And also

were stopped. In addition this changes made by bot were reverted (at time of writing revert process is ongoing, once you read it that was likely completed)

Repetition

This is reoccurring edit and may be made as soon as new matching elements appear. At this moment triggering new edit requires human intervention so exact schedule is not predictable and bot may stop running at any moment.

This can change in a future. If bot is abandoned and does not run, feel free to ping me. If I am unable to run it any more feel free to use my code. Note that it may require going through bot approval process again and that code is on specific license.

https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/recurrent_bot_edits may have more up to date code version that what is listed on this page

Source code

GPL 3.0 licensed

from osm_bot_abstraction_layer.generic_bot_migrate_values_within_key import fix_bad_values

def key():
    return "shop"

def replacements():
    return {
        # synonyms or synonyms in this context
        # sometimes product tagged as a shop type
        'pawnshop': 'pawnbroker',
        'bread': 'bakery',
        #'laundromat': 'laundry', https://lists.openstreetmap.org/pipermail/talk/2023-February/088086.html (laundromat implies self-service, floated idea to invent new tag)
        'flowers': 'florist',
        'meat': 'butcher',
        'glasses': 'optician',
        'hgv': 'truck',
        'liquor': 'alcohol',
        #'Bag_shop': 'bag', https://lists.openstreetmap.org/pipermail/talk/2023-February/088087.html - "is this a shop that sells bags, or "things by the bagfull"?"
        'empty': 'vacant',
        'travel_agent': 'travel_agency',
        'marijuana': 'cannabis', # https://wiki.openstreetmap.org/wiki/Proposed_features/shop%3Dmarijuana

        # singular to plural
        'toy': 'toys',

        # translations
        'opał': 'fuel',

        'stationary': 'stationery',
        'hardware_store': 'hardware', # Note: there are weird clusters of shop=hardware in some places, but that is a bit different story - I suspect some systematic mistake or bad mapping, unless there are African towns where 1/4 of all shops are really shop=hardware

        # various ways of saying that we lack info about shop type
        'user defined': 'yes',
        'user_defined': 'yes',
        'lack_of_info': 'yes',
        'other': 'yes',
        'unknown': 'yes',
        '*': 'yes',
        'Shop': 'yes',
        'shop': 'yes',
        'stuff': 'yes',
        'store': 'yes',

        'lamps': 'lighting',
        'lamp': 'lighting',
        'Lighting_Shop': 'lighting',

        'knife': 'knives', # also, maybe shop=knives may be better as shop=weapons
        'collectibles': 'collector', # asked https://www.openstreetmap.org/changeset/120116954 (top user)
        'unused': 'vacant',
        'vacancy': 'vacant', # https://www.openstreetmap.org/changeset/131533418

        'egg': 'eggs', # neither is documented but consistency in tagging would be nice
        'gun': 'firearms', # both undocumented but plural is clearly preferable - may change just to guns # See https://osmus.slack.com/archives/C2VJAJCS0/p1678976149914619 and https://wiki.openstreetmap.org/w/index.php?title=Tag:shop%253Dweapons - requires more research is moving both to weapons is a good idea
        'nut': 'nuts', # yes, this has minimal use
        'textile': 'textiles', # both undocumented, but plural form of product is more commonn in general - maybe migrate both to fabric?

        # add/remove s as needed - can I do this for all shop values where such modification will change them to a searchable shop value present in iD presets, also without going through review like this one?
        'crafts': 'craft',
        'map': 'maps',
        'wig': 'wigs',
        'shoe': 'shoes',
        'tyre': 'tyres',
        'spice': 'spices',
        'sport': 'sports',
        'foods': 'food',
        'paints': 'paint',
        'door': 'doors',
        'health_foods': 'health_food', # this is not an endorsement of shop=health_food, just that this change is useful (the same goes for other changes)
        'locksmiths': 'locksmith',
        'bathroom_furnishings': 'bathroom_furnishing',

        # based on review of other low use values with extra s, this were not reviewed specifically
        'fireplaces': 'fireplace',
        'outdoors': 'outdoor',
        'tickets': 'ticket',
        'window_blinds': 'window_blind',
        'floorings': 'flooring',
        'beds': 'bed',
        'photos': 'photo',
        'curtains': 'curtain',
        'opticians': 'optician',
        'models': 'model',
        'pets': 'pet',
        'bags': 'bag',
        'fabrics': 'fabric',
        'computers': 'computer',

        'convinience': 'convenience', # in many cases I asked some or all mappers, for example here in https://www.openstreetmap.org/changeset/133091374
        'cosmetic': 'cosmetics', # https://www.openstreetmap.org/changeset/123802513 ( NESP_II_businesses_and_facilities_Import )
        'paint shop': 'paint', # https://www.openstreetmap.org/changeset/129733688
        'electronics_store': 'electronics', # https://www.openstreetmap.org/changeset/118986356
        'retail_furniture': 'furniture', # https://www.openstreetmap.org/changeset/96555990
        'convenience_store': 'convenience',
        'electronic': 'electronics',
        'Furniture store': 'furniture',
        'furniture_shop': 'furniture',
        'furniture_store': 'furniture',
        'swimming_pools': 'swimming_pool',
        'collectables': 'collector',
        'beauty_pets': 'pet_grooming',
        'pet_hairdresser': 'pet_grooming',        
        'pet_parlour': 'pet_grooming',
        'pet_beauty': 'pet_grooming',
        'icecream': 'ice_cream',
        'green_grocer': 'greengrocer',
        'General Shop': 'general',
        'food stuff': 'food',
        'car_dealership': 'car',
        'hair_dresser': 'hairdresser',
        'storage-rental': 'storage_rental',
        'repairs': 'repair',
        'telecom': 'telecommunication',
        'sexshop': 'erotic',
        'sex': 'erotic',
        'frames': 'frame',
        'optican': 'optician',
        'gas_shop': 'gas',
        'cars': 'car',
        'rentals': 'rental',
        'Kitchen': 'kitchen',
        'religious': 'religion',
        'pawn': 'pawnbroker',
        'closed': 'vacant',
        'nut_store': 'nuts',
        'estate agent': 'estate_agent',

        # with trailing space at the end - would it be fine to do it also with other known valid values (listed on Wiki or in iD presets as valid) if they appear, without a separate bot proposal?
        'shoes ': 'shoes',
        'fashion_accessories ': 'fashion_accessories',
        'health_food ': 'health_food',

        # would it be fine to do it also with other known valid values (snip '_shop', ' shop', ' store', '_store', '_products', ' products', etc at the end of shop value if it produces valid shop type, found in iD presets, without going through bot approval procedure?)
        'model_store': 'model',
        'farm_shop': 'farm',
        'farm_stand': 'farm',
        'convenience store': 'convenience',
        'mobile_phone_shop': 'mobile_phone',
        'gift_shop': 'gift',
        'fabric store': 'fabric',

        'horse': 'equestrian', # https://matrix.to/#/!CCRKncVOQamqtSzFBm:matrix.org/$1681241585132216XhYkO:matrix.org?via=matrix.org&via=mozilla.org&via=tchncs.de
        'haberdasher': 'haberdashery', # maybe all shop=haberdashery should be moved to shop=sewing?
    }

if __name__ == "__main__":
    fix_bad_values(
        editing_on_key = key(),
        replacement_dictionary = replacements(),
        cache_folder_filepath = '/media/mateusz/OSM_cache/osm_bot_cache',
        is_in_manual_mode=False,
        discussion_url="https://lists.openstreetmap.org/pipermail/talk/2023-February/088076.html and https://lists.openstreetmap.org/pipermail/talk/2023-April/thread.html#88185",
        osm_wiki_documentation_page='https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/fixing_malformed_shop_tags',
        )

Source code of a partial revert

GPL 3.0 licensed

import who_added_this_tag
from osm_bot_abstraction_layer.overpass_downloader import download_overpass_query
import osm_bot_abstraction_layer.osm_bot_abstraction_layer as osm_bot_abstraction_layer
import osm_bot_abstraction_layer.generic_bot_retagging as generic_bot_retagging
import thin_osm_api_wrapper
import os.path

config = {
    'key': 'shop',
    'value': 'snack', # may be None - in such case it looks for all values of key
    'area_identifier_key': None, # may be None, in such case search is worldwide, 'ISO3166-1' is a good identifier
    'area_identifier_value': 'PL', # ignored if area_identifier_key is None
    'list_all_edits': True,
    'list_all_edits_made_by_this_users': [], # ignored with list_all_edits set to True
}
#who_added_this_tag.statistics.process_case(config)

#filepath = 'output_a.osm'
#print(who_added_this_tag.statistics.download_object_list(config, filepath))
#print(who_added_this_tag.statistics.create_object_store_from_downloaded(filepath))

filepath = 'shop_yes_last_edited_by_my_bot.osm'
download_query = """
[out:xml][timeout:600];
// gather results
(
  // query part for: “user:"Mateusz Konieczny - bot account" and shop=yes”
  node(user:"Mateusz Konieczny - bot account")["shop"="yes"];
  way(user:"Mateusz Konieczny - bot account")["shop"="yes"];
  relation(user:"Mateusz Konieczny - bot account")["shop"="yes"];
);
// print results
out skel qt;
"""
if os.path.isfile(filepath) == False:
    download_overpass_query(download_query, filepath)
osm_object_store = who_added_this_tag.statistics.create_object_store_from_downloaded(filepath)
for entry in osm_object_store:
    print(entry)
    object_type = entry['type']
    object_id = entry['id']
    #object_type = 'node'
    #object_id = '6061734645'
    value_before_yes = None
    previous_shop_value = None
    is_last_shop_yes = False
    for history_revision in thin_osm_api_wrapper.api.history_json(object_type, object_id, user_agent='who_added_this_tag_script'):
        if 'tags' in history_revision:
            print(history_revision)
            if 'shop' not in history_revision['tags']:
                is_last_shop_yes = False
            elif history_revision['tags']['shop'] == "yes" and previous_shop_value != "yes" and history_revision['user'] == 'Mateusz Konieczny - bot account':
                is_last_shop_yes = True
                value_before_yes = previous_shop_value
            else:
                is_last_shop_yes = False
            if 'shop' in history_revision['tags']:
                previous_shop_value = history_revision['tags']['shop']
            else:
                previous_shop_value = None
        else:
            is_last_shop_yes = False
    print(is_last_shop_yes)
    print(value_before_yes)
    if is_last_shop_yes:
        if value_before_yes in ["fixme", "miscelanea", "bazaar", "miscellaneous", "Retail Shop", "true", "Generic shop", "commercial", "misc", "retails", "retailer", "Retail", "retail", "???", "generic", "local_shop", "Retails", "samoobsługowy"]:
            print("SHOULD BE REVERTED")
            is_in_manual_mode = False
            changeset_comment = "shop=yes to shop=" + value_before_yes + " [Reverting own bot, there was no consensus to run retagging to shop=yes, sorry for my misunderstanding] [though in my opinion shop=yes was better in this case, making easier for other mappers to spot places reuiring resurvey]"
            discussion_url = "https://www.openstreetmap.org/changeset/136075427"
            osm_wiki_documentation_page = "https://wiki.openstreetmap.org/wiki/Mechanical_Edits/Mateusz_Konieczny_-_bot_account/fixing_malformed_shop_tags"
            changeset = generic_bot_retagging.build_changeset(is_in_manual_mode, changeset_comment, discussion_url, osm_wiki_documentation_page)
            data = osm_bot_abstraction_layer.get_data(object_id, object_type)
            print(data)
            data["tag"]["shop"] =  value_before_yes
            osm_bot_abstraction_layer.update_element(changeset, object_type, data)
            changeset.ChangesetClose()
        elif value_before_yes in ["*", "shop", "Shop", "stuff", "store", "unknown", "other", "lack_of_info", "user defined", "user_defined", None, "yes"]:
            pass
        else:
            print("https://www.openstreetmap.org/" + object_type + "/" + object_id)
            raise(Exception(value_before_yes))

Opt-out

Please write at bot approval thread. Note that in case of opt-out exactly the same edit will be made manually for objects where bot opt-out was used.