Mechanical Edits/ramseraph bot

From OpenStreetMap Wiki
Jump to navigation Jump to search

Who

I, User:RamSeraph using my bot account

contact

I will watching this page and the corresponding talk page. You can send a message on OSM India Telegram group. I am currently active there as Sreeram K.

What

Fixing existing wikidata tags and adding missing wikidata tags for Indian administrative entity boundary relations. Currently fixing: admin_level=6 boundary relations. Future Plans: admin_level=9 boundary relations

Why

Wikidata currently has other goverment identifiers for Indian admin entities till subdistrict level, linking with wikidata will help validate the data in OSM and maintain it better. This is apart from the more obvious advantages of having wikidata linking. Automation was chosen because the number of entities to be fixed was too high.

Numbers

Current fixing plan:

admin_level=6 count in OSM is 6473. Around 1000 of these have links, some of which will be modified. Links will be added to most of the missing ones if a good enough match is found. Official count according to government is 7129. So, missing entries identified might also have to be linked, as and when they get created.

Future fixing plan:

admin_level=9 count in OSM is 53306, Most of them are likely unlinked and need to be tagged. Official count is 664517, these will also be linked as and when they get created.

How

Wikidata entries to link are identified based on the government identifiers restricted based on expected hierarchies and the names are then matched between wikidata, government data and OSM using a modified version of levenshtein edit distance which was tweaked to work with the transliteration problems of Indian names. What couldn't get a good match will be checked manually and added to the automatically generated match list. The whole match list will then be used by an automatic updater.


Changeset would be described and tagged with tags that mark it as automatic, provide link to discussion approving edit, It would include at least


Changeset may be split by States( count: 36 ) or Districts( count: 785 ) for admin_level=6 depending on the number of edits with the intention of keeping the changesets geographically co-located and of ideal size.

Edits would be generated by an osmapi based program.

Discussion

https://t.me/OSMIndia/27816

Repetition

Current Plan: Edits will be run till there all existing OSM admin_level=6 relations in India are linked

Future Plan: One more round for admin_level=9 relations will be done. Further maintenance of entities added in the future might be automated. A new approval for this will be pursued.

Opt-out

Please comment at OSMIndia telegram or this page's discussion section.