Talk:Automated edits

From OpenStreetMap Wiki
Jump to: navigation, search

Move -> Bots

I am proposing moving this to 'Bots' to match wikipedia:Bots terminology. I will do the move in a few days if I don't hear objections. PeterIto 19:46, 23 August 2009 (UTC)

"Bots" isn't exactly the same thing. "Automated Edits" is slightly broader. A bot is an agent which performs automated tasks via a interface primarily designed for humans. The term is widely used in wikipedia where initially bots were being created which submit changes via the wiki edit form, so basically pretending to the wikipedia server that they are human editors, but performing automated edits. In fact these days wikipedia bots are normally editing via an API (maybe they are required to now actually)
In OpenStreetMap we've always had an API. People have created similar software agents, such as User:Frederik Ramm#Fixbot, and I would call these "bots" if they are intended to be run on a regular basis or over a long period of time, but there are other forms of "automated edits" which don't fit the word "bot" so well. e.g. a script which modifies or imports data as a one-off generation of a changes file to be applied via JOSM. -- Harry Wood 14:15, 19 September 2009 (UTC)
Agreed, the term bots in OSM should be restricted to software that makes 'regularly makes approved automated changes to the database over a long period of time'. Automated edits is a wider definition which includes changes made once or infrequently. PeterIto (talk) 09:34, 19 May 2015 (UTC)

Wish list discussions

  • A bot that marks all attractions that do not have a wheelchair tag yet with a FIXME=wheelchair tag.
I disagree with this request - missing tags can be identified easily by debug tools, editor software and human mappers, so there is no need to explicitly add FIXME. Btw, this section should be moved to a talk page. --Tordanik 14:33, 4 November 2009 (UTC)

  • Finland is country with two official languages; Finnish and Swedish. This means that the streets in many municipalities have two names. So the following tag rules (as examples) and changes is needed (note our scandic letters äåöÄÅÖ): --Kslotte 11:52, 29 January 2010 (UTC)
    • If name differ like ("name=Rautakatu" and "name:sv=Järngatan) but no name:fi tag exist then fill "name:fi=Rautakatu"
    • If name differ like ("name=Bläsnäsvägen" and "name:fi=Bläsnäsintie) but no name:sv tag exist then fill "name:sv=Bläsnäsvägen"
I oppose this proposal, as it will produce wrong results if the other language was only forgotten and exists. Lulu-Ann
Let's keep this on hold, until consensus is achieved. --Kslotte 11:52, 29 January 2010 (UTC)

  • A bot that checks to see if the website tag have the correct/up to date www address. It could run through the list of all web links addressed to places in OSM and see if there are 404 error messages. --Hawkeyes 13:48, 23 May 2010
Well that would be a rule to add to a Quality Assurance tool, rather than anything requiring automated edit. no? -- Harry Wood 09:01, 24 May 2010 (UTC)
What do you want to do with the results? Add a FIXME-tag? Lulu-Ann
Sorry, forgot about this post...yes a FIXME tag would work and removing the dead link. But yes, this is something for Quality Assurance tools could do too. I have added it to the Quality Assurance discussion page.

  • All states in the USA have Interstate Highways, U.S. Highways, and state highways. Creating relations for the first two is pretty close to being done, but the state highways are a daunting task. This could be automated because most (maybe all) of them have "State Highway ###" in their name as a tiger:name_base=*. Given one terminus, it shouldn't be difficult to follow a road and add all continuations that match a certain pattern to a relation. The relations would need to be cleaned up by hand, but this would remove a lot of the tedium. It also has uses for other imports than just TIGER. --BigPeteB 14:46, 13 July 2010 (UTC)
(I'd be willing to write this myself, but would need some guidance on how to access the data, and how to not slam the server. Once I have the data I know what to do ^_^ --BigPeteB 14:46, 13 July 2010 (UTC))

  • There's been discussion about a bot that would search for addr:street=* tags, look for a matching street nearby, and add the node or building to a relation, probably removing the addr:street=* tag in the process. This could be done for exact matches, and then expanded to include fuzzy matches (e.g., lots of TIGER data still has abbreviations like "Pky" instead of "Parkway"). --BigPeteB 14:49, 13 July 2010 (UTC)
Oppose There is not yet a consensus in discussions about addr:street vs relation. There is 25k associatedStreet relations and 1.1m ways with addr:street, seems like people go for the addr:street method. --Gorm 00:41, 26 October 2010 (BST)
Oppose Relations are worse than addr:street, because they are more complex to edit (and that's what really matters, data users can work with both formats). I'd more likely accept a bot that throws out the unnecessary associatedStreet relations than one that adds even more of them. --Tordanik 14:09, 26 October 2010 (BST)

  • To clean up old style tagging, I propose a bot that
foreach element with place_name
   if (!isset(name)) rename 'place_name' to 'name';
   if (place_name==name) delete 'place_name';
   if (place_name!=name and !isset(is_in)) rename 'place_name' to 'is_in';  //Controversial?
   foreach(language code)
      same as above (place_name:en -> name:en etc.).;
   if (!isset(place)) set place=locality;                                   //Even more controversial??

(I would be happy with just the two first tests) --Gorm 00:41, 26 October 2010 (BST)

Terminology - Automated edits, bots or Mechanical edits?

I notice that we have a muddle of terminology on the wiki. The article is called 'Automated edits' which contains a section with a list of 'bots'. The article is in a category called category:Bots. We have a Automated Edits code of conduct but the Data Working Group has adopted and evidently enforces a Mechanical Edit Policy. Can I suggest that we standardise on one term and then update the categories, titles and text to match that term.? PeterIto 10:07, 21 January 2012 (UTC)

My vote currently would be for 'Mechanical Edits', because not all edits that we will be concerned with will be totally automated, but they do all use software (including standard editors) to speed up the change process with associated increased risks of damage. PeterIto 10:07, 21 January 2012 (UTC)
I think maybe the main thing to do is try to explain what we're talking about with reference to these different terms, but maybe giving some more concrete examples. Try to say what exactly is covered by the code of conduct and to what extent? I think the best example of grey-area is a user finding all instances of a particular tag in their town using XAPI and then select-all changing the tag in JOSM and clicking upload. You wouldn't call that a "bot". It's not an "import". But both the words "automated" and "mechanical" could be applied to some extent. If they do a larger area than a town, then this starts to be a power tool with the negative aspects and the need for caution of any other bot/auto-edit approach, so it's an example of something which probably should be covered by this, but to what extent? Grey area.
I agree the terminology is a bit confusing, and we should standardise wiki page names if possible. However this page and the associated Automated Edits code of conduct have been around for a long time, and are heavily linked to (including from outside the wiki). I'd be reluctant to move these pages. And besides.. I like "automated edits". It's one of those areas of technology where different words can apply equally well, but often mean slightly different things. I don't think moving this page will help all that much. When the "Mechanical Edit Policy" popped up, I did wonder why Frederick (and the Data Working Group) didn't call it the "Automated Edits Policy" for consistency. I'd prefer to see that page moved because to me (and know it can have broader meanings) "mechanical" makes me think of clockworks and wind-up toys. Also that page is less well known and linked.
-- Harry Wood 13:54, 21 January 2012 (UTC)
Makes sense. I have left a note on Talk:Mechanical Edit Policy requesting a discussion on the subject here. Personally I think we would do better to combine the concepts rather than try to distinguish them. PeterIto 17:18, 21 January 2012 (UTC)
I also think that "Automated edit policy" would be the better name for the policy page, simply because this terminology has been around for longer. And since nobody reacted to the suggestion on the other Talk page, I suggest "being bold" and moving that page. --Tordanik 02:43, 20 May 2012 (BST)
Does anyone object to the move? I have added a final note to Talk:Mechanical Edit Policy before making the move. PeterIto 11:59, 31 May 2012 (BST)

(DONE. Mechanical Edit Policy was renamed and then later merged into Automated Edits code of conduct. March 2015)

IMHO this "cleanup of 'mechanical edits' pages" was done like a bad mechanical edit: A discussion quite old and hidden, no announcements anywhere, no information of the users involved and no useful changeset comment given. -- malenki 20:17, 20 May 2015 (UTC)

Note that there's a bunch of subpages. Special:Prefixindex/Mechanical Edits/, and we still have instructions to create new sub pages under Mechanical Edits: Automated Edits code of conduct#Document thoroughly. Not sure if these need to be moved. -- Harry Wood (talk) 18:32, 26 March 2015 (UTC)

Thanks Harry. Am looking to clean these 'mechanical edits' subpages up now. Also... As you may have noticed, I will be running a session on 'bots, patrols and automated edits' at State of the Map in New York in June 2015 much of which I hope will be a discussion about how OSM should use and manage automated edits.
I do also agree that the term 'bot' should only be used for largely autonomous editing software that operates on the OSM database making autonomous edits largely autonomous edits on a long term basis using a bot account. Automated edits include bots and also one off scripted tasks and imports etc.
-- PeterIto (talk) 09:13, 19 May 2015 (UTC)
I see most of these Special:Prefixindex/Mechanical Edits/ are redirects now (showing in italics on that list), but some still need moving if we are to remove "Mechanical Edits" completely -- Harry Wood (talk) 02:38, 2 April 2017 (UTC)

Re-import/fix of data from CanVec

My (wife's) parents have a cottage on Mackenzie Lake in Ontario. Looking at this data I noticed a number of minor problems, and started to use the editor to try to fix them. However, I came across the same problem over and over, and when I looked into it a bit, found a more serious issue that I think could be automatically fixed.

Basically the data for the area, if not all of Canada, was automatically imported from CanVec. CanVec is split into grid squares, and the vectors are not continuous across the lines. Looking at my example, Mackenzie Lake ends up split into three parts, as there is a grid corner in the northeast corner, splitting the lake into parts. This same problem effects the entire map, at least in the areas I looked at, so lakes are split up, roads are non-continuous, etc.

I strongly suspect that there is data inside the CanVec database that would allow us to re-connect these vectors back into continuous paths.

That assumes that such a thing is "good". Does OSM want the data to be split up in a similar fashion? Or is it fine with large-scale vectors?

I would like to know how to better explore this, and perhaps start the process of updating the import to fix this, if possible.

Maury Markowitz 19:23, 15 October 2012 (BST)

Help to control updated translations

Is it possible to create a page on the Wiki, which is updated from time to time, with the following columns:

Updated translations

  1. Page name of the key/tag/element, etc. with "EN"
  2. Date of last change
  3. Page name of the key or tag equivalent to "Pl" or other language.
  4. Date of last change
  5. Optionally, a red symbol appears when the date of translated page is later.

This would help to control the update translations from English, while showing missing translations. --Władysław Komorek (talk) 05:52, 15 April 2013 (UTC)

surface values

I propose to reduce the number of values of the surface=* key once. According to taginfo there are currently 2468 different values. I want to rename errors and translate local names (especially german) to english. In particular:

  • (p. 154): (ground) -> ground
  • (p. 153): concrete-plate -> concrete:plate
  • (p. 150): find_gravel -> fine_gravel
  • (p. 149): unpaved␣ -> unpaved
  • (p. 149): unpaved- -> unpaved
  • (p. 148): paving␣stone -> paving_stones
  • (p. 145): ␣cobblestone:flattened -> cobblestone:flattened
  • (p. 145): asphat -> asphalt

I can look for further examples if I know this can be done--Nobelium (talk) 13:21, 4 August 2013 (UTC)

Removing leading or trailing whitespaces, or compressing multiple whitespaces by a single one may be done automatically (some editors do that implicitly in their tag editor).
Replacing whitespaces in the middle by underscores may be done if the vbalue is enumerated (it should not be done in tags with free-text values such as name=*)
There are a few tag that still use whitespace in their key name, most of them are specific to a region and were documented as is: this should not be automated before a discussion occurs in the relzvant community and the documentation page is updated to reflect the change accepted by the community (sometimes these tags have been introduced de facto by a single user but were never objected and are docuemtned as is since long: there's little benefit to change them). When these tags are region-specific, their names may not be restricted to just basic ASCII and accents (or non-Latin letters) may be used, as well as capitalization appropriate for the language in which it was defined (these are not generic British English terms and it's even illusory to attempt any kind of imprecise translation if there's no goal to transform the tag in a more global meaning usable elsewhere: such approximative translation, with unattested English terms will be extremely fuzzy, it's best to keep the local terms which may also have precise legal definitions, with no clear equivalence in other countries; however if these tags are country specific and map to legal terms, these tags should preferably use an uppercase country/territory code prefix from ISO 3166, and a colon without whitespaces around).
However we don't need any bot to perform these edits mechanically: QA tools are already reporting such cases, and manual editing can be done locally with much less troubles.
I suggest then to ask this only to wellknown QA tools authors. Manual review of their individual reports is still better: these small "errors" are so frequent that we need to dedicate a bot to make massive edits on very large areas
Bots should still edit only local areas in their changesets (not much more than a city and surrounding suburbs, i.e. within a ~20km radius in rural areas, ~5km in low density urbanized areas, and ~1km area in dense urban areas: this allows easier reverts. As well, these edits per changeset should not exceed about 200 objets (lower limit than human edits, that limit may be larger only for import tools that should still work without spreading their changes over a large zone: this allows easier reverts without impacting too many users: human editors can review bots edits and resolve edit conflicts only if these lists of changes do not exceed about 200 elements; above this threshold, they have to resign and abandon their half-terminated uploads, leaving many disconnected objects). Large changesets are frequently disruptive against the work currently being done by other users (and resolving edit conflicts is not easy, and very errorprone if these happen too repeatedly). — Verdy_p (talk) 11:33, 2 April 2017 (UTC)

Automated edits from personal accounts

I have noticed that this automated edits say that they are actioned via personal accounts: MichaelSchoenitze's bot and User:EtienneChove. Not sure if they are still used. probably worth asking the users directly. PeterIto (talk) 18:46, 1 June 2015 (UTC)

For the second case, these are old (2009-2010) and were discussed at that time on the relevant mailing list. At that time, the OSM data in France was still in very early stage and most of them have been refined later. Also the bot policy was not so clear as it is now, but if it was efefctive, these edits were compliant to the rules.
Local edits are not necessarily discussed here, but in relevant local mailing lists. EtienneChove properly listed what he did, and gave hints about with whom they were discussing. Theses were not isolated bot edits. Etienne was playing very fairly by documenting this in his page. Particularly, he was fair because he added or unified many source tags that were requested by the source (according to its licence), including an important information: the year (this allows all of us to detect old data that may need updates now when the source has fresher data, for example changing from bitmap cadastral maps to more precise vectorized data, or when conflation between multiple maps that was then needed is no longer needed because sources have been synchronized with also a better terrain model data for adjusting the orthophotographic sources). — Verdy_p (talk) 11:44, 2 April 2017 (UTC)