User talk:Yurik

From OpenStreetMap Wiki
Jump to: navigation, search

Default_language page?

Do I have permission to edit the default_language=* page to match the current proposal about default language formats? This is necessary if we are going to change the key to be default_language, to avoid confusion.

If you don't want the page changed, I can pick a different key like default:language, though it still will lead to some confusion if the proposal is approved. --Jeisenbe (talk) 01:01, 27 September 2018 (UTC)

@Jeisenbe, you never need permission to change any page in the wiki (with the possible exception of the user:*). That said, the current default_language=* documents how that tag should be used. You shouldn't change how it should be used until after the proposal is accepted. I would recommend copying the current page into the Proposed features/Key:default_language or some other such place, making all the needed adjustments to the current text, and getting the acceptance. Once done, you can simply copy it back over, possibly preserving any minor changes that happened since your initially copied it. --Yurik (talk) 01:37, 27 September 2018 (UTC)
I'm trying to be polite, and I also don't want to waste anyone's time by editing a page which will then be reverted. What I mean is that I am considering using default_language=* instead of language:default=* for the proposal about default language formats, which you commented on: Default Language Format. But I can't use this tag without changing the page to match the language of the proposal. Specifically, I would need to totally change the part about multiple languages. Alternately, we could move this whole page "Proposed_Features/default_langauge", because it has not yet been approved.--Jeisenbe (talk) 05:20, 27 September 2018 (UTC)
@Jeisenbe, appreciate it :) We have to consider the goals though. default_language has a very clear goal - to allow data users to know the language of the name tag. Given any OSM object with just a single name tag, default_language will allow you to convert object's coordinates to a simple language code, thus being able to intelligently use that single name tag. If you put two values in there, this goal cannot be reached, unless each name tag in that region always contains two name strings, e.g. separated by a dash. This is very rare, and I only saw it on the city names, not street names (I'm sure it does exist). Also, changing the meaning of the default_language would break any client that already uses it. I might have to re-read your proposal just to make certain I understood it correctly. --Yurik (talk) 05:29, 27 September 2018 (UTC)

SPARQL question

I see that you created SPARQL examples. I have two questions:

Is it feasible to generate list of wikidata entries that

  • have interesting (long or featured) articles on Polish or English Wikipedia
  • have coordinates or are otherwise described as mappable object
  • are without wikipedia/wikidata tag
  • are located in Poland or around some specific location

?

To avoid XY problem - I want to add wikidata entries but I am bored by processing at https://osm.wikidata.link/ bunch of substubs without any interesting content. Also, trying to match major articles may help to detect missing OSM data.

Is it possible to get query result from http://88.99.164.208 using API? https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Wikidata_Query_Help/Result_Views is mentioning only manual download

Mateusz Konieczny (talk) 08:32, 9 September 2017 (UTC)

Hi Mateusz, yes to all of the above :) First, you may want to take a look at the main WDQS (click "Examples" there) - it has a lot of Wikidata-only examples and info. Also some help links too. You may use the API directly (look in the browser debugger at the request it sends, and do the same). Its a simple GET request. Or you can use my python code, with http://88.99.164.208/bigdata/sparql endpoint. I will post the queries here in a bit. --Yurik (talk) 01:06, 10 September 2017 (UTC)
P.S. Mateusz, I wrote a query per above and added it to examples. --Yurik (talk) 02:44, 10 September 2017 (UTC)
P.P.S Wikidata does not store article length, but it has page views counts, and how many different wiki languages have an article on the topic. Both are very good indicators of which objects should be shown first.
Is it possible to exclude events? Note that I am not interested in limiting to subclasses of Q618123 (that should be easier, but many entries in Wikidata lack "instance of"). I tried in Wikidata Query Service
SELECT ?item ?itemLabel
WHERE
{
  ?item wdt:P625 ?location.
  MINUS { ?location wdt:P31/wdt:P279* wd:Q1190554. }  # excludes events
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 10

but it timeouts for some reason. I looked at examples but I failed one that excludes items that are subclasses of something and once I tried adapt others I ended with query above that for some reason has performance problems. What is the CPU safe way of excluding events, without excluding objects missing instance of? I prefer false positives, I have no problem with adding missing "is instance".

Modyfying query at SPARQL examples tinyurl y9ma6e3f or tinyurl 993o5sy (no direct URL, wikidata for some reason is using public link shortener typically used to hide spam) resulted in 504 Gateway Time-out.

Mateusz Konieczny (talk) 06:08, 13 September 2017 (UTC)

You are using the wrong subject - it should be ?item, or in my example - ?wd: FILTER NOT EXISTS { ?wd wdt:P31/wdt:P279* wd:Q1190554 . } But yes, it does take too long. It might work faster if replace circle service with the coordinate filter, like I did here (last commented portion). --Yurik (talk) 06:45, 13 September 2017 (UTC)
Thanks! I tried bbox, but it is failing even in the simplest case:
SELECT ?osmId ?wdLabel WHERE {
   ?osmId osmm:loc ?loc .
   BIND( geof:longitude(?loc) as ?longitude )
   BIND( geof:latitude(?loc) as ?latitude )
   FILTER( ?longitude > 19 && ?longitude < 20 && ?latitude > 50 && ?latitude < 51)
} LIMIT 10

is it a hardware limitation or is something wrong wrong with my query? And thanks for featured articles query, I already used it to add some links and read interesting Wikipedia articles Mateusz Konieczny (talk) 08:30, 13 September 2017 (UTC)

I think its failing because there are so many points, and that query requires a sequential scan through the whole DB. That's why there is a box and circle geo services developed by Wikidata. The sequential filtering works well after the results are already small enough. The service on the other hand seems to be doing small geo-indexing, but it does it first, before the filtering. Need to look at it more. --Yurik (talk) 10:41, 13 September 2017 (UTC)

SPARQL question II

Sorry for bothering you but I have other query that fails for an unknown reason:

I wanted to find on Wikidata human settlements in Poland with teryt code and exclude already linked from OSM to check whatever wikidata import using teryt ids for matching may be useful

SELECT ?item ?itemLabel
 WHERE
 {
   ?item wdt:P31 wd:Q486972.
   FILTER EXISTS  {
 		?item wdt:P4046 ?teryt
   }
   # There must not be an OSM object with this wikidata id
   FILTER NOT EXISTS { ?osm1 osmt:wikidata ?wd . }
 
   # There must not be an OSM object with this wikipedia link
   FILTER NOT EXISTS { ?osm2 osmt:wikipedia ?sitelink . }
   
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
 } LIMIT 10

Run it (edit query)

Query return 0 elements, what is surprising given that there are elements that should match.

For example https://www.openstreetmap.org/node/1589969137#map=19/53.97112/14.54503 and https://osm.wikidata.link/Q673875 (I checked lack of matches with http://overpass-turbo.eu/s/rGW )

I used following query adapted from one examples to check that both osm node representing Grodno and its Wikidata entry are in database:

SELECT ?marketLoc ?marketName (?amenity as ?layer) ?osmid WHERE {
   VALUES ?place { "hamlet" }
   ?osmid osmt:place ?place ;
         osmt:name ?marketName ;
         osmm:loc ?marketLoc .
   # Get the location of Grodno from Wikidata
   wd:Q673875 wdt:P625 ?myLoc .
   # Calculate the distance,
   # and filter to just those within 5km
   BIND(geof:distance(?myLoc, ?marketLoc) as ?dist)
   FILTER(?dist < 5)
 }

Run it (edit query)

so it seems that there is a bug in my TERYT query. Is it something obvious? Mateusz Konieczny (talk) 16:35, 13 September 2017 (UTC)

Mateusz, the first query is a bit wrong - you used different variable names - ?item and ?wd, instead of the same one. Also, you don't need "Filter exists" - you can simply list both statements. No need to filter out sitelinks because they are not connected to the rest of the query. And lastly, you don't want just the "instance of a human settlement", you want "instance of a human settlement or anything that is a sub-sub-sub... class of a human settlement".
SELECT ?item ?teryt ?itemLabel WHERE {
   ?item wdt:P31/wdt:P279* wd:Q486972 .
   ?item wdt:P4046 ?teryt .
   FILTER NOT EXISTS { ?osm1 osmt:wikidata ?item . }
 
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
} LIMIT 10

Run it (edit query)

--Yurik (talk) 22:00, 13 September 2017 (UTC)

Wikidata fixing - my noSPARQL tool

It seems that you are interested in the topic and it may give ideas for more quality checks and data import possibilities - http://www.openstreetmap.org/user/Mateusz%20Konieczny/diary/42385 https://github.com/matkoniecz/OSM-wikipedia-tag-validator. I created it before I was aware of SPARQL, main benefit is that it allows thorough listing of issues in a given location without running multiple SPARQL querries. Main problem is that it is not using proper database import (it is downloading individual wikidata entries), as result it is not feasible to run worldwide reports Mateusz Konieczny (talk) 08:03, 2 October 2017 (UTC)

Hi @Mateusz Konieczny, yes, seems we do have some functional overlap there :) Please take a look at Wikipedia Link Improvement Project - I'm putting together all sorts of issues that have been discovered so far. I think your github README should point just to that page, not my old OSM_wiki tag problems or quality control queries. The SPARQL_examples page is mostly used from inside the service itself - it shows up as the "examples" dialog there (try it, its cool :)) Also, there is a big (and somewhat ... heated) discussion at the @talk mailing list that you might be interested in.
Lastly, lets see if we can create SPARQL queries for all of your validations - allowing people to query directly, and to see up to date info is fairly important. BTW, if you know ruby, it would be awesome to help improve MapRoulette a little bit - so that we can upload some of these challenges there. Currently maproulette is missing one important feature - ability to store OSM ids. If it allows it, we can upload objects, and allow users to link to those objects.

P.S. I know it sucks to duplicate efforts, lets coordinate better :) --Yurik (talk) 18:23, 2 October 2017 (UTC)

  • Wikipedia Link Improvement Project - thanks! (I found it already, in fact that was why I am writing this).
  • "should point just to that page, not my old OSM_wiki tag problems or quality control queries" - fixed!
  • "Lastly, lets see if we can create SPARQL queries for all of your validations" - that would be a good idea Mateusz Konieczny (talk) 05:08, 3 October 2017 (UTC)
@Mateusz Konieczny, LOL: " rely on service that may be hard to replicate once it stops working" -- that's a very strange argument. Every service is only good while it works! :) On the other hand, anyone can set up a clone of that service - https://github.com/nyurik/osm2rdf --Yurik (talk) 05:16, 3 October 2017 (UTC)
"that's a very strange argument. Every service is only good while it works!" - yes, but I think you will admit that this service is probably less stable than wikidata API. Thanks for the link! I will try setting this up. Mateusz Konieczny (talk) 05:28, 3 October 2017 (UTC)
On topic of mailing lists - I looked at it and I would really advice to stop pushing for worldwide mechanical edits. It is obvious that people are really unhappy with that idea, and it has potential to not merely end with no edits done and massive amount of discussion but also with backslash ("lets delete all wikidata", "completely ban bots") Maybe try discussing this ideas with local community? Mateusz Konieczny (talk) 05:29, 3 October 2017 (UTC)
@Mateusz Konieczny, I agree, at this point I am simply trying to educate people of what it is, and what benefits it provides. Seems like the loudest are the ones who also have the least understanding of it. Funny how all the more advanced data consumers have already switched to it - Mapbox, Openmaptiles, etc. It has a huge benefit, but because it looks like a "number", and not all tools support it yet, people are afraid of "some number of the devil" that is added. I thought that my initial request, followed by a discussion, followed by a 4 days of quiet time could be considered as settled. But apparently some people decided to jump on it after the discussion. Oh well. As for your github, please change the wording about "only works while it works" - same can be said about OSM itself :) --Yurik (talk) 05:42, 3 October 2017 (UTC)
I removed tautology from github readme. "people are afraid of" - I think that most people are afraid of ending bot happy like Wikidata or Cebuano wikipedia. What is understandable, given different style/targets/situation. Mateusz Konieczny (talk) 06:28, 3 October 2017 (UTC)
@Mateusz Konieczny, its always important to go for the "golden middle". English Wikipedia has used countless bots, and has become the most successful project. OSM is extremely anti-bot. ceb-wiki is a joke. I think its a mistake to go into either extreme. --Yurik (talk) 06:39, 3 October 2017 (UTC)
I'm sure pushing wikidata in such a large scale effort will change that overall attitude towards bots or more genereally speaking what is considered to be automated edits. NOT. People are per se not anti wikidata (even you keep on reiterating the same argument over and over). Maybe you should use a more piecemeal approach so people have a chance to gain more trust and confidence in more automation. This however might end up in a way that a more local mapper community driven approach would be favored, and automation is still considered to be fundamentally flawed (e.g. because the data you use for reasoning is already crappy). I really wonder why you take the burden of adding all this wikidata on your own. It's really the local communities who should take ownership and you can support them via your toolset (like you already do). There were several bots in the past that really messed up data, such as xybot. I believe people have good reason to be very sceptical, given how easily you can screw up data. Leveraging wikidata for multilanguage labels like Mapbox does looks like the first step, I'm sure there are plenty of other use cases ahead :) (see https://blog.mapbox.com/support-for-arabic-and-portuguese-in-mapbox-streets-5a9690dabff4) Mmd (talk) 06:53, 3 October 2017 (UTC)

Change of parser limits

Hi Yuri,

Thank you for Blacktocat.svg openstreetmap/chef/commit/6694697e2851ac362546d34cf12f146633be3782. I found it by accident, it would have been nice if you would have posted that at Talk:Wiki or at https://forum.openstreetmap.org/viewforum.php?id=52. Please do not get me wrong! I am absolutely happy with that. Reading pages like Category:Pages with too many expensive parser function calls you get the impression that people waited for that change for years. Now, these pages need to be adjusted.
Thanks again
PS: I am more than happy to help you with something I just do not know a lot about the technical setup behind OSM (that is why I have never made a change or added a PR). --Tigerfell This user is member of the wiki team of OSM (Let's talk) 18:30, 27 September 2018 (UTC)

@Tigerfell thanks for the heads up. Sorry I was not aware that was impacting anyone else, or that the community was waiting for it. I only saw it with regards to the recent Lua changes. I will make sure to post these things to the form. BTW, a heads up -- the searchbox change has just been merged, so it should bring back the original functioning of the searchbox dropdown. --Yurik (talk) 20:40, 27 September 2018 (UTC)

Great! Thanks again! --Tigerfell This user is member of the wiki team of OSM (Let's talk) 21:49, 27 September 2018 (UTC)

Translations

Thanks for the tips! I will try to help as much as I can. --Милан Јелисавчић (talk) 21:29, 28 September 2018 (UTC)

Thanks!!! --Yurik (talk) 21:37, 28 September 2018 (UTC)

Wiki extensions for displaying maps

Hi Yuri,

I am currently searching for a well-crafted MediaWiki extension for this wiki that allows displaying multiple maps in an article. We had some discussion already at Talk:Wiki#Map_extensions, tried out the installation of "Maps" extension, and failed.

One alternative mentioned was the "Kartographer" extension. We have the following requirements:

  1. Dependencies towards a singular website should be avoided (see issues with Simple image MediaWiki Extension).
  2. dropped
  3. No extensions that require an arbitrary amount of maintenance or self-coding (missing coding/maintenance capacities in the wiki).
  4. Ability to display Standard tile layer map rendering.
  5. If the extension allows maps from non-open sources, there must be a setting to disable accessing this source for the whole wiki.

Since you were involved in Kartographer, can you tell me if it can fulfil these requirements? --Tigerfell This user is member of the wiki team of OSM (Let's talk) 11:32, 22 October 2018 (UTC)


Yurikbot

Hi Yuri, I have seen you are running the bot Yurikbot which seems to create wikidata style items in the wiki from taginfo data. Is there documentation what this bot does specifically? --Dieterdreist (talk) 18:37, 9 November 2018 (UTC)

Hi Dieterdreist, please take a look at the OpenStreetMap:Wikibase and search for the word "bot". I plan to write up a much bigger post, and upload all the bot code shortly (currently in the process of cleaning it up). Thanks! --Yurik (talk) 19:19, 9 November 2018 (UTC)