Module talk:OsmPageTitleParser

From OpenStreetMap Wiki
Jump to navigation Jump to search

Support other types of titles

Could we change this to a more general module which is able to parse all wiki page titles, not just the key and tag pages? There might be useful code at Module:Element already. --Tigerfell This user is member of the wiki team of OSM (Let's talk) 14:30, 30 September 2018 (UTC)

@Tigerfell: could you expand unit tests to demonstrate the types of titles and how you see them parsed? It's ok that they will be failing at first. --Yurik (talk) 17:09, 30 September 2018 (UTC)
I would also need a function that returns the English language name for each language code. As this is not primarily about title parsing, I would write that into Module:Element, but I would like to use Module:OsmPageTitleParser/data. How can we document this properly? --Tigerfell This user is member of the wiki team of OSM (Let's talk) 12:29, 2 October 2018 (UTC)
While you can load data from any page, it would be confusing if you used a subpage of one module in another module. On the other hand, we might as well add related functions in here. We could rename the page if you want. --Yurik (talk) 13:06, 2 October 2018 (UTC)
That was also my thought. I would then suggest to rename the page to Module:DataLanguageCodes and suggest the convention to use the prefix Data for general data tables that can be "globally" included... --Tigerfell This user is member of the wiki team of OSM (Let's talk) 13:48, 2 October 2018 (UTC)
@Tigerfell: I moved it to Module:OSM Constants - I suspect it would be better to keep multiple things like that together to reduce loading time. Note that each #invoke creates a new Lua parser, so if there are multiple templates on one page that use templates with Lua, each re-parses lua markup. Data page (using mw.loadData()) optimizes that. --Yurik (talk) 20:04, 2 October 2018 (UTC)
Here's a list of the most used prefixes (count > 10, not counting redirects) that are not resolved by MediaWiki as a language. Note that the language codes I added to the constants is not part of it -- their usage is < 10. The constants in Module:OSM_Constants were taken from some wiki page you suggested, but for some reason I can no longer find it using search). --Yurik (talk) 21:17, 2 October 2018 (UTC)
2733 Tag
1319 Key
154 France
85 Switzerland
65 Canada
61 Finland
44 SotM 2014 session
43 Proposed features/Tag
42 Proposed features/Key
37 SotM 2010 session
35 India
27 Relation
19 CanVec
15 Serbia/MappingSerbia
15 Proposed features/Relation
11 POI
The disadvantage of a central OSM constants data table is the fact that the whole table needs to be loaded for many pages and it gets long. Well, I guess we will see if it does.
The fact that Lua code is reparsed multiple times for a single page seems to be unavoidable to me. Is there a mechanism to avoid that apart from data tables?
Thank you for this interesting data. I will have to check what 'CanVec' is... I already had the intuition that these language versions would consist of few pages only. The reason of bringing this up in the unit test was merely to make you aware of the existence of such codes. I would not have objected if you would have left that out. Sorry, my actions were contradicting to my intentions.
I recently found a list on Template:Languagename. I guess you refer to this list. There is an 'official' list, but it includes all MediaWiki codes and 'Gcf' only. --Tigerfell This user is member of the wiki team of OSM (Let's talk) 23:13, 2 October 2018 (UTC)
Question is - do we actually want to even parse all the non-typical pseudo namespaces that are not key: tag: relation: or a language? I saw the official page too, but I figured running the list of existing pages through the language parser is easier than just following that table. BTW, there are a few very strange namespaces that mediawiki "kinda" thinks in a language. AND-NL (no idea what this is), Ro-md (should be ro because Moldovan Romanian is not different) and Sr-latn (tricky). --Yurik (talk) 03:28, 3 October 2018 (UTC)
No, not really. I do not need them.
We do have one page with Ro-md: Ro-md:Map Features and there is also one page in AND-NL. In the long run, we might consider moving these pages? If you know how to query these things and have an overview over the parser load this is of course preferable to looking at (possibly) outdated tables. You could replace the table with a query link (if reasonable). --Tigerfell This user is member of the wiki team of OSM (Let's talk) 09:29, 3 October 2018 (UTC)