is_in is pointless. Spatial Indexing
Key:is_in is a pointless waste of database space. It's a poor solution to spatial indexing, a problem which is already solved by... spatially indexed databases.
Can someone link to a mailing list discussion? I know it has been discussed at length. Also this page needs to be modified to express the pointlessness of the tag so that maybe one day people will stop using it and we can delete it from the database.
-- Harry Wood 18:56, 4 February 2010 (UTC)
- http://lists.openstreetmap.org/pipermail/talk/2009-July/039139.html, http://lists.openstreetmap.org/pipermail/talk/2008-December/032039.html, http://lists.openstreetmap.org/pipermail/talk/2008-April/025186.html --Pieren 11:47, 12 February 2010 (UTC)
is_in is needed
Ppl. who say it was pointless (or even delete it!) should provide a working ersatz for navit on my 8 GB android w/o data flatrate. As apk-download-link, not via the market please. And all those addr:postcode, addr:city, addr:province, addr:district, addr:subdistrict, addr:hamlet and addr:country are also redundant and a waste of database space, why not delete them too? That'll free hundrets of megabytes! --Themroc 21:52, 21 May 2011 (BST)
- Provide-a-working-navit-as-a-apk-download-thankyou-very-much is a cheeky suggestion of course. On the whole it's up to developers to figure out sensible solutions which make their software work ...without asking mappers to fill our database with unnecessary cruft!
- is_in tags will only ever be a half solution. Why do developers want a half solution anyway? Perhaps they are hoping that is_in will be filled in everywhere. There is a better way that would achieve FULL worldwide coverage, would entail zero wasted effort by mappers, and zero extra wasteful tags: Do some spatial calculations.
- I appreciate that technically it's not all that easy. Working with administration ways and relations is a real pain at the moment. We definitely need to provide some demonstrations, and reusable code. You wouldn't want to do these calculations on-device. It'll work better as augmented planet downloads. For example a planet download with all is_in tags added by calculation would be feasible. The nominatim Pre-Indexed Data Service is arguably a step in the right direction (spatial indexing smartness, all automatically generated from planet data) In fact I can imagine that nominatim polygons could be used to set is_in tags (for an augmented planet download, not to put into the main database)
- These things will take some time. It'll probably happen quicker if people stop being distracted by half-solutions which waste lot of time and effort of the mapping community. We're not there yet, and I'm not advocating purging is_in tags and creating a short term problem for developers who are trying to use it, but please help move things in the right direction.
- -- Harry Wood 12:51, 14 August 2011 (BST)
How we stop people from using this tag
I think that the best way to stop people from using this tag, is to make sure that none of the is_in tags is actually being considered in the different seach engines, redering engines, gps-map-generators etc. Please update the list below if you know for a fact that each function does not use is_in:
* Nominatim - there is some references to it in the code. * Mapnik - unknown * Osmarender - unknown * mapgen.pl - there is some references to in in mapgen.pl * mkgmap (map for garmin) - uses is_in tags when mkgmap is run with the --location_autofill switch * Navit only works with is_in tags, can't work with boundaries.
- Why don't we provide a script to easily add
is_intags to an OSM file? In that way, tools that use
is_incan preprocess the data, and the users would stop with adding is_in tags.--Sanderd17 13:46, 18 December 2010 (UTC)
- No, I'm sorry. That would be a very, very, wrong thing to do. If you make up a piece of code that could do that, you should rather add that code to the tools that would otherwise need is_in. Any piece of information that can be calculated should be just that; calculated. The OSM database is not some place one should cache the results of such calculations. --Gorm 23:04, 20 December 2010 (UTC)
- I believe he meant to use the resulting osm file only locally, say extract -> is_in adder -> navit, i.e. without any intention of uploading it back. Would cover the few cases where the otherwise valuable software can not be easily extended to handle boundaries, and where the area being processed does contain valid boundaries - many parts of the world still don't have anything else besides the admin_level=2 boundaries. If all the boundary data is to be gathered from the extract being processed, some tools can not identify where something "is in" within their current pipelines, when they process the extract by iterating through the file in one go; the file would have to be read through once before the normal operation to construct the boundaries. (I don't know enough about Navit to say if it's a good example here).
- If such a utility existed, the coder should investigate and validate ways to malform the data enough to make the api reject it, if someone were to try to (edit and) upload the resulting file; maybe dropping the version attribute would do it, it's not AFAIK used by any data consumers. Boundaries are better, but if they are not available, at early stages of mapping a new place the only option is to identify where each feature near the border belongs to; the mapper doesn't yet know that for the next feature (house, road, ..) so he can't draw a boundary there - not yet. Alv 06:55, 21 December 2010 (UTC)
- I think that, if you say on the wiki that is_in tags can easily be added, no user will upload it. You have to work a while on OSM before you can download a dataset, execute such a script on it and upload it back. I think against that time, a user will know that it's not a good thing to do. But still, it's not easy to switch from boundaries (in all forms: relations, closed ways and some boundaries under construction) to is_in tags. And you need some kind of "is_in" information for performance. If you look at the navit trunk , you see that they worked on boundary support until about 7 months ago. I haven't contacted martin-s, but since he's still active in the project, I guess he stopped the boundary support because it was too difficult. --Sanderd17 08:02, 11 January 2011 (UTC)
- Shouldn't this page be changed (change all "isin" tags with "is_in") and removed from the proposed features set? Seems that it is approved (it's on the map features page but I can't find any final approval). -- Cimm 8 Nov 2006
- Yes, I guess all "isin" tags should be changed to "is_in" - now done. The tag 'is_in' was removed from proposed features and put on map features page on 5 November. Dmgroom 13:26, 8 November 2006 (UTC)
- Secondly the "place=country name=denmark isin=europe,scandanavia" should be removed, it's a bad example (wrong tag, europe should be after scandinavia) and Scandinavia is ambigous here. As far as I know it's not a political entity but a collection of countries like "the Baltic states or the Benelux". Do we add these kind of classifications as well? -- Cimm 8 Nov 2006
- It's not an ordered list, just a collection of tags - putting one region before another in the list doesn't have any significance. Ojw 13:27, 8 November 2006 (UTC)
- I'd say it's OK to use things which aren't political entities (e.g. means you can browse cities in ireland without having to choose eire or ulster first) but other peoples' views may differ. Ojw 13:27, 8 November 2006 (UTC)
- The name tag is used in the examples... when do we need to use the name tag and when the place_name tag or do we use both? -- Cimm 8 Nov 2006
- I have an issue with the correct spelling for this tag: All the examples given use a colon as a separator between values, but the FAQ says that multiple values for a tag should be separated by semicolons. Which is correct?--Guenter 20:51, 8 November 2008 (UTC)
- I agree with Guenter. All other tags have the mutiple values separated by semicolons (;) and the FAQ says "the semicolon is the only accepted character". More than that, editors and tools handling data merge values in the same manner and have been doing so for quite some time. This is easily visible in Potlatch if one merges two ways with different values for the same tag. I have looked over the data in Romania and is_in is the ONLY tag that uses commas as separators while for other tags the semicolon is used consistently. EddyP Fri, 17 Jul 2009 10:32:35 +0000
- Not imposing a certain order or at least recommending one makes the information useless. The same goes for the meaning of that information, if there's no agreement on what the tag means, then ALL the information associated with that tag is useless. The first examples say that both "USA, CA, San Francisco" and "San Fancisco, CA, USA" are valid, how about we change that so we make sure smaller entities are first, larger ones going last. Also the attempt to include Canberra into some abstract group is useless and would probably better done with another dedicated tag (or a relation?). EddyP Fri, 17 Jul 2009 10:41
- I agree: I think some systematisation of the is_in tags is needed, particularly as they start to become useful for things like mgkmap csdf 15:29, 24 September 2009 (UTC)
- What to add as a value? Only political entities or more general stuff as well? Do we add provinces, countries, continents, political unions, geographical collections (like Scandinavia or the Benelux)? Is Europe the political Europe or the geographical Europe? -- Cimm 8 Nov 2006
How about some examples - say whether you think each one is a good idea to mark. Add some more examples too.
|Kempston in Bedfordshire||yes Ojw, yes Dmgroom, yes EddyP||Villages in a county|
|London in England||yesDmgroom, yes EddyP, no Themroc|
|London in UK||yes Themroc|
|England in UK||yes Themroc||Hierachies. Is there a node for England that we could attached the is_in tag to? Dmgroom 13:57, 8 November 2006 (UTC)|
|France in Europe||yes Ojw, yes Cimm, no EddyP, no Themroc|
|Jersey in The Channel Islands||no Themroc||Regions|
|Isle of Wight in Islands||no EddyP, no Themroc||Marking geographical feature types|
|Isle of Wight in Featured Locations||no EddyP, no Themroc||Using as a way of tagging good locations|
|Oakham in Places > 80% done||no Cimm, no EddyP, no Themroc||Use as a grading system|
|Oakham in Rutland||no Themroc||traditional counties|
|Switzerland in Europe||no EddyP, no Themroc||Europe as geographical region but not a member of the European Union|
Referring to Dmgrooms remark about "the" England node. It's good question, how can you know where to find these kind of nodes and if they exsist or not. Maybe we need some API to look for all nodes in this bounding box with these tags? Or they could have another node color in JSOM (maybe some plugin or an extension for the mappaint plugin), this last idea would only work on city level off course? -- Cimm 9 Nov 2006
I'd say: Use only ISO 3166 2-letter-codes for toplevel. If someone really needs to know which continent "cn" belongs to, it can easily be found out per software (there are only 249 of them). --Themroc 21:26, 21 May 2011 (BST)
This tag should be localizated, as the place names are localizated. For example, for Madrid, in Spain, the tag should be "is_in=Comunity of Madrid, Spain, Europe", and "is_in:es=Comunidad de Madrid, España, Europa".
I want to bring up a previous suggestion I saw (on osm-talk, I think) to use a namespace, e.g. isin:county=Shropshire, isin:country=United Kingdom. Exactly what namespaces are used is, of course, debatable.
The reason is I want to use a kind of place selection on my website to show stuff in a particular area. Currently is_in defines a set of tags so it's not so easy to derive from this countries, counties, etc, on its own. I've got round this partially in an import script I've written where it finds countries using isin:country then associates places with countries where they have that country name in their is_in tag.
Another method I can use is to simply provide tags and by selecting more tags one can narrow down the number of places. More 'web2.0' than dropdown lists but not quite as easy to program. Still, it doesn't solve the problem if people do want to list counties in England, for example...
BTW, I've got a script ready for importing places into MySQL from the planet dump every week. I'm currently not doing anything with it (haven't programmed anything that needs the data just yet) but I can provide dumps in various formats if anyone wants them (probably not doable weekly but I'll have a go).
Higgy 19:12, 5 November 2007 (UTC)
is_in:country - what value?
I am currently implementing support for "is_in:country" and "is_in" in an algorithm to determine the country a way is in for Traveling Salesman in order to determine what national traffic-regulations to use. (It is a shortcut to avoid loading the border-polygon and to make ways explicitely tagged to be in a given country overrule an inprecide border-polygon).
What should I compare the tagged value against? Currently I compare first against the english name of the country (case-insensitive), then against the ISO3166 (uppercase, case-sensitive). The later may give false positives in the is_in -tag (e.g. a city-name containing a valid ISO-country-code as a substring.) What about localized names? If so, what localizations and what about countries with more then one language and more then one alphabet or more then one way to romanize(translate into latin characters) the name?
--MarcusWolschon 08:30, 26 March 2009 (UTC)
I need to find out if a given way is inside a city/town/village. Are there any suggestions on how to do this and avoid handling "Europe", "Europa", the english or local name of the country/state/geographic area like a city? --MarcusWolschon 13:29, 27 March 2009 (UTC)