Drafts/Canonical names

From OpenStreetMap Wiki
Jump to navigation Jump to search

Places have names, they have been registered at OSM database as well at Wikipedia, Wikidata, DBpedia. Some place names are stable: was unchanged by years or decadades. Why OSM tools have no direct access to polygons of places names like administrative areas?

We have boundary relations and Wikidata-IDs for all countries and cities: why OSM not offer something as http://place.openstreetmap.org/{country_name}/{city_name} (returning the polygons) as service?

There are a lot of technical rationale showing why we should not set OSM database with stable names or stable IDs... Even in the scope of Nominatim... But no serious discussion about how to solve the problem for the most frequently used names.

Web Semantic of places

Wikidata, DBpedia, and SchemaOrg defined what is a "place" in the context of the Semantic Web:

And Wikidata, Wikipedia, DBpedia and others (eg. CIA Fact Book) was registered thousands of place instances: a lot of them are stable. The association of an stable and frequently used label (eg. formal RDF label) with a stable place instance is the "place name". Example: all countries, cities and hierarchically intermediate administrative areas (tipically states), have stable place names, ISO codes and well-known labels in its local (stable) community.

All ontologies have consensual country and city definitions: db:country, sc:country, wd:country, db:city, sc:city, wd:city.

Specific example: Brazil, state of Paraíba in Brazil, city of João Pessoa in Paraíba,

Brazil is ISO 3166-2 code BR, the OSM relation 59470 (polygons), and the Wikidata-ID Q155 (used also as DBpedia entry).
Paraíba is ISO 3166-2:BR code BR-PB, the OSM relation 301464, and the Wikidata-ID Q38088 (DBpedia).
João Pessoa is into BR-PB, the OSM relation 301405, and the Wikidata-ID Q167436 (DBpedia).

The hierarchy is generated at Wikidata by the P131, "located in the administrative territorial entity" (that have also the semantic of within in the spatial relation context), creating a chain of place instances that have countries as roots... As ASCII simplified string for place name (eg. as hierarchical RDF label) we can use
  br for Brazil,  br:pb for Paraíba, and  br:pb:joao.pessoa for João Pessoa.
In fact, these strings are jurisdictions in the URN LEX convention, and they can be canonical place names for offcial administrative areas at OSM.

Web names of places

There are many reliable free gazetteer services to confirm names, as Geonames and StatoIDs, that are good to "check fact" or enhance terminological consistency, but no one returns reliable polygons, no one is a name resolver.

PS: today, as proprietary-database and service, the best reference is google's place-id.

The OSM canonical name of the place

Examples of complementary datasets to use as reference in rule decisions:

So, when there are no "official name" we can adopt stable "official extensions" — git datasets maintained by stable communities as Open Knowledge or gazetters — to resolve ambiguities or lacks of official names. Example: BR-GB is a non-valid ISO 3166-2:BR code, but is a fact checked by the community, the BR-GB is valid place name for objects (eg. publication-place of historical documents and law) dated between 1960 and 1975. as showed by br-state-codes.csv dataset.

The urn:place proposal

There are a kind of "window of opportunity" for OSM, to subscribe yourself as a name resolver organization, with the new 2017's IEFT standard RFC 8141, with the new "Uniform Resource Names" (URNs) assign process.

"An organization that will assign URNs within a formal URN namespace SHOULD meet the following criteria:"
  1. Organizational stability and the ability to maintain the URN namespace for a long time;
  2. Competency in URN assignment.
  3. Commitment to not reassigning existing URNs and to allowing old URNs to continue to be valid