Talk:Machine-readable Map Feature list/Archive 1

From OpenStreetMap Wiki
Jump to navigation Jump to search

Value descriptions

Note: value-summary was added to the DTD as a result of this discussion.


More on the quick reference stuff, I'm toying with rewriting the following value definitions in my output (again for brevity):

  • Transport-specific access restrictions to refer to 'access' (e.g. Key: motorcar; Values: See: access)
  • "List: monday, mon, tuesday, tue ..." to "List: weekday (e.g. monday, mon)"
  • "grade1, grade2, ..." to "gradeN (1<=N<=5)" (or something as concise but more readable)

Above it's decided that defining a type system for tag values is out of scope, but how about an optional natural language summary of allowed values in a tag definition that would cater for situations like the above, and maybe other values that are not really literal values (non-discrete numbers - speed, height, "User defined", etc)? tracktype might be described as follows:

So far I expected this kind of value summaries be part of descriptions, but in the meantime I begin to like your approach. value-summary will be useful to informally describe the "type" of tags like maxweight, name, minspeed and the like, as long as we don't use a more detailled type system. I wouldn't use it in the example below, however. A value summary grade1 to grade5 isn't very useful because each grade is and should be described with an individual value-def. The main advantage of value-summary is for tag-defs which don't include value-defs.
Gubaer 13:31, 21 December 2008 (UTC)
In something like a key-value summary (as a quick reference might be), including full descriptions isn’t always practical, but listing the keys and values is just enough to jog the memory of someone looking for what to include. I dislike the values for tracktype becuase they are not very descriptive (but that’s another issue), and listing each value separately gains nothing over compacting it into a quick summary.--Sward 19:33, 21 December 2008 (UTC)
It might help to see what I’m working on to guage my requirements: <http://bleah.co.uk/~simon/stuff/osm/mapfeatures/mapfeatures-1n-tab-sec-2.pdf>.--Sward 19:43, 21 December 2008 (UTC)
<tag-def key="tracktype" onway="yes">
  <displayName>Track Type</displayName>
  <description>Provides a classification of tracks.</description>

  <value-summary>grade1 to grade5</value-summary>
	
  <value-def value="grade1" onway="yes">
    <displayName>Grade 1</displayName>
    <description>paved track or heavily compacted hardcore.</description>		
  </value-def>
  <value-def value="grade2" onway="yes">
    <displayName>Grade 2</displayName>
    <description>unpaved track; surface of gravel or densely packed dirt/sand. </description>		
  </value-def>

  <!-- ... -->
</tag-def>

--Sward 22:32, 20 December 2008 (UTC)

SVN repository

I think we should manage our artifacts in the OSM SVN. Our sub-tree could start at http://svn.openstreetmap.org/applications/utils/map-feature-list.

  • schema - the schema for the list (DTDs)
  • doc - documentation
  • ~data - repository for the generated lists
  • src - source of harvesters, publishers and possibly client-side libraries

Gubaer 19:09, 21 December 2008 (UTC)

Done, see http://svn.openstreetmap.org/applications/utils/map-feature-list Gubaer 19:34, 23 December 2008 (UTC)


Restrict scope to accepted features

Note: 2008-12-28: moved to archive.

Today, there are several approaches for defining features on the Wiki. I described them on the main page. If we want to harvest tag and value definitions from the Wiki we will have to clean up the source, i.e. the data in the wiki, first. In order to get something out soon we should restrict ourselves in the scope of the features we will include the machine-readable feature list. I suggest that we focus on harvesting accepted features (i.e. those mentioned on the Map Features page) in a first step and deal with proposed, rejected and deprecated features later. Gubaer 14:59, 21 December 2008 (UTC)

There we run into the first problems. Parsing the MF page only won't get us a full list of accepted features. The MF page is page is far away from being complete in this manner. We should rather rely on the single Tag / Key pages (with a small change in the used Templates to get all necessary values) and use the MF page as an additional source. We run into trouble with the MF page when it grows to much (happened once that the Wikiserver couldn't handle the page with all its templates any more).
Furthermore if we build the harvester around the Key/Tag Templates in the first place it is easier the add the proposed tags later. We "simply" need to clean up all proposal pages and add "our" standardized Templates than can be evaluated. --Etric Celine 08:44, 22 December 2008 (UTC)
Agree. I yesterday drafted a proposal along this line. The idea is to reuse the existing templates Template:KeyDescription and Template:ValueDescription, see main page. Gubaer 08:58, 22 December 2008 (UTC)


XML as data format

Note: 2008-12-28: moved to archive


xml looks good to me, but I don't care that much

I think it could be more human-readable. This is very enterprisey XML --Hubne 08:44, 19 December 2008 (UTC)
What do you mean with "more human-readable"? Other tag names ? Shorter ones? Is "tns:..." as namespace bothering you? Don't worry, it refers to current namespace and can be left out. Gubaer 09:40, 19 December 2008 (UTC)
OK - that was a rushed comment. I think this is the output of a UML-type program. "tns:" only bothers me in that it's meaningless/random/default (unless you tell me otherwise). I would rather see a namespace prefix that relates to the vocabulary like "osmtag" or "otl" (OSM tag list) or something. That's just a placeholder but it's like how it's good practice to use meaningful variable names. I was going to talk about trimming the overcomplicated element names but actually they are only in your diagram, not the XML. I can read XML Schema but it is very hard to scrutinise compared to RELAX NG. I think the schema is OK. I don't really understand what the labels are for. Can you perhaps pick out 5 well-known tags and populate more osm-tag-definition elements? --Hubne 11:50, 19 December 2008 (UTC)
"I think this is the output of a UML-type program." Actually it isn't, I've been using an XSD editor in Eclipse. The diagram is a dia diagramm, not a classical UML-tool either there is an UML stencil for dia but I didn't use it) and usually not the kind of tool from which one generates XSD schemata ;-)
tns is an abbreviation for this namespace, similar to the keyword this in OO-languages. This is just some kind of convention in the community. The namespace prefix in XML is irrelevant. You can always replace it with another one, provided that you declare it in the root element mit xmlns.
Why do we have to distinguish between osm-tag-definition and osm-label-definition? Mostly because the tag highway and the individual labels defined for this tag have a display name and a description. But also because individual rules of applicablility apply for the tag and invidual labels. For instance, highway as a tag is applicable to nodes, areas and ways, but highway=residential is only applicable to ways, highway=bus_stop only to nodes and highway=pedestrian only to ways and areas. So, the applicablility of tags could be infered from the ablicability of its labels? No, because there are tags without predefined label, for instance name, maxspeed, maxheight, etc. We therefore sometimes have to define the applicability on the tag, not on its labels.
Furthermore, our goal is to create a machine-radabel list of map features, whether this list is more or less human-readable shouldn't bother us too much.
Gubaer 21:43, 19 December 2008 (UTC)
I know all about XML namespaces, thank you. I'm just saying don't choose the default from your authoring tool when you can take a minute to give a it a meaningful prefix and it's so much easier to grok. Have you ever worked with Dreamweaver developers who use "style1" etc?


I think I am getting confused by your use of the word "label". It seems like you mean "value".
--Hubne 01:50, 20 December 2008 (UTC)
I'm using value for an arbitrary value of a tag, i.e. foo, bar, lkauoa%635ssdt. For certain tags there is a set of predefined values with a given meaning (semantics) which I call labels. Users are suggested to use predefined labels as values but in OSM they are and should be free to enter whatever value they want.
I would also like to see the Schema a little bit less confusing. I am a fan of kiss(Keep it short and simple). Why do we need something like "osm-tag-definition" doesn't do <key name="highway" icon="highway.png"> the same job? If we want to have categories too it should look something like this (--Al Friede 14:02, 19 December 2008 (UTC)):
Sure. Can you create a schema (I don't care whether it is RELAX NG or XML Schema) or a DTD for this example? Probably you will run into problems with the <en>, <de>, etc. tags, because it is difficult to define a schema with elements for all possible ISO language codes. And it is difficult to define the default value for an element, if the element is missing (i.e. for the elements <node></node>, <area></area>, etc.) but we will see, as soon as your proposal is available. The rest of your poposal will closely resemble the existing proposal, osm-tag-definition replaced with tag and label replaced with value and elements used instead of attributes. But the structure is basically the same (and from my point of view neither more nor less KISS, just different). I suggest that you create a proposal for a schema and submit it here. Then we discuss both, vote on them finally stick to one.
Language should probably be specified using xml:lang instead of making up elements/attributes for it. See: http://www.w3.org/International/questions/qa-when-xmllang --Sward 16:12, 20 December 2008 (UTC)
Agree. Gubaer 16:24, 20 December 2008 (UTC)
Of course, the question is, whether we need a schema. I'd say yes, we need it. I'd like to validate the machine-readable list of map features against a schema, mainly because best practices mandates that applications check the structure of the file anyway. If a schema was available a standard software component like a validating XML parser could do the job and we would not have to implement it.
Regarding categories: is it OK for everybody, that every tag belongs to exactly one category? (no tags outside of a category, no tags belonging to two or more categories)
Gubaer 21:09, 19 December 2008 (UTC)
I don't have much time now, but I'll see if I can do it this weekend. I'm having trouble just understanding the (reasoning behind the) abstract model proposed here, so I may start from the requirements.
It's crazy to have elements named after all ISO language codes. Make it an attribute value as originally planned.
We should have a schema, but it's never strictly needed. It's a very good idea for the reasons you state.
If we can, why limit our categories to one per tag? I would implement them as repeatable elements within the tag definition (or whatever you call it).
--Hubne 02:04, 20 December 2008 (UTC)
The idea behind the categories is just to group the Tags together and make it possible to split the large list of Tags. Therefore they should only belong to one category to avoid duplication. They have nothing to do with the usage itself. Just a cosmetic help. --Etric Celine 08:17, 20 December 2008 (UTC)
Someone seems to have copied your comment into another section, where I have responded :~) --Hubne 23:27, 20 December 2008 (UTC)
Suggested XML instance document has been provided under Another XML sketch --Hubne 23:27, 20 December 2008 (UTC)

The code below has not been signed. Please sign and remove this notice.

<categorie name="highway">
	<key name="highway" icon="highway.png">
		<displayName>
			<en>Highway</en>
			<de>Wege</de>
		</displayName>
		<description>
			<en>The highway tag is the primary tag used for highways. It is often the only tag. There are conventions for its use in particular countries.</en>
			<de>Der highway-Tag ist die hauptsächlich genutzte Art und Weise um Straßen zu kennzeichnen</de>
		</description>
		<value name="motorway">
	 		<displayName>
				<en>Motorway</en>
				<de>Autobahn</de>
	 		</displayName>
	 		<description>
				<en>A restricted access major divided highway, normally with 2 or more running lanes plus emergency hard shoulder. Equivalent to the Freeway, Autobahn, etc.. </en>
				<de>Straße mit baulich getrennten Fahrtrichtungen (i. A. Grünstreifen) und besonderen Nutzungsbeschränkungen. Typischerweise....</de>
	 		<description>
	 		<node>no</node>
	 		<way>yes</way>
	 		<area>no</area>
		</value>
		.
		.
 	</key>
 	<key name="junction" icon="junction.png">
	.
	.
</categorie>

For languages, xml:lang attributes are certainly better than 2-letters named elements. Marc Mongenet 01:22, 27 December 2008 (UTC)

Another XML sketch

The whole thread about how to structure our XML is moving too fast to work on something sanely. I have produced a rough instance document, which doesn't incorporate some of the more recent discussion. (A lot of the new developments I like.) Nevertheless, it's simple enough and readable, which is important for coders who have to work with it. I've called tag definitions "features" to be consistent with the wiki and to represent the concept rather than the implementation. It's all ideas.

I don't think I have time to do a schema. I find it easier to follow these, anyway.

The structure of this proposal is fairly close to what we already have. There are a couple of details I like:
  • xml:lang on the root element could be useful. If present, it will define the default language for the whole file. I'll remove the default value en for xml:lang on description and display-name. xml:lang on these objects shall inherit the default language from xml:lang on the root element (which can't be expressed formally in the DTD).
Inheritance of xml:lang is already defined in the recommendation. No (or empty) xml:lang means the language is undefined. Child elements inherit. You can set defaults for elements. See: http://www.w3.org/TR/REC-xml/#sec-lang-tag --Sward 19:33, 21 December 2008 (UTC)
Ok, thanks! I've added xml:lang to the root element with default value en. Gubaer 19:42, 21 December 2008 (UTC)
  • using the id and ref attribute when we refer from one element to another.
Other aspects of this proposal are less convincing:
  • Why define applicability that way? It is difficult to declare default values if we don't use attributes.
     <applicable><!-- default applicability for the tag -->
      <object>way</object>
      <object>area</object>
    </applicable>
   
  • I don't understand why presets are child elements of features. The whole idea of presets is that they define groups of tag definitions (or features) with a specific meaning (semantics).
  • We try to reach consensus that defining a type system for tag and value definitions is out-of-scope (see above). If we had to declare types I'd support the approach of reusing basic types from XML Schema and xsi:type as attribute, but for the time being I still suggest to leave types aside.
  • I don't think we should rely on the Dublin Core
    I'm not sure what the problem is with the Dublin Core. You don’t rely on the Dublin Core, you use the properties it specifies where they fit. This does, however, only appear to be dc:description in our context.--Sward 13:23, 26 December 2008 (UTC)
    There is no problem. As you said, "you use the properties it specifies where they fit" and I agree with you that description currently is the only element whose semantics would closely match the semantics of an element in the Dublin Core: dc:description. Personally, I'd therefore rather not create the "dependency" to the DC (if reusing elements from another namespace can be called a dependency at all) Gubaer 13:05, 21 December 2008 (UTC)



<?xml version="1.0" encoding="utf8" ?>
<mapFeatures
  xmlns="#FIXME"
  xml:lang="en"
  version="0.1"
  >

  <!-- this example doesn't include "suggests" or similar tag properties -->

  <category title="foo" id="cat1">
    <description>foofoo</description> <!-- we could use dc:description here if we wanted to borrow from Dublin Core metadata standard -->
  </category>
  <!-- repeat category element or have them all in another XML resource -->

  <feature key="key1" status="inuse">
    <label>footag</label>
    <label xml:lang="de">fütag</label>
    <description>footag description</label>
    <description xml:lang="de">ist heute noch fütag?</label>
    <applicable> <!-- default applicability for the tag -->
      <object>way</object>
      <object>area</object>
    </applicable>
    <presets type="string"> <!-- could use xsi:type if we want to adopt those definitions -->
      <value content="bar" default="default">
        <label>foovalue</label>
        <label xml:lang="de">füvalue</label>
        <description>foovalue description</label>
        <description xml:lang="de">foovalue description auf Deutsch</label>
        <applicable> <!-- optional local exceptions for this tag value -->
          <object>way</object>
        </applicable>
      </value>
      <!-- repeat value element, only one may have default attribute set -->
    </presets>
    <iconName>icon1</iconName>
    <category ref="cat1" />
    <category ref="cat2" />
  </feature>
  <!-- repeat feature element -->

</mapFeatures>

--Hubne 02:24, 21 December 2008 (UTC)

Hiearchical names for tag keys

Note: 2008-12-28: moved to archive

  • Currentyl, OSM uses a kind of generic tag keys. The set of tag keys given by the pattern name:<language>, i.e. name:de, name:en, name:fr, etc. do not fit in the proposed data structures
Let them aside for now ? Sletuffe 21:55, 18 December 2008 (UTC)


Another thing we should take into account are "Namespaces" (or what else you like to call it). This is something that never got a good description or a proper examples where we can use them. But they are "secretly" in use more or less.

Some tags have a prefix attached to identify their "Namespace/group":

  • Namespace:Keyname = tagname

others use a postfix instead

  • Keyname:Namespace = tagname


Some examples: Prefix

  • AND:importance_level = xxx
  • educamadrid:codigo_postal
  • educamadrid:tipo
  • geonames:country_code
  • geonames:feature_class
  • gns:MGRS
  • gns:UFI
  • ideewfs:alturaElipsoidal
  • ideewfs:convergencia
  • openGeoDB:is_in
  • openGeoDB:name
  • tiger:reviewed
  • addr:housenumber
  • addr:city
  • addr:street
  • recycling:glass
  • recycling:bottles
  • building:levels
  • building:type
  • building:use

Postfix

  • <Keyname>:de
  • <Keyname>:fr
  • <Keyname>:en
  • ...
  • <Keyname>:left
  • <Keyname>:right


So in general Postfix are used by nearly all imports to add "internal" metadata without creating a conflict with the OSm data. e.g. an Element can hava a name=xxx tag and a source:name=xxx tag tells you the name of this feature in the source database.

the postfix are used to add specific languages to description/name/todo/note and so on keys. or for the left/right approach

We should take this into account and specify a clean way to describe pre- postfix values.

--Etric Celine 21:04, 18 December 2008 (UTC)


Ontology for map features revisited

2009-01-11: now included in the main article


I wasn't yet happy with the current proposal for classes and properties at the beginning of this section. I had concerns about the proposed classes and about specific properties (both the name of them, their indented values and their semantics). So I tried to summarize how our ontology and our mapping to SWM could look like. Gubaer 17:24, 31 December 2008 (UTC)

Classes

We have four classes (in OWL terminology). Naming conventions follow those used in OWL, i.e.:

  • camel case for class names, first letter upper case
  • camel case for property name, first letter lower case

Some of the properties have multiplicity 1, others 0..1 and other *. It is not possible to restrict multiplicity in SMW. It seems that multiplicity is always *: just repeat a property declaration with another value in order to assign multiple values to it. For clarity, the intended multiplicity of the properties is given below in [..].


Our classes

  • KeyDef (or Key)
An individual of KeyDef describes a key used in OSM for tagging a map element.
Semantic Properties:
  • key[1] is of type String (we can introduce our own type for key if necessary, for instance, if we want to restrict the allowed characters in the future)
  • displayName[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • description[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • onNode[0..1] of type boolean - whether this key is applicable to nodes or not. If missing, false is assumed.
  • onArea[0..1] of type boolean - whether this key is applicable to areas or not. If missing, false is assumed.
  • onWay[0..1] of type boolean - whether this key is applicable to ways or not. If missing, false is assumed.
  • onRelation[0..1] of type boolean - whether this key is applicable to relations or not. If missing, false is assumed.
  • state[0..1] of type String - either accpeted, rejected, deprecated or proposed. If missing, accepted is assumed.
  • category[*] of type String - holds the name of a category this KeyDef belongs to
  • ValueDef (or Value, or Tag)

An indiviual of ValueDef is a Name/Value-Pair used in OSM for tagging a map element.

Semantic Properties:
  • key[1] is of type String (we can introduce our own type for key if necessary, for instance, if we want to restrict the allowed characters in the future)
  • value[1] is of type String
  • displayName[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • description[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • onNode[0..1] of type boolean - whether this key is applicable to nodes or not. If missing, false is assumed.
  • onArea[0..1] of type boolean - whether this key is applicable to areas or not.If missing, false is assumed.
  • onWay[0..1] of type boolean - whether this key is applicable to ways or not. If missing, false is assumed.
  • onRelation[0..1] of type boolean - whether this key is applicable to relations or not. If missing, false is assumed.
  • state of type String - either accpeted, rejected, deprecated or proposed
  • category of type String - holds the name of a category this ValueDef belongs to
  • Preset

An individual of Preset describes, how an OSM element is tagged with various map features in order to precisely describe a kind of real-world objects.

Semantic Properties:
  • name[1] is of type String (we can introduce our own type for name if necessary, for instance, if we want to restrict the allowed characters in the future)
  • displayName[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • description[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • requires[*] is of type String. The value is the either a key or a tag, i.e. name or highway/residential. It denotes a map feature which is required in the context of this preset. Interactive editors will add it automatically to the tags of a map element.
  • implies{*] is of type String. The value is the either a key or a tag, i.e. name or highway/residential. It denotes a map featues which is not necessary because in the context of this preset. Interactive editors will warn mappers if this map feature is present.
  • suggests[*] is of type String. The value is the either a key or a tag, i.e. name or highway/residential. It denotes a map featues which is helpful in describing an element in the context of this preset. Interactive editors will propose it to mappers.


  • FeatureCategory

An individual of FeatureCategory describes a specific category (or group) a map feature belongs to.

Semantic Properties:
  • name[1] is of type String (we can introduce our own type for name if necessary, for instance, if we want to restrict the allowed characters in the future)
  • displayName[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)
  • description[*] is of type String (semantics of the property should declare, that we don't allow HTML or wiki formatting in in the value)

Relationship between classes and Wiki categories

For each class we create a wiki category, i.e.

  • Category:KeyDef
  • Category:ValueDef
  • Category:FeatureCategory
  • Category:Preset

Relationships between properties an Wiki pages

In SWW, we can describe each property in a respective page in the namespace Property:, for instance Property:name or Property:implies. Creating property pages is not mandatory, but it makes sense, because the page provides semantic information about the property to editors. Editors can learn there what the meaning of a specific property are and what range of values are acceptable for this property.

Following the OSM philosophy, editors are still free to use whatever property they want, assigning whatever value they feel appropriate. However, editors are strongly encouraged to use a canonical set of semantic properties (those defined in the preceding sections) consistently.

A property page includes

  • a declaration of its type using [[Type:aType]], i.e. [[Type:String]]
  • standard wiki text which describes the properties to editors

It does not include, however:

  • a formal declaration of the properties multiplicitiy. Multiplicity must be described to users in the wiki text

Individuals and how we maintain them on the SMW

A specific key (for instance highway)

  • create a page highway
  • include the following wiki text. It sets the instantiated properties (OWL lingo), i.e. the property values for this individual. You may use a trailing | in [[property::value | ]] if setting the property should be invisible.
 [[key::highway]]
 [[onNode::false]]
 [[onArea::true]]
 [[onWay::true]]
 [[onRelation::false]]
 [[state::accepted]]
 [[category::physical]]
  • make sure it is marked as individual of the class KeyDef by including the following wiki category
 [[Category:KeyDef]]
  • make sure there is at least one language specific page with language specific properties, for instance en:highway. It should include
 [[displayName::Highway]]
 [[description::This is the description for Highway]]
Furthermore, this page will include the rest of the wiki text for this key in english.

A specific value (for instance highway/residential)

  • create a page highway/residential
  • include the following wiki text.It defines the instantiated properties (OWL lingo), i.e. the property values for this individual. You may use a trailing | in [[property::value | ]] if setting the property should be invisible.
 [[key::highway]]
 [[value::residential]]
 [[onNode::false]]
 [[onArea::false]]
 [[onRelation::false]]
 [[state::accepted]]
 [[category::physical]]
  • make sure it is marked as individual of the class ValueDef by including the following wiki category in the text
  [[Category:ValueDef]]
  • make sure there is at least one language specific page with language specific properties, for instance en:highway/residential. It should include
 [[displayName::Residential road]]
 [[description::This is the description for a residential road ]]
Furthermore, this page will include the rest of the wiki text for this value in english.

A specific feature category (for instance physical)

  • create a page FeatureCategory:physical
  • include the following wiki text. It sets the instantiated properties (OWL lingo), i.e. the property values for this individual. You may use a trailing | in [[property::value | ]] if setting the property should be invisible.
 [[name::physical]]
  • make sure it is marked as individual of the class FeatureCategory by including the following wiki category
  [[Category:FeatureCategory]]
  • make sure there is at least one language specific page with language specific properties, for instance en:FeatureCategory:physical. It should include
 [[displayName::Physical]]
 [[description::Description for feature category Physical]]
Furthermore, this page will include the rest of the wiki text for this feature category.

A specific preset (for instance residential_road)

  • create a page Preset:residential_road
  • include the following wiki text
 [[name::residential_road]]
 [[requires::highway/residential]]
 [[suggest::name]]
 [[suggest::oneway/yes]]
 [[suggest::maxspeed]]
  • make sure it is marked as individual of the class Preset
  [[Category:Preset]]
  • make sure there is at least one language specific page with language specific properties, for instance en:Preset:residential_road. It should include
 [[displayName::Residential Road]]
 [[description::Description Residential Road ]]
Furthermore, this page will include the rest of the wiki text for this Preset.