User:Casey boy/GB housing
See also Housing in the United Kingdom
The term "housing" generally applies to any building used for people to live in  and so is not limited specifically to houses. In OSM, the following are all "housing" tags:
(Other tags are also also in use - see Accomodation for a list).
In this analysis, we look only at dwellings classed as houses.
OSM housing mapping schemes
Tagging by implication
Tagging by implication is when houses are mapped simply as building=house regardless of the house type. The rationale is that data consumers can infer the house type (e.g. detached, semi-detached, terraced) by the number of adjoining house polygons (or lack thereof).
This type of mapping can be convenient for the mapper but does make it more difficult for data consumers to accurately determine statistics from tag usage alone. Instead, detailed analyses will need to account for adjoining polygons (i.e. with shared nodes) to determine house type.
The top-level tagging system tags the type of house as a value to the building=* key. Examples include:
This tagging scheme is currently the one offered to mappers by both the iD and JOSM editors and has worldwide usage (leading to de facto status).
In the second-level tagging system, all houses are tagged using the the building=house top-level tag. Further refinement on the type of house is then provided by second level tagging, using the house=* key.
This tagging scheme is not currently offered to mappers by either the iD and JSOM editors. It is also worth noting that this scheme is generally only used in Ireland and the United Kingdom.
Building type vs use
A quick note that mappers should differentiate, if possible, between a building's type and use. This can, therefore, result in more nuanced tagging such as building=industrial, building:use=apartments. It could also mean that building=house may not necessarily be in use as a "house" (i.e. a residential dwelling), though this is much rarer.
Some housing related building:use tags are shown below. The count is the global number of uses.
The mapping of terraced houses, which are the most common type of house in the UK, is slightly more complicated. A terraced house is one where it is joined on either side to their neighbours, by a party wall, forming a chain of houses at least three houses long. The two "end of terrace" houses are still classed as terraced houses, rather than semi-detached houses, even though they only adjoin one neighbour.
If only the outline of the whole terrace is mapped (i.e. the whole chain) then all tagging schemes map the building as building=terrace. However when the terrace is split into individual houses, which must share connected nodes, the schemes diverge.
Tagging by implication
If using the top-level tagging scheme, individual terraced houses are mapped only as building=house. One might assume the houses would be mapped as building=terraced (to match,e.g., building=detached, building=semidetached_house) however this key has almost no usage globally.
This creates an ambiguity with terraced house mapping. Is building=house an unspecified type of house or is it a terraced house? Data consumers must infer the house type by the number of building=house polygons a building=house polygon is connected to via shared nodes, as with tagging by implication).
Additional tagging options have been suggested, e.g. a relation or using building:part, but each suffer from certain drawbacks (add mailing list source).
Use of the bungalow value, in both the top-level and second-level tagging schemes, does not portray any information regarding if the bungalow is detached, semi-detached or terraced. This presents an ambiguity when mapping bungalows, especially those that are attached to another house/bungalow. Currently it seems that we can only tag by implication, i.e. if two building=bungalow or house=bungalow polygons are adjoined, they are semi-detached bungalows.
One potential option to remove any ambiguity (when using tags alone) is to introduce refined tagging for bungalows (e.g. house=semidetached_bungalow, bungalow:type=*, or bungalow=semi-detached) or to add levels=* to house tags to indicate a bungalow. Alternatively, we continue tagging by implication.
Mapping of houses in GB
The following statistics are taken from the taginfo GB site which lists the tags, as well as their stats, used in Great Britain. Data from Northern Ireland are generally included in the island of Ireland taginfo site.
Since taginfo tables (as used above) are not country specific (unless they can be?), the numbers below are manually collected and are accurate as of 17 May 2021.
|Key||Value||Count||% of buildings|
1This tag is ambiguous and could indicate (i) an unspecified house type, (ii) a terraced house in the top-level tagging scheme, (iii) any type of house in the second-level tagging scheme, or (iv) a house in the tagging by implication tagging scheme. However, with regard to option iii, only 18,732 (0.8%) of objects tagged as building=house also have a house=* tag associated with them, so the vast majority of building=house must come from one of the other examples.
|Key||Value||Count||% of buildings|
In the following table, the usage statistics of the four main types of house (detached, semi-detached, bungalow, and terraced) are given. The percentage columns indicate the percentage of house of this type tagged using the top-level or second-level tagging scheme. No data is provided for terraced houses in the top-level scheme due to the ambiguity of the tagging for these houses (see Top-level tagging for terraced houses).
In the "combined" columns, the combined totals of the top-level and second-level schemes are provided. The mapped % provides the percentage of mapped houses as this type (combined) and actual % provides the actual percentage of these types of houses in the UK. The source for actual percentages includes flats (20.9% of dwellings) but, since these aren't often mapped individually in OSM, they have been removed from the percentages.
|Top-level tagging||Second-level tagging||Combined|
|House type||Key||Value||Count||%||Key||Value||Count||%||Count||mapped %||actual %|
|Total (ignoring static caravan and terraced)||--||--||317,507||97.7||--||--||7,407||2.3||324,914||89.0||64.7|
2Note that in Scotland, bungalows are reported by their attached status (i.e. detached, semi-detached, terraced) and so the values are slightly skewed. 3Static caravan not listed as a dwelling type in source.
There are just over 9 million buildings tagged in GB OSM. Of those:
- 27% (~2.5 million) are tagged as an unspecified type of house (building=house)
- 15% (~1.4 million) are tagged as generic residential buildings (building=residential)
- 2% (~200,000) are tagged as a terrace building, i.e. a row of terraced houses (building=terrace)
- 1% (~80,000) are tagged as apartment buildings (building=apartments)
The use of the top-level tagging scheme is much more popular than the second-level tagging scheme. Of the four types of houses looked at, 94.5% are tagged with the top-level tagging scheme (97.7% if house=terraced is omitted). It is difficult to know how popular tagging by implication is as we cannot tell if the mapper has deliberately chosen this scheme from tags alone.
However, the combined usage of the top-level scheme has only 12.7% of the uses that building=house has. This fraction is likely an underestimate due to the top-level tagging scheme currently using building=house for individually mapped terraced houses.
The fraction of building=house that are in fact individually mapped terraced houses, using the top-level tagging scheme, is difficult to identify without analysis of the actual mapped polygons. This has undoubtedly resulted in lower reporting of terraced house mapping than is actually present in OSM. This could be resolved by the adoption of a building=terraced (or similar) tag for top-level tagging, or moving to second-level tagging.
Specifying the type of house, in either tagging scheme, requires more effort than simply mapping a building (i.e. building=yes). Sometimes it may not be possible to identify the type of house without a ground survey (and so building=residential is used) or sometimes the mapper is not interested in adding the specific (e.g. building=semidetached_house, or building=house and house=semi-detached).
Which scheme is better? There is no right or wrong answer here.
- The top-level scheme is, by far, more widely used than second-level tagging when directly compared. This is especially true more globally as the secondary-level tagging is barely used outside the UK and Ireland.
- However, usage of the top-level scheme in GB is still only relatively minor compared to the more generic housing tags (i.e. building=house and building=residential).
- The secondary-level tagging scheme ties most housing types together under one top-level tag (building=house), which may appeal to some mappers.
- The top-level tagging scheme does not currently identify terraced houses, using tagging alone, whereas the secondary-level tagging scheme does. This could, however, be resolved with the introduction of a new value.
- Tagging by implication may provide enough information for some users. However, it does require much more sophisticated analysis for surveying the types of houses being mapped, as the data consumer must infer the house type by the number of adjoining building polygons (if any).
- Tagging of bungalows is not fully handled with either the top-level or second-level schemes but is, in a way, handled by the tagging by implication scheme.