Toronto building import
This page will be used for organizing and documenting the process of completing the import of buildings in the City of Toronto. The actual data to be imported and some of the code used for pre-processing is available on GitHub.
Context and Scope
About 300,000 buildings were imported in the City of Toronto as part of the Canada building import, though this import was stopped before it was completed due to disputes over data quality and lack of documentation and notice. The east and west sides of the city had buildings imported, but the more densely mapped central areas remained largely untouched. We are proposing to complete the import of buildings inside the city of Toronto. It is critical that this import effort adequately resolves all of the issues that got in the way of completing the Canada-wide effort.
As of October 2019, there are about 350,000 buildings mapped in the city, of which about 247,000 were last touched by one of four import accounts associated with the original import. This import completion effort will bring in approximately 57,000 building features in central Toronto. Most will be simple buildings (building=yes) and a few will need to be manually identified as building parts (building:part=*).
Data and License
It is worth noting that there is also data available from the City of Toronto which includes estimated building heights, though it appears that the license on this data may not be suitable for OSM. The data from Statistics Canada seems to be derived directly from the City of Toronto data, though the building height attributes are missing for the sake of consistency with data that StatsCan aggregated from other municipalities. There is building data available from Microsoft under a suitable license though the quality is not adequate for urban mapping.
Source data quality issues
There were several data quality issues raised during the Canada Building Import which are addressed below in the context of this import.
Most geometries in the Canada building import contained too many nodes. That is, there were many nodes which did not represent real building features but were only artefacts of the data collection/storage process. This introduces undesirable bloat into the database.
Before import, the data have been automatically simplified using the Douglas-Peucker simplification algorithm with a 15cm tolerance. Topology has been maintained where buildings share nodes to avoid accidental disconnection of party walls and/or building:part=* components.
Many buildings which properly are built entirely of right angles actually have angles that are a few degrees off. Many buildings however do actually have walls at 45 degrees, etc. and it is not easy to automate the process of rectifying these defects for some buildings without causing trouble for others. Problematic un-squared buildings will need to be squared manually during the import process using JOSM's "orthogonalize shape" tool (keyboard shortcut 'Q'). Tasks are being kept small enough that it should be possible to manually identify each building that needs squaring.
Many buildings which share walls and present a contiguous street frontage (e.g. in many urban business districts) are represented as a single large polygon. There is no way to automate a solution to this, so it will be necessary to either leave these buildings as they are or manually split them during import.
3D Building Parts
The data was originally designed (partially) for 3d building massing models and some larger or newer buildings are represented by several polygons, each originally with a different height attribute. We cannot import the height attributes (that is not part of the open-licensed dataset) so this feature may be of dubious value.
All buildings and building parts will initially be tagged as building=yes, however individual importers are encouraged to use their judgement in combining these or converting them to building:part=* and adding a single outline with building=*. If building parts aren't used in a given case then the smaller parts will need to be removed leaving just the building footprint.
This import will not directly attempt to overwrite or alter any data already existing in the OSM database, whether it originally came from an import or not. Only buildings in the source dataset which do not spatially intersect with buildings already in OSM will be imported. This may result in some small gaps and these will need to be handled manually during or after the import.
Dividing up the work
One of the initial concerns about the Canada building import was that the size of tasks in the tasking manager was too large either to assure proper validation or to fit in a single changeset. A task manager has been set up with about 300 tasks, or roughly 200 buildings each. Since the density of buildings varies, the size and shape of the tasks has to vary as well to keep task size manageable.