Import/Toronto Public Washroom Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Toronto Public Washroom Import is an import of the Park Washroom Facilities dataset from the City of Toronto Open Data Portal which is of type GeoJSON covering Toronto in Canada). The import is currently as of August 11, 2024 at the planning stage.

Goals and Background

The goal of this project is to import the locations and details of public washrooms in Toronto, Ontario, Canada using open data provided by the City of Toronto.

The City has two open datasets with the locations of public washrooms:

The first dataset, Park Washroom Facilities, includes the location and key details of 418 City-run public washrooms in parks and community centres. This dataset will be imported in this project.

As of 2024-08-10, we expect to import just under 300 data points after filtering for quality, type, and status. Of these, around 100 will be conflated with existing washroom locations and around 200 will be locations that were previously un-mapped. These will be broken up into 25 changesets based on City of Toronto ward.

In addition to the just under 300 data points in the main import sets, there are also approximately 25 washroom locations with service alerts that will be manually reviewed and added or conflated if appropriate. (Service alerts range from total closure of the washroom for one gender to a single broken stall or sink.)

The second dataset, Street Furniture - Public Washroom, includes the locations of four automated public washrooms installed and maintained by Astral Media as part of their street furniture contract with the City. This dataset only includes four features, three of which are already in OpenStreetMap. This dataset will be manually reviewed to add the one missing washroom location. Unlike the other public washrooms run by the City, these washrooms are paid (25 cent fee) and have a 20 minute time limit according to the City's street furniture webpage.

The source code and import files for this project can be found at our GitHub repository here: https://github.com/tallcoleman/toronto-osm-washroom-import

Schedule

  • August 11, 2024: proposal finalized and shared to community forum
  • August 25, 2024: proposed date to start importing changesets
  • September 8, 2024: proposed date to complete imports
  • After main import is completed
    • Double check for any additional changes to City source data
    • Manual addition of locations with service alerts
  • November 2024: re-check City dataset to see if it has been updated to indicate which park washrooms are closed or open in the winter

Import Data

Background

Data source site: https://open.toronto.ca/dataset/washroom-facilities/
Data license: Open Government License - Toronto
Type of license (if applicable): N/A - custom Open Government License
Link to permission (if required): See Toronto#City_of_Toronto_Open_Data and osmfoundation.org - OGL Canada and local variants
OSM attribution (if required): Contributors#Toronto
ODbL Compliance verified: yes

OSM Data Files

The import files for this project can be found at our GitHub repository here: https://github.com/tallcoleman/toronto-osm-washroom-import

The "by_ward" folder includes one folder for each changeset to be imported. The full set of data to be imported is included in the root "to_import" folder for reference only.

Import Type

So far, this is only planned as a one-time import, however we will save a copy of the source dataset in the GitHub repository as of the import date(s) so that it can be used to determine future changes and updates made by the City.

The data will be imported into OSM using JOSM. Filtering and transformation of the source dataset is done using the python script in the code repository, and conflation and import will be done manually in JOSM using the protocol outlined below.

Data Preparation

Data Reduction & Simplification

The source data will be converted into OpenStreetMap tags and any extraneous data in the City dataset will not be imported. Data will be conflated with existing OpenStreetMap data to maintain edit history, as well as surveyed tags and geometry.

The source data is filtered according to the following criteria:

  • "type" is "Washroom Building" (and not "Portable Toilet", per Don't map... temporary features)
  • "status" is 1 (open) and not 0 (closed) or 2 (service alert)

Status 2 (service alert) washrooms will be reviewed manually as described in the Goals and Background section.

Tagging Plans

Source Data Attributes

The source data from the City has the following attributes, as described in the "Data Features" section of the source page. A data profile summary can be found on the GitHub repository README.

Note that in the descriptions below, "asset" means the washroom and "parent asset" means the park or community centre the washroom is located within.

Column Description
_id Unique row identifier for Open Data database
id Parks, Forestry and Recreation Unique Identifier of the parent asset
asset_id Parks, Forestry and Recreation Unique Identifier of the asset
location Name of the parent asset
alternative_name Is comprised of the name of the parent asset and a descriptor of the location
type Ex: Washroom, Portable Toilet
accessible Provides a list of accessible features, if available
hours Hours of Operation, if outside of Park hours
location_details A detailed description of where to find the asset
url A link to the webpage of the parent asset
address Street address of the park, where the drinking water source is located.
PostedDate Date status was posted
AssetName This asset's name
Reason Reason for a particular status
Comments Open field for comments on this asset
Status
  • 0 = closed
  • 1 = open
  • 2 = service alert
geometry (lat/long coordinates for the location)

Tagging Plan for Data to be Imported

The data to be imported will have the following tags:

key value
amenity=* "toilets"
access=* "yes" (One known exception: Jack Layton Ferry Terminal Washroom (asset_id: 58062) where the description indicates it is behind fare paid area, in this case, "access"="customers" is used.)
fee=* "no"
male=* / female=* Only included as "yes" in the rare case that AssetName in the source data includes e.g. "Men's", "Female", etc. See note below about gender values.
toilets:disposal=* "flush"
toilets:handwashing=* "yes"
changing_table=* "yes" if Accessible in the source data includes "Child Change Table"
changing_table:adult=* "yes" if Accessible in the source data includes "Adult Change Table"
wheelchair=*
  • "yes" if Accessible in the source data has "Entrance at Grade" or "Entrance Access Ramp" as well as "Accessible Stall"
  • "no" if Accessible in the source data is "None"
  • left blank otherwise
toilets:wheelchair=* "yes" if Accessible in the source data has "Accessible Stall"
wheelchair:description=* Creates a list of the relevant features from Accessible in the source data, e.g. "Accessible features: entrance at grade, automatic door opener, accessible stall"
operator=* "City of Toronto"
opening_hours=* see note below
description=* location_details from source data, e.g. "Located on the grassy area behind the baseball diamond"
note=* see note below about opening_hours
ref:open.toronto.ca:washroom-facilities:asset_id asset_id from source data

opening_hours

The city's information page about public washrooms notes the following:

A number of the City’s washroom buildings have been winterized and are open year round.

...

Hours of Operation

  • From May to October, washrooms in parks are open from 9 a.m. to 10 p.m.
  • From November to April, washrooms in parks are open from 9 a.m. to 8 p.m.
  • Staff teams open and close many washrooms in a geographic area every day, so individual washrooms may open earlier according to where they land on the route.
  • Check building hours for washrooms located in recreation centres.

Unfortunately, the City dataset does not indicate which washrooms in parks are winterized. As a result, we have conservatively generated the opening_hours=* tag for all washrooms in parks that have the regular "9 a.m. to 10 p.m." hours to indicate that they are only open from May to October.

In these cases, the note tag will prompt surveyors to update the opening_hours tag if the washroom is winterized (usually indicated by a sign on the outside of the washroom building).

The script cross-references the washroom data with the City's dataset on Parks and Recreation Facilities to determine whether the washroom is in a park or a community centre. Washrooms with "9 a.m. to 10 p.m." hours in a community centre are assumed to have the same hours year-round.

For a small number of washrooms, the City Parks and Recreation Facilities dataset includes both a park and a community centre with the same facility ID. In these cases, the note tag will prompt surveyors to record the opening hours, since the context is unclear.

For other numeric opening hours indicated in the City source data (e.g. "9 a.m. to 5 p.m.", "9 a.m. to 7:30 p.m.", "6:30 a.m. to 11:30 p.m."), hours are assumed to apply year-round.

In all other cases, including where the City source data indicates, e.g. "View centre hours" or "View outdoor rink hours", the opening_hours tag is left blank. The City's dataset on Parks and Recreation Facilities does not indicate opening hours; these would have to be obtained from the City's webpages.

As a final note, it is possible that the City may update the source dataset in November to indicate which washrooms remain open during the winter; this will be checked as the final step of the proposed import plan.

Gender Tagging

Unless the AssetName column includes e.g. "Men's" or "Female", the City source data does not include sufficient information to determine the male=*, female=*, unisex=*, or gender_segregated=* tags.

While most washrooms in the dataset are likely male=yes + female=yes, some newer City parks and community centres are designed with unisex washrooms (see, e.g. Bluffer's Park East Washroom Building Improvements). There is not a clear consensus about whether unisex=yes has the same meaning as male=yes + female=yes, so using unisex=yes as a catch-all seems unwise. (See: Talk:Tag:amenity=toilets#Gender_neutral,_gender_segregated_and_unisex=yes)

Changeset Tags

Changesets will be tagged with the following values. The script generates changeset tags to go along with each import set.

Key Value
comment Toronto Public Washroom Import, subset NAME OF SUBSET
created_by JOSM/VERSION (created automatically by JOSM)
import yes
source City of Toronto
source:url https://open.toronto.ca/dataset/washroom-facilities/
source:date YYYY-MM-DD (generated automatically by script)
import:page Import/Toronto Public Washroom Import
source:license Open Government License - Toronto

Data Merge Workflow

Team Approach

This import will be done as a team, with the following contributors:

Additional help may be solicited from Civic Tech Toronto members

References

Workflow and Conflation Steps

There are 25 changesets to be conflated and uploaded in JOSM (one for each City Ward). The largest has has 23 nodes, and the smallest has 5 nodes.

As a precursor, we checked for ways with building=toilets that don't have the amenity=toilets tag (since the conflation steps later on match on the amenity=toilets tag) — changeset 154503418 (achavi, OSMLab)

Update and commit import data

  1. Run generate_imports.py to update the import data and commit the updated source and import data to GitHub.

Since the City source data is updated daily, the best plan is likely to run the script and commit its results once before working through the 25 changesets. Then, after the 25 changesets are done, we can run the script again and check for additional changes (e.g. washrooms that have changed status code in the meantime).

Video for JOSM Steps

YouTube Logo Toronto Washroom import workflow with JOSM

Load data in JOSM

  1. Import the WARD_washrooms.geojson file
  2. Download existing toilets data from OSM
    1. Download data using WARD_toilets_query in "Download from Overpass API" - make sure to "Download as a new layer"
    2. rename layer to "existing_toilets"

Conflate import data with existing toilets

Conflation Tool:

  1. Configure
  2. Select WARD_washrooms.geojson in layers and select all datapoints (ctrl/command + A)
  3. Reference -> Freeze
  4. Select existing_toilets layer and search for datapoints with amenity=toilets (ctrl/command + F)
  5. Subject -> Freeze
  6. Make sure "Replace Geometry" is unchecked
  7. In the Advanced tab, make sure the only option selected is "Centroid distance" and set the distance at 50 metres

Screenshot on GitHub - conflation tool configuration

Review conflation results

Tip: Keep the existing_toilets layer selected the whole time to prevent accidental edits to the reference layer. The selected layer has a green check-mark next to it.

The conflation plugin generates two types of results that need to be reviewed following the steps below:

  • "Reference only" results are new additions to the map (i.e. with no existing toilets found within 50 metres).
  • "Match" results are the closest features from both datasets within the search radius (which in this case is 50 metres). Matches require an additional step of merging and reviewing tags from both datasets.
Check surrounding context
  1. Double click on a result in the list to zoom to it
  2. Download data for the view area to check if there's an existing building relevant to the washroom feature
  3. check amenity=toilets inview (to double confirm that there aren't any nearby buildings already tagged as toilets)
Conflation: "Reference Only" results
  1. Select "conflate" to add the node to the existing_toilets layer
Conflation: "Match" results
  1. Hover over the icons in the "Reference" and "Subject" columns to see the existing tags for each dataset.
    1. The "Tags" column will show if any tags are conflicting. Tags on "No Conflict" matches should still be reviewed (a) in case there are any tagging mistakes or (b) if there is information in the existing tags that is more specific than what is in the reference (import) file. See notes below on both cases.
    2. For matches with tag conflicts, write down some notes about which values should be retained and which should be overwritten. (This is helpful since the tag conflict resolution menu can be ambiguous about which value came from which dataset).
  2. Select "conflate" to merge the tags and resolve any conflicts using the notes you took earlier

Potential tagging mistakes:

  • Tagging that does not confirm to the instructions in Tag:amenity=toilets
  • Gender-segregated washrooms should tagged as one node or way in accordance with One_feature,_one_OSM_element. If the immediately adjacent male and female sections were originally mapped separately, the overall facility should be tagged with male=yes, female=yes, and gender_segregated=yes.
  • Washrooms in this dataset should be access=yes instead of access=permissive
  • Otherwise, if the tagging on the existing feature is different than the tagging in the reference (import) file, assume the existing tagging is correct
Check Geometry (for "Reference Only" results)

For "match" results, defer to the existing geometry unless there is an obvious error (e.g. node is outside of the bounding box of the relevant building).

For "reference only" results, the geometry should be reviewed and edited as follows:

  1. Based on the aerial imagery, the "description" tag on the imported node, and the surrounding context, decide how to map the toilet:
    1. Option 1: leave as a node (e.g. if within a multi-purpose building, e.g. a community centre) — in this case, ensure the node is in a logical space within the building (check the description, and when in doubt, place in the center of the building)
    2. Option 2: copy tags to an existing building (e.g. a standalone washroom building that is not already tagged building=toilets) — in this case, add or edit the building tag to building=toilets, copy the tags from the imported node onto the building, and delete the imported node.
    3. Option 3: draw a building based on aerial imagery (e.g. if a standalone washroom building has not been created yet) — in this case, ensure the aerial imagery aligns with surrounding features, then draw the building, and then add/copy the relevant tags like in Option 2.

If you are especially unsure where to locate a node (option 1), you can add a note, e.g. "Please survey to determine exact location within community centre".

Some general guidelines for when a washroom building is likely to be standalone (and the tags can be copied onto the building along with building=toilets) and when it is likely to be multi-purpose (and the washroom should be kept as a node):

  • Buildings that are also likely to offer change facilities (e.g. adjoining outdoor swimming pools, hockey rinks, or beaches) — map the washroom as a node
  • Community centres — map the washroom as a node
  • Buildings with existing map features indicating e.g. a café or information booth — map the washroom as a node
  • Smaller stand-alone with an area of less than 250-300 square metres — map the washroom as a building (The JOSM Measurement plugin can show you the area of a selected building.)

If very unsure about how to properly map the geometry of a washroom, delete/do not conflate the node, finish the rest of the changeset, and add a note to this wiki requesting a survey of the skipped location (include the washroom's ref ID).

Upload changes

  1. With the existing_toilets layer selected, select the "upload" arrow.
  2. The validator will show any tagging errors detected - the "key with uncommon character error relating to "ref:open.toronto.ca:washroom-facilities:asset_id" can be ignored (hyphens are not included in the wiki documention on discouraged key characters), but any other errors should be fixed before uploading.
  3. Open WARD_changeset_tags.txt, copy the contents. In the "Settings" tab, select the button with three plus signs to paste the tag values for the changeset.
  4. Select "Upload Changes" to publish the changeset

In case a changeset needs to be reverted, we will use the JOSM/Plugins/Reverter plugin.

QA

The script includes the following features for quality assurance:

  • City source data is validated against a schema using pandera — this way, if the assumptions used when writing the transformation functions have changed, the script will throw an informative error
  • The filtered set of City source data is also validated to confirm that there are no service alert–related notes
  • Automatic fixing of known typos and character encoding errors in the City dataset
  • Variable length tag values generated by the script are checked to ensure that they do not exceed the 255 character limit

The protocol outlined above for the conflation and upload steps in JOSM will also help to keep the import thorough and consistent, and will help ensure that the geometry from the City dataset is adjusted as needed to ensure it is correctly positioned relative to surrounding features.

See also

The post to the community forum was sent on 2024-08-11 and can be found here