User:B1tw153/Maintaining Census Designated Places

From OpenStreetMap Wiki
Jump to navigation Jump to search

Many Census Designated Place (CDP) boundaries were imported into OSM using the TIGER/Line 2008 data US Census Bureau. Many of the CDP boundaries were significantly modified in the 2010 Census. Although the CDP boundaries were more stable in the 2020 Census, there may have been smaller changes to the boundaries in many places.

Why keep CDP boundaries in OSM?

Especially in rural and remote places, the CDP boundaries are the only definition that OSM has for the area of communities. While the exact location of the CDP boundaries is only of interest if you're working directly with Census data, the CDP boundaries are useful in OSM to geocode areas and associate them with place names. In urban areas and incorporated municipalities, there are likely already administrative boundaries in OSM that serve this purpose. In these areas, the CDP boundaries may not be meaningful in OSM. Consult with mappers active in the local area before modifying or deleting CDP boundaries in these areas.

Keeping CDP boundaries up to date

Older imported CDP boundaries are often tagged with source=TIGER/Line® 2008 Place Shapefiles (http://www.census.gov/geo/www/tiger/) or similar tags. Where the CDP boundaries are useful, they can be updated to use the latest TIGER data.

Confirm that you have the right elements

There are three types of elements that are relevant when updating a CDP boundary. All of these elements should have gnis:feature_id=* tags but you may find some elements that do not have this tag.

  • The CDP boundary outline, as a single way or a relation. Look up the Feature ID to confirm that the GNIS record is for a Census Designated Place in the Census class. The name in GNIS should end with "Census Designated Place." If it ends with "Division," then the record is for a Census Division which is a different type of area. If the name in GNIS has no suffix at all (i.e., just the place name), you may be looking at the GNIS record for the label node. Try to find another record for the place that ends with "Census Designated Place." Make sure you can find the correct GNIS record for the CDP.
  • The label node. Look up the Feature ID to confirm that the GNIS record is in the Populated Place class, or in the Census class but without the "Census Designated Place" suffix on the name.
  • An administrative boundary. If the place is incorporated, it may have a separate administrative boundary. The corresponding record in GNIS will be in the Civil class. If you find one of these classes in GNIS, consider carefully whether the CDP boundary is necessary in OSM or whether it would be appropriate for OSM to have only the administrative boundary.

If you encounter any GNIS records with "(historical)" as a suffix, these places are likely no longer in existence. Sometimes historical names remain in local use, in which case it is appropriate to retain them in OSM. It may be better to update the related elements in OSM with a current name, or remove them if the name is no longer in use.

You may find that the GNIS Feature IDs associated with these elements have been mixed up or combined in OSM. Sometimes this happens when mappers see two different elements with the same name and think that they represent the same thing. This is a good time to straighten things out. The "Census Designated Place" ID goes on the CDP boundary. The ID for the Populated Place (or Census without "Census Designated Place" in the name) goes on the label node. And the Civil ID goes on the administrative boundary.

The GNIS Feature IDs for the CDP and administrative boundary correspond directly to the Place National Standard Code (PLACENS) in TIGER data. These IDs will help you find the right elements in the TIGER data.

Get the latest TIGER data

Start with the TIGERweb site which provides an interactive view of TIGER data. Turn on the Places and County Subdivisions layer. If you expand this category, you can enable or disable individual layers for the Incorporated Places and Census Designated Places. Zoom in on the area you're interested in until these layers are visible. Use the Identify tool (circle with an "i" at the top right) to click on the relevant CDP (or administrative boundary) then select the boundary from the results on the left. This will pop up a window with all the details of the TIGER record. Double check the name and the Place NS Code, which should match the appropriate GNIS Feature ID. Note the State FIPS Code and Geographic Identifier because these will help you find the right GIS files to download.

With the information from the TIGERweb site, go to the TIGER/Line Shapefiles page. Find the latest release of Census Bureau data with "all legal boundaries and names." Sometimes this may be in a prior year's data set. Click the link to go to the FTP archive, which is really a web page but looks like FTP. Select the "PLACE" folder, then download the Zip file numbered with the State FIPS Code you got from the TIGERweb site.

Transform the TIGER data to WGS84

The TIGER data set (and many other US government data sets) use the NAD83 coordinate system while OSM uses WGS84. If you import the raw TIGER data into JOSM the coordinates won't be transformed. For most parts of the US, this results in errors of up to a couple of meters. Administrative boundaries are often drawn to align with land parcels, roads, or natural features. An error of a couple of meters is noticeable when the boundary doesn't match up with a road centerline or a lot boundary. Administrative boundaries are not verifiable on the ground, so these errors often cannot be corrected using other sources.

Transforming the TIGER data from NAD83 to WGS84 takes a few steps, and you have to be careful to get everything right. Or for the moment, you can use the 2023 TIGER/Line data that Ray Vanlandingham has kindly transformed into WGS84.

  1. If you don't already have the QGIS software, download and install it.
  2. Go to the PROJ.org Datumgrid CDN, download all the us_noaa_nadcon5_* files, and move them to the /share/proj/ directory in the QGIS installation directory.
  3. In QGIS, use Layer > Add Layer > Add Vector Layer to open the raw TIGER Shapefile.
  4. Then use Vector > Data Management Tools > Reproject Layer to start the process of transforming the coordinates. Be careful to get all the settings right at each step.
    1. Select the correct input layer. For the first pass, start with the raw TIGER Shapefile. For subsequent passes, use the last layer created by the previous step. You can confirm that you have the correct input layer by checking the Authority ID (EPSG:NNNN) against the transformation steps in the tables below.
    2. Select the correct target coordinate reference system (Target CRS). The drop down menu only has recently used CRSs, so click the small globe button to get the full list. You can use the filter to list coordinate reference systems by name, but you may need to manually expand both the Geographic (2D) category to see the results.
    3. If it's not already open, flip the toggle to open the Advanced Parameters. This provides a list of coordinate operations for the transformation. Select the correct coordinate operation identified in the tables below.
    4. Verify that the transformation will use a NADCON 5 projection. The transformation command is displayed in monospaced font below the Identifiers label in the Advanced Parameters section. One of the command parameters should be +grids=us_noaa_nadcon5_.... If that's not present in the command, double check the settings above.
    5. Run the transformation. Pay attention to any warning messages that may appear at the top of the Map View (behind the Reproject Layer window). If you get a warning that QGIS used a "ballpark" transformation, you may have messed something up. (Sometimes QGIS displays these warnings when it renders a layer in the Map View, which is separate from the layer transformation.) If it doesn't look like the transformation worked, delete the layer you just created and try again.
    6. Click the Change Parameters button or the Parameters tab, then repeat these steps for all the transformations in the tables below until you complete the transformation to WGS84.
  5. After the final transformation, close the Reproject Layer window and click on the small "chip" button on the top layer in the Layers list. This button allows you to save the layer to a file. Save the layer as an ESRI Shapefile.

You really do need to go through all the individual transformation steps to get accurate results. If you try to transform the data directly to WGS84, the transformation in QGIS does nothing but increase the error bounds on the coordinates.

Transformations for the Conterminous US (Lower 48)
Sequence From To Coordinate Operation
Coordinate System Authority ID Coordinate System Authority ID Transformation Description
1 NAD83(1986) EPSG:4269 NAD83(HARN) EPSG:4152 NAD83 to NAD83(HARN) (47) - DERIVED_FROM(EPSG):8556 United States (USA) - CONUS onshore
2 NAD83(HARN) EPSG:4152 NAD83(FBN) EPSG:8860 NAD83(HARN) to NAD83(FBN) (3) - DERIVED_FROM(EPSG):8865 United States (USA) - CONUS onshore
3 NAD83(FBN) EPSG:8860 NAD83(NSRS2007) EPSG:4759 NAD83(FBN) to NAD83(NSRS2007) (1) - DERIVED_FROM(EPSG):8862 United States (USA) - CONUS onshore
4 NAD83(NSRS2007) EPSG:4759 NAD83(2011) EPSG:6318 NAD83(NSRS2007) to NAD83(2011) (1) - DERIVED_FROM(EPSG):8559 United States (USA) - CONUS onshore
5 NAD83(2011) EPSG:6318 WGS 84 (G2139) EPSG:9755 (Only one option here - use the default)
Transformations for Alaska
Sequence From To Coordinate Operation
Coordinate System Authority ID Coordinate System Authority ID Transformation Description
1 NAD83(1986) EPSG:4269 NAD83(HARN) EPSG:4152 NAD83 to NAD83(HARN) (48) - DERIVED_FROM(EPSG):8550 United States (USA) - Alaska including EEZ
2 NAD83(HARN) EPSG:4152 NAD83(NSRS2007) EPSG:4759 NAD83(HARN) to NAD83(NSRS2007) (2) - DERIVED_FROM(EPSG):8551 United States (USA) - Alaska including EEZ
3 NAD83(NSRS2007) EPSG:4759 NAD83(2011) EPSG:6318 NAD83(NSRS2007) to NAD83(2011) (2) - DERIVED_FROM(EPSG):8552 United States (USA) - Alaska
4 NAD83(2011) EPSG:6318 WGS 84 (G2139) EPSG:9755 (Only one option here - use the default)
Transformations for Puerto Rico and the US Virgin Islands
Sequence From To Coordinate Operation
Coordinate System Authority ID Coordinate System Authority ID Transformation Description
1 NAD83(1986) EPSG:4269 NAD83(HARN) EPSG:4152 NAD83 to NAD83(HARN) (22) - DERIVED_FROM(EPSG):1495 Puerto Rico and US Virgin Islands - onshore
2 NAD83(HARN) EPSG:4152 NAD83(HARN corrected) EPSG:8545 NAD83(HARN) to NAD83(HARN Corrected) (1) - DERIVED_FROM(EPSG):9181 Puerto Rico and US Virgin Islands - onshore
2 NAD83(HARN corrected) EPSG:8545 NAD83(FBN) EPSG:8860 NAD83(HARN Corrected) to NAD83(FBN) (1) - DERIVED_FROM(EPSG):8867 Puerto Rico and US Virgin Islands - onshore
3 NAD83(FBN) EPSG:8860 NAD83(NSRS2007) EPSG:4759 NAD83(FBN) to NAD83(NSRS2007) (2) - DERIVED_FROM(EPSG):8868 Puerto Rico and US Virgin Islands - onshore
4 NAD83(NSRS2007) EPSG:4759 NAD83(2011) EPSG:6318 NAD83(NSRS2007) to NAD83(2011) (3) - DERIVED_FROM(EPSG):8673 Puerto Rico and US Virgin Islands - onshore
5 NAD83(2011) EPSG:6318 WGS 84 (G2139) EPSG:9755 (Only one option here - use the default)
Transformations for Hawaii (Pacific Plate)
Sequence From To Coordinate Operation
Coordinate System Authority ID Coordinate System Authority ID Transformation Description
1 NAD83(1986) EPSG:4269 NAD83(HARN) EPSG:4152 NAD83 to NAD83(HARN) (49) - DERIVED_FROM(EPSG):8660 United States (USA) - Hawaii - main islands onshore
2 NAD83(HARN) EPSG:4152 NAD83(PA11) EPSG:6322 NAD83(HARN) to NAD83(PA11) (1) - DERIVED_FROM(EPSG):8661 United States (USA) - Hawaii - main islands onshore
3 NAD83(PA11) EPSG:6322 WGS 84 (G2139) EPSG:9755 (Only one option here - use the default)

Transformations for Guam, American Samoa, and the Northern Mariana Islands

At this time, it is unclear how to transform the TIGER/Line files for these areas to WGS84.

Import the data into OSM

After transforming the TIGER/Line data as described above (don't skip this step!), use JOSM with the opendata plugin to select the transformed Shapefile and open it.

Zoom to the place you're working on and locate the correct boundary in the TIGER/Line data. Confirm that you have the correct boundary by matching the PLACENS field in the TIGER/Line data to the GNIS Feature ID for the Census Designated Place (or Civil boundary if you're working with an incorporated municipality). Sometimes the TIGER/Line data will have several separate ways with the same attributes rather than combining them into a single multipolygon relation. Make sure you get all the relevant data by searching for matching values of the PLACENS field. Merge the boundary way or multipolygon to your working OSM data layer.

Updating Simple Geometry

For rural and remote CDPs and municipalities, the boundary data in OSM may be a simple way or a relation with an outer way and label node. If that's the case, updating the geometry is simple.

In your OSM data layer, select both the old boundary way and the new boundary way, and use the Replace Geometry tool (Ctrl/Cmd-Shift-G) to update the geometry of the way. This preserves the history of the original way and the history of many of the existing nodes. It also merges all the tags on the two ways.

If there is no boundary relation for the place, create one. Add the boundary way to the relation with the outer role and the place node with the label role. Move all the tags from the boundary way to the boundary relation.

Updating Complex Geometry

In urban and suburban areas, the boundaries of municipalities and CDPs are typically constructed using multipolygons with individual ways shared with other civil boundaries. In this case, the process of updating the geometry is more complex.

  1. In your OSM data layer, select the first way that is a member of the existing boundary relation. Download data from OSM along this way. Using a small offset of 2 meters around the way can help keep download sizes small in dense urban areas.
  2. If the way has any feature tags on it (e.g., barrier=fence) remap the relevant features so they're no longer conflated with the boundary way.
  3. Unglue this way (G key) from other features and if you're prompted, leave tags on the existing nodes to separate the boundary way from other conflated features. This will also break the relations that the way is a member of, but you'll be reconnecting those relations as you work.
  4. There may be tagged nodes on the way where someone has placed a gate, stop sign, etc. onto the boundary way. Get info on the boundary way (Cmd/Ctrl I) to get its ID then search (Cmd/Ctrl F) for tagged nodes in the way using this query with the ID of the way: (child id:....) and (-tags:0). Remap the relevant features so they're no longer conflated with the boundary way.
  5. Select suitable start and end points on the new boundary way from TIGER and split it (P key) so that matches the length of the old boundary way as closely as possible.
  6. Decide whether to keep the old boundary way or use the new boundary way. If this boundary segment is a component of something much bigger than a municipality, like a state or international border, you probably want to keep the old boundary way. Also, if the boundary segment follows an indeterminate feature like coastline, it's likely that the TIGER data is not very good and it would be better to keep the old boundary way.
    1. If you're keeping the old boundary way, delete the new segment from TIGER. If appropriate, adjust the endpoints of the old boundary way so that they are connected to the new TIGER segments on either end.
    2. If you're using the new boundary way, select both the old and new boundary ways and use Replace Geometry (Cmd/Ctrl G) to update the old way with the geometry from the new way. This preserves as much of the history as possible. The endpoints of the way will likely now be disconnected from the adjacent ways that were part of other boundaries. Move the endpoints of the old adjacent ways to connect them to the ends of the new segment from TIGER.
  7. Select the next way (working either clockwise or counterclockwise) in the boundary relation and repeat this process from step 2 until you've gone through all the ways in the boundary relation.
  8. If everything went well, all the relations that share ways with the boundary you're working on should be reassembled. Check them by selecting the members of the boundary relation, then going through the other relations that they're members of one by one to confirm that all the relations are fully connected and there are no gaps.

Tag Cleanup

Clean up the tags on the relation by deleting all the tags that came from the TIGER/Line data and any tiger:* tags that came from the boundary in OSM. Keep, correct, or add the admin_level=*, boundary=*, gnis:feature_id=*, name=*, place=*, source=*, wikipedia=*, and wikidata=* tags. Census Designated Places should be tagged with boundary=census and no admin_level tag and Incorporated Places should be tagged with boundary=administrative and admin_level=*. Update the source=* tag on the boundary relation and all the member ways with a reference to the TIGER/Line data you used.

While you're working on this, add or update the population=*, population:date=*, and source:population=* tags using data from the Census Bureau Data Table.

Note that the property tags for the CDP (or administrative) boundary do not belong on the way(s) that make up the boundary. The only tag that belongs on these ways is the source=2023 TIGER/Line Shapefiles (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) tag so that it's easy to see when the boundary was last updated. Keep these ways clean so that they can be reused in adjacent boundary relations.

Check your work

You should have:

  1. A place node with a GNIS Feature ID that corresponds to a record in the Populated Place class, or a record in the Census class that does not have "Census Designated Place" as a suffix to the name. The place node should have place=*, gnis:feature_id=*, name=*, official_name=*, population=*, population:date=*, source:population=*, website=*, wikipedia=*, and wikidata=* tags.
  2. An updated boundary way with no tags except source=* but with the full history of the previous boundary ways.
  3. A boundary relation with the place node and boundary way as members and with correct admin_level=*, boundary=*, border_type=*, gnis:feature_id=*, name=*, official_name=*, population=*, population:date=*, source:population=*, website=*, wikipedia=*, and wikidata=* tags.

If it all looks good, upload the changeset. Congratulations! You've just updated the CDP boundary so that it can be used for geocoding!