Import Kokomo address data
The city of Kokomo, Indiana, USA offers GIS data online. Among their data sets is address points for the city in ESRI Shapefile format. This proposal details how this data might be imported into OpenStreetMap to enhance data for Kokomo, IN, especially geocoding results.
The data has been almost all imported - remaining steps are to clean up and fix any problematic data points.
I was provided with a PDF on the city's letterhead stating, "This letter is to confirm that the City of Kokomo has no objections to geodata derived in part from Kokomo GIS data being incorporated into the OpenStreetMap project geodata database and released under free and open license."
Additionally, some members in the OSM community believe that address data in the USA is factual information that cannot be copyrighted.
Current state of Kokomo OSM data
Currently, the data in Kokomo mostly consists of road information from TIGER. There are few building outlines, and only 2 features with address data in the city. Some major buildings are traced, but many classes of features ranging from parks to water are incomplete or missing. There is limited editing activity in Kokomo - contributors seem to be largely hit-and-run contributors who are not focused on the city.
The data provided to me is in ESRI Shapefile format, with approximately 36,000 points.
Points only have the street address, for example, "303 E Superior St". The names would have to be separated into addr:housenumber and addr:street tags, and the addr:street tags will have to be expanded. For example,
TEXT=303 E Superior St
addr:housenumber=303 addr:street=East Superior Street
Most of this can be done automatically. Some subtlety may lie in parsing two county road intersections, but as a happy side-effect of their data processing, a large portion of the addresses have multiple spaces between the house number and the street name (see example above).
The points seem to be on parcel centers, which is not surprising as the city doesn't seem to have building outlines readily available. In some areas, there are points that are only labeled with a street name, without a house number, but these do not seem to correspond to any actual buildings and would be automatically filtered out.
The city also has a KML file online with the house numbers and coordinates (but no street names). This data set does not contain points without house numbers, and does not have duplicate points that are present in the Shapefile. The plan is to cross-reference all Shapefile points against the KML, and exclude any points that don't have a housenumber match on both.
- Convert shapefile into OSM file (done)
- Filter nodes that are only street names or street-street intersections (i.e. don't have house numbers)
- Filter nodes that are not also present in the KML with a matching street number
- Convert node names to addr:housenumber and addr:street tags
- Identify any street name mismatches between the Kokomo data and OSM, and correct those as needed
- Split data up into multiple rectangular chunks, each spanning 0.01 degrees on both sides
- Manually import each chunk, merging duplicate features and removing erroneous features as needed