Mapseg is a program for segmenting maps and producing OSM polygons. The current version only operates on OS street view tiles but it could be adapted to use other tile shapes in future. It enables a semi automatic way of tracing shapes and adding them to the OSM database. Manual checking is required to ensure high quality data. The software has been tested on linux and windows but should work on any platform with python and the appropriate libraries.
This program was created to extract polygons from OS street view tiles. It depends on python 2.6 and the shapely, numpy, PIL libraries. Please check they are installed and are available to python before running. The processing time for a single tile is less than 30 minutes to 7 hours (or more) depending in building density, hardware and software, so this implementation might be regarded as quite slow, but sufficient for importing buildings in a limited area.
Guidelines for best practice:
- Don't import data without consulting the local mapping community.
- Don't delete existing OSM data without confirming no information is lost
- Don't duplicate buildings in OSM
- Don't import the whole UK area without agreement on talk-gb (this is unlikely to happen)
- Check if alternative data sources are available
- Manually check each polygon imported for accuracy and correct errors
- Only use maps that are out of copyright or have appropriate licenses
- Additional guidelines are available
For shapes that have difficulties in conversion, the program adds warning flags to the output OSM data. These may be used to find and fix problems. Examples of these flags are: "mapseg:extra_node=FIXME", "mapseg:fragment=FIXME", "mapseg:inner-polygon=FIXME", "mapseg:tile-edge=FIXME" and "mapseg:not_orthogonal=FIXME".
The source tag "source=auto_os_street_view" is added to each exported way. I strongly suggest manually corrected polygons have their source tags changed, to distinguish them. The tag "source=OS_OpenData_StreetView" is used for manually corrected buildings, if using OS street view.
First, download the appropriate tiles Ordnance_Survey_Opendata#Download_sites
Check & Install Python Libraries
Check python 2.6 and required libraries are installed (shapely, numpy, PIL, libxml2). For windows, the 32 bit version seems to be easier to get installed. The command for a debian package based linux is:
sudo apt-get install python-shapely python-numpy python-imaging python-libxml2
Windows downloads for the libxml2 python module (32 bit) are here http://users.skynet.be/sbi/libxml-python/. A 64 bit binary installer does not seem to be available. The other modules are available on their official web pages (links above).
Prepare Source Tiles
This section is not currently possible as mapseg does not read worldfiles of geotiffs.
If you have GDAL installed, smaller tiles can be made using a command like this:
gdal_translate -projwin 454000 334000 455000 339000 sk53nw.tif sk5439.tif
This cuts a 1 km square from the larger tile: which in built-up areas reduces both processing time and volume of OSM data. Of course buildings on boundaries may be split. The command uses the projection in the .tab and .tfw files (OSGB National Grid), and works from top-left to bottom-right corners (the opposite of OSM bbox).
Run the program like this:
python mapseg.py ../path/to/tile/su95se.tif
After a long wait, this should output su95se.osm
Optionally automatically remove duplicate polygons
Polygons should not be imported if they duplicate polygon data already in OSM. They can be automatically filtered out using the RemoveDups.py program. Download the existing data in the region of interest (I use JOSM), I call this the baseline polygon file. Use the command similar to:
python RemoveDups.py -i su95se.osm -b existing_su95se.osm -o nodup_su95se.osm
Visually check output
Load this into JOSM or similar editor. Check buildings against the original map tiles. Remove duplicate buildings. Search for "mapseg" to find flagged polygons and correct them.
Upload selected polygons to OSM
Upload to OSM database only if you are sure you know what you are doing. (I can't be held responsible for any community backlash!)