Cropping OSM with awk
From OpenStreetMap
| | Software described on this page or in this section is unlikely to be compatible with API version 0.5 deployed on 8 October, 2007. If you have fixed the software, or concluded that this notice does not apply, remove it. |
Contents |
Objective
You want to extract a region of interest from planet.osm while keeping the osm format. See code for usage and define boundingbox.
Tools
You need gawk to run this script. Cause is the asort() function.
The script
# crop_osm.awk
# crop Openstreetmap to bounding box
# usage:
# cat planet.osm | awk -f crop_osm.awk > crop.osm
# Authors: Alexander Dusleag, Florian Kindl
BEGIN{
FS="\"";
# <?xml ...>, <osm ...>
system("head -n 2 planet.osm");
mode="undef";
b=10000;
}
# parts of one node, segment or way are collected in array buff
# flushBuff() prints whole buff and resets the array
function resetBuff()
{
delete buff;
b=10000;
mode="undef";
}
function flushBuff()
{
# sort buffer
j = 1
for (i in buff)
{
ind[j] = i # index value becomes element value
j++
}
n = asort(ind) # index values are now sorted
for (i = 1; i <= n; i++)
{
print buff[ind[i]]
}
delete ind;
resetBuff();
}
{
# if mode is undef, find out if there starts a node, seg or way in this line
if (mode == "undef")
{
# keep nodes within bbox and remember their id
# DEFINE BOUNDING BOX HERE
if (($1 == " <node id=") && ($4 >= 42) && ($4 <= 50) && ($6 >= 4) && ($6 <= 18))
{
mode="node";
nodes[$2]=1;
buff[b++]=$0;
if ($9 == "/>")
{
flushBuff();
}
}
# keep segments completely within bbox and remember their id
else if (($1 == " <segment id=") && ($4 in nodes) && ($6 in nodes))
{
mode="seg";
segs[$2]=1;
buff[b++]=$0;
if ($9 == "/>")
{
flushBuff();
flushBuff();
}
}
# keep ways if 1 or more segments within bbox
else if ($1 == " <way id=")
{
mode="way";
waysegs=0;
buff[b++]=$0;
}
else
{
#print;
}
}
# if mode is not undef continue here
# node, seg: write lines to buff and
# print it after closing tag
else if (mode == "node")
{
buff[b++]=$0;
if ($1 == " </node>")
{
flushBuff();
}
}
else if (mode == "seg")
{
buff[b++]=$0;
if ($1 == " </segment>")
{
flushBuff();
}
}
# way: count segments within bbox and
# print only if 1 or more found
else if (mode == "way")
{
if ($1 == " <seg id=")
{
if ($2 in segs)
{
buff[b++]=$0;
waysegs++;
}
}
else if ($1 == " </way>")
{
if (waysegs > 0)
{
buff[b++]=$0;
flushBuff();
}
else
{
resetBuff();
}
}
else
{
buff[b++]=$0;
}
}
else
{
print "This will never be printed."
}
# tell where we are in planet.osm
if ( NR%10000 == 0) {
printf "\r%s", NR > "/dev/stderr";
}
}
END{
# </osm>
system("tail -n 1 planet.osm")
}
Authors
Alexander Dusleag and Florian Kindl, Tirol Atlas (University of Innsbruck)
for any questions write to:
- a . dusleag (a) uibk . ac . at
- florian . kindl (a) uibk . ac . at

