OSM Fixer

From OpenStreetMap Wiki
Jump to: navigation, search

OSM Fixer is an attempt to remove duplicate k/v pairs in the tag tables as well as remove referential violations that can only be spotted on extensive search. OSM Fixer runs on MonetDB and uses the normalised schema's of Cherokee/MonetDB Handler OSM.

SQL

Searches for referential violations that would come up as foreign key violations in relations.

select distinct relation_members_node.relation as relation, to_node from
relation_members_node left join nodes_legacy on relation_members_node.to_node =
nodes_legacy.id where nodes_legacy.id is NULL;
select distinct relation_members_way.relation as relation, to_way from
relation_members_way left join ways on relation_members_way.to_way =
ways.id where ways.id is NULL;
select distinct relation_members_relation.relation as relation, to_relation from
relation_members_relation left join relations on relation_members_relation.to_relation =
relations.id where relations.id is NULL;


The above three combined.

select distinct * from (
select distinct relation_members_way.relation as relation from
relation_members_way left join ways on relation_members_way.to_way =
ways.id where ways.id is NULL union select distinct relation_members_relation.relation as relation from
relation_members_relation left join relations on relation_members_relation.to_relation =
relations.id where relations.id is NULL union select distinct relation_members_node.relation as relation from
relation_members_node left join nodes_legacy on relation_members_node.to_node =
nodes_legacy.id where nodes_legacy.id is NULL ) as bigquery order by relation;


Searches for foreign key violations in the way/node table.

select distinct way_nds.way as way, to_node from
way_nds left join nodes_legacy on way_nds.to_node =
nodes_legacy.id where nodes_legacy.id is NULL;


Searches for duplicate key usage in the tags.

select count(node), node, k, v from node_tags group by node, k, v having count(node) > 1;
select count(way), way, k, v from way_tags group by way, k, v having count(way) > 1;
select count(relation), relation, k, v from relation_tags group by relation, k, v having count(relation) > 1;

Searches for duplicate references in relations.

select count(relation), relation, to_node, role from relation_members_node group by relation, to_node, role having count(relation) > 1;
select count(relation), relation, to_way, role from relation_members_way group by relation, to_way, role having count(relation) > 1;
select count(relation), relation, to_relation, role from relation_members_relation group by relation, to_relation, role having count(relation) > 1;

Searches for duplicate references in ways.

select count(way), way, to_node from way_nds group by way, to_node having count(way) > 1;

Bash

The following script removes duplicate tags.

ls -1 | while read i; do
grep -v "<tag" $i | grep -v "</way>" | grep -v "</osm>" > /tmp/toupload/$i
sort -u $i | grep "<tag" | grep -v 'k="created_by"' >> /tmp/toupload/$i
echo '    <tag k="created_by" v="OSM Fixer"/>' >> /tmp/toupload/$i
echo '  </way>' >> /tmp/toupload/$i
echo '</osm>' >> /tmp/toupload/$i
done

Uploading is done using:

ls -1 | while read i; do curl -H "Expect:" -i -u "Stefan de Konink:pass1234" -T $i "http://api.openstreetmap.org/api/0.5/way/$i"; done


Or Deleting if the object is empty anyway:

curl -H "Expect:" -i -u "Stefan de Konink:pass1234" -X DELETE "http://api.openstreetmap.org/api/0.5/relation/105969"


In case out of sequence deletes:

cat /tmp/deletebetter | while read i; do echo '<?xml version="1.0" encoding="UTF-8"?><osm ersion="0.5" generator="OpenStreetMap server"><way id="' > $i; echo $i >> $i; echo '"><nd ref="1"/><tag k="created_by" v="OSM Fixer"/><tag k="comment" v="This undelete was done because an out of order delete that was never propagated."/></way></osm>' >> $i; curl -H "Expect:" -i -u "Stefan de Konink:pass1234" -T $i "http://api.openstreetmap.org/api/0.5/way/$i"; curl -H "Expect:" -i -u "Stefan de Konink:pass1234" -X DELETE "http://api.openstreetmap.org/api/0.5/way/$i"; rm $i; done


Python

There is a shell script querying MonetDB (Planet database) for referential violations and a Python script reading the results. It can be found at [1].

The Python script will correct the violations via API v0.6 with OAuth. By default, the script is writing to api06.dev.openstreetmap.org in order to avoid damage to the production data.