Converting OSM to GML
From OpenStreetMap
Contents |
Why GML?
GML, aka Geographic Markup Language, is an industry-standard XML format for expressing vector geometries and their attributes. GML is readable by a variety of different F/OSS GIS packages, most notably any that depend on the OGR library, including QGIS and GRASS.
User:ChristopherSchmidt and User:SchuylerErle have come up with a couple different ways of exporting OSM XML to GML for use elsewhere. Both methods currently support nodes, segments, and tags, but not ways (yet).
Using XSLT
XSLT is a useful tool for transforming one kind of XML into another kind of XML.
I've compiled a library for faster processing and added some start scripts for Windows and Unix. They can be downloaded from here. After successful conversion to GML there are other tools like OGR2OGR which can turn GML e.g. into Shapefiles (shp). The linux script in the zip does not work, this is the right syntax:
java -cp "lib/serializer.jar:lib/xsltc.jar:lib/osm2gml.jar" -Xmx512M org.apache.xalan.xsltc.cmdline.Transform -u $1 osm2gml > `basename "$1" .osm`.gml
The following XSLT stylesheet turns the XML returned by the OSM REST API v.05 into GML:
Version 0.2
<?xml version="1.0" encoding="UTF-8"?>
<!--
Version 0.2 by Stefan Keller, http://geoconverter.hsr.ch
Original version by Schuyler Erle.
Based on OSM REST API 0.5.
/-->
<xsl:stylesheet xmlns="http://osm.maptools.org/"
xmlns:osm="http://www.openstreetmap.org/gml/"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:gml="http://www.opengis.net/gml" version="1.0">
<xsl:output method="xml"/>
<xsl:output indent="yes"/>
<xsl:template match="/">
<xsl:text>
</xsl:text>
<osm:FeatureCollection>
<xsl:for-each select="/osm/way">
<xsl:call-template name="way"/>
</xsl:for-each>
<xsl:text>
</xsl:text>
</osm:FeatureCollection>
</xsl:template>
<xsl:template match="/osm/way" name="way">
<xsl:text>
</xsl:text>
<gml:featureMember>
<osm:way fid="{@id}">
<osm:id><xsl:value-of select="@id"/></osm:id>
<osm:timestamp><xsl:value-of select="@timestamp"/></osm:timestamp>
<osm:user><xsl:value-of select="@user"/></osm:user>
<osm:geometryProperty>
<gml:LineString>
<gml:coordinates><xsl:apply-templates select="nd"/></gml:coordinates>
</gml:LineString>
</osm:geometryProperty>
<xsl:apply-templates select="tag"/>
</osm:way>
</gml:featureMember>
</xsl:template>
<xsl:key name='nodeById' match='/osm/node' use='@id'/>
<xsl:template match="/osm/way/nd">
<xsl:variable name='ref' select="@ref"/>
<xsl:variable name='node' select='key("nodeById",$ref)'/>
<xsl:value-of select="$node/@lon"/>,<xsl:value-of select="$node/@lat"/>
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="/osm/way/tag">
<xsl:variable name="osm_element" select="translate(@k, translate(@k, 'aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ_-.0123456789', ''), '_')"/>
<!-- xsl:variable name="osm_element" select="@k"/ -->
<xsl:if test="string($osm_element)">
<xsl:element name="osm:{$osm_element}">
<xsl:value-of select="@v"/>
</xsl:element>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Version 0.1
The following XSLT stylesheet turns the XML returned by the OSM REST API v.04 into GML: Original version from Schuyler Erle?
| | Software described on this page or in this section is unlikely to be compatible with API version 0.5 deployed on 8 October, 2007. If you have fixed the software, or concluded that this notice does not apply, remove it. |
<?xml version="1.0"?> <xsl:stylesheet xmlns="http://osm.maptools.org/" xmlns:osm="http://www.openstreetmap.org/gml/" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:gml="http://www.opengis.net/gml" version="1.0"> <xsl:output method="xml"/> <xsl:key name='nodeById' match='/osm/node' use='@id'/> <xsl:template match="/osm/segment/tag"> <xsl:element name="{@k}"> <xsl:value-of select="@v"/> </xsl:element> </xsl:template> <xsl:template match="/osm/segment"> <xsl:variable name="from" select="@from"/> <xsl:variable name="to" select="@to"/> <xsl:variable name='from_node' select='key("nodeById",$from)'/> <xsl:variable name='to_node' select='key("nodeById",$to)'/> <gml:featureMember> <segments fid="{@id}"> <ID> <xsl:value-of select="@id"/> </ID> <osm:geometryProperty> <gml:LineString> <gml:coordinates> <xsl:value-of select="$from_node/@lon"/>,<xsl:value-of select="$from_node/@lat"/> <xsl:text> </xsl:text> <xsl:value-of select="$to_node/@lon"/>,<xsl:value-of select="$to_node/@lat"/> </gml:coordinates> </gml:LineString> </osm:geometryProperty> <xsl:apply-templates select="tag"/> </segments> </gml:featureMember> </xsl:template> <xsl:template match="/"> <osm:FeatureCollection> <xsl:apply-templates/> </osm:FeatureCollection> </xsl:template> </xsl:stylesheet>
You can run this on a UNIX platform with xsltproc like so:
$ xsltproc osm2gml.xsl data.osm > data.gml
(Just copying the file out of the wiki, I had trouble running it, as the automatic markup loses an empty space in one of the lines: "9', ), '_')". It does not show up here either, but after the first comma the script requires an empty string marked with two single quotes.)
You can then convert the GML, to say, ESRI Shapefile, using the OGR toolkit, like so:
$ ogr2ogr data.shp data.gml
This method works great for relatively small OSM XML files.
The downside is that XSLT tends to be slow: XSLT processors usually build an entire DOM tree of the source XML document in memory. Worse, the template that links nodes to segments uses XPath, which results in (as best) O(n) search time for each node. Both of these properties make XSLT unsuitable for converting an entire planet.osm data dump to GML.
Adding keys may help: This is what osmarender does. However, it does not reduce the memory requirements -- which, for Planet.osm, immediately jump into the 1.3GB range. Without an idle machine with lots of memory to sit and think for a while, I would not recommend using the XSLT method for converting something so large as OSM, but the keys may significantly increase speed on smaller chunks of data.
Using Python
Using Python and xml.sax, it is possible to convert from the Planet.osm dump to GML much more quickly. There is a script available from http://www.chzsoft.com.ar/geo/osm2gml.py.txt which shows how to do it. (A script for the old version 0.4 API can be found at http://london.freemap.in/osm2gml_simple.py.txt)
Syntax for running this script:
cat yourfile.osm | ./osm2gml.py > yourfile.gml
A few important things to notice about this code:
- The OSM namespace is the base namespace. This is so that attributes, when they come out, come out as lanes, not osm:lanes, which some tools might not completely understand.
- You can chose via exportAll whether all tags should be exported. If exportAll is set to 0 then exportTags defines a set of tags which are exported. The reason that we don't just export all tags is that when exporting to shapefiles, attributes are fixed width. This means that for every attribute, even if it's null 99.9% of the time, you're increasing your attributes table significantly. Exporting only useful tags lets you select which data are important, and limit the size of your files. This is only a concern if you are converting to shapefiles, however.
- By default only ways are exported. After setting exportNodes to 1 all nodes which have a name tag will be exported as well.
From here, you can use standard GIS tools to conver the GML to something more useful:
ogr2ogr planet.shp planet.gml
And from there, you can set it up in whatever you want: qgis, for example, will now display it.
Output GML file from May Planet.osm (produced by v.0.4 API script) --
- http://london.freemap.in/output/planet_simple.gml.gz (16MB, gzipped.)

