Pbftoosm

From OpenStreetMap Wiki
Jump to navigation Jump to search
Logo.png
This page describes a historic artifact in the history of OpenStreetMap. It does not reflect the current situation, but instead documents the historical concepts, issues, or ideas.
About
Pbftoosm was a tool to convert an .osm.pbf file into a OSM XML file (usually .osm filename suffix).
Reason for being historic
It was developed in 2011 and is not maintained any more. Its features are included in Osmconvert nowadays.
Captured time
2011


pbftoosm is a tool which transforms .pbf files into .osm files. You can define a bounding box or a bounding polygon to get a limitation of the map region in the same process. This tool is written in C and rather fast.

This program is not actively maintained at present. Please try using osmconvert instead.

Download

These Downloads are available:

(As usual: There is no warranty, to the extent permitted by law.)

Program Description

There are two major functions: translating a .pbf file into .osm format and limiting the borders to a certain geographical region.

Decoding .pbf Files

If you have a .pbf file and want to get a (human-readable) .osm XML file, you can use this program to accomplish the job. The syntax is like this:

./pbftoosm < norway.osm.pbf > norway.osm

You also can compress the .osm output file. For example:

./pbftoosm < norway.osm.pbf | gzip -1 > norway.osm.gz

For most applications the history tags are not needed. If you decide to exclude version, changeset, user and timestamp information, add the command line argument '--drop-history'. For example:

./pbftoosm --drop-history < a.pbf > a.osm

If you need to delete references to nodes which have been excluded because lying outside geographical borders, use this option (mandatory for data imports into OSM Map Composer):

--drop-brokenrefs

Similar to this, you can drop whole sections of the file:

--drop-nodes
--drop-ways
--drop-relations

Applying Geographical Borders

If you want to limit the geographical region of an .osm file, you can use a bounding box. To do this, enter the southwestern and the northeastern corners of the box. For example:

./pbftoosm < germany.osm.pbf -b=10.5,49,11.5,50 > nuernberg.osm

Instead of a simple bounding box you can use a border polygon file. This will allow a more accurate limitation to a political border, for example.

./pbftoosm < germany.osm.pbf -B=hamburg.poly > hamburg.osm

The format of a border polygon file can be found in the OSM Wiki: here. You do not need to follow strictly the format description, but you must ensure that every line of coordinates starts with blanks.

Usually, the output file will contain some empty relations because their nodes do not lie inside the borders you have applied. If you need to prevent this, please use the parameter -i to allow the program random file access. The processing time will increase a little. For example:

./pbftoosm -i=germany.osm.pbf -b=10.5,49,11.5,50 > nuernberg.osm
./pbftoosm -i=germany.osm.pbf -B=hamburg.poly > hamburg.osm

Plausibility Check

The parameter -t disables the standard output. This is useful if you want to check a file and do not need any output data. You will get error and warning messages only.

Additional Help

Please start the program with -h to get the help page.

Resources

The program itself will occupy only about 30 MB MB. When applying geographical borders (parameters -b and -B) another 400 MB are needed for a hash memory. You can change the hash memory size with the parameter -h..., e.g. -h=1000.

Benchmarks

Comparison with Osmosis

Although Osmosis is undoubtedly the best-known and most universal tool for OSM data conversions, pbftoosm proves to be much faster for certain special tasks.

With this benchmark, we compare the transformation of a .pbf file to an .osm file, while applying a border polygon. In detail:

  • .pbf input file: germany.osm.pbf (2011-04-16)
  • border polygon: bayern.poly von Cloudmade (2011-04-16)
  • software: Osmosis-0.39 and pbftoosm 0.2A
  • operating system: Ubuntu 10.04
  • CPU: Intel Atom 330 (2 cores, 1.6 GHz)

Osmosis

$ time ../osmosis/bin/osmosis --read-pbf file="germany.osm.pbf" --bounding-polygon file="bayern.poly" cascadingRelations --write-xml file="bayern_os.osm"
16.04.2011 14:00:10 org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.39
16.04.2011 14:00:11 org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
16.04.2011 14:00:12 org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
16.04.2011 14:00:12 org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
16.04.2011 15:07:29 org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline complete.
16.04.2011 15:07:29 org.openstreetmap.osmosis.core.Osmosis run
INFO: Total execution time: 4039003 milliseconds.
real	67m20.922s
user	67m6.656s
sys	1m18.453s
  • Result: about 67 minutes

pbftoosm

$ time ./pbftoosm -i=germany.osm.pbf -B=bayern.poly > bayern_po.osm
real	4m29.005s
user	3m40.514s
sys	0m22.029s
  • Result: less than 5 minutes

pbftoosm, discarding History Information

$ time ./pbftoosm -i=germany.osm.pbf -B=bayern.poly --drop-history > bayern_po_h.osm
real	3m10.160s
user	2m38.766s
sys	0m12.269s
  • Result: about 3 minutes

File Sizes

  • bayern.poly - 100.1 KB (102506 Bytes)
  • germany.osm.pbf - 871.1 MB (913428341 Bytes)
  • bayern_os.osm - 3.1 GB (3303447197 Bytes)
  • bayern_po.osm - 3.0 GB (3239098681 Bytes)
  • bayern_po_h.osm - 1.5 GB (1654126217 Bytes)

Osmosis vs. pbf2osm vs. pbftoosm

Comparison of cutting a poly out of three files:

  1. bayern.osm.lzo (584 MB) (ultra fast compression and decompression, may depend on speed of the harddisk)
  2. bayern.osm.bz2 (287 MB) (very slow compression)
  3. bayern.osm.pbf (181 MB) (fast decompression)
time osmosis --rx file=bayern.osm.bz2 --bp file=regensburg.poly --wx file=e.osm
real	10m7.058s
time lzop -d < bayern.osm.lzo | osmosis --rx file=- --bp file=regensburg.poly --wx file=c.osm
real	2m23.741s
time bzcat bayern.osm.bz2 | osmosis --rx file=- --bp file=regensburg.poly --wx file=g.osm
real	2m22.601s
time bzcat bayern.osm.bz2 | osmchange -B=regensburg.poly > f.osm
real	2m0.632s
time pbf2osm bayern.osm.pbf | osmchange -B=regensburg.poly > d.osm
real	1m45.854s
time lzop -d < bayern.osm.lzo | osmchange -B=regensburg.poly > b.osm
real	0m46.488s
time osmosis --rb file=bayern.osm.pbf --bp file=regensburg.poly --wx file=a.osm
real	0m40.421s
time pbftoosm -B=regensburg.poly -i=bayern.osm.pbf > h.osm
real	0m6.658s

Conclusion

Cutting out a polygon with pbftoosm is nearly 20 times faster than pbf2osm combined with osmchange. And still more than 6 times faster than osmosis (with pbf).

Further Benchmarks

Please add your benchmark results.