PBF Perl Parser

From OpenStreetMap Wiki
Jump to: navigation, search

In order to parse PBF files in Perl, you'll need two modules from CPAN :

  • Google::ProtocolBuffers [1]
  • Compress::Zlib from the IO::Compress suite [2]

Then you need to load the two ProtoBuf prototype files from an OSM source, for instance :


The tricky part being that Google::ProtocolBuffers does not understand C-style comment ( /* */ ) - only C++-style comment (// in the beginning of the line). Use your favourite text editor to alter the proto files accordingly.

Finally, this skeleton script should help you get started :


use Google::ProtocolBuffers;
use Compress::Zlib qw(uncompress);

Google::ProtocolBuffers->parsefile("fileformat.proto", {});
Google::ProtocolBuffers->parsefile("osmformat.proto", {});

my $data;
my $headerblock=undef;
open my($fh), "<belgium.osm.pbf";
binmode $fh;
while ( read($fh,$data,4) == 4) {
  my $sz = unpack("N",$data);
  (read($fh,$data,$sz) == $sz) or die "Cannot read ".$sz." bytes";
  my $blobheader = BlockHeader->decode($data);
  $sz = $blobheader->{datasize};
  (read($fh,$data,$sz) == $sz) or die "Cannot read ".$sz." bytes";
  my $blob = Blob->decode($data);
  if (defined($blob->{raw})) {
    $data = $blob->{raw};
  } elsif (defined($blob->{zlib_data})) {
    $data = uncompress($blob->{zlib_data});
    die "Cannot uncompress block" unless (defined($data));
  } else {
    die "Unknown compression type";
  print "Buffer is ".length($data)." bytes - header announced ".$blob->{raw_size}." bytes\n";
  if ($blobheader->{type} eq 'OSMData') {
    my $primitive = OSMPBF::PrimitiveBlock->decode($data);  
  } elsif ($blobheader->{type} eq 'OSMHeader') {
    $headerblock = OSMPBF::HeaderBlock->decode($data);
close $fh;

Important note : this code is work in progress and is not working yet - use the 'talk' page for improvements