OSMbin (file format)

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages — OSMbin (file format)
Afrikaans Alemannisch aragonés asturianu azərbaycanca Bahasa Indonesia Bahasa Melayu Bân-lâm-gú Basa Jawa Baso Minangkabau bosanski brezhoneg català čeština dansk Deutsch eesti English español Esperanto estremeñu euskara français Frysk Gaeilge Gàidhlig galego Hausa hrvatski Igbo interlingua Interlingue isiXhosa isiZulu íslenska italiano Kiswahili Kreyòl ayisyen kréyòl gwadloupéyen kurdî latviešu Lëtzebuergesch lietuvių magyar Malagasy Malti Nederlands Nedersaksies norsk norsk nynorsk occitan Oromoo oʻzbekcha/ўзбекча Plattdüütsch polski português română shqip slovenčina slovenščina Soomaaliga suomi svenska Tiếng Việt Türkçe Vahcuengh vèneto Wolof Yorùbá Zazaki српски / srpski беларуская български қазақша македонски монгол русский тоҷикӣ українська Ελληνικά Հայերեն ქართული नेपाली मराठी हिन्दी অসমীয়া বাংলা ਪੰਜਾਬੀ ગુજરાતી ଓଡ଼ିଆ தமிழ் తెలుగు ಕನ್ನಡ മലയാളം සිංහල ไทย မြန်မာဘာသာ ລາວ ភាសាខ្មែរ ⵜⴰⵎⴰⵣⵉⵖⵜ አማርኛ 한국어 日本語 中文(简体)‎ 吴语 粵語 中文(繁體)‎ ייִדיש עברית اردو العربية پښتو سنڌي فارسی ދިވެހިބަސް

Draft for an osm-binary-format.


Features offered

  • fast, indexed access via object-id or geographic location without loading or uncompressing more than the object to be loaded
  • fast, incremental updates without affecting more than the updated objects (e.g. apply hourly diffs to a binary planet-file)
  • fast, indexed access of "ways of a node", "relations of a way" and "relations of a node"
  • can store all information the OSM-xml-format can except username and userid (these are usually not required for anything).
  • can be used as a native format for:
    • navigation software
    • routing software
    • moving vector-maps
    • editors (not recommed)

Usage and intended use

The OSMbin file-format is intended for the following types of clients:

  • navigators/routers
  • addresses-finders
  • realtime-rendering of graphical maps

it is not intended for

  • devices with very limited storage-capacity

it is optimized for:

  • fast, indexed data-access
  • incremental updates
  • general usage

This protocol is supported by the following clients:


  • DONE The on-disk-format of version 1.0 is completely specified. It is simple enough to be understood by developers without a geodata-background.
  • DONE A reference-implementation of version 1.0 is provided in libOsm (part of Traveling Salesman).
  • DONE finding the optimal number of tag and wayRef/nodeRef -slots per record via a spreadsheet containing statistics of hamburg.
  • DONE: osmosis-tasks for reading, writing and reindexing osmbin-v1.0
  • DONE: implement an fsck-program that scans and repairs broken files/indexes.
  • DONE: add version-information to nodes, ways and relations
  • DONE: back-references between nodes, ways, relation and the relations that referencce them
  • DONE: optimized storage of long attribute-values
  • DONE: shorter storage of the element-types of relations

Status: OSMbin Version 1.0 is fully specified and a reference-implementation is fully working.


OSMbin is an on-disk-format that supports:

  • getWaybyID(), getNodebyID(), getRelationbyID()
  • getWaysForNodeID()
  • getAttributeofNodeID(AttribName)
  • getAttributeofWayID(AttribName)
  • getRelationsofWayID()
  • getRelationsofNodeID() and most important:
  • getNodesbyBoundingBox(north,south,east,west)
  • It is uncompressed, so it can be mmapped()
  • It is a mutable format to support updating parts of the map without having to re-generate the complete map-file
  • We keep wayIDs and nodeIDs as well as all nodes that originally belonged to a way from OSM, so osm-xml-diff -files can be applied to update the map.

Version 1.0

Version 1.0 requires API v0.6.

It is the default file-format of Traveling Salesman Release 1.0 .

Version 0.9

Version 0.9 of this format is the default-format of Traveling Salesman Release 0.8 .


The format need not consist of only a single file. e.g. indexes can be in separate files and ways, nodes, relations and attributes each in their own file. This can make it easier to grow an index and make the files for way,node,relation contain only records of a fixed size. You may also separate the (possibly normalized) data required for routing from the larger data-set required for real-time map-rendering with or without duplicating information between the 2.

  • the file-format contains redundant information but also the rules required to repair a broken file in a defined manner.
  • IDs are stored as 32bit-integer and are assumed to be dense in the planet-file. The current distribution is as follows:
    • Nodes: Number of used IDs=278150661, max(ID)=311426557 = 89% of the IDs between 0 and 311426557 are in use for not yet deleted objects
    • Ways: Number of used IDs=22702734, max(ID)=28356734)
    • Relations: Number of used IDs=41545, max(ID)=50910
  • Whitespace at the end or start of tag-values may be lost.
  • The empty key and the empty value MAY be supported.

File size:

  • hamburg.osm.bz2 = 4MB
  • hamburg.osm = 42MB
  • indexed street-names in HSQLDB=21KB
  • nodex.idx = 160MB (Tree of order 8 with no balancing and fixed, implicit depth of 16+1. Each level encodes the next 4 bit of the ID)
  • nodex.obm = 63.5MB (32 chars/Tag-Value, 4 attributes/record, 4 wayRefs/record)
  • ways.idx = 33.7MB
  • ways.obm = 26.6Mb (32 chars/Tag-Value, 6 attributes/record, 8 wayRefs/record)
  • attrnames.txt = 3KB (253 tag-names, longest name has 42 characters)

File size:

  • baden-wurttemberg.osm.bz2 = 44MB
  • baden-wuerttemberg.osm = 1,6GB
  • nodex.idx = 2,5GB
  • nodex.obm = 533MB
  • ways.idx = 391MB
  • ways.obm = 451MB
  • attrnames.txt = 22KB

Reference implementation:

  • Java is limiting the size of memory-mapped files to 64MB per default. Change it via the "-XX:MaxDirectMemorySize=256M" -parameter to the JVM