Compression thingy
From OpenStreetMap Wiki
| It has been proposed that this page be deleted or replaced by a redirect. (Discuss) |
Binary file format for OSM data
Notes
Data items (tag names, tag values)
- If you don't expect the text to recur (e.g. "80n:389423894.jpg") then store an ID of 0 followed by the text
- If you expect the text to recur (e.g. "highway") then store an ID of 1, followed by the ID that you want to allocate to this text, followed by the text
- If the text has already been stored (e.g. the second highway) then just use its existing ID and don't bother storing the text
It's up to the compressor to decide whether something is 'volatile' (many unique values) or just a few values that get reused many times. Sample implementation hardcodes a list of volatile tag names. Future ones may do this by counting how many unique values each tag type has
Hard limits
- Keys and values not more than 256 bytes
- No more than 256 tags per item
- 2^16 nodes per way
- 2^32 nodes and ways
Lat, long
- encoded as an integer
Size
- Compress 110MB OSM file to 16MB
Speed
With the 110MB sample (on 700MHz PC)
- 6m07 to encode
- 1m47 to decode (without storing decoded data)
