Improvements to osrm
OSRM is a routing package for OSM data. In this project we are aiming to extend the functionality of OSRM by writing a plugin which takes the route output given by OSRM engine and provides "turn directions" as an output to the user so that he can navigate easily along the route. We also work on improving the efficiency of current engine by adding compression techniques for efficient memory management. This makes index lookups faster and thus results in faster throughputs.
The project is divided into two parts ( written in the order of priority)
Route Descriptor Plugin
As the name suggests , this plugin describes the routes as turn-directions for a route given as output by osrm. This involves processing the output given by the routing engine using the additional data from osm file and creating the turn-directions
Done with the coding part. In this phase I worked on improving the Route Descriptor engine. We have made a few improvements for long distance routes and this results in lesser number of route descriptions by clustering a set of routes as a single instruction and thus less number of turn directions. Patch Submitted . Yet to be committed to the trunk.
Current route descriptor engine contained a "BaseDescriptor" plugin which is the base for all the descriptors like KML,JSON, GPX etc. However every of the above route descriptors use their own route description code for providing instructions. This resulted in redundant code in the trunk as each descriptor contains the same code of route calculation with minor modifications (corresponding to their output formats). So as a part of this phase , 2 things are implemented.
1) Cleaning up the route Descriptor code: I have implemented a separate plugin called "RouteDescriptor" that extends "BaseDescritor" and provides methods that take input as a set of route segments and provides route descriptions in a general format that can be used by any of the KML,JSON, GPX etc. Now each of these descriptors now extends (inheritance in cpp) "RouteDescriptor" instead of "BaseDecriptor". The advantage of doing this modularisation is separating route description and specific formats. Now if we make changes to Route Descriptor engine, we need not worry about specific formats like KML, JSON provided me maintain the output format correctly.
2) Improvements in Descriptor : Once step 1 is done, we can now play with the Route Descriptor, we made some optimizations to the route descriptor engine for longer routes. The main idea behind this optimization is that, for longer routes , the end user doesn't expect each and every detail (turn directions) at every point in the route. Using this , we can cluster some continuous routes (like highway routes) into a single instruction along with combined distance and we presend all the roads at a single go. This results in less number of instructions.
Eg: ( Continue on NH1 for 100km & Continue on NH5 for 200km) can be replaced by Continue on NH1:NH5 by 300km.
The above procedure results in 33% decrease in the number of instructions for longer distances without any loss of data. We don't touch via points because they need to be explicitly mentioned in the route.
Pre-processing using compression
This involves using a compressing library to speed up things during preprocessing of osm data to create hierarchies and indexes. The parser is given as an input, a large number of Street Names to be parsed and this hogs the main memory. We can improve things using a good compression library like zlib or any other library which can provide good compression/decompression ratios.
Status : Working on integrating zlib-compression during extraction phase and writing the compressed file to disk instead of uncompressed binary text. Initial results showed 50% decrease in index size on the disk and also in the main-memory usage and slightly increasing the lookup time due to on the fly de-compression. Almost completed the coding part and patch will be submitted before 05/07/2011 - Patch submitted (yet to be committed to the trunk).
Around 20% of total osm data is street name data. These street names are extensively used by routing applications to describe routes to the user. So, there is a great need to compress them so as to save disk space and also memory consumption in the routing servers. Currently osrm stores the street data as a binary file on disk and parses it during runtime and stores it as a vector in the main memory. If we configure osrm for large datasets like Planet, this blows up the memory usage and also disk usage. In this phase, I integrated zlib with osrm, which compresses the street index before writing to disk and while loading back into the memory, it loads the compressed data as partitions directly without de-compressing it into a vector. When the application tries to lookup for a street name using an index, we de-compress the corresponding partition on the fly and fetch the relevant street name. Since we de-compress during the run time, there is a small penalty in the lookup time but the advantage is that we have significant saving in memory consumption and also disk space. Before applying the zlib compression, we sorted the entire list of street names and applied Difference Encoding to it. This removes most of the redundancy and applying zlib over it produced good results.
I tested it over India dataset:
Diskspace reduced from 210KB to 80KB and memory usage reduced from 42.4KB to 1.2KB .
Lookup time increased slightly by 0.0003 to 0.0004ms [Averaged over 100 lookups].
This needs to be tested on datasets which contain large of streetnames (like European countries).
Project yet to start on May 23rd, 2011. Currently working on the source code of OSRM . Hoping to update this page with with some useful stuff.
15/05/2011 to 23/05/2011 (Community bonding period)
23/05/2011 to 30/05/2011
01/06/2011 to 08/06/2011
09/06/2011 to 15/06/2011
15/06/2011 to 05/07/2011 (2.5 weeks)
07/07/2011 to 21/07/2011 (2 weeks)
23/07/2011 to 14/08/2011 (2 weeks)
14/07/2011 to 21/08/2011 (1 week)
Dennis Luxen- Mentor
Bharath Vissapragada - Student