Global Statistical Speed Matrix

GS^2M is a concept of a database containing data stored in records they can be addressed with integer values taken from geographic coordinates with easy formulas. It is meant to be a layer totally separated from OSM vector data never linked with existing ways and nodes, used only for queries during routing process when searching for the fastest way to get to destination.

In first thoughts each record contains information about quasi-square cell (1 second of width and 1 second of length on the map). Each second of latitude is always about 31 meters on the earth surface. Each second of longitude is about 31 meters only near the equator and the more far from equator we are, the less each second of longitude is to finally meet zero on poles. Because of that reason it is necessary to make few tweaks to make database more memory-efficent without losing any important data necessary in routing process. This will be described later what we need to do in order to have 31 meters as an average cell width.

Requirements

First of all, to make it widely usable, GS^2M should meet few requirements like:

Decentralization

everybody can have her/his own GS^2M database created from her/his own NMEA/GPX tracks
database can be merged easily with, for example, my friends' databases without losing any necessary information
then our databases can be uploaded to a public server in order to make this database Global

Privacy

nobody needs to know where I parked my car 3 days ago from 10pm to 7am ;-), but you may need to know that usually because of traffic from 8am to 11am GMT maximum possible speed is 40 km/h when you travel by M1 from Swords to Dublin (fortunately it is not so bad when you travel in opposite direction ;-) that time)

Simplicity

easy segmentation (for example into continents, countries, etc. whatever is necessary to cope with limited memory devices)
easy updating process by merging main public database with recent NMEA/GPX points processed into small update-databases
easy upgrade to variable cell size addressing

Story begins

Basically Global Statistical Speed Matrix is 2-dimensional matrix containing records stored in 360x60x60 columns and 180x60x60 rows. Each record contains data from NMEA/GPX points localized inside a quasi-square cell with 1 longitude-second of width and 1 latitude-second of length.

One record can contain data such as:

average speed (actually multiplicative inversed speed value which will be our metric, measured in time units, used to find the fastest way to destination). To make it even better, all speed data in each record can be stored hour after hour in a sub-matrix with 24x7 size in order not to loose important information about daily and weekly periods in traffic.
average altitude above a sea level in order to exclude NMEA/GPX points collected when travelling with a plane
number of NMEA/GPX points included in these average values
heading, in order to have separated speed data about going forward and backward when passing that cell. On weekdays there is a huge traffic in the morning on ways they lead from country side to city center, because majority of people go to work that time. But you can easily travel during that time in opposite directions. Analogically there is a huge traffic on those opposite directions in the afternoon when people come back from work, etc.

Cell adressing

There is no cell addressed with [zero,zero].

Rows

Height of each row on the map is about 31 meters.

Rows are addressed with integer numbers:

from -324000 to -1 on southern hemishere
from 1 to 324000 on northern hemisphere

Columns

In brief columns are addressed analogically:

from -648000 to -1 on western hemishere
from 1 to 648000 on eastern hemisphere

But in order to maintain 31 meters as an average width of a column on the map, little bit of complexity is necessary. Starting from North Pole to Equator there will be 13 sections with different number of columns:

rows	cols	max & min latitude values	number of cells
1	4	90°00'00" >= φ > 89°59'59"	4
4	20	89°59'59" >= φ > 89°59'55"	80
22	100	89°59'55" >= φ > 89°59'33"	2 200
105	500	89°59'33" >= φ > 89°57'48"	52 500
213	1500	89°57'48" >= φ > 89°54'15"	319 500
742	4500	89°54'15" >= φ > 89°41'53"	3 339 000
2123	13500	89°41'53" >= φ > 89°06'30"	28 660 500
6475	40500	89°06'30" >= φ > 87°18'35"	262 237 500
6433	81000	87°18'35" >= φ > 85°31'22"	521 073 000
19524	162000	85°31'22" >= φ > 80°05'58"	3 162 888 000
33302	324000	80°05'58" >= φ > 70°50'56"	10 789 848 000
83037	648000	70°50'56" >= φ > 47°46'59"	53 807 976 000
172019	1296000	47°46'59" >= φ > 00°00'00"	222 936 624 000

So there will be 291 513 020 284 cells to cover the northern hemisphere, so in total we need 583 026 040 568 cells to cover the Earth this way.

Data processing

Average values

Only NMEA/GPX points with non-zero speed value are included to process the average values in a cell. New n+1 average value is counted iterationally, so there is no need to know n points they were counted before in that cell.

av_new = av_old * n / (n+1)  +  value_new / (n+1);
n++;

However using only simple arithmetic mean formula can cause one issue, that in fact that will be weighted mean value. Tracks with lower speed values will contribute more to the final average value, because there are more NMEA/GPX points left in the same cell, when you travel slower. Knowing that, final formula should look like:

av_new = av_old * n / (n+speed_new/weight)  +  value_new / (n+speed_new/weight);
n = n+speed_new/weight;

where, for example, weight=130.

Heading

Each cell contains average values and n-values for going forward and backward separately.

24x7

Knowing daily and weekly traffic periods might be very valuable in routing process. Nobody would mind to know that if she/he started a trip one hour earlier, she/he would reach her/his destination 5 hours earlier.

Global Statistical Speed Matrix

Contents

Requirements

Decentralization

Privacy

Simplicity

Story begins

Cell adressing

Rows

Columns

Data processing

Average values

Heading

24x7

Navigation menu

Global Statistical Speed Matrix

Requirements

Decentralization

Privacy

Simplicity

Story begins

Cell adressing

Rows

Columns

Data processing

Average values

Heading

24x7

Navigation menu

Search