Hartmut Holzgraefe <hartmut@(php.net|mariadb.com)>
MySQL Backend for osm2pgsql
- osm2pgsql in theory supports different database backends for output and for its caching layer (aka 'middle')
- MySQL/MariaDB have basic GIS support (with improvements in MariaDB since 5.3 and in MySQL since 5.6)
So why not add an 'output' and 'middle' module to osm2pgsql that talks to MySQL/MariaDB instead of PostgreSQL?
Shouldn't be too hard, right?
Unfortunately the osm2pgsql turned out to be less modular than originally thought, with PostgreSQL specific code not only in the output-pgsql and middle-pgsql module code files but also throughout the main osm2pgsql code files themselves, and not PostgreSQL specific generic code that could be shared by other backend implementations being embedded into the PostgreSQL specific module files
- be able to compare GIS functionality and performance between MySQL, MariaDB and PostgreSQL using large real world data sets
- shared hosters often only provide MySQL out of the box, not PostgreSQL/PostGIS (or only for extra $$$)
- when using an application or framework that relies on (or works best with) MySQL, like e.g. MediaWiki or WordPress it can make sense to have GIS data in the same DBMS instead of having to maintain a second DBMS installation alongside with MySQL
- MySQL on-disk data file formats are architecture-agnostic, so it is e.g. possible to do an import on a fast local box and then to just copy the generated data files over to e.g. a webserver that has a different architecture (with PostgreSQL this already fails when trying to move files generated on a 64bit x86 system to a 32bit x86 system, with MySQL this has been a no-brainer between systems of different word length or endianess for over a decade now)
- the above also applies to native replication setups ...
- refactor all PostgreSQL specific code out of the main code and into the pgsql specific module files only (done)
- refactor generic functionality out of the PostgreSQL specific files into generic files so that it can be shared by different backends (done)
- create MySQL/MariaDB output module (done)
- create MySQL/MariaDB middle module (work in progress)
- find a way to support --hstore (not started yet)
- imports are possible as long as there is enough RAM (no --slim mode yet due to lack of middle layer)
- imports take about 4-5 times as long as with PostGIS (on the same machine)
- query performance on the data imported so far is similar to that of PostGIS (this may change though once larger data sets can be imported so that the working set doesn't fit into RAM easily anymore)
See this GitHub project ...