Bulk Importing
From OpenStreetMap
Bulk importing is the process of taking some external data source and converting it for use in OSM. Some examples include:
- AND Data - Data for the whole of the Netherlands
- Almien coastlines - Importing coastlines from PGS
- TIGER - Importing data from the US TIGER database
This discusses various methods of doing a bulk import.
Contents |
via JOSM
One of the earliest and most common forms of bulk import is via JOSM. Some program takes the import data and converts it to JOSM file format. A user can then open JOSM, load the file and hit the upload button. This method is fairly reliable and allows for the user to view and manipulate the data prior to upload. The disadvantage is that JOSM has to be able to load data, which becomes impractical once the data reaches several megabytes. Also there have been reports that when uploading lots of data, the process usually gets stuck after a few thousand objects, making it not very useful for large scale uploads.
Because it cannot be run on the server because of the user interaction required.
via direct API manipulation
This method is rather uncommon in general, however it is used by Almien coastlines. There the shapefile is processed record by record and it immediatly creates the objects. The main downside of this method is that no record is generally made of what has been uploaded and created and since it is done on the fly, any interruption generally does not know how far it got. The actual changeset is ambiguous since it is also generated on the fly.
In general this method should be discouraged.
via Osmosis
Osmosis is a program which does general manipulation of OSM data. Amongst its many other features it allows the application of a "change file" to a database. It understand osmChange file format. However, it can only apply directly to a database, not to the API and currently lacks features like placeholders during the creation of objects and referential integrity of the database objects. However, for applying changes where the IDs of objects are known beforehand and to a local database it is unbeatable.
This tools shows a lot of long-term promise.
via bulk_upload.pl
This method basically follows the concept of taking a changefile and applying it to the OSM database. It understands both JOSM file format and osmChange format. It handles placeholders and tracks changes that have been applied. Thus if the process is aborted for any reason it can recover and continue.
This program was used for both the AND and TIGER imports and is regularly used while uploading coastlines. It supports v4 and v5 API, though it obviously defaults to the latter. For details see bulk_import.pl.

