bulk_upload.py

From OpenStreetMap Wiki
Jump to: navigation, search


bulk_upload.py is a Python script for performing bulk imports. It uses ElementTree for XML parsing which allows for a subset of the abilities of Bulk_upload.pl but is compatible with API v0.6.


It is currently available in SVN (click to view online). There is also an untested bulk_upload.php script included.

To check the scripts out of subversion, use

svn co http://svn.openstreetmap.org/applications/utils/import/bulk_upload_06/

Usage

python bulk_upload.py --help Usage: bulk_upload.py -i input.osm -u user -p password

Options:

 -h, --help            show this help message and exit
 -i INFILE, --input=INFILE
                       read data from input.osm
 -u USER, --user=USER  username
 -p PASSWORD, --password=PASSWORD
                       password
 -c COMMENT, --comment=COMMENT
                       ChangeSet Comment

Installation on Ubuntu 9.04

aptitude install python2.5 python-httplib2 python-celementtree python-graph

Execution on Windows

Current version may fail running in Windows because os.rename() does not behave as in UNIX-like environments. The symptom is that 2000 nodes get loaded and the program aborts:

Created changeset: ###### Uploading to changeset ###### Uploading to changeset ###### Traceback (most recent call last):

 File "bulk_upload.py", line 333, in <module>
   importProcessor.parse(options.infile)
 File "bulk_upload.py", line 103, in parse
   self.addToChangeset(elem)
 File "bulk_upload.py", line 120, in addToChangeset
   self.currentChangeset.addChange(action, elem)
 File "bulk_upload.py", line 209, in addChange
   self.currentDiffSet.addChange(action,item)
 File "bulk_upload.py", line 249, in addChange
   self.upload()
 File "bulk_upload.py", line 276, in upload
   self.idMap.save()
 File "bulk_upload.py", line 150, in save
   os.rename(self.filename+".tmp", self.filename)

WindowsError: [Error 183] Cannot create a file when that file already exists


My quick hack is to add two lines around this step:

       os.rename(self.filename, self.filename+".old") # ensure prior version is renamed to something else
       os.rename(self.filename+".tmp", self.filename)
       os.remove(self.filename+".old") # once actual rename performed delete old file

Obviously a test on the value returned by os.name would make the behaviour more graceful.

Alternative version(s)

bulk_upload_sax.py

For huge imports where the XML file is too big for the default script (e.g. the Corine Land Cover France with 1.4 GB single file), a special version has been created in July 2009 to replace the DOM parsing loading the whole file in memory by the SAX parsing. Also, the translation table of old_id's to new_id's is stored in the *.osm.db file with the Python "shelve" persistence library (for performance reasons).

A second version of the script introduces an automatic retry in case of http error 500 ("Internal server error"), the most common error stoping bulk imports.

You can find the script here in svn.