Osmosis
From OpenStreetMap
Overview
Osmosis is a command line java app for processing OSM data. The tool consists of a series of pluggable components that can be chained together to perform a larger operation. For example, it has components for reading from database and from file, components for writing to database and to file, components for deriving and applying change sets to data sources, components for sorting data, etc. It has been written so that it is easy to add new features without re-writing common tasks such as file or database handling.
Some examples of the things it can currently do are:
- Generate planet dumps from a database
- Load planet dumps into a database
- Produce change sets using database history tables
- Apply change sets to a local database
- Compare two planet files and produce a change set
- Re-sort the data contained in planet files
- Extract data inside a bounding box or polygon
For more information about the changesets used see OsmChange.
Current status
While I (User:Brett) have many ideas for expanding the scope of osmosis, it is in a reasonably complete state where all tasks appear to be working correctly. All bug reports welcome.
I've completed basic documentation now and it's all contained on this wiki. I have no plans to add additional documentation, but can be convinced otherwise if people see a need.
Downloading
The latest version is available at: http://gweb.bretth.com/osmosis-latest.zip
A java5 compatible version is available at: http://gweb.bretth.com/osmosis-0.24.1-java5.zip
If you wish to create a local schema, a schema creation script matching production is available at: http://gweb.bretth.com/osm_schema_latest.sql Note: This script is currently out of date, if anybody wishes to supply me with a version 11 schema script I'll upload it.
The subversion repository is available at: http://svn.openstreetmap.org/applications/utils/osmosis/
Example Usage
(More on the sub-page Osmosis/Examples.)
Import a planet file into a local MySQL database.
osmosis --read-xml file="planet.osm" --write-mysql host="x" database="x" user="x" password="x"
Export a planet file from a local MySQL database.
osmosis --read-mysql host="x" database="x" user="x" password="x" --write-xml file="planet.osm"
Derive a change set between two planet files.
osmosis --read-xml file="planet1.osm" --read-xml file="planet2.osm" --derive-change --write-xml-change file="planetdiff-1-2.osc"
Derive a change set between a planet file and a database.
osmosis --read-xml file="planet1.osm" --read-mysql host="x" database="x" user="x" password="x" --derive-change --write-xml-change file="planetdiff-1-2.osc"
Apply a change set to a planet file.
osmosis --read-xml file="planet1.osm" --read-xml-change file="planetdiff-1-2.osc" --apply-change --write-xml file="planet2.osm"
Sort the contents of a planet file.
osmosis --read-xml file="data.osm" --sort type="TypeThenId" --write-xml file="data-sorted.osm"
The above examples make use of the default pipe connection feature, however a simple read and write planet file command line could be written in two ways. The first example uses default pipe connection, the second explicitly connects the two components using a pipe named "mypipe". The default pipe connection will always work so long as each task is specified in the correct order.
osmosis --read-xml file="planetin.osm" --write-xml file="planetout.osm"
osmosis --read-xml file="planetin.osm" outPipe.0="mypipe" --write-xml file="planetout.osm" inPipe.0="mypipe"
Extract an area based on a polygon as found on maproom.psu.edu:
osmosis --read-xml file="planet-latest.osm" --bounding-polygon file="country2pts.txt" --write-xml file="germany.osm"
Only 0.5 tasks are available from version 0.22 onwards.
Detailed Usage
This section describes the complete set of command line options available.
Global Options
| Option | Description |
|---|---|
| -v | Specifies that increased logging should be enabled. |
| -vx | x is a positive integer specifying the amount of increased logging, 0 is equivalent to the -v option alone. |
| -q | Specifies that reduced logging should be enabled. |
| -qx | x is a positive integer specifying the amount of increased logging, 0 is equivalent to the -q option alone. |
Default Arguments
Some tasks can accept un-named or "default" arguments. In the tasks description, the argument name will be followed by "(default)".
For example, the --read-xml task has a file argument which may be unnamed. The following two command lines are equivalent.
osmosis --read-xml file=myfile.osm --write-null
osmosis --read-xml myfile.osm --write-null
Tasks
Only 0.5 tasks are available as of version 0.22 therefore all tasks default to the 0.5 version. 0.5 tasks can be explicitly specified by adding a "-0.5" suffix.
--read-mysql (--rm)
Reads the contents of a MySQL database at a specific point in time.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| readAllUsers | If set to yes, the user public edit flag will be ignored and user information will be attached to every entity. | yes, no | no |
| snapshotInstant | Defines the point in time for which to produce a data snapshot. | format is "yyyy-MM-dd_HH:mm:ss" | (now) |
--read-mysql-current (--rmcur)
Reads the current contents of a MySQL database. Note that this task cannot be used as a starting point for replication because it does not produce a consistent snapshot.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| readAllUsers | If set to yes, the user public edit flag will be ignored and user information will be attached to every entity. | yes, no | no |
--write-mysql (--wm)
Populates an empty MySQL database.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| lockTables | If yes is specified, tables will be locked during the import. This provides measurable performance improvements but prevents concurrent queries. | yes, no | yes |
| populateCurrentTables | If yes is specified, the current tables will be populated after the initial history table population. If only history tables are required, this reduces the import time by approximately 80%. | yes, no | yes |
--read-xml (--rx)
Reads the current contents of an OSM XML file.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The name of the osm file to be read, "-" means STDIN. | dump.osm | |
| enableDateParsing | If set to yes, the dates in the osm xml file will be parsed, otherwise all dates will be set to a single time approximately equal to application startup. Setting this to no results in major performance improvements due to the overhead of XML date parsing. | yes, no | yes |
| compressionMethod | Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2). | none, gzip, bzip2 | none |
--write-xml (--wx)
Writes data to an OSM XML file.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The name of the osm file to be written, "-" means STDOUT. | dump.osm | |
| compressionMethod | Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2). | none, gzip, bzip2 | none |
--bounding-box (--bb)
Extracts data within a specific bounding box defined by lat/lon coordinates.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| left | The longitude of the left edge of the box. | -180 to 180 | -180 |
| right | The longitude of the right edge of the box. | -180 to 180 | 180 |
| top | The latitude of the top edge of the box. | -90 to 90 | 90 |
| bottom | The latitude of the bottom edge of the box. | -90 to 90 | -90 |
| completeWays | Include all available nodes for ways which have at least one node in the bounding box. | yes, no | no |
| completeRelations | Include all available relations which are members of relations which have at least one member in the bounding box. | yes, no | no |
| idTrackerType | Specifies the memory mechanism for tracking selected ids. BitSet is more efficient for very large bounding boxes (where node count is greater than 1/32 of maximum node id), IdList will be more efficient for all smaller bounding boxes.
Note: when processing JOSM output with negative object IDs, select BitSet. IdList doesn't work. (Sounds illogical, should be the other way round, but I've tried it.) | BitSet, IdList | IdList |
--bounding-polygon (--bp)
Extracts data within a polygon defined by series of lat/lon coordinates loaded from a polygon file.
The format of the polygon file is described at the MapRoom website, with two exceptions:
- A special extension has been added to this task to support negative polygons, these are defined by the addition of a "!" character preceding the name of a polygon header within the file.
- The first coordinate pair in the polygon definition is not, as defined on the MapRoom site, the polygon centroid; it is the first polygon point. The centroid coordinates are not required by Osmosis (nor are they expected but they won't break things if present and counted as part of the polygon outline).
- An explicit example is provided on the Polygon filter file format page.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file | The file containing the polygon definition. | polygon.txt | |
| completeWays | Include all available nodes for ways which have at least one node in the bounding polygon. | yes, no | no |
| completeRelations | Include all available relations which are members of relations which have at least one member in the bounding polygon. | yes, no | no |
| idTrackerType | Specifies the memory mechanism for tracking selected ids. BitSet is more efficient for very large bounding boxes (where node count is greater than 1/32 of maximum node id), IdList will be more efficient for all smaller bounding boxes.
Note: when processing JOSM output with negative object IDs, select BitSet. IdList doesn't work. (Sounds illogical, should be the other way round, but I've tried it.) | BitSet, IdList | IdList |
--derive-change (--dc)
Compares two data sources and produces a changeset of the differences.
Note that this task requires both input streams to be sorted first by type then by id.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| inPipe.1 | Consumes an entity stream. |
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
--apply-change (--ac)
Applies a change stream to a data stream.
Note that this task requires both input streams to be sorted first by type then by id.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| inPipe.1 | Consumes a change stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
--read-xml-change (--rxc)
Reads the contents of an OSM XML change file.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The name of the osm change file to be read, "-" means STDIN. | change.osc | |
| enableDateParsing | If set to yes, the dates in the osm xml file will be parsed, otherwise all dates will be set to a single time approximately equal to application startup. Setting this to no results in major performance improvements due to the overhead of XML date parsing. | yes, no | yes |
| compressionMethod | Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2). | none, gzip, bzip2 | none |
--write-xml-change (--wxc)
Writes changes to an OSM XML change file.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The name of the osm change file to be written, "-" means STDOUT. | change.osc | |
| compressionMethod | Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2). | none, gzip, bzip2 | none |
--read-mysql-change (--rmc)
Reads the contents of a MySQL database at a specific point in time.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| readAllUsers | If set to yes, the user public edit flag will be ignored and user information will be attached to every entity. | yes, no | no |
| intervalBegin | Defines the beginning of the interval for which to produce a change set. | format is "yyyy-MM-dd_HH:mm:ss" | (1970) |
| intervalEnd | Defines the end of the interval for which to produce a change set. | format is "yyyy-MM-dd_HH:mm:ss" | (now) |
--write-mysql-change (--wmc)
Applies a changeset to an existing populated MySQL database.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| populateCurrentTables | If yes is specified, the current tables will be populated after the initial history table population. This is useful if only history tables were populated during import. | yes, no | yes |
--truncate-mysql (--tm)
Truncates all current and history tables in a MySQL database.
| Pipe | Description |
|---|---|
| no pipes |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
--write-null (--wn)
Discards all input data. This is useful for osmosis performance testing and for testing the integrity of input files.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
--write-null-change (--wnc)
Discards all input change data. This is useful for osmosis performance testing and for testing the integrity of input files.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
--sort (--s)
Sorts all data in an entity stream according to a specified ordering. This uses a file-based merge sort keeping memory usage to a minimum and allowing arbitrarily large data sets to be sorted.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| type (default) | The ordering to apply to the data. |
| TypeThenId |
--sort-change (--sc)
Sorts all data in a change stream according to a specified ordering. This uses a file-based merge sort keeping memory usage to a minimum and allowing arbitrarily large data sets to be sorted.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| type (default) | The ordering to apply to the data. |
| streamable |
--buffer (--b)
Allows the pipeline processing to be split across multiple threads. The thread for the input task will post data into a buffer of fixed capacity and block when the buffer fills. This task creates a new thread that reads from the buffer and blocks if no data is available. This is useful if multiple CPUs are available and multiple tasks consume significant CPU.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| bufferCapacity (default) | The size of the storage buffer. This is defined in terms of the number of entity objects to be stored. An entity corresponds to an OSM type such as a node. | 100 |
--buffer-change (--bc)
As per --buffer but for a change stream.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| bufferCapacity (default) | The size of the storage buffer. This is defined in terms of the number of change objects to be stored. A change object consists of a single entity with an associated action. | 100 |
--merge (--m)
Merges the contents of two data sources together.
Note that this task requires both input streams to be sorted first by type then by id.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| inPipe.1 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| conflictResolutionMethod | The method to use for resolving conflicts between data from the two sources. |
| timestamp |
--merge-change (--mc)
Merges the contents of two changesets together.
Note that this task requires both input streams to be sorted first by type then by id.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| inPipe.1 | Consumes a change stream. |
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| conflictResolutionMethod | The method to use for resolving conflicts between data from the two sources. |
| timestamp |
--read-api (--ra)
Retrieves the contents of a bounding box from the API. This is subject to the bounding box size limitations imposed by the API.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| left | The longitude of the left edge of the box. | -180 to 180 | -180 |
| right | The longitude of the right edge of the box. | -180 to 180 | 180 |
| top | The latitude of the top edge of the box. | -90 to 90 | 90 |
| bottom | The latitude of the bottom edge of the box. | -90 to 90 | -90 |
| url | The url of the API server. | http://www.openstreetmap.org/api/0.5 |
--report-entity (--re)
Produces a summary report of each entity type and the users that last modified them.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The file to write the report to. | entity-report.txt |
--report-integrity (--ri)
Produces a list of the referential integrity issues in the data source.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| file (default) | The file to write the report to. | integrity-report.txt |
--write-pgsql (--wp)
Populates an empty PostgreSQL database. Note that this task is NOT designed for writing to a postgis database, rather it is aimed at writing to an equivalent schema to that provided by the main MySQL database.
This task is experimental and not complete. It is a testbed for playing with PostgreSQL as a replacement for the main OSM database. If anybody is interested in this, I can send the current schema I've been working with. This is currently a low priority task being worked on in the background.
The major aims of this are:
- Full database transaction support for all operations
- Eliminate misalignment of current and history tables by removing data duplication, may help to reduce space
- Referential integrity checking (ie. foreign key constraints)
Other things that would be nice from an operational point of view are:
- Online index rebuilding
- Consistent snapshots of tables without locking
- Table partitioning
What it won't do:
- It probably won't reduce overall disk space due to a possible increased number of indexes.
- Performance is unlikely to be improved due to additional referential integrity checks.
- Enhance the API. The aim is to remain compatible with the existing API to keep the changes minimal.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
| lockTables | If yes is specified, tables will be locked during the import. This provides measurable performance improvements but prevents concurrent queries. | yes, no | yes |
| populateCurrentTables | If yes is specified, the current tables will be populated after the initial history table population. If only history tables are required, this reduces the import time by approximately 80%. | yes, no | yes |
--truncate-pgsql (--tp)
Truncates all data in a PostgreSQL database.
| Pipe | Description |
|---|---|
| no pipes |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
--log-progress (--lp)
Logs progress information using jdk logging at info level at regular intervals. This can be inserted into the pipeline to allow the progress of long running tasks to be tracked.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| interval | The time interval between updates in seconds. | 5 |
--log-change-progress (--lcp)
Logs progress of a change stream using jdk logging at info level at regular intervals. This can be inserted into the pipeline to allow the progress of long running tasks to be tracked.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| outPipe.0 | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| interval | The time interval between updates in seconds. | 5 |
--tee (--t)
Receives a single stream of data and sends it to multiple destinations. This is useful if you wish to read a single source of data and apply multiple operations on it.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| ... | |
| outPipe.n-1 (where n is the number of outputs specified) | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| outputCount (default) | The number of destinations to write this data to. | 2 |
--tee-change (--tc)
Receives a single stream of change data and sends it to multiple destinations. This is useful if you wish to read a single source of change data and apply multiple operations on it.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a change stream. |
| outPipe.0 | Produces a change stream. |
| ... | |
| outPipe.n-1 (where n is the number of outputs specified) | Produces a change stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| outputCount (default) | The number of destinations to write this data to. | 2 |
--write-pgsql-simple (--wps)
Populates an empty PostGIS database with a "simple" schema. A schema creation script is available in the osmosis script directory.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
--write-pgsql-simple-dump (--wpsd)
Writes a set of data files suitable for loading a PostGIS database with a "simple" schema using COPY statements. A schema creation script is available in the osmosis script directory. A load script is also available which will invoke the COPY statements and update all indexes and special index support columns appropriately.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| directory | The name of the directory to write the data files into. | pgimport |
--truncate-pgsql-simple (--tps)
Truncates all current and history tables in a PostGIS with a "simple" schema.
| Pipe | Description |
|---|---|
| no pipes |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
--read-pgsql-simple (--rm)
Reads the contents of a PostGIS database with a "simple" schema.
| Pipe | Description |
|---|---|
| outPipe.0 | Produces a dataset. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| authFile | The name of the file containing database login credentials (See Osmosis#Database_Login_Credentials for more info). | N/A | |
| host | The database host server. | osm | |
| database | The database instance. | osm | |
| user | The database user name. | osm | |
| password | The database password. | (blank) | |
| validateSchemaVersion | If yes is specified, the task will abort if the database schema version is not supported. | yes, no | yes |
--dataset-bounding-box (--dbb)
Extracts data within a specific bounding box defined by lat/lon coordinates. This differs from the --bounding-box task in that it operates on a dataset instead of an entity stream, in other words it uses the features of the underlying database to perform a spatial query instead of examining all nodes in a complete stream.
This implementation will never clip ways at box boundaries, and depending on the underlying implementation may detect ways crossing a box without having any nodes within that box.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a dataset. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| left | The longitude of the left edge of the box. | -180 to 180 | -180 |
| right | The longitude of the right edge of the box. | -180 to 180 | 180 |
| top | The latitude of the top edge of the box. | -90 to 90 | 90 |
| bottom | The latitude of the bottom edge of the box. | -90 to 90 | -90 |
| completeWays | Include all nodes for all included ways. | yes, no | no |
--dataset-dump (--dd)
Converts an entire dataset to an entity stream.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes a dataset. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
--way-key-value (--wkv)
Given a list of "key.value" tags, this filter passes on only those ways that have at least one of those tags set.
Note that this filter only operates on ways. All nodes and relations are passed on unmodified.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| keyValueList | Comma-separated list of desired key.value combinations | highway.motorway,highway.motorway_link,highway.trunk,highway.trunk_link |
--used-node (--un)
Restricts output of nodes to those that are used in ways.
| Pipe | Description |
|---|---|
| inPipe.0 | Consumes an entity stream. |
| outPipe.0 | Produces an entity stream. |
| Option | Description | Valid Values | Default Value |
|---|---|---|---|
| no arguments |
Database Login Credentials
All database tasks accept a minimum of four arguments, these are:
- authFile
- host
- database
- user
- password
If no arguments are passed, then the default values for host, database, user and password apply.
If authFile is supplied, it must point to a properties file with name value pairs specifying host, database, user and password. For example:
host=localhost
database=osm
user=osm
password=mypassword
Note that the properties file doesn't have to contain all four parameters, it may contain only the password leaving other parameters to be specified on the command line separately.
Command line arguments override the authFile parameters, which in turn override the default argument values.
Notes
The minimum supported Java version is 1.6. Osmosis makes use of some java.awt.geom classes that only appeared in 1.6, if these are re-written to use 1.5 functionality it may be possible to return to the previous minimum of 1.5. Osmosis utilises generics and java.util.concurrent features requiring Java 1.5 as a minimum.
A number of tasks produce temporary files that contain serialised java classes to avoid using too much RAM. These files are gzipped to reduce disk space and in many cases improve performance through reduced IO. This will fail on some Java installations where the uncompressed data exceeds 2GB. This is caused by [Bug 5092263]. Upgrading to a later JDK will fix this problem. The only other option is to modify the source code to use uncompressed temporary files, contact me if this is required. This affects the IBM JDK1.5.0 included with IBM Rational tools such as Rational Software Architect. This was fixed in Sun JDK 5.0u8(b01) and Sun JDK 6.0.
The standard Java GZIP implementation cannot support multiple gzip streams concatenated into a single file. This may be required if an osm file has been built from many parts. This problem is documented by [Bug 4691425]. I may attempt to fix this problem by incorporating the workaround included in the comments.
The builtin bzip (.bz2) performance is much slower than the gzip (.gz) built in Java one (Java gzip utilizes native code to improve performance). For larger planet files it is suggested to use platform's native bzip implementations and read and write files from /dev/stdin and /dev/stdout respectively.

