Osmsync
Contents |
Osmsync
Osmsync is a library to help keep an external dataset synchronized with osm. Osmsync loads data from the external source, loads matching data from osm, and prepares a "diff" or difference report. A JOSM compatible changeset is produced for further human review prior to upload.
Osmsync has limited applicability, but it is highly functional in a few select situations. Osmsync is highly flexible and can be used for the initial import, for adding records later, for altering or renaming keys, or for feeding community edits back to the external source.
| Scripted imports and automated edits should only be carried out by those with experience and understanding of the way the OpenStreetMap community creates maps, and only with careful planning and consultation with the local community. See Import/Guidelines and Automated Edits/Code of Conduct for more information. |
Basic Operation
Osmsync loads both datasets and compares. Certain keys are defined as "master" in the external dataset. In the case of a chain store this might be the hours of operation, or the phone number of the store. Other keys are mastered in osm. In almost all cases osm is master for the exact coordinates, though an osmsync plugin may flag locations differ by more than 100 meters. Thus osm mappers can significantly enhance osmsync data without interfering with the conflation or data freshness updates.
Example datasets include:
- Car sharing locations, sourced from the car sharing reservation system. Count vehicles and vehicle types at each location.
- Government stream flow gauges, with a link to real time flow data.
- Current bus stop locations (but see also the gtfs-osm-sync project).
How it works
Osmsync stores a "conflation key" with each osm object, and uses it to match up data later:
source:pkey=xxxx source=osmsync:yyyy
The source:pkey tag documents the primary key in the original dataset (if there is no primary key in the original dataset a hash is used instead). If these keys are left alone by future mappers, the import will proceed smoothly. Deletion of these keys can lead to duplicates.
Example Control File
#!/usr/bin/python
##
## osmsync module to import USGS water data (specifically stream guaging
## stations)
from osmsync import osmsync
class osmsync_usgs_waterdata(osmsync):
# Sample waterdata as supplied:
# <site>
# <agency_cd>USGS</agency_cd>
# <site_no>09423350</site_no>
# <station_nm>CARUTHERS C NR IVANPAH CA</station_nm>
# <site_tp_cd>ST</site_tp_cd>
# <dec_lat_va>35.24498915</dec_lat_va><dec_long_va>-115.29887590</dec_long_va>
# <coord_acy_cd>F</coord_acy_cd>
# <dec_lat_long_datum_cd>NAD83</dec_lat_long_datum_cd>
# </site>
def fetch_source(self, sourcedata):
sourcenodes = {}
req = urllib2.Request(sourcedata, headers=self.http_headers);
tree = ElementTree.parse(urllib2.urlopen(req))
for site in tree.iter('site'):
pkey = site.find('site_no').text.strip()
node = {}
node['tag'] = {}
node['id'] = pkey
node['lat'] = site.find('dec_lat_va').text.strip()
node['lon'] = site.find('dec_long_va').text.strip()
node['tag']['source:pkey'] = pkey
node['tag']['man_made'] = 'monitoring_station'
node['tag']['monitoring:river_level'] = 'yes'
node['tag']['operator'] = site.find('agency_cd').text.strip()
node['tag']['description'] = site.find('station_nm').text.strip()
node['tag']['website'] = 'http://waterdata.usgs.gov/nwis/inventory/?site_no=' + pkey
sourcenodes[pkey] = node
source_is_master_for=['operator','website','description']
return(sourcenodes, source_is_master_for)
JOSM Extension
In August 2011 the osmsync developer extended JOSM slighly. JOSM will set the default changeset tags based from a loaded file:
<osm version="0.6" generator="osmsync">
<changeset>
<tag k="source" v="osmsync:ccs"/>
<tag k="note" v="Prepared by osmsync: car share reservation system..."/>
<tag k="conflation_key" v="source:pkey"/>
</changeset>
...
See also
- Potential Datasources - List of datasources which could be imported, with descriptions of licensing status/investigation.
- Foundation/Import Support Working Group - a group formed by the foundation to facilitate investigation and potential import of datasets.