Slovak Libraries Import

From OpenStreetMap Wiki
Jump to navigation Jump to search

Slovak Libraries Import

Import of Slovak libraries from SNK directory (Slovak National Library).

Goal

Add/complete publicly available libraries from SNK directory to OSM. The process should be repeated when SNK directory is updated (cca twice a year).

License

We obtained email approval from SNK on 2016-11-02.

Input Data

Input data is XLS file on SNK website with following columns:

  • library name
  • type
  • status
  • address
    • address is string in format A. Sládkoviča 2, or Horná Štubňa 472. If the string equals to city/village name, we assume such city is not using co called "orientation" numbers and only conscription numbers, and thus we do not work in such case with OSM tags addr:street and addr:streetnumber.
  • pobox
  • town/village name
  • subregion
  • region
  • email
  • web page
  • phone

Identifier

SNK directory does not contains simple integer-like identifier. Name of the library itself is not unique, nor library name + city/village. For the purpose of matching existing OSM libraries with SNK libraries and finding location of an SNK library we us matching by address (bbox of a subregion, city, street, streetnumber and visually check outcome of the script prior inserting the data into OSM.

Import Lifecycle

OSM contains cca 500 libraries in Slovakia (2016-11-09).

SNK directory contains 2349 records (2016-11-09), for the purpose of import we are interested in publicly available libraries:

  • regional: 37
  • city libraries: 104
  • village libraries: 1601

Initial Import Phase - SNK to OSM sync

Step 1. Search via Overpass API

For each SNK entry we perform following search in selected sub-region bbox:

  • A: we search among cca 500 currently existing libraries in OSM, i.e. we search for such nodes and ways which hold name=snk.name and tag amenity=library
  • B: we search by address:
    • if SNK entry contains in address part streetname, we search in OSM for such nodes and ways, which hold tags addr:city=snk.city, add:street=snk.street, addr:streetnumber=snk.address_no
    • if SNK entry does not contains streetname (i.e. it is a village without streetnames), we search OSM for such nodes and ways which hold tags addr:city=snk.city, addr:housenumber=snk.address_number

We search OSM only for nodes and ways (not relations), because there is only one existing library in Slovakia on relation, and address points in Slovakia are mostly on ways (buildings) and in lesser amount on nodes.

We pair the OSM search results with SNK records.

Step 2. - prepare osmc file

Rules:

  • if SNK record already exists in OSM as node or way with amenity:library, we do not create new node but enrich the existing found node.
  • for an SNK library, if we find on OSM only address node/way (not tagged as library), we do following:
    • if it is way (building), we create new node in center which will holds amenity:library tag.
    • if it is node, we enrich it.
  • library name from SNK overrides library name in OSM
  • existing addr: tags in OSM are not overriden with SNK data. Missing addr: tags are fetched from SNK data.

Note: slovak version of this page contains more detailed description on which particular addr tags are used and the rules to adopt them from SNK, but we have not translated it to english because the rules are specific to slovak addresses (thus non-slovak readers will be not familiar with them). We use the same address schema as we use in kapor import.

Script which implements rules above is launched with parameter subregion and produces following two files:

  • <subregion>_matched_libraries_to_create_update_or_delete.osc
    • for records from set A we have here <modify> osc operations.
    • for records from set B we have here <create> and <modify> osc operations.
    • each entry has also tag fixme:yes to force user to review each entry manually.
  • <subregion>_nonmatched_amenity_libraries.osm
    • contains those OSM nodes and ways from given subregion bbox which hold amenity:library and were not found with our search.

Step 3. - Manual review and manual import

User will login to JOSM as SKlibraries_bot and opens both files. User walks through all the nodes and ways in .osc file and when visually confirms tags and location, removes tag fixme=yes. Thanks to .osm file user can see if a library proposed to be created in osc file actually does not exists in OSM (was not discovered by the search process). In such case user will merge the two libraries manually.

Continuous Updates

In initial import phase we expect to find location or existing entry for cca 50% SNK records. Most of the associations between SNK-OSM will be provided via address. Import of Slovak address points (kapor) is still an ongoing process which is proceeded subregion by subregion and currently cca 50% of subregions are imported so far.

We will repeat SNK import later along with the kapor import progress.

SNK directory is updated once or twice a year, after its update we will run the update process again, which will allow us track libraries which can be later disused (SNK directory contains information on disused libraries (snk.status = Zrušená || Stagnujúca). Location of an existing library is not supposed to change in Slovakia (or it happens very rarely), thus we do not cover in our workflow the situation that a library should change its address.

Dedicated user

Data is imported under dedicated OSM account named SKlibraries_bot

Source code

https://github.com/Infolovec/mapakniznic.sk

run script via rake snk-to-osm <okres>. (not fully working, work in progress ).

Contact

https://www.openstreetmap.org/user/Peter%20Vojtek/

Discussion in Slovak

https://groups.google.com/forum/#!topic/osm_sk/HEOTNJmN40o