WIWOSM

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages
Screenshot of German Wikipedia: Article to Austria with vector-data to this article (red)

WIWOSM (Wikipedia where in OSM) is a project to show for a Wikipedia article geometric objects from OpenStreetMap. These objects have a matching wikipedia=* tag. The usage of WIWOSM will be primarily at maps inside Wikipedia (OSM-Gadget and WikiMiniAtlas).

Help us!
fix the tagging errors

Example page

http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=en&title=Austria
is the same as:
http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=de&title=%C3%96sterreich

Shows Austria on the map. The title-parameter in the URL is the Wikipedia article in the english Wikipedia, instead the OSM-relation for Austria use the tagging for German Wikipedia: wikipedia=de:Österreich. The existing Wikipedia-side interwiki language links from Wikipedia area indexed, so only a single extra tag is needed on the OpenStreetMap side, and this can be any one of the Wikipedia language articles.

Advantages

Objects which can be referred to using WIWOSM don't need to be "hardlinked" anymore. Up to now the only possible way to link a certain object was to refer to its OSM-ID. The problem here is that the ID can change and thus breaks the link.

Advantages for Wikipedia

The project will give users of wikipedia with one click a detailed view on complex geographical objects like rivers, streets and districts. So it increases the potential of wikipedias geocoding where an article gets only a point by the possibility to describe additionally complex lines and polygones. The geometry from OSM helps us also to bring the map to the right zoomlevel.

WIWOSM will connect the Wikipedia content with the opportunities of a Geographic information system. So we can say e.g. how large an article object is or which articles are in an other...

Advantages for OpenStreetMap

OpenStreetMap can show with this project for many people that it is more than a map, instead it's a database with a lot of use cases. OpenStreetMap can win with the wikipedians a lot of new users this will help to ensure the quality. The connection to Wikipedia will also help to translate a lot of objects into different languages.

Other Example: An article about a miniature railway in Germany [1]. It uses points for railway stations, lines for the railway track and polygons for the rail yard.

Object types

OpenStreetMap has different representations for objects. This means a place can be represented by a node (a single point) or a multipolygon (an area). We want the most complex representation for this project. A single node brings us no additional information on the map compared to the usual geocoding in Wikipedia. On the other hand the place node for a city has a lot of additional data so it would be nice to connect it with Wikipedia.

Other use-cases like the usage of waterway=river or waterway=riverbank needs further discussions of the community. Please use the discussion page here.

The project idea is very good and simple to bring many OSM objects to one Wikipedia article. In the reverse it is not possible to tag one OSM object with more than one Wikipedia article. As solution for that we see the usage of relations.

Please have in mind that the scope of Wikipedia and OpenStreetMap differs at some points. So please do not add things to OSM that are out of project scope like historic stuff or ranges of specimen. We hope to solve this problem later with other projects, like http://en.wikipedia.org/wiki/Template:Attached_KML.

You should also respect that the system has limits. So if you want to tag all post boxes in the world for the article "Post Box", it will not work and you will see protest from OSM-community. Please discuses such ideas before. The system seems also not good for collecting articles in Wikipedia like Streets in Oslo. If you want to have maps in Wikipedia for this it's perhaps better to ask the Wikipedia Map workshop.

Technical procedure

Data basis

The project use the data of the mapnik-database on toolserver.org. This brings sometimes a larger replication delay: munin-stat

Filtering by Wikipedia-Tag

OSM has a lot of ways to use the wikipedia=* tag, and we try to support most of them. But we believe it would be best to always use the preferred method: wikipedia=lang:article.

Key value support
wikipedia lang:article prefered tagging scheme. One tag is enough!
wikipedia:lang article works, too. If multiple wikipedia tags are found, we use the first we can get. We get other languages from Wikipedias Interlanguage links
wikipedia http://lang.wikipedia.org/wiki/Article is parsed (also with https)
wikipedia:lang http://lang.wikipedia.org/wiki/Article is parsed, too (also with https)
wikipedia article is shown in the broken.html log. We need a defined language and don't want to guess one! Please help to fix it!


Please make sure you don't use a name which gets redirected in Wikipedia.
Example: http://de.wikipedia.org/wiki/Frasher gets redirected to http://de.wikipedia.org/wiki/Frashër, but Frasher at WIWOSM doesn't work.

WIWOSM supports anchors seperated by '#' for objects without an own article but with an entry in a list article. So you can use wikipedia=lang:list-xy#object. As disadvantage this will be not so stable like full article names and doesn't support interwikilinks. Linked Wikipedia articles are possible to jump exactly to the right position: Example (A church in a list of churches).

Wikidata

2014-03-06
WIWOSM uses now wikidata to link all the different articles in many different wikis that describe the same thing.
You may think, that this is an obvious use case of wikidata, but at the time of starting this project wikidata was in an unusable state.
Now the procedure is to look in the wikidata table 'wb_items_per_site' for the Wikidata-ID (that one beginning with Q...) and to get all linked articles by that number.
The advantage is that you can ask the api now for a wikidata object by its ID like:

 http://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=wikidata&article=Q1731

or the map like

 http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=wikidata&title=Q1731

This does not mean, that we support the Tag Proposed_features/Wikidata for now.
That would be another big code change in the WIWOSM-processing and it brings up some other kind of problems, for example "Which tag should be preferred (wikidata=... or wikipedia=...)?" or "Is it better to trust IDs than human readable article names?" and so on.
But maybe it could be usefull anyway.

Logging

Thanks to many nice users we reduced some wrong wikipedia-tags (about 44000 Objects) to just ~4000. Awesome!
With the new Wikidata linking process, we find some other bugs, that were not addressed in the past.
The reasons for that could be:

  • bugs in WIWOSM
  • bugs in Wikidata (or the toolserver clone of it)
  • bugs in the tagging in OSM

You can have a look in the log and test some articles that could not be found. Maybe we can reduce the errors again. ;)

Check out the WIWOSM processing log in different formats here:

Logtype Link comment
json http://tools.wmflabs.org/wiwosm/wiwosmlog/broken.php Get the newest error log as gzipped json file.
html http://tools.wmflabs.org/wiwosm/wiwosmlog/broken.html View the log in your Browser, with some helpfull edit links and filtering.
html http://osm.jjaf.de/wiwosm/log/?stats Thanks to User:Jjaf.de we have some statistics with edit links.
You can use filters for example: http://osm.jjaf.de/wiwosm/log/?t=r&s=DE to get relations in germany. (works with TSV and ods spreadsheet, too)
tsv http://osm.jjaf.de/wiwosm/log/ You can get it as tab separated file.
ods http://osm.jjaf.de/wiwosm/log/broken.ods This is an open document spreadsheet.

Have fun with fixing errors!

Simplification

For objects with with a lot of points we use the PostGIS-function ST_SimplifyPreserveTopology() to reduce the data we send to the browser. We also reduce the decimal places. The simplify strategy at this moment is this:

ST_AsGeoJSON(
  CASE
    WHEN ST_NPoints(ST_Collect(way))<10000 THEN ST_Collect(way)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 10000 AND 20000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/500000)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 20000 AND 40000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/200000)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 40000 AND 60000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/150000)
    ELSE ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/100000)
  END
,9) AS geojson

If you have a good idea to optimize it, let us know, please.

GeoJSON & compression

We transfer the geometry to GeoJSON with a PostGIS function and put the files compressed on the server. The projection of the files is Google mercator.

Internationalisation;

We use the InterWikiLinks of Wikipedia to create hardlinks in the file system.
The filesystem structure is based on a hashing algorithm called FNV.
The process is as follows:

 1. build a String with the language followed by article with underscores replaced by spaces
 2. compute the fnv hash of this string for example 1A00A138 in hexadecimal format
 3. build the file path like /1A/1A00/1A00A138_lang_articlename.geojson.gz (with forbidden chars removed from lang and articlename and cropped at 230 chars)
 4. write the gzip compressed GeoJSON to that file
 5. for every other language found for an article by Interlanguag links do 1-3 and create a hard link in the filesystem to that file

The advantage of this procedure is a fast access to the right geojson file, if you know language and article, without database queries or such stuff. Also we are able to support simpler update strategies by only updating one file without lookup on interwiki links.

Clientside

On clientside we load the geometry in OpenLayers, WikiMiniAtlas or Leaflet (optional Vietnamese Wikipedia gadget). If the geometry is loaded and if it's not only a point, we zoom to it.

Updating

  • We will try to update each night. I optimized the procedure a little so there is a smart update (>34min) each day but wednesday. Wednesday the whole directory is thrown away and completely new generated (>84min). The smart update doesn't check if an OSM-object or its wikipedia tag was deleted or if new languages in inter language links in wikipedia were added. This is only done on full update.
  • If you add a new Wikipedia-Tag and can not wait, you can update with (URL:+"?...&action=purge"). This procedure is relative expensive for us and needs 30-60 sec to be finished, because we have to look up this article in the hstore of the mapnik db on toolserver.
    So if you can, please wait to next day and be careful with this feature (Don't send the URL to a mailing list.). Have also in mind that the toolserver database has a replication lag of some minutes, so please wait after your edit at least 5 minutes.

API

You can get the geojson file content of an article directly by calling:

 http://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=de&article=Dresden

With a lang and article parameter. If it does not exist you get 404 Not Found.

If you want to explicit check if an wiwosm object is there without fetching its content you can use the action=check parameter for example

 http://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=de&article=Dresden&action=check

gives 1 on existing objects and 0 otherwise.

If you explicitly want to regenerate a single WIWOSM-Object before the nightly update process you can use action=purge but use with care!

18.04.2014: We moved wiwosm from toolserver to labs resulting in a new URL of the API.

Old URL:

 http://toolserver.org/~master/...

New URL:

 http://tools.wmflabs.org/wiwosm/...

The old url should be redirected at the moment.

Source code

WIWOSM together with POIs from Wikipedia

Lists for testing

Examples by the author

At the following list you can find some links to example maps:

Further examples

Toolserver Wikipedia Wikipedia tag in: Relation link
[2] Strategischer Bahndamm relation type=collection
[3] Residenzenweg relation type=superroute [4]
[5] Rhein relation type=waterway [6]

FAQ

  • Feature xyz I added to the map isn't in Wikipedia yet

Check the log to see if data has been renewed since you edit, and if there are no problems with your data.

To-Do

A lot of things are in the moment (April 2012) not in WIWOSM.
Help to bring more Wikipedia-Tags inside OSM.
WIWOSM in Europe (April 2012)

Creators / Contact

Events

  • 10.07.2012 WIWOSM is now running in all Wikipedia versions (not running in Internet Explorer)
  • 21.03.2012 WIWOSM starts it beta-phase and going live in German Wikipedia
  • after OSM license-change WIWOSM will go live in other Wikipedia

See also

  • Add-tags - web service to assist in adding wikipedia tags to OSM objects with the use of JOSM's RemoteControl feature
  • OSM-Wikipedia place name tool - on Toolserver
  • WikiMiniAtlas - Map widget based on OpenStreetMap data, active on many wikipedia sites; displays WIWOSM data
  • JOSM Wikipedia Plugin - A JOSM plugin which shows a list of Wikipedia articles regarding the area in which the user is mapping (by position or by category). It lets the mapper to add the wikipedia tag by a simple double click, and it also indicates the objects already delivered by WIWOSM server.
  • Osmose analyser - An Osmose wikipedia tag analyse. Partial world coverage.