WIWOSM

From OpenStreetMap Wiki
Jump to: navigation, search
Available languages
Screenshot of German Wikipedia: Article to Austria with vector-data to this article (red)

WIWOSM (Wikipedia where in OSM) is a project to show for a Wikipedia article geometric objects from OpenStreetMap. These objects have a matching wikipedia=* tag. The usage of WIWOSM will be primarily at maps inside Wikipedia (OSM-Gadget and WikiMiniAtlas).

Help us!
fix the tagging errors

Example page

http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=en&title=Austria
is the same as:
http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=de&title=%C3%96sterreich

Shows Austria on the map. The title-parameter in the URL is the Wikipedia article in the english Wikipedia, instead the OSM-relation for Austria use the tagging for German Wikipedia: wikipedia=de:Österreich. The existing Wikipedia-side interwiki language links from Wikipedia area indexed, so only a single extra tag is needed on the OpenStreetMap side, and this can be any one of the Wikipedia language articles.

Advantages

Objects which can be referred to using WIWOSM no longer need to be "hardlinked". Previously, the only possible way to link a certain object was to refer to its OSM-ID. The problem with that is that the ID can change and thus break the link.

Other Example: An article about a miniature railway in Germany [1]. It uses points for railway stations, lines for the railway track and polygons for the rail yard.
Advantages for Wikipedia
The project will give users of Wikipedia a one-click, detailed view of complex geographical objects like rivers, streets and districts. It increases the potential of Wikipedia's geocoding, where an article gets only a point by describing complex lines and polygons. Geometry from OSM helps us to show maps at the right zoom level.
WIWOSM will connect Wikipedia content with the opportunities of a Geographic information system. So we can say e.g. how large an article object is or which articles are within the subject of another article.
Advantages for OpenStreetMap
This project shows that OpenStreetMap is more than a map; it's a database with a lot of use cases. Wikipedians will be encouraged to become OpenStreetMap editors and this will help to improve quality. The connection to Wikidata will also help to translate a lot of object labels into different languages.

Object types

OpenStreetMap has different representations for objects. This means a place can be represented by a node (a single point) or a multipolygon (an area). We want the most complex representation for this project. A single node brings us no additional information on the map compared to the usual geocoding in Wikipedia. On the other hand the place node for a city has a lot of additional data so it would be nice to connect it with Wikipedia.

Other use-cases like the usage of waterway=river or waterway=riverbank needs further discussions of the community. Please use the discussion page here.

The project idea is very good and simple to bring many OSM objects to one Wikipedia article. In the reverse it is not possible to tag one OSM object with more than one Wikipedia article. As solution for that we see the usage of relations.

Please have in mind that the scope of Wikipedia and OpenStreetMap differs at some points. So please do not add things to OSM that are out of project scope like historic stuff or ranges of specimen. We hope to solve this problem later with other projects, like http://en.wikipedia.org/wiki/Template:Attached_KML.

You should also respect that the system has limits. So if you want to tag all post boxes in the world for the article "Post Box", it will not work and you will see protest from OSM-community. Please discuses such ideas before. The system seems also not good for collecting articles in Wikipedia like Streets in Oslo. If you want to have maps in Wikipedia for this it's perhaps better to ask the Wikipedia Map workshop.

Technical procedure

Data basis

The project use the data of the mapnik-database on toolserver.org. This brings sometimes a larger replication delay: munin-stat

Filtering by wikipedia-tag

OSM has a lot of ways to use the wikipedia=* tag, and we try to support most of them. But we believe it would be best to always use the preferred method: wikipedia=lang:article.

schema example support
wikipedia=lang:article wikipedia=de:Dresden preferred tagging scheme. One tag is enough!
wikipedia:lang=article wikipedia:de=Dresden works, too. If multiple wikipedia tags are found, we use the first we can get. We get other languages from Wikipedias Inter language links
wikipedia=http://lang.wikipedia.org/wiki/Article wikipedia=http://de.wikipedia.org/wiki/Dresden is parsed (also with https)
wikipedia:lang=http://lang.wikipedia.org/wiki/Article wikipedia:de=http://de.wikipedia.org/wiki/Dresden is parsed, too (also with https)
wikipedia=lang:list-xy#object wikipedia=de:Liste der Kirchengebäude in Chemnitz#Schlosskirche anchors separated by '#' for objects without an own article but with an entry in a list article. As disadvantage this will be not so stable like full article names and doesn't support inter wiki links. Linked Wikipedia articles are possible to jump exactly to the right position: Wikipedia Liste der Kirchengebäude in Chemnitz#Schlosskirche on Wikipedia Example (a church in a list of churches).
wikidata=wikidata_entity wikidata=Q1731 Since April 2014, we have a beta support for the wikidata-tag. You can refer to a Q-Number of a wikidata entity.
wikipedia=article wikipedia=Dresden is shown in the broken.html log. We need a defined language and don't want to guess one! Please help to fix it!

Please make sure you don't use a name which gets redirected in Wikipedia!

Example: Wikipedia Frasher on Wikipedia gets redirected to Wikipedia Frashër on Wikipedia, but Frasher at WIWOSM doesn't work.

Wikidata

2014-04-19
Yeah, we now have support for the basic wikidata=* Tag!
I know that some people don't like that kind of ID-tagging scheme, because it is difficult to see the correctness by viewing the number, but I am convinced, that we need that kind of linking to wikidata somehow. The other way around (link OSM-IDs from wikidata) is a really bad idea, because OSM-IDs are more fragile.

The WIWOSM-processing tries now to find the wikidata-entity given by the wikidata tag. If this is not successfull or there is no wikidata-tag, we look at one of the wikipedia-tags and try to get the entity by language and article. This is beta! I hope it works as expected and does not break other things. Have fun with it!

2014-03-06
WIWOSM uses now wikidata to link all the different articles in many different wikis that describe the same thing.
You may think, that this is an obvious use case of wikidata, but at the time of starting this project wikidata was in an unusable state.
Now the procedure is to look in the wikidata table 'wb_items_per_site' for the Wikidata-ID (that one beginning with Q...) and to get all linked articles by that number.
The advantage is that you can ask the api now for a wikidata object by its ID like:

 http://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=wikidata&article=Q1731

or the map like

 http://toolserver.org/~kolossos/openlayers/kml-on-ol-json3.php?lang=wikidata&title=Q1731

Logging

Thanks to many nice users we reduced some wrong wikipedia-tags (about 44000 Objects) to just ~4000. Awesome!
With the new Wikidata linking process, we find some other bugs, that were not addressed in the past.
The reasons for that could be:

  • bugs in WIWOSM
  • bugs in Wikidata (or the toolserver clone of it)
  • bugs in the tagging in OSM

You can have a look in the log and test some articles that could not be found. Maybe we can reduce the errors again. ;)

Check out the WIWOSM processing log in different formats here:

Logtype Link comment
json http://tools.wmflabs.org/wiwosm/wiwosmlog/broken.php Get the newest error log as gzipped json file.
html http://tools.wmflabs.org/wiwosm/wiwosmlog/broken.html View the log in your Browser, with some helpfull edit links and filtering.
html http://osm.jjaf.de/wiwosm/log/?stats Thanks to User:Jjaf.de we have some statistics with edit links.
You can use filters for example: http://osm.jjaf.de/wiwosm/log/?t=r&s=DE to get relations in germany. (works with TSV and ods spreadsheet, too)
tsv http://osm.jjaf.de/wiwosm/log/ You can get it as tab separated file.
ods http://osm.jjaf.de/wiwosm/log/broken.ods This is an open document spreadsheet.

Have fun with fixing errors!

Simplification

For objects with with a lot of points we use the PostGIS-function ST_SimplifyPreserveTopology() to reduce the data we send to the browser. We also reduce the decimal places. The simplify strategy at this moment is this:

ST_AsGeoJSON(
  CASE
    WHEN ST_NPoints(ST_Collect(way))<10000 THEN ST_Collect(way)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 10000 AND 20000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/500000)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 20000 AND 40000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/200000)
    WHEN ST_NPoints(ST_Collect(way)) BETWEEN 40000 AND 60000 THEN ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/150000)
    ELSE ST_SimplifyPreserveTopology(ST_Collect(way),(ST_Perimeter(ST_Collect(way))+ST_Length(ST_Collect(way)))/100000)
  END
,9) AS geojson

If you have a good idea to optimize it, let us know, please.

GeoJSON & compression

We transfer the geometry to GeoJSON with a PostGIS function and put the files compressed on the server. The projection of the files is Google mercator.

Internationalisation;

We use the InterWikiLinks of Wikipedia to create hardlinks in the file system.
The filesystem structure is based on a hashing algorithm called FNV.
The process is as follows:

 1. build a String with the language followed by article with underscores replaced by spaces
 2. compute the fnv hash of this string for example 1A00A138 in hexadecimal format
 3. build the file path like /1A/1A00/1A00A138_lang_articlename.geojson.gz (with forbidden chars removed from lang and articlename and cropped at 230 chars)
 4. write the gzip compressed GeoJSON to that file
 5. for every other language found for an article by Interlanguag links do 1-3 and create a hard link in the filesystem to that file

The advantage of this procedure is a fast access to the right geojson file, if you know language and article, without database queries or such stuff. Also we are able to support simpler update strategies by only updating one file without lookup on interwiki links.

Clientside

On clientside we load the geometry in OpenLayers, WikiMiniAtlas or Leaflet (optional Vietnamese Wikipedia gadget). If the geometry is loaded and if it's not only a point, we zoom to it.

Updating

  • We will try to update each night. I optimized the procedure a little so there is a smart update (>34min) each day but wednesday. Wednesday the whole directory is thrown away and completely new generated (>84min). The smart update doesn't check if an OSM-object or its wikipedia tag was deleted or if new languages in inter language links in wikipedia were added. This is only done on full update.
  • If you add a new Wikipedia-Tag and can not wait, you can update with (URL:+"?...&action=purge"). This procedure is relative expensive for us and needs 30-60 sec to be finished, because we have to look up this article in the hstore of the mapnik db on toolserver.
    So if you can, please wait to next day and be careful with this feature (Don't send the URL to a mailing list.). Have also in mind that the toolserver database has a replication lag of some minutes, so please wait after your edit at least 5 minutes.

API

By 2014-04-18 the API-URLwas moved from toolserver to labs resulting in a new URL

Old URL
http://toolserver.org/~master/
New URLs
https://tools.wmflabs.org/wiwosm/ STRINT
http://tools.wmflabs.org/wiwosm/

The old URL should be redirected at the moment - but it is not!

GeoJSON

You can get the Wikipedia GeoJSON on Wikipedia file content of an article directly by calling with a lang and article parameter:

https://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=de&article=Dresden

Returns GeoJSON or HTTP 404 Not Found if it does not exist.

check existence

If you want to explicit check if an WIWOSM object is there without fetching its content you can use the action=check parameter:

https://tools.wmflabs.org/wiwosm/osmjson/getGeoJSON.php?lang=de&article=Dresden&action=check

Returns 1 on existing objects or 0 otherwise.

manual regeneration

If you explicitly want to regenerate a single WIWOSM-object before the nightly update process you can use action=purge but use with care!

Source code

WIWOSM together with POIs from Wikipedia

Lists for testing

Examples by the author

At the following list you can find some links to example maps:

Further examples

Wikipedia wikipedia=* in OSM Result on Toolserver
Wikipedia Strategischer Bahndamm on Wikipedia Relation type=collection Osm element relation.svg 551400 (view, XML, Potlatch2, iD, JOSM, history, analyze, manage, gpx) [2]
Wikipedia Residenzenweg on Wikipedia Relation type=superroute Osm element relation.svg 31636 (view, XML, Potlatch2, iD, JOSM, history, analyze, manage, gpx) [3]
Wikipedia Rhein on Wikipedia Relation type=waterway Osm element relation.svg 123924 (view, XML, Potlatch2, iD, JOSM, history, analyze, manage, gpx) [4]

FAQ

  • Feature xyz I added to the map isn't in Wikipedia yet

Check the log to see if data has been renewed since you edit, and if there are no problems with your data.

To-Do

A lot of things are in the moment (April 2012) not in WIWOSM.
Help to bring more Wikipedia-Tags inside OSM.
WIWOSM in Europe (April 2012)

Creators / Contact

Events

  • 2012-07-10 WIWOSM is now running in all Wikipedia versions (not running in Internet Explorer)
  • 2012-03-21 WIWOSM starts it beta-phase and going live in German Wikipedia
  • after OSM license-change WIWOSM will go live in other Wikipedia

See also

  • Add-tags - web service to assist in adding wikipedia tags to OSM objects with the use of JOSM's RemoteControl feature
  • OSM-Wikipedia place name tool - on Toolserver
  • WikiMiniAtlas - Map widget based on OpenStreetMap data, active on many wikipedia sites; displays WIWOSM data
  • JOSM Wikipedia Plugin - A JOSM plugin which shows a list of Wikipedia articles regarding the area in which the user is mapping (by position or by category). It lets the mapper to add the wikipedia tag by a simple double click, and it also indicates the objects already delivered by WIWOSM server.
  • Osmose analyser - An Osmose wikipedia tag analyse. Partial world coverage.